My big complaint with SIP is latency - we should be able to get latency way down and call pleasure is enhanced significantly with low latency - how does this help
Latency in a VoIP phone call has nothing to do with the control protocols and everything to do with your connectivity and the configuration of the jitter buffers on either end.
There should be no practical difference in audio performance, either latency or quality, between different VoIP signaling protocols unless something is wrong. If you're experiencing latency on calls that is significantly greater than your network latency plus jitter then something is wither misconfigured or malfunctioning.
That said as RIPT attaches the media to the signaling in a single flow of packets it should have far fewer problems negotiating NATs and similar scenarios that have been known to cause problems for SIP. Those problems cause complete loss of audio in one or both directions though, anything affecting call quality is almost certainly a network performance issue.
The idea that RIPT will be phone endpoint to phone endpoint for media is ridiculous which means latency will go up. I hope I misread the spec!
Because phones are usually double natted / double firewalled the vonages / ringcentral and friends will have the phones connect to their servers, and pass the media that way.
I currently finally got the media side to handoff pretty directly from our phone system to whatever media endpoint is suggested to handle the call termination for SIP - latency is improved, but the media bridges to local telco etc seem to be adding their own latency.
What is crazy is you used to be able to call three floors down (separate network but next door) and get no latency. Literally. The call was circuit switched through something in town or sometimes closer.
Now with RIPT that media is going to travel to the ringcentral west point server, then back, to go three floors, and YES, latency will be higher than 30 year old tech. Imagine that. Billions spent, and the quality is worse.
As you've noted, the direct media thing is great but doesn't really work reliably with phones behind NAT unless there's special processing at the NAT side. And of course most NAT devices that offer said processing get it wrong and usually make things worse.
I've been working in VoIP for years. We do direct media when we can, but we bounce it through the server fairly often because NAT sucks. RIPT seems to built primarily with cloud hosted applications in mind, which need to scale and are likely being connected to by clients behind NAT. It's explicitly not like SIP, where every user agent is equal and would theoretically communicate directly where possible.
This is why I wish IPv6 had a real killer app to make it catch on, it'd make life so much easier for me and SIP would work how it's supposed to.
Usually that means your ATA is not configured properly. With the correct dial plan settings, call setup time should be comparable to the PSTN.
I have many other complaints with SIP, mostly related to the assumption by many clients that there is never any packet loss during call setup, but call setup time is not one.
SIP (Session Initiation Protocol) is used for initiating, maintaining, and terminating sessions, while media transport is delegated to other protocols (RTP, SRTP).
This would change with RIPT, and latency is definitely a consideration:
I'm upvoting your comment because, while technically naive, it squarely addresses what matters to users. As others noted, SIP and RIPT have absolutely nothing to do with call quality. RIPT is intended to make if easier to write software that implements calling features.
In a former life I was the director of call-processing / media-handling development at Vonage. SIP was an overly-complicated, overly-featureful protocol that fought us at every turn. But our really frustrating problems were with our user's Internet connections. Nobody cared what calling features we had, but media latency, jitter, and just plain connection flakiness was a major driver of customer churn.
For a VoIP provider, SIP is a vast improvement over telco standards such as SS7 and Megaco (H.248). But it envisioned a utopian, distributed world where end-users directly connected with one another over the Internet, with maybe a little bit of help from directory services. In the real world, VoIP is implemented as a strictly client-server architecture. This is driven by needs for security, PSTN interconnection, accounting, and above all coordination of state management. SIP smears call state over the entire network; every piece involved is sovereign of its own little part of the puzzle, and has to cajole the other parts into making appropriate state changes. For example, call parking (putting a call on hold and then picking it up on another phone) in idiomatic SIP requires a parking server just to keep the call alive. [0] [1]
Based on a quick look, RIPT seems to resolve some of SIP's headaches. It's still too complicated. For customer-facing apps all we really need is a persistent connection to the server, with events saying "customer wants to call X", "incoming call", "customer pushed button Y" and "here's a media stream". Let the server handle call state. PSTN and other interop is a whole 'nother problem space with little in common with customer-facing features, only requires call-leg state rather than call/session management, and should be addressed with a completely different protocol.
I have configured SIP to avoid some of the client / server model. You can do that on a local network you can control, and if you can get your firewall to play or can expose a SIP endpoint to the public internet directly you can cut a fair bit of latency out.
My question - will RIPT REDUCE latency in the ordinary case vs well configured SIP?
I've driven latency reduction by doing the onsite SIP PBX to a set of analog phone lines - in some setups that BEATS the round trips.
SIP would work much better if every phone was on a publicly routed unfiltered network. But that simply isn't the case. Others posting are claiming the phones under RIPT will be able to talk directly too each other - but how?
The stack adds latency - audio encoding, bufferbloats, etc - whatever is happening results in worse audio latency than 30 year old tech which makes little sense.
These days, latency is very low for most calls in North America. In fact, when we calculate MOS (which is a score for your call quality), we don’t even use latency if it’s under 100ms.
These days, I find more call quality issues are a result of issues on the LAN rather than the Internet. More specifically, I see voice quality degradation caused by routers/firewalls that are not quite able to deal with the volume of traffic being pushed through them.