The Session Initiation Protocol
Development on the Session Initiation Protocol began in the mid nineties around the same time as the H.323 protocol. As a peer to peer technology at the application layer, it relies on SIP enabled endpoint devices such as IP phones or workstations for most of its call processing and control. Thus, it began as a much lighter protocol than its counterpart H.323, which leans heavily on centralized gateways and servers.
At the turn of the century, a rivalry erupted between supporters of SIP and the H.323 protocols for domination in the IP telephony industry. SIP soon became the protocol of choice for Instant Messaging, and in 2002 was adopted by the 3rd Generation Partnership Project (3GPP R5) as the signaling protocol for cellular networks.
Consumers benefited through the early adopters of VoIP, e.g. Vonage, Packet8 and others, and as the technology of IP Telephony matured, the SIP standard has been integrated into the larger ISP networks, the IP PBX market, and in manufacturers' endpoint devices.
As of today in late 2007, SIP has become the defacto signaling standard for VoIP and multimedia communications. It has been said that SIP has benefited by being an IETF standard, the IETF being quicker to adapt to industry forces than the ITU. SIP is supported by both standards committees, as is H.323.
SIPs emergence as the protocol of choice came in no small part from its work on IMS, and the consequent development of the concept of Presence. While by no means exclusive territory, the standards committees have worked hard to put SIP in the forefront of presence technology. Knowing where a person is and their availability, coupled with the ability to communicate through a variety of applications and devices, business productivity and mobility soar as the old days of playing phone tag become obsolete.
Being a text based protocol, similar to HTTP, SIP integrates well with other Internet applications such as email, instant messaging, and voice/video conferencing and collaboration. Modular in nature, participants in a SIP enabled multimedia conference in different locations would be able to use whatever capabilities their devices support.
For example, at headquarters a conference is initiated utilizing voice, video
and application sharing. A colleague at a branch office is invited through a
SIP enabled video phone, and the president of the company joins the conference
from his corvette as he travels to meet his girlfriend. The participants at
headquarters can see, hear, and collaborate, while the branch manager can see
and hear what's going on, and the president can listen to the whole conference
on his cell phone. This is a departure from the least common denominator mentality
of other protocols where all endpoints must have the same capabilities.
The logical elements making up the SIP standard are the User Agent, Back-to-Back User Agent, Proxy Server, Redirect Server, and the Registrar. One big advantage of SIP is that these logical components can coexist with other applications on existing network components, making a costly infrastructure upgrade unnecessary.
User Agent
User Agents are endpoint devices that initiate and terminate sessions through a series of request/response queries. The UA is defined by RFC 3261 and consists of a UA client and a UA server application (UAC, UAS). The UAC application initiates the call, and the device receiving the call acts as the UAS. An endpoint device, e.g. an IP phone, can serve as both UAC and UAS. In a closed IP environment, endpoint devices can find each other without the need of the other entities.
B2BUA
A Back-to-Back User Agent is an application that acts as an intermediary between two endpoints, and is seen as an endpoint to the parties involved. It maintains the state of the call and is also responsible for call termination. It can act as a gateway, representing for example an endpoint on the IP side to an endpoint on the PSTN, and visa versa.
Proxy Server
Another intermediary, proxy servers have both UAC and UAS functionality, and can process requests by passing them on to other SIP servers. A proxy can translate a request and rewrite the message before passing it on if required. SIP Proxy Servers perform address translation within the domain by resolving email URLs or telephone numbers to IP addresses, and use DNS to find SIP Servers outside the domain. SIP uses the ENUM standard to map and index outside telephone numbers to IP addresses.
Proxy servers are only involved in call setup and termination, leaving call control to the endpoints themselves.
Redirect Server
A Redirect Server maps a request from a client to the closest URL of the party being called, then sends it back to the original requestor. A Redirect Server does not pass requests on to other servers. For example, a client request for a person's workstation IP phone may be redirected to their cell phone address if they are on the road.
Registrar Server
The Registrar, as the name suggests, registers users into a database as they come online. Information indicating their identity and the devices on which they choose to be reached is stored by IP address, phone number, or URL. Users can also contact the Registrar to update their location, i.e. presence, and indicate to the server a number or device they can be reached at.
The Registrar and the Redirect components can and often do reside on a SIP Proxy Server, as is the case with Microsoft's Real Time Communications Server.
Now over ten years old, the Session Initiation Protocol is the accepted standard for unified communications, and is considered to be in the "polishing" phase of development. SIP enjoys a wide base of acceptance from equipment manufactures and application developers, and as time goes on, new SIP enabled devices and applications are expected to broaden the base even further.
Cell phone manufacturer's Nokia and Ericcson have embraced the technology, and networking giant Cisco Systems has integrated the protocol into most of its devices, from IP Phones to the H.323 based CallManager PBX server. New web applications such as click to talk are already coming to market and many more are on the horizon, benefiting such verticle markets as retail, customer service/support call centers, and the health care industry.
One big coup for SIP is Microsoft's up and coming Office Communications Server. Based on the SIP standard, Microsoft claims that they will save businesses millions in hardware costs as the face of communications evolves to a soft phone application platform.
Detractors will say that after ten years of development, SIP has grown up to be an obese adolescent. Indeed, the many extensions added to address IMS and Presence since 2000 seems to indicate a loss of focus, and the 3GPP implementation has morphed it into something resembling a proprietary cellular protocol.
Nevertheless, the implementation of SIP on a wide variety of platforms, along with its interoperability and ability to integrate with other applications and protocols, indicate that the Session Initiation Protocol will be with us for some time to come.
The VoIP Signaling Protocols
H.323 | SGCP | MGCP | Megaco-H.248 | SIP


