Today most Internet users are in a private network behind one or several NATs. This is not a problem for your browser (HTTP, FTP...) or email client (SMTP, POP...) because they use protocols that can operate behind NATs. The problem comes when you try to use your SIP phone to REGISTER or place a call to another softphone in different private network through Internet (Public network).
The problem with SIP is that IP addresses and Ports where to contact/respond an agent (e.g. PC, Box or mobile phone) are embedded in the SIP message itself.
For example, when you REGISTER to a registrar (e.g Serving-CSCF through Proxy-CSCF) you add a contact header (IP address and Port - Contact: <sip:email@example.com:27208;transport=udp>) where the registrar should send all your incoming calls(INVITEs). This address is called Address-Of-Record (a.k.a AOR).
If you are in a private network you will add your private IP address and port in this header (Contact) and the problem is that this AOR is not visible to the elements outside your private network.
This mean that INVITE requests will never reach your agent.
You will have the same problem when you are the caller, as you will add in your SDP the IP addresses and ports where RTP packets shall be sent to you.
So, NAT problem concern both signaling (SIP/SDP) and media (RTP/RTCP) plans.
This is the simplest way to deal with NATs problem at signalling (SIP only) plan for connectionless protocols (e.g. UDP). rport has been defined in RFC 3581 and apply to both connectionless (e.g. UDP) and connection-oriented (e.g. TCP, TLS, SCTP) protocols.
Because connection-oriented protocols such as TCP are bidirectional it is easier to deal with NAT traversal (all responses will be sent back using the same connection from which the request has been received from).
The philosophy (of rport) is that when you add this attribute (rport) in your requests (Via: SIP/2.0/UDP 192.168.16.108:27208;branch=z9hG4bK1234;rport) then all responses will be sent back to the ip:port from which the request has been received from. This parameter MUST be added in the top most Via header.
The response will contain a new parameter (« received ») containing the IP address from which the request has been received and the rport value will be filled with the mapped (public?) source port. In this way the client/caller (the sender of the request) can learn it's public/reflexive IP address and port (Via: SIP/2.0/UDP 192.168.16.108:27208;branch=z9hG4bK1234;rport=1234;received=10.1.1.1).
By examining the response the client can know if it's behind a NAT or not.
The problem with this solution is that it only deals with SIP messages and cannot be used for RTP/RTCP packets as the SDP will contain wrong IP addresses and ports. A solution to this issue could be using Symmetric RTP/RTCP (see above).
Symmetric RTP / RTP Control Protocol (RTCP)
As we have seen above (rport), we cannot use « rport » to deal with RTP/RTCP packets which are almost always transported using UDP. A solution to this problem could be using « Symetric RTP/RTCP » as per RFC 4961. This solution is a bit like using « rport ».
In this case the caller (INVITE originator) will create the request as per RFC 3261 as usual. When the 2xx (with SDP) or 1xx (with SDP) is received, the caller will ignore the IP addresses and ports defined in the response and send RTP/RTCP packets to IPs/Ports from which the RTP/RTCP packets have been sent (like rport).
- Bob sends an INVITE to Alice
- Alice sends back a 2xx (with SDP) or 1xx (with SDP) to Bob
- Bob waits for first RTP/RTCP packets to come from Alice
- Alice send first RTP/RTCP packest from IP-a and Port-a to IP-b and Port-b (Bob address and port defined in the SDP).
- Instead of sending RTP/RTCP packets to IP-asdp and Port-asdp as sepecified in the Alice's SDP, Bob will begin sending media stream to IP-a and Port-a.
As you can imagine this solution only work for only some NATs.
Session border controller
As its name says, it controls (both media and SIP packets) sessions (SIP calls) and is in the border (between two networks) of the networks.
In our case (NAT traversal) the role of SBC will be to inspect/control all outgoing and incoming SIP/RTP/RTCP packets. All SIP packets will be inspected and IP addresses and ports within the packet will be rewritten (e.g from private IP to public IP).
This solution has several problems:
- Very expensive (€€$$££)
As the SBC operates on the SIP packets then it MUST be aware of all headers (or functions) in order to know what should be changed and what should not be changed. This mean that the SBC MUST always be up to date in order to efficiently handle SIP packets
Most SBCs can only handle well-know protocols such as SIP, RTP or RTCP and will drop all unknown protocols
There is also many problems when End-to-End encryption (e.g. TLS, SRTP or IPSec) is used unless the SBC has the key (which is a bad idea)
As the SBC will be used to relay all RTP/RTCP packets then this will introduce additional delay (dad sound quality)
When unreliable transport is used this (relaying packets) could also increase packet loss (QoS problem)
In the 3GPP IMS context the SBC is in most case bundled with the Proxy-CSCF (P-CSCF plus IMS-ALG) and this could resolve the security issue (SIP-IPSec).
STUN was previously defined in RFC 3489 and updated by RFC 5389. STUN stands for « Session Traversal Utilities for NAT » and is a client-server protocol (request <-> response) as SIP.
There is also « indication requests » that don't generate responses like « binding requests ».
Both reliable (e.g. UDP) and unreliable (e.g TCP, TLS or SCTP) are supported.
Like SIP, when unreliable transoport is used there is the notion of transctions and retransmissions.
When STUN is used the client learn its public IP address and port (a.k.a reflexive transport address) by sending a binding request to the STUN server (default UDP/TCP port:3478 and default TLS port: 5349). The server will in some case challenge (401) the client which should resend its request with all credentials (HTTP digest authentication).
If the request is suceessfuly autheticated by the server, then a success binding response is sent back to the client. This response contains a STUN attribute (XOR-MAPPED-ADDRESS)
with the client's public IP address, family and port. This is also called the « reflexive transport address ».
Once this reflexive transport address is know then the client can for example begin REGISTERing using this address and port as AOR.
====== Step 1: Sending binding request ======
In this request I send my STUN request from "192.168.16.108:1115" to the server in order to get the public IP address and port associated to this local socket/Private address (file descriptor).
====== Step 2 Success binding response ======
In step 2 the STUN client receive an response from the server with its public IP address and port. To match the response with the request we compare the transaction IDs.
As some routers rewrite the content of the packets the IP address and port are not sent "as is" but in XOR format (into the XOR-MAPPED-ADDRESS attribute).
To retrieve the port:
uint16_t port = ntohs(*((uint16_t*)payloadPtr));
port ^= 0x2112; /* First two bytes of the STUN2 magic cookie. */
To retrive the IPv4 address:
uint32_t addr = ntohl(*((uint32_t*)payloadPtr));
addr ^= 0x2112A442; /* The STUN2 magic cookie */
====== Step 3 Sending first SIP request ======
From step 1 and 2 the sip agent can assert that [192.168.16.108:1159] is mapped to [188.8.131.52:1115] (take care to the port mapping).
In step 3 when sending it's first REGISTER request it will use this pulic IP address and port to build its AOR. As rport option is used then you could keep the Via IP address and port inchanged (or not).
The major problem with STUN is that it could not be used behind bi-directional NATs. In the next parts I will explain how to overcome this problem by using TURN and ICE.