rtcweb P. Patel
Internet-Draft C. Jennings
Intended status: Informational S. Nandakumar
Expires: April 20, 2016 J. Rosenberg
D. Wing
October 18, 2015

Firewall Traversal for WebRTC


Traversal of RTP through corporate firewalls has traditionally been complex, requiring the deployment of Session Border Controllers (SBCs) or wide open pinholes. This draft proposes a simple technique that allows WebRTC based RTP traffic to traverse firewalls without complex firewall configuration and without deployment of SBCs or other middleboxes.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 20, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Problem Statement

WebRTC [I-D.ietf-rtcweb-overview] based voice and video communications systems are becoming far more common inside enterprises, which often need voice and video media to traverse the enterprise firewall. This can happen when a device inside the firewall such as a web browser or phone is exchanging media with a conference bridge or gateway outside the firewall, or it can happen when a device inside the firewall is talking to a device in another enterprise or behind a different firewall.

This problem is not unique to WebRTC media of course. It is common practice for enterprise administrators to block outbound UDP through the corporate firewall. This is done for several reasons:

  1. The lack of any kind of return messages means that there is no way to know that the recipient of the UDP traffic really wants it. Infected computers within the enterprise could utilize UDP as the source of a DDoS attack. If the firewall permitted such outbound traffic, the enterprise could in effect be a contributing source to such an attack. By blocking UDP, the enterprise IT admin ensures that this cannot happen - at least not to external targets.
  2. There have been prior attacks that have utilized UDP as a command and control channel for orchestrating DDoS attacks. At the time, UDP had little usage within enterprises (most VoIP was internal to the enterprise when it existed at all). Consequently, infosec departments have deemed it safer to block UDP outright in order to prevent such further incidents.
  3. Many IT administrators enable various packet inspection operations on traffic flowing through the firewall. High volume UDP traffic - such as voice or video - can be costly to inspect. As such, in cases where there is a need for traversal of such traffic, IT has preferred to deploy an SBC that, in essence, verifies that the traffic is VoIP and authorizes its egress. The IT administrator then enables traffic to/from the SBC through the firewall. In other words, VoIP authorization is delegated to an outsourced SBC.

As more and more IP communications services move to the cloud, there is an increased need for VoIP traffic to traverse the enterprise firewall. At the same time, the entire point of a cloud service is that it does not require the deployment of on premises infrastructure, making SBC-based solutions less desirable. An alternative solution that has been historically used is to enable outbound UDP in the firewall to specific IP addresses, corresponding to the external service (TURN servers or conference servers) that the enterprise wishes to authorize. With more applications running on virtual machines within cloud compute platforms like Amazon EC2, IP addresses are decreasingly usable as identifiers for a service. VMs running TURN servers or conferencing servers may be established and torn down by the day, hour or even minute, with continuously changing IP addresses. Given the multitenant nature of such providers, IT departments are unwilling to whitelist the IP addresses for the entire block used by such providers.

Consequently, there is a growing need for solutions that allow VoIP traversal through the corporate firewall that alleviate the concerns above. This issue is further exacerbated by the growing adoption of WebRTC by enterprise applications, which provide a ready source of RTP traffic which often needs to traverse the firewall.

2. Solution Requirements

We believe the solution must meet the following requirements:

REQ-1: The solution must enable traversal of real-time media without requiring deployment of additional media intermediaries on premise (e.g., no SBC required)

REQ-2: The solution must not require the whitelisting of specific external IP addresses

REQ-3: The solution must enable the enterprise to be sure that the receiving party of the traffic desires the traffic

REQ-4: The solution must work with P2P calls between users in different enterprises without requiring a TURN server

REQ-5: The solution must work with cloud services external to the enterprise which terminate media on servers, such as conference servers, voicemail servers, and so on.

REQ-6: The solution must not require decryption of either signaling or media traffic at the firewall or at any other intermediary

REQ-7: The solution must allow the IT department to easily make policy decisions about which applications are allowed, or not allowed, to traverse the firewall

REQ-8: The solution must not require inspection of every single packet that traverses the firewall

REQ-9: The solution must provide a minimum level of proof that the traffic is WebRTC media or data and not something else

REQ-10: The solution must work with WebRTC traffic. Note that solving this for non-WebRTC is a non-requirement.

3. Solution Overview

Many of the reasons for blocking UDP at the corporate firewall have their origins in the lack of a three-way handshake for UDP traffic. TCP’s three-way handshake ensures that the receiving party of the connection desires the traffic. Similarly, HTTP traffic easily traverses the firewall since it provides application identification information in the URL.

Consequently, the solution proposed here relies on the ICE connectivity checks, which provide a similar handshake and ensure consent of the remote party.

The firewall looks for an initial STUN transaction to a STUN server to learn which application is using the port (based on the STUN HOST attribute Section 6). Next the firewall watches the outbound ICE connectivity check on that port and allows inbound ICE connectivity checks that are going to the same location that sent the outbound request and that have the correct random ufrag value that was created by the client inside the firewall. After a successful ICE connectivity check, the firewall allows other media to flow on the same 5 tuple that had the successful ICE connectivity check. Timers are used to removed the various pinholes created.

In addition, the initial outbound STUN packets can contain the STUN HOST attribute which the firewall can use to make an authorization decision on the application.

The end result is a system where:

4. Firewall Processing

The firewall processing is broken into four stages: recognizing STUN packets, mapping to an application, making a policy decision as to whether each STUN packet should trigger a pinhole to be created, and managing the lifetime of any pinholes that are created.

4.1. Terminology

The key words defined in [RFC2119] are used in this specification.

The term 3-tuple is used to refer to IP address, protocol (which is always UDP), and port that the firewall sees as the address of the client inside the firewall.

The term 4-tuple is used to refer to 3-tuple plus the ice ufrag that was send in the STUN request message for the client inside the firewall.

The term 5-tuple is used to refer to the 3-tuple plus the IP address and port of the device outside the firewall.

When matching a ufrag, if it is a STUN request that came from outside the firewall, the two halves of the username on either side of the “:” need to be swapped before matching.

4.2. Recognizing STUN packets

STUN messages all have a magic cookie value of 0x2112A442 in the 4th to 8th byte. This can be used to quickly filter nearly all UDP packets that are not STUN packets. Many firewalls are capable of doing this in hardware. STUN supports an optional FINGERPRINT attribute that provides a 32 bit CRC over the message.

Option A: Firewalls SHOULD look at outbound UDP packets and if they have the correct magic cookie they can classify them as STUN packets.

Option B: The firewall looks for any outgoing STUN requests to the STUN port (3478). When it finds one, it stores the 3 tuple of the source address port and protocol=UDP and for the next 30 seconds checks any packets from this 3 tuple to see if they are ICE connectivity checks. 

Open Issues:

4.3. Application Mapping

The STUN HOST attribute Section 6 carries the fully qualified domain name of the STUN server that is being contacted in the STUN requests. So for example, if a browser was on a page such as example.com and that page used the WebRTC calls to set up a connection to stun.example.com, the STUN request’s HOST attribute would have the value stun.example.com. Similarly when contacting a STUN or TURN server over TLS or DTLS, the TLS SNI [RFC6066] value provides the name of the host. For systems that provide a unique STUN server name for each application, this allows the firewall to map the stun port to the application using it and use that for logging and making any policy decisions.

Once the Firewall receives as STUN packet from the inside to the outside on a new 3-tuple. It MUST create an internal record to track any additional traffic on this 3-tuple. If the STUN packets contains an HOST attribute then the value it contains is saved in this record and referred to as the applications name. Firewall might wish to put the application name in the log files for this 3-tuple.

It is important to realize that any application inside the firewall can lie about the value of the HOST attribute. However, a web browser that is trusted will not allow the Javascript running in the web browser to lie about the value of the HOST.

4.4. Policy decision

Once the firewall has received a STUN packet from inside the firewall, it needs to decide if the packet is acceptable. For most situations the firewall SHOULD accept all outbound STUN packets. This is similar to allowing all outbound TCP flows. Some firewalls may choose to look at other factors including the outside UDP port and the application name for this 3-tuple.

In general WebRTC media can be sent on a wide range of UDP ports but the two ports that are commonly used are the the RTP port (5004) and TURN port (3478). Some firewalls MAY choose to only allow flows where the destination port on the outside of the firewall is one of these.

Some firewalls MAY decide to white or blacklist connections based on the application name.

4.5. Creating the pinhole rules

Once a STUN packet is accepted, the firewall MUST create a temporary rule that causes the firewall to allow any inbound or outbound ICE messages on this 4-tuple. This pinhole MUST to be valid for at least 5 seconds from the time of creation.

The firewall keeps track of the STUN transaction ID for all STUN requests messages that traverse the 4 tuple along with the 5 tuple they were sent on and direction (inbound or outbound). If the firewall sees a STUN Success binding responses, with the matching transaction ID, and on the same 5 tuple but in the opposite direction as the STUN request, then a valid ICE connectivity check has happened. Then the firewall MUST create a pinhole for this 5 tuple that allows any UDP traffic to flow across that 5 tuple. This pinhole MUST to be valid for at least 30 seconds from the time of creation.

The firewall continues watching ICE connectivity checks across this 5-tuple as described in the previous paragraph and anytime the a valid ICE connectivity check happens, this effectively extends the lifetime of the pinhole by 30 seconds. The procedures in [I-D.ietf-rtcweb-stun-consent-freshness] will ensure that an ICE connectivity check is done more often than every 30 seconds. This is designed to make things work with behave compliant NATs and Firewalls as specified in [RFC4787].

4.6. Media vs Data Statistics

WebRTC can send audio and video as well as carry a data channel. Confidential data could leave an enterprise by a video camera being pointed at a document, but IT departments are often more concerned about the data channel. It is easy for the firewall to separately track the amount of RTP media and non-media data for each WebRTC flow. If the first byte of the UDP message is 23, it is non-media data; if it is in the range 127 to 192 it is audio or video data. More information about this can be found in [I-D.ietf-avtcore-rfc5764-mux-fixes]. Network management systems on the firewall can track these two separately which can help identify unusual usage.

5. WebRTC Browsers

This specification would require browsers to include the FINGERPRINT and the HOST attributes in STUN requests to a STUN server used to gather candidates for this to work correctly. Note they MUST not be included in STUN requests sent peer to peer or sent to ensure media consent.

Open Issue: how much randomness for ICE ufrag

Open Issue: Does adding the HOST reduce user privacy?

Open Issue: Would only including HOST when it matched the HTTP ORIGIN improve privacy?

6. STUN HOST attribute

This specification defines a new STUN attribute called HOST and uses the syntax defined in Section 15 of [RFC5389]. This attribute is of type comprehension-optional. The value of the HOST attribute is a variable length value. It MUST contain a UTF-8 [RFC3629] encoded sequence of characters.

The HOST attribute identifies the fully qualified domain name of the application provider that is serving the WebRTC application and also operating the STUN server. The WebRTC EndPoint MUST include this attribute as part of the ICE candidate gathering phase and there MUST be only one HOST attribute in a given STUN binding request.

7. Deployment Advice

7.1. WebRTC Servers

WebRTC media servers and TURN servers with public IP address(es) that can receive incoming packets from anywhere on the Internet are suggested to listen for UDP on ports 5004 for RTP media servers and 3478 for TURN servers. UDP destined for port 53 or 123 if often allowed by firewalls that otherwise block UDP.

7.2. Firewall Admins

Often the approach has been to lock down everything, so that all UDP is blocked. This simply causes applications to do things like embed the data in normal looking HTTP or HTTPS requests. Malware and viruses use similar approaches. Just turning off all UDP results in a poor user experience some of the time, which results in users moving to applications and devices outside the firewall. The IT department loses the visibility into what is going on and can no longer protect its users when their computers become compromised. Allowing things that users want to use to work and monitoring them to detect when things have gone wrong is very valuable.

8. Design Consideration

8.1. Why not just use TCP?


9. IANA Considerations

IANA is requested to add the HOST attribute to the STUN attribute registry. The values for HOST is to be allocated from the expert review comprehension-optional range of (0xC000 - 0xFFFF).

Value Name Reference

This specification defines the HOST attribute in Section 6.

10. Security Considerations

Enterprises have a range of concerns around WebRTC traffic traversal of the firewall. The major concerns that are raised include:

  1. Unlike TCP, UDP does not have a connection where a device inside the firewall has confirmed that it wants to talk to the thing outside.
  2. Incoming UDP pinholes allow out of band packets to be spoofed into connecting as there is no equivalent of a TCP sequence number to check.
  3. UDP has been used by malware command and control protocols so we block it.
  4. We do not want enable ways for data to be exfiltrated outside the firewall with no monitoring.
  5. An encrypted data channel in WebRTC can be used to bring malware into the company.
  6. An encrypted media or data channel in WebRTC can be used as a command and control channel for malware inside the firewall.
  7. An encrypted data channel in WebRTC can be used by an outside attacker to exfiltrate private files from inside the firewall.

TODO - Describe to what degree theses are addressed. Be clear about attacks due to Javascript inside the firewall and attacks due to executables inside the firewall.

10.1. Privacy

Unlike previous version of this draft, we think that using HOST instead of ORIGIN minimizes any privacy concerns. The HOST is already known to the operator of the STUN server as they run it. It often contains exactly the same information as the existing STUN REALM attribute. It has roughly the same information as the TLS SNI [RFC6066] when STUN is run over DTLS.

11. Alternate Approaches

11.1. SDN Control of Firewall

An alternative ways of solving this problem is for the Web Application running in the browser to inform the web site what ports and IP addresses it is using then the web site to contact the appropriate SDN controller and request the SDN controller tell the appropriate firewall what pinholes to open. This can be made to work in some deployments but not all as it is often not clear how to find the correct SDN controller or set up a relationship such that the SDN controller trusts a website outside the firewall wall enough to let it tell the controller to open wholes in the Firewall.

SDN based approaches should be pursued as well as this approach as they compliment each other.

11.2. Any Cast White List

Deploying media or TURN servers on a single any-cast IP address also makes it easier for firewall administrators to whitelist the address. Concerns have been raised that two packets sent from the same host to a given any-cast address may get delivered to different servers. This is certainly possible in theory but in practice it does not seem be happen in limited experiments done so far.

12. Acknowledgements

Many thanks to review from Shaun Cooley, Teh Cheng, and Alissa Cooper.

The definition of HOST STUN attribute was motivated by discussion around the draft-ietf-tram-stun-origin document and we want to thank Alan Johnston, Justin Uberti, John Yoakum, and Kundan Singh.

13. References

13.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 2003.
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P. and D. Wing, "Session Traversal Utilities for NAT (STUN)", RFC 5389, DOI 10.17487/RFC5389, October 2008.

13.2. Informative References

[I-D.ietf-avtcore-rfc5764-mux-fixes] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme Updates for Secure Real-time Transport Protocol (SRTP) Extension for Datagram Transport Layer Security (DTLS)", Internet-Draft draft-ietf-avtcore-rfc5764-mux-fixes-02, March 2015.
[I-D.ietf-rtcweb-overview] Alvestrand, H., "Overview: Real Time Protocols for Browser-based Applications", Internet-Draft draft-ietf-rtcweb-overview-14, June 2015.
[I-D.ietf-rtcweb-stun-consent-freshness] Perumal, M., Wing, D., R, R., Reddy, T. and M. Thomson, "STUN Usage for Consent Freshness", Internet-Draft draft-ietf-rtcweb-stun-consent-freshness-16, August 2015.
[RFC4787] Audet, F. and C. Jennings, "Network Address Translation (NAT) Behavioral Requirements for Unicast UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 2007.
[RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) Extensions: Extension Definitions", RFC 6066, DOI 10.17487/RFC6066, January 2011.

Authors' Addresses

Pradeep Patel Cisco EMail: pradpate@cisco.com
Cullen Jennings Cisco EMail: fluffy@iii.ca
Suhas Nandakumar Cisco EMail: snandaku@cisco.com
Jonathan Rosenberg Cisco EMail: jdrosen@cisco.com
Dan Wing Cisco EMail: dwing@cisco.com