Internet Engineering Task Force                                MidCom WG
Internet Draft                                                P. Cordell
draft-cordell-midcom-span-discuss-00.txt     Ridgeway Systems & Software
29 August, 2002                                                       
Expires: 29 February, 2003

   
                       SPAN Discussion Issues


STATUS OF THIS MEMO

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as work in progress.

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   This document collects points of discussion surrounding the
   pre-midcom SPAN deliverable (SPAN = Simple Protocol for Augmenting
   NATs).  As far as possible it is intended to act as a collation point
   for facts surrounding the SPAN deliverable.  Where a discussion item
   is not black and white it attempts to collate opinion from all angles
   as far as the author is able to without bias.  It does not draw
   conclusions of any sort.  

1. Introduction

   This document collects points of discussion surrounding the
   pre-midcom SPAN deliverable.  As far as possible it is intended to
   act as a collation point for facts surrounding the SPAN deliverable.


Cordell                                                          [Page 1]
Internet Draft              SPAN Discussion                  August 2002


   Where a discussion item is not black and white it attempts to collate
   opinion from all angles as far as the author is able to without bias.
   It does not draw conclusions of any sort.  

   It is intended that other designers will contribute to the knowledge
   base presented by this document without compromising its goal of
   being as impartial as possible.  

   It is suggested that the designers should present their personal
   conclusions on the various topics raised in this document in separate
   documents.  Hopefully a number of these documents will be
   collaborative efforts!

   This document is very much of the form of a brain dump.  As it is not
   expected to go beyond being an interim Internet Draft, only minimal
   effort has been made to make it more presentable.  Its main purpose
   is to capture design issues and it is more akin to a set of meeting
   minutes than a formal design document.

      Note: This document was originally drafted during the early stages
      of the pre-midcom design team.  Design decisions in SPAN-A have
      been made as a result of drawing conclusions from the issues
      raised in this document.

2. Definitions

   Client: A device in the Inner Network running the client side of the
      SPAN protocol.  Also known as a SPAN client.  In some cases the
      client may be run on a proxy and so may not run on the same device
      as the client for the protocol that is being traversed.

   Connection: A data path between two devices identified by source
      address and port and destination address and port.  For the
      purposes of this document, a connection may be TCP or UDP, even
      though UDP is connectionless.

   Inner Address: An address valid for the Inner Network.

   Inner Network: A network that is separated from an Outer Network such
      that the addresses in the inner network need to be mapped to
      unique addresses in the outer network using a NAT function before
      connections can be made across the outer network.  

   NAT Function: A device for connecting the address spaces of an Inner
      Network and an Outer Network.  A number of addresses from the
      outer network are assigned to represent connections made from the
      inner network.  Typically the number of addresses allocated from
      the outer network address space will be smaller than the actual
      number of addresses in use in the inner network and so a dynamic
      mapping between inner addresses and outer addresses needs to be
      made.  


Cordell                                                          [Page 2]
Internet Draft              SPAN Discussion                  August 2002


   NAT Address: The address used to represent an Inner Address on the
      Outer Network.

   Outer Network: A network that connects multiple Inner Networks.  Note
      that it is possible for an Outer Network to also be an Inner
      Network at some other level.  Thus Outer Networks can be nested.

   Outer Address: An address valid for the Outer Network.

   Relay: A device in a shared network that will accept incoming packets
      and forward them to a client in a 'private' network.

   TCP Connection: A connection that uses TCP.

   UDP Connection: A connection that uses UDP.  (Note that because UDP
      is connectionless this is really more of an association rather
      than a true connection.)

3. Background Information

3.1. Types of NAT/NAPT

   [STUN] lists a number of different types of NAT/NAPT that are in
   deployment.  These are (with paraphrased descriptions):

      Full Cone: New connections using the same inner address are mapped
         to the same NAT address.  Return packets allowed from anywhere.

      Restricted Cone: New connections using the same inner address are
         mapped to the same NAT address.  Return packets only allowed
         from addresses that have already been sent to.

      Port Restricted Cone: New connections using the same inner address
         are mapped to the same NAT address.  Return packets only
         allowed from addresses and ports that have already been sent
         to.

      Symmetric: Each source address/port and destination address/port
         combination is given a separate NAT address.

   Additionally, there is 1-to-1 NAT, which maps a separate NAT address
   to each private address in use.  

   {N.B. I'm assuming that 1-to-1 NAT does Cone.  Is this true?}

   STUN can be used for UDP with 1-to-1 and full cone types.  With
   variations to the protocols that use it, STUN can also be used with
   various restricted cone types.  SPAN is intended to be used with all
   types, but is required for the symmetric type.  SPAN is also intended
   to address inbound TCP.

   The residential market currently use 1-to-1 NATs, but this area is


Cordell                                                          [Page 3]
Internet Draft              SPAN Discussion                  August 2002


   migrating to more use of Port Restricted Cone.  The enterprise
   segment is predominantly Port Restricted Cone [Mahadev].

   {What types of NAT are used in the airport and NAT scenarios?}

4. Firewall Security Policy

   It is important for SPAN not to compromise site security policy.
   This is not only with regard to the letter of the law (i.e. the
   firewall rules), but also the spirit of the law (the intent that led
   to a particular set of rules being written).

   In this respect there are a number of different types of inbound
   connection that SPAN could potentially allow, the two main types
   being TCP and UDP.

5. UDP Issues

   Incoming UDP can be handled in two ways:

      UDP permanent cone: A SPAN relay could open a port that,
         throughout the lifetime of the relay connection, allows packets
         from any location to be relayed to the client.

      UDP collapsing cone: The UDP port on the relay starts as a cone,
         but will collapse down after receiving the first packet so that
         it will only forward packets received from the initial source.
         All other received packets are discarded.

   Both methods begin as a cone because the source of packets is
   typically not known a-priori.

   The permanent cone is attractive to allow multiple incoming call
   attempts.  The permanent cone is also required to allow for remote
   RTCP sender reports and receiver reports to come from different
   locations on the remote client (even though this may not happen
   often). 

   A UDP permanent cone will allow attacker packets to get to a client
   in addition to packets from the genuine remote party.  This may allow
   the attacker to compromise the client.  Further, if the NAT is
   intentionally a symmetric NAT, the presence of the permanent cone
   relay will undermine what little security the NAT offers.

   On the other hand, a collapsing cone makes it easy for an attacker to
   launch a denial of service attack based on stealing the service.  

   And once an attacker has access to the path, they may carry out any
   attack that they might carry out over a non-collapsing cone.

   For both forms it is necessary to decide whether the address that
   data is received from needs to be identified to the client.  Such


Cordell                                                          [Page 4]
Internet Draft              SPAN Discussion                  August 2002


   information would allow the relaying of protocols that require
   sending replies back to the original source address, and would allow
   a client know that it is receiving packets from multiple sources
   which may suggest some sort of attack.  

   There may be a case for hiding this information from the client in
   the interests of attempting to conform more closely to the assumed
   site security policy.  In the case of the permanent cone, if address
   hiding is thought desirable, it will be possible for the relay to
   identify the source of the packets to the client using logical
   identifiers.  This will allow the client to tell which session the
   received packets relate to, but will not give it detailed information
   about the actual source.  This implies some additional multiplexing
   in the channel (and additional work if the component was embedded in
   a NAT), which might not be desirable.  The use of logical identifiers
   instead of real source addresses can be seen as a violation of the
   end-to-end principle.  Going further, not including any indication of
   source address (real or otherwise) is an even bigger violation of the
   principle, and it may follow that relays MUST include the source
   address in all cases.  

   It is not clear at this time whether only one of the types will be
   used, or the client will be able to tell the server what behaviour is
   required.

      Almost by definition, STUN does not have to consider firewall
      security as it is only useable in the absence of a firewall. (i.e.
      STUN only works with cone NATs, but a firewall will typically make
      the NAT appear like a symmetric NAT.)

6. TCP Issues

   The handling of in-bound TCP comes down to the characteristics of the
   TCP listeners on the relay.  There are two options:

      (1)   One-shot listener - A listener is posted on the relay that
            is terminated as soon as the first incoming connection is
            received.

      (2)   Persistent listener - The listener remains active and able
            to receive additional incoming connections after the first
            connection is received.

   The problem with (1) is that if a one-shot listener is used to route
   incoming call notification (e.g. as in H.323) the address to route
   those notifications is lost as soon as the first call notification is
   received. The client then has to obtain a new 'one-shot listener'
   address and re-register that address.  This potentially opens up an
   attack where the attacker can repeatedly force the client (or
   clients) to keep re-registering with the proxy.  This effectively
   represents an amplification of computing load by the time it reaches
   the registrar and amounts to a form of DDoS.


Cordell                                                          [Page 5]
Internet Draft              SPAN Discussion                  August 2002


   The conundrum for (2) is that its main benefit is also its main
   weakness. The main benefit is that it is able to allow multiple
   clients in the 'private' domain to post listeners that can be used to
   accept multiple incoming calls.  The main weakness is that multiple
   hosts in the 'private' domain are able to accept multiple incoming
   connections.  It seems that it is impossible to allow the benefit
   without the problem!

   (2) is mainly of use when multiple clients are behind a firewall/NAT
   and the initial protocol signaling is done over TCP (e.g. H.323, SIP
   over TCP or TLS).  Where only one client is behind a firewall or NAT,
   or it is possible to deploy a proxy within the protected network,
   hard coding of the NAT may be possible (e.g. port 5060 is mapped to
   the client).  However, even in these cases relay forwarding may be
   required if it is not possible to configure the NAT.  (What is the
   situation with residential ADSL, cable modems etc?)

   Initially it appears that there are at least two classes of problem
   that a persistent TCP listener enables:

      (a)   It allows un-authorised servers to be setup for essentially
            malicious purposes,

      (b)   It allows servers to be set up that, through ignorance, are
            not sufficiently hardened to attack.

   In the case of (a) it needs to be decided whether this represents a
   significantly greater threat to enterprise security policy than a
   number of other inside attacks that already exist.

   In the case of (b) it needs to be decided whether the threat of a
   persistent TCP listener is really any more of a threat than a
   one-shot listener.

   Both types of listener likely require some sort of public database
   (such as a SIP proxy) to map external requests to the dynamically
   allocated ports on the relay.  Method (1) may be used for receiving
   multiple inbound call requests if, on receiving a new connection, the
   client immediately requests a new listener and re-registers with the
   public proxy.  The presence of a public proxy is likely to work
   against the benefit realized by using what amounts to port hopping,
   and if this is thought to be a threat it might be necessary to
   somehow ban use of both types of TCP listener initiating sessions.

   On the other hand, if an attacker is attempting to find victims by
   port scanning the relay, as long as multiple clients are using a
   single relay, it is unlikely that the attacker will be able to
   repeatedly connect to the same client and attack it in a sustained
   way.  The attacker may also not be able to readily associate a
   particular communication session with a particular client and so
   attacks requiring multiple sessions will be harder.  But this may
   simply slow the attacker down, rather than preventing the problem.


Cordell                                                          [Page 6]
Internet Draft              SPAN Discussion                  August 2002


   One option to allow for persistent listeners with the slightly higher
   security of one-shot listeners, might be for the client to explicitly
   signal acceptance of an incoming connection and for the relay to get
   involved in some form of TCP accept rate limiting.

   (2) potentially makes the client susceptible to a DoS attack based on
   request flooding.  However, (1) enables other forms of DoS attack,
   such as service stealing.  

   It needs to be weighed up whether using (1) is practical enough not
   to use (2) and whether the threat of (2) is sufficiently greater than
   the threat of either (1), or whether there are comparable threats
   associated with allowing inbound UDP connections before (2) is
   discarded.   

   There is obviously much discussion to be had here! 

6.1. Listener Lifetime Management

   Both types of TCP listener may be in the listening state for a long
   time before they are actually triggered.  In both cases it may be
   necessary to terminate the listener before it actually receives an
   incoming connection.  

   A one-shot listener could be associated with a TCP connection between
   the relay and the client, and the lifetimes of the two tied together
   such that when the TCP connection is closed, the listener is also
   closed.  

   A persistent listener will likely require some form of out-of-band
   control to close it down.  Both hard state and soft state mechanisms
   could be used to maintain the status of the listener.

7. Outbound Forwarding through the Relay

   An issue is whether it is ever necessary to send out-bound packets
   through the relay.  (The other option being to always send them
   directly from the client.)  Enabling the relay to do outbound
   forwarding would allow packets to flow back along the same logical
   path (defined solely by source and destination addresses and ports)
   that they were received on.

   This is primarily an issue for UDP as any inbound TCP connection is
   implicitly bi-directional and thus requires no explicit forwarding
   rules.

   There seems to be three options here: 

      (1)   do not allow the relay to do outbound forwarding, 

      (2)   allow the relay to forward packets to the destination that
            packets were first received from, 


Cordell                                                          [Page 7]
Internet Draft              SPAN Discussion                  August 2002


      (3)   allow the relay to forward outbound packets based on an
            explicit command from the client that remains in force until
            a subsequent forwarding command is issued, during which time
            multiple UDP packets may be forwarded, and

      (4)   allow the relay to forward outbound packets based on per UDP
            packet explicit commands from the client.

   It maybe that (1) is sufficient.  (2) enables an effective denial of
   service attack in which an attacker can simply send port scanning
   packets to the relay and steal service.  (3) is much harder to
   implement as it implies out-of-band control, or multiplexed in-band
   control.  (4) implies multiplexed in-band control.

   (2) is probably only useful when something like symmetric RTP has
   been explicitly indicated in the signalling as in the general case
   RTP packets do not go back from whence they came.

   One reason for relay based out-bound forwarding might be legal
   interception, although thus far the IETF has decided not to support
   such features.  Additionally, service provider firewalls are likely
   to be a more appropriate location to support this function.

   To decide whether this is necessary, we need to look at:

         RTP

         SIP

         Use of symmetric RTP in SIP {is this still an option?}

         H.323 RAS

         H.323 Annex E

         Whether remote NATs will cause problems with non-symmetric
         paths.

   When carrying out the above analysis it may be necessary to consider
   real-world implementations rather than simply what the standards say.
   For example, experience has shown that some H.323 endpoints expect
   RTP and RTCP data to come from the same location that they are
   sending it to, even though there is no basis for this mode of
   operation in the various recommendations.

8. In-band Control Vs. Out-of-Band Control

   There needs to be communication between the client and the relay.
   Three possibilities exist: 

      (1)   Control uses a different transport connection to the data
            path,


Cordell                                                          [Page 8]
Internet Draft              SPAN Discussion                  August 2002


      (2)   Control is multiplexed onto the same transport connection as
            the data using some form of multiplexing,

      (3)   Control initially uses the transport connection, and then
            irreversibly switches over to data transport once the
            connection has been suitably configured.

   (1) is out-of-band and (3) is in-band control.  Whether (2) is
   in-band or out-of-band is a matter of opinion.  It is in-band if the
   transport is considered to be the level of multiplexing, and
   out-of-band if you consider that the necessary muxing on top of the
   transport connection is part of the multiplexing.

   (1) and (2) allow control and data to be exchanged at any time that
   it is required to do so.  The most obvious benefit of this is that it
   allows controlled termination of the relay operation.  It also
   readily allows the protocol to be extended.  

   (1) requires a separate transport connection per client for the
   control.  (2) requires some form of in-band multiplexing, although
   with a constrained data set and suitable care it may be possible to
   define a multiplexing scheme that does not normally require
   additional copying of the data in order to insert header shims.
   However, in the general case it does require special handling on
   occasion, and so may make it less attractive if the relay function is
   integrated into a NAT.

   (2) and (3) will often require the client user to be authenticated
   for each flow that is setup (e.g. twice per RTP session), whereas (1)
   allows for user authentication at initial control channel setup, and
   then a more localised authentication scheme for each flow that is
   setup thereafter.  (Here user authentication is assumed to involve
   accessing user records, whereas a localized scheme may simply be
   based on some cryptographic token that does not require access to per
   user information.)  It may be possible to ameliorate the difference
   between user authentication and localized authentication by using
   some form of user credential caching.  This maybe complicated by the
   fact that typically two data paths will be setup at the same time
   (RTP and RTCP) and hence by at the time of authenticating the second
   data path, the first has not completed caching.  There are solutions
   to this problem, but their complexity has to be compared to the
   complexity involved in setting up a separate control connection.  

   User authentication may also adopt a challenge-based scheme to
   prevent exchange of actual passwords.  This would make per flow user
   authentication less attractive as it requires more round-trips.

   Adopting a scheme where the data path authenticates using a more
   localized scheme allows the media relays to not have access to the
   user records.  This simplifies their implementation, and hence helps
   with scalability.


Cordell                                                          [Page 9]
Internet Draft              SPAN Discussion                  August 2002


   In the case of (1), if the out-of-band control channel is based on
   TCP then TLS can readily be used to help encrypt and authenticate it.
   In the case of (2) or (3), IPSec or a more application specific
   scheme will have to be adopted for the UDP sessions.

   {Does IPSec allow for things like certificate exchange?}

   The requirement for multiple streams to have a particular port
   relationship (such as RTP and RTCP) may preference an out-of-band
   scheme as the second of the pair will have effectively been setup
   out-of-band.  Hence, using an in-band scheme to set up pairs would
   result in both in-band and out-of-band techniques and at first sight
   may lead to a more complex design than a purely out-of-band
   technique.  More detailed design should readily resolve this
   particular issue.

9. Keep Alives

   What methods should be used to keep UDP NAT and firewall bindings
   alive?  Are zero length UDP packets sufficient?  

   For streams that don't have much traffic (e.g. call signalling paths
   - assuming we have such things) is there a trade off between
   continuous keep-alives over UDP, and transporting the UDP data over
   TCP, which has better state management.  Even if it has benefits, the
   latter is likely to be seen as a violation of firewall policy as it
   does not allow an administrator to allow TCP traffic while blocking
   UDP traffic.  On the other hand the feeling seems to be that TCP is a
   bigger threat to security than UDP, so maybe UDP over TCP is not that
   bad!!!

   TCP NAT bindings vary widely.  Some last as long as 24 hours.  The
   Linux TCP NAT bindings appear to be as short as 15 minutes.  One
   option to keep the TCP NAT bindings alive is to use TCP Keep-Alive.
   The recommended default period for this is 2 hours.  It is
   recommended that stacks make this value configurable, but in a number
   of cases it isn't configurable or required a re-build of the OS.
   These considerations suggest that the TCP keep-alive mechanism isn't
   appropriate to keep the NAT bindings alive, and some other method is
   required.

10. ICMP

   Typically an ICMP message will be sent to a remote client if a UDP
   packet is received on a port that is not in use.  This can give
   information to an attacker on which ports are active on a relay.  To
   prevent this it may be worth recommending that ICMP reports are not
   sent in this situation.  This mirrors the way some firewalls can be
   configured.

11. IP Fragmentation


Cordell                                                          [Page 10]
Internet Draft              SPAN Discussion                  August 2002


   Do we need to do/specify anything about this?

12. SCTP

   Do we need to cover SCTP or can we leave it as FFS?

13. Security Considerations

   The main consideration throughout this document has been security.
   The overriding philosophy of this document, as mentioned previously,
   is that it is important for SPAN not to compromise site security
   policy.  This is not only with regard to the letter of the law (i.e.
   the firewall rules), but also the spirit of the law (the intent that
   led to a particular set of rules being written).  

   Inferring such intent is difficult and there are likely to be as many
   opinions on the subject as there are people contributing to the
   debate.  This document attempts to tread a path comparing threats
   that already exist against threats that may be introduced by a SPAN
   type protocol, tempered with the knowledge even minor concessions to
   functionality start to affect a sites security situation.  (Even the
   deployment of something simple like STUN might have an impact on a
   site's intended security policy.)  This is a difficult area that is
   not exclusive to SPAN like protocols.  For example, Teredo faces
   exactly the same sorts of issues and, to some degree, so does OPES.

   The situation where a SPAN type deployment may work without
   administrator intervention is where any outbound TCP connection is
   allowed through the firewall.  The main reason for this type of rule
   is to enable FTP operating in PASV mode without an ALG.  However,
   most major firewalls and the majority of small DSL firewalls include
   FTP ALGs or stateful inspection and such 'any outbound protocol is
   OK' rules are no longer required and becoming increasingly rare.
   Indeed, even protocols such as HTTP and POP3 are subject to
   considerable scrutiny by various proxies to avoid the introduction of
   viruses.  

   Additionally, there is good reason to avoid rules that allow any
   outbound protocol.  Large amongst these reasons is avoiding the
   effects of trojans such as Back Orifice that, having obtained
   resources on an internal machine, will attempt to make an outbound
   connection back to their host.

   Hence, it would appear that in many situations, particularly those in
   where security is considered important, it would not be possible to
   run a SPAN like protocol without involvement of the administrator.

   Another issue for SPAN is that somebody inside an enterprise might be
   able to use it for malicious purposes. Chances are that they would
   have to be technically literate to make use of SPAN in which case
   they could implement their own solution without knowledge of SPAN
   etc. Also Back Orifice and a number of other attacks such as sendmail


Cordell                                                          [Page 11]
Internet Draft              SPAN Discussion                  August 2002


   attacks already make use of outbound connections made from
   compromised machines, and the presence of something like SPAN would
   not make such attacks any easier. Also, there are probably many
   better ways for a malicious internal person to operate than using
   SPAN. For example, rather than post an internal web server they could
   simply copy all the data they wanted onto a CD-ROM and post it
   externally. Such an approach is much less traceable, and far more
   sensible from the malicious person's perspective.

   One thing that would seem to be beneficial to the community of
   administrators would be to have guidance that there are protocols
   such as SPAN around and that there are simple ways to prevent them
   compromising a site's intended firewall security policy.  One way to
   do this would be to publish a SPAN like protocol and include in the
   specification ways that it and other protocols can be blocked by
   administrators.  That way firewall vendors can include a suitable set
   of firewall rules 'out-of-the-box', the reasons for such rules can be
   openly discussed and administrators will not get any unexpected
   results.

14. References

   [STUN]J. Rosenberg, "STUN - Simple Traversal of UDP Through NATs,"
         IETF Internet Draft, draft-rosenberg-midcom-stun-01.txt, March
         1 2002.

   [Mahadev]Information kindly provided by Mahadev Somasundaram in a
         private e-mail.

15.   Authors' Addresses

   Pete Cordell              
   Ridgeway Systems & Software
   66 Suttons Business Park
   Reading 
   RG6 1AZ
   England
   pcordell@ridgewaysystems.com
   

Cordell                                                          [Page 12]