INTERNET DRAFT                                         Yogesh Prem Swami
File: draft-swami-tcp-lmdr-03.txt                               Khiem Le
Expires: January 14, 2005                          Nokia Research Center
                                                                  Dallas
                                                           July 15, 2004


           Lightweight Mobility Detection and Response (LMDR)
                           Algorithm for TCP


Status of this Memo
   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of [RFC2026]

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

Abstract

     TCP congestion control is based on the assumption that the  end-to-
     end path of a connection changes very infrequently (most likely due
     to router failure) after connection establishment. This assumption
     allows a TCP sender to compute (predict) a new congestion window
     (cwnd) based on the ACKs from previous cwnd. With host mobility,
     however, the assumption of "constant path" does not hold, and the
     present congestion control and avoidance mechanisms can lead to
     suboptimal system performance. In this document we describes a TCP
     option that allows a receiver to inform the sender about subnet
     change; based on which, the sender can react to optimize
     performance.


Expires: January 15, 2005                                       [Page 1]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


1. Introduction

     TCP congestion control [RFC2581] is based on the assumption that
     end-to-end path of a connection does not change--or at best changes
     infrequently--once the connection is established. Based on this
     assumption, TCP increases its data rate whenever it receives a
     positive feedback in the form of new ACKs (i.e., ACKs for new
     data). However, unless the assumption of "constant path" is made,
     the TCP sender cannot continue with the old data rate: ACKs
     received for packets sent on old path only reflect the congestion
     state of that path, not of the new path.

     When a TCP sender or receiver changes its point of attachment to
     the Internet (henceforth referred as "changes subnets" or "changes
     path"), the entire end-to-end path between the sender and receiver
     can change. Therefore, relying on the rate of arrival of ACKs as
     the only criterion for congestion control can lead to suboptimal
     system performance.

     In this document, we describe a network layer independent mechanism
     by which a hosts can propagate their path-change information to
     their peers, based on which peers can react to optimize
     performance. We assume that a mobile host always knows about its
     own subnet information (for example, by looking at its neighbor
     cache, destination cache, default router, or a combination of these
     [RFC2461]), but currently, it is not able to inform its peer of
     such.

     Please note that some network layer mobility management techniques
     such Mobile-IPv6 [JPA03] with route optimization may be used to
     indirectly derive peer's mobility information (for example, by
     looking into the binding cache), but these schemes do not work in
     other cases such as Mobile-IPv6 with reverse tunneling, Mobile-IPv4
     [RFC3344], or traditional cellular networks. Once a TCP sender has
     mobility information about itself or its peer, it can use the
     congestion response described in section-5 to adjust its data rate.

     Please also note that we are not trying to solve the link-up/link-
     down problem. Link-up/link-down issues are related to link layer
     mechanisms which may or may not take place due to subnet change.
     For example, unplugging and replugging the ethernet cable
     constitutes a link-up/link-down event, even though the host might
     remain in same subnet after replugging the cable.  LMDR on the
     other hand has been designed for just one purpose: To facilitate
     subnet change notification and to optimize performance if there is
     a subnet change.

     Furthermore, we consider packet loss due to bit errors to be


Expires: January 15, 2005                                       [Page 2]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     different from packet loss due to host mobility. LMDR MUST NOT be
     used as a general mechanism to recover from packet loss due to bit
     error. Conceptually, loss due to bit errors are different from loss
     due to mis-routed packets.

2. Terminology

     The key words "MUST," "MUST NOT," "REQUIRED," "SHALL," "SHALL NOT,"
     "SHOULD," "SHOULD NOT," "RECOMMENDED," "MAY," "OPTIONAL," and
     "silently ignore" in this document are to be interpreted as
     described in [RFC2119].

     Mobile Node (MN):
          A host (not a router) capable of changing its point of
          attachment to the Internet without breaking transport layer
          connectivity. Hosts that change their point of attachment to
          the Internet but use DHCP or other mechanism to get a new IP
          address are not considered mobile.

     Old Subnet:
          MN's point of attachment (subnet prefix) to the Internet prior
          to movement. Old Subnet and Old Path are often used
          interchangeably in this document.

     New Subnet:
          MN's point of attachment after movement. New Subnet and New
          Path are used interchangeably in this document.

     INIT_WINDOW:
          The initial congestion window size at the start of connection
          as described in [RFC3390].

     Stale ACK:
          ACKs corresponding to the data sent on the Old Path. These
          ACKs don't contain meaningful congestion information about the
          new path and should be ignored for congestion response on the
          new path.

3. Congestion Issues with Subnet Change

     For concreteness, the description below assumes network mobility
     based on Mobile IP, but the same concepts are readily applicable to
     other types of networks.

     To illustrate the problem, consider Figure-1. At time=T, the MN is
     reachable on Subnet-1 through AR-1 and has the care-of address
     <Subnet-1, MN>. While MN is "attached" to AR-1, packets between
     TCP-Sender and <Subnet-1, MN> are routed using PATH-1. Let's assume


Expires: January 15, 2005                                       [Page 3]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     that after some period of time, at T+1, MN moves (hands over) to
     Subnet-2 and is reachable through AR-2 with the care-of address
     <Subnet-2, MN>. While MN is attached to AR-2, all packets between
     TCP-Sender and <Subnet-2, MN> are routed using PATH-2.


                          <---------PATH-1---------->

                            /---------\   +---------+
                            |         |   |         | Subnet-1
                        +---+ Cloud-1 +---+  AR-1   +-->>>>>MN
                        |   |         |   |         |  (Time=T)
       +------------+   |   \----++---/   +---------+
       |            |   |        ||            |
       | TCP Sender +---+        ^V PATH-3    ^V^ PATH-4
       |            |   |        ||            |
       +------------+   |   /----++---\   +----+----+
                        |   |         |   |         | Subnet-2
                        +---+ Cloud-2 +---+  AR-2   +-->>>>>MN
                            |         |   |         |  (Time=T+1)
                            \---------/   +---------+

                           <--------PATH-2----------->


                                   Figure-1


     During the transient period, when MN moves from Subnet-1 to
     Subnet-2, AR-1 may (or may not) buffer and forward packets destined
     to and from <Subnet-2, MN> through PATH-3 or through PATH-4 [K03].

     We make the distinction between PATH-3 and PATH-4 to emphasize the
     fact that PATH-4 may belong to a well provisioned network that has
     dynamic equilibrium for mobile users. Such networks are designed to
     accommodate extremely bursty traffic. PATH-3, on the other hand,
     may consist of arbitrary routers without proper provisioning.

     Let's assume that a TCP connection was progressing between MN and
     TCP Sender when the user moves from Subnet-1 to Subnet-2. We now
     analyze the problem of congestion on different paths shown above.

3.1 Congestion On PATH-1

     Congestion on PATH-1 is governed by basic slow-start and congestion


Expires: January 15, 2005                                       [Page 4]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     avoidance mechanisms [RFC2581]. As long as MN remains in Subnet-1,
     standard congestion control algorithms is sufficient. But once it
     moves from Subnet-1 to Subnet-2, two different scenarios are
     possible depending on the network topology.

     Scenario-1: Access Routers Don't Tunnel Packets to new Subnet.

          In this scenario (typical of Mobile-IPv4), all packets
          destined to <Subnet-1, MN> are dropped by AR-1 once the mobile
          has moved (this happens if the access routers don't have
          enough packet forwarding information).  Since the latency
          involved in establishing a new tunnel is of the order of RTT
          (2*RTT in case of Mobile-IPv6), roughly an entire window worth
          of data will be dropped by AR-1. Because of this window loss,
          the sender will timeout in most cases.

          In this scenario, the TCP sender has to unnecessarily wait for
          an RTO before it can initiate its loss recovery algorithm. In
          addition, the sender's SS_THRESH value will be set to an
          arbitrary value which will have no correlation with the BDP on
          the new path. An arbitrary SS_THRESH severely impacts the
          throughput of the connection. It forces the sender to spend a
          lot of time trying to reach a reasonable throughput on the new
          path if the BDP on the two paths are substantially different.
          For example, consider the case where the BDP on the old path
          was 10 packet, while the BDP on the new path is 1000 packets.
          With a normal timeout based loss recover algorithm, the
          sender's SS_THRESH will be set to 10 packets, and reaching a
          reasonable throughput of at least 500 packet (i.e., half of
          BDP) will require ( log_2(10/2) + (500-5)) Round Trips(recall
          that data rate increase during congestion avoidance is just
          one packet per RTT). Contrast this with a scheme where the
          sender resets the SS_THRESH to a large value after subnet
          change and only spends log_2(500/2) RTT to reach a reasonable
          throughput.

     Scenario-2: Access Routers Tunnel Packets to the new Subnet
          In this scenario, all packets destined to <Subnet-1, MN> are
          forwarded to <Subnet-2, MN> by AR-1 [K03]. In this case, AR-1
          can forward packets to <Subnet-2, MN> using PATH-3 or PATH-4.
          We consider these two paths separately in the following
          sections.

3.2 Congestion On PATH-3

     If AR-1 starts forwarding packets to AR-2 using PATH-3, PATH-3 will
     experience a sudden burst of data. In addition, If multiple MNs
     move between AR-2 and AR-1, PATH-3 MAY get congested. But if


Expires: January 15, 2005                                       [Page 5]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     sending packets on PATH-3 is bad for other connections, dropping
     them is bad for the connections that change subnets (section-3.1).

3.3 Congestion On PATH-4

     In many cases, it's reasonable to assume that wireless service
     providers will have a well provisioned network that can accommodate
     highly bursty traffic. Such networks may have a dynamic equilibrium
     where the average transit traffic from AR-1 to AR-2 is the same as
     the transit traffic from AR-2 to AR-1. Such well provisioned paths
     are, however, not possible Internet-wide.

3.4 Congestion On PATH-2

     Since the MN is able to receive packets even after moving away from
     AR-1, it will continue to generate ACKs in the orderly fashion.
     These ACKs will traverse PATH-3 or PATH-4  and finally reach the
     TCP sender. But the segments sent by TCP sender due to these ACKs
     will travel on PATH-2 (assuming the TCP sender has received the
     binding update to send data on new path). Unfortunately, the TCP
     sender has no congestion information about PATH-2 and using the old
     congestion window may cause congestion on PATH-2. This problem
     becomes worse as the number of mobile users or rate of subnet
     change increases in the system. Consider, for example, the case
     where a train moves across a subnet boundary due to wireless radio
     coverage limitations, and hundreds of mobile users on that train
     handoff to a new subnet.  In these cases, the new subnet will see a
     burst of data that can cause unnecessary packet loss and timeouts.

     Conversely, if PATH-2 is much lightly loaded than PATH-1, and if
     the sender is in congestion avoidance, it will spend multiple RTTs
     before reaching a reasonable throughput.

     To summarize:

     a) If packets from the old subnet are tunneled to the new subnet,
        then the influx of TCP connection in the new subnet MAY add to
        network congestion and cause unnecessary packet loss and
        timeouts. Furthermore, if the new subnet is lightly loaded, the
        sender will spend a lot of time trying to reach a reasonable
        throughput.

     b) If packets are not tunneled to the new subnet, then the sender
        may have to wait for an RTO before it can start loss recovery.
        In addition, the SS_THRESH update after a timeout may further
        degrade the performance if the BDP on the two paths are very
        different.


Expires: January 15, 2005                                       [Page 6]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


4. Subnet Change Detection

     Subnet change detection in itself is a two step process. First, a
     mobile node needs to know if it has moved from one subnet to
     another; second it needs to propagate this information to its peer.
     Detecting when a mobile node has moved is a neighbor discovery
     [RFC2461] problem and is beyond the scope of this document. In this
     document we assume that hosts can determine path-change information
     either from lower layers or through other out of band mechanisms.

     We now focus on how a mobile can propagate this information to its
     peer. To do so, we propose to use a TCP option.

4.1 LMDR TCP Option

     The basic idea behind LMDR option is to use a counter, which is
     decremented every time there is a subnet change. At the start of
     the connection, both endpoints use this option in the SYN packet
     and agree on an initial counter value of 7 (each side has it's own
     counter). After the SYN exchange is completed, the mobile hosts
     don't send this option until there is a subnet change.

     When there is a subnet change, the Initiator (the host that wants
     to inform its peer--the Responder--about subnet change) decrements
     the counter and sends this option in every subsequent ACK or data
     packet. When the Responder sees an LMDR option, it echoes back the
     Initiator's counter. The Responder keeps echoing back the value
     until the Initiator stops sending the option. On the other hand,
     the Initiator keeps sending this TCP option until it has received
     an Echoed value. In short, the initiator keeps sending the LMDR
     option until the Responder "acknowledges" that it has received the
     Subnet change notification. The responder acknowledged the value by
     echoing back the LMDR counter to the Initiator. Note that in case
     both the initiator and responder mode simultaneously, the host that
     has maximum Initial TCP sequence number should assume the role of
     Initiator.

     Following is the LMDR TCP Option format:

                 +----------------+----------------+----+------+------+
                 |      TYPE      |     LENGTH     |RES | CNTR | ECNT |
                 +----------------+----------------+----+------+------+

     TYPE: (8 Bits) TCP Option Type. Value set to 25 for experimental
     purposes.

     LENGTH: (8 Bits) TCP Option Length. Value = 3.


Expires: January 15, 2005                                       [Page 7]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     RES: (2 Bits) Reserved bits. Sender should set the value to zero.
     Receiver should ignore these fields.

     CNTR: (3 Bits) The subnet counter value of the host sending this
     option. This value is decremented once for ever subnet change
     (i.e., if the mobile host moves from x1.y1.z1/24 to x2.y2.z2/24,
     and the counter value in x1.y1.z1/24 was C1, then the counter value
     in x2.y2.z2/24 will be C1-1). As long as the mobile is the same
     subnet, it should send the same value of counter.

     ECNT: (3 Bits) The echoed value of CNTR. When the Responder
     receives an LMDR option, it should copy the CNTR value to ECNT.
     Moreover, the Responder should use it's own subnet counter to fill
     in the CNTR value. Following is an example of how it works.

     Let's say MN-A has a subnet counter CNTR-A = 5 and MN-B has CNTR-B
     = 3 before subnet change. Let's assume that node B moves to a new
     subnet. See Figure-2 for details of the message exchange.


                            [NO LMDR OPTION]
                 MN-A <-----------------------------------> MN-B
        ( my_subnet_count = 5 )                  ( my_subnet_count = 3)
        ( rem_subnet_count = 3)                  ( rem_subnet_count = 5)


                   Time = T (MN-B moves to a new subnet)
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                          LMDR: CNTR=2 (3-1), ECNT=5
                 MN-A <-----------------------------------  MN-B
           ( rem_subnet_count = 3)              ( my_subnet_count = 2)
           ( my_subnet_count = 5 )              ( rem_subnet_count = 5)

                          LMDR: CNTR=5, ECNT=2
                 MN-A -------------------------------------> MN-B
           (B Has Moved. Echo back ECNTR=2)       (Stop sending LMDR)


                                   Figure-2


     Following are the details of subnet change detection algorithm:


Expires: January 15, 2005                                       [Page 8]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     1. Each TCP implementation should keep three new
        variables--my_subnet_count, remote_subnet_count, and
        in_transition--to facilitate mobility detection and response
        algorithm. my_subnet_count, and rem_subnet_count are used for
        the mobility count information about the local and remote hosts
        respectively. in_transition is set to one when the Responder
        receives the first LMDR option. The value is reset to zero when
        the Responder receives a packet without the LMDR option set.

     2. At connection set up, both the client and server willing to
        have mobility detection MUST send LMDR option with CNTR=7 in the
        SYN packets. If both the end points agree to using the LMDR
        option, only then the TCP sender should process future LMDR
        options.

     3. For each packet sent, each host should determine
        if it has moved to a new subnet. If either of the end-points
        determine that it has moved, it SHOULD update the value of
        my_subnet_count as follows:

                   my_subnet_count =  (my_subnet_count - 1);
        in_transition = 1;

        The node that updates this value is referred as Initiator. The
        Initiator SHOULD send an LMDR option for every packet as long as
        in_transition == 1.If the Initiator is also a data sender, it
        MUST follow the congestion response algorithm described in
        Section-5.  In addition, the Initiator MUST keep the
        in_transition value unaltered until it receives a packet with

                           ECNT == my_subnet_counter;

        (i.e., until the recent most CNTR value is echoed back by the
        Responder).

     4. When the Responder receives a valid TCP packet (i.e., a packet
        that meets the sequence number and ACK sequence number criteria
        of RFC 793), it should compare the value of 'CNTR' with the
        value of conclude that the Initiator has not moved and MUST NOT
        update its in_transition variable. (Although it MUST keep
        echoing back the LMDR option. Note that in case of simultaneous
        move it will result in sending the option for every subsequent
        packet. To break this infinite loop, the host with largest
        Initial TCP sequence number should assume the role of
        Initiator.)

        Finally, if the two values of remote_subnet_counter and CNTR in
        LMDR option differ, the Responder should conclude that the


Expires: January 15, 2005                                       [Page 9]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


        Initiator has moved. In addition, the Responder MUST update the
        variables as follows:

                  rem_subnet_count = CNTR; in_transition = 1;

        After making these changes, the TCP sender MUST follow the
        congestion response algorithm as described in Section-5.
        Moreover, the value of in_transition SHOULD be reset when the
        responder receives a packet from the Initiator without the LMDR
        option (in other words, this guarantees that the Initiator has
        received the option).

NOTE: In certain network architectures it's possible that a mobile
   (and the associated link technology) has sufficient congestion
   information about the new path. In these cases, if the congestion on
   the new path is low, one MAY choose not to indicate subnet change
   information to the sender since there is no need to reduce the data
   rate. However, the mobility information MUST be indicated if no such
   information is available or if the congestion information is not for
   the entire path (i.e., if the congestion information is only for a
   part of the new path, then the Initiator MUST inform about subnet
   change).

5. Congestion Response after Subnet Change

     The goal of congestion response after subnet change is to minimize
     congestion on PATH-2. In principle, congestion response for PATH-2
     has the same requirements as that of a new connection: The sender
     should have no more than INIT_WINDOW worth of data outstanding on
     the *new path* and the SS_THRESH should be set to a large value.
     What makes the problem complex is the fact that connections after
     subnet change have non-zero packets in flight. ***The congestion
     response after subnet change MUST therefore ignore the Stale ACKs
     and MUST base its congestion control response based solely on the
     new ACKs (i.e., ACKs generated for data sent on new path).***

     The idea behind the congestion response is to send an INIT_WINDOW
     worth of new data packets at the time when in_transition field is
     set to one, and not send any packets until the in_transition field
     is set to zero. Since the in_transition field will remain set for
     at least one RTT on the new path, it guarantees that the TCP sender
     would behave like a standard TCP connection. Following are the
     details of the congestion response algorithm.

     1. When the TCP sender concludes that there is a subnet change,
        it's value of in_transition should be set to 1 (as described
        above in Section-4). At this time, the data sender should
        increase its congestion window as:


Expires: January 15, 2005                                      [Page 10]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


                             cwnd=cwnd+INIT_WINDOW;

        and send INIT_WINDOW worth of data on the new path and restart
        RTO timer as if this were a new connection [RFC2018].

     2. For each subsequent ACK received, the sender should adjust the
        congestion window such that *no new data packet is sent* into
        the network. This behavior should continue until in_transition =
        0 again or there is a timeout. Once the in_transition is set to
        zero, the sender should update the unsacked packets as lost, and
        update the packets in flight as INIT_WINDOW - 1. The sender MUST
        also set the congestion window to INI_WINDOW + 1, and initiate
        loss recovery in slow start.

6. Architectural Considerations

     Architecturally, the method described above does not add any new
     architectural features in the system. Although LMDR requires a TCP
     receiver to look into some parameters and data structures (local to
     that stack) that are specific to IP layer, it should not be a
     problem either from an implementation point of view or from a
     theoretical point of view. In most cases, TCP layer already
     consults the IP layer for MTU information, at the very least.

7. Security Considerations

     Since LMDR option is valid only for an acceptable ACK [RFC793],
     it's immune to passive attacks as long as the congestion window is
     not of the order of 2^31 bytes. However, LMDR is not safe against
     active DoS attacks (present TCP is not safe either). We will
     describe a security mechanism to protect against active attacks if
     there is a requirement from the working group.

8. Acknowledgments

     We would like to thank Shashikant Maheshwari and Mark Allman for
     their comments and suggestions on a previous version of this draft.


9. REFERENCES

     [RFC2581]  M. Allman, V. Paxson, W. Stevens, "TCP Congestion
                Control," Apr 1999.

     [K03]      R. Koodli, "Fast Handover for Mobile IPv6," Internet
                draft; work in progress, draft-ietf-mobileip-fast-
                mipv6-07.txt, Sept 2003.


Expires: January 15, 2005                                      [Page 11]

draft-swami-tcp-lmdr-03.txt                                July 15, 2004


     [RFC2461]  T. Narten, E. Normark., W, Simpson, " Neighbor Discovery
                for IP Version 6 (IPv6)," Dec 1998.

     [JPA03]    D. Johnson, C. Perkins, J. Arkko, "Mobility Support in
                IPv6," Internet Draft; Work In Progress, draft-ietf-
                mobileip-ipv6-24.txt, June 2003.

     [RFC3344]  C. Perkins, "IP Mobility Support for IPv4," Aug 2002.

     [RFC3390]  M. Allman, S. Floyd, C. Partridge, "Increasing TCP's
                Initial Window," Oct 2002.

     [RFC3360]  S. Floyd, "Inappropriate TCP Resets Considered Harmful,"
                Aug 2002.

     [RFC3517]   E. Blanton, M. Allman, K. Fall, L. Wang, "A
                Conservative SACK-based Loss Recovery Algorithm for
                TCP," Internet draft; work in progress, draft-allman-
                tcp-sack-13.txt, Oct 2002.

     [RFC2018]  M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, "TCP
                Selective Acknowledgment Options," RFC 2018. Nov 2000.

     [RFC2988]  V. Paxson, M. Allman, "Computing TCP's Retransmission
                Timer," Nov 2000.

     [RFC793]   "Transmission Control Protocol," RFC-793, Sept 1981.

10. IPR Statement

     The IETF has been notified of intellectual property rights claimed
     in regard to some or all of the specification contained in this
     document. For more information consult the on-line list of claimed
     rights at http://www.ietf.org/ipr.

Author's Address:

   Yogesh Prem Swami                   Khiem Le
   Nokia Research Center, Dallas       Nokia Research Center, Dallas
   6000 Connection Drive               6000 Connection Drive
   Irving, TX-75063, USA.              Irving, TX-75063. USA.

   E-Mail: yogesh.swami@nokia.com      E-Mail: khiem.le@nokia.com
   Ph    : +1 972 374 0669             Ph    : +1 972 894 4882


Expires: January 15, 2005                                      [Page 12]