Internet Engineering Task Force P. Sarolahti INTERNET DRAFT Nokia Research Center File: draft-sarolahti-tsvwg-tcp-frto-03.txt M. Kojo University of Helsinki January, 2003 Expires: July, 2003 F-RTO: A TCP RTO Recovery Algorithm for Avoiding Unnecessary Retransmissions Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of [RFC2026]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Spurious retransmission timeouts (RTOs) cause suboptimal TCP performance, because they often result in unnecessary retransmission of the last window of data. This document describes the "Forward RTO Recovery" (F-RTO) algorithm for detecting spurious TCP RTOs. F-RTO is a TCP sender only algorithm that does not require any TCP options to operate. After retransmitting the first unacknowledged segment triggered by an RTO, the F-RTO algorithm at a TCP sender monitors the incoming acknowledgements to determine whether the timeout was spurious and to decide whether to send new segments or retransmit unacknowledged segments. The algorithm effectively helps to avoid additional unnecessary retransmissions and thereby improves TCP performance in case of a spurious timeout. Expires: July 2003 [Page 1] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 1. Introduction The TCP protocol [Pos81] has two methods for triggering retransmissions. Primarily, the TCP sender relies on incoming duplicate ACKs, which indicate that the receiver is missing some of the data. After a required amount of successive duplicate ACKs have arrived at the sender, it retransmits the first unacknowledged segment [APS99]. Secondarily, the TCP sender maintains a retransmission timer which triggers retransmission of segments, if they have not been acknowledged within the retransmission timer expiration period. When the retransmission timer expires, the congestion window is initialized to one segment and unacknowledged segments are retransmitted using the slow-start algorithm. The retransmission timer is adjusted dynamically based on the measured round-trip times [PA00]. It has been pointed out that the retransmission timer can expire spuriously and trigger unnecessary retransmissions when no segments have been lost [GL02]. After a spurious RTO the acknowledgements of original segments arrive at the sender, usually triggering unnecessary retransmissions of whole window of segments during the RTO recovery. Furthermore, after a spurious RTO a conventional TCP sender increases the congestion window in slow start, injecting a large number of data segments to the network within one round-trip time. There are a number of potential reasons for spurious RTOs. First, some mobile networking technologies involve sudden delay peaks on transmission because of actions taken during a hand-off. Second, arrival of competing traffic, possibly with higher priority, on a low-bandwidth link or some other change in available bandwidth involves a sudden increase of round-trip time which may trigger a spurious retransmission timeout. A persistently reliable link layer can also cause a sudden delay when several data frames are lost for some reason. This document does not distinguish the different causes of such a delay, but discusses the spurious RTO caused by delay in general. This document describes an alternative RTO recovery algorithm called "Forward RTO-Recovery" (F-RTO) to be used for detecting spurious RTO and thus avoiding unnecessary retransmissions following the RTO. When the RTO is not spurious, the F-RTO algorithm reverts back to the conventional RTO recovery algorithm and should have similar performance. F-RTO does not require any TCP options in its operation, and it can be implemented by modifying only the TCP sender. This is different from alternative algorithms (Eifel [LK00] and DSACK-based algorithms [BA02]) that have been suggested for detecting unnecessary retransmissions. The Eifel algorithm uses TCP timestamps for Expires: July 2003 [Page 2] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 detecting a spurious timeout and the DSACK-based algorithms require that the SACK option with DSACK extension [FMMP00] is in use. With DSACK, the TCP receiver can report if it has received a duplicate segment, making it possible for the sender to detect afterwards whether it has made unnecessary retransmissions. When an RTO occurs, the F-RTO sender retransmits the first unacknowledged segment normally. If the next two acknowledgements advance the window, the F-RTO sender continues sending new data and exits the recovery. However, if either of the next two acknowledgements is a duplicate ACK, there was no sufficient evidence of spurious RTO; therefore the F-RTO sender retransmits the unacknowledged segments in slow start similarly to the traditional algorithm. The F-RTO algorithm only attempts to avoid unnecessary retransmissions after a RTO. Eifel can also be used in avoiding unnecessary retransmissions in other events, for example due to packet reordering. This document is organized as follows. Section 2 describes the basic F-RTO algorithm. Section 3 outlines an optional enhancement to the F- RTO algorithm that takes leverage on the TCP Selective Acknowledgment Option [MMFR96] and Section 4 presents an alternative of F-RTO that uses the TCP timestamp option. Section 5 discusses the possible actions to be taken after detecting a spurious RTO, and Section 6 discusses the security considerations. 2. F-RTO Algorithm The F-RTO algorithm affects the TCP sender behavior only after a retransmission timeout. Otherwise the TCP behavior remains unmodified. This section describes a basic version of the F-RTO algorithm that does not require TCP options to work. The actions taken in response to spurious RTO are not described in this document, but we discuss the different alternatives for congestion control in Section 5. When the retransmission timer expires, the F-RTO algorithm takes the following steps at the TCP sender. 1) When RTO expires, the TCP sender SHOULD retransmit first unacknowledged segment. The highest sequence number transmitted so far is stored in variable "send_high". The TCP sender MAY postpone adjusting the congestion control parameters for the next two incoming ACKs, until it has got more input on whether the RTO was spurious or Expires: July 2003 [Page 3] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 not. If the TCP sender adjusts the congestion control parameters at this point, it may store the earlier values of the parameters to be able to restore the values when it detects that the RTO was spurious. 2) When the first acknowledgement after the RTO arrives at the sender, the sender chooses the following actions depending on whether the ACK advances the window or whether it is a duplicate ACK. a) If the acknowledgement is a duplicate ACK OR it is acknowledging a sequence number equal or above to the value of send_high, the TCP sender SHOULD revert to the conventional recovery and not enter step 3 of this algorithm. The sender MUST set cwnd to 1 * MSS. This duplicate ACK is triggered by a segment that was sent before the RTO retransmission. This is possible, for example, if the RTO expired during fast recovery while forward transmissions are triggering duplicate ACKs. Furthermore, if a segment retransmitted during fast recovery is lost, it needs to be retransmitted again by retransmission timer. In this case it is also possible that the duplicate ACK is triggered by a new segment transmitted during the fast recovery before the RTO. b) If the acknowledgement advances the window AND it is below the value of send_high, the TCP sender MAY transmit two new (previously unsent) segments. Sending two new segments at this point is equally aggressive to the conventional RTO recovery algorithm, which would have increased its cwnd to 2 * MSS when the first valid ACK arrives after RTO. It is possible that the sender can transmit only one new segment at this time, because the receiver window limits it, or because the TCP sender does not have more data to send. This does not prevent the algorithm from working. In any case, the TCP sender SHOULD transmit at least one segment, either new data or from the retransmission queue. If the sender retransmits the next unacknowledged segment, it MUST NOT enter the step 3 of this algorithm, but continue retransmitting similarly to the conventional RTO recovery algorithm. If the first acknowledgement after RTO does not acknowledge all of the data that was retransmitted in step 1, the TCP sender MUST NOT enter step 3 of this algorithm. Otherwise, a malicious receiver acknowledging partial segments could cause the sender to declare the RTO spurious in a case where data was lost. When receiving an acknowledgement for a partial segment, the TCP Expires: July 2003 [Page 4] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 sender SHOULD revert to conventional RTO recovery. 3) When the second acknowledgement after the RTO arrives at the sender, either declare the RTO spurious, or start retransmitting the unacknowledged segments. a) If the acknowledgement is a duplicate ACK, the TCP sender MUST set congestion window to no more than 3 * MSS, and continue with the slow start algorithm retransmitting unacknowledged segments. The duplicate ACK indicates that at least one segment other than the segment which triggered RTO is lost in the last window of data. There is no sufficient evidence that the RTO was spurious. Therefore, the sender proceeds with retransmissions similarly to the conventional RTO recovery algorithm, with the send_high variable stored when the retransmission timer expired to avoid unnecessary fast retransmits. b) If the acknowledgement advances the window and acknowledges data beyond the highest sequence number that was retransmitted on RTO, the TCP sender SHOULD declare the RTO spurious. Because the TCP sender has retransmitted only one segment after the RTO, this acknowledgement indicates that an originally transmitted segment has arrived at the receiver. This is regarded as a strong indication of a spurious RTO. The TCP sender should not assume that the unacknowledged segments are lost, and it should continue by sending new previously unsent segments. If this algorithm branch is taken, the TCP sender SHOULD set the value of send_high variable to SND.UNA in order to disable the Reno "bugfix" [FH99]. The send_high variable was proposed for avoiding unnecessary multiple fast retransmits when RTO expires during fast recovery with NewReno TCP. As the sender has not retransmitted other segments but the one that triggered RTO, the problem addressed by the bugfix cannot occur. Therefore, if there are duplicate ACKs arriving at the sender after the RTO, they are likely to indicate a packet loss, hence fast retransmit should be used to allow efficient recovery. If there are not enough duplicate ACKs arriving at the sender after a packet loss, the retransmission timer expires another time and the sender enters step 1 of this algorithm. If the TCP sender does not have any new data to send in algorithm branch (2b), or the receiver window limits the transmission, the sender SHOULD revert back to retransmitting unacknowledged data Expires: July 2003 [Page 5] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 similarly to the regular TCP. The motivation for this is to ensure that the flow of segments into the network does not stop. In the worst case that would result in additional RTO significantly degrading the TCP performance. The TCP sender could try to proceed with the F-RTO algorithm by alternatively transmitting one segment from the tail of the retransmission queue, if it is not possible to transmit new data in algorithm step (2b). Another option would be to transmit data beyond the advertised receiver window. If the RTO was spurious, the receiver is likely to be able to store the segment at the time when it arrives. However, the current recommendation is to revert to the conventional RTO recovery if sending new data is not possible, because we believe the benefits of doing otherwise are not very remarkable. After the RTO is declared spurious, the TCP sender cannot detect if the unnecessary RTO retransmission was lost. In principle the loss of the RTO retransmission should be taken as a congestion signal, and thus there is a small possibility that the F-RTO sender violates the congestion control rules, if it chooses to fully revert congestion control parameters after detecting a spurious RTO. The Eifel detection algorithm has a similar property, but the DSACK option can be used to detect whether the retransmitted segment was successfully delivered to the receiver. The F-RTO algorithm has a side-effect on the TCP round-trip time measurement. Because the TCP sender avoids most of the unnecessary retransmissions after a spurious RTO, the sender is able to take round-trip time samples of the delayed segments. This would not be possible due to retransmission ambiguity, if the regular RTO recovery is used without TCP timestamps. As a result, the RTO estimator is likely have larger values with F-RTO than with the regular TCP after the spurious RTO. We believe this is an advantage in the networks that are prone to delay spikes. It is possible that the F-RTO algorithm does not always avoid unnecessary retransmissions after spurious RTO. If packet reordering or packet duplication occurs on the segment that triggered the spurious RTO, the F-RTO algorithm may not detect the spurious RTO. Additionally, if a spurious RTO occurs during fast recovery, the F- RTO algorithm often cannot detect the spurious RTO. However, we consider these cases relatively rare, and note that in cases where F- RTO fails to detect the spurious RTO, it performs similarly to the regular RTO recovery. Expires: July 2003 [Page 6] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 3. A SACK-enhanced version of the F-RTO algorithm This section describes an alternative version of the F-RTO algorithm, that makes use of TCP Selective Acknowledgement Option [MMFR96]. By using the SACK option the TCP sender can detect spurious RTOs in most of the cases when packet reordering or packet duplication is present, or when the TCP sender is under loss recovery. The difference to the basic F-RTO algorithm is that the sender may declare RTO spurious even when duplicate ACKs follow the RTO, if the SACK blocks acknowledge new data that was not transmitted after RTO. DCLOR is a related TCP enhancement that uses SACK option for avoiding unnecessary retransmissions after a spurious RTO [SL02]. However, DCLOR is different from F-RTO in that it does not declare the RTO spurious before all segments outstanding when the RTO occurs have been acknowledged. The SACK-enhanced F-RTO algorithm takes the following steps: 1) When RTO expires, the TCP sender SHOULD retransmit first unacknowledged segment. The TCP sender should also store the highest sequence number transmitted in variable "send_high". 2) The first acknowledgement after RTO arrives at the sender. a) if the cumulative ACK acknowledges all segments up to send_high stored in algorithm step 1, the TCP sender SHOULD revert to the conventional RTO recovery and it MUST set congestion window to no more than 2 * MSS. The sender does not enter step 3 of this algorithm. b) otherwise, the TCP sender MAY transmit two new segments. If the TCP sender does not transmit any previously unsent data, it MUST NOT enter step 3 of this algorithm, but revert to the conventional RTO recovery. 3) The second acknowledgement after RTO arrives at the sender. a) if the ACK acknowledges data above send_high, either in SACK blocks or as a cumulative ACK, the sender MUST set congestion window to no more than 3 * MSS and proceed with slow start, retransmitting unacknowledged segments. The sender SHOULD take this branch also when the acknowledgement is a duplicate ACK and it does not contain any new SACK blocks for previously unacknowledged data below send_high. Expires: July 2003 [Page 7] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 b) if the ACK does not acknowledge data above send_high and some previously unacknowledged data below send_high is acknowledged, the TCP sender SHOULD declare the RTO spurious. If there are unacknowledged holes between the received SACK blocks, those segments SHOULD be retransmitted similarly to the conventional SACK recovery algorithm. In addition, send_high should be set to its earlier value, since no loss recovery was needed due to the RTO. As with the basic version of the F-RTO algorithm, in step (2b) the sender may transmit only one segment if the receiver window does not allow more, or there are no more application data. 4. On using the TCP timestamps with F-RTO The basic F-RTO algorithm suggests applying the conventional RTO recovery if the receiver window or application limits the transmission of new previously unsent data, and in such a case it is possible that the F-RTO algorithm cannot be used to detect a spurious RTO. The F-RTO sender can avoid the need of transmitting new previously unsent segments after RTO, if it has TCP timestamps [BBJ92] available. The Eifel detection algorithm [LK00] describes how the TCP timestamps can be used to avoid unnecessary retransmissions after a spurious RTO. However, if the RTO is declared spurious based on the timestamp echoed with the first acceptable ACK following the RTO, the TCP sender may falsely declare the RTO spurious and continue by transmitting new data when the RTO was caused by loss of acknowledgements. The Eifel algorithm may signal spurious RTO falsely, if the first data segment retransmitted after RTO was not lost, but the corresponding acknowledgement was, and the acknowledgement does not include DSACK option [FMMP00]. If sender and receiver implement DSACK, this problem can be avoided. An alternative algorithm for detecting spurious RTOs by using TCP timestamps without DSACK is described below. When TCP timestamps are available, the F-RTO sender MAY apply the following algorithm. 1) When RTO expires, retransmit first unacknowledged segment and store the timestamp of retransmitted segment in variable "RetransmitTS". Store the highest sequence number transmitted so far in variable "send_high". 2) Wait until the first ACK that acknowledges previously unacknowledged data arrives at the sender. If duplicate ACKs arrive, they are processed normally while the sender stays in this step of the algorithm. Expires: July 2003 [Page 8] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 a) if the timestamp echoed with the ACK is later or equal than what is stored in "RetransmitTS", the TCP sender SHOULD revert to the conventional RTO recovery and it MUST NOT enter step 3 of this algorithm. The sender should adjust the congestion window according to the standard congestion control rules. b) if the timestamp echoed with the first ACK is earlier than what is stored in "RetransmitTS", the TCP sender SHOULD transmit the first unacknowledged segment and enter step 3 of this algorithm. 3) When the next acknowledgement arrives at the sender, it SHOULD apply one of the following branches of the algorithm. a) if the timestamp echoed with the ACK is later or equal than what is stored in "RetransmitTS", or if the acknowledgement is duplicate ACK, the TCP sender SHOULD revert to the conventional RTO recovery. The TCP sender MUST set the congestion window to no more than 2 * MSS. b) if the timestamp echoed with the ACK is earlier than what is stored in "RetransmitTS", the TCP sender SHOULD declare the RTO spurious. send_high SHOULD be set to the value of SND.UNA to cancel the NewReno bugfix, as described in Section 2. The drawback of this algorithm compared to the original Eifel detection is that the above-presented algorithm can make two unnecessary retransmissions instead of one. In addition, packet reordering, packet duplication, or packet loss for the next segment after the one that triggered RTO may prevent the detection of spurious RTO. Therefore, it may be desirable to apply the basic F- RTO or the SACK-enhanced version of the F-RTO algorithm whenever the sender is able to transmit previously unsent data when the first ACK after RTO arrives. However, we believe the algorithm above effectively avoids false spurious RTO signals. 5. Taking Actions after Detecting Spurious RTO Upon retransmission timeout, a conventional TCP sender assumes that outstanding segments are lost and starts retransmitting the unacknowledged segments. When the RTO is detected to be spurious, the TCP sender should not start retransmitting based on the RTO. For example, if the sender was in congestion avoidance phase transmitting new previously unsent segments, it should continue transmitting previously unsent segments after detecting spurious RTO. In addition, it is suggested that the RTO estimation is reinitialized and the RTO timer is adjusted to a more conservative value in order to avoid Expires: July 2003 [Page 9] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 subsequent spurious RTOs [LG02]. Different approaches have been suggested for adjusting the congestion control state after a spurious RTO. This document does not recommend any of the alternatives below, but considers the response to spurious RTO as a subject of further research. 1) Revert the congestion control parameters to the state before the RTO [LG02]. This appears to be a justified decision, because it is similar to the situation in which the RTO did not expire spuriously. However, we identified two concerns in this approach: First, some detection mechanisms, such as F-RTO or the Eifel Detection algorithm, do not notice the loss of the spurious retransmission, thus introducing a small risk of violation of the congestion control principles. Second, a spurious RTO indicates that some part of the network was unable to deliver packets for a while, which can be considered as a potential indication of congestion. 2) Reduce ssthresh and congestion window when detecting a spurious RTO [SKR02]. For example, ssthresh and cwnd could be set to half of their earlier values, as done with the other congestion notification events. This alternative would be conservative enough considering the possibility of not detecting a packet loss of the RTO-triggered retransmission, but the TCP sender should avoid reducing the congestion window more than once in a round-trip time. 3) Reset congestion window to one segment and proceed with slow start, once the pipe is assumed to be empty from earlier packets [SL02]. This would be a justified action to take if the spurious RTO is assumed to be caused due to changes in the network conditions, such as a change in the available bandwidth or a wireless handoff to another point in the network. Disadvantage of this alternative is that it is rather inefficient on a network paths with high delay, and on the other hand, it may result in slow start overshoot. 6. Security Considerations No additional security threats on TCP due to the F-RTO algorithm are known. Acknowledgements Expires: July 2003 [Page 10] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark Allman, Sally Floyd, Yogesh Swami, and Mika Liljeberg for the discussion and feedback contributed to this text. Normative References [APS99] M. Allman, V. Paxson, and W. Stevens. TCP Congestion Con- trol. RFC 2581, April 1999. [MMFR96] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP Selec- tive Acknowledgement Options. RFC 2018, October 1996. [PA00] V. Paxson and M. Allman. Computing TCP's Retransmission Timer. RFC 2988, November 2000. [Pos81] J. Postel. Transmission Control Protocol. RFC 793, Septem- ber 1981. Informative References [ABF01] M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's Loss Recovery Using Limited Transmit. RFC 3042, January 2001. [BA02] E. Blanton and M. Allman. On Making TCP more Robust to Packet Reordering. ACM Computer Communication Review, 32(1), January 2002. [BBJ92] D. Borman, R. Braden, and V. Jacobson. TCP Extensions for High Performance. RFC 1323, May 1992. [FH99] S. Floyd and T. Henderson. The NewReno Modification to TCP's Fast Recovery Algorithm. RFC 2582, April 1999. [FMMP00] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An Exten- sion to the Selective Acknowledgement (SACK) Option to TCP. RFC 2883, July 2000. [GL02] A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for TCP in a GPRS Network. In Proc. of European Wireless, Flo- rence, Italy, February 2002 [LG02] R. Ludwig and A. Gurtov. The Eifel Response Algorithm for TCP. Internet draft "draft-ietf-tsvwg-tcp-eifel- response-02.txt". December 2002. Work in progress. [LK00] R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP Expires: July 2003 [Page 11] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 Robust Against Spurious Retransmissions. ACM Computer Com- munication Review, 30(1), January 2000. [SKR02] P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: A New Recovery Algorithm for TCP Retransmission Timeouts. Univer- sity of Helsinki, Dept. of Computer Science. Series of Pub- lications C, No. C-2002-07. February 2002. Available at: http://www.cs.helsinki.fi/research/iwtcp/papers/f-rto.ps [SL02] Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery using SACK option for spurious timeouts. Internet draft "draft-swami-tsvwg-tcp-dclor-00.txt". November 2002. Work in progress. Appendix A: Scenarios This section discusses different scenarios where RTOs occur and how the basic F-RTO algorithm performs in those scenarios. The interesting scenarios are a sudden delay triggering RTO, loss of a retransmitted packet during fast recovery, link outage causing the loss of several packets, and packet reordering. A performance evaluation with a more thorough analysis on a real implementation of F-RTO is given in [SKR02]. A.1. Sudden delay An unexpectedly long delay can trigger an RTO, should it occur on a single packet blocking the following packets, or appear as increased RTTs for several successive packets. The example below illustrates the sequence of packets and acknowledgements seen by the TCP sender that follows the F-RTO algorithm, when a sudden delay occurs triggering RTO but no packets are lost. For simplicity, delayed acknowledgements are not used in the example. ... (cwnd = 6, ssthresh < 6, FlightSize = 5) 1. SEND(10) 2. ACK(6) 3. SEND(11) 4. (set ssthresh <- 3) 5. SEND(6) 6. ACK(7) 7. SEND(12) 8. SEND(13) 9. ACK(8) (set cwnd <- 3, FlightSize = 6) 10. ACK(9) (cwnd = 3, FlightSize = 5) 11. ACK(10) (cwnd = 3, FlightSize = 4) 12. ACK(11) (cwnd = 4, FlightSize = 3) Expires: July 2003 [Page 12] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 13. SEND(14) ... When a sudden delay long enough to trigger RTO occurs at step 4, the TCP sender retransmits the first unacknowledged segment (step 5). Because the next ACK advances the cumulative ACK point, the TCP sender continues by sending two new data segments (steps 7, 8) and adjusts cwnd to 3 MSS. Because the second acknowledgement arriving after the RTO also advances the cumulative ACK point, the TCP sender exits the recovery and continues with the congestion avoidance. From this point on the retransmissions are invoked either by fast retransmit or when triggered by the retransmission timer. Because the TCP sender reduces cwnd when receiving the first ACK after RTO and sends the two new data segments at steps 7 and 8, it has to wait until the FlightSize is reduced to the level of congestion window before it can continue transmitting again at step 13. A.2. Loss of a retransmission If a retransmitted segment is lost, the only way to retransmit it again is to wait for the RTO to trigger the retransmission. Once the segment is successfully received, the receiver usually acknowledges several segments cumulatively. The example below shows a scenario where retransmission (of segment 6) is lost, as well as a later segment (segment 9) in the same window. The limited transmit [ABF01] or SACK TCP [MMFR96] enhancements are not in use in this example. ... (cwnd = 6, ssthresh < 6, FlightSize = 5) 1. SEND(10) 2. ACK(6) 3. SEND(11) 4. ACK(6) 5. ACK(6) 6. ACK(6) 7. SEND(6) (set cwnd <- 6, set ssthresh <- 3) 8. ACK(6) 9. (set ssthresh <- 2) 10. SEND(6) 11. ACK(9) 12. SEND(12) 13. SEND(13) 14. ACK(9) (set cwnd <- 3) 15. SEND(9) 16. SEND(10) 17. SEND(11) 18. ACK(11) Expires: July 2003 [Page 13] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 ... In the example above, segment 6 is lost and the sender retransmits it after three duplicate ACKs in step 7. However, the retransmission is also lost, and the sender has to wait for the RTO to expire before retransmitting it again. Because the first ACK following the RTO advances the cumulative ACK point (step 11), the sender transmits two new segments. The second ACK in step 14 does not advance the cumulative ACK point, and the sender enters the slow start, sets cwnd to 3 * MSS, and retransmits the next three unacknowledged segments, as per the F-RTO algorithm description given in Section 2. After this the receiver acknowledges all segments transmitted prior to entering recovery and the sender can continue transmitting new data in congestion avoidance. A.3. Link outage A performance study shows that F-RTO performs similarly to the regular recovery when consecutive packets are lost both up- and downstream as a result of link outage, triggering an RTO [SKR02]. If the RTO was not spurious but some data was actually lost, one of the next two ACKs after RTO does not advance the cumulative ACK point when RTO was caused by data loss, because the basic F-RTO retransmits only one segment after RTO. As a result, F-RTO sender continues by retransmitting unacknowledged segments similarly to the conventional RTO recovery. A.4. Packet reordering Since F-RTO modifies the TCP sender behavior only after a retransmission timeout and it is intended to avoid unnecessary retransmits only after spurious RTO, we limit the discussion on the effects of packet reordering in F-RTO behavior to the cases where packet reordering occurs immediately after RTO. We consider the retransmission timeout due to packet reordering to be very rare case, since reordering often triggers fast retransmit due to duplicate ACKs caused by out-of-order segments. Should packet reordering occur after an RTO, duplicate ACKs arrive to the sender, taking the F-RTO algorithm to retransmit in slow start as a regular RTO recovery would do. Although this might not be the correct action, it is similar to the behavior of the regular TCP, making F-RTO a safe modification also in the presence of reordering. Authors' Addresses Pasi Sarolahti Nokia Research Center Expires: July 2003 [Page 14] draft-sarolahti-tsvwg-tcp-frto-03.txt January 2003 P.O. Box 407 FIN-00045 NOKIA GROUP Finland Phone: +358 50 4876607 EMail: pasi.sarolahti@nokia.com http://www.cs.helsinki.fi/u/sarolaht/ Markku Kojo University of Helsinki Department of Computer Science P.O. Box 26 FIN-00014 UNIVERSITY OF HELSINKI Finland Phone: +358 9 1914 4179 EMail: markku.kojo@cs.helsinki.fi Expires: July 2003 [Page 15]