INTERNET-DRAFT Balaji Venkat HCL-Technologies India Pvt Limited, Expires June 1999 (HCL-Cisco software development center), chennai, india December 1998 MTU discovery using TCP MSS and Discussion on MSS value in SYN acknowledgment Status of this memo This document is an Internet-draft. Internet-drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-drafts. Internet-drafts are draft documents valid for a maximum of six months and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use Internet-drafts as reference material or cite them other than as " work in progress ". To learn the current status of any Internet-Draft, please check the "lid- abstracts.txt" listing contained in the Internet-Drafts Shadow directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific rim), ftp.ietf.org (US East coast), or ftp.isi.edu (US West Coast). Distribution of this memo is unlimited. Abstract Path MTU discovery as it exists now finds the least MTU of a given path. Traceroute through IP option [3] provides a method for finding the MTU on each hop using an ICMP message as a reply from the target host, with output link MTU in a portion of the message. The method proposed in this document intends to find the MTU on each hop on an internet path, without using the ICMP message for traceroute. This mechanism intends to acheive the same goal as the traceroute through IP option, but through a different mechanism. Discovery of the MTU of each router on a internet path would serve as a valuable network debugging tool. The way in which it is proposed to be implemented, it has the advantage of being automatically supported by all of the routers that support the TCP layer. It has a couple of disadvantages that it generates quite a few TCP packets and the amount of time it takes to run to discover each MTU along the path is quite substantial. This document specifies the MTU discovery mechanism with the existing IP and TCP options and the ICMP message types that Balaji Expires June 1999 [ Page 1 ] MTU Discovery December 1998 exist on all routers that support TCP layer in the internet. This method is suggested as an alternative to the Traceroute through IP option [3]. The intention is not to obsolete RFC 1393. This document also suggests that by default a reply SYN packet from a target host should include a MSS value that is derived from the MTU of a connected network. Table of contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . .2 2. Path MTU discovery and MTU discovery today . . . . . . . .2 3. MTU discovery (an alternative) . . . . . . . . . . . . . .3 4. Leveraging from Traceroute . . . . . . . . . . . . . . . .3 5. TCP Maximum segment size . . . . . . . . . . . . . . . . .4 6. Basic Algorithm . . . . . . . . . . . . . . . . . . . . .5 7. References . . . . . . . . . . . . . . . . . . . . . . . .6 8. Author' s address . . . . . . . . . . . . . . . . . . . .6 Acknowledgements This proposal is a product of the author's idea. The mechanism proposed here is a further enhancement of the RFC 1191 by Mogul & Deering [1]. It utilizes the TCP connection setup and traceroute mechanisms for achieving its purpose. 1. Introduction When a IP host transmits a datagram to a destination, the data is transmitted as a series of IP datagrams. It is recommended that these datagrams be of the largest size that does not require fragmentation anywhere along the path from the source to the destination. (For a further analysis of this topic, see [1]). This datagram is referred to as the Path MTU (PMTU), and it is equal to the minimum of the MTUs of each hop in the path. To discover the MTU of each hop on an internet path, there exists a traceroute with IP option mechanism as suggested by Malkin [3] that makes use of a ICMP message to get the output link MTU. The method suggested in this draft uses a method that offers an alternative mechanism (which is a combination of that employed by traceroute prior to RFC 1393 [3] and the TCP connection setup) to the traceroute with IP option. Balaji Expires June 1999 [ Page 2 ] MTU Discovery December 1998 2. Path MTU discovery and MTU discovery today The technique as it exists today, involves using the Dont Fragment bit in the IP header to dynamically discover the PMTU of a path. The basic idea is that a source host initially assumes that the PMTU of a path is the (known) MTU of its first hop, and sends all datagrams on that path with the DF bit set. If any of the datagrams are too large to be forwarded without fragmentation by some router along the path, that router will discard them and return ICMP "Datagram too big " message as per RFC 1191. Earlier to this the ICMP message sent was Destination Unreachable message with a code meaning "Fragmentation needed and DF set" [2]. The PMTU process of discovery ends when the host's estimate of the PMTU is low enough that its datagrams can be delivered without fragmentation. Or, the host may elect to end the discovery process by ceasing to set the DF bit in the datagram headers; it may do so for example, because it is willing to have datagrams fragmented in some circumstances. Normally, the host continues to set DF in all datagrams, so that if the route changes and the new PMTU is lower it will be discovered. As per RFC 1191, if an intermediate router has a MTU lower than size of the datagram and hence requires fragmentation, an ICMP message is sent with a field in the IP header field in the message meaning Datagram too big, that reports the MTU of the constricting hop. This method offers to provide the Path MTU and nothing more, in that it does not report the MTU of each intervening hop in the path. MTU discovery today involves using the ICMP message "Traceroute" to discover the MTU of each intermediate hop in an internet path. Setting an appropriate IP option (section 2.2 Malkin [3]) and sending the datagram to the target hop acheives this and prompts the target hop to send the ICMP "Traceroute" message with the output link MTU. 3. MTU discovery (An alternative) The mechanism proposed in this draft, intends to find the MTU of each intervening hop in a given path. This information would be provided using a technique that is a combination of traceroute prior to rfc 1393 and TCP connection setup. The MTU discovery mechanism would gather the information regarding each hop's MTU on a internet path and provide the same to the user of this mechanism. 4. Leveraging from traceroute This utility would leverage off traceroute as it existed prior to RFC 1393, in finding the intermediate hops to a destination on a given internet path. Balaji Expires June 1999 [ Page 3 ] MTU Discovery December 1998 Traceroute's algorithm would be required for that very purpose. This would be done as specified by the RFC 792 using the TTL field in the IP header [2]. This method does not intend to use the traceroute using IP option mechanism as suggested by Malkin [3]. In fact it intends to provide an alternative mechanism for discovering the MTU on each hop on a internet path. 5. TCP Maximum Segment Size. The other mechanism in this alternative method which would follow up what is done by traceroute, would be the initial packet exchange during the TCP connection setup. The maximum segment size (MSS) is the largest chunk of data that TCP will send to the other end. When a connection is established, each end can announce its MSS. The resulting IP datagram is normally 40 bytes larger; 20 bytes for the TCP header and 20 bytes for the IP header. When a connection is established, each end has the option of announcing the MSS it expects to receive. The SYN segment sent in the TCP connection setup contains the MSS option. If one end does not receive an MSS from the other end, a default of 536 bytes is assumed. Thus if the MSS on the other end sends 536 as the MSS then the calculation of the MTU would be accordingly that figure and nothing more. When TCP sends a SYN segment, either because a local application wants to initiate a connection, or when a connection request is received from another host, it can send an MSS value up to the outgoing interface's MTU, minus the size of the fixed TCP and IP headers. For an Ethernet this implies an MSS of upto 1460 bytes. The destination to which the connection is intended MAY then announce its MSS value in the acknowledgement for the SYN. This is a method discussed by Mogul & Deering [1]. We would then have to make an argument for the target host making the MTU - (40 bytes + overhead) the mss value in its replying SYN segment. Limiting the mss value to a minimum of the default MSS 536 or the value derived from MTU the connected network, would in fact cause an unnecessary limiting of the segment to 536 bytes if in case the least MTU along the entire path is greater than 576. Why do we need to limit the size of the segment to that value which is lower than what is possible to be transmitted without fragmentation ? Thus the suggestion would be to always return the MTU derived value of the MSS to the connection seeking host. The suggestion gets its basis from what is suggested in section 3 of RFC 1191 [1]. Section 3 of Mogul & Deering states "Actually, many implementations always send an MSS option, but set the value to 536 if the destination is non-local. This behaviour was correct when the internet was full of hosts that did not follow the rule that datagrams larger than 576 octets should not be be sent to non-local destinations. Now that most hosts do follow this rule, Balaji Expires June 1999 [ Page 4 ] MTU Discovery December 1998 it is unnecessary to limit the value in the TCP MSS option to 536 for non-local peers. Moreover, doing this prevents PMTU discovery from discovering PMTUs larger than 576, so hosts SHOULD no longer lower the value they send in the MSS option. The MSS option should be 40 octets less than the size of the largest datagram the host is able to reassemble (MMS_R, as defined in [1]); in many cases, this will be the architectural limit of 65495 ( 65535 - 40 ) octets. A host MAY send an MSS value derived from the MTU of its connected network (the maximum MTU over its connected networks, for a multi-homed host); this should not cause problems for PMTU discovery, and may dissuade a broken peer from sending enormous datagrams)." Thus a more effective method would be to calculate the MSS value to be set in the MSS option in the SYN segment, based on the minimum of MTU derived MSS or default mss, where default mss would be equal to the largest datagram the host can reassemble. But there is a problem here in that setting 65495 would quite possibly tickle (Mogul & Deering [1]) some IP implementations that have sign-bit bugs. +----------+ +-----------+ MTU=1500 +-----------+ | host A |-----------------| host B |--------------| host C | +----------+MTU=296 MTU=296 +-----------+ MTU= 1500 +-----------+ SYN <------------------------------------------------------ SYN ------------------------------------------------------> Fig 1.0 SYN with MSS Consider three hosts A , B and C connected in the manner shown in fig 1.0. Let us say the host C wants to initiate a TCP connection with host A. The MTUs of the various networks are as shown. The SYN of the host C is sent with an mss value of 1460 which is MTU 1500 - 40 bytes. In reply to this the host A stack responds with mss 256 which is the MTU of the outgoing interface on host A minus 40 bytes for the TCP and IP headers. This mechanism offers a way to obtain the MTU of the interface on each of the hops in a internet path. Utilizing this and the traceroute's mechanism of identifying the intermediate hosts, it would be possible to discover the MTU of each hop in an internet path. 6. Basic Algorithm. The basic algorithm for identifying the MTU on each hop would be to traceroute the intermediate hops to a given destination. Storing these values and then initiating a connection to the telnet port of each host through an iterative method with a MSS value of 65535 in the outgoing SYN segment. Balaji Expires June 1999 [ Page 5 ] MTU Discovery December 1998 Once the connection initiation is done, the SYN packet would send the source's MSS value and in reply the hop whose MTU is to be discovered would reply with its MSS value. On obtaining it the MTU of the said interface from that hop would be available by adding 40 to the returned value in the MSS portion of the TCP header. Iteratively going through the list of hops the MTU of each hop would be found. Once the MTU is computed a FIN packet would be sent to the telnet port of the target hop and the connection closed with appropriate packets exchanged for connection closure. For those hops that do not support TCP layer as part of their stack implementation, there would be either a timeout (if the hop does not return a ICMP Unreachable error) from the source, or on the reciept of the ICMP Unreachable from the IP layer, a default value of 576 would be assumed as the MTU for that hop. Thus a value of 576 bytes returned would denote that the MTU discovery on that hop did not work. If the telnet port on a target hop is not available or if telnet is not supported on that hop, it would be viable for the discovery to try alternate ports of the kind that are available by default on most routers. A certain amount of overhead is expected in terms of TCP packet exchanges everytime a connection is sought to be setup and torn down for finding the MTU. This is the overhead that one needs to pay to get the MTU of each intermediate hop along the way to a given destination. 7. References [1] J.Mogul, S.Deering, Path MTU discovery RFC 1191, DECWRL and Stanford University, November 1990. [2] J. Postel, Internet Control Message Protocol. RFC 792, SRI Network Information Center, September 1981. [3] G.Malkin, Traceroute using an IP option, Xylogics, Inc, January 1993. 8. Author's address V.Balaji Venkat HCL-Technologies India Pvt Limited, (HCL-Cisco software development center), 49/50 Nelson Manickam road, Chennai - 600 029 Tamil Nadu, India. Phone : 091-44-481 9938 Fax : 091-44-481 9939 Email : bvenkat@cisco.com Balaji Expires June 1999 [ Page 6 ]