Internet Engineering Task Force Inter-Domain Multicast Routing Working Group INTERNET-DRAFT W. Fenner draft-ietf-idmr-traceroute-ipm-01.txt Xerox PARC S. Casner Precept Software November 26, 1996 Expires: 3/31/97 A "traceroute" facility for IP Multicast. Status of this Memo This document is an Internet Draft. Internet Drafts are working docu- ments of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Abstract This draft describes the IGMP multicast traceroute facility. As the deployment of IP multicast has spread, it has become clear that a method for tracing the route that a multicast IP packet takes from a source to a particular receiver is absolutely required. Unlike unicast traceroute, multicast traceroute requires a special packet type and implementation on the part of routers. This specification describes the required functionality. This document is a product of the Inter-Domain Multicast Routing working group within the Internet Engineering Task Force. Comments are soli- cited and should be addressed to the working group's mailing list at idmr@cs.ucl.ac.uk and/or the author(s). Casner, Fenner Expires March 1997 [Page 1] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 1. Introduction The unicast "traceroute" program allows the tracing of a path from one machine to another, using mechanisms that already existed in IP. Unfor- tunately, no such existing mechanisms can be applied to IP multicast paths. The key mechanism for unicast traceroute is the ICMP TTL exceeded message, which is specifically precluded as a response to multicast packets. Thus, we specify the multicast "traceroute" facility to be implemented in multicast routers and accessed by diagnostic programs. While it is a disadvantage that a new mechanism is required, the multi- cast traceroute facility can provide additional information about packet rates and losses that the unicast traceroute cannot, and generally requires fewer packets to be sent. Goals: + To be able to trace the path that a packet would take from some source to some destination. + To be able to isolate packet loss problems (e.g., congestion). + To be able to isolate configuration problems (e.g., TTL threshold). + To minimize packets sent (e.g. no flooding, no implosion). 2. Overview Tracing from a source to a multicast destination is hard, since you don't know down which branch of the multicast tree the destination lies. This means that you have to flood the whole tree to find the path from one source to one destination. However, walking up the tree from desti- nation to source is easy, as all existing multicast routing protocols know the previous hop for each source. Tracing from destination to source can involve only routers on the direct path. The party requesting the traceroute (which need be neither the source nor the destination) sends a traceroute Query packet to the last-hop multicast router for the given destination. The last-hop router turns the Query into a Request packet by adding a response data block contain- ing its interface addresses and packet statistics, and then forwards the Request packet via unicast to the router that it believes is the proper previous hop for the given source. Each hop adds its response data to the end of the Request packet, then unicast forwards it to the previous hop. The first hop router (the router that believes that packets from the source originate on one of its directly connected networks) changes the packet type to indicate a Response packet and sends the completed response to the response destination address. The response may be returned before reaching the first hop router if a fatal error condition Casner, Fenner Expires March 1997 [Page 2] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 such as "no route" is encountered along the path. 3. Multicast Traceroute header The header for all multicast traceroute packets is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IGMP Type | # hops | checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Multicast Group Address | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Response Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | resp ttl | Query ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3.1. IGMP Type: 8 bits The IGMP type field is defined to be 0x1F for traceroute queries and requests. The IGMP type field is changed to 0x1E when the packet is completed and sent as a response from the first hop router to the querier. Two codes are required so that multicast routers won't attempt to process a completed response in those cases where the initial query was issued from a router or the response is sent via multicast. 3.2. # hops: 8 bits This field specifies the maximum number of hops that the requester wants to trace. If there is some error condition in the middle of the path that keeps the traceroute request from reaching the first-hop router, this field can be used to perform an expanding- length search to trace the path to just before the problem. 3.3. Checksum: 16 bits This is the standard IGMP checksum. 3.4. Group address This field specifies the group address to be traced, or zero if no Casner, Fenner Expires March 1997 [Page 3] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 group-specific information is desired. Note that non-group- specific traceroutes may not be possible with certain multicast routing protocols. 3.5. Source address This field specifies the IP address of the multicast source for the path being traced. The traceroute request proceeds hop-by-hop from the intended multicast receiver towards this source. 3.6. Destination address This field specifies the IP address of the multicast receiver for the path being traced. The trace starts at this destination and proceeds toward the source. 3.7. Response Address This field specifies where the completed traceroute response packet gets sent. It can be a unicast address or a multicast address, as explained in section 6.2. 3.8. resp ttl: 8 bits This field specifies the TTL at which to multicast the response, if the response address is a multicast address. 3.9. Query ID: 24 bits This field is used as a unique identifier for this traceroute request so that duplicate or delayed responses may be detected and to minimize collisions when a multicast response address is used. Casner, Fenner Expires March 1997 [Page 4] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 4. Response data Each router adds a "response data" segment to the traceroute packet be- fore it forwards it on. The response data looks like this: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Query Arrival Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Incoming Interface Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outgoing Interface Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Previous-Hop Router Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Input packet count on incoming interface | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Output packet count on outgoing interface | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Total number of packets for this source-group pair | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rtg Protocol | FwdTTL |MBZ| Src Mask | ForwardingErr | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.1. Query Arrival Time The Query Arrival Time is a 32-bit NTP timestamp specifying the arrival time of the traceroute request packet at this router. The 32-bit form of an NTP timestamp consists of the middle 32 bits of the full 64-bit form; that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. 4.2. Incoming Interface Address This field specifies the address of the interface on which packets from this source are expected to arrive, or 0 if unknown. 4.3. Outgoing Interface Address This field specifies the address of the interface on which packets from this source flow to the specified destination, or 0 if unk- nown. Casner, Fenner Expires March 1997 [Page 5] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 4.4. Previous-Hop Router Address This field specifies the router from which this router expects packets from this source, or 0 if unknown. 4.5. Input packet count on incoming interface This field contains the number of multicast packets received for all groups and sources on the incoming interface, or 0xffffffff if no count can be reported. 4.6. Output packet count on outgoing interface This field contains the number of multicast packets that have been transmitted for all groups and sources on the outgoing interface, or 0xffffffff if no count can be reported. 4.7. Total number of packets for this source-group pair This field counts the number of packets from the specified source forwarded by this router to the specified group, or 0xffffffff if no count can be reported. 4.8. Rtg Protocol: 8 bits This field describes the routing protocol in use between this router and the previous-hop router. Specified values include: 1 - DVMRP 2 - MOSPF 3 - PIM 4 - CBT 5 - PIM using special routing table 6 - PIM using a static route 7 - DVMRP using a static route 4.9. FwdTTL: 8 bits This field contains the TTL that a packet is required to have before it will be forwarded over the outgoing interface. 4.10. Src Mask: 6 bits This field contains the number of 1's in the netmask this router has for the source (i.e. a value of 24 means the netmask is 0xffffff00) Casner, Fenner Expires March 1997 [Page 6] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 4.11. ForwardingErr: 8 bits This field contains a forwarding error code. Specified values include: 0x00 No error 0x01 Traceroute request arrived on an interface to which this router would not forward for this source,group,destination. 0x02 This router has sent a prune upstream for the group. 0x03 This router has stopped forwarding in response to a request from the next hop router. 0x04 The group is subject to administrative scoping at this hop. 0x05 This router has no route for the source. 0x07 This router is not forwarding this source,group for an unspecified reason. 0x08 Reached Rendez-vous Point or Core 0x09 Traceroute request arrived on the expected RPF interface for this source,group. 0x0A Traceroute request arrived on an interface which is not enabled for multicast. 0x81 There was not enough room to insert another response data block in the packet. 0x82 The next hop router does not understand traceroute requests. 0x83 Traceroute is administratively prohibited. Note that if a router discovers there is not enough room in a packet to insert its response, it puts the 0x81 error code in the previous router's ForwardingErr field, overwriting any error the previous router placed there. It is expected that a multicast tra- ceroute client, upon receiving this error, will restart the trace at the last hop listed in the packet. The 0x80 bit of the ForwardingErr code is used to indicate a fatal error. A fatal error is one where the router may know the previous hop but cannot forward the message to it. 5. Router Behavior All of these actions are performed in addition to (NOT instead of) for- warding the packet, if applicable. E.g. a multicast packet that has TTL remaining MUST still get forwarded. 5.1. Traceroute Query Upon receiving a traceroute Query message (a request with no Casner, Fenner Expires March 1997 [Page 7] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 response blocks filled in), a router must examine the traceroute request to see if it is the proper last-hop router for the destina- tion address in the packet. It is the proper last-hop router if it has a multicast-capable interface on the same subnet as the Desti- nation Address and is the router that would forward traffic from the given source onto that subnet. It is also the proper last-hop router if the Destination Address is the address of one of its interfaces and either it is the router that would forward traffic from the given source onto that subnet or there is no other router on that subnet. A router may receive a traceroute Query message via either unicast or multicast. If received via multicast and it determines that it is not the proper last-hop router, the packet should be silently dropped. If received via unicast and it determines that it is not the proper last-hop router, a response block with an error code of 0x1 must be inserted and the response forwarded to the response address as described below. If the router knows which router is the correct last-hop router, it puts that router's address in the "Previous Hop" field of the response. When a router receives a traceroute request with no response blocks and it determines that it is the proper last-hop router, it inserts a response block and forwards the traceroute request towards the router that it expects to be the previous hop for this source and group (or, if no group is specified, the previous hop for this source). 5.2. Traceroute Request When a router receives a traceroute request with some number of response blocks filled in, it first checks the interface from which it received the traceroute request. If the reception interface is not one to which the router would forward data from the source, an error code of 0x1 is noted and processing continues. If the recep- tion interface is the interface from which the router would expect data to arrive from the source, an error code of 0x9 is noted and processing continues. If it receives a traceroute Request with some number of response blocks filled in and the packet destination is a multicast address, it must silently drop the packet. If a router has no way to determine a route for the source, an error code of 0x5 is noted and processing continues. The router fills in as many fields as possible in the response packet, and then for- wards the packet on or returns it to the requester. If the Previous-hop router is known for the source and group (or, if no group is specified, the previous-hop router for the source) and the number of response blocks is less than the number requested, the packet is forwarded to that router. Otherwise, it is sent to the Casner, Fenner Expires March 1997 [Page 8] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 Response Address in the header, with the indicated TTL if the Response Address is a multicast address. 5.3. Traceroute response A router must forward all traceroute response packets normally, with no special processing. 5.4. Sending Traceroute Responses 5.4.1. Destination Address A traceroute response must be sent to the Response Address in the traceroute header. 5.4.2. TTL If the Response Address is unicast, the router inserts its normal unicast TTL in the IP header. If the Response Address is multi- cast, the router copies the Response TTL from the traceroute header into the IP header. 5.4.3. Source Address If the Response Address is unicast, the router may use any of its interface addresses as the source address, preferring globally routable addresses. If the Response Address is multicast, the router MUST use a globally routable source address, if it has one. If the router does not have a globally routable address attached to any interface, then it SHOULD NOT try to send a multicast response. 5.4.4. Sourcing Multicast Responses When a router sources a multicast response, the response packet MUST be forwarded as if it were received on the outgoing interface. 6. Using multicast traceroute <> Several problems may arise when attempting to use multicast traceroute. 6.1. Last hop router The traceroute querier may not know which is the last hop router, or that router may be behind a firewall that blocks unicast packets but passes multicast packets. In these cases, the traceroute request should be multicasted to the group being traced (since the Casner, Fenner Expires March 1997 [Page 9] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 last hop router listens to that group). All routers except the correct last hop router should ignore any multicast traceroute request received via multicast. Traceroute requests which are mul- ticasted to the group being traced must include the Router Alert IP option [Katz96]. If the traceroute querier is attached to the same router as the destination of the request, the traceroute request may be multi- casted to 224.0.0.2 (ALL-ROUTERS.MCAST.NET) if the last-hop router is not known. 6.2. First hop router The traceroute querier may not be unicast reachable from the first hop router. In this case, the querier should set the traceroute response address to a multicast address, and should set the response TTL to a value sufficient for the response from the first hop router to reach the querier. It may be appropriate to start with a small TTL and increase in subsequent attempts until a suffi- cient TTL is reached, up to an appropriate maximum (such as 192). The IANA has assigned 224.0.1.32, MTRACE.MCAST.NET, as the default multicast group for multicast traceroute responses. Other groups may be used if needed, e.g. when using mtrace to diagnose problems with the IANA-assigned group. 6.3. Broken intermediate router A broken intermediate router might simply not understand traceroute packets, and drop them. The querier would then get no response at all from its traceroute requests. It should then perform a hop- by-hop search by setting the number of responses field until it gets a response (both linear and binary search are options, but binary is likely to be slower because a failure requires waiting for a timeout). 6.4. Trace termination When performing an expanding hop-by-hop trace, it is necessary to determine when to stop expanding. 6.4.1. Arriving at source A trace can be determined to have arrived at the source if the last router in the trace has an interface on the same subnet as the source. (***BAD HEURISTIC***! A router might have secondary sub- nets attached to it but not have an address on any of those sub- nets) <> 6.4.2. Fatal Error A trace has encountered a fatal error if the last Forwarding Error in the trace has the 0x80 bit set. 6.4.3. No Previous Hop A trace can not continue if the last Previous Hop in the trace is set to 0. 7. Problem Diagnosis 7.1. Forwarding Inconsistencies The forwarding error code can tell if a group is unexpectedly pruned or administratively scoped. 7.2. TTL problems By taking the maximum of (hops from source + forwarding TTL thres- hold) over all hops, you can discover the TTL required for the source to reach the destination. 7.3. Congestion By taking two traces, you can find packet loss information by com- paring the difference in input packet counts to the difference in output packet counts at the previous hop. On a point-to-point link, any difference in these numbers implies packet loss. Since the packet counts may be changing as the trace query is propagat- ing, there may be small errors (off by 1 or 2) in these statistics. However, these errors will not accumulate if multiple traces are taken to expand the measurement period. On a shared link, the count of input packets can be larger than the number of output packets at the previous hop, due to other routers or hosts on the link injecting packets. This appears as "negative loss" which may mask real packet loss. In addition to the counts of input and output packets for all mul- ticast traffic on the interfaces, the response data includes a count of the packets forwarded by a node for the specified source- group pair. Taking the difference in this count between two traces and then comparing those differences between two hops gives a meas- ure of packet loss just for traffic from the specified source to the specified receiver via the specified group. This measure is not affected by shared links. Casner, Fenner Expires March 1997 [Page 11] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 On a point-to-point link that is a multicast tunnel, packet loss is usually due to congestion in unicast routers along the path of that tunnel. On native multicast links, loss is more likely in the out- put queue of one hop, perhaps due to priority dropping, or in the input queue at the next hop. The counters in the response data do not allow these cases to be distinguished. Differences in packet counts between the incoming and outgoing interfaces on one node cannot generally be used to measure queue overflow in the node because some packets may be routed only to or from other interfaces on that node. In the multicast extensions for SunOS 4.1.x from Xerox PARC, both the output packet count and the packet forwarding count for the source-group pair are incremented before priority dropping for rate limiting occurs and before the packets are put onto the interface output queue which may overflow. These drops will appear as (posi- tive) loss on the link even though they occur within the router. In release 3.3/3.4 of the UNIX multicast extensions, a multicast packet generated on a router will be counted as having come in an interface even though it did not. This can create the appearance of negative loss even on a point-to-point link. In releases up through 3.5/3.6, packets were not counted as input on an interface if the reverse-path forwarding check decided that the packets should be dropped. That causes the packets to appear as lost on the link if they were output by the upstream hop. This situation can arise when two routers on the path for the group being traced are connected by a shared link, and the path for some other group does not flow between those two routers because the downstream router receives packets for the other group on another interface, but the upstream router is the elected forwarder to other routers or hosts on the shared link. 7.4. Link Utilization Again, with two traces, you can divide the difference in the input or output packet counts at some hop by the difference in time stamps from the same hop to obtain the packet rate over the link. If the average packet size is known, then the link utilization can also be estimated to see whether packet loss may be due to the rate limit or the physical capacity on a particular link being exceeded. 7.5. Time delay If the routers have synchronized clocks, it is possible to estimate propagation and queueing delay from the differences between the timestamps at successive hops. Casner, Fenner Expires March 1997 [Page 12] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 8. Acknowledgments This specification started largely as a transcription of Van Jacobson's slides from the 30th IETF, and the implementation in mrouted 3.3 by Ajit Thyagarajan. Van's original slides credit Steve Casner, Steve Deering, Dino Farinacci and Deb Agrawal. A multicast traceroute client, mtrace, has been implemented by Ajit Thyagarajan, Steve Casner and Bill Fenner. 9. Security Considerations Security issues are not discussed in this memo. <> <> 10. References Katz96 Katz, D., "IP Router Alert Option," RFC XXXX, Cisco Sys- tems, April 1996. Casner, Fenner Expires March 1997 [Page 13] Internet Draft draft-ietf-idmr-traceroute-ipm-01.txt November 1996 11. Authors' Addresses William C. Fenner Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304 Phone: +1 415 812 4816 Email: fenner@parc.xerox.com Stephen L. Casner Precept Software, Inc. 21580 Stevens Creek Blvd, Suite 207 Cupertino, CA 95014 Email: casner@precept.com Casner, Fenner Expires March 1997 [Page 14]