Internet Draft David Allan Document: draft-allan-mpls-a-bit-00.txt Nortel Networks Category: Standards Track April 2003 The Case for the 'A' Bit in the MPLS and IP PID Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright(C) The Internet Society (2003). All Rights Reserved. Abstract This memo describes the underlying rationale for inclusion of the LSR alert bit in the proposed MPLS payload ID. Sub-IP ID Summary [to be removed when published] WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK Fits in the MPLS, and PWE3. WHY IS IT TARGETED AT THESE WGs This draft addresses a number of issues associated with instrumenting/controlling MPLS LSPs and PWs in general. Allan et.al Expires October 2003 Page 1 The Case for the 'A' Bit 1. Introduction The internet draft [MPLSPID] had numerous commonalities with other proposals for an MPLS PID. All proposals include some well known value in the first nibble, and a means of identifying the protocol used in the subsequent payload. The only significant difference in the proposals is that [MPLSPID] includes an 'LSR alert' bit. This memo describes the underlying rationale for inclusion of the 'LSR alert' or 'A' bit. 2. The 'A' bit mechanism The 'A' bit is provided as an alternative mechanism to the router alert reserved label [RFC3032]. Intermediate LSRs along an LSP that recognize the MPLS and IP PW payload ID must process the protocol PDU when the 'A' bit is set in the control word. The intent is to provide a hop-by-hop mechanism that is unaffected by deployed ECMP (which may impact fate sharing between RA labeled flows and the original LSP) and maximizes commonality of forwarding between hop- by-hop and normal LSP flows in the internal implementation of labeled payload handling. Specification of the 'A' bit would require that compliant LSRs check the MPLS top of stack label entry and if the 'S' bit is set, examine the first nybble of each packet. If it is the extended Payload CW, check the alert bit and if so, process the payload (which in most cases will also include subsequently forwarding the payload). 3. Discussion There are aspects of the direction that proactive fault detection has taken that will introduce as many problems as are solved. This is an artifact of the currently available mechanisms for distinguishing fault detection messaging (discussed extensively in [FRAMEWORK]). One specifically pathological mode of failure is misdirection of traffic as this is a defect NOT detected or recovered by means other than path specific testing. Misdirection of traffic is a prime motivation of MPLS OAM efforts [REQUIRE]. Unlike problems detected adjacent to the source of the fault such as link or node failures, detection of misdirection via e2e probing will have no associated IGP notification that could act as a coordinating mechanism for how nodes remote to the problem respond. MP2P LSPs and label stacking mean that large numbers of LSP ingresses may be impacted by a problem with a single forwarding table entry. One option for detection of such problems is the use of LSP-PING [LSP-PING] proactively on some number of the potentially impacted LSPs. A defect in that table entry will misdirect all of the associated ping transactions. This will be reported to the ping originators which in some implementations this may result in the ping originators initiating traceroute, isolating the problem and Allan Expires October 2003 Page 2 The Case for the 'A' Bit alarming. The ping originators operate in an independent and unsynchronized manner so a defect may trigger a significant amount of redundant diagnostic traffic and alarms. Use of a one way transaction (either LSP-PING in 'do not reply' mode or the simpler FEC-CV [FEC-CV]) is a significant improvement as responsibility for handling the problem is pushed to a much reduced set of network elements. However this is still limited in recovery capability as the egress would only know it is some arbitrary number of hops downstream of the problem and would still need to take further action to isolate and recover from the problem. This would have security implications if it simply generated unsolicited fault notifications to untrusted peers some arbitrary number of hops across the network. When there are nested LSPs stacked on a misdirected LSP, the set of nested labels will also be misdirected. Some will not progress past the next LSR when there is no corresponding ILM but some number will collide with existing label values, and merge their traffic into existing LSPs. When combined with e2e testing even with an egress detection paradigm, this will result in a large number of LSRs in the network independently detecting a common failure. With that as the background, one conclusion we have come to is that misbranching detection is BEST performed hop by hop such that detection will frequently occur within one hop of the fault and prior to any significant fan out of misdirected nested LSPs. This minimizes the number of alarms associated with the fault, and may permit some misbranching faults to be automatically corrected. 4. Current hop by hop Mechanisms 4.1 Router Alert The current hop-by-hop mechanism is to prepend the current label stack with the Router Alert label. Use of the router alert label on top of the label under test will be subject to significant implementation variations that will impact the validity of any hop- by-hop testing using the router alert mechanism. The combination of ECMP (or other hashing based load spreading mechanisms) and label stacking means that use of any reserved label will interfere with fate sharing of flows. A path may fail or be misdirected and not be detected by probes pre-pended with the router alert label. We believe that the above two reasons disqualify use of the router alert label from consideration as a solution. 4.2 TTL Manipulation Alternately use of TTL=1 to relay messages hop by hop down the LSP will promptly break if the mechanism is not ubiquitously deployed. Allan Expires October 2003 Page 3 The Case for the 'A' Bit Non-implementation breaks the chain instead of simply forwarding the message transparently. This disqualifies hop by hop TTL manipulation as a candidate solution. Ideally some method of identifying hop-by-hop flows with minimal impact to label stack semantics is required (hence the 'A' bit). 5. Load Spreading Load spreading can occur in the MPLS architecture when an FEC to NHLFE mapping or ILM mapping resolves to multiple NHLFEs. The ability to distinguish hop by hop probing of the network suggests a way forward. Hop-by-hop forwarding of misbranching probes can exercise the set of NHLFEs via any of several mechanisms: a) Replication: an incoming probe is simply replicated across the set of NHLFEs. b) Round-robin: an NHLFE selector selects the next NHLFE in the set for forwarding upon receipt of a misbranching probe. c) Probe examination such that a probe from any given source or directed to a src/dest pair always resolves to a specific NHLFE. It should be noted that proactive detection of a problem will have different requirements that fault isolation. A round robin approach will not provide consistent forwarding from probe to probe as there is no guarantee that two identical probes will share common forwarding, nor is it required to in this application. LSP-PING when used to isolate a fault and inserted either e2e or employing TTL exhaust would be required to produce consistent results for a given LSP and src/dest tuple and would do so independent of the hop-by-hop technique. A round robin approach would most likely produce acceptable detection times without magnifying the probe load on the network. It would be expected to provide faster detection times than random payload manipulation techniques (altering a 127./8 destination address), and compared to when traceroute was used to establish a specific 127./8 test plan, would respond to topology changes faster than periodic traceroute. 6. Protocol Options There are currently two protocol proposals that have suitable functional characteristics for hop-by-hop fault detection. They do have differences in implementation complexity. In both cases they would be used with the MPLS and IP PW payload ID. 6.1 LSP-PING One would be the use of LSP-PING in either 'do not reply' mode or to modify the reply semantics such that a reply was only generated when either a fault was detected or the LSP egress was reached. The LSR Allan Expires October 2003 Page 4 The Case for the 'A' Bit detecting the fault is delegated responsibility for reporting the problem and initiating corrective action. Each LSR intercepting the 'alert' designated ping message would check the FEC TLV and compare this with the FEC for the LSP. If the FEC was invalid, the If the FEC TLV contained a valid FEC, the probe would then be forwarded to the next LSR. This would be an improvement over simply running traceroute continuously as the number of messages would be significantly reduced. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TBD |A| rsvd. | PA | Protocol ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // IP Header // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // UDP Header // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version Number | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Message Type | Reply mode | Return Code | Return Subcode| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender's Handle | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TimeStamp Sent (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TimeStamp Sent (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TimeStamp Received (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TimeStamp Received (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLV type = Target FEC Stack | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SubTLV type = IPv4 FEC | Length =5 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Prefix | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | mask | +---------------+ Figure 1: LSP-PING (IPv4 FEC) with MSLS & IP Payload CW Allan Expires October 2003 Page 5 The Case for the 'A' Bit Note that if adopted, the option of specifying that the payload following the IP PW control word is an IP protocol, then if 'do not reply' mode is used, the LSP-PING PDU could be simplified by dispensing with the IP and UDP headers. 6.2 FEC-CV The other would be the use of FEC-CV as proposed for Y.1711. Each LSR intercepting the 'alert' designated FEC-CV message would compare the FEC-filter with the expected value (Boolean 'AND' operation). If no mismatch is detected then the probe would be forwarded to the next LSR. FEC-CV is a one way probe message so it would simply be discarded at the LSP egress. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TBD |1| rsvd. | PA | Protocol ID (TBD) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Function (=7) | Reserved (=0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | | TTSI | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | FEC Filter | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved (=0) | BIP 16 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: FEC-CV PDU w. Payload ID CW 7. Interlayer coordination When a misbranching defect occurs for an LSP that transports LSPs (e.g. TE trunk or PWs on a PSN), it is desirable to minimize the number of points detecting the problem. There are two scenarios that can be considered: 1) The fault misdirects MPLS LSPs without changing the MPLS level (label swap problem). When an LSP is detected as being in a misforwarded state, probably the most secure response that the network can offer is to silently discard all traffic or at least all labeled traffic transported by the LSP. This has the additional benefit of not forwarding any misbranching probes that have been inserted into client LSPs. This Allan Expires October 2003 Page 6 The Case for the 'A' Bit will have the desirable properly of ensuring client LSPs will not alarm. 2) The fault misdirects traffic by altering the MPLS level (push or pop problem). A premature pop fault fault will result in a significant number of misbranching defects being detected by the immediately adjacent LSRs. It may also result in no-ILM conditions in the LSR where the unexpected pop occurred. LSRs should limit the rate of management notifications associated with misdirection of traffic. An unexpected push in the network would be more difficult to isolate if the traffic merged with an existing LSP. Detection of the fault would not occur until the egress of the LSP merged into was reached by any probes. 8. Implications of partial deployment Partial deployment diminishes but does not eliminate the value of the hop-by-hop audit. Nodes that do not implement 'A' bit functionality will simply forward the misbranching probes without processing them. If the fault has occurred in an MP2P LSP, and for security reasons or other operational reasons detection of such a fault leads to silent discard of all traffic, then detection by virtually any node upstream of the egress node will reduce the amount of traffic impacted by the misbranching fault. 9. Conclusions Most simply expressed, IP is hop by hop forwarding. Hop by hop detection of MPLS forwarding problems for LSPs set up with via LDP is consistent with the IP paradigm whereas proactive e2e path testing for the same is not. Proactive hop by hop verification of forwarding is only practical if a sufficiently lightweight mechanism existed such that the network was not degraded by proactive probing. Section 6 explores some possibilities and suggests that this is not an insurmountable obstacle. Hop by hop verification of forwarding requires a mechanism for distinguishing hop-by-hop probes that has maximum commonality with the handling of the label of interest and immunity to deployed ECMP. This is what motivates the 'A' bit proposal. 10.References [MPLSPID] Allan, D., 'The MPLS and IP PW Payload ID', Internet Draft, draft-allan-mpls-pid-00, April 2003 Allan Expires October 2003 Page 7 The Case for the 'A' Bit [FEC-CV] Allan, D., 'Overview of the FEC-CV proposed extension to the Y.1711 protocol', IETF Internet Draft, draft- allan-fec-cv-overview-00.txt IETF Internet Draft, draft-allan-mpls-oam-frmwk-04, February 2003 [LSPPING] Pan, P. et.al.,'Detecting Data Plane Liveness', Internet Draft, draft-ietf-mpls-lsp-ping-02, April 2003 [REQUIRE] Nadeau et.al., 'OAM Requirements for MPLS Networks', IETF Internet Draft, draft-nadeau-ietf-oam- requirements-01, February 2003 11.Author's Address David Allan Nortel Networks Phone: 1-613-763-6362 3500 Carling Ave. Email: dallan@nortelnetworks.com Ottawa, Ontario, CANADA 12.Full Copyright Statement "Copyright (C) The Internet Society (2003). Except as set forth below, authors retain all their rights. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for rights in submissions defined in the IETF Standards Process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/S HE REPRESENTS (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.