Neil Harrison Internet Draft Peter Willis Document: draft-harrison-mpls-oam-req-00.txt British Telecom Expires: November 2001 Shahram Davari PMC-Sierra Enriqu G. Cuevas Ben Mack-Crane AT&T Laboratories Tellabs Elke Franze Hiroshi Ohta Deutsche Telekom NTT Tricci So Sanford Goldfless Caspian Network Feihong Chen Lucent May 2001 Requirements for OAM in MPLS Networks Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright(C) The Internet Society (2001). All Rights Reserved. Abstract This draft provides motivation and requirements for user-plane OAM (Operation and Maintenance) functionality in MPLS networks. Harrison et.al Expires November 2001 Page 1 Requirements for OAM in MPLS Networks May 2001 Motivation for this recommendation rose from Network operators' need for tools that ensure reliability and performance of MPLS LSPs (Label Switched Paths). User-plane OAM tools are required to verify that LSPs have been setup and are available to deliver customer data to target destinations according to QoS (Quality of Service) guarantees given in SLAs (Service Level Agreements). Requirements presented in this draft include but are not limited to: . Tools to efficiently detect and localize defects in MPLS layer . Mechanisms for fast defect notification . Availability and performance criteria . Trigger for corrective actions (e.g. protection switching) when failures occur. Table of Contents 1. Introduction..................................................2 2. Definitions...................................................2 3. Motivation for MPLS OAM functions.............................3 4. Requirements for OAM functions................................5 5. Security Considerations.......................................6 6. References....................................................6 7. Author's Addresses............................................7 Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. 1. Introduction This Internet draft provides motivation and requirements for OAM (Operation and Maintenance) for the user-plane in MPLS networks. It is recognized that OAM functionality is important in public networks for ease of network operation, for verifying network performance and to reduce operational costs. OAM functionality is especially important for networks, which are required to deliver (and hence be measurable against) QoS (Quality of Service) and availability performance parameters/objectives. 2. Definitions This document introduces some new terminology, which is required to discuss the functional network components associated with OAM. Harrison et. al. Expires August 2001 Page 2 Requirements for OAM in MPLS Networks May 2001 Functional Architecture Meaning Term ------------------ ------------------ Client/server A term referring to the transparent (relationship between transport of a client (ie higher) layer networks) layer link connection by a server (ie lower) layer network trail. Link connection A partition of a layer N trail that exists between two logically adjacent switching points within the layer N network. LSP Tunnel An LSP Tunnel is an LSP with well- defined source (ingress point) and sink (egress point) Subnetwork A subnetwork is a contiguous topological region of a network delimited by its set of peripheral access points, and is characterized by the possible routing across the subnetwork between those access points. A network is the largest subnetwork and a node is the smallest subnetwork (at least in practical physical terms, though there are smaller sub-networks within nodes). Trail A generic transport entity at layer N which is composed of a client payload (which can be a packet from a client at higher layer N-1) with specific overhead added at layer N to ensure the forwarding integrity of the server transport entity at layer N. Trail termination point A source or sink point of a trail at layer N, at which the trail overhead is added or removed respectively. A trail termination point must have a unique means of identification within the layer network. 3. Motivation for MPLS OAM functions Harrison et. al. Expires August 2001 Page 3 Requirements for OAM in MPLS Networks May 2001 It is recognized that OAM functionality is important in public networks to ensure agreed upon SLAs, reduce operational costs, verify network performance, and facilitate network operations. Network operators need OAM functionality to: 1. Detect MPLS user-plane defects: MPLS introduces a unique functional layer in the network. MPLS layer OAM functionality is not a substitute for lower layer OAM (also known as server layer) or higher layer OAM (also known as client layer). Moreover MPLS nesting capability (realized through label stack encoding [5]) allows LSPs to create layer networks in their own right, and hence will have defects that are specific to the MPLS LSP layer networks. MPLS user-plane defects are those that are encountered during transport of customer data. Although some MPLS control-plane OAM functions may be available, but network Operators cannot rely exclusively on fate sharing with the control plane to detect all transport defects, because: a. There will not be full commonality of all components traversed by an LSP and the control plane. Therefore control plane survival is not authoritative indication of the health of an LSP. b. It is possible for an MPLS network not to have a control- plane (when LSPs are setup statically) or have user data transported on paths that are not used by signaling (when LSPs are not routed hop-by-hop). 2. Verify whether Availability and Quality of Service guarantees given in SLAs (Service Level Agreements) are in fact being met by the connection. The ability to determine availability performance to achieve QoS for satisfying SLA is critical to network operators who wish to deploy numerous LSPs and dynamic routing in core MPLS networks. 3. Reduce operating costs, by allowing efficient detection and handling of defects. Lack of efficient automatic defect detection forces operators to increase their engineering and support workforce, hence increase operating costs 4. Determine LSP availability and performance reliably and accurately for accounting/billing purpose. This is required to ensure that customers are not inappropriately charged for degraded service or service outages. 5. Permit rapid localization of defects. 6. Reduce the duration of defects and thus improves the availability performance. 7. Protect customer traffic by detecting traffic mis-connections so that customerÆs confidential data are not delivered to wrong destinations (which may otherwise be undetectable). 8. Help to decrease the number of defects that are not apparent until the customer reports a problem. Harrison et. al. Expires August 2001 Page 4 Requirements for OAM in MPLS Networks May 2001 9. Allow taking necessary actions against defects even if a network element (NE) fails without notifying this failure to NMS (silent failure) so that consequent defects on LSPs can be detected. 10. Improve security of MPLS networks, by detecting mis- connections, and therefore helping prevent a customerÆs traffic being exposed to another party. 4. Requirements for OAM functions This section describes the high level requirements that have been identified and requested by a number of service providers. The requirements include but are not limited to: 1. Both on demand and continuous connectivity verification of LSPs to confirm that defects do not exist on the target LSPs. 2. If a defect occurs, it is necessary to detect, notify and localize it immediately and to take necessary actions. This facilitates minimizing the interruption of service by providing the network with sufficient information to take corrective action to bypass the defect (protection switching, re-routing etc.), and to minimize the time to correct the defect and return the failed resource to the available state. It is necessary that defects be detected and notified automatically without operator intervention for this purpose. 3. A defect event in a given layer network should not cause multiple alarm events to be raised (in the same layer network or client layer networks). 4. OAM functions should be able to perform stably in large scale networks. 5. Necessary operator actions such as setting up and activation of MPLS OAM functions should be minimized in order to use MPLS OAM functions easily even in large scale networks where the number of LSPs tends to be large. 6. OAM function must be optional to the operator and only be used by the networks that need it. Operators choose which function to use and which LSP to apply the OAM function. 7. OAM function must be backward compatible. LSRs that do not support such function must silently discard or pass through the OAM packets without disturbing the traffic or causing unnecessary actions. 8. Measurement of availability and QoS performance. Harrison et. al. Expires August 2001 Page 5 Requirements for OAM in MPLS Networks May 2001 9. The OAM functionality of a MPLS layer should not be dependent on any specific server or client layer technology. This is critical to ensure that layer networks can evolve (or new/old layer networks be added/removed) without impacting other layer networks. The control-plane of a given layer network must also have its own OAM. [Note - Control-plane OAM is outside the scope of this Recommendation.] 10. All the major defect conditions must be identified with in- service measurable entry and exit criteria, and all consequent actions must be specified. At least the following MPLS user- plane defects need to be detected: a. Loss of LSP connectivity (due to a server layer failure or a failure within the MPLS layer); b. Swapped LSP trails; c. LSP mismerging (of 2 or more LSP trails); (including loops). d. Unintended replication (e.g. unintended multicasting). 11. It is important to specify how unavailable/available state transitions relate to the stopping/starting of the aggregation of available state QoS metrics. 12. Connectivity status assessment must not be dependent on user traffic behavior. 13. The OAM tools provided should ensure (as far as reasonably practicable) that customers should not have to act as failure detectors for the operator. 14. Under fault conditions a layer network is not expected to behave in a predictable manner. Therefore OAM functions should not require the defected layer function in a reliable and predictable manner for fault diagnosis. 5. Security Considerations The OAM function described in this document enhances the security of MPLS networks, by detecting mis-connections, and therefore preventing customersÆ traffic to be exposed to other customers. The MPLS OAM functions as defined in this document do not raise any new security issue, to MPLS networks. 6. References [1] IETF, RFC3031, Multiprotocol Label Switching Architecture, Category: Standards Track, January 2001. [2] IETF, RFC 3032, MPLS label stack encoding, Category: Standards Track, January 2001.Architecture". Harrison et. al. Expires August 2001 Page 6 Requirements for OAM in MPLS Networks May 2001 7. Author's Addresses Neil Harrison British Telecom Phone: 44-1604-845933 Heath Bank Email: neil.2.Harrison@bt.com Iugby Road, Harleston South Hampton, UK Peter Willis British Telecom Phone: 44-1473-645178 BT, PP RSB10/PP3 B81 Email: peter.j.willis@bt.com Adastrial Park Martlesham, Ipswich, UK Shahram Davari PMC-Sierra 411 Legget Drive Phone: 1-613-271-4018 Kanata, ON, Canada Email: Shahram_Davari@pmc-sierra.com Ben Mack-Crane Tellabs 4951 Indiana Ave Phone: 1-630-512-7255 Lisle, IL, USA Email: ben.mack-crane@tellabs.com Hiroshi Ohta NTT Y-709A, 1-1 HikarinoÆka phone: 81-468-59-8840 Yokosuka-Shi Email: ohta.hiroshi@nslab.ntt.co.jp Kanagawa, Japan Sanford Goldfless Lucent Technologies 200 Nickerson Road Phone: 508-786-3655 Marlborough, MA 01752 Email: sgoldfless@lucent.com Feihong Chen Lucent Technologies 200 Nickerson Road Phone: 508-786-3675 Marlborough, MA 01752 Email: fchen6@lucent.com Tricci So Caspian Network 170 Baytech Drive Phone: 408-382-5217 San Jose, CA Email: tso@caspiannetworks.com U.S.A. 94070 Elke Franze Deutsche Telekom T-Systems T-Nova, Technologiezentrum Phone: +49 6151 83 5459 D-64307 Darmstadt Email: elke.franze@t-systems.de Harrison et. al. Expires August 2001 Page 7 Requirements for OAM in MPLS Networks May 2001 Darmstadt, Germany Enriqu G. Cuevas AT&T Laboratories Room A2-1E03 Phone: +1 732 420 3252 200 S. Laurel Avenue E-mail: ecuevas@att.com Middletown, NJ 07748 USA Harrison et. al. Expires August 2001 Page 8