MPLS Working Group Heinrich Hummel, Internet Draft Jochen Grimminger Expiration Date: November 2002 Siemens AG May 2002 Hierarchical LSP draft-hummel-mpls-hierarchical-lsp-01.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. For potential updates to the above required-text see: http://www.ietf.org/ietf/1id-guidelines.txt Abstract This draft pursues the standardization of the "Hierarchical LSP" being a label-switched sequence of other LSPs, i.e. of hierarchically next-lower LSPs, which eventually, after some recursive cycles, consist of label-switched sequences of physical interfaces, i.e. of LSPs as of today. Hierarchically stacked LSPs will eliminate the notorious N-square problem when virtual networks (or even virtual networks on top of virtual networks) are to be built. Hummel March 2002 [Page 1] Hierarchical LSP Exp. Nov. 2002 1 Introduction and motivation This draft is written in favor of extending [CR-LDP] as to enable "Hierarchical LSPs" (=H-LSPs) being a label-switched sequence of other already existing LSPs, herein called sub H-LSPs. A sub H-LSP is either a label-switched sequence of other already existing LSPs as well (i.e. also an H-LSP) or a conventional LSP, i.e a label-switched sequence of physical interfaces. Similar work is done by [3], where the H-LSP is called FA-LSP (FA = forwarding adjacency). That draft specifies the FA-LSP as to "inject" the resulting "virtual link" back to OSPF, so that some other entity may eventually find several such virtual links and may build an even hierachically higher FA-LSP out of them, and may inject this even hierarchically higher FA-LSP back to OSPF again,etc. However this draft pursues an additional goal which is to overcome the notorious N-Square problem. With the help of H-LSPs an effective full mesh of tunnels shall be accomplished by using just O(n) many (elementary) tunnels - which may form a contiguous partial mesh - rather than a full mesh using O(n**2) many tunnels. Sufficient many H-LSPs are to be established so that there is a concatenated sequence of elementary tunnels from any to any of the n sites. These H-LSPs will only consume network resources at the endpoints of the used elementary tunnels (control state, NHLFE), not at any P-router. The costs (incl. performance costs) for the H-LSPs from the network core's point of view will however even be equal to ZERO, if the H- LSPs become absolutely invisible for any P-router. This will be accomplished, if the messages (Label Request, Label Mapping,...) for establishing /terminating the H-LSPs are sent thru the elementary tunnels as well. PE-based VPNs (RFC2547) try to reduce the severeness of the n-square problem 1) by employing PE-to-PE tunnels instead of CE-to-CE tunnels whereby at each PE there may be several attached CEs, and 2) by sharing PE-to-PE tunnels among different customer enterprises (whereby the traffic flows will be disaggregated at the egress-PE based on the bottom-most "VC-label". CE-based VPNs, which are said to have a four times bigger market share, cannot use either of these mechanisms. Yes, there is a third mechanism for improving scalability, which is to employ merging LSPs. As a matter of fact merging LSP do accomplish a significant reduction of "next hops" i.e. of MPLS-switching state resources. It is up to 80 %. But the remaining 20 % of O(n**2) is still of O(n**2). Hummel March 2002 [Page 2] Hierarchical LSP Exp. Nov. 2002 Two examples for demonstrating the savings due to H-LSPs: 1) CE-based VPN with n= 50 000 CEs Note, it is a stated requirement to find scalable VPN solutions for up to n=50 000 CEs! Worst solution a): Deploy n*(n-1) = 2,499,850,002 p2p CE-to-CE LSPs. Solution b): Deploy n=50 000 mp2p CE-to-CE LSPs (such CE-to-CE LSPs are not yet standardized !). Assuming a saving by 80 %, the costs are still like maintaining 599,970,000 CE-to-CE p2p LSPs. Favored solution c): Deploy a partial mesh of n-1 pairs of mutual inverse elementary CE- to-CE tunnels, which contiguously interconnect all 50,000 CEs. Additional H-LSPs( =sequences of LSPs which reuses some of these elementary tunnels again and again) will enable an EFFECTiVE full mesh. 99,998 (elementary) CE-to-CE tunnels are required. 2) Simulation of a PE-based VPN with n=100 PEs,200 P-routers,550 links The PE distribution as well as the meshing is done randomly but fair. Worst solution a): Full mesh using n*(n-1) p2p LSPs: takes 99,874 next hop links Solution b): Full mesh using n (merging) mp2p LSPs: takes 19,886 next hop links, i.e. saves about 80 % compared with a) Favored solution c): Partial mesh with n=99 pairs of mutually inverse directed elementary tunnels: takes 338 next hop links, i.e. saaves about 99,7 % compared with a). The additionally required H-LSPs only consume resources at the edges of the elementary tunnels: Either n*(n-1) - 2*(n-1)= (n-1)*(n-2) p2p H-LSPs or n mp2p H-LSPs are required. Additional arguments in favor of using H-LSPs: Multicast: Note that none of the LSPs according to a) or b) can be used for Hummel March 2002 [Page 3] Hierarchical LSP Exp. Nov. 2002 VPN-multicast traffic: Using these LSPs, the same multicast data would have to be send out as often as there are receiver nodes (PEs, CEs), or, what is no good solution either, has to be tranmitted via one (per VPN) additional (!!!) multicast delivery channel tree (which must carry even blind traffic - a trade-off in order to avoid multiple (application-dedicated) multicast delivery channel trees. VPN-multicast based on solution c) means: do multicast while using the elementary LSPs as "parent link" or "child links" of the multicast delivery tree. No additonal elementary LSPs are needed, no P-router is bothered with VPN-multicast. VPN-Multipath: VPN-Multipath means to provide multiple alternate pathes (e.g. differently routed, or for differently aggregated data) between any pair of sites. VPN-Multipath is usefull for traffic balancing, fast path restoration,QoS-sensitive data aggregation. Multiple full mesh according to solutions a) or b) is certainly not scalable, whereas according to c) only additional H-LSPs were required, but no additional elementary LSPs. Carrier's carrier networking: VPNs may be built based on a "network of LSPs" which is rented by some "virtual service provider": The elementary LSPs between the PEs of some VPN may already be H-LSPs themselves! MPLS over IP and over IPSEC due to H-LSPs: H-LSP may be built such that IP-tunnels (L2TP,GRE,IPSEC) are concatenated as well - together with MPLS-LSPs (ffs). Accordingly, a service provider may deploy VPN tunneling which spans both its MPLS- domains as well as its IP-domains. If IPSEC tunnels are used, they will be concatenated only "on safe ground". Security associations are reduced: not n, but only as many as there are neighboring PEs/CEs according to the partial mesh. The VC-label can still be exploited, too. 2 Definition of the H-LSP The H-LSP of level m is a concatenated (label-switched) sequence of sub H-LSPs of levels w, with 1<= w <= m-1, whereby at least one of them has level m-1. By the same recursive manner, each of these sub H-LSPs is defined as well. Eventually some of these sub H-LSPs may be and definitely some of their sub-sub H-LSPs will be LSPs of level 1, i.e. conventional LSPs as are well-known today. Hummel March 2002 [Page 4] Hierarchical LSP Exp. Nov. 2002 We may also envision NON-MPLS tunnels (L2TP, IPSEC,GRE) to be used as sub H-LSPs of level 1 (ffs). The sequence of sub H-LSPs may be linear (p2p H-LSP), or merging (mp2p H-LSP) or branching (p2mp H-LSP). The establishment of the H- LSP may be done by Downstream-on-Demand mode or by Unsolicited- Downstream mode. The rest of this draft is focussed on p2p H-LSP established in Downstream-on-Demand mode. The other cases may be subject for further drafts in the future. Example: The following figure shows H-LSP U16 which shall carry labelled user packets from PE1 to PE6 passing PE2,...,PE5 as well as P-routers P1,...,P8 as displayed. It will concatenate/label-switch the three sub H-LSPs U13, U34 and U46. Sub H-LSP U13 is itself a concatenation of LSPs U12 and U23. Sub H-LSP U46 is itself a concatenation of LSPs U45 and U56. Sub H-LSP U34 is a conventional LSP. Figure 1: PE1-P1---P2--PE2----P3---PE3----P4---P5---PE4---P6-----PE5--P7-P8---PE6 | | | | | | | | | | | | | | | | | | | | | | | | | | | | |a--|b---|0--|c-----|0---|d-----|e---|0---|f----|0-----|g---|h-|0---| |--LSP U12-->|--LSP U23->|-sub H-LSP U34->|---LSP U45->|--LSP U56-->| | | | | | | | | | | | | |a1----------|0----------| |b1----------|0-----------| |---sub H-LSP U13------->| |---sub H-LSP U46-------->| | | | | | | | | |a2----------------------|b2--------------|0------------------------| |------------H-LSP U16--------------------------------------------->| Labels: a,b,c,d,e,f,g,h, a1, b1, a2, 0 (=IPv4 Explicit Null) P1 to P8 indicate P-routers PE1 to PE6 indicate Provider Edge routers LSP U12 shall carry user packets from PE1 via P1 and P2 to PE2, which are initially labelled with label a, which is swapped with label b at P1, which is swapped with label 0 (=explicit IPv4 Null label) at P2. LSP U23 shall carry user packets from PE2 via P3 to PE3, which are initially labelled with label c, which is swapped with label 0 at P3. Hummel March 2002 [Page 5] Hierarchical LSP Exp. Nov. 2002 (sub H-)LSP U34 shall carry user packets from PE3 via P4 and P5 to PE4, which are initially labelled with label d, which is swapped with label e at P4, which is swapped with label 0 at P5. LSP U45 shall carry user packets from PE4 via P6 to PE5, which are initially labelled with label f, which is swapped with label 0 at P6. LSP U56 shall carry user packets from PE5 via P7 and P8 to PE6, which are initially labelled with label g, which is swapped with label h at P7, which is swapped with label 0 at P8. Sub H-LSP U13 shall carry user packets from PE1 to PE3 with an initial label stack = (a, a1). PE2 will pop the received top-most 0- label and swap a1 with ( c, 0 ). Sub H-LSP U46 shall carry user packets from PE4 to PE6 with an initial label stack = (f, b1). PE5 will pop the received top-most 0- label and swap b1 with ( g,0 ). H-LSP U16 shall be established such that user packets will be carried from PE1 to PE6 based on an initial label stack (a,a1,a2). PE3 shall pop two 0-labels and swap a2 with label stack (d, b2). PE4 will pop one 0-label and swap label b2 with label stack (f, b1,0). 3 Knowing all relevant sub H-LSPs That entity in PE1 of above example which is about to establish H-LSP U16 shall only consider the sub H-LSPs whose LSP-IDs are U13, U34 and U46. It should be ignorant how they are "composed" (i.e. whether or not these sub H-LSPs have themselves 0,1, or several nested sub H- LSPs). That should be a local matter of each respective INGRESS router: From this prospective, being ingreess router of U13, PE1 must be able to derive all nested LSPs of U13 ( here U12 only) based on LSP-ID U13. This needs to be enabled (see 3.1) A H-LSP may either be used for carrying U-Plane data or C-Plane data or both. Accordingly it may be called U-Plane H-LSP, C-Plane H-LSP or U/C-Plane H-LSP. In above example U16 shall become a U-Plane H- LSP, concatenating the U-Plane or U/C-Plane sub H-LSPs U13, U34 and U46. See next figure: The establishment messages (LABEL_REQUEST resp. LABEL_MAPPING) for establishing U16 shall be forwarded via C-PLane or U/C-Plane sub H- LSPs C13, C34, C46 resp. C64, C43, C31. These sub-H-LSPs shall be derivable: Hummel March 2002 [Page 6] Hierarchical LSP Exp. Nov. 2002 PE1 must be able to derive C13 from U13. PE3 must be able to derive C41 from U13 as well as C34 from U34. PE4 must be able to derive C43 from U34 as well as C46 from U46. This needs to be enabled (see 3.2) Figure 2: PE1-P1---P2--PE2----P3---PE3----P4---P5---PE4---P6-----PE5--P7-P8---PE6 | | | | |---sub H-LSP U13------->|-sub H-LSP U34->|---sub H-LSP U46-------->| | | | | | | | | | LABEL_REQUEST | LABEL_REQUEST | LABEL_REQUEST | |---sub H-LSP C13------->|-sub H-LSP C34->|---sub H-LSP C46-------->| | | | | | LABEL_MAPPING | LABEL_MAPPING | LABEL_MAPPING | |<--sub H-LSP C31--------|<-sub H-LSP C43-|<--sub H-LSP C64---------| | | | | | | | | 3.1 Deriving all nested sub-sub H-LSPs The ingress node of any conventional LSP Z of level 1, which may eventually be used by some H-LSPs, shall allocate a MIB-entry as follows: {LSP-ID Z; S-bit=1; first physical interface of LSP Z; first label of LSP Z }. This entry will be retrievable based on its first component which is LSP-ID Z. The ingress node of any sub H-LSP of level > 1, which may eventually be used by some H-LSPs of even higher hierarchical levels, shall allocate a similar MIB-entry. Let's assume H-LSP X concatenates a sequence of (H-) LSPs which begins with H-LSP Y. Let's also assume that H-LSP Y concatenates a sequence of conventional LSPs which begins with LSP Z. Hummel March 2002 [Page 7] Hierarchical LSP Exp. Nov. 2002 That router which is the ingress of X, Y and Z shall altogether allocate the following MIB-entries for X,Y and Z: {LSP-ID X; S-Bit=0; LSP-ID Y, 1st label of LSP X} {LSP-ID Y; S-Bit=0; LSP-ID Z, 1st label of LSP Y} {LSP-ID Z; S-Bit=1; 1st phys.interface of Z, 1st label of LSP Z} Note that the S-bit has the same purpose like the S-bit in a label stack ! As soon as any LSP as well as H-LSP has been established successfully, the respective MIB-entry shall be allocated in some table. Starting with the right LSP-ID, e.g. with LSP-ID X, we can navigate thru the right MIB-entries as to retrieve the complete initial label stack as well as the initial physical link. As we will see in section 5, we need this information for two reasons: 1) for sending some LDP-message THRU the respective Control Plane sub-H-LSP 2) for building an NHLFE as to concatenate two consecutive sub H- LSPs. Examples: PE1 shall maintain, after H-LSP U16 has been successfully established, the following MIB-Entries: {LSP-ID U16; S-Bit=0; LSP-ID U13, label a2} {LSP-ID U13; S-Bit=0; LSP-ID U12, label a1} {LSP-ID U12; S-Bit=1; interface to P1, label a } PE2 shall maintain: {LSP-ID U23; S-Bit=1; interface to P3, label c } PE3 shall maintain: {LSP-ID U34; S-Bit=1; interface to P4, label d } PE4 shall maintain: {LSP-ID U46; S-Bit=0; LSP-ID U45, label b1} {LSP-ID U45;S-Bit=1; interface to P6, label f } PE5 shall maintain: {LSP-ID U56; S-Bit=1; interface to P7, label g } Hummel March 2002 [Page 8] Hierarchical LSP Exp. Nov. 2002 3.2 Deriving the relevant C-Plane sub H-LSPs The ingress endpoint router of a User Plane sub H-LSP must be able to derive the LSP-ID of the respective parallel Control Plane sub H- LSP (which has the same ingress and the same egress). The egress endpoint router of a User Plane sub H-LSP must be able to derive the LSP-ID of the respective inverse Control Plane sub H-LSP. Note that there may be several parallel User Plane sub H-LSPs (sharing the same ingress and the same egress), e.g. routed differently or carrying different types of date (voice, video, data,..). They may have in common a parallel C-Plane sub H-LSP (with the same ingress and the same egress), which is either an additional sub H-LSP or one of these U-Plane sub H-LSPs. Also note that there may be several parallel C-Plane sub H-LSPs (sharing the same ingress and the same egress), e.g. being dedicated to different VPNs. Here is an (incomplete) list of property information pertaining to any sub H-LSP which will also help to correlate User Plane sub H-LSP and Control Plane sub H-LSPs as needed: - its LSP ID - its ingress and its egress router addresses - its plane-type (U-Plane only , C-Plane only, U-Plane AND C-Plane) - if U-Plane, (inherited) QoS/SLA/Traffic Parameter, bandwidth,color,preference, FEC etc. - if U-Plane, the respective parallel (if not identical) C-Plane LSP. - if C-Plane, the LSP-ID of the inverse C-Plane LSP. - whether it is shared among several VPNs/communities or is exclusively owned by some specific VPN/community. - ID of VPN/community which exclusively owns this H-LSP, if applicable - etc. All this information may be made available to whichever router that will need it, either by BGP-, OSPF, IS-IS-, or LDP-extensions or by static configuration. 4 Explicit routing and Hop-by-Hop routing 4.1 Explicit Routing The ingress edge router sends the entire list of LSP-IDs of to be concatenated User Plane sub H-LSPs in an ER TLV (Note that in current Hummel March 2002 [Page 9] Hierarchical LSP Exp. Nov. 2002 [CR-LDP] the ER HOP TLVs may contain LSP-IDs of C-Plane LSPs which however are not necessarily identical to the User Plane sub H-LSP to be concatenated. Some enhancement of the ER HOP TLV will be required, see section 6.2). Each transit edge router strips off the first entry, so that each of them may get: - First listed entry: LSP-ID for the U-Plane sub H-LSP, incoming from upstream. - Second listed entry: LSP-ID for the U-Plane sub H-LSP, outgoing to downstream. Based on the locally available property information (see 3.2) it may derive the LSP-IDs for the two needed C-Plane sub H-LSPs, in particular, - from the first listed entry, the inverse directed C-Plane sub H-LSP outgoing to upstream; - from the second listed entry, the parallel directed C-Plane sub H- LSP outgoing to downstream. 4.2 Hop-by hop Routing The ingress edge router computes the first hop U-Plane sub H-LSP based on FEC, Parameters, etc. and sends its LSP-ID in a new "Next U-Plane sub H-LSP" TLV to the first transit PE router. Each transit PE repeats the same computation and replaces the received "Next U- Plane sub H-LSP" TLV by the self-computed one before it forwards the Label Request message to the next hop PE. In this way each transit PE will know incoming as well as outgoing U-Plane sub H-LSP,too, and is also able to derive the respective C- Plane sub H-LSPs. 5 Downstream on-demand process for establishing the H-LSP of level m In comparison with conventional LSP setup, special attention is to be given to: 1) Sending some LDP-message THRU the respective Control Plane sub-H- LSP 2) Building an NHLFE as to concatenate two consecutive sub H-LSPs. Hummel March 2002 [Page 10] Hierarchical LSP Exp. Nov. 2002 5.1 Sending LDP-message thru Control Plane sub H-LSP tunnel Where ever a LABEL_REQUEST or a LABEL_MAPPING is to be sent to the next PE we must determine the respective Control Plane sub H-LSP: its LSP-ID, then its initial label stack and its first physical interface. Its LSP-ID can be derived as described in 3.2. Its initial label stack and its first physical interface can be derived as described in 3.1. 5.2 Label-switching of two consecutive User Plane sub H-LSPs A transit-PE which has received a label (value x) inside of a Label TLV in the LABEL_MAPPING message must concatenate the upstream User Plane sub H-LSP Uu with the downstream User Plane sub H-LSP Ud. It assigns a new label (value y), replaces x by y in the Label TLV and forwards it in the LABEL_MAPPING message to the next upstream PE. It also must allocate a NHLFE which is retrievable based on label value y. This NHLFE shall contain first physical interface and initial label stack of sub H-LSP Ud. That information can be retrieved as described in 3.1 and entered into the NHLFE. Additionally, label value x is to be entered into the NHLFE as to become the bottom-most label. Example: PE4 shall receive label x = 0 and assign label y = b2. Based on LSP-ID U46 it will retrieve the physical interface to P6 as well as the top most label stack portion (f,b1). PE4 will form a NHLFE which is retrievable based on label y=b2 and which encompasses {interface to P6, label f, label b1, label x=0}. 6 Syntactical enhancements of [CR-LDP] 6.1 Enhancement of the LSP ID TLV According to [CR-LDP] the LABEL_REQUEST message carries an LSP ID- TLV. It is proposed to enhance this LSP ID-TLV so that it may indicate that the LSP to be built will be an H-LSP. Furthermore, it may indicate whether it shall become a p2p H-LSP, or a mp2p H-LSP or a p2mp H-LSP. Last not least, it shall also indicate whether it may be used for transmitting U-Plane data, C-Plane data or both. 6.2 Enhancement of the ER HOP TLV As pointed out in section 4.1, a different ER-TLV is required which contains a list of ER HOP-TLVs, whereby each ER HOP denotes the LSP- Hummel March 2002 [Page 11] Hierarchical LSP Exp. Nov. 2002 ID of another U-Plane sub-H-LSP (and not of another C-Plane sub H-LSP as known in [CR-LDP]). It is proposed to specify some of the RESERVED-bits inside the ER HOP TLV as to indicate: a) the contained LSP-ID is the LSP-ID of a U-Plane sub H-LSP (a parallel C-Plane sub H-LSP may have to be derived thereof) b) the contained LSP-ID is the LSP-ID of a U-Plane sub H-LSP which may also be used as a C-Plane sub H-LSP 6.3 New "Next U-Plane sub H-LSP"-TLV In 4.2 an new "Next U-Plane sub H-LSP"-TLV is mentioned which may carry the LSP-ID of the next U-Plane sub H-LSP. It may contain just one (enhanced) ER HOP TLV as described in 6.2. 7 References [1] H.Hummel,J.Grimminger (Siemens AG): Partially meshed base tunnels plus hierarchical mp2p tunnel sequence LSPs draft-hummel-ppvpn-mp2p-tunnel-sequencing-00.txt [2] H.Hummel (Siemens AG): Tree/Ring/Meshy VPN tunnel systems draft-hummel-ppvpn-tunnel-systems-00.txt [3] K.Kompella (Juniper Networks): LSP Hierarchy with Generalized MPLS TE, draft-ietf-mpls-lsp-hierarchy-04.txt [CR-LDP] RFC 3212 "Constraint-Based LSP Setup using LDP" 8 Authors' Addresses Heinrich Hummel Siemens AG Hofmannstrasse 51 81379 Munich, Germany Tel: +49 89 722 32057 Email: heinrich.hummel@icn.siemens.de Jochen Grimminger Siemens AG Otto-Hahn-Ring 6 81739 Munich, Germany Tel.+49 89 636 417410 Email: Jochen.Grimminger@mchp.siemens.de Hummel March 2002 [Page 12] Hierarchical LSP Exp. Nov. 2002 Full Copyright Statement "Copyright (C) The Internet Society (March 2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. Hummel March 2002 [Page 13]