CCAMP Working Group A. DÆAchille Ed.(CoRiTel) Internet Draft M. Listanti (CoRiTel) Expires: April 2005 U. Monaco Ed.(CoRiTel) F. Ricciato (CoRiTel) D. Ali (CoRiTel) V. Sharma (Metanoia, Inc.) October 2004 Diverse Inter-Region Path Setup/Establishment draft-dachille-diverse-inter-region-path-setup-01.txt Status of this Memo "By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668." This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This memo describes a mechanism to establish end-to-end diverse or disjoint label switched paths (LSPs) across multiple regions, as required for inter-area and inter-AS traffic engineering. The mechanism relies on the joint computation, in each region, of the LSP segments of the diversely routed LSPs, thereby achieving the simplicity of the per-domain approach with effectively no changes to the signaling protocols. This is achieved by the use of a new RSVP-TE object, called the Associated Route Object or ARO, which is defined Dachille et al. Expires ûApril 2005 [Page 1] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 in this document and used in the signaling messages in support of the scheme. Table of Contents Status of this Memo...............................................1 1. Change History.................................................2 2. Introduction...................................................3 3. Terminology....................................................4 4. Network Model..................................................4 5. Novel Proposal: The Joint Selection approach (JSA).............7 5.1 ARO object: semantics, format, and subobjects.................8 5.1.1 ARO Sub-Objects.............................................9 5.1.2 Sub-object 1: IPv4 address.................................10 5.2 Full ARO expansion (Inter-area case).........................12 5.3 General Inter-Region Mesh Topology...........................14 5.4 Detailed Nodal Processing of ARO.............................16 6. Comparison with Alternative Proposals for Diverse Path........18 6.1 ISPA Iterated Shortest Path Approach (ISPA)..................18 6.2 Path Computation Server/Path Computation Element (PCS/PCE) Approach.........................................................20 7. Inter-region Path Selection Algorithms........................21 8. Operational Considerations: Setup Failure, Crankback, SRLGs...21 8.1 Handling Setup Failure.......................................21 8.2 Crankback....................................................22 8.3 SRLG Disjointedness Constraints..............................23 9. Experimental Results..........................................23 10. Security Considerations......................................24 10.1 Encrypting the ARO..........................................25 10.2 Use of Call Controllers.....................................25 11. References...................................................25 11.1 Normative References........................................26 11.2 Informative References......................................26 12. Appendix: Analysis of Crankback Probability..................27 13. Intellectual Property Considerations.........................29 13.1 IPR Disclosure Acknowledgement..............................29 14. Full Copyright Statement.....................................29 1. Change History October 2004: Revised to incorporate minor feedback received, and address minor edits. Added preliminary inputs on simulations results. Expires - April 2005 [Page 2] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 July 2004: Reissued as draft-dachille-diverse-inter-region-path- setup-00.txt. - Extensive reorganization of document. - Removed references to virtual links. - Incorporated major comments from Seoul IETF and subsequent IETF CCAMP ML discussions. - Reorganized to confirm to new IETF draft publication guidelines. January 2004: Reissued as draft-dachille-inter-area-path-protection- 1.txt, with some editorial corrections. January 2004: First issued as draft-dachille-inter-area-path- protection-00.txt. 2. Introduction End-to-end diverse path setup is an important requirement for both inter-area and inter-AS traffic engineering [INTER-AS-REQ],[INTER- AREA-REQ]. Such diverse paths have a number of useful applications: load balancing, the ability to use multiple paths when one has insufficient bandwidth, the establishment of multiple redundant paths between VoIP gateways, and the establishment of end-to-end disjoint LSPs for protection/restoration. In this document, we provide a mechanism and signaling protocol extensions to enable the joint and distributed computation of diverse paths in a multi-region (inter-area or inter-AS) network. Our scheme allows for each node to have only a limited knowledge of the network topology, while still achieving diverse path setup with minimal extensions to RSVP-TE signaling. The document is structured as follows: In Section 3, we introduce some terminology used in the rest of the document. In Section 4, we discuss our network model for both the inter-area and inter-AS cases, while in Section 5, we introduce our novel scheme, the Joint Selection Algorithm (JSA). We present the Associated Router Object (ARO) introduced in support of our scheme, and discuss its semantics and format. We then discuss the operation of our scheme for the inter-area and inter-AS cases, respectively. In Section 6, we compare our scheme with some other proposals that may also be used for diverse path computation. In particular, the Iterated Shortest Path Approach Expires - April 2005 [Page 3] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 (ISPA) scheme, which uses the exclude route [EXCLUDE-ROUTE] object, and the Path Computation Server scheme, which requires protocol objects for communication between path computation entities [INTER- AREA-AS-TE]. In Section 7, we highlight several path selection algorithms, any of which could be used with our scheme, while in Section 8, we discuss some operational considerations related to setup failure, crankback and SRLG disjointness. Finally, in Section 9, we discuss ways in which the privacy of inter- AS information may be retained across ASs when using our scheme. 3. Terminology ARO Associated Route Object. Defined in this document ASBR AS Border Node BN Border Node ERO Explicit Route Object, see [RSVP-TE] GLSP Generalized LSP; more details can be found in [GMPLS] PATH Downstream RSVP message, see [RSVP] Region ID A region identifier, see Section 5.1.4 RESV Upstream RSVP message, see [RSVP] RRO eXclude Route Object, see [EXCLUDE-ROUTE] SRLG A group of links that may be affected by the same fault; TE Traffic Engineering XRO Record Route Object, see [RSVP-TE] 4. Network Model We assume that the network can be modeled as a conglomerate of interconnected regions. Within each region we distinguish between border nodes and internal nodes. A border node (BN) is placed at the boundary of one or more regions, while an internal node is completely embedded within a single one. Note that a region might include only border nodes, with no internal nodes. Expires - April 2005 [Page 4] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 We assume that each BN has complete knowledge of the intra-region topology, and only a summarized view of the inter-region connectivity. On the intra-region side, each BN knows all the nodes and links included in the regions(s) that it is connected to, along with their associated attributes (e.g. available capacity, degree of reliability, etc.). On the inter-region side, each BN maintains a list of region-level paths for each destination: each record in this list represents the ordered sequence of crossed regions towards the specific destination, similar to the information that BGP (OSPF) already distributes at inter-AS (inter-area) level. As a working hypothesis, we will assume that every inter-region connection is duplicated, i.e. involves at least two different BNs on each side. This assumption is required to enforce node- disjointedness, and corresponds to the usual way in which inter- AS/inter-area connections are arranged in practice. Any possible extensions to this scheme to cope with a network where the duplicated inter-area connection hypothesis does not apply are considered as a point for further studies. In the rest of this draft, we will assume a MPLS control plane [MPLS], involving RSVP-TE signaling and OSPF-TE/ISIS-TE intra-domain routing. However, this proposal can be easily extended to a GMPLS control plane, as needed for instance in the case of optical networks. Within the assumptions made above, the concepts and mechanisms proposed here could apply to several scenarios. For instance, a region can be considered as an abstraction of: - an OSPF area, in a large IGP domain, as in Fig. 1; in this case, the BN are directly connected to various areas; - an Autonomous System (AS), in the inter-domain context, as depicted in Fig. 2; in this case, AS Border Routers (ASBRs) belonging to different ASs are connected through inter-AS links. The proposed approach could also be applied in a traffic engineered optical network if optical islands could be mapped to an area/AS. In this case, the message definitions and processing given in Section 5 would need to be adapted for a GMPLS control plane; the case for handling optical islands as separate administrative entities (no mapping to specific areas/ASs) is for further study. Expires - April 2005 [Page 5] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 In the AS scenario, we assume that the interconnection between different ASs is realized by means of ASBR-ASBR single-hop links and that an ASBR is allowed to flood the TE information related to the ASBR-ASBR links even if no IGP-TE is performed on those links, as stated in [INTER-AREA-AS-TE]. An ASBR is also allowed to flood the IGP adjacencies (ASBR-ASBR links), in order to allow routing on those links in case multiple routes can be selected. Note that this inter- AS link information is not flooded beyond the AS to which the link is attached, thus retaining privacy of AS information. The presence of ASBR-ASBR links is the main difference between the inter-area and inter-AS scenarios. This assumption allows the ingress ASBR at a given AS to compute the path(s) up to the first ASBR in the next hop AS; otherwise, the ingress ASBR of a given AS could have computed the path(s) only up to the egress ASBR of the current AS, which would then compute the inter-AS path(s) segment(s) to the next hop AS. The importance of this assumption will be clear in Section 4, where the computation mechanism is described. area 1 area 0 area 2 /-----------\ /------------\ /-------------\ / \ / \ / \ --/-- --\-- --\-- --\-- | |***********| |*********** | |*************| | | A | | C | * | E | | G | | | | | * | | | | ----- ----- * ----- ----- | * | * | *| | * | * | *| | * | * | *| | * | * | *| ----- ----- * ----- ----- | | | | * | | | | | B | | D | * | F | | H | | |***********| |************| |*************| | --\-- --/-- --/-- --/-- \ / \ / \ / \-----------/ \------------/ \-------------/ Expires - April 2005 [Page 6] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 ****** = intra-area link Fig. 1 ******** ******** ** ** ** ** * ------- ------- * * |ASBR1|~~~~~~~~|ASBR3| * * ------- ~ ~ ------- * * * ~ ~ * * * AS1 * ~~ * AS2 * * * ~~ * * * * ~ ~ * * * ------- ~ ~ ------- * * |ASBR2|~~~~~~~~|ASBR4| * * ------- ------- * ** ** ** ** ******** ******** ~~~~~~~ = inter-AS link Fig.2 5. Novel Proposal: The Joint Selection approach (JSA) We propose a novel scheme, called "Joint Selection Approach", based on the joint selection of diversely routed LSPs, for example the first and second LSPs when two dijsointly routed LSPs are required. This mechanism assumes that a disjoint route selection algorithm is implemented in the BNs, such as, for example the Suurballe algorithm [SUURBALLE] or the Bhandari algorithm [BHANDARI] or one of their extensions to cope with SRLG-disjointedness. A new RSVP-TE object called Associated Route Object (ARO), described in Section 5.1 is defined and used in the signaling messages in support of this scheme. Expires - April 2005 [Page 7] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 The mechanism foresees a preliminary phase in which the head-node selects the inter-region level paths for the two diverse LSPs (for e.g. two diversely routed LSPs in case of disjoint protection path computation). These might be fully or partially overlapping at the region-level. The region-level path for the primary LSP is included in the Explicit Route Object (ERO, [RSVP-TE]), while the region-level path for the secondary LSP is included in the Associated Route Object (ARO), proposed in this work. In other words, the head-node initializes the ERO and ARO with the region-level paths, while the JSA distributed scheme will expand them into link-level paths, i.e., determine the exact intra-region paths in terms of crossed nodes and links, and establish the LSP along such paths. This is achieved in two sequential steps: Step 1: Distributed expansion of the ERO, and setup of the primary LSP; expansion of the ARO limited to the regions that are in common with the ERO; Step 2: Completion of ARO expansion (for the regions NOT in common with the ERO) and setup of the secondary LSP. The ARO expansion only has to be performed by those intermediate BNs that lie in regions shared between the primary and secondary LSP: for such path segments a detailed node list is stored in the ARO in the place of a region identifier. Next we provide a detailed description of the JSA operation. For the sake of simplicity, we will first consider in Section 5.2 the simple case of a network with linear inter-region connectivity, as depicted in Fig 1. In this case the region-level paths for the primary and secondary LSP must necessarily fully overlap, thus requiring complete ARO expansion in Step 1 itself. In Section 5.3, we will describe the JSA mechanism in the general case of meshed inter-region connectivity, as reported in Fig 5. In this case, the region-level routes of the two LSPs may be partially or completely disjoint, requiring only a partial ARO expansion in Step 2. 5.1 ARO object: semantics, format, and subobjects This is a new RSVP-TE object used in JSA. Expires - April 2005 [Page 8] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 The route of the second path associated with a first path is advertised by means of the Associated Route Object (ARO). Class value and C_Type value are [TBD]. The ARO Object has the following format (Fig. 3a): 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ----------------------------------------------------------------- | | // SUB-OBJECTS // | | ----------------------------------------------------------------- Fig. 3a It is easy to notice that this format is very similar to that of ERO object [RSVP-TE]. Indeed it is in fact the same, except for Class and C_Type. We can think of this object as playing the role of the ôfuture ERO of the second pathö, and this explains why it is so similar to the ERO object. As happens with ERO, the sub-objects represent an abstract node, i.e. a group of real nodes, which may consist even of a single physical node (this concept is well described in [RSVP-TE]). 5.1.1 ARO Sub-Objects The ARO object is a collection of variable length sub-objects, as defined for ERO object; each sub-object has the following format: 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 ------------------------------------------------------------------ | | | | | Type | Length | SUB-OBJECTS Contents // | | | | ------------------------------------------------------------------ Expires - April 2005 [Page 9] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 Fig. 3b Type: the Type field specifies the type of the information included in the ôsub-object contentsö field; in fact it is necessary to distinguish not only between IP addresses and Region_IDs, but also among different kind of IP addresses and Region_IDs; the possible values are: -IPv4 address (Type = 1) -IPv6 address (Type = 2) -Autonomous System ID (Type = 32) -OSPF Area.ID (Type = 33) These last two ID are the so called ôRegion_ID". Length: the length field specifies the total length of the sub- objects in bytes, including the Type and Length fields; it must be a multiple of 4 and its value must be at least 4. 5.1.2 Sub-object 1: IPv4 address The format of this sub-object is shown in Fig. 4a: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ------------------------------------------------------------------ | | | | | Type = 1 | Length | IPv4 Address (4 Bytes) | | | | | ------------------------------------------------------------------ | | | | | IPv4 Address (continued) | Addr.len | Resvd | | | | | ------------------------------------------------------------------ Fig. 4a Expires - April 2005 [Page 10] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 In this case, Type = 1, Length = 8; IPv4 address: an IPv4 address treated as a prefix, based on the value of the field Addr_len. Bits beyond this prefix should be ignored on receipt. 5.1.3 Sub-object 2 : IPv6 address The format of this sub-object is shown in Fig. 4b 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ------------------------------------------------------------------ | | | | | Type = 2 | Length | IPv6 Address (16 Bytes) | | | | | ------------------------------------------------------------------ | | | IPv6 Address (continued) | | __________ | ------------------------------------------------------------------ | | | IPv6 Address (continued) | | | ------------------------------------------------------------------ | | | IPv6 Address (continued) | | | ------------------------------------------------------------------ | | | | | IPv6 Address (continued) | Addr.len | Resvd | | | | | ------------------------------------------------------------------ Fig. 4b In this case, Type = 2, Length = 20; IPv6 address: an IPv6 address treated as a prefix, based on the value of the field Addr.len. Bits beyond this prefix should be ignored on receipt. Expires - April 2005 [Page 11] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 5.1.4 Sub-object 32-33: Region_ID These two sub-objects have the same format, but differ one from the other in the Type field, and also by the structure of the identifiers whether they identify AS or OSPF areas; we decided to use the same format for these sub-objects because they are processed in the same manner. This format is shown in Fig. 4c 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ------------------------------------------------------------------ | | | | |Type = 32, 33 | Length | Region_ID | | | | | ------------------------------------------------------------------ Fig. 4c In this case, Type = 32, 33 Length = 4 Region_ID: An identifier that specifies an AS (Type=32), an OSPF area in a large IGP network (Type = 33). 5.2 Full ARO expansion (Inter-area case) Fig 1 may represent a multi-area OSPF network; in fact, OSPF areas are organized in a star topology, with a central area 0, the backbone, and all other areas connected to it. If we assume that each destination is connected to a single area, we can exclude stub areas that do not include source or destination nodes, so we end up with a chain of length 3. Suppose that BN A receives a diverse connection demand towards BN H; JSA foresees two steps: -Step 1: Joint and distributed computation of the two LSPs and installation of the first one. This step is divided in the following phases: - Step 1a: A computes two node-disjoint paths towards node H within area 1. This computation is performed through a disjoint route selection algorithm, e.g. [SUURBALLE] and Expires - April 2005 [Page 12] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 [BHANDARI]. The computed paths say A-C and A-B-D represent ERO and ARO expansion in Area 1. - Step 1b: A starts the signaling procedure to establish the primary LSP; the PATH message includes the following objects: - ERO specifying the route C(Strict)-A0(Loose)-A2(Loose)- H(Loose). A0 and A2 are area identifier (Area_ID) of area 0 and 2; - ARO specifying the route A-B-D-A0-A2. - Step 1c: BN C receives the PATH message and selects the two disjoint paths within Area 0. If no protection is required (i.e. no ARO in the PATH message), only one route is computed and only ERO expansion is completed; - Step 1d: node C performs ARO and ERO expansion and continues the signaling procedure; the PATH message contains now: -ERO specifying the route E(Strict)-A2(Loose)-H(Loose); -ARO specifying the route A-B-D-F-A2, since A0 has been expanded to the route (D-F); - Step 1e: BN E performs the last ERO and ARO expansion within Area 2: -ERO specifying the route G(Strict)-H(Strict); -ARO specifying the route A-B-D-F-H; the tail-end node H is added; - Step 1f: the tail-end node H duplicates the ARO and sends it back in the RESV message for the primary LSP. -Step 2: Setup of the secondary LSP; this step is divided into the following phases: - Step 2a: when BN A receives the ARO of Step 1, it copies the detailed node-level path list into the Explicit Route Object of the secondary LSP. Because of fully ARO expansion, ERO includes all nodes marked as Strict. - Step 2b: BN B,D, and F forward the PATH message along the route described in the ERO; - Step 2c: The PATH message arrives at the tail-end node H, which sends back RESV message to establish the secondary LSP; Expires - April 2005 [Page 13] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 - Step 2d: if the RESV message is received by BN A, the secondary LSP is correctly setup and traffic can be routed along one or both of the LSPs (for example, the working one, in case a pair of protection LSPs was being established). If the second LSP cannot be established, the first LSP must be torn down. In any case, if one of the previous steps is unsuccessful, the setup fails and a PATH-ERR message will notify the head-end node. In Section 8 some possible suggestions to address the case in which one of the previous step fails are proposed. 5.3 General Inter-Region Mesh Topology In this section we will describe JSA operations on a general inter- region mesh topology. In particular, we will refer to the network of Fig. 5, which can model a multi-AS network. We assume that the head- end node A receives a diverse path setup demand towards node H and the route selection algorithm at BN A returns the following inter-AS routes: P1 = AS1-AS3-AS4-AS5 and P2 = AS1-AS2-AS4-AS5. As inter-AS paths are partially overlapping, partial ARO expansion occurs. The JSA mechanism proceeds as follows: - Step 1: Joint and distributed computation of the two LSPs and setup of the primary LSP. This step is performed in the following phases: - Step 1a: BN A computes two disjoint paths towards node H, within AS 1; in this case, node A MUST select one ASBR for the next ASs (in this example AS2 and AS3) before the disjoint route selection algorithm is performed; the two LSPs will not share the next AS according to the selected inter-AS path. Assume that nodes B5 and B9 are chosen (the way this selection occurs depend on the path selection algorithm implemented in the ASBR). - Step 1b: Node A starts the signaling procedure, sending the PATH message along the primary LSP through ASs 1-3-4-5 including the following objects: - ERO specifying all the nodes between node A and node B5, marked as Strict, AS3(Loose)-AS4(Loose)- AS5(Loose)-node H(Loose); Expires - April 2005 [Page 14] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 - ARO specifying all the nodes between node A and node B9, then AS2-AS4-AS5. AS2, AS4, and AS5 are the AS_IDs, of the secondary LSP inter-AS route; - Step 1c: Upon receiving the PATH message, node B5 checks if AS3 (the AS through which H is reachable) is included in ARO; it is not, then only the primary LSP crosses AS 3. BN B5 performs an ERO expansion, as described in [EXCLUDE-ROUTE], while ARO is not be processed, and forwards the PATH message. Assume that nodes B13 is the next ASBR selected for the segment of the primary LSP through AS4. - Step 1d: Node B13 controls if AS4 is included in ARO; since the two LSP will share AS 4, node B13 performs the disjoint route selection algorithm. In this computation node B13 MUST prune node B14, since it must be the first node of the primary LSP segment in AS4. Assume that ASBRs B13 and B20 are selected along the primary LSP and B15 and B19 along the secondary. - Step 1e: ASBR B15 perform ERO and ARO expansion and forwards the PATH message to ASBR B20 with the following objects: - ERO specifying all the nodes between B13 and B20, marked as Strict, AS5(Loose)-H(Loose); - ARO specifying all the nodes between nodes A and B9, AS2, all the nodes between B15 and B19 and then AS5; - Step 1f: Since AS5 is included in ARO, node B20 must perform the disjoint route selection algorithm on AS5. ERO and ARO expansion occur as described in steps above. - Step 1g: Node H sends the RESV message to setup the primary LSP, copying the ARO into it. This ends the first step. -Step 2: Establishment of the secondary LSP computed in Step 1. This step is divided in the following phases: - Step 2a: BN A starts the signaling procedure to install the secondary LSP. A sends a PATH message including the ERO specifying the route obtained from the ARO it received in the RESV message of Step 1; all nodes are marked as Strict, while AS_ID are substituted by type 32 Loose sub-objects (see [RSVP-TE]); - Step 2b: The secondary LSP is established as described in [EXCLUDE-ROUTE], with the usual ERO expansion. Expires - April 2005 [Page 15] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 **** ** ** * * |----------B9 AS2 * | * * | B10 * | | ** ** | | B11**B12 | | | | *B1* | | | **** --- B2------| | ------------------- ** ** | A | * | | * * --- AS1 * | | * AS5 * * * | | * --- * * -----------------| | B19 | H | ** B4------| | | /** * --- *B3* | | | / B20** | | | | / / | | **** B15**B16 / / | | ** ** ** ** / / | B6 B7-----B14 B17 / | * AS3 * * AS4 * / |---------B5 * * B18 * * * * ** ** ** ** **B8------------B13*** A = Head-end node H = Tail-end node --------- = Inter-AS Link Bi = AS Border Router "i" Fig. 5 5.4 Detailed Nodal Processing of ARO In this section we explain how the ARO object is processed. Since the purpose of the ARO is to keep track of the nodes along the secondary path, the only operations performed on it will be reading it or adding sub-objects to it. Expires - April 2005 [Page 16] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 When a PATH message arrives at a node, the node first checks whether the received PATH message contains an ARO; if no ARO is present, the requested path is unprotected, and the procedure explained in [EXCLUDE-ROUTE] can be followed; on the other hand, if the requested path is protected, the ARO is included in the PATH message sent along the first LSP and processed as follows. A BN receiving a PATH message relative to the first LSP MUST check if Region_ID of the next region towards the tail-end node, say region X, is included in one of the ARO sub-objects (e.g. in the example of Section 4.3, node B13 checks if AS_ID AS4 is included in the ARO). - If region X ID is included, the first and second LSPs will share the region X and the border node MUST perform disjoint route selection algorithm on the topology obtained pruning the following nodes, depending on whether the two LSPs share the previous region along the primary path: - If the two LSPs do not share the previous region as a common one, all border nodes connecting to regions either not included in the ARO, or included in the ERO, MUST be pruned. This operation guarantees edge continuity and disjointedness of the two LSPs within the region X. - Otherwise the two LSPs share the previous region and the ARO includes the last border node computed along the secondary route. Except the last BN in the ARO, all border nodes connecting to regions either not included in the ARO or not included in the ERO, MUST be pruned. Moreover, all the border nodes connecting to next regions that are included in the ERO and in the ARO MUST also be pruned, except for one single border node per region. The selection of this border node depends on the implemented path selection algorithm. In this way it is assured that the two LSPs will follow the computed region-level paths. - If region X ID is not included in the ARO, the secondary LSP will not share region X with the first one. The border node, MUST only perform an ERO expansion as described in [EXCLUDE-ROUTE] and no ARO processing is required. When a PATH message containing ERO and ARO objects arrives at the tail-end node, the ARO object is copied into the RESV message that is sent back to the head-end node. Likewise, the ARO will not be Expires - April 2005 [Page 17] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 processed by intermediate nodes, but only by the head-end node, since it provides the route of the secondary path. When such a RESV message arrives at the head-end node, the ARO must be used to create the ERO included in the PATH message sent along the second LSP, with the following constraints: - Every node included in the ARO is copied into ERO and marked as Strict; - If ARO contains Region_IDs (only in the partial ARO expansion case), a proper Type-32 ERO sub-object is added to ERO (see [RSVP-TE] for details). When the head-end node sends a PATH including such an ERO, it is processed as in [EXCLUDE-ROUTE] or in [VASSEUR]; ERO expansion occurs whenever the next hop is marked as Loose. If this step is successful, a RESV message will allocate resources for the secondary path, otherwise the primary path previously installed must be torn down. 6. Comparison with Alternative Proposals for Diverse Path Setup Alternative solutions to the problem of routing end-to-end disjoint LSPs in a multi-area/AS MPLS network can be found in [EXCLUDE-ROUTE] and [INTER-AREA-AS-TE]. In this section, we briefly outline the key differences between JSA and these alternative approaches. 6.1 ISPA Iterated Shortest Path Approach (ISPA) A distributed solution to the problem of routing end-to-end disjoint LSPs in a multi-area/AS MPLS network was proposed in [EXCLUDE-ROUTE]. Therein a new RSVP-TE object was introduced, namely the XRO (eXclude Route Object) and the issue of diverse LSPs setup in a multi-region scenario is described as an example of one practical application of the XRO. We will refer to this approach as the Iterated Shortest-Path Approach (ISPA). In this section, we provide a brief description of ISPA, and outline some of its limitations. With reference to the network of Fig 1, we describe the ISPA assuming that a diverse path setup from node A to node H is required. The mechanism foresees two steps: Expires - April 2005 [Page 18] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 - Step 1: The first LSP is setup in a distributed manner along the shortest end-to-end path, say P1. This involves ERO expansion. In the RESV message the Record Route Object RRO (see [RSVP-TE] for details) collect the list of nodes along the first LSP; - Step 2: the list of nodes along P1 is included in the XRO. A PATH message containing the XRO starts the setup of the second LSP. Every node included into the XRO will be pruned before performing the route selection algorithm for the second path P2. With respect to JSA, ISPA has two main drawbacks, namely: i) trapping and ii) sub-optimal route selection. The problem of trapping was introduced in [DOVERSPIKE], and applies whenever a trap topology is present in the network. In [TRAP-AVOID] it is claimed that this problem may occur with a probability as high as 30% in a typical mesh network when node-disjoint paths are required. Fig. 1 shows a sample network affected by trapping. The network consists of three areas, namely area 1 (nodes A,B,C,D), area 0 (nodes C,D,E,F) and area 2 (nodes E,F,G,H), and links with equal costs. Assuming that node A receives a diverse path request towards node H, in the first step, ISPA will select and setup path A- C-F-H as the first LSP. Because of the pruning, after the first path selection no more paths are available from A to H (i.e., nodes C and F will be included into the XRO Object). On the other hand we already showed that JSA correctly find a pair of disjoint paths, namely A-B-D-F-H and A-C-E- G-H. Beyond trapping, a second drawback of ISPA is that, it may lead to sub-optimal path selection if the chosen optimization metric is the sum of the weights of both the paths; this drawback may arise even in the case of a single area network. In fact it could be easily shown that performing iterative instances of a shortest path selection algorithm by pruning at each iteration all the nodes in the resulting path, may lead to worse paths selection than a joint selection algorithm can do. Basically the asset of the JSA is the joint paths computation that may lead to better paths selection with respect to an iterated approach. Expires - April 2005 [Page 19] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 6.2 Path Computation Server/Path Computation Element (PCS/PCE) Approach Another distributed solution is proposed in [INTER-AREA-AS-TE]. It is a mechanism using one or more distributed servers, called PCE/PCS (Path Computation Element/Path Computation Server). In a network modeled as described in Section 4, a BN could play the role of PCE, or the PCE could be located in other nodes; this requires that a discovery procedure be implemented. Obviously, this scheme requires the implementation of a protocol to allow PCE-PCE communication, and therefore further information overhead for the network. We give a brief description of the PCE. Basically, the mechanism foresees three steps: - Step 1: discovery by the head-end LSR of a PCE capable of serving its path computation request. The PCE can either be statically configured or dynamically discovered via IGP extensions; - Step 2: an RSVP Path computation request is sent to the selected PCE. The PCE of Region(i) relays the path computation request to the selected PCE peer in Region(i+1) located in the next hop Region. The process is iterated until the destination Region is reached. - Step 3: once a requesting PCE receives an RSVP Path computation reply, it computes the shortest path from Region(i) to Region(i+1) by means of a Shortest Path Tree computation. The SPT is then communicated to the head-end LSR, which can select the optimal inter-region path. This scheme, as the JSA, is distributed; however, since it does not rely on summarized information [FARREL], it can achieve optimal inter-region path computation. An optimal path is defined as the path that would have been computed making use of some CSPF algorithm in the absence of multiple IGP areas. Path optimality is beyond the scope of this document. On the other hand, optimality is reached by communicating the entire SPT to the head-end LSR; this may involve a non-negligible information overhead, in particular as the number of the crossed ASes AND the number of diversely routed LSPs grow. Expires - April 2005 [Page 20] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 7. Inter-region Path Selection Algorithms According to the mechanisms described in Section 5, a region-level path selection for both the first and the second LSPs must be performed before the joint computation scheme can start. Therefore, an inter-region path selection algorithm must be available at the head-node. We showed how node-disjointedness (i.e., at the intra-region level) between the two paths can be enforced regardless of whether the inter-region level paths determined by the head-node are partially or fully overlapping. The metrics used by the inter-region path selection algorithm could be either derived from the inter-region routing protocol (OSPF/BGP) or provided by some commercial agreements between Service Providers. In any case, the scheme proposed here is independent of the particular inter-region path selection algorithm used. Even though the policy that a given node applies in the selection of the region-level paths, in case multiple paths are available, is out of scope for this document, some possible criteria for the inter- region path selection algorithm could be to find: - the shortest region-level paths (that minimize the sum of the cost of both inter-region paths); - fully disjoint inter-region paths (except for the head and the tail regions) or alternatively paths computed in order to maximize the region-level disjointedness; - fully overlapping inter-region paths; - region-level paths can also be statically configured. 8. Operational Considerations: Setup Failure, Crankback, SRLGs 8.1 Handling Setup Failure The issue of handling setup failures needs to be better investigated, since could be tackled by means of several draft proposals; below we propose some possible solutions. If one of the steps described in Section 5.2 and 5.3 is unsuccessful the following options can be exercised: - a PATH-ERR message can be sent back, and the entire procedure restarted, with a maximum of N tries (the parameter N may be optimized to minimize signaling overhead on the network); Expires - April 2005 [Page 21] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 - the reiteration of the entire procedure could also be improved by means of the XRO [EXCLUDE-ROUTE], which could be used to exclude from the new iteration a segment or the full path on which the failure happened; in this case, the PATH-ERR message should bring back the node(s) to exclude; - crankback; an expanded description of how crankback could be used to handle setup failures could be found below in this Section. 8.2 Crankback A interesting point is the relation of crankback within the ARO scheme. We first observe that the JSA scheme has the ability to make possible diverse path setup with crankback [CRANKBACK] (if it becomes necessary), particularly when there is a particular trap topology (remember that JSA can handle cases of traps that are completely embedded within a region), whereas some other sequential schemes can fail completely, requiring the entire primary and secondary to be setup from scratch. Two main conditions under which a crankback could be triggered are: i) Lack of bandwidth ii) Existence of a trap-like case (see below) Regarding i) bandwidth exhaustion should be considered an "exceptional" event in an operational network (it indicates that the carrier should have increased the link bandwidth some time ago). In this case, the ARO scheme may have to crankback, but any scheme that is efficient in the most common cases, while leading to some additional burden (i.e., crankback signaling) in some rare cases is clearly preferable to any alternative scheme that adds significant signaling overhead in all cases [RFC 3439]. Regarding ii), as explained in the Appendix (Section 12), there might sometimes be some "particular" intra-region topologies that, coupled with some "special" inter-AS connectivity schemes, could lead the ARO scheme to crankback. However, as shown in the Appendix, the probability of cranking back N hops, is an exponentially decreasing function of N. Further, as discussed in the Appendix, even in the event that the ARO scheme needs to crankback, in most cases, it will need to do so only one hop. We also note there that alternative schemes such as the PCS/PCE approach, are able to avoid crankback at the cost of a significant increase in computational complexity (having to compute the shortest Expires - April 2005 [Page 22] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 path tree rooted at the source/destination and extending to every possible ASBR between the source and destination). Alternatively, if these schemes were to target scalability, they would do so by introducing some partial pruning in the computation of the shortest path tree, which would make them susceptible to crankback as well. 8.3 SRLG Disjointedness Constraints The Joint Selection Approach could be easily extended to achieve SRLG-disjointedness in the inter-area scenario. This is because when considering IGP areas, the problem of incompatible or duplicated SRLG assignments is a non-issue. In fact, IGP areas would be under the control of a single provider, and the provider can reasonably be expected to assign distinct, globally unique SRLGs to the links in different areas. In the inter-AS scenario, the problem of assigning globally unique and consistent SRLGs across different ASs is _universal, and all schemes related to restoration have to deal with it. In particular, any_ scheme that does distributed path computation has to deal with it. Thus, whatever mechanisms (or inter-provider agreements) are devised to consistently assign SRLG values to links in different provider networks will also apply to the JSA approach. 9. Experimental Results We have performed some initial simulations comparing the ISPA and the JSA algorithms over a wide variety of topologies and ABR densities. The topology consisted of 3 areas connected in a linear segment as in Fig. 1,with the inter-area topology being fairly complex, and derived from realistic ISP topologies. As an example, Table 1 lists reports the names and numbers of nodes in each inter-area topology, and also provides the number of routers with degree 1 within each area. ISP Name Total No. of Nodes Nodes of degree 1 Abovenet 368 37 ebobe 161 27 Tiscali 248 82 Exodus 246 32 The last column provides the effective number of nodes within each area that could not be chosen as the head-end or the tail-end of an Expires - April 2005 [Page 23] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 inter-region LSP. Indeed, the nodes with node degree = 1 were removed from the topology since they could not be the head-end nor the tail- end of an inter-region protected LSP, nor could they act as intermediate nodes. For each of the ABR selection methods a set of 24 topologies has been derived considering the dispositions of the four ISP topologies over the three areas of the chain. We built up inter-area networks with 2 and 3 ABRs and ran simulations over the 4 sets of topologies (2 and 3 ABRs, random, and maximum degree ABR selection method). On each topology we resolved 500 instances of inter-area double path computation demands by means of the two distributed mechanisms (ISPA and JSA), and the centralized one (optimum case). Source and destination have been randomly chosen respectively in area 0 and area 2 among nodes with degree greater than 1. For each instance, we recorded full paths (if any) of the primary and secondary LSPs found by the three mechanisms. Our preliminary results (see [PERFCOMP]) basically indicate that: -- if diverse paths exist, JSA succeeds in finding them ôalmost alwaysö (JSA failed only in few cases over all the simulated instances); -- there are some critical network configurations (trapping configurations) over which ISPA obtain very poor performance with respect to JSA and the optimum; on the average 9% of the topologies caused such a problem; -- there is not an appreciable difference between the cost of diverse paths found by the distributed mechanisms and the optimum bound (when ISPA and JSA find the paths they perform a meaningful selection). In summary, we can say, based on our results thus far, that JSA performs very closely to the optimum (negligible probability of trapping, costs approximating the optimal). The main advantage of JSA over ISPA is in the higher robusteness from trapping, while the main advantage of JSA over other methods (es. PCE) is simplicity and scalability. We will discuss and present these results further during the upcoming IETF. 10. Security Considerations We recognize that there are cases where one operator may be unwilling to export the internal routes of its AS to the outside world (AS of Expires - April 2005 [Page 24] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 another operator), as may be required when the ARO is expanded (Section 4.3). In such cases, we propose two solutions to help retain the confidentiality of an operator's AS-specific details (other solutions may also be possible and we welcome inputs on those). 10.1 Encrypting the ARO In the first approach, it may be possible to encrypt the portion of the ARO that refers to internal routes, with only the identity of the border routers exported unencrypted. With this mechanism, the ARO portion referring to AS X, which is expanded by the border nodes of AS X, will travel downstream (in the PATH) and upstream (in the RESV) encrypted. The head end will copy the ARO into the ERO of the alternative path in a transparent way. When the messages arrive at the border nodes of AS X, they could decrypt and forward the PATH message along the pre-computed internal route (assuming all border nodes of AS X share the same keys). Notably, since the upstream nodes are removed from the ERO upon processing, the internal routes of AS X are never exposed to the outside world. 10.2 Use of Call Controllers In the second approach, each AS may have an AS-specific call controller. During the setup of the first path, when the border node at AS X computes the ARO, it passes it on to the call controller. Only the identity of the border node at which the second LSP will enter AS X is recorded in the ARO. Thus, the ARO can be thought of as a loose route, specifying only the border nodes at various Ass through which the path of the second LSP passes. As before, the head end simply copies this ARO into the ERO of the second (diverse) LSP. Upon receipt at an intermediate border node, the node queries its call controller to extract the complete path of the LSP through its AS, and replaces the loose segment between itself and the border node at the subsequent AS with this path. As earlier, by the time the PATH message arrives at the border node of the next AS, the entries in corresponding to the internal route through AS X have been "consumed", and these are never exposed to the outside. 11. References Expires - April 2005 [Page 25] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 11.1 Normative References [INTER-AS-REQ] R. Zhang, JP Vasseur, draft-ietf-tewg-interas-te- req-07, June 2004, work in progress. [INTER-AREA-REQ] JL Leroux, JP Vasseur, J. Boyle, draft-ietf-tewg- inter-area-mpls-te-req-02, June 2004, work in progress. [RSVP-TE] RSVP-TE: Extensions to RSVP for LSP Tunnels, RFC3209. [GMPLS] Generalized Multi-Protocol Label Switching, Signaling Functional Description, RFC3471. [RSVP] Resource ReserVation Protocol, ver.1, RFC2205. [MPLS] Multi Protocol Label Switching Architecture, RFC3031. 11.2 Informative References [EXCLUDE-ROUTE] CY Lee, A. Farrel, S. De Cnodder, ôExclude routes û extension to RSVP-TEö, draft-ietf-ccamp-rsvp- te-exclude-route-02, July 2004, work in progress. [INTER-AREA-AS-TE] JP Vasseur, A. Ayyangar, draft-vasseur-ccamp- inter-area-as-te-00, February 2004, work in progress. [SUURBALLE] J.W. Suurballe, ôDisjoint paths in a networkö, Networks, pp.125-145, 1974. [BHANDARI] R. Bhandari, ôSurvivable Networks ûAlgorithms for Diverse Routingö, AT&T Labs, Norwood, MA: Kluwer, 1999. [VASSEUR] J.P. Vasseur ôInter-AS MPLS Traffic Engineeringö, draft-vasseur-inter-as-te-01, work in progress. [DOVERSPIKE] B. Doverspike, Guanzhi Li, C Kalmanek, ôFiber Span Protection in mesh optical networksö, AT&T Labs, Optical Network Magazines vol.3, No.3, May/June 2002. Expires - April 2005 [Page 26] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 [TRAP-AVOID] Dahai Xu, Yizhi Xiong, et al., ôTrap avoidance and protection schemes in networks with shared risk link groupsö, IEEE Journal of Lightwave Technology, November 2003. [FARREL] A. Farrel, JP Vasseur, A. Ayyangar, ôA framework for Inter-Domain MPLS Traffic Engineergingö, draft-farrel-ccamp-inter-domain-framework.00.txt, May 2004, work in progress. [CRANKBACK] Farrel, A., et al., "Crankback Signaling Extensions for MPLS Signaling", draft-ietf-ccamp- crankback-01.txt, January 2004, work in progress. [1] [PERFCOMP] Ali, D., Monaco, U., "Performance Comparison of Different Diverse Inter-region LSP setup Mechanisms," pre-liminary report, CoRiTel and Univ. of Rome, October 2004. (Available from the authors.) 12. Appendix: Analysis of Crankback Probability Our goal in this appendix is to demonstrate that the probability of crankback in the ARO scheme is extremely low, and that in most cases, if crankback becomes necessary, the ARO scheme will have to crankback at most one hop. Let P be the probability that a situation that causes crankback occurs for one area (i.e. the probability of the joint event: "particular" topology in the AS(k) AND_ "special" connectivity scheme between AS(k) and AS(k+1)). Therefore, the probability that ARO signaling will have to crankback of N backwards hops (=AS) is P ^ N (P to the power of N), which should be negligible for more than 1 hop since in real cases P<<1. Therefore, in the worst case, for _parallel computation_there is the necessity to crankback completely only with a probability P^N. But most crankback events, if any at all, should be confined in a single hop, as explained below. Moreover, the "worst-case" for the alternative _sequential computation_ (i.e., presence of at least one trap topology along the path) occurs with a higher probability than the worst-case for ARO, and always leads to complete crankback. Expires - April 2005 [Page 27] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 Now, we explain more precisely what we mean by a "particular" intra area topology, and a "special" interconnectivity scheme. Let assume that at some point along the path, say at AS(J), you have only 2 egress points to AS(J+1) (we assumed that are not less than 2 in the draft), and 3 ingress point to AS(J+1), and that the first signaling phase for ARO has hitted I1(J+1) while only I2(J+1) and I3(J+1) have disjoint pairs towards the next AS(J+2). In this case it is not difficult to show that only 1-link crankback (i.e., from I(J+1) to E(J) suffices to overcome the block. Let us assume an even worse case, in which there are 3 egress points from AS(J) and 3 ingress on AS(J+1), with just 1:1 links between the E(J) and I(J+1) sets. Similarly to above, assume that the topology inside AS(J+1) is such that only I2(J+1) and I3(J+1) have disjoint pairs towards the next AS(J+2), while the signaling procedure ha hitted I1(J+1). In this case, since we assumed 1:1 links between E(J) and I(J+1), the scheme has to crankback 1-hop, until I(J) ! But then, from I(J) in general one could find a disjoint pair passing from the other two egress points in AS(J), in our case E2(J) and E3(J), thus avoiding E1(J). If this is not the case, one has hit a "particular" topology: one in which the selected I(J) (say I1(J) and I2(J) ) have a disjoint pair to E1(J)+E2(J) or to E1(J)+E3(J), but not_to E2(J)+E3(J) , AND there are other candidate ingress points to AS(J) other than I(J). 9. AuthorsÆ Addresses Fabio Ricciato Alessio DÆAchille CoRiTeL CoRiTeL Via Cavour 256 Via Cavour 256 00184 Roma 00184 Roma Italia Italia Phone: +39 06 47825184 Phone: +39 06 47825184 Email: ricciato@coritel.it Email: dachille@coritel.it Marco Listanti Ugo Monaco CoRiTeL CoRiTeL Via Cavour 256 Via Cavour 256 00184 Roma 00184 Roma Italia Italia Phone: +39 06 47825184 Phone: +39 06 47825184 Email: Email: marco@infocom.uniroma1.it monaco@infocom.uniroma1.it Expires - April 2005 [Page 28] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 Daniele Ali Vishal Sharma CoRiTeL Metanoia, Inc. Via Cavour 256 888 Villa St, Suite 200 00184 Roma Mountain View, CA 94041 Italia United States of America Phone: +39 06 47825184 Phone: +1-650-641-0082 Email: ali@coritel.it Email: v.sharma@ieee.org 13. Intellectual Property Considerations The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. 13.1 IPR Disclosure Acknowledgement By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. 14. Full Copyright Statement "Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights." Expires - April 2005 [Page 29] draft-dachille-diverse-inter-region-path-setup-01.txt October 2004 "This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE Of THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Expires - April 2005 [Page 30]