Inter-Domain Routing H. Gredler Internet-Draft J. Medved Intended status: Standards Track Juniper Networks, Inc. Expires: January 12, 2012 S. Previdi Cisco Systems, Inc. July 11, 2011 Advertising Link-State Information in BGP draft-gredler-bgp-te-01 Abstract This document defines a new Border Gateway Protocol Network Layer Reachability Information (BGP NLRI) encoding format that can be used to distribute a network topologies' link and node information. Links can be either physical links connecting physical nodes, or virtual paths between physical or abstract nodes. The network topology information is carried via the BGP, thereby reusing protocol algorithms, operational experience, and administrative processes, such as inter-provider peering agreements. The BGP protocol carrying Link State information would provide a well-defined, uniform, policy-controlled interface from the network to outside servers that need to learn the network topology in real- time, for example an ALTO Server or a Path Computation Server. Having Traffic Engineering (TE) information from remote areas and/or Autonomous Systems would allow path computation for inter-area and/or inter-AS source-routed unicast and multicast tunnels. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any Gredler, et al. Expires January 12, 2012 [Page 1] Internet-Draft Link-State Information in BGP July 2011 time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 12, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Gredler, et al. Expires January 12, 2012 [Page 2] Internet-Draft Link-State Information in BGP July 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Transcoding Link State Information into a BGP NLRI . . . . . . 5 3.1. NLRI format . . . . . . . . . . . . . . . . . . . . . . . 5 3.2. TLV Format . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3. Node Descriptors . . . . . . . . . . . . . . . . . . . . . 7 3.3.1. Local Node Descriptors . . . . . . . . . . . . . . . . 8 3.3.2. Remote Node Descriptors . . . . . . . . . . . . . . . 8 3.3.3. Node Descriptor Sub-TLVs . . . . . . . . . . . . . . . 9 3.3.4. Router-ID Anchoring Example: ISO Pseudonode . . . . . 9 3.3.5. Router-ID Anchoring Example: OSPFv2 to IS-IS Migration . . . . . . . . . . . . . . . . . . . . . . 10 3.4. Link Descriptors . . . . . . . . . . . . . . . . . . . . . 10 3.5. Link Attributes . . . . . . . . . . . . . . . . . . . . . 11 3.5.1. MPLS Protocol TLV . . . . . . . . . . . . . . . . . . 12 3.5.2. TE Default Metric TLV . . . . . . . . . . . . . . . . 12 3.5.3. IGP Link Metric TLV . . . . . . . . . . . . . . . . . 13 3.5.4. Shared Risk Link Group TLV . . . . . . . . . . . . . . 13 3.5.5. OSPF specific link attribute TLV . . . . . . . . . . . 14 3.5.6. IS-IS specific link attribute TLV . . . . . . . . . . 14 3.6. Node Attributes . . . . . . . . . . . . . . . . . . . . . 15 3.6.1. Node Flag Bits TLV . . . . . . . . . . . . . . . . . . 15 3.6.2. OSPF Specific Node Properties TLV . . . . . . . . . . 15 3.6.3. IS-IS Specific Node Properties TLV . . . . . . . . . . 16 3.7. IGP Area Information . . . . . . . . . . . . . . . . . . . 16 3.8. Inter-AS Links . . . . . . . . . . . . . . . . . . . . . . 17 4. Link to Path Aggregation . . . . . . . . . . . . . . . . . . . 17 4.1. Example: No Link Aggregation . . . . . . . . . . . . . . . 17 4.2. Example: ASBR to ASBR Path Aggregation . . . . . . . . . . 18 4.3. Example: Multi-AS Path Aggregation . . . . . . . . . . . . 18 5. Originating the TED NLRI . . . . . . . . . . . . . . . . . . . 18 6. Receiving the TED NLRI . . . . . . . . . . . . . . . . . . . . 19 7. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 19 7.1. MPLS TE . . . . . . . . . . . . . . . . . . . . . . . . . 19 7.2. ALTO Server Network API . . . . . . . . . . . . . . . . . 20 7.3. Path Computation Element (PCE) TED Synchronization Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 21 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 11.1. Normative References . . . . . . . . . . . . . . . . . . . 22 11.2. Informative References . . . . . . . . . . . . . . . . . . 23 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 Gredler, et al. Expires January 12, 2012 [Page 3] Internet-Draft Link-State Information in BGP July 2011 1. Introduction Today, the contents of a link-state database usually has the scope of an IGP area. There are several use cases that could benefit from knowing the topology in a remote area or Autonomous System, but today no mechanism exists to distribute this information beyond an IGP area. This draft proposes to use BGP as the distribution mechanism for exchanging link-state data between routers in different IGP areas and/or Autonomous Systems. The mechanism can also be used to exchange topology and TE data between the network and external network-aware applications, such as the Alto Servers. The Border Gateway Protocol (BGP [RFC4271]) has grown beyond its original intention of disseminating IPv4 Inter-domain routing paths. A modern BGP implementation can be viewed as a ubiquitous database replication mechanism, which allows replication of many different state information types across arbitrary distribution graphs. Its built-in loop protection mechanism (AS path, Cluster List attributes) enables building of stable and redundant distribution topologies. In addition to IP routing, applications that use BGP for state distribution are L2VPN, VPLS, MAC-VPN, Route-target information, and Flowspec for firewalling. Using BGP as a dissemination protocol for topology data is a logical consequence. A router maintains one or more databases for storing link-state information about nodes and links in any given area. Link attributes stored in these databases are: local/remote IP addresses, local/ remote interface indices, metric, link bandwidth, reservable bandwidth, per CoS class reservation state, preemption and Shared Risk Link Groups (SRLG). The router's BGP process can retrieve topology from one of the link-state databases and distribute it to peer BGP Speakers using the encoding specified in this draft. A BGP Speaker may distribute the real physical topology from the Link State database or the Traffic Engineering database, or create an abstracted topology, where virtual, aggregated nodes are connected by virtual paths. Aggregated nodes can be created, for example, out of multiple routers in a POP. Abstracted topology can also be a mix of physical and virtual nodes and physical and virtual links. Consumers of the network topology and TE data are peer routers in other areas either in the router's own AS or in remote ASes, or entities outside the network that may need network and/or TE data to optimize their behavior. Gredler, et al. Expires January 12, 2012 [Page 4] Internet-Draft Link-State Information in BGP July 2011 2. Scope The scope of Link State NLRI are the static attributes / metrics of a path between two routers. The path can be a physical link or multiple links aggregated into a path. Dynamic data, such as reservable bandwidth or delay metrics, is out of scope of this draft. 3. Transcoding Link State Information into a BGP NLRI The MP_REACH and MP_UNREACH attributes are BGP's containers for carrying opaque information. Each Link State NLRI describes either a single node or link. All link and node information shall be encoded using a TBD AFI / SAFI 1 or SAFI 128 header into those attributes. SAFI 1 shall be used for Internet routing (Public) and SAFI 128 shall be used for VPN routing (Private) applications. In order for two BGP speakers to exchange Link-State NLRI, they must use BGP Capabilities Advertisement to ensure that they both are capable of properly processing such NLRI. This is done as specified in [RFC4760], by using capability code 1 (multiprotocol BGP), with an AFI of TBD and an SAFI of 1 or 128. 3.1. NLRI format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NLRI Type | Total NLRI Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Link-State NLRI (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Link State SAFI 1 NLRI Format Gredler, et al. Expires January 12, 2012 [Page 5] Internet-Draft Link-State Information in BGP July 2011 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NLRI Type | Total NLRI Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Route Distinguisher + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Link-State NLRI (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: Link State SAFI 128 NLRI Format The 'Total NLRI Length" field contains the cumulative length of all the TLVs in the NLRI. For VPN applications it also includes the length of the Route Distinguisher. The 'NLRI Type' field can contain one of the following values: Type = 1: Link NLRI, contains link descriptors and link attributes Type = 2: Node NLRI, contains node attributes The Link NLRI (NLRI Type = 1) is shown in the following figure. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Node Descriptors (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Node Descriptors (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Link Descriptors (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Link Attributes (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: The Link NLRI format The Node NLRI (NLRI Type = 2) is shown in the following figure. Gredler, et al. Expires January 12, 2012 [Page 6] Internet-Draft Link-State Information in BGP July 2011 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Node Descriptors (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Node Attributes (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4: The Node NLRI format 3.2. TLV Format The Node Descriptors, Link Descriptors, Link Attribute, and Node Attribute fields are described using a set of Type/Length/Value triplets. The format of each TLV is shown in Figure 5. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Value (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: TLV format The Length field defines the length of the value portion in octets (thus a TLV with no value portion would have a length of zero). The TLV is not padded to four-octet alignment; Unrecognized types are ignored. 3.3. Node Descriptors Each link gets anchored by at least a pair of router-IDs. Since there are many Router-IDs formats (32 Bit IPv4 router-ID, 56 Bit ISO Node-ID and 128 Bit IPv6 router-ID) a link may be anchored by more than one Router-ID pair. The set of Local and Remote Node Descriptors describe which Protocols Router-IDs will be following to "anchor" the link described by the "Link attribute TLVs". There must be at least one "like" router-ID pair of a Local Node Descriptors and a Remote Node Descriptors per-protocol. If a peer sends an illegal combination in this respect, then this is handled as an NLRI error, described in [RFC4760]. It is desirable that the Router-ID assignments inside the Node anchor are globally unique. However there may be router-ID spaces (e.g. Gredler, et al. Expires January 12, 2012 [Page 7] Internet-Draft Link-State Information in BGP July 2011 ISO) where not even a global registry exists, or worse, Router-IDs have been allocated following private-IP RFC 1918 [RFC1918] allocation. In order to disambiguate the Router-IDs the local and remote Autonomous System number TLVs of the anchor nodes may be included in the NLRI. The Local and Remote Autonomous System TLVs are 4 octets wide as described in [RFC4893]. 2-octet AS Numbers shall be expanded to 4-octet AS Numbers by zeroing the two MSB octets. 3.3.1. Local Node Descriptors The Local Node Descriptors TLV (Type 256) contains Node Descriptors for the node anchoring the local end of the link. The length of this TLV is variable. The value contains one or more Node Descriptor Sub- TLVs defined in Section 3.3.3. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Node Descriptor Sub-TLVs (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6: Local Node Descriptors TLV format 3.3.2. Remote Node Descriptors The Remote Node Descriptors TLV (Type 257) contains Node Descriptors for the node anchoring the remote end of the link. The length of this TLV is variable. The value contains one or more Node Descriptor Sub-TLVs defined in Section 3.3.3. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Node Descriptor Sub-TLVs (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7: Remote Node Descriptors TLV format Gredler, et al. Expires January 12, 2012 [Page 8] Internet-Draft Link-State Information in BGP July 2011 3.3.3. Node Descriptor Sub-TLVs The Node Descriptor Sub-TLV type codepoints and lengths are listed in the following table: +------+-------------------+--------+ | Type | Description | Length | +------+-------------------+--------+ | 258 | Autonomous System | 4 | | 259 | IPv4 Router-ID | 4 | | 260 | IPv6 Router-ID | 16 | | 261 | ISO Node-ID | 7 | +------+-------------------+--------+ Table 1: Node Descriptor Sub-TLVs The TLV values in Node Descriptor Sub-TLVs are as follows: Autonomous System: opaque value (32 Bit AS ID) IPv4 Router ID: opaque value (can be an IPv4 address or an 32 Bit router ID) IPv6 Router ID: opaque value (can be an IPv6 address or 128 Bit router ID) ISO Node ID: ISO node-ID (6 octets ISO system-ID plus PSN octet) 3.3.4. Router-ID Anchoring Example: ISO Pseudonode IS-IS Pseudonodes are a good example for the variable Router-ID anchoring. Consider Figure 8. This represents a Broadcast LAN between a pair of routers. The "real" (=non pseudonode) routers have both an IPv4 Router-ID and IS-IS Node-ID. The pseudonode does not have an IPv4 Router-ID. Two unidirectional links (Node1, Pseudonode 1) and (Pseudonode 1, Node 2) are being generated. The NRLI for (Node1, Pseudonode1) encodes local IPv4 router-ID, local ISO node-ID and remote ISO node-id) The NLRI for (Pseudonode1, Node2) encodes a local ISO node-ID, remote IPv4 router-ID and remote ISO node-id. Gredler, et al. Expires January 12, 2012 [Page 9] Internet-Draft Link-State Information in BGP July 2011 +-----------------+ +-----------------+ +-----------------+ | Node1 | | Pseudonode 1 | | Node2 | |1921.6800.1001.00|--->|1921.6800.1001.02|--->|1921.6800.1002.00| | 192.168.1.1 | | | | 192.168.1.2 | +-----------------+ +-----------------+ +-----------------+ Figure 8: IS-IS Pseudonodes 3.3.5. Router-ID Anchoring Example: OSPFv2 to IS-IS Migration Migrating gracefully from one IGP to another requires congruent operation of both routing protocols during the migration period. The target protocol (IS-IS) supports more router-ID spaces than the source (OSPFv2) protocol. When advertising a point-to-point link between an OSPFv2-only router and an OSPFv2 and IS-IS enabled router the following link information may be generated. Note that the IS-IS router also supports the IPv6 traffic engineering extensions RFC 6119 [RFC6119] for IS-IS. The NRLI encodes local IPv4 router-id, remote IPv4 router-id, remote ISO node-id and remote IPv6 node-id. 3.4. Link Descriptors The 'Link Descriptor' field is a set of Type/Length/Value (TLV) triplets. The format of each TLV is shown in Figure 5. The 'Link descriptor' TLVs uniquely identify a link between a pair of anchor Routers. The encoding of 'Link Descriptor' TLVs, i.e. the Codepoints in 'Type', and the 'Length' and 'Value' fields are the same as defined in [RFC5305], [RFC5307], and [RFC6119] for sub-TLVs in the Extended IS reachability TLV. The Codepoints are in the IANA Protocol Registry for IS-IS, sub-TLV Codepoints for TLV 22, [IANA-ISIS]. Although the encodings for 'Link Descriptor' TLVs were originally defined for IS-IS, the TLVs can carry data sourced either by IS-IS or OSPF. The following link descriptor TLVs are valid in the Link NLRI: Gredler, et al. Expires January 12, 2012 [Page 10] Internet-Draft Link-State Information in BGP July 2011 +------+-------------------------------+------------------------+ | Type | Description | Defined in: | +------+-------------------------------+------------------------+ | 4 | Link Local/Remote Identifiers | [RFC5307], Section 1.1 | | 6 | IPv4 interface address | [RFC5305], Section 3.2 | | 8 | IPv4 neighbor address | [RFC5305], Section 3.3 | | 12 | IPv6 interface address | [RFC6119], Section 4.2 | | 13 | IPv6 neighbor address | [RFC6119], Section 4.3 | +------+-------------------------------+------------------------+ Table 2: Link Descriptor TLVs 3.5. Link Attributes The 'Link Attributes' field is a set of Type/Length/Value (TLV) triplets. The format of each TLV is shown in Figure 5. For Codepoints < 255, the encoding of 'Link Attributes' TLVs, i.e. the Codepoints in 'Type', and the 'Length' and 'Value' fields are the same as defined in [RFC5305], [RFC5307], and [RFC6119] for sub-TLVs in the Extended IS reachability TLV. The Codepoints are in the IANA Protocol Registry for IS-IS, sub-TLV Codepoints for TLV 22, [IANA-ISIS]. Although the encodings for 'Link Attributes' TLVs were originally defined for IS-IS, the TLVs can carry data sourced either by IS-IS or OSPF. For Codepoints > 255, the encoding of 'Link Attributes' TLVs is described in subsequent sections. The following link attribute TLVs are valid in the Link NLRI: +-------+--------------------------------+------------------------+ | Type | Description | Defined in: | +-------+--------------------------------+------------------------+ | 3 | Administrative group (color) | [RFC5305], Section 3.1 | | 9 | Maximum link bandwidth | [RFC5305], Section 3.3 | | 10 | Max. reservable link bandwidth | [RFC5305], Section 3.5 | | 11 | Unreserved bandwidth | [RFC5305], Section 3.6 | | 20 | Link Protection Type | [RFC5307], Section 1.2 | | 64509 | MPLS Protocol | Section 3.5.1 | | 64510 | TE Default Metric | Section 3.5.2 | | 64511 | IGP Link Metric | Section 3.5.3 | | 64512 | Shared Risk Link Group | Section 3.5.4 | | 64513 | OSPF specific link attribute | Section 3.5.5 | | 64514 | IS-IS specific link attribute | Section 3.5.6 | +-------+--------------------------------+------------------------+ Table 3: Link Attribute TLVs Gredler, et al. Expires January 12, 2012 [Page 11] Internet-Draft Link-State Information in BGP July 2011 3.5.1. MPLS Protocol TLV The MPLS Protocol TLV (Type 64511) carries a bit mask describing which MPLS signaling protocols are enabled. The length of this TLV is 1. The value is a bit array of 8 flags, where each bit represents an MPLS Protocol capability. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L R | +-+-+-+-+-+-+-+-+ Figure 9: MPLS Protocol TLV The following bits are defined: +-----+---------------------------------------------+-----------+ | Bit | Description | Reference | +-----+---------------------------------------------+-----------+ | 0 | Label Distribution Protocol (LDP) | [RFC5036] | | 1 | Extension to RSVP for LSP Tunnels (RSVP-TE) | [RFC3209] | | 2-7 | Reserved for future use | | +-----+---------------------------------------------+-----------+ Table 4: MPLS Protocol TLV Codes 3.5.2. TE Default Metric TLV The TE Default Metric TLV (Type 64512) carries the TE Default metric for this link. This TLV corresponds to the IS-IS TE Default metric sub-TLV (Type 18), defined in RFC5305, Section 3.7 [RFC5305], and the OSPF TE Metric sub-TLV (Type 5), defined in RFC3630, Section 2.5.5 [RFC3630]. If the value in the TE Default metric TLV is derived from IS-IS TE Default Metric, then the upper 8 bits of this TLV are set to 0. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TE Default Metric | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10: TE Default metric TLV format Gredler, et al. Expires January 12, 2012 [Page 12] Internet-Draft Link-State Information in BGP July 2011 3.5.3. IGP Link Metric TLV The IGP Metric TLV (Type 64513) carries the IGP metric for this link. This attribute is only present if the IGP link metric is different from the TE Default Metric (Type 18). The length of this TLV is 3. If the length of the IGP link metric from which the IGP Metric value is derived is less than 3 (e.g. for OSPF link metrics or non-wide IS-IS metric), then the upper bits of the TLV are set to 0. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IGP Link Metric | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 11: IGP Link Metric TLV format 3.5.4. Shared Risk Link Group TLV The Shared Risk Link Group (SRLG) TLV (Type 64514) carries the Shared Risk Link Group information (see Section 2.3, "Shared Risk Link Group Information", of [RFC4202]). It contains a data structure consisting of a (variable) list of SRLG values, where each element in the list has 4 octets, as shown in Figure 12. The length of this TLV is 4 * (number of SRLG values). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Shared Risk Link Group Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ............ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Shared Risk Link Group Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 12: Shared Risk Link Group TLV format Note that there is no SRLG TLV in OSPF-TE. In IS-IS the SRLG information is carried in two different TLVs: the IPv4 (SRLG) TLV (Type 138) defined in [RFC5307], and the IPv6 SRLG TLV (Type 139) defined in [RFC6119]. Since the Link State NLRI uses variable Router-ID anchoring, both IPv4 and IPv6 SRLG information can be carried in a single TLV. Gredler, et al. Expires January 12, 2012 [Page 13] Internet-Draft Link-State Information in BGP July 2011 3.5.5. OSPF specific link attribute TLV The OSPF specific link attribute TLV is an envelope that transparently carries optional link properties TLVs advertised by an OSPF router. The value field contains one or more optional OSPF link attribute TLVs. An originating router shall use this TLV for encoding information specific to the OSPF protocol or new OSPF extensions for which there is no protocol neutral representation in the BGP link-state NLRI. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | OSPF specific link attributes (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 13: OSPF specific link attribute format 3.5.6. IS-IS specific link attribute TLV The IS-IS specific link attribute TLV is an envelope that transparently carries optional link properties TLVs advertised by an IS-IS router. The value field contains one or more optional IS-IS link attribute TLVs. An originating router shall use this TLV for encoding information specific to the IS-IS protocol or new IS-IS extensions for which there is no protocol neutral representation in the BGP link-state NLRI. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | IS-IS specific link attributes (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 14: IS-IS specific link attribute format Gredler, et al. Expires January 12, 2012 [Page 14] Internet-Draft Link-State Information in BGP July 2011 3.6. Node Attributes The following node attribute TLVs are valid in the Node NLRI: +-------+--------------------------------+----------+ | Type | Description | Length | +-------+--------------------------------+----------+ | 65515 | Node Flag Bits | 1 | | 65516 | OSPF Specific Node Properties | variable | | 65517 | IS-IS Specific Node Properties | variable | +-------+--------------------------------+----------+ Table 5: Node Attribute TLVs 3.6.1. Node Flag Bits TLV The Node Flag Bits TLV (Type 1) carries a bit mask describing node attributes. The value is a bit array of 8 flags, where each bit represents an MPLS Protocol capability. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | +-+-+-+-+-+-+-+-+ Figure 15: Node Flag Bits TLV format The bits are defined as follows: +-----+--------------+-----------+ | Bit | Description | Reference | +-----+--------------+-----------+ | 0 | Overload Bit | [RFC1195] | | 1 | Attached Bit | [RFC1195] | | 2 | External Bit | [RFC2328] | | 3 | ABR Bit | [RFC2328] | +-----+--------------+-----------+ Table 6: Node Flag Bits Definitions 3.6.2. OSPF Specific Node Properties TLV The OSPF Specific Node Properties TLV is an envelope that transparently carries optional node properties TLVs advertised by an OSPF router. The value field contains one or more optional OSPF node Gredler, et al. Expires January 12, 2012 [Page 15] Internet-Draft Link-State Information in BGP July 2011 property TLVs, such as the OSPF Router Informational Capabilities TLV defined in [RFC4970], or the OSPF TE Node Capability Descriptor TLV described in [RFC5073]. An originating router shall use this TLV for encoding information specific to the OSPF protocol or new OSPF extensions for which there is no protocol neutral representation in the BGP link-state NLRI. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | OSPF specific node properties (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 16: OSPF specific Node property format 3.6.3. IS-IS Specific Node Properties TLV The IS-IS Router Specific Node Properties TLV is an envelope that transparently carries optional node specific TLVs advertised by an IS-IS router. The value field contains one or more optional IS-IS node property TLVs, such as the IS-IS TE Node Capability Descriptor TLV described in [RFC5073]. An originating router shall use this TLV for encoding information specific to the IS-IS protocol or new IS-IS extensions for which there is no protocol neutral representation in the BGP link-state NLRI. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | IS-IS specific node properties (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 17: IS-IS specific Node property format 3.7. IGP Area Information IGP Area information can be carried in BGP communities. An implementation should support configuration that maps IGP areas to BGP communities. Gredler, et al. Expires January 12, 2012 [Page 16] Internet-Draft Link-State Information in BGP July 2011 3.8. Inter-AS Links The main source of TE information is the IGP, which is not active on inter-AS links. In order to inject a non-IGP enabled link into the BGP link-state RIB an implementation must support configuration of static links. 4. Link to Path Aggregation Distribution of all links available in the global Internet is certainly possible, however not desirable from a scaling and privacy point of view. Therefore an implementation may support link to path aggregation. Rather than advertising all specific links of a domain, an ASBR may advertise an "aggregate link" between a non-adjacent pair of nodes. The "aggregate link" represents the aggregated set of link properties between a pair of non-adjacent nodes. The actual methods to compute the path properties (of bandwidth, metric) are outside the scope of this document. The decision whether to advertise all specific links or aggregated links is an operator's policy choice. To highlight the varying levels of exposure, the following deployment examples shall be discussed. 4.1. Example: No Link Aggregation Consider Figure 18. Both AS1 and AS2 operators want to protect their inter-AS {R1,R3}, {R2, R4} links using RSVP-FRR LSPs. If R1 wants to compute its link-protection LSP to R3 it needs to "see" an alternate path to R3. Therefore the AS2 operator exposes its topology. All BGP TE enabled routers in AS1 "see" the full topology of AS and therefore can compute a backup path. Note that the decision if the direct link between {R3, R4} or the {R4, R5, R3) path is used is made by the computing router. AS1 : AS2 : R1-------R3 | : | \ | : | R5 | : | / R2-------R4 : : Figure 18: no-link-aggregation Gredler, et al. Expires January 12, 2012 [Page 17] Internet-Draft Link-State Information in BGP July 2011 4.2. Example: ASBR to ASBR Path Aggregation The brief difference between the "no-link aggregation" example and this example is that no specific link gets exposed. Consider Figure 19. The only link which gets advertised by AS2 is an "aggregate" link between R3 and R4. This is enough to tell AS1 that there is a backup path. However the actual links being used are hidden from the topology. AS1 : AS2 : R1-------R3 | : | | : | | : | R2-------R4 : : Figure 19: asbr-link-aggregation 4.3. Example: Multi-AS Path Aggregation Service providers in control of multiple-ASes may even decide to not expose their internal inter-AS links. Consider Figure 20. Rather than exposing all specific R3 to R6 links, AS3 is modeled as a single node which connects to the border routers of the aggregated domain. AS1 : AS2 : AS3 : : R1-------R3----- | : : \ | : : vR0 | : : / R2-------R4----- : : : : Figure 20: multi-as-aggregation 5. Originating the TED NLRI A BGP Speaker must be configured to originate TED NLRIs. Usually export of the TED database into BGP is enabled on ASBRs and ABRs. The BGP Speaker shall throttle the rate of TED NLRI updates. An implementation shall provide a configuration attribute for the Gredler, et al. Expires January 12, 2012 [Page 18] Internet-Draft Link-State Information in BGP July 2011 interval between updates. The minimum interval between updates is 30 seconds. 6. Receiving the TED NLRI This section describes the processing of TED NLRIs at the receiving BGP Speaker. TE attributes for a link received from an IGP have higher priority than TED NLRIs received via BGP. Multiple BGP Speakers may advertise the same TED NLRI; the receiving BGP Speaker can individually choose the source BGP Speaker for each NLRI. The AS_PATH attribute is used both for loop detection and for NLRI selection: the TED NLRI with shorter AS_PATH length is preferred. The Community and Extended Community path attributes are stored in the RIB and may be used in operator-defined policies. Communities can also be used to encode the IGP Area information. All other path attributes are ignored. 7. Use Cases 7.1. MPLS TE If a router wants to compute a MPLS TE path across IGP areas TED lacks visibility of the complete topology. This is an issue for large scale networks that need to segment their core networks into distinct areas because inter-area TE cannot get deployed there. Current solutions for inter area TE only compute the path for the first area. The router only has full topological visibility for the first area along the path, but not for subsequent areas. The best practice is to use a technique called "loose-hop-expansion" which uses the IGP computed shortest path topology for the remainder of the path. Therefore no non-SPF based path setup is possible across areas. This has disadvantages for path protection and path engineering applications, as shown in Figure 21. Gredler, et al. Expires January 12, 2012 [Page 19] Internet-Draft Link-State Information in BGP July 2011 ............................... ................................... : Area 51 : : Area 0 : : +--------+ +--------+ +--------+ : : ************************************************************ : : * +-------| R1 |-----| ABR1 |-----| R3 |-------+ * : : * | ######## | | # | | | | * : : * | # +--------+ +----|-#-+ +--------+ | * : : +-*-|-#-+ : :| # +---|-*-+ : : | * # | : :| # | * | : : | S # | : :| # | D | : : | # | : :| # | | : : +---|-#-+ : :| # +---|---+ : : | # +--------+ +----|-#-+ +--------+ | : : | ############################# | | | | : : +-------| R2 |-----| ABR2 |-----| R4 |-------+ : : | | | | | | : : +--------+ +--------+ +--------+ : : : : : :.............................: :.................................: ...... **** Primary LSP : : Area Boundary #### Bypass LSP :....: Figure 21: MPLS TE Bypass LSP problem Router S sets up an RSVP LSP from S to D. Although it has only visibility into Area 51, the LSP setup ultimately succeeds, as shortest path first routing from ABR1 onwards routes the RSVP message towards destination D. What does not work is to setup a Link Protection bypass LSP protection for the R1 to ABR1 link as shown in the figure. The problem is that the TE database at Router R1 does not have path visibility of the link between ABR1 and ABR2, such that it can compute the Link Bypass LSP. 7.2. ALTO Server Network API An ALTO Server is an entity that generates an abstracted network topology and provides it to network-aware applications over a web service based API. Example applications are p2p clients or trackers, or CDNs. The abstracted network topology comes in the form of two maps: the network map that specifies allocation of prefixes to PIDs, and the cost map that specifies the cost between the PIDs. For more details, see [I-D.ietf-alto-protocol]. ALTO abstract network topologies can be auto-generated from the physical topology of the underlying network. The generation would typically be based on policies and rules set by the operator. Both Gredler, et al. Expires January 12, 2012 [Page 20] Internet-Draft Link-State Information in BGP July 2011 prefix and TE data are required: prefix data is required to generate the network maps, TE (topology) data is required to generate the cost maps. Prefix data is carried and originated in BGP, TE data is originated and carried in an IGP. Without BGP TE NLRI the ALTO Server would have to peer with both BGP Speakers and IGP in multiple areas and/or ASes to obtain all the necessary network topology data. The BGP TE NLRI allows for a single interface between the network and the ALTO Server. 7.3. Path Computation Element (PCE) TED Synchronization Protocol RFC4655, Section 5.2, Figure 2 [RFC4655] describes a Path Computation Element (PCE) which synchronizes its traffic engineering database (TED) by use of a routing protocol. This memo describes the first standardized protocol for PCE to learn about inter-AS or inter-area TE information. 8. IANA Considerations This document requests a code point from the registry of Address Family Numbers This document requests creation of a new registry for node anchor, link descriptor and link attribute TLVs. The range of Codepoints in the registry is 0-65535. Values 0-255 will shadow Codepoints of the IANA Protocol Registry for IS-IS, sub-TLV Codepoints for TLV 22. Values 256-65535 will be used for Codepoints that are specific to the BGP TE NLRI. The registry will be initialized as shown in Table 2 and Table 3. Allocations within the registry will require documentation of the proposed use of the allocated value and approval by the Designated Expert assigned by the IESG (see [RFC5226]). Note to RFC Editor: this section may be removed on publication as an RFC. 9. Security Considerations This draft does not affect the BGP security model. 10. Acknowledgements We would like to thank Nischal Sheth from Juniper Networks for his input and contributions to this text. We would like to thank Alia Atlas, David Ward, John Scudder, Kaliraj Vairavakkalai, and Yakov Rekhter from Juniper Networks, Les Ginsberg and Mike Shand from Cisco Gredler, et al. Expires January 12, 2012 [Page 21] Internet-Draft Link-State Information in BGP July 2011 Systems, and Richard Woundy from Comcast for their comments. 11. References 11.1. Normative References [IANA-ISIS] "IS-IS TLV Codepoint, Sub-TLVs for TLV 22", . [RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and dual environments", RFC 1195, December 1990. [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2", RFC 3630, September 2003. [RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4202, October 2005. [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006. [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC 4760, January 2007. [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS Number Space", RFC 4893, May 2007. [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP Specification", RFC 5036, October 2007. Gredler, et al. Expires January 12, 2012 [Page 22] Internet-Draft Link-State Information in BGP July 2011 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic Engineering", RFC 5305, October 2008. [RFC5307] Kompella, K. and Y. Rekhter, "IS-IS Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 5307, October 2008. [RFC6119] Harrison, J., Berger, J., and M. Bartlett, "IPv6 Traffic Engineering in IS-IS", RFC 6119, February 2011. 11.2. Informative References [I-D.ietf-alto-protocol] Alimi, R., Penno, R., and Y. Yang, "ALTO Protocol", draft-ietf-alto-protocol-08 (work in progress), May 2011. [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation Element (PCE)-Based Architecture", RFC 4655, August 2006. [RFC4970] Lindem, A., Shen, N., Vasseur, JP., Aggarwal, R., and S. Shaffer, "Extensions to OSPF for Advertising Optional Router Capabilities", RFC 4970, July 2007. [RFC5073] Vasseur, J. and J. Le Roux, "IGP Routing Protocol Extensions for Discovery of Traffic Engineering Node Capabilities", RFC 5073, December 2007. Authors' Addresses Hannes Gredler Juniper Networks, Inc. 1194 N. Mathilda Ave. Sunnyvale, CA 94089 US Email: hannes@juniper.net Gredler, et al. Expires January 12, 2012 [Page 23] Internet-Draft Link-State Information in BGP July 2011 Jan Medved Juniper Networks, Inc. 1194 N. Mathilda Ave. Sunnyvale, CA 94089 US Email: jmedved@juniper.net Stefano Previdi Cisco Systems, Inc. Via Del Serafico, 200 Roma 00142 Italy Email: sprevidi@cisco.com Gredler, et al. Expires January 12, 2012 [Page 24]