PCEP Extension for Distribution of Link-State and TE Information.

In Multiprotocol Label Switching (MPLS) and Generalized MPLS (GMPLS), a Traffic Engineering Database (TED) is used in computing paths for connection oriented packet services and for circuits. The TED contains all relevant information that a Path Computation Element (PCE) needs to perform its computations. It is important that the TED be complete and accurate each time, the PCE performs a path computation. In MPLS and GMPLS, interior gateway routing protocols (IGPs) have been used to create and maintain a copy of the TED at each node running the IGP. One of the benefits of the PCE architecture is the use of computationally more sophisticated path computation algorithms and the realization that these may need enhanced processing power not necessarily available at each node participating in an IGP. Section 4.3 of describes the potential load of the TED on a network node and proposes an architecture where the TED is maintained by the PCE rather than the network nodes. However, it does not describe how a PCE would obtain the information needed to populate its TED. PCE may construct its TED by participating in the IGP ( and for MPLS-TE; and for GMPLS). An alternative is offered by BGP-LS . proposes some other approaches for learning and maintaining the Link-State and TE information directly on a PCE as an alternative to IGPs and BGP flooding and investigate the impact from the PCE, routing protocol, and node perspectives. describes the specifications for the Path Computation Element Communication Protocol (PCEP). PCEP specifies the communication between a Path Computation Client (PCC) and a Path Computation Element (PCE), or between two PCEs based on the PCE architecture . This document describes a mechanism by which Link State and TE information can be collected from networks and shared with PCE using the PCEP itself. This is achieved using a new PCEP message format. The mechanism is applicable to physical and virtual links as well as further subjected to various policies. A network node maintains one or more databases for storing link-state and TE information about nodes and links in any given area. Link attributes stored in these databases include: local/remote IP addresses, local/ remote interface identifiers, link metric and TE metric, link bandwidth, reservable bandwidth, per CoS class reservation state, preemption and Shared Risk Link Groups (SRLG). The node's PCEP process can retrieve topology from these databases and distribute it to a PCE, either directly or via another PCEP Speaker, using the encoding specified in this document. Further describes Hierarchical-PCE architecture, where a parent PCE maintains a domain topology map. The child PCE MAY transport (abstract) Link-State and TE information from child PCE to a Parent PCE using the mechanism described in this document. describe LSP state synchronization between PCCs and PCEs in case of stateful PCE. This document does not make any change to the LSP state synchronization process. The mechanism described in this document are on top of the existing LSP state synchronization.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in .

The terminology is as per and .

As per , the mechanism specified in this draft is applicable to: Where there is no IGP or BGP-LS running in the network. Where there is no IGP or BGP-LS running at the PCE to learn link-state and TE information. Where there is IGP or BGP-LS running but with a need for a faster TE and link-state population and convergence at the PCE. A PCE may receive partial information (say basic TE, link-state) from IGP and other information (optical and impairment) from PCEP. A PCE may receive an incremental update (as opposed to the entire information of the node/link). A PCE may receive full information from both existing mechanism (IGP or BGP) and PCEP. Where there is a need for transporting (abstract) Link-State and TE information from child PCE to a Parent PCE in H-PCE ; as well as for Physical Network Controller (PNC) to Multi-Domain Service Coordinator (MDSC) in Abstraction and Control of TE Networks (ACTN) . A PCC may further choose to send only local information or both local and remote learned information. How a PCE manages the link-state (and TE) information is implementation specific and thus out of scope of this document.

Following key requirements associated with link-state (and TE) distribution are identified for PCEP: The PCEP speaker supporting this draft MUST be a mechanism to advertise the Link-State (and TE) distribution capability. PCC supporting this draft MUST have the capability to report the link-state (and TE) information to the PCE. This includes self originated information and remote information learned via routing protocols. PCC MUST be capable to do the initial bulk sync at the time of session initialization as well as changes after. A PCE MAY learn link-state (and TE) from PCEP as well as from existing mechanism like IGP/BGP-LS. PCEP extension MUST have a mechanism to link the information learned via other means. There MUST NOT be any changes to the existing link-state (and TE) population mechanism via IGP/BGP-LS. PCEP extension SHOULD keep the properties in a protocol (IGP or BGP-LS) neutral way, such that an implementation may not need to know about any OSPF or IS-IS or BGP protocol specifics. It SHOULD be possible to encode only the changes in link-state (and TE) properties (after the initial sync) in PCEP messages. The same mechanism should be used for both MPLS TE as well as GMPLS, optical and impairment aware properties. The same mechanism should be used for PCE to PCE Link-state (and TE) synchronization. The extension in this draft SHOULD be extensible to support various architecture options listed in .

Several new functions are required in PCEP to support distribution of link-state (and TE) information. A function can be initiated either from a PCC towards a PCE (C-E) or from a PCE towards a PCC (E-C). The new functions are: Capability advertisement (E-C,C-E): both the PCC and the PCE must announce during PCEP session establishment that they support PCEP extensions for distribution of link-state (and TE) information defined in this document. Link-State (and TE) synchronization (C-E): after the session between the PCC and a PCE is initialized, the PCE must learn Link-State (and TE) information before it can perform path computations. In case of stateful PCE it is RECOMENDED that this operation be done before LSP state synchronization. Link-State (and TE) Report (C-E): a PCC sends a LS (and TE) report to a PCE whenever the Link-State and TE information changes.

In this document, we define a new PCEP messages called LS Report (LSRpt), a PCEP message sent by a PCC to a PCE to report link-state (and TE) information. Each LS Report in a LSRpt message can contain the node or link properties. An unique PCEP specific LS identifier (LS-ID) is also carried in the message to identify a node or link and that remains constant for the lifetime of a PCEP session. This identifier on its own is sufficient when no IGP or BGP-LS running in the network for PCE to learn link-state (and TE) information. Incase PCE learns some information from PCEP and some from the existing mechanism, the PCC SHOULD include the mapping of IGP or BGP-LS identifier to map the information populated via PCEP with IGP/BGP-LS. See for details.

During PCEP Initialization Phase, PCEP Speakers (PCE or PCC) advertise their support of LS (and TE) distribution via PCEP extensions. A PCEP Speaker includes the "LS Capability" TLV, described in , in the OPEN Object to advertise its support for PCEP-LS extensions. The presence of the LS Capability TLV in PCC's OPEN Object indicates that the PCC is willing to send LS Reports whenever local link-state (and TE) information changes. The presence of the LS Capability TLV in PCE's OPEN message indicates that the PCE is interested in receiving LS Reports whenever local link-state (and TE) information changes. The PCEP protocol extensions for LS (and TE) distribution MUST NOT be used if one or both PCEP Speakers have not included the LS Capability TLV in their respective OPEN message. If the PCE that supports the extensions of this draft but did not advertise this capability, then upon receipt of a LSRpt message from the PCC, it SHOULD generate a PCErr with error-type 19 (Invalid Operation), error-value TBD1 (Attempted LS Report if LS capability was not advertised) and it will terminate the PCEP session. The LS reports sent by PCC MAY carry the remote link-state (and TE) information learned via existing means like IGP and BGP-LS only if both PCEP Speakers set the R (remote) Flag in the "LS Capability" TLV to 'Remote Allowed (R Flag = 1)'. If this is not the case and LS reports carry remote link-state (and TE) information, then a PCErr with error-type 19 (Invalid Operation) and error-value TBD1 (Attempted LS Report if LS remote capability was not advertised) and it will terminate the PCEP session.

The purpose of LS Synchronization is to provide a checkpoint-in- time state replica of a PCC's link-state (and TE) data base in a PCE. State Synchronization is performed immediately after the Initialization phase (see ]). In case of stateful PCE () it is RECOMENDED that the LS synchronization should be done before LSP state synchronization. During LS Synchronization, a PCC first takes a snapshot of the state of its database, then sends the snapshot to a PCE in a sequence of LS Reports. Each LS Report sent during LS Synchronization has the SYNC Flag in the LS Object set to 1. The end of synchronization marker is a LSRpt message with the SYNC Flag set to 0 for an LS Object with LS-ID equal to the reserved value 0. If the PCC has no link-state to synchronize, it will only send the end of synchronization marker. Either the PCE or the PCC MAY terminate the session using the PCEP session termination procedures during the synchronization phase. If the session is terminated, the PCE MUST clean up state it received from this PCC. The session re-establishment MUST be re-attempted per the procedures defined in , including use of a back-off timer. If the PCC encounters a problem which prevents it from completing the LS synchronization, it MUST send a PCErr message with error-type TBD2 (LS Synchronization Error) and error-value 5 (indicating an internal PCC error) to the PCE and terminate the session. The PCE does not send positive acknowledgements for properly received LS synchronization messages. It MUST respond with a PCErr message with error-type TBD2 (LS Synchronization Error) and error-value 1 (indicating an error in processing the LSRpt) if it encounters a problem with the LS Report it received from the PCC and it MUST terminate the session. The LS reports can carry local as well as remote link-state (and TE) information depending on the R flag in LS capability TLV. The successful LS Synchronization sequences is shown in .

| (Sync start) | | |-----LSRpt, SYNC=1----->| | . | | . | | . | |-----LSRpt, SYNC=1----->| | . | | . | | . | | | |-----LSRpt, SYNC=0----->| (End of sync marker | | LS Report | | for LS-ID=0) | | (Sync done) ]]> The sequence where the PCE fails during the LS Synchronization phase is shown in .

| | | |-----LSRpt, SYNC=1----->| | . | | . | | . | |-----LSRpt, SYNC=1----->| | | |---LSRpt,SYNC=1 | | \ ,-PCErr---| | \ / | | \/ | | /\ | | / `-------->| (Ignored) |<--------` | ]]> The sequence where the PCC fails during the LS Synchronization phase is shown in .

| | | |-----LSRpt, SYNC=1----->| | . | | . | | . | |-------- PCErr--------->| | | ]]>

These optimizations are described in .

The PCC MUST report any changes in the link-state (and TE) information to the PCE by sending a LS Report carried on a LSRpt message to the PCE. Each node and Link would be uniquely identified by a PCEP LS identifier (LS-ID). The LS reports may carry local as well as remote link-state (and TE) information depending on the R flag in LS capability TLV. In case R flag is set, It MAY also include the mapping of IGP or BGP-LS identifier to map the information populated via PCEP with IGP/BGP-LS. More details about LSRpt message are in .

A permanent PCEP session MUST be established between a PCE and PCC supporting link-state (and TE) distribution via PCEP. In the case of session failure, session re-establishment MUST be re-attempted per the procedures defined in .

As defined in , a PCEP message consists of a common header followed by a variable-length body made of a set of objects that can be either mandatory or optional. An object is said to be mandatory in a PCEP message when the object must be included for the message to be considered valid. For each PCEP message type, a set of rules is defined that specify the set of objects that the message can carry. An implementation MUST form the PCEP messages using the object ordering specified in this document.

A PCEP LS Report message (also referred to as LSRpt message) is a PCEP message sent by a PCC to a PCE to report the link-state (and TE) information. A LSRpt message can carry more than one LS Reports. The Message-Type field of the PCEP common header for the LSRpt message is set to [TBD3]. The format of the LSRpt message is as follows:

::=

Where: ::= [] ]]>

The LS object is a mandatory object which carries LS information of a node or a link. Each LS object has an unique LS-ID as described in . If the LS object is missing, the receiving PCE MUST send a PCErr message with Error-type=6 (Mandatory Object missing) and Error-value=[TBD4] (LS object missing). A PCE may choose to implement a limit on the LS information a single PCC can populate. If a LSRpt is received that causes the PCE to exceed this limit, it MUST send a PCErr message with error-type 19 (invalid operation) and error-value 4 (indicating resource limit exceeded) in response to the LSRpt message triggering this condition and SHOULD terminate the session.

If a PCEP speaker has advertised the LS capability on the PCEP session, the PCErr message MAY include the LS object. If the error reported is the result of an LS report, then the LS-ID number MUST be the one from the LSRpt that triggered the error. The format of a PCErr message from is extended as follows: The format of the PCErr message is as follows:

::= ( [] ) | [] ::=[] ::=[ | ]

::=[] ::=[] ::=[] ]]>

The PCEP objects defined in this document are compliant with the PCEP object format defined in . The P flag and the I flag of the PCEP objects defined in this document MUST always be set to 0 on transmission and MUST be ignored on receipt since these flags are exclusively related to path computation requests.

This document defines a new optional TLV for use in the OPEN Object.

The LS-CAPABILITY TLV is an optional TLV for use in the OPEN Object for link-state (and TE) distribution via PCEP capability advertisement. Its format is shown in the following figure:

The type of the TLV is [TBD5] and it has a fixed length of 4 octets. The value comprises a single field - Flags (32 bits): R (remote - 1 bit): if set to 1 by a PCC, the R Flag indicates that the PCC allows reporting of remote LS information learned via other means like IGP and BGP-LS; if set to 1 by a PCE, the R Flag indicates that the PCE is capable of receiving remote LS information (from the PCC point of view). The R Flag must be advertised by both a PCC and a PCE for LSRpt messages to report remote as well as local LS information on a PCEP session. The TLVs related to IGP/BGP-LS identifier MUST be encoded when both PCEP speakers have the R Flag set. Unassigned bits are considered reserved. They MUST be set to 0 on transmission and MUST be ignored on receipt. Advertisement of the LS capability implies support of local link-state (and TE) distribution, as well as the objects, TLVs and procedures defined in this document.

The LS (link-state) object MUST be carried within LSRpt messages and MAY be carried within PCErr messages. The LS object contains a set of fields used to specify the target node or link. It also contains a flag indicating to a PCE that the LS synchronization is in progress. The TLVs used with the LS object correlate with the IGP/BGP-LS encodings. LS Object-Class is [TBD6]. Four Object-Type values are defined for the LS object so far: LS Node: LS Object-Type is 1. LS Link: LS Object-Type is 2. LS IPv4 Topology Prefix: LS Object-Type is 3. LS IPv6 Topology Prefix: LS Object-Type is 4. The format of all types of LS object is as follows:

Protocol-ID (8-bit): The field provide the source information. Incase PCC only provides local information (R flag is not set), it MUST use Protocol-ID as Direct. The following values are defined (same as ):

Flags (32-bit): S (SYNC - 1 bit): the S Flag MUST be set to 1 on each LSRpt sent from a PCC during LS Synchronization. The S Flag MUST be set to 0 in other LSRpt messages sent from the PCC. R (Remove - 1 bit): On LSRpt messages the R Flag indicates that the node/link/prefix has been removed from the PCC and the PCE SHOULD remove from its database. Upon receiving an LS Report with the R Flag set to 1, the PCE SHOULD remove all state for the node/link/prefix identified by the LS Identifiers from its database. LS-ID(64-bit): A PCEP-specific identifier for the node or link or prefix information. A PCC creates an unique LS-ID for each node/link/prefix that is constant for the lifetime of a PCEP session. The PCC will advertise the same LS-ID on all PCEP sessions it maintains at a given times. All subsequent PCEP messages then address the node/link/prefix by the LS-ID. The values of 0 and 0xFFFFFFFFFFFFFFFF are reserved. Unassigned bits are considered reserved. They MUST be set to 0 on transmission and MUST be ignored on receipt. TLVs that may be included in the LS Object are described in the following sections.

In case of remote link-state (and TE) population when existing IGP/BGP-LS are also used, OSPF and IS-IS may run multiple routing protocol instances over the same link as described in . See and for more information. These instances define independent "routing universes". The 64-Bit 'Identifier' field is used to identify the "routing universe" where the LS object belongs. The LS objects representing IGP objects (nodes or links or prefix) from the same routing universe MUST have the same 'Identifier' value; LS objects with different 'Identifier' values MUST be considered to be from different routing universes. The format of the ROUTING-UNIVERSE TLV is shown in the following figure:

Below table lists the 'Identifier' values that are defined as well-known in this draft (same as ).

If this TLV is not present the default value 0 is assumed.

As described in , each link is anchored by a pair of Router-IDs that are used by the underlying IGP, namely, 48 Bit ISO System-ID for IS-IS and 32 bit Router-ID for OSPFv2 and OSPFv3. Incase of additional auxiliary Router-IDs used for TE, these MUST also be included in the link attribute TLV (see ). It is desirable that the Router-ID assignments inside the Node Descriptor are globally unique. Some considerations for globally unique Node/Link/Prefix identifiers are described in . The Local Node Descriptors TLV contains Node Descriptors for the node anchoring the local end of the link. This TLV MUST be included in the LS Report when during a given PCEP session a node/link/prefix is first reported to a PCE. A PCC sends to a PCE the first LS Report either during State Synchronization, or when a new node/link/prefix is learned at the PCC. The value contains one or more Node Descriptor Sub-TLVs, which allows specification of a flexible key for any given node/link/prefix information such that global uniqueness of the node/link/prefix is ensured.

The value contains one or more Node Descriptor Sub-TLVs defined in .

The Remote Node Descriptors contains Node Descriptors for the node anchoring the remote end of the link. This TLV MUST be included in the LS Report when during a given PCEP session a link is first reported to a PCE. A PCC sends to a PCE the first LS Report either during State Synchronization, or when a new link is learned at the PCC. The length of this TLV is variable. The value contains one or more Node Descriptor Sub-TLVs defined in .

The Node Descriptor Sub-TLV type Type and lengths are listed in the following table:

The sub-TLV values in Node Descriptor TLVs are defined as follows (similar to ): Autonomous System: opaque value (32 Bit AS Number) BGP-LS Identifier: opaque value (32 Bit ID). In conjunction with ASN, uniquely identifies the BGP-LS domain as described in . This sub-TLV is present only if the node implements BGP-LS and the ID is set by the operator. OSPF Area ID: It is used to identify the 32 Bit area to which the LS object belongs. Area Identifier allows the different LS objects of the same node to be discriminated. IGP Router ID: opaque value. Usage is described in for IGP Router ID. In case only local information is transported and PCE learns link-state (and TE) information only from PCEP, it contain the unique local TE IPv4 or IPv6 router ID. Multi-Topology-ID: Usage is described in for MT-ID. There can be at most one instance of each sub-TLV type present in any Node Descriptor.

The Multi-Topology ID (MT-ID) TLV carries one or more IS-IS or OSPF Multi-Topology IDs for a link, node or prefix. The semantics of the IS-IS MT-ID are defined in Section 7.2 of . The MT-ID TLV MAY be present in a Link Descriptor, a Prefix Descriptor, or in the attribute of a node (Node Attributes TLV) in LS object. The format and handling of the MT-ID TLV is as defined in . In a Link or Prefix Descriptor, only a single MT-ID TLV containing the MT-ID of the topology where the link or the prefix is reachable is allowed. In case one wants to advertise multiple topologies for a given Link Descriptor or Prefix Descriptor, multiple reports need to be generated where each LS object contains an unique MT-ID. In the attribute of a node (Node Attributes TLV) in LS object, one MT-ID TLV containing the array of MT-IDs of all topologies where the node is reachable is allowed.

The Link Descriptors TLV contains Link Descriptors for each link. This TLV MUST be included in the LS Report when during a given PCEP session a link is first reported to a PCE. A PCC sends to a PCE the first LS Report either during State Synchronization, or when a new link is learned at the PCC. The length of this TLV is variable. The value contains one or more Link Descriptor Sub-TLVs. The 'Link descriptor' TLVs uniquely identify a link among multiple parallel links between a pair of anchor routers similar to .

The Link Descriptor Sub-TLV type and lengths are listed in the following table:

The format and semantics of the 'value' fields in most 'Link Descriptor' sub-TLVs correspond to the format and semantics of value fields in IS-IS Extended IS Reachability sub-TLVs, defined in , and . Although the encodings for 'Link Descriptor' TLVs were originally defined for IS-IS, the TLVs can carry data sourced either by IS-IS or OSPF or direct. The information about a link present in the LSA/LSP originated by the local node of the link determines the set of sub-TLVs in the Link Descriptor of the link as described in .

The 'Prefix Descriptor' field is a set of Type/Length/Value (TLV) triplets. 'Prefix Descriptor' TLVs uniquely identify an IPv4 or IPv6 Prefix originated by a Node. The following TLVs are valid as Prefix Descriptors in the IPv4/IPv6 Prefix-

This is an optional attribute that is used to carry node attributes. The node attribute TLV may be encoded in the LS node Object.

The Node Attributes Sub-TLV type and lengths are listed in the following table:

Link attribute TLV may be encoded in the LS Link Object. The format and semantics of the 'value' fields in some 'Link Attribute' sub-TLVs correspond to the format and semantics of value fields in IS-IS Extended IS Reachability sub-TLVs, defined in , and . Although the encodings for 'Link Attribute' TLVs were originally defined for IS-IS, the TLVs can carry data sourced either by IS-IS or OSPF or direct.

The following 'Link Attribute' sub-TLVs are valid :

Prefix attribute TLV may be encoded in the LS Prefix Object. Prefixes are learned from the IGP (IS-IS or OSPF) or BGP topology with a set of IGP attributes (such as metric, route tags, etc.). This section describes the different attributes related to the IPv4/IPv6 prefixes. Prefix Attributes TLVs SHOULD be encoded in the LS Prefix Object.

The following 'Link Attribute' sub-TLVs are are valid :

The main source of LS (and TE) information is the IGP, which is not active on inter-AS links. In some cases, the IGP may have information of inter-AS links (, ). In other cases, an implementation SHOULD provide a means to inject inter-AS links into PCEP. The exact mechanism used to provision the inter-AS links is outside the scope of this document.

This document extends PCEP for LS (and TE) distribution including a new LSRpt message with new object and TLVs. Procedures and protocol extensions defined in this document do not effect the overall PCEP security model. See , . Tampering with the LSRpt message may have an effect on path computations at PCE. It also provides adversaries an opportunity to eavesdrop and learn sensitive information and plan sophisticated attacks on the network infrastructure. The PCE implementation SHOULD provide mechanisms to prevent strains created by network flaps and amount of LS (and TE) information. Thus it is suggested that any mechanism used for securing the transmission of other PCEP message be applied here as well. As a general precaution, it is RECOMMENDED that these PCEP extensions only be activated on authenticated and encrypted sessions belonging to the same administrative authority.

TBD.

This section contains the global table of all TLVs/Sub-TLVs in LS object defined in this document.

This document borrows some of the structure and text from the . Thanks to Eric Wu, Venugopal Kondreddy, Mahendra Singh Negi, and Zhengbin Li for the reviews.