| Internet-Draft | RCA | August 2023 | 
| Decraene, et al. | Expires 26 February 2024 | [Page] | 
RFC 5492 allows a BGP speaker to advertise its capabilities to its peer. When a route is propagated beyond the immediate peer, it is useful to allow certain capabilities to be conveyed further. In particular, it is useful to advertise forwarding plane features.¶
This specification defines a BGP transitive attribute to carry such capability information, the "Router Capabilities Attribute," or RCA. Unlike the capabilities defined by RFC 5492, those conveyed in the RCA apply solely to the routes advertised by the BGP UPDATE that contains the particular RCA.¶
This specification also defines an RCA capability that can be used to advertise the ability to process the MPLS Entropy Label as an egress LSR for all NLRI advertised in the BGP UPDATE. It updates RFC 6790 and RFC 7447 concerning this BGP signaling.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 26 February 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
[RFC5492] allows a Border Gateway Protocol (BGP) speaker to advertise its capabilities to its peer. When a route is propagated beyond the immediate peer, it is useful to allow certain capabilities to be conveyed further. In particular, it may be useful to advertise forwarding plane features.¶
This specification defines a BGP optional transitive attribute to carry such capability information, the "Router Capabilities Attribute", or RCA. (Note that this specification should not be confused with RFC 5492 BGP Capabilities.)¶
Since the RCA is intended chiefly for conveying information about forwarding plane features, it needs to be regenerated whenever the BGP route's next hop is changed. Since owing to the properties of BGP transitive attributes this can't be guaranteed (an intermediate router that doesn't implement this specification would be expected to propagate the RCA as opaque data), the RCA encodes the next hop of its originator. If the RCA passes through a router that changes the next hop without regenerating the RCA, they will fail to match when later examined, and the recipient can act accordingly. This scheme allows RCA support to be introduced into a network incrementally. Complete details are provided in Section 2.¶
An RCA carried in a given BGP UPDATE message conveys information that relates to all Network Layer Reachability Information (NLRI) advertised in that particular UPDATE, and only to those NLRI. A different UPDATE message originated by the same source might not include an RCA, and if so, NLRI carried in that UPDATE would not be affected by the RCA. By implication, if a router wishes to use RCA to describe all NLRI it originates, it needs to include an RCA with each UPDATE it sends. In this respect, despite its similar naming, the RCA is unlike RFC 5492 BGP Capabilities.¶
This specification also defines an RCA capability, called "ELCv3", to advertise the ability to process the Multiprotocol Label Switching (MPLS) Entropy Label as an egress Label Switching Router (LSR) for all NLRI advertised in the BGP UPDATE. It updates [RFC6790] and [RFC7447] with regard to this BGP signaling, this is further discussed in Section 3. Although ELCv3 is only relevant to NLRI of labeled address families, in general, a future RCA capability might be applicable to NLRI of any address family.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The BGP Router Capabilities attribute (RCA attribute, or just RCA) is an optional, transitive BGP path attribute with type code 39. The RCA always includes a network layer address identifying the next hop of the route the RCA accompanies. The RCA signals potentially useful information, so it is desirable to make it transitive; the next hop data is to ensure correctness if it traverses BGP speakers that do not understand the RCA. This is further explained below.¶
The Attribute Data field of the RCA attribute is encoded as a header portion that identifies the originator of the attribute, followed by one or more capability Type-Length-Value (TLV) triples:¶
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   Address Family Identifier   |     SAFI      | Next Hop Len  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~             Network Address of Next Hop (variable)            ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                   Capability TLVs (variable)                  ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The meanings of the header fields (Address Family Identifier, SAFI or Subsequent Address Family Identifier, Length of Next Hop, and Network Address of Next Hop) are as given in Section 3 of [RFC4760].¶
In turn, each Capability is a TLV:¶
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        Capability Code        |        Capability Length      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                  Capability Value (variable)                  ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Capability Code: a two-octet unsigned binary integer that indicates the type of capability advertised and unambiguously identifies an individual capability.¶
Capability Length: a two-octet unsigned binary integer that indicates the length, in octets, of the Capability Value field. A length of 0 indicates that no Capability Value field is present.¶
Capability Value: a variable-length field. It is interpreted according to the value of the Capability Code.¶
A BGP speaker MUST NOT include more than one instance of a capability with the same Capability Code, Capability Length, and Capability Value. Note, however, that processing multiple instances of such a capability does not require special handling, as additional instances do not change the meaning of the announced capability; thus, a BGP speaker MUST be prepared to accept such multiple instances.¶
BGP speakers MAY include more than one instance of a capability (as identified by the Capability Code) with different Capability Value. Processing of these capability instances is specific to the Capability Code and MUST be described in the document introducing the new capability.¶
Capability TLVs MUST be placed in the RCA in increasing order of Capability Code. (In the event of multiple instances of a capability with the same Capability Code as discussed above, no further sorting order is defined here.) Although the major sorting order is mandated, an implementation MUST elect to be prepared to consume capabilities in any order, for robustness reasons.¶
Suppose a BGP speaker S has a route R it wishes to advertise with next hop N to its peer.¶
If S is originating R into BGP, it MAY include an RCA attribute with it, that carries capability TLVs that describe aspects of R. S MUST set the header portion of the RCA to be equal to N, using the encoding given above.¶
If S has received R from some other BGP speaker, two possibilities exist. First, S could be propagating R without changing N. In that case, S need take no special action, it SHOULD simply propagate the RCA unchanged unless specifically configured otherwise. Indeed, we observe that this is no different from the default action a BGP speaker takes with an unrecognized optional transitive attribute -- it is treated as opaque data and propagated.¶
Second, S could be changing R in some way, and in particular, it could be changing N. If S has changed N it MUST NOT propagate the RCA unchanged. It SHOULD include a newly-constructed RCA attribute with R, constructed as described above in the "originating R into BGP" case. Any given capability TLV carried by the newly-constructed RCA attribute might use information from the received RCA attribute as input to its construction, possibly as straightforwardly as simply copying the TLV; the details of this are specific to the definition of each capability. Any capability TLVs received by S that are for capabilities not supported by S will not be included in the version of R constructed by S.¶
An implementation SHOULD send the RCA and its contained capabilities by default. An implementation SHOULD provide configuration control of whether any given capability is sent. An implementation MAY provide finer-grained control on propagation based on attributes of the peering session, as discussed in Section 6.1.¶
We note that due to the nature of BGP optional transitive path attributes, any BGP speaker that does not implement this specification will propagate the RCA, the requirements of this section notwithstanding. Such a speaker will not update the RCA, however.¶
An implementation SHOULD accept the RCA and its contained capabilities by default. An implementation SHOULD provide configuration control of whether any given capability is accepted. An implementation MAY provide finer-grained control on propagation based on attributes of the peering session, as discussed in Section 6.1.¶
When a BGP speaker receives a BGP route that includes the RCA, it MUST compare the address given in the header portion of the RCA and illustrated in Figure 1 to the next hop of the BGP route. If the two match, the RCA may be further processed. If the two do not match, it means some intermediate BGP speaker that handled the route in transit both does not support RCA, and changed the next hop of the route. In this case, the contents of the RCA cannot be used, and the RCA MUST be discarded without further processing, except that the contents MAY be logged.¶
In considering whether the next hop "matches", a semantic match is sought. While bit-for-bit equality is a trivial test of matching, there may be certain cases where the two are not bit-for-bit equal, but still "match". An example is when a MP_REACH Next Hop encodes both a global and a link-local IPv6 address. In that case the link-local address might be removed during Internal BGP (IBGP) propagation, the two would still be considered to match if they were equal on the global part. See Section 3 of [RFC2545].¶
A BGP speaker receiving a Capability Code that it supports behaves as defined in the document defining the Capability Code. A BGP speaker receiving a Capability Code that it does not support MUST ignore that Capability Code. In particular, it MUST NOT be handled as an error.¶
The presence of a capability SHOULD NOT influence route selection or route preference, unless tunneling is used to reach the BGP next hop or the selected route has been learned from External BGP (that is, the next hop is in a different Autonomous System). Indeed, it is in general impossible for a node to know that all BGP routers of the Autonomous System (AS) will understand a given capability, and if different routers within an AS were to use a different preference for a route, forwarding loops could result unless tunneling is used to reach the BGP next hop.¶
An RCA is considered malformed if the length of the attribute is inconsistent with the lengths of the contained capability TLVs.¶
A BGP UPDATE message with a malformed RCA SHALL be handled using the approach of "attribute discard" defined in [RFC7606].¶
Unknown Capability Codes MUST NOT be considered to be an error.¶
An RCA that contains no capability TLVs MAY be considered malformed, although it is observed that the prescribed behavior of "attribute discard" is semantically no different from that of having no TLVs to process.¶
A document that specifies a new RCA Capability should provide specifics regarding what constitutes an error for that RCA Capability.¶
If a capability TLV is malformed, that capability TLV SHOULD be ignored and removed. Other capability TLVs SHOULD be processed as usual. If a given capability TLV requires different error-handling treatment than described in the previous sentences, its specification should provide specifics.¶
In the corner case where multiple nodes use the same IP address as their BGP next hop, such as with anycast nodes as described in [RFC4786], a BGP speaker MUST NOT advertise a given capability unless all nodes sharing this same IP address support this capability. The network operator operating those anycast nodes is responsible for ensuring that an anycast node does not advertise a capability not supported by all nodes sharing this anycast address. The means for accomplishing this are beyond the scope of this document.¶
The foregoing sections define the RCA as a container for capability TLVs. The Entropy Label Capability is one such capability.¶
When BGP [RFC4271] is used for distributing labeled NLRI as described in, for example, [RFC8277], the route may include the ELCv3 as part of the RCA. The inclusion of this capability with a route indicates that the egress of the associated Label Switched Path (LSP) can process entropy labels as an egress LSR for that route -- see Section 4.2 of [RFC6790]. Below, we refer to this for brevity as being "EL-capable."¶
For historical reasons, this capability is referred to as "ELCv3", to distinguish it from the prior Entropy Label Capability (ELC) defined in [RFC6790] and deprecated in [RFC7447], and the ELCv2 described in [I-D.scudder-bgp-entropy-label].¶
The ELCv3 has capability code 1, capability length 0, and carries no value:¶
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Capability Code = 1      |       Capability Length = 0   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When a BGP speaker S has a route R it wishes to advertise with next hop N to its peer, it SHOULD include the ELCv3 capability if it knows that the egress of the associated LSP L is EL-capable, otherwise it MUST NOT include the ELCv3 capability. Specific conditions where S would know that the egress is EL-capable are if S:¶
The ELCv3 MAY be advertised with routes that are labeled, such as those using SAFI 4 [RFC8277]. It MUST NOT be advertised with unlabeled routes.¶
(Below, we assume that "includes the ELCv3" implies that the containing RCA has passed the checks specified in Section 2.3. If it had not passed, then the RCA would have been discarded and the ELCv3 would be deemed not to have been included.)¶
When a BGP speaker receives an unlabeled route that includes the ELCv3, it MUST discard the ELCv3.¶
When a BGP speaker receives a labeled route that includes the ELCv3, it indicates the associated LSP supports entropy labels. This implies that the receiving BGP speaker if acting as ingress, MAY insert an entropy label as per Section 4.2 of [RFC6790].¶
The ELCv3 is considered malformed and must be disregarded if its length is other than zero.¶
The ELCv3 functionality introduced in this document replaces the "BGP Entropy Label Capability Attribute" (ELC attribute) that was introduced by [RFC6790], and deprecated by [RFC7447]. The latter RFC specifies that the ELC attribute, BGP path attribute 28, "MUST be treated as any other unrecognized optional, transitive attribute". This specification revises that requirement.¶
As the current specification was developed, it became clear that due to incompatibilities between how the ELC attribute is processed by different fielded implementations, the most prudent handling of attribute 28 is not to propagate it as an unrecognized optional, transitive attribute, but to discard it. Therefore, this specification updates [RFC7447], by instead requiring that an implementation that receives the ELC attribute MUST discard any received ELC attribute.¶
IANA has made a temporary allocation in the BGP Path Attributes registry of the Border Gateway Protocol (BGP) Parameters group. IANA is requested to make this allocation permanent.¶
| Value | Code | Reference | 
|---|---|---|
| 39 | BGP Router Capabilities (RCA) | (this doc) | 
IANA is requested to create a new registry called "BGP Router Capability Codes" within the Border Gateway Protocol (BGP) Parameters group. The registry's allocation policy is First Come, First Served. It is seeded with the following values:¶
| Value | Description | Reference | Change Controller | 
|---|---|---|---|
| 0 | reserved | (this doc) | IETF | 
| 1 | ELCv3 | (this doc) | IETF | 
| 65500 - 65534 | reserved for experimental use | (this doc) | IETF | 
| 65535 | reserved | (this doc) | IETF | 
The header portion of the RCA contains the next hop the attribute's originator included when sending it. This will typically be an IP address of the router in question. This may be an infrastructure address the network operator does not intend to announce beyond the border of its Autonomous System, and it may even be considered in some weak sense, confidential information.¶
In order to be fit for purpose, the attribute needs to be able to propagate between Autonomous Systems when they are under the control of the same administrator, but for anticipated uses it need not be sent to other Autonomous Systems. At time of writing, work [I-D.uttaro-idr-bgp-oad] is underway to standardize a method of distinguishing between the two categories of external Autonomous Systems, and if such a distinction is available, an implementation can take advantage of it by constraining the RCA and its contained capabilities to only propagate by default to and from the former category of Autonomous Systems. If such a distinction is not available, a network operator may prefer to configure routers peering with Autonomous Systems not under their administrative control to not send or accept the RCA or its contained capabilities, unless there is an identified need to do so.¶
The forgoing notwithstanding, control of RCA propagation can't be guaranteed in all cases -- if a border router doesn't implement this specification, the attribute, like all BGP optional transitive attributes, will propagate to neighboring Autonomous Systems. Similarly, if a border router receiving the attribute from an external Autonomous System doesn't implement this specification, it will store and propagate the attribute, the requirements of Section 2.3 notwithstanding. So, sometimes this information could leak beyond its intended scope. (Note that it will only propagate as far as the first router that does support this specification, at which point it will typically be discarded due to a non-matching next hop, per Section 2.3.)¶
If the attribute leaks beyond its intended scope, capabilities within it would potentially be exposed. Specifications for individual capabilities should consider the consequences of such unintended exposure, and should identify any necessary constraints on propagation.¶
Insertion of an ELCv3 by an attacker could cause forwarding to fail. Deletion of an ELCv3 by an attacker could cause one path in the network to be overutilized and another to be underutilized. However, we note that an attacker able to accomplish either of these (below, an "on-path attacker") could equally insert or remove any other BGP path attribute or message. The former attack described above denies service for a given route, which can be accomplished by an on-path attacker in any number of ways even absent ELCv3. The latter attack defeats an optimization but nothing more; it seems dubious that an attacker would go to the trouble of doing so rather than launching some more damaging attack.¶
A router that supports this specification could also have other means to know that an egress is EL-capable, for example, it could support ELCv2 [I-D.scudder-bgp-entropy-label], or it could know through configuration. If a router learns through any means that an egress is EL-capable, it MAY treat the egress as EL-capable. For example, reception of a valid ELCv2 would be sufficient (even if a valid ELCv3 is not received), and similarly, reception of a valid ELCv3 would be sufficient (even if a valid ELCv2 is not received). The details of which methods are accepted for signaling EL capability are beyond the scope of this specification but SHOULD be configurable by the user.¶
The authors of this specification thank Wes Hardaker and Gyan Mishra for their review and comments.¶
This specification derives from two earlier documents, [I-D.ietf-idr-next-hop-capability] and [I-D.scudder-bgp-entropy-label].¶
[I-D.ietf-idr-next-hop-capability] included the following acknowledgements:¶
    The Entropy Label Next-Hop Capability defined in this document is
    based on the ELC BGP attribute defined in section 5.2 of [RFC6790].
    The authors wish to thank John Scudder for the discussions on this
    topic and Eric Rosen for his in-depth review of this document.
    The authors wish to thank Jie Dong and Robert Raszuk for their
    review and comments.
¶
[I-D.scudder-bgp-entropy-label] included the following acknowledgements:¶
    Thanks to Swadesh Agrawal, Alia Atlas, Bruno Decraene, Martin
    Djernaes, John Drake, Adrian Farrell, Keyur Patel, Toby Rees, and
    Ravi Singh, for their discussion of this issue.
¶