Internet-Draft BGP-SPF Selection Rules October 2023
Dong, et al. Expires 25 April 2024 [Page]
Workgroup:
Link State Vector Routing Working Group
Internet-Draft:
draft-dong-lsvr-bgp-spf-selection-00
Published:
Intended Status:
Informational
Expires:
Authors:
J. Dong
Huawei Technologies
J. Chen
Huawei Technologies
S. Fang
Huawei Technologies

Proposed Update to BGP Link-State SPF NLRI Selection Rules

Abstract

For network scenarios such as Massively Scaled Data Centers (MSDCs), BGP is extended for Link-State (LS) distribution and the Shortest Path First (SPF) algorithm based calculation. BGP-LS-SPF leverages the mechanisms of both BGP protocol and BGP-LS protocol extensions, with new selection rules defined for BGP-LS-SPF NLRI. This document proposes some update to the BGP-LS-SPF NLRI selection rules, so as to ensure a deterministic selection result. The proposed update can also help to mitigate some issues in BGP-LS-SPF route convergence. This document updates the NLRI selection rules in I-D.ietf-lsvr-bgp-spf.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 25 April 2024.

Table of Contents

1. Introduction

For network scenarios such as Massively Scaled Data Centers (MSDCs), BGP is extended for Link-State (LS) distribution and the Shortest Path First (SPF) algorithm based calculation. BGP-LS-SPF leverages the mechanisms of both BGP protocol and BGP-LS protocol extensions, with new selection rules for BGP-LS-SPF NLRI defined in [I-D.ietf-lsvr-bgp-spf]. For all BGP-LS-SPF NLRIs, the NLRI selection rules are defined as below:

  1. NLRI originated by directly connected BGP SPF peers are preferred.

  2. The NLRI with the most recent Sequence Number TLV, i.e., highest sequence number is selected.

  3. The NLRI received from the BGP SPF speaker with the numerically larger BGP Identifier is preferred.

In some cases, these rules may not be enough to provide deterministic selection result. And in some failure cases, these rules may cause the distribution of the latest link-state information be delayed, which would result in delayed route convergence in the network.

This document firstly describes the network scenarios in which the existing NLRI selection rules are considered not enough. Then some updates to the BGP-LS-SPF NLRI selection rules are proposed.

2. Network Scenarios Which Triggered This Update

2.1. Delayed Convergence during Link Failure

Section 6.5.2 of [I-D.ietf-lsvr-bgp-spf] describes the NLRI advertisement in case of node failures. While in some cases, route convergence can be delayed due to the current NLRI selection rules.

  +-----+         +-----+  link down  +-----+        +-----+
  | R1  +---------+  R2 +------X------+  R3 +--------+ R5  |
  +-----+         +--\--+             +--/--+        +-----+
                      \                 /
  R1-R2: down to up    \               /
                        \             /
                         \           /
                          \         /
                           \+-----+/
                            |  R4 |
                            +--+--+
                               |
                               |
                               |
                               |
                            +--+--+
                            |  R6 |
                            +-----+

As shown in the example in Figure 1, a failure of BGP session between R2 and R3 is detected by R3, using either BFD or other detection mechanisms. Since R2 cannot distinguish whether it is a node failure of R2, or a link failure of R2-R3, in order to avoid unnecessary route flaps, according to the description in Section 6.5.2 of [I-D.ietf-lsvr-bgp-spf], R3 will hold all the NLRIs received from R1 for the period of NLRIImplicitWithdrawalDelay. During this period, if the state of link R1-R2 change from down to up, an updated link NLRI of R1-R2 with a greater sequence number would be originated by R2 and advertised to its neighboring nodes. Due to the failure of R2-R3, R3 cannot receive the updated link NLRI directly from R2, while R3 can receive the updated link NLRI of R1-R2 with a greater sequence number from R4. However, according to the NLRI selection rule, R3 would prefer the link NLRI of R1-R2 directly received from R2, thus R3 would not consider the link NLRI R1-R2 received from R4 as the latest one. Consequently, R3 will not use the latest link NLRI of R1-R2 for SPF computation, nor it will advertise the latest link NLRI of R1-R2 to its neighbors. This would cause delayed convergence of the network.

2.2. Unnecessary Redundant Advertisement

According to the rules in [I-D.ietf-lsvr-bgp-spf], for the BGP-LS-SPF NLRIs with the same sequence number, the NLRI received from the numerically larger BGP ID is preferred. While in some cases, this may cause unnecessary redundant advertisement of the same NLRI.

  +----+  new  +----+         +----+       +----+
  | R6 +-------+ R1 +---------+ R2 +-------+ R5 |
  +----+       +-+--+         +-+--+       +----+
                 |              |
                 |              |
                 |              |
                 |              |
                 |              |
               +-+--+         +-+--+
               | R3 +---------+ R4 |
               +----+         +----+

As shown in the example in Figure 2, a new BGP session is established between R1 and R6, and R1 advertise the link NLRI of R1-R6 to its neighboring nodes (R2 and R3). R2 firstly receives the link NLRI R1-R6 from R1 directly, and advertise it further to its neighbors (R4 and R5). R4 receives the link NLRI of R1-R6 with the same sequence number from both R3 and R2, and according to the NLRI selection rules, R4 would prefer the NLRI received from R3 according to the rule of numerically larger BGP ID, then R4 advertises this link NLRI of R1-R6 to R2. R2 would also prefer the NLRI received from R4 according to the rule of numerically larger BGP ID, and further advertises this link NLRI to R5, which is a redundant advertisement of its previous advertisement of the same link NLRI.

2.3. Parallal BGP-LS-SPF Peers

In some scenarios, BGP single-hop peering model is used between directly connected BGP nodes. When two or more parallel links exists between the BGP nodes, multiple BGP sessions are established between the peering nodes, and each session will be used for the distribution of BGP-LS-SPF NLRIs.

               parallel BGP sessions

  +----+       +----+         +----+       +----+
  |    |       |    +---------+    |       |    |
  | R3 +-------+ R1 +---------+ R2 +-------+ R4 |
  +----+       +-+--+         +-+--+       +----+

As shown in the example of Figure 3, there are two parallel links between R1 and R2, and a separate BGP session is established on each link. Based on the existing BGP-LS-SPF NLRI selection rules, from R2's perspective, for the same NLRI with the same sequence number, either the route received from peer R1.1, or the route received from peer R1.2 may be selected as the best. To facilitate network operation and troubleshooting, it is preferable to have a deterministic result of NLRI selection once the network enters relative stable state. Thus some rules to select the preferred NLRI among parallel peering sessions is needed.

3. Update to BGP-LS-SPF Selection Rules

This document proposes to update the selection rules for all BGP-LS-SPF NLRI as follows:

  1. NLRI originated by directly connected BGP SPF peers SHOULD be preferred.

  2. The NLRI with the most recent Sequence Number TLV, i.e., highest sequence number SHOULD be selected.

  3. For NLRIs received from EBGP peers, the NLRI with smaller number of AS numbers in the AS_PATH attribute SHOULD be preferred.

  4. For NLRIs received from IBGP peers, the NLRI with smaller number of Cluster IDs in the CLUSTER_LIST attributes SHOULD be preferred.

  5. The NLRI received from the BGP SPF speaker with the numerically larger BGP Identifier SHOULD be preferred.

  6. NLRI received from the BGP SPF peer with the smaller peer address SHOULD be preferred.

The new rule 3 and 4 is to solve the duplicated advertisement problem as described in section 2.2. The new rule 6 is to solve the indeterministic selection problem as described in section 2.3.

For the problem illustrated in Section 2.1, there are several options to solve it, the details will be discussed further and documented in a future version of this document.

4. IANA Considerations

This document makes no request of IANA.

5. Security Considerations

The mechanism described in this document provide updates to the NLRI selection rules for BGP-LS-SPF. It does not introduce any additional security considerations than those described in [RFC4271] and [RFC4272].

6. Acknowledgements

The authors would like to thank Haibo Wang, Jun Ge and Li Zhang for the valuable discussion and suggestions.

7. References

7.1. Normative References

[I-D.ietf-lsvr-bgp-spf]
Patel, K., Lindem, A., Zandi, S., and W. Henderickx, "BGP Link-State Shortest Path First (SPF) Routing", Work in Progress, Internet-Draft, draft-ietf-lsvr-bgp-spf-28, , <https://datatracker.ietf.org/doc/html/draft-ietf-lsvr-bgp-spf-28>.
[RFC4271]
Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, , <https://www.rfc-editor.org/info/rfc4271>.

7.2. Informative References

[RFC4272]
Murphy, S., "BGP Security Vulnerabilities Analysis", RFC 4272, DOI 10.17487/RFC4272, , <https://www.rfc-editor.org/info/rfc4272>.

Authors' Addresses

Jie Dong
Huawei Technologies
China
Jinqiang Chen
Huawei Technologies
China
Sheng Fang
Huawei Technologies
China