Network Working Group Enke Chen Internet Draft Naiming Shen Expiration Date: January 2005 Redback Networks Advertisement of the Group Best Paths in BGP draft-chen-bgp-group-path-update-00.txt 1. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Abstract In this document we first identify and qualify the Group Best Paths for an address prefix as in general the necessary and sufficient subset of paths that need to be advertised by a BGP route reflector or a BGP confederation ASBR in order to eliminate the MED-type route oscillations and to achieve consistent routing in a network. We then propose a mechanism for BGP that would allow a route reflector or a confederation ASBR to advertise the Group Best Paths. The proposed mechanism is designed such that the vast majority of the BGP speakers in a network need only minor software changes in order to deploy the mechanism. Chen & Shen [Page 1] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 3. Introduction The current BGP update procedures [1, 2, 3] can be characterized as "best path based" as an UPDATE message only deals with the advertisement of the best paths which are identified by the address prefixes. As documented in [4], the routing information reduction by BGP Route Reflection [2] or BGP Confederation [3] can result in persistent IBGP route oscillations with certain routing setup and network topologies. Except for a couple artificially engineered network topologies, the MED attribute [1] has played a pivotal role in virtually all of the known persistent IBGP route oscillations. For the sake of brevity, we use the term "MED-type route oscillation" hereafter to refer to a persistent IBGP route oscillation in which the MED plays a role. In order to eliminate the MED-type route oscillations, clearly a route reflector or a confederation ASBR needs to advertise more than just the best path for an address prefix. Various efforts have been made trying to extend BGP to advertise multiple paths for an address prefix. We believe, however, that we should first tackle the most fundamental issue of identifying the "right" subset of paths for an address prefix that needs to be advertised by a route reflector or a confederation ASBR, and then introduce BGP extensions to support such route advertisements. In addition, the solution needs to have reasonable implementation and deployment properties. In this document we first identify and qualify the Group Best Paths for an address prefix as in general the necessary and sufficient subset of paths that need to be advertised by a BGP route reflector or a BGP confederation ASBR in order to eliminate the MED-type route oscillations and to achieve consistent routing in a network. We then propose a mechanism for BGP that would allow a route reflector or a confederation ASBR to advertise the Group Best Paths. The proposed mechanism is designed such that the vast majority of the BGP speakers in a network need only minor software changes in order to deploy the mechanism. Chen & Shen [Page 2] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 4. Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [9]. 5. What to Advertise The term neighbor-AS for a route refers to the neighbor AS from which the route was received. The calculation of the neighbor-AS is specified in Sect. 9.1.2.2 of [1], and Section 7.2 of [3]. By definition the MED is comparable only among routes with the same neighbor-AS. As a result, the route selection procedures specified in [1] would conceptually involve two steps: first organize the paths for an address prefix into groups according to their respective neighbor-AS's, and calculate the most preferred one (termed "Group Best Path") for each of the groups; Then calculate the overall best path among all the Group Best Paths. Note that the overall best path would also naturally be a Group Best Path. As a generally recommended and widely adopted practice, a route reflection cluster or a confederation sub-AS should be designed such that the IGP metrics for links within a cluster (or confederation sub-AS) are smaller than the IGP metrics for the links between the clusters (or confederation sub-AS). This practice helps achieve consistent routing within a route reflection cluster or a confederation sub-AS. When the aforementioned practice for devising a route reflection cluster or confederation sub-AS is followed in a network, we claim that the advertisement of all the Group Best Paths by a route reflector or a confederation ASBR is sufficient to eliminate the MED- type route oscillations in the network. This claim can be validated as the following. Observe that a network which maintains a full IBGP mesh is free of the MED-type route oscillations, and only the external paths that survive the (Local_Pref, AS-PATH Length, Origin, MED) comparison [1] would contribute to the route selection in the network. Consider a route reflection cluster in which there exists one or more external paths that would survive the (Local_Pref, AS-PATH Length, Origin, MED) comparison among all the external paths in the network. One such an external path in each group would be selected as the Group Best Path by the route reflector in the cluster. Due to the constrain on the IGP metrics as described previously, this path would remain as the Group Best Path and would be advertised to all other Chen & Shen [Page 3] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 clusters regardless of whether a path is received from another cluster. On the other hand, when no path in a route reflection cluster would survive the (Local_Pref, AS-PATH Length, Origin, MED) comparison among all the external paths in the network, the Group Best Path (when exists) for a route reflector would be from another cluster. Clearly the advertise of the Group Best Path by the route reflector to the clients only depends on the paths received from other clusters. Therefore there is no MED-type route oscillation in the network as the advertisement of a Group Best Path to a peer does not depend on the paths received from that peer. The claim for the confederation can be validated similarly. We also note that in general it is necessary to make available a path that would survive the (Local_Pref, AS-PATH Length, Origin, MED) comparison in order to prevent another path with less preferred MED from being selected as an active path in the network. We therefore conclude that when the recommended practice for devising a route reflection cluster or confederation sub-AS with respective to the IGP metrics is followed, the Group Best Paths for an address prefix are in general the necessary and sufficient set that needs to be advertised by a route reflector or a confederation ASBR in order to eliminate the MED-type route oscillation and to achieve consistent routing. One exception is that a Group Best Path does not need to be advertised if it is lost to another Group Best Path prior to the MED comparison. 6. NLRI Encoding for Route Withdraw A Group Best Path for an address prefix can be identified by the combination of the neighbor-AS and the address prefix. The current NLRI encodings specified in [1, 5, 6] suffice for a reachable route as the neighbor-AS is implicitly carried in the AS-PATH attribute of an UPDATE message. To withdraw a path with a particular neighbor-AS, the current NLRI encodings are revised by prepending the neighbor-AS to the existing encodings. That is, the NLRI encoding specified in [1] and [5] are revised as the following: +---------------------------+ | Neighbor-AS (4 octets) | +---------------------------+ Chen & Shen [Page 4] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 | Length (1 octet) | +---------------------------+ | Prefix (variable) | +---------------------------+ and the NLRI encoding specified in [6] is modified as the following: +---------------------------+ | Neighbor-AS (4 octets) | +---------------------------+ | Length (1 octet) | +---------------------------+ | Label (3 octets) | +---------------------------+ ............................. +---------------------------+ | Prefix (variable) | +---------------------------+ The value 0 can be used in the Neighbor-AS field of the revised NLRI encodings to withdraw all the paths associated with an address prefix. The usage of the revised NLRI encodings is specified in the Operation section. 7. Group Best Path Based Update Capability The "Group Best Path Based Update Capability" is a new BGP capability [7]. The Capability Code for this capability is specified in the "IANA Considerations" section of this document. The Capability Length field of this capability is one octet. The Capability Value field has only one field, Send-Mode, which specifies the procedures the BGP speaker would follow in sending updates. The Send-Mode field has two possible values: Value Symbolic Name 0 Best Path 1 Group Path When advertising the capability to an internal peer, or to a confederation external peer, a BGP speaker conveys to the peer that Chen & Shen [Page 5] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 the speaker is capable of receiving the Group Best Path based updates (as well as the best path based updates). In addition, as detailed in the Operation section, the speaker would send either the best path based updates or the Group Best Path based update, depending on the setting of the Send-Mode field and whether the "Group Best Path Based Update Capability" is received from the peer. 8. Operation A BGP speaker that has implemented the procedures for receiving the Group Best Path based updates SHOULD advertise the "Group Best Path Based Update Capability" to its internal peers and confederation external peers using BGP Capabilities advertisement [7]. The setting of the Send-Mode field depends on the configuration. The "Group Best Path Based Update Capability" SHOULD not be sent to an external peer. A BGP speaker MUST ignore the "Group Best Path Based Update Capability" from a peer that is neither internal nor confederation external. A BGP speaker MUST follow the existing best path based update procedures with a peer unless the BGP speaker advertises the "Group Best Path Based Update Capability" and also receives the capability from the peer. Consider that two BGP speakers A and B advertise the "Group Best Path Based Update Capability" to each other. The NLRI encodings in an UPDATE message MUST follow the ones specified in [1, 5, 6] unless the Send-Mode field of the capability advertised by one speaker is set to "Group Path" in which case that speaker MUST use the revised NLRI encoding specified in this document to withdraw a path. In addition, the following procedures MUST be followed in formatting and processing UPDATE messages between the two BGP speakers. - When Speaker A sets the Send-Mode field to "Best Path" in the Capability advertised, it means that Speaker A would follow the existing best path based update procedures in route advertisement. Speaker B MUST follow the best path based update procedures in processing an UPDATE messages received from Speaker A. - When Speaker A sets the Send-Mode field to "Group Path" in the Capability advertised, then Speaker A MUST explicitly update Chen & Shen [Page 6] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 a path identified by the combination of the neighbor-AS and the address prefix. One exception is that the value 0 can be used in the Neighbor-AS field of the NLRI encoding to withdraw all the paths associated with the address prefix. Note that the neighbor-AS is implicitly carried in the AS-PATH attribute for a reachable route. When processing a reachable route received from Speaker A, Speaker B must first calculate the neighbor-AS from the AS-PATH attribute, and then use the combination of the neighbor-AS and the address prefix to identify a path being updated. When processing a route withdraw from Speaker A, Speaker B MUST use the combination of the neighbor-AS and the address prefix to identify the path being withdrawn, unless the neighbor-AS field is zero in which case all paths associated with the address prefix are withdrawn. 9. Route Reflection and Confederation As discussed in the "What To Advertise" section, the Group Best Paths (instead of the best path) for an address prefix should be advertised by a route reflector or a confederation ASBR. 9.1. Route Reflection The procedures for sending updates to an IBGP peer by a route reflector are revised as follows: - A route reflector SHOULD advertise to its clients all the Group Best Paths received from its non-client IBGP peers. - A route reflector SHOULD advertise to its non-client IBGP peers all the Group Best Paths received from its clients. - A route reflector MAY also advertise to its clients all the Group Best Paths received from its clients. A route reflector SHOULD set the Send-Mode field to "Group Path" in the "Group Best Path Based Update Capability" advertised to an IBGP peer. Chen & Shen [Page 7] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 9.2. Confederation The procedures for sending updates within a confederation by a confederation ASBR are revised as follows: - A confederation ASBR SHOULD advertise to its IBGP peers all the Group Best Paths received from its confederation external peers. - A confederation ASBR SHOULD advertise to confederation external peers all the Group Best Paths learned from its IBGP peers. A confederation ASBR SHOULD set the Send-Mode field to "Group Path" in the "Group Best Path Based Update Capability" advertised to an IBGP peer, and to a confederation external peer. 9.3. Update Optimization A Group Best Path does not need to be advertise if it is lost to another Group Best Path in route selection prior to Step C - MED Comparison specified in Sect. 9.1.2.2 [1]. This optimization is recommended in order to minimize the number of paths advertised. 10. Remarks Depending on the network topology and routing setup, the proposed mechanism may increase the number of BGP paths advertised, and thus increase the memory usage by these BGP paths. However, the number of BGP paths advertised for an address prefix is bounded by the number of neighbor-AS's (for that address prefix), which is typically small because of the small number of upstream providers for a customer and the nature of advertising only customer routes at the inter-exchange points. The implementation of the proposed mechanism may require additional route advertisement states to be maintained in order to avoid duplicate updates. However, the additional states for an address prefix would be per neighbor-AS based rather than per path based. Chen & Shen [Page 8] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 11. Deployment Considerations While the route reflectors or confederation ASBRs in a network need to send the Group Best Path based updates, the vast majority of the BGP speakers in the network only need to receive the Group Best Path based updates, which would involve just minor software changes: - Advertise the capability with the Send-Mode set to "Best Path". - Process received UPDATE messages based on the neighbor-AS and address prefix. To deploy the mechanism in a network, it is recommended that the BGP speakers (one at a time) be first upgraded to a software version that supports the new capability, and be configured to advertise the new capability with the Send-Mode field set to "Best Path". Then on a per route reflection cluster or confederation sub-AS basis, the route reflectors or the confederation ASBRs are re-configured to set the Send-Mode field to "Group Path" in the capability advertised. It should be emphasized that in order to eliminate the MED-type route oscillations in a network using the proposed mechanism, the recommended practice for devising a route reflection cluster or confederation sub-AS with respect to the IGP metrics should be followed. The requirement could be relaxed by imposing some other topological constrains in a network, and that is for further study. 12. IANA Considerations This document defines a new BGP capability (Group Best Path Based Update Capability). The capability code needs to be assigned. 13. Security Considerations This extension to BGP does not change the underlying security issues inherent in the existing BGP [8]. Chen & Shen [Page 9] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 14. Acknowledgments TBD 15. References 15.1. References [1] Rekhter, Y., T. Li, and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-23.txt, November 2003. [2] Bates, T., R. Chandra, and E. Chen "BGP Route Reflection - An Alternative to Full Mesh IBGP", RFC 2796, Arpil 2000. [3] Traina, P., D. McPherson, and J. Scudder, "Autonomous System Confederations for BGP", draft-ietf-idr-rfc3065bis-02.txt, May 2004. [4] D. McPherson, V, Gill, D. Walton, and A. Retana, "Border Gateway Protocol (BGP) Persistent Route Oscillation Condition", RFC 3345, August 2002. [5] Bates, T., R. Chandra, D. Katz, and Y. Rekhter, "Multiprotocol Extension for BGP-4", RFC 2858, June 2000. [6] Rekhter, R. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, May 2001. [7] Chandra, R., Scudder, J., "Capabilities Advertisement with BGP-4", draft-ietf-idr-rfc2842bis-02.txt, April 2002. [8] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 Signature Option", RFC 2385, August 1998. [9] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Chen & Shen [Page 10] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 16. Authors' Addresses Enke Chen Redback Networks Inc. 300 Holger Way. San Jose, CA 95134 EMail: enke@redback.com Naiming Shen Redback Networks Inc. 300 Holger Way. San Jose, CA 95134 EMail: naiming@redback.com 17. Intellectual Property Notice The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Chen & Shen [Page 11] Internet Draft draft-chen-bgp-group-path-update-00.txt June 2004 18. Full Copyright Statement Copyright (C) The Internet Society (2004). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Chen & Shen [Page 12]