Network Working Group N. Dubois B. Decraene B. Fondeviole Internet Draft France Telecom Document: draft-dubois-bgp-planned-maintenance-00.txt June 2004 Expiration Date: December 2004 Graceful Shutdown of BGP Sessions Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract To ease the maintenance of BGP-4 sessions and limit the amount of traffic that is lost during planned maintenance on routers, a specific mechanism is proposed in order to gracefully shutdown a router or a session. It's proposed that a router first withdraw its route to its peer to initiate their convergence. After a timer the router can proceed with the closing of the BGP sessions and consequently remove its peers'routes from it's RIB (Routing Information Base). 1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. Dubois Expires December 2004 [Page 1] Internet Draft BGP planned maintenance March 2004 2. Introduction The BGP-4 protocol is heavily used in service provider networks. For resiliency purposes, most of the IP network operators deploy redundant routers to minimize the risk associated with router failures. In a context where a Service Provider wants to upgrade or remove a particular router that maintains one or several BGP sessions, it is highly recommended to avoid any traffic loss. Currently, the BGP-4 Finite State Machine (FSM) does not include any operation to prevent traffic loss in case of planned maintenance. This draft proposes an additional refinement of the BGP-4 protocol machinery to enhance the administrative shutdown process of a BGP-4 peer. It proposes a mechanism which would allow BGP peers to re-route IP flows before they might be discarded by the router to be shut down. The proposed approach can be smoothly deployed, since it is fully backward compatible with the BGP-4 protocol specification. An adequate implementation of such mechanism should minimize the loss of traffic in most foreseen maintenance contexts. 3. Planned shutdown Procedure 3.1. Basic Concept When a router needs to be rebooted, we propose a new behavior for the BGP-4 sessions that need to be shut down. Instead of sending a BGP NOTIFICATION message and/or tearing the TCP session down, we propose the following two-step procedure: Step 1: The router withdraws all the routes it has advertised to its peers (which means it MUST send an UPDATE message to its peers with a non- empty "Withdrawn Routes" field). It does not clear any BGP-4 session nor does it remove the BGP-4 routes it has received from its peers in its Adj-RIB-In and loc-RIB tables. By doing so, forwarding is not impacted and no traffic is lost for all the destination prefixes that are reacheable through a back up BGP router. Step 2: Once a timer has expired BGP-4 sessions are shutdown and BGP NOTIFICATION messages are sent to the relevant peers with cease code 6. Sub-code 2 is also sent if the router supports the sending of cease subcode messages [4]. Advantages of this solution are the following: - If another router provides an alternate path towards a set of destination prefixes, the IP flows are re-routed before the Dubois Expires September 2004 [Page 2] Internet Draft BGP planned maintenance March 2004 session termination and no traffic is lost during rerouting, since both the forwarding and the Loc-RIB tables are maintained while the peers are re-computing their forwarding tables. - This mechanism is backward compatible with the existing BGP-4 protocol specification, so there are no risks associated with it and it can be deployed incrementally. Please note that the rerouting is effective even if the back-up router advertises a less specific route than those that were withdrawn by the router to be shut down. 3.2. Examples Let us consider the following example (Figure 1 below) where one customer router (denoted as "CUST" in the figure) is dual-homed to two SP routers, denoted as "ASBR1" and "ASBR2". ' ' ' AS1 ' AS2 ' /-----------ASBR1-----P1---- / | / | CUST | \ | \ | \-----------ASBR2-----P2---- ' ' AS1 ' AS2 ' Figure 1: Dual-Home Peering Example. Let's say traffic is normally conveyed by the CUST-ASBR1 link and the SP wants to shutdown ASBR1 for maintenance purposes. The standard behavior is: 1. ASBR1 tears all its BGP-4 sessions down. 2. As a result, it removes all its BGP-4 routes from its RIB and FIB tables. 3. Its BGP-4 peers remove all the routes that were announced by the shutting down peer. Dubois Expires September 2004 [Page 3] Internet Draft BGP planned maintenance March 2004 During its peer convergence: - CUST continues to send traffic to ASBR1. ASBR1 drops this traffic because it has no route to destination. - P1 continues to send traffic to ASBR1. ASBR1 either drops this traffic because it has no route to CUST routes or sends this traffic back to P1 thus creating a routing loop. From the customer's point of view, the traffic is lost during BGP-4 convergence time. With the new behavior defined in this document: - On all its BGP-4 sessions, ASBR 1 withdraws all previously- announced routes. - ASBR1 still has all the routes in its Adj-RIB-in, Loc-RIB, and FIB tables so it keeps forwarding traffic in every direction. - Its peers take into account the BGP-4 withdraw messages and start rerouting traffic accordingly. - Once the timer has expired, ASBR1 closes its BGP-4 sessions with its peers. No traffic is lost. 3.3. Specification of the "Planned Maintenance" Timer To trigger the shutdown of BGP-4 sessions once BGP UPDATE messages with a non-empty Withdrawn Routes field have been sent by the peer to shut down, a planned maintenance timer SHOULD be available and SHOULD be configurable. This timer SHOULD be triggered when all the aforementioned BGP UPDATE messages have been sent. One conservative suggestion for the initial setting of this timer is 300 seconds. In case multiple sessions are shutdown at the same time, the timer is triggered when the last route has been withdrawn. When only one session is shut down, the timer is applicable to this session only. When the timer expires, BGP NOTIFICATION messages are sent to the peers. Timer Timer Starts Expires |-------------| |---------------| |-------------| |Normal | |BGP | |BGP sessions | |BGP operation|--->|Withdrawns sent|--->|closed | |-------------| |---------------| |-------------| Dubois Expires September 2004 [Page 4] Internet Draft BGP planned maintenance March 2004 3.4. Specification of the Cease-code notification It is REQUIRED to send cease code 6 in the final BGP NOTIFICATION message. 3.5. Applicability Statement This mechanism is applicable to e-BGP and i-BGP sessions. It SHOULD NOT be used to withdraw BGP NLRI whose BGP Next Hop is not the shut downing router. Otherwise, routing loops may appear (depending of the network topology) for the duration of the planned maintenance timer. For that reason, this mechanism is not applicable on a route reflector for reflected iBGP sessions. This mechanism is applicable to any address family. If the BGP-4 implementation allows closing a sub-set of AFIs carried in a MP-BGP-4 session, this mechanism is applicable to this sub-set of AFI identifiers. The mechanism provides its best results in the case where the whole BGP-4 process is shutdown. However, it is also applicable when one session or even one address family undergoes a planned maintenance. 3.6. Specification of the Mechanism for the Maintenance of all BGP-4 Sessions Router SHOULD send BGP UPDATE messages with non-empty Withdrawn Field for all routes that have been stored in its Adj-RIB-out table. When all the necessary UPDATE messages are sent, it SHOULD trigger the planned maintenance timer. When the planned maintenance timer expires, it SHOULD close the BGP-4 session and send the appropriate BGP NOTIFICATION message. 3.7. Specification of the Mechanism for the Maintenance of One Session When only one session is shutdown, the mechanism is applicable if another mechanism such as graceful restart [3] cannot be applied. Its purpose is to limit the amount of traffic that will be dropped. Detailed behavior is explained below: For the session to be shutdown, the Router SHOULD send BGP UPDATE messages with a non-empty Withdrawn Field for all the relevant routes that have been stored in the Adj-RIB-out table. For the rest of the active BGP-4 sessions, the router SHOULD send BGP UPDATE messages with a non-empty Withdrawn Field for all the relevant Dubois Expires September 2004 [Page 5] Internet Draft BGP planned maintenance March 2004 routes that have been stored in the Adj-RIB-In of the maintained sessions and advertised to other peers.. This prevents peers to send traffic to the maintained BGP peer and initiate their convergence toward a new path. Since the Adj-RIB-In and the Loc-RIB contents are not modified by the mechanism, the router maintains the forwarding of all the routes that are being withdrawn for maintenance purposes. When the planned maintenance timer expires, the router SHOULD close the BGP-4 session and send the appropriate BGP NOTIFICATION message. Note that this mechanism is not applicable if the router does have nor adj-RIB-in nor does support route-refresh as we need to know the NLRI that are contained in each adj-RIB-in. Interaction with Other Peers While the "planned maintenance" timer is running, the router SHOULD keep on processing BGP-4 messages received from its peers, in order the keep in sync with others BGP peers. Otherwise, if all BGP peers have not a coherent description of the network, routing loops may occur. If it received a BGP UPDATE message (with a possibly non-empty Withdrawn Field), it MUST process it and update the contents of its routing tables accordingly. If necessary, it MUST advertise the modification to all its peers, but the peer from which it received the UPDATE message. The router initiating the planned maintenance SHOULD not send any new BGP messages through the BGP session(s) being maintained because: - These peers SHOULD not receive any new routes. So the router SHOULD not send BGP UPDATE messages towards such peers. - They have no active routes received from this peer. So there is no need to send new BGP UPDATE messages with a non-empty Withdrawn Field since these routes have already been withdrawn. 3.8. Configuration and Deployment Guidelines It is recommended that the "planned maintenance" mechanism should be automatically triggered when an operator shuts down a router that has some active BGP-4 sessions, or when an operator wants to tear a BGP-4 session down. Remark: In some existing implementations, it is possible to have the desired behavior with route policy configuration statements. Dubois Expires December 2004 [Page 6] Internet Draft BGP planned maintenance March 2004 4. Security Considerations The BGP-4 shutdown mechanism described in this document does not introduce any change as far as the Security Consideration section of [1] is concerned. 5. Normative References [1] Rekhter, Y. and T. Li (editors), "A Border Gateway protocol 4 (BGP-4)", Internet Draft draft-ietf-idr-bgp4-23.txt. [2] Bates, T., Rekhter, Y, Chandra, R. and D. Katz, "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000. [3] Sangli, S., Y. Rekhter, R. Fernando, J. Scudder and E. Chen, "Graceful Restart Mechanism for BGP," Work in Progress (draft- ietf-idr-restart-10.txt). [4] E. Chen and V. Gillet, "Subcodes for BGP Cease Notification Message", draft-ietf-idr-cease-subcode-05.txt, March 2004. [5] E. Rosen, Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, March 1999 6. Informative References [6] K. Kompella, Y. Rekhter "Virtual Private LAN Service", draft- ietf-l2vpn-vpls-bgp-01.txt January 2004, work in progress. 7. Acknowledgments The author would like to thank Christian Jacquenet, Vincent Gillet, Xavier Vinet and Jean-Louis le Roux for the useful discussions on this subject, their review and comments. 8. Author's Addresses Nicolas Dubois France Telecom R&D 38-40 rue de general Leclerc 92794 Issy Moulineaux cedex 9 France Email: nicolas.dubois@francetelecom.com Dubois Expires December 2004 [Page 7] Internet Draft BGP planned maintenance March 2004 Bruno Decraene France Telecom R&D 38-40 rue de general Leclerc 92794 Issy Moulineaux cedex 9 France Email: bruno.decraene@francetelecom.com Benoit Fondeviole France Telecom R&D 38-40 rue de general Leclerc 92794 Issy Moulineaux cedex 9 France Email: benoit.fondeviole@francetelecom.com IPR Disclosure Acknowledgement By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Full Copyright Statement "Copyright (C) The Internet Society (2004). All Rights Reserved.This document is subject to rights licences and restrictions contained in BCP 78 and except as set forth therein, the authors retain all their rights. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Dubois Expires December 2004 [Page 8]