INTERNET DRAFT H. Berkowitz draft-berkowitz-bgpcon-00.txt Nortel Networks January 2001 Benchmarking Methodology for Exterior Routing Convergence Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract This document defines a specific set of tests that vendors can use to measure and report the convergence performance of BGP-4 processes. It does not consider the forwarding performance of such routers once they have converged. A separate document will define convergence in interior routing. This memo will consider changes in forwarding performance while a router is reconverging, but RFC 2544 remains the methodology document for benchmarking forwarding performance. 1. Introduction This document defines a specific set of tests that implementers can use to measure and report the convergence performance of BGP routers. It does not consider the forwarding performance of such routers once they have converged, with the caveat that the effect of the reconvergence process on forwarding performance can be considered. . Indeed, the techniques here are appropriate for pure route servers as well as for devices that do both path determination and packet forwarding. The results of these tests will provide the user comparable data from different vendors with which to evaluate these devices. RFC 2544 remains the methodology document for forwarding performance. Labovits, Ahuja, et al have done, and are continuing to do, valuable work on Internet-wide convergence. Their measurements, however, reflect a wide range of factors affecting convergence, including media speeds, propagation times, policies, etc. Whenever possible, terminology in this document is consistent with Labovits et al. The presentation does not formalize the definition of convergence, but, in any case, there appear to be several useful meanings of "BGP convergence time." Lack of standard terminology leads both to difficulty in comparing research results, and generating FUD for Internet operators and consumers. Existing benchmarking documents, such as RFC 2544, focus on forwarding performance rather than convergence. 2. Requirements Requirements for the use of MUST, SHOULD, etc., follow the definitions of RFC 2119. 3. Workloads and Scenarios Providing useful convergence information for BGP routers depends significantly on the intended use of the router. Since workload, principally the size of the full routing table and the number of BGP peers, but also additional processing such as route filtering, flap dampening, authentication, etc., will affect any router. Not all BGP routers are intended for the same applications. This section presents some representative scenarios, but, in practice, the tester of a given router will need to develop workload parameters that are appropriate for the intended purpose. The goal of this specification is not to prescribe numeric values for these parameters, but simply to identify the parameters and require them to accompany a compliant test report. A given test report must include: Number of routes to be in the device under test's (DUT) converged routing table Number of eBGP peers For each peer, the number of routes to be received and to be advertised Number of iBGP peers The number of routes will vary with the proposed application. Realistic numbers should be based on the size of a current default-free routing table (exclusive of internal routes). This table is referred to as DFRT and the number of routes it contains a NDFRT. It is the Routing Information Base (RIB) of the DUT. Depending on the router implementation, one or more Forwarding Information Bases (FIB) may need to be generated from the RIB before a router can advertise and forward at full speed. Be aware that many service providers will have substantial numbers of internal and non-aggregated customer routes, so the routing table of a large provider's core router could very well contain 1.5 NDFRT or more routes. Smaller RIBs may be used with routers explicitly intended for edge use with defaults, and the assumptions cited. Appendix A presents some scenarios for typical BGP applications. 4. Types of Convergence Two significantly different types of convergence time tend to be lumped together in product specifications. The first is the time needed for a BGP speaker to build a full table after initialization, or for a particular peering session to rebuild its table after a hard reset. The second is the time needed for a router to respond to a new announcement or withdrawal. 4.1 Reference Configuration For tests when the number of peers is not a performance parameter of interest, use the configuration in Figure 1: TR1==========+---------+==========TR3 | | | D1 | | | | DUT | TR2==========| | +---------+ D1 is a prefix reachable by both TR1 and TR2. It is assumed that neither TR1 or TR2 is the originating AS for the announcement of D1. More complex peering arrangements will involve up to n Test Routers, as shown in Figure 2. It is recommended that the Figure 1 configuration always be tested as a baseline, and then additional reports made that show the effect on performance of increasing the number of peers. TR1==========+---------+==========TR3 | | | D1 | | | | DUT | TR2==========| | | | ... TRn==========+---------+ Interface speeds will be specified as part of the test report. At least 100 Mbps is recommended, so media delays are not a signficant component of the convergence time. In the absence of other route selection criteria, TR1 shall have an IP address that makes it most preferred. 4.3 Events in the Convergence Process [Ahuja 2000a] defines the events: Tup -- A new route is advertised Tdown -- A route is withdrawn (i.e. single-homed failure) Tshort -- Advertise a shorter/better ASPath (i.e. primary path repaired) Tlong -- Advertise a longer/worse ASPath (i.e.primary path fails) In this paper, the meaning of Tup and Tdown are preserved and extended from [Ahuuja]. The notation Tup(TRx) means a Tup event advertised to the router being tested (i.e., DUT). The sense of the Tshort and Tlong events is also preserved, but the basic criterion for selecting a "better" route is the final tiebreaker defined in RFC1771, the router ID. As a consequence, this memorandum uses the events Tbetter, Tworse, and Tbest. While ASPath is quite likely to be the most common tiebreaker in the operational Internet, it is not actually part of the RFC-defined route preference algorithm. AS path prepending is another widely used but nonstandard factor for influencing route preference, but questions have been raised regarding its scalability in an ever-growing Internet. 5. Measurement Measurements can be defined either as internal or external. Internal measurements examine the RIB/FIB of the DUT. While they are more accurate in principle, they require measurement hooks in the implementation, as described in [Trotter]. External measurements start with a stimulus from one or more "upstream" routers and end with a specific event causing an advertisement to be sent to a "downstream" peer. In the reference configuration above, external measurements are defined with respect to TR3 as the downstream router. 6. eBGP tests All routers in this configuration have a policy of ADVERTISE ALL/ACCEPT ALL [RPSL]. Tests with prefix filtering, community-based preferences, authentication, etc., as well as performance under flap are TBD. Not all eBGP applications are alike. While the tests in this section are applicable to a wide range of configurations, testers may select configurations that are most relevant to the intended product use. Such configurations include: 1. Interprovider peering, characterized by an exchange of customer routes, which, in the case of major providers, may be in the tens of thousands of routes but smaller than the default-free table. 2. Transit services, where the transit customer advertises a relatively small number of routes toward the provider, but variously may take full default-free routes, customer routes, or default only from the provider. 6.1 eBGP Initial Convergence While this is relatively simple to measure, and often is the basis of product specifications, it is operationally far less significant than reconvergence after changes. A "carrier-grade" router should not initialize often, and the soft reset option reduces the need to rebuild views. The initialization time, therefore, can be amortized over a long period of time and may disappear into the noise when compared to reconvergence. 6.1.1 Initial Convergence Time The test begins with OPEN requests sent from TR1 and TR2 to the DUT. Each Test Router sends a standard routing table of TBD routes. The test ends when the DUT begins to advertise the last route in the routing table to TR3. 6.2 eBGP Reconvergence For all of these measurements, report any route filters, authentication, and reverse path verification used. 6.2.1 Time to Add Newly Advertised Route The DUT has been initialized, with no path to D. Measurement time begins when TR1 announces D to the DUT. 6.2.1.1 Time to Readvertise D Measurement time stops when the DUT advertises D to TR3. 6.2.1.2 Time to Begin Forwarding to D Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the DUT. Measurement time ends when TR3 receives a TR1-originated packet via the DUT. 6.2.2 Time to Change to Alternate Path after Withdrawal The DUT has been initialized and has paths to D via both TR1 and TR2. TR1's path is preferred, but TR1 withdraws it with TDown(TR1). Reconvergence occurs when the TR2 advertised paths becomes active. 6.2.2.1 Time to Readvertise D Measurement time stops when the DUT advertises D to TR3. 6.2.2.2 Time to Begin Forwarding to D Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the DUT. Measurement time ends when TR3 receives a TR1-originated packet via the DUT. 6.2.3 Time to Reconverge after Sequential Withdraw and New Announcement The DUT has been initialized and has a path to D1 via TR1, not TR2. Simultaneously, TR1 sends TDown(TR1) and TR2 announces the new route with Tbest(TR2). 6.2.3.1 Time to Readvertise D Measurement time stops when the DUT advertises D to TR3. 6.2.3.2 Time to Begin Forwarding to D Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the DUT. Measurement time ends when TR3 receives a TR1-originated packet via the DUT. 7. iBGP 7.1 Mesh tests Repeat the topologies of step 5, but within the same AS. The test report shall show the specific test configuration(s). It is highly desirable that the result show the effect of increasing the number of peers on routing performance. 7.2 Route Reflector tests TR1==========+---------+==========TR3 | | | D1 | | | | DUT | TR2==========| | | | ... TRn==========+---------+ 7.2.1 DUT as Route Reflector The DUT acts as the cluster server in a single-server cluster. Let TR1 and TR2 be clients of the DUT, and repeat the tests of step 5. 7.2.2 DUT Route Reflector in multiple reflector cluster The DUT acts as one of the the clusters server in a multi-server cluster. TRn will be the additional server. There will be iBGP peering between TRn and DUT, between DUT and TR1, between TRn and TR1, between DUT and TR2, and between TRn and TR2. Let TR1 and TR2 be clients of the DUT, and repeat the tests of step 5. 7.2.3 DUT as Route Reflector Client The DUT acts as a client in a single-server cluster. Let TR1 be the cluster reflector. TR2, and additional routers as desired, serve as clients. Test results shall state the number of clients. 7.2.4 DUT as Route Reflector Client in multiple reflector cluster The DUT acts as one of the the clients in a multi-server cluster. TRn will be the additional server. There will be iBGP peering between TR1 and TRn, between DUT and TR1, between DUT and TRN, between TR2 and TR1, and between TR2 and TRN. 8. Modifiers It might be useful to know the DUT performance under a number of conditions; some of these conditions are noted below. The reported results SHOULD include as many of these conditions as the test equipment is able to generate. The suite of tests SHOULD be first run without any modifying conditions and then repeated under each of the conditions separately. 8.1 Filters 8.1.1 Representative Customer Ingress Filtering Following the principles of [RFC 2827], perform the eBGP tests with a filter to accept a single prefix from TR1, while being sent a 10-route table and a full (TBD) table. 8.2. Bursty traffic/route flap Let TRF be a router that will generate only flapping routes. TR1==========+---------+==========TR3 | | | D1 | | | | DUT | TR2==========| | | | ... TRF==========+---------+ 8.2.1 Flap Isolation Test TRF will advertise a continuously flapping route. Repeat the eBGP convergence tests. 8.2.2 Flap Rejection Tests Repeat eBGP Reconvergence Tests while one route in the TR1 peering flaps continuously. 8.3 Communities 8.3.1 Community-based Acceptance Perform the eBGP tests with a filter to accept TBD prefixes tagged with community XXX, sent as part of a full (TBD) table. 8.3.2 Community Advertising Perform the eBGP advertising tests but adding a community YYY. 9. Security Considerations Security issues are not addressed in this document. 10. Acknowledgements Thanks to Francis Ovenden for review and Abha Ahuja for encouragement. 11. References [Ahuja 2000a] "An Experimental Study of Delayed Internet Routing Convergence." Abha Ahuja, Farnam Jahanian, Abhijit Bose, Craig Labovits, RIPE 37 - Routing WG. [RFC 2539] "BGP Route Flap Damping" C. Villamizar, R. Chandra, R. Govindan. November 1998. [RFC 2544] "Benchmarking Methodology for Network Interconnect Devices." S. Bradner, J. McQuaid. March 1999. [RFC 2622] Routing Policy Specification Language (RPSL)." C. Alaettinoglu, C. Villamizar, E. Gerich, D. Kessens, D. Meyer, T. Bates, D. Karrenberg, M. Terpstra. June 1999. [RFC 2827] Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing. P. Ferguson, D. Senie. May 2000. [RFC 2928] "Route Refresh Capability for BGP-4". E. Chen. [Trotter] "Terminology for Forwarding Information Based (FIB) based Router Performance Benchmarking", Work in Progress, IETF draft-ietf-bmwg-fib-term-00.txt 12. Author's Address Howard Berkowitz Nortel Networks 5012 S. 25th St PO Box 6897 Arlington VA 22206 Phone: +1 703 998-5819 (ESN 451-5819) Fax: +1 703 998-5058 EMail: hberkowi@nortelnetworks.com hcb@clark.net Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.