v6ops B. Dickson Internet-Draft Afilias Canada, Inc Expires: August 9, 2008 February 6, 2008 Aggregation: Methods and Benefits, for IPv4, IPv6 or other binary addresses draft-dickson-v6ops-aggregation-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 9, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Dickson Expires August 9, 2008 [Page 1] Internet-Draft Aggregation Methods and Benefits February 2008 Abstract This Internet Draft discusses general benefits of aggregation, with quantitative examples of different aggregation strategies for the same set of allocations. Recommended "best practices" for service providers and enterprises are listed, as well as "how-to" information. Dickson Expires August 9, 2008 [Page 2] Internet-Draft Aggregation Methods and Benefits February 2008 Author's Note This Internet Draft is intended for Informational status, for v6ops or other suitable WG. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [2] Table of Contents 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Description of the Problem: Route Table Scaling . . . . . . . 5 3. Allocation vs Aggregation . . . . . . . . . . . . . . . . . . 6 4. Block Allocation vs Assignments . . . . . . . . . . . . . . . 7 5. Locations for allocations . . . . . . . . . . . . . . . . . . 8 6. Allocation Pools . . . . . . . . . . . . . . . . . . . . . . . 9 7. Sizes of allocations . . . . . . . . . . . . . . . . . . . . . 10 7.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . 10 8. Benefits and Consequences of Hierarchical Aggregation . . . . 13 8.1. Caveats . . . . . . . . . . . . . . . . . . . . . . . . . 13 8.2. Law of Scaling . . . . . . . . . . . . . . . . . . . . . . 15 8.3. Routing Table Size . . . . . . . . . . . . . . . . . . . . 15 8.4. Routing Stability . . . . . . . . . . . . . . . . . . . . 15 8.5. Routing Convergence . . . . . . . . . . . . . . . . . . . 15 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 12. Informative References . . . . . . . . . . . . . . . . . . . . 19 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 Intellectual Property and Copyright Statements . . . . . . . . . . 21 Dickson Expires August 9, 2008 [Page 3] Internet-Draft Aggregation Methods and Benefits February 2008 1. Background IPv4 and IPv6 are both protocols with address schemes that involve binary addresses, and longest-match routing. This allows for routing tables with overlapping network entries who differ in length of the entry. The real benefit of longest-match routing, is that summary routing information, i.e. aggregate routes, can be used without needing the more-specific prefixes. Distribution of more-specific prefixes can be limited to locations where they are needed to disambiguate next-hop choices. This tends to be topologically restricted, and thus only the aggregate routes are needed in the wider distribution of prefixes. In order to achieve maximum benefit from this set-up, however, allocation of prefixes needs to be done in such a way as to support aggregation. Prefixes need to be assigned from topologically-appropriate aggregate blocks. We do not discuss the assignment techniques within these block in this document; that is a subject for another document. Dickson Expires August 9, 2008 [Page 4] Internet-Draft Aggregation Methods and Benefits February 2008 2. Description of the Problem: Route Table Scaling The problem being addressed is the practical operational issue on the (hybrid IPv4+IPv6) Internet: routing table size. Table size is impacted by address aggregation and address allocation - two activities that goe hand-in-hand. This is true both in the context of the global routing table, as well as the internal routing table requirements of individual ISPs and enterprises. Routing convergence time, while less obvious, is another function which is scale- sensitive. And hardware forwarding tables (where they exist) are very scale-limited, expensive, and generally only upgradeable by upgrading the routers containing them. These grow approximately 1:1 with the routing table for any given router. Aggregation itself cannot be controlled directly or mandated alone, it should be noted. The ability to aggregate is driven directly by the allocation schemes used. Dickson Expires August 9, 2008 [Page 5] Internet-Draft Aggregation Methods and Benefits February 2008 3. Allocation vs Aggregation Aggregation is the act of combining a number of smaller things into a bigger thing. In the context of prefixes in an internet, aggregation can only occur on bit boundaries, and only when objects being combined are contiguous, with sufficiently similar properties (such as as-path). Most important is the contiguous property. Consequently, one way to view aggregation is as the reverse of de- aggregation, i.e. announcing more-specific prefixes from a CIDR block. Unless an initial allocation and any reservation for further allocation are contiguous, no aggregation between the two is possible. So, initial and subsequent allocations from the same larger block, in fact look like a de-aggregation followed by aggregation. Internal use and assignment can similarly be viewed as de-aggregation, with the summarization happening at the border of the entity doing the aggregation (e.g. ASN border router.) Dickson Expires August 9, 2008 [Page 6] Internet-Draft Aggregation Methods and Benefits February 2008 4. Block Allocation vs Assignments In order to avoid confusion, we will refer to allocations of prefixes to third parties (e.g. end-users), or of internal prefixes to groups or departments (for an enterprise) as "Assignments". When we use the term "Allocation" we are referring to the creation of an aggregate block out of which subsequent assignments will be made. The main issues we are concerned with are: o Size of allocations o Structure of allocations o Size-specific allocation pools Dickson Expires August 9, 2008 [Page 7] Internet-Draft Aggregation Methods and Benefits February 2008 5. Locations for allocations Bottom-tier allocations should be made as close to customers as possible - ideally on aggregation routers. Subsequent tier allocations should be made wherever topology permits, such as by Point-of-Presence (POP), city, region, and continent, as appropriate. Dickson Expires August 9, 2008 [Page 8] Internet-Draft Aggregation Methods and Benefits February 2008 6. Allocation Pools It is not strictly necessary to create pools from which only one size of assignment is made. While it may be simpler to administer, it leads to inefficiency, especially when too many pool sizes are needed, or where per-size pools do not map well to the aggregation topology. The main considerations should be whether there is uncertainty over the distribution of larger assignments among the allocation blocks, and whether the sizes of assignments might be so large as to dominate blocks. It should be reasonable to include assignments (in terms of predicted use by allocation) up to 1/4 the size of the allocation block itself. The presence of predicted/ reserved space in the allocation which is unused is not likely to dramatically affect the HD ratio for a few instances. And with optimum assignment techniques, assignment ability is not order- sensitive. This means that if sufficient space remains, assignments should be possible. Exhausting an allocation is not overly significant. The only impact is that a new allocation is needed when this occurs. Dickson Expires August 9, 2008 [Page 9] Internet-Draft Aggregation Methods and Benefits February 2008 7. Sizes of allocations The way to ensure proper sizing of allocations is to guage the per- location requirements, in terms of number of assignments and assignment sizes, and to work bottom-up. At each layer of aggregation, the total of assignments needed (in binary) gets rounded up to the next power of 2. This becomes the appropriate allocation size. Note well: the sizes of "sibling" allocations within a higher- tier allocation block, do not need to be the same. Variable-size blocks should in fact be expected. Efficient assignment techniques are sufficient to ensure optimum packing of the subordinate allocations. Depending on the address assignment regime, it may be appropriate to "pad" the assignments themselves universally, e.g. by adding some number of bits to the size, such as 4 bits. This is iterated at each layer of the hierarchy of aggregation. The presumption is that assignments can and will be made with near- optimal efficiency. It is necessary to work from some basic understanding of the ability to populate customer assignments on the bottom-tier aggregation locations: o Slightly conservative is better than either extreme. o Too small an allocation results in poor aggregation and little-to-no benefit. o Too large an allocation can result in space exhaustion, renumbering, or hole-punching, all of which are bad results. o Slightly small means some allocations might be outgrown, with little waste. o Slightly large means room for growth by customers is available. o At the bottom level, the very-long-term view should be taken, e.g. max out the access equipment allocations 7.1. Example Let's consider a medium-sized ISP, with a mix of business and residential DSL customers. The ISP has several POPs in a few cities. The ISP decides that residential customers will get /56 prefix assignments (e.g. via DHCP with prefix delegation), small businesses also get /56's, and large businesses get /48's. Here is how the bottom-up worksheet would be done, and what the allocation block sizes would be. Note the differences in sizes of siblings. Dickson Expires August 9, 2008 [Page 10] Internet-Draft Aggregation Methods and Benefits February 2008 City A POP A1 Residental DSLAM 1: 200 x /56 -- aggregate of /48 Residental DSLAM 2: 600 x /56 -- aggregate of /46 Residental DSLAM 3: 2000 x /56 -- aggregate of /45 Residental DSLAM 4: 200 x /56 -- aggregate of /48 Total aggregate for POP A1 -- aggregate of /44 POP A2 Business DSLAM 1: 200 x /56 + 10 x /48 -- aggregate of /44 Business DSLAM 2: 20 x /48 -- aggregate of /43 Residental DSLAM 3: 1000 x /56 -- aggregate of /46 Residental DSLAM 4: 500 x /56 -- aggregate of /47 Total aggregate for POP A2 -- aggregate of /42 Total aggregate for City A -- aggregate of /41 City B POP B1 Residental DSLAM 1: 200 x /56 -- aggregate of /48 Residental DSLAM 2: 500 x /56 -- aggregate of /47 Residental DSLAM 3: 200 x /56 -- aggregate of /48 Total aggregate for POP B1 -- aggregate of /46 POP B2 Business DSLAM 1: 30 x /56 + 4 x /48 -- aggregate of /45 Business DSLAM 2: 10 x /48 -- aggregate of /44 Residental DSLAM 3: 200 x /56 -- aggregate of /48 Dickson Expires August 9, 2008 [Page 11] Internet-Draft Aggregation Methods and Benefits February 2008 Residental DSLAM 4: 500 x /56 -- aggregate of /47 Total aggregate for POP B2 -- aggregate of /43 Total aggregate for City B -- aggregate of /42 City C POP C1 Residental DSLAM 1: 200 x /56 -- aggregate of /48 Residental DSLAM 2: 600 x /56 -- aggregate of /46 Residental DSLAM 3: 2000 x /56 -- aggregate of /45 Residental DSLAM 4: 200 x /56 -- aggregate of /48 Business DSLAM 5: 4 x /48 + 20 x /56 -- aggregate of /45 Total aggregate for POP C1 -- aggregate of /43 Total aggregate for City C -- aggregate of /43 Total aggregate for ISP -- aggregate of /40 Dickson Expires August 9, 2008 [Page 12] Internet-Draft Aggregation Methods and Benefits February 2008 8. Benefits and Consequences of Hierarchical Aggregation 8.1. Caveats Aggregation, allocations, and assignments are not one-time events. They are necessarily a process, which must be maintained with great dilligence, if the benefits are to be maintained. The effort required in keeping this ordered structure, is substantially less than that of having to clean up the results of badly managed address allocations. The main caveat, however, is that all address assignments *and* allocations, must be viewed, at all levels, as "non-portable". Assignments and allocations are fundamentally tied to their topological locations. And this means: If an entity gets moved topologically, all assignments or allocations must be renumbered. The new assignments or allocations, must come from the new topological parent. The old assignments/allocations must be returned to the old parent. As a general rule, those who move more often, should have less dependence on the permanence of their addresses. This means that the impact to the recipient of addresses, should not be dire. It also means, that operators should ensure that they have tools which are as good at supporting renumbering, as they are at numbering new recipients. The following IPv6 example is used to demonstrate the number of prefixes needed, for reaching destinations in an Ingerior Gateway Protocol (IGP) area, of a given size. The laws governing scalability are identified, and examples used to illustrate how important scaling is to IGPs regardless of specific IGP technology. The presumption being made is, that allocation is made according to the topology, and that aggregation is done to the maximum degree possible, everywhere it is possible. o IPv6 address block (allocated out to LIR networks): /32 o Customer end assignments: /56 o Number of bits of allocation ability: 24 o Maximum number of prefixes to assign: 2^24 (approximately 16M) 1 0 3 3 5 5 6 6 2 1 2 3 6 7 4 5 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Network Block (PA from RIR) | Subnet Hierarchy | Cust_Bits | IfId | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Dickson Expires August 9, 2008 [Page 13] Internet-Draft Aggregation Methods and Benefits February 2008 Figure 1 Each value in the Subnet Hierarchy, is a different discrete assignment (/56) to an end site. Cust_Bits are the subnets within the end-site prefix. The following table compares different structures of Subnet Hierarchy. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | No. Bits per level | Total prefixes on ABR per level | + -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Levels | 1 | 2 | 3 | 1 | 2 | 3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1 | 24 | | | 16M | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2 | 8 | 16 | | 66k | 65k | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2 | 12 | 12 | | 8k | 4k | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 3 | 8 | 8 | 8 | 512 | 513 | 257 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 We'll use the examples from Figure 1 and Figure 2, and extrapolate the rules on prefix counts in a multi-level aggregation regime. First case: all of the assignments are made without any aggregation. The IGP needs to carry all 16M prefixes. It would likely have some difficulty coping with storing those, let alone achieving routing convergence, with today's technology. Second case: a two-level hierarchy, with the delineation point is at the /40 boundary. This gives a top-level IGP carrying 8 bits of aggregated prefixes, or 2^8 = 256 prefixes.Similarly, the second level has 2^16 prefixes, or 65k prefixes. That is still a substantial number. Third case: a two- level hierachy, with both the top level and the second level having about of 2^12 of each kind of prefix. The routers in the IGP doing aggregation would have 2 * 2^12, about 8k prefixes. Fourth case: a three-level hierarchy. The router bordering level 2 and level3 would also need the third level prefixes, but for the first-level prefixes, would need only the single super-aggregate for the whole block. If each of the boundaries is an 8-bit boundary, the second-from-the- bottom tier routers in the IGP would need 1 + 2^8 + 2^8 prefixes, or 513 prefixes. Hierarchical assignments result in drastic reductions in number of prefixes needed. Dickson Expires August 9, 2008 [Page 14] Internet-Draft Aggregation Methods and Benefits February 2008 8.2. Law of Scaling The general rule on hierarchical maximum prefix count is, the maximum of values (1 + 2^Bi+2^B(i+1)), where Bi is the number of bits used for level "i" of the heirarchy. Thus, the greatest benefit is achieved by putting an upper bound on max(Bi), i.e. minimize size of each layer of the hierarchy. This can be done by aggregating at every reasonable place topologically suitable for aggregation. 8.3. Routing Table Size Hierarchical aggregation results in routers typically seeing only prefix information at the same level of the hierarchy as itself, plus summarization prefixes for one higher level of the hierarchy seen by aggregators themselves. This drastically reduces the requirements for the size of routing tables within an organization. This benefit continues to be seen, even when aggregation routers grow beyond their initial aggregation pool, and need subsequent pools assigned, so long as those additional aggregates are also assigned from the hierarchy. 8.4. Routing Stability Aggregation also limits the impact of routing updates. In a hierarchy of aggregations of prefixes, aggregation typically suppresses reachability information for more-specific prefixes. This limits the scope of routing flaps, and improves network-wide routing stability. Routing flaps propogate only to the aggregator, and not higher in the hierarchy. 8.5. Routing Convergence Thus, we can see that not only external aggregation at the top level, but hierarchical aggregation within a block of addresses, has benefit and is highly recommended for any organization with sufficient resources (address space) allocated to it. Routing convergence scales by an order of N*log(N) where N is the number of prefixes. Reducing N by ORDERS of magnitude have profound benefits on speed of convergence, i.e. also orders of magnitude. The fewer prefixes there are in a routing table, the faster routing can converge. This is especially true for SPF protocols, such as OSPF or ISIS. Convergence time is on the order of Log(N) for N prefixes. The smallest number N is achieved with routing topologies that implement hierarchical aggregation which mirrors topology. (This is classic OSPF summarization methodology.) Dickson Expires August 9, 2008 [Page 15] Internet-Draft Aggregation Methods and Benefits February 2008 9. Security Considerations Owing to the abstract nature of this document, there are no security considerations. Dickson Expires August 9, 2008 [Page 16] Internet-Draft Aggregation Methods and Benefits February 2008 10. IANA Considerations This document has no actions for IANA. Dickson Expires August 9, 2008 [Page 17] Internet-Draft Aggregation Methods and Benefits February 2008 11. Acknowledgements The author wishes to acknowledge the helpful guidance of Joe Abley, Brian Haberman, and Jari Arrko. The author also thanks to Marla Azinger, Scott Leibrand, Bob Hinden, and Iljitsch van Beijnum. Dickson Expires August 9, 2008 [Page 18] Internet-Draft Aggregation Methods and Benefits February 2008 12. Informative References [1] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] Fuller, V., Li, T., Yu, J., and K. Varadhan, "Classless Inter- Domain Routing (CIDR): an Address Assignment and Aggregation Strategy", RFC 1519, September 1993. [4] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006. Dickson Expires August 9, 2008 [Page 19] Internet-Draft Aggregation Methods and Benefits February 2008 Author's Address Brian Dickson Afilias Canada, Inc 4141 Yonge St, Suite 204 North York, ON M2P 2A8 Canada Email: briand@ca.afilias.info URI: www.afilias.info Dickson Expires August 9, 2008 [Page 20] Internet-Draft Aggregation Methods and Benefits February 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Dickson Expires August 9, 2008 [Page 21]