Network Working Group A. Przygienda Internet-Draft Juniper Intended status: Standards Track Y. Lee Expires: March 12, 2020 A. Sharma Comcast R. White Juniper September 9, 2019 Flood Reflectors draft-przygienda-flood-reflector-00 Abstract This document provides specification of an optional ISIS extension that allows to create l2 flood reflector topologies independent of resulting forwarding within L1 areas when they are used as 'transit' to guarantee L2 connectivity between L2 "islands". Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on March 12, 2020. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. Przygienda, et al. Expires March 12, 2020 [Page 1] Internet-Draft draft-przygienda-flood-reflector September 2019 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Description . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Further Details . . . . . . . . . . . . . . . . . . . . . . . 6 3. Flood Reflection TLV . . . . . . . . . . . . . . . . . . . . 7 4. Non-Forwarding Adjacency Sub-TLV . . . . . . . . . . . . . . 7 5. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Adjacency Forming Procedures . . . . . . . . . . . . . . . . 9 7. Special Considerations . . . . . . . . . . . . . . . . . . . 9 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 11.1. Informative References . . . . . . . . . . . . . . . . . 10 11.2. Normative References . . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 1. Description Due to the inherent properties of link-state protocols the number of IS-IS routers within a flooding domain is limited by processing and flooding overhead on each node. While that number can be maximized by well written implementations and techniques such as exponential back-offs, IS-IS will still reach a saturation point where no further routers can be added to a single flooding domain. In certain deployment scenarios of L2 backbones, this limit presents an obstacle. While the standard solution to increase the scale of an IS-IS deployement is to break it up into multiple L1 flooding domains and a single L2 backbone, and alternative way is to think about "multiple" L2 flooding domains connected via L1 flooding domains. In such a solution, the L2 flooding domains are connected by "L1/L2 lanes" through the L1 areas to form a single L2 backbone again. However, in the simplest implementation, this requires the inclusion of most, or all, of the transit L1 routers as L1/L2 to allow traffic to flow along optimal paths through such transit areas and with that Przygienda, et al. Expires March 12, 2020 [Page 2] Internet-Draft draft-przygienda-flood-reflector September 2019 ultimately does not help to reduce number of L2 routers and increase the scalability of L2 backbone. +----+ +-------+ +-------+ +-------+ +----+ | R1 | | 00 +------------+ 10 +---------------+ 20 | | R4 | | L2 +--+ L1/L2 | | L1 | | L1/L2 +--+ L2 | | | | +--------+ +-+ | +------------+ | | | +----+ ++-+--+-+ | | +---+---+----------+ +-+--+-++ +----+ | | | | | | | | | | | | | | | | | | | | | | +-----------+ | | | | +-------+ | | | | | | | | | | | | | | | | | | | | | +------+ | | +------+ +--------+ | +-------+ | | | | | | | | | | | | | | | | +----+ ++------+---+ | +---+---+---+--+ | +-------+------++ +----+ | R2 | | 01 | | | | | 11 | | | | | 21 | | R5 | | L2 +--+ L1/L2 +------------+ L1 +---------------+ L1/L2 +--+ L2 | | | | | | | | | | | | | | | | | +----+ ++------+---+ | | +---+--++ | +-------+------++ +----+ | | | | | | | | | | | | | | +---------------+ | | | | | | | | | | | | | | | | | | | | | | | +--------------+ | +-----------------+ | | | | | | | | | | | | | | +----+ ++-+--+-+ | | +------+---+---+-----+ | | | ++-----++ +----+ | R3 | | 02 | +----------| 12 | | +----+ 22 | | R6 | | L2 +--+ L1/L2 | +--------| L1 +-------+ | | L1/L2 +--+ L2 | | | | +------------+ |---------------+ | | | +----+ +-------+ +-------+-------------+ +-------+ +----+ Figure 1 Figure 1 is an example of a network where a topologically rich L1 area is used to provide transit between six different routers in L2 "partitions" (R1-R6). To take advantage of the cornucopia of paths in the L1 transit, all the intermediate systems could be placed into both L1 and L2, but this essentially combines the separate L2 flooding domains into a single one, triggering maximum L2 scale limitations again. A more effective solution would allow to reduce the number of links and routers exposed in L2, while still utilizing the full L1 topology when forwarding through the network. The mechanism described in [RFC8099] could be used in ISIS to build a full mesh of tunnels over the L1 transit, but a full mesh of tunnels Przygienda, et al. Expires March 12, 2020 [Page 3] Internet-Draft draft-przygienda-flood-reflector September 2019 can also quickly limit the scaling. The network in Figure 2 would expose 6 L1/L2 nodes and (5 * 6)/2 = 15 L2 tunnels. In a slightly larger network, however, in a comparable topology containing 15 L1/L2 edge nodes the number grows very quickly to 105 tunnels. +----+ +-------+ +-------------------------------+-------+ +----+ | R1 | | 00 | | | 20 | | R4 | | L2 +--+ L1/L2 +------------------------------------+ L1/L2 +--+ L2 | | | | | | | | | | +----+ ++-+-+--+-+ | +-+--+---++ +----+ | | | | | | | | | +----------------------------------------------+ | | | | | | | | | | +-----------------------------------+ | | | | | | | | | | | | | +----------------------------------------+ | | | | | | | | | | +----+ ++-----+- | | | | -----+-++ +----+ | R2 | | 01 | | | | | | 21 | | R5 | | L2 +--+ L1/L2 +------------------------------------+ L1/L2 +--+ L2 | | | | | | | | | | | | | +----+ ++------+------------------------------+ | | +----+-++ +----+ | | | | | | | | | | | | | | | | | +-------------------------------------------+ | | | | | | | | | | | | | +----------+ | | | | | | | | | | | | | +-----+ | | | | | | | | | | +----+ ++----+-+-+ | +-+-+--+-++ +----+ | R3 | | 02 | | | 22 | | R6 | | L2 +--+ L1/L2 +------------------------------------+ L1/L2 +--+ L2 | | | | | | | | | | +----+ +-------+----+ +-------+ +----+ Figure 2 BGP, described in [RFC4271], faced a similar scaling problem, which has been solved in many networks by deploying BGP route reflectors, as described in [RFC4456]. And, to offer another crucial observation, BGP route reflectors do not necessarily need to be in the forwarding path. We suggest here a similar solution for IS-IS. A good approximation of what a "flood reflector" approach would look like is shown in Przygienda, et al. Expires March 12, 2020 [Page 4] Internet-Draft draft-przygienda-flood-reflector September 2019 Figure 3, where router 11 is used as 'reflector.' All L1/L2 routers build an L2 tunnel to such reflectors, so we end up with only 6 L2 tunnels instead of 15 of a full mesh. Multiple such reflectors can be used, of course, allowing the network operator to balance between resilience, path utilization, and state in the control plane. The resulting L2 tunnel scale is roughly R * n where R is the redundancy factor or in other words, number of flood reflectors used. This compares quite favorably with n^2 / 2 tunnels used in a fully meshed L2 solution. +----+ +-------+ +-------+ +----+ | R1 | | 00 | | 20 | | R4 | | L2 +--+ L1/L2 +--------------+ +-----------------+ L1/L2 +--+ L2 | | | | | | | | | | | +----+ +-------+ | | +-------+ +----+ | | +----+ +-------- --+---+-- --------+ +----+ | R2 | | 01 | | 11 | | 21 | | R5 | | L2 +--+ L1/L2 +------------+ L1/L2+---------------+ L1/L2 +--+ L2 | | | | | | FR | | | | | +----+ +-------+ +-+---+-+ +-------+ +----+ | | +----+ +-------+ | | +-------+ +----+ | R3 | | 02 +--------------+ +-----------------+ 22 | | R6 | | L2 +--+ L1/L2 | | L1/L2 +--+ L2 | | | | | | | | | +----+ +-------+ +-------+ +----+ Figure 3 This proposal, however, without further qualification would concentrate forwarded traffic at router 11. It would be hence desirable to decouple the forwarding plane from the control plane, so router 11 can reflood information without being placed in the forwarding path (hence router 11 would not end up being a forwarding plane bottleneck). To achieve that goal, multiple pieces will be necessary, only one of which is a local protocol extension on the L1/ L2 leafs and the 'flood reflectors'. In first approximation these extensions include: o A full mesh of L1 tunnels between the L1/L2 routers, ideally load- balancing across all available L1 links. This harnesses all forwarding paths between the L1/L2 edge nodes without injecting unneeded state into the L2 flooding domain or creating 'choke points' at the 'flood reflectors.' Przygienda, et al. Expires March 12, 2020 [Page 5] Internet-Draft draft-przygienda-flood-reflector September 2019 o A 'non-forwarding adjacency' for all the adjacencies built for the purpose of reflecting flooding information. This allows these 'flood reflectors' to participate in the IS-IS control plane without being used in the forwarding plane. This is a purely local operation on the L1/L2 ingress; it does not require replacing or modifying any routers not involved in the reflection process. o Some system to support reflector redundancy, and potentially some way to auto-discover and advertise such adjacencies as non- forwarding. This may allow L2 nodes outside the L1 to perform optimizations in the future based on this information. 2. Further Details Several considerations should be noted in relation to such a flood reflection mechanism. First, this allows multi-area IS-IS deployments to scale without any major modifications in the IS-IS implementation on most of the nodes deployed in the network. Unmodified (traditional) L2 routers will compute reachability across the transit L1 area using the non- forwarding adjacencies. Second, the flooding reflectors are not required to participate in forwarding traffic through the L1 transit area. These flooding reflectors can be hosted on virtual devices outside the forwarding topology. Third, astute readers will realize that flooding reflection may cause the use of suboptimal paths. This is similar to the BGP route reflection suboptimal routing problem described in [ID.draft-ietf-idr-bgp-optimal-route-reflection-19]. The L2 computation determines the egress L1/L2 and with that can create illusions of ECMP where there is none. And in certain scenarios lead to an L1/L2 egress which is not globally optimal. This represents a straightforward instance of the trade-off between the amount of control plane state and the optimal use of paths through the network often encountered when aggregating routing information. One possible solution to this problem is to expose additional topology information into the L2 flooding domains. In the example network given, links from router 01 to router 02 can be exposed into L2 even when 01 and 02 are participating in flood reflection. This information would allow the L2 nodes to build 'shortcuts' when the L2 flood reflected part of the topology looks more expensive to cross distance wise. Przygienda, et al. Expires March 12, 2020 [Page 6] Internet-Draft draft-przygienda-flood-reflector September 2019 Another possible variation is for an implementation to approximate with the L1 tunnel cost the cost of the underlying topology. Redundancy in the solution is trivial to achieve by building multiple flood reflectors into the L1 area while all reflectors are still remaining completely stateless and do not need any kind of synchronized algorithms amongst themselves except standard ISIS flooding procedures and database. 3. Flood Reflection TLV The Flood Reflection TLV is indicating the participation of a node as reflector and/or client. It is included in L1 area scope flooded LSPs and on L1 and L2 IIH. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Priority | FR Cluster ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: TBD Length The length, in octets, of the following fields. Reflector Priority Priority of the router to act as flood reflector in the cluster. A value of 0 indicates that the router is a client in the cluster. Any value higher than 0 indicates preference to be a flood reflector. Higher values are to be preferred by clients. FR Cluster ID Flood Reflector Cluster Identifier to allow a node to participate in possibly multiple clusters. 4. Non-Forwarding Adjacency Sub-TLV Przygienda, et al. Expires March 12, 2020 [Page 7] Internet-Draft draft-przygienda-flood-reflector September 2019 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FR Cluster ID | +-+-+-+-+-+-+-+-+ Type: TBD Length The length, in octets, of the following fields. FR Cluster ID Flood Reflector Cluster Identifier to which this NFA belongs. 5. Procedures There are a number of points to consider when implementing and deploying this solution, including: A router participating in flood reflection MUST be configured as L1L2 router. It originates the Flood Reflection TLV with area flooding scope in L1 only. Normally routers on the edge of the area, i.e. with non-FR L2 adjacencies, will advertise themselves as clients. Any L1L2 non-client router in the area can act as FR. A flood reflector can participate in a single cluster only, the clients are free to participate in multiple clusters at the same time. Upon reception of a Flood Reflection TLV router acting as client (in case it doesn't have such L2 adjacencies already) MUST initialize tunnels towards all the FRs with highest priority and MAY initiate such tunnels to FRs with lower priority. L2 over such tunnels MUST be marked as non-forwarding adjacencies. If the client has a direct L2 adjacency with the flood reflector it SHOULD use it instead of instantiating a tunnel. Upon reception of a Flood Reflection TLV router acting as client in case it doesn't have such direct L1 adjacencies already SHOULD initialize tunnels towards all the other clients in the its clusters. L1 *only* adjacencies SHOULD be built over such tunnels to ensure their liveliness, but other means can be used (since those adjacencies are used for L1 forwarding, it is prudent to advertise them into L1 as forwarding links). Przygienda, et al. Expires March 12, 2020 [Page 8] Internet-Draft draft-przygienda-flood-reflector September 2019 On the reflection client, after L2 and L1 computation, all non- forwarding adjacencies used as next-hops for L2 routes MUST be examined and replaced with the correct L1 tunnel next-hop to the egress. Due to the rules in Section 6 the computation in the resulting topology is relatively simple, the L2 SPF from a flood reflector client is guaranteed to reach within a hop the FR and in the following hop the L2 egress to which it has a L1 forwarding tunnel. However, if the topology has L2 paths which are not route reflected and look "shorter" than path through the FR then the computation will have to track the egress out of the L1 domain by a more advanced algorithm. A node, when advertising the L2 NFA SHOULD include the Non- Forwarding Adjacency Sub-TLV in Extended IS reachability TLV and MT-ISN TLV. 6. Adjacency Forming Procedures To ensure loop-free routing the ingress routers MUST follow normal L2 computation to generate L2 routes. This is because nodes outside the L1 area may not be aware that flooding reflection is performed. The resulting short cuts through the L1 area needs to be able to easily calculate the egress L1/L2 router where the tunnel tail-end is located. To prevent complex scenarios of flood reflectors building L2 adjacencies within a cluster or across clusters or hierarchies of reflectors, a flood reflector MUST never form an L2 adjacency with a peer if the peer is not a client in the same Cluster ID. This ensures a L2 computation on an ingress link or adjacency following a non-forwarding adjacency will always traverse a client of the flood reflector to exit the flooding domain. This allows shortcuts through the L1 area to be used without any danger of forwarding loops. Depending on pseudo-node choice in case of a broadcast domain with multiple flood reflectors attached this can lead to a partitioned LAN and hence a router discovering such a condition MUST initiate an alarm and declare misconfiguration. 7. Special Considerations In pathological cases setting the overload bit in L1 (but not in L2) can partition L1 forwarding, while allowing L2 reachability through non-forwarding adjacencies to exist. In such a case a node cannot replace a route through non-forwarding adjacency with a L1 shortcut and the client can use the L2 tunnel to the flood reflector for forwarding while it MUST initiate an alarm and declare misconfiguration. Przygienda, et al. Expires March 12, 2020 [Page 9] Internet-Draft draft-przygienda-flood-reflector September 2019 A flood reflector with directly L2 attached prefixes should advertise those in L1 as well since based on preference of L1 routes the clients will not try to use the L2 non-forwarding adjacency to route the packet towards them. A very, very corner case is when the flood reflector is reachable via L2 non-forwarding adjacency (due to underlying L1 partition) only in which case the client can use the L2 tunnel to the flood reflector for forwarding towards those prefixes while it MUST initiate an alarm and declare misconfiguration. Instead of modifying the computation procedures one could imagine a flood reflector solution where the FR would re-advertise the L2 prefixes with a 'third-party' next-hop but that would have less desirable convergence properties than the solution proposed and force a fork-lift of all L2 routers to make sure they disregard such prefixes unless in the same L1 domain as the FR. 8. IANA Considerations This document will request IANA to assign new TLV type value in the ISIS TLV Codepoints registry. This document will request IANA to assign new TLV type value in the 'Sub-TLVs for TLVs 22, 23, 25, 141, 222, and 223 (Extended IS reachability, IS Neighbor Attribute, L2 Bundle Member Attributes, inter-AS reachability information, MT-ISN, and MT IS Neighbor Attribute TLVs)' registry. 9. Security Considerations This document introduces no new security concerns to ISIS or other specifications referenced in this document. 10. Acknowledgements Thanks to Shraddha and Chris Bowers for thorough review. 11. References 11.1. Informative References [ID.draft-ietf-idr-bgp-optimal-route-reflection-19] Raszuk et al., R., "BGP Optimal Route Reflection", July 2019. [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, . Przygienda, et al. Expires March 12, 2020 [Page 10] Internet-Draft draft-przygienda-flood-reflector September 2019 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route Reflection: An Alternative to Full Mesh Internal BGP (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, . [RFC8099] Chen, H., Li, R., Retana, A., Yang, Y., and Z. Liu, "OSPF Topology-Transparent Zone", RFC 8099, DOI 10.17487/RFC8099, February 2017, . 11.2. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . Authors' Addresses Tony Przygienda Juniper 1137 Innovation Way Sunnyvale, CA USA Email: prz _at_ juniper.net Yiu Lee Comcast 1800 Bishops Gate Blvd Mount Laurel, NJ 08054 US Email: Yiu_Lee _at_ comcast.com Alankar Sharma Comcast 1800 Bishops Gate Blvd Mount Laurel, NJ 08054 US Email: Alankar_Sharma _at_ comcast.com Przygienda, et al. Expires March 12, 2020 [Page 11] Internet-Draft draft-przygienda-flood-reflector September 2019 Russ White Juniper 1137 Innovation Way Sunnyvale, CA USA Email: russw _at_ juniper.net Przygienda, et al. Expires March 12, 2020 [Page 12]