INTERNET-DRAFT N. Malhotra, Ed. K. Patel Arrcus Intended Status: Proposed Standard J. Rabadan Nokia Expires: Sept 12, 2019 Mar 11, 2019 LSoE-based PE-CE Control Plane for EVPN draft-malhotra-bess-evpn-lsoe-00 Abstract In an EVPN network, EVPN PEs provide VPN bridging and routing service to connected CE devices based on BGP EVPN control plane. At present, there is no PE-CE control plane defined for an EVPN PE to learn CE MAC, IP, and any other routes from a CE that may be distributed in EVPN control plane to enable unicast flows between CE devices. As a result, EVPN PEs rely on data plane based gleaning of source MACs for CE MAC learning, ARP/ND snooping for CE IPv4/IPv6 learning, and in some cases, local configuration for learning prefix routes behind a CE. A PE-CE control plane alternative to this traditional learning approach, where applicable, offers certain distinct advantages that in turn result in simplified EVPN operation. This document defines a PE-CE control plane as an optional alternative to traditional non-control-plane based PE-CE learning in an EVPN network. It defines PE-CE control plane procedures and TLVs based on LSoE as the base protocol, enumerates advantages that may be achieved by using this PE-CE control plane, and discusses in detail EVPN use cases that are simplified as a result. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". malhotra et al. Expires Sept 12 2019 [Page 1] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 2. PE <-> CE Control Plane Overview . . . . . . . . . . . . . . . 7 3. TLVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 Overlay IPv4 Encapsulation PDU . . . . . . . . . . . . . . . 9 3.2 Overlay IPv6 Encapsulation PDU . . . . . . . . . . . . . . . 10 3.3 Overlay IPv4 Prefix Encapsulation PDU . . . . . . . . . . . 12 3.4 Overlay IPv6 Prefix Encapsulation PDU . . . . . . . . . . . 13 4. CE MAC/IP Learning on a PE AC . . . . . . . . . . . . . . . . . 14 4.1 PE <-> CE LSoE Session Establishment . . . . . . . . . . . . 14 4.2 CE MAC/IP Learning . . . . . . . . . . . . . . . . . . . . . 14 5. PE Any-cast GW MAC/IP Learning on CE . . . . . . . . . . . . . 15 6. Remote CE MAC/IP Learning on CE . . . . . . . . . . . . . . . . 15 7. PE <-> CE Control Plane with EVPN All-active Multi-Homing . . . 16 7.1 All-active Multi-Homing Mode . . . . . . . . . . . . . . . . 16 7.2 Source MAC . . . . . . . . . . . . . . . . . . . . . . . . . 17 7.3 CE MAC/IP Learning with EVPN All-active Multi-Homing . . . . 17 7.4 LAG Member Link Failure . . . . . . . . . . . . . . . . . . 18 7.4.1 Session Re-establishment . . . . . . . . . . . . . . . . 18 7.4.2 TLV Retention . . . . . . . . . . . . . . . . . . . . . 18 7.4 LAG Failure . . . . . . . . . . . . . . . . . . . . . . . . 18 7.5 Example PE <-> CE Control Plane Flow with All-active malhotra et al. Expires Sept 12 2019 [Page 2] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN Multi-Homing . . . . . . . . . . . . . . . . . . . . . . . . 19 8. Software Neighbor Tables . . . . . . . . . . . . . . . . . . . 21 9. MAC/IP Learning Conflict Resolution . . . . . . . . . . . . . . 21 10. PE-CE Overlay Prefix Learning . . . . . . . . . . . . . . . . 22 11. Asymmetric EVPN-IRB . . . . . . . . . . . . . . . . . . . . . 22 12. Centralized Gateway EVPN-IRB . . . . . . . . . . . . . . . . . 22 13. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 22 13.1 Simplified EVPN Operations . . . . . . . . . . . . . . . . 22 13.1.1 EVPN All-active Multi-Homing . . . . . . . . . . . . . 23 13.1.2 Convergence on CE Host Moves . . . . . . . . . . . . . 24 13.1.2.1 Silent Hosts . . . . . . . . . . . . . . . . . . . 24 13.1.2.2 Probing . . . . . . . . . . . . . . . . . . . . . 25 13.1.3 ARP Gleaning Latency . . . . . . . . . . . . . . . . . 26 13.2 Applicability to non-EVPN Use Cases . . . . . . . . . . . . 26 14. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 15.1 Normative References . . . . . . . . . . . . . . . . . . . 28 15.2 Informative References . . . . . . . . . . . . . . . . . . 28 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 29 malhotra et al. Expires Sept 12 2019 [Page 3] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 1 Introduction In an EVPN network, CE devices typically connect to an EVPN PE via layer-2 interfaces that terminate in a BD on the PE. Multi-homed LAG interfaces together with EVPN all-active multi-homing procedures are used to achieve PE-CE link and PE node redundancy for fault-tolerance and load-balancing. PEs provide overlay bridging and, optionally, first-hop routing service for these CE devices based on an EVPN control plane that is used to distribute CE MAC, IP, and prefix reachability across PEs. At present, there is no PE-CE control plane defined for an EVPN PE to learn connected CE host MACs and IPs. As a result, EVPN PEs rely on: o data plane based gleaning of source MAC for MAC learning, o ARP snooping for IPv4 + MAC learning, and o ND snooping for IPv6 + MAC learning. A PE-CE control plane alternative to this traditional learning approach, where applicable, can offer some distinct advantages across various boot-up, mobility, and convergence scenarios: o PE-CE learning is decoupled from non-deterministic hashing of data, ARP, and ND packets from CEs over all-active multi-homed LAG interfaces. o PE-CE learning is decoupled from non-deterministic periodicity of data traffic from CEs or, in an extreme scenario, from CE device being silent for an extended period. o PE-CE learning is decoupled from non-deterministic CE behavior with respect to unsolicited ARPs and NAs following boot-up and moves. o PE-CE learning is decoupled from latencies associated with data packet triggered ARP and ND gleaning. This in-turn results in simplification of certain EVPN operations such as aliasing, MAC and IP syncing across multi-homing PEs, and probing on MAC/IP moves. In addition, it helps achieve a deterministic convergence behavior across various boot-up, mobility, and failure scenarios. A PE may also use local policy configuration for learning prefixes behind a CE that does not run a dynamic routing protocol. A PE-CE control plane can provide an operationally simpler alternative to local configuration for such use cases, where CE and PE devices are not under the same configuration management entity. This document defines a new PE-CE control plane as an alternative to traditional data-plane and ARP/ND snooping based PE-CE host learning malhotra et al. Expires Sept 12 2019 [Page 4] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN and to local configuration-based PE-CE prefix learning. It defines PE-CE control plane procedures and TLVs based on [LSOE] as the base protocol, enumerates advantages that may be achieved by using this PE-CE control plane, and discusses in detail EVPN operations that are simplified as a result. Use of PE-CE control plane defined in this document is intended to be optional and backwards compatible with CEs that use traditional PE-CE learning within the same BD. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The following terms are used in this document: o LSoE: Link State over Ethernet Protocol defined in [LSOE] o EVPN-IRB: A BGP-EVPN distributed control plane based integrated routing and bridging fabric overlay discussed in [EVPN-IRB] o Underlay: IP or MPLS fabric core network that provides IP or MPLS routed reachability between EVPN PEs. o Overlay: VPN or service layer network consisting of EVPN PEs OR VPN provider-edge (PE) switch-router devices that runs on top of an underlay routed core. o EVPN PE: A PE switch-router in a data-center fabric that runs overlay BGP-EVPN control plane and connects to overlay CE host devices. An EVPN PE may also be the first-hop layer-3 gateway for CE/host devices. This document refers to EVPN PE as a logical function in a data-center fabric. This EVPN PE function may be physically hosted on a top-of-rack switching device (ToR) OR at layer(s) above the ToR in the Clos fabric. An EVPN PE is typically also an IP or MPLS tunnel end-point for overlay VPN flows. o CE: A tenant host device that has layer 2 connectivity to an EVPN PE switch-router, either directly OR via intermediate switching device(s). o Symmetric EVPN-IRB: An overlay fabric first-hop routing architecture as defined in [EVPN-IRB], wherein, overlay host-to- host routed inter-subnet flows are routed at both ingress and egress EVPN PEs. o Asymmetric EVPN-IRB: An overlay fabric first-hop routing architecture as defined in [EVPN-IRB], wherein, overlay host-to- host routed inter-subnet flows are routed and bridged at ingress PE and bridged at egress PEs. o Centralized EVPN-IRB: An overlay fabric first-hop routing architecture, wherein, overlay host-to-host routed inter-subnet malhotra et al. Expires Sept 12 2019 [Page 5] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN flows are routed at a centralized gateway, typically at the one of the spine layers, and where EVPN PEs are pure bridging devices. o ARP: Address Resolution Protocol [RFC 826]. o ND: IPv6 Neighbor Discovery Protocol [RFC 4861]. o Ethernet-Segment: physical Ethernet or LAG port that connects an access device to an EVPN PE, as defined in [RFC 7432]. o ESI: Ethernet Segment Identifier as defined in [RFC 7432]. o LAG: Layer-2 link-aggregation, also known as layer-2 bundle port-channel, or bond interface. o EVPN all-active multi-homing: PE-CE all-active multi-homing achieved via a multi-homed layer-2 LAG interface on a CE with member links to multiple PEs and related EVPN procedures on the PEs. o EVPN Aliasing: multi-homing procedure as defined in [RFC 7432]. o BD: Broadcast Domain. o Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. o AC: A PE Attachment Circuit. This may be an access (untagged) or trunk (tagged) layer-2 interface that is a member of a local VLAN or a BD. malhotra et al. Expires Sept 12 2019 [Page 6] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 2. PE <-> CE Control Plane Overview The Link State over Ethernet (LSoE) protocol is defined in [LSOE] as a protocol over Ethernet links to auto-discover connected neighbor's layer 2, layer 3 attributes, and encapsulations for the purpose of bringing up upper layer routing protocols. This document leverages LSoE as a PE-CE protocol in an EVPN network fabric on access links between an EVPN PE and CE. Specifically, o PE-CE control plane based on LSoE protocol is proposed for CE MAC learning as an alternative to data-plane based source MAC learning. o PE-CE control plane based on LSoE protocol is proposed for CE MAC-IP adjacency learning as an alternative to MAC-IP learning based on ARP/ND snooping. o PE-CE control plane based on LSoE is proposed for learning of IP Prefixes and associated overlay indexes, as an alternative to local configuration on the PE for use case defined in section 4.1 of [EVPN-PREFIX-ADV]. Note that any specification related to base LSoE protocol itself is considered out of scope for this document and will continue to be covered in the base protocol spec. This document will instead focus on procedures and TLV extensions needed to achieve the above learning on PE-CE links in an EVPN network. Any text that relates to the base protocol included in this document is simply background information in the context of use cases covered in this document. The reader should refer to the base LSoE protocol document for the exact LSoE protocol specification. +------------------------+ | Underlay Network Fabric| +------------------------+ BGP-EVPN Peering <------------------------------> +------+ +------+ +------+ | PE1 | ..... | PE2 | | PE3 | +------+ +------+ +------+ | \ / LSoE Session \ ESI / | LSoE \ / LSoE CE-host to PE2 CE-Host to PE3 Figure 1 malhotra et al. Expires Sept 12 2019 [Page 7] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN An LSoE session is established on layer-2 access interfaces between the EVPN PE and each connected CE host device. A session end-point is identified by a peer device MAC address on a layer-2 interface. LSoE HELLO messages are used for end-point discovery and OPEN messages are exchanged between two end-points to establish an LSoE peering. Once LSoE peering is established, encapsulation TLVs are exchanged for learning. In the context of an EVPN network, CE Attachment Circuits (AC logical interfaces) typically terminate in a BD on the PE, with multi-homed LAG interfaces used for EVPN all-active multi-homing. CE hosts may be directly connected to EVPN PEs via access ports, or may be connected on trunk-ports via another switch. In a common EVPN-IRB design, EVPN PEs also function as distributed first-hop gateways for hosts in a BD. While symmetric and asymmetric IRB designs are possible as discussed in [EVPN IRB], procedures described in subsequent sections assume symmetric IRB with distributed any-cast gateways on EVPN PEs. Any deviations from these procedures for asymmetric IRB design or a centralized IRB design will be covered in future updates to this document. The next few sections will focus on additional LSoE TLVs and procedures needed for PE-CE learning on EVPN PE ACs without and with all-active multi-homing. malhotra et al. Expires Sept 12 2019 [Page 8] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 3. TLVs This section defines new TLVs that are used by PE-CE control plane defined in this document. 3.1 Overlay IPv4 Encapsulation PDU A new encapsulation PDU type is defined for the purpose of carrying overlay IPv4 and MAC bindings. Alternatively, it may also be used to carry an overlay MAC with a NULL IPv4 address in a non-IRB use case. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 8 | PDU Length | Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | IPv4 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | PrefixLen |E| Rsvd | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MAC Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | IPv4 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | PrefixLen |E| Rsvd | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MAC Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | more ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 o A new LSoE PDU type (8) is requested for this PDU. o The IPv4 Address is that of an overlay. o MAC address carries the MAC binding for the particular IPv4 address if one is set in the PDU. If an IPv4 address is not set, it simply signals an overlay MAC address. o EVPN flag 'E' indicates if this encapsulation is being sent on behalf of a remote host learnt via EVPN. Use of this flag is covered in a later section. This PDU is used to carry PE's any-cast gateway IPv4 address and MAC bindings to a CE host device. Optionally, it may also be used to relay a remote CE's IPv4 address and MAC bindings to a local CE host within a subnet, as well as to send local CE IPv4 address and MAC binding to the PE. Procedures related to use of this PDU are malhotra et al. Expires Sept 12 2019 [Page 9] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN discussed in subsequent sections. In comparison to IPv4 Encapsulation PDU defined in [LSOE], this PDU allows you to explicitly signal a MAC binding that MAY be different from the device MAC used to establish an LSoE peering via HELLO/OPEN messages exchange. The encapsulation list in this PDU MUST follow full replace semantics as in the LSoE protocol specification. 3.2 Overlay IPv6 Encapsulation PDU A new encapsulation PDU type is defined for the purpose of carrying overlay IPv6 and MAC bindings: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 9 | PDU Length | Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+ + | | + + | | + + | IPv6 Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | PrefixLen |E|R|O| Rsvd | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MAC Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | more ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 o A new LSoE PDU type (9) is requested for this PDU. o The IPv6 Address is that of an overlay. o MAC address carries the MAC binding for IPv6 address in the PDU. o An EVPN flag 'E' indicates if this encapsulation is being sent on behalf of a remote host learnt via EVPN. Usage of this flag is covered in a later section. o A Router flag 'R' is used to carry "Router Flag" or "R-bit" as defined in [RFC4861]. Usage of this flag for the purpose of installing ND cache entries based on learning via this TLV is as defined in [RFC4861] malhotra et al. Expires Sept 12 2019 [Page 10] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN o An Override flag 'O' is used to carry "Override Flag" or "O-bit" as defined in [RFC4861]. Usage of this flag for the purpose of installing ND cache entries based on learning via this TLV is as defined in [RFC4861] This PDU is used to carry PE's any-cast gateway IPv6 address and MAC bindings to a CE host device. Optionally, it may also be used to relay a remote CE's IPv6 address and MAC bindings to a local CE within a subnet, as well as to send local CE IPv6 address and MAC bindings to the PE. Procedures related to usage of this PDU are discussed in subsequent sections. The encapsulation list contained in this PDU MUST follow full replace semantics as in the LSoE protocol specification. malhotra et al. Expires Sept 12 2019 [Page 11] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 3.3 Overlay IPv4 Prefix Encapsulation PDU A new encapsulation PDU type is defined for the purpose of carrying overlay IPv4 prefix routes for prefixes behind a CE that does not run a dynamic routing protocol for use-case as defined in section 4.1 of [EVPN-PREFIX-ADV]: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 10 | PDU Length | Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Prefix Count | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Prefix | PrefixLen | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Prefix | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PrefixLen | More... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | GW IP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rsvd | More... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 A CE device as defined in [EVPN-PREFIX-ADV], with prefixes behind it MAY use the above PDU to send these prefixes to an EVPN PE with itself as the GW. An EVPN PE MAY then advertise prefixes received via this PDU as RT-5, with TS as the GW, as defined in [EVPN-PREFIX-ADV]. o A new LSoE PDU type (10) is requested for this PDU. o IPv4 Prefix is set to a prefix behind a CE. o PrefixLen is set to IPv4 prefix length for the advertised prefix. o GW-IP is set to the CE IPv4 address (advertised via Type 8 PDU). Multiple prefixes may be set for a single GW IP. The encapsulation list contained in this PDU MUST follow full replace semantics as in the LSoE protocol specification. malhotra et al. Expires Sept 12 2019 [Page 12] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 3.4 Overlay IPv6 Prefix Encapsulation PDU A new encapsulation PDU type is defined for the purpose of carrying overlay IPv6 prefix routes for prefixes behind a CE that does not run a dynamic routing protocol for use-case as defined in section 4.1 of [EVPN-PREFIX-ADV]: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 10 | PDU Length | Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Prefix Count | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + + | | + + | IPv6 Prefix | + +-+-+-+-+-+-+-+-+ | | PrefixLen | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + IPv4 Prefix + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PrefixLen | more... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + GW IP + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rsvd | more... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 A CE device as defined in [EVPN-PREFIX-ADV], with prefixes behind it malhotra et al. Expires Sept 12 2019 [Page 13] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN MAY use the above PDU to send these prefixes to an EVPN PE with itself as the GW. An EVPN PE MAY then advertise prefixes received via this PDU as RT-5, with TS as the GW, as defined in [EVPN-PREFIX-ADV]. o A new LSoE PDU type (11) is requested for this PDU. o IPv6 Prefix is set to an IPv6 prefix behind a CE. o PrefixLen is set to IPv6 prefix length for the advertised prefix. o GW-IP is set to the CE IPv6 address (advertised via Type 9 PDU). Multiple prefixes may be set for a single GW IP. The encapsulation list contained in this PDU MUST follow full replace semantics as in the LSoE protocol specification. 4. CE MAC/IP Learning on a PE AC This section defines procedures for learning a connected CE MAC and IP on a PE local attachment circuit (AC). 4.1 PE <-> CE LSoE Session Establishment On an EVPN PE, o A HELLO and/or OPEN PDU sent from a CE host source MAC is received on a tagged or untagged interface that is member of a local BD, referred here to as an AC. o OPEN messages are exchanged with the host on the AC. o LSoE session is established to the host source MAC and bound to a local AC. 4.2 CE MAC/IP Learning Overlay IPv4 and IPv6 encapsulation PDU types 8/9 from a CE are used for the purpose of CE MAC/IP learning on a PE: o The EVPN flag 'E' MUST NOT be set in type 8/9 PDU from a CE. o A MAC entry for the MAC received in a type 8/9 PDU MUST be installed in the MAC-VRF table pointing to the AC to which the session is bound. o If an IPv4/IPv6 address is set in the PDU, an IPv4/IPv6 neighbor binding MUST be established for the IPv4/IPv6 address in the PDU to the MAC address in the PDU. In other words, a next-hop re-write for these IPv4/IPv6 neighbor entries MUST be installed using the MAC address in the PDU, and if required by forwarding logic, bound to the AC associated with the LSoE session. o Note that an IPv4/IPv6 address MAY NOT be set in a type 8/9 PDU received from a CE, in which case this PDU is only used for MAC learning. This MAY be the case in a non-IRB EVPN network, wherein, an EVPN PE is not a first-hop router for the attached CEs. malhotra et al. Expires Sept 12 2019 [Page 14] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 5. PE Any-cast GW MAC/IP Learning on CE If LSoE based host learning is enabled on a PE with a distributed any-cast gateway on the EVPN PE, o EVPN PE MUST send type 8/9 Overlay Encapsulation PDUs on associated ACs with LSoE sessions toward CE hosts. o Type 8/9 PDUs from an EVPN PE MUST be encoded with the any-cast gateway IPv4/IPv6 address and any-cast gateway MAC address. o EVPN flag 'E' MUST NOT be set in this PDU. o A CE MAY process type 8/9 PDUs to establish GW IP to MAC bindings and learn gateway MAC to LAG AC bindings, similar to handling of type 8/9 PDUs on the PE described above. Handling of type 8/9 PDUs for the purpose of gateway learning on the host is desirable but optional. A CE MAY continue to use ARP and ND for this purpose. 6. Remote CE MAC/IP Learning on CE For CE to CE intra-subnet flows across the overlay, CE needs to learn and install a neighbor IP to MAC binding for remote CEs. This is handled today either by flooding ARP/ND requests across the overlay bridge and optionally implementing an ARP/ND suppression cache on the PE that is populated via MAC+IP EVPN route-type 2. ARP/ND request frames are trapped on the PE that does a local ARP/ND reply on behalf of the remote CE. If LSoE based learning is enabled in the fabric, LSoE may be used for this purpose to avoid overlay ARP/ND flooding, data frame triggered ARP learning, and to avoid maintaining an ARP suppression cache on the PE. o Remote MAC-IP routes learned via BGP EVPN route-type 2 that are imported to a local MAC-VRF MAY also be sent as type 8/9 PDUs on LSoE sessions to CEs over local ACs in that BD. o EVPN flag 'E' MUST be set in this encapsulation in the PDU. o A CE MAY install IPv4/IPv6 neighbor MAC bindings for remote CEs within a subnet based on 'E' flagged type 8/9 PDUs received from the PE. Handling of type 8/9 PDUs for this purpose is optional but desirable to get full benefit of a fabric that is completely setup on boot-up, avoids overlay flooding, and is decoupled from latencies associated with data plane driven ARP and ND learning. malhotra et al. Expires Sept 12 2019 [Page 15] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 7. PE <-> CE Control Plane with EVPN All-active Multi-Homing +------------------------+ | Underlay Network Fabric| +------------------------+ BGP-EVPN Peering <---------------------------------------------------> +------+ +------+ +------+ +------+ | PE1 | | PE2 | ..... | PEx | | PEy | +------+ +------+ +------+ +------+ \ / \ / \ / \ / \ / \ / \ ESI-a / \ ESI-b / LSoE \ / LSoE LSoE \ / LSoE to PE1\ / to PE2 to PEx\ / to PEy CE-Host CE-Host Figure 6 In an EVPN all-active multi-homing setup, a LAG interface on the CE includes member physical ports that connect to multiple PE devices. A subset of these member ports that terminate at a PE are configured as members of a local LAG interface at that PE. A LAG AC at the PE is a logical interface in a BD, identified by this LAG interface and optionally, an Ethernet Tag in case of trunk ports. In order for LSoE based learning to work with EVPN all-active multi- homing, a separate LSoE peering MUST be established between the CE host and each PE device. For this reason, while an EVPN PE MAY form an LSoE peering to a CE host on its local LAG AC, the CE host MUST form an LSoE peering to a PE on a local LAG "member physical port". A configurable All-active Multi-Homing mode is defined below in order to be able to bind an LSoE peering to a LAG member-port as opposed to a LAG interface. 7.1 All-active Multi-Homing Mode When configured to run on a local LAG port in this mode, o LSoE HELLO messages MUST be replicated on ALL LAG member ports. o An LSoE OPEN message sent in response to a HELLO MUST be sent on the LAG member port on which the HELLO was received. o An LSoE session MUST be bound to the local LAG member port on malhotra et al. Expires Sept 12 2019 [Page 16] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN which the OPEN message was received. o LSoE encapsulation PDUs MUST be sent on the local LAG member port on which the session was bound. o LSoE Keep-Alives MUST be sent on the local LAG member port on which the session was bound. Note that this may result in a PE receiving multiple HELLO PDUs from a CE MAC. This however is harmless, as per the [LSOE] specification. A PE simply drops redundant HELLOs from a MAC that it has already replied to with an OPEN, within a retry time window. 7.2 Source MAC LSoE relies on the source MAC address in the Ethernet frame to establish a peering. When running LSoE on a LAG port (in all-active multi-homing mode or regular mode), LSoE frames MUST use the LAG interface MAC as the source MAC address in the Ethernet frame. 7.3 CE MAC/IP Learning with EVPN All-active Multi-Homing In order to accomplish MAC/IP learning of CE host devices multi-homed to EVPN fabric PEs via EVPN All-active Multi-Homing: o A multi-homed CE device MUST be configured to run LSoE on a local LAG interfaces in All-active Multi-Homing mode defined above. o EVPN PE MAY run LSoE on local LAG interfaces to multi-homed CE devices in regular mode. o EVPN PEs that share the same Ethernet Segment MUST use unique source MACs (that of the local LAG) in HELLO/OPEN messages to establish separate LSoE sessions to a CE. With the above rules in place, o An LSoE session on the CE is bound to a local LAG member-port. o An LSoE session on the PE is bound to a local LAG AC port. o A single LSoE session is established at the PE to a CE on the local LAG AC. o 'N' LSoE sessions are established at the CE, one to each PE on a local LAG member interface, where N = number of multi-homing PEs in an Ethernet Segment. Once an LSoE session is established as above, all other host learning procedures defined earlier for CE MAC/IP learning on a PE's AC port apply as is to a LAG AC in an EVPN all-active multi-homing setup. malhotra et al. Expires Sept 12 2019 [Page 17] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 7.4 LAG Member Link Failure On a CE that is running in all-active multi-homing mode, an LSoE session to a PE is bound to a LAG member interface. If the link that the LSoE session is bound to fails, LSoE session will get torn down at the CE by virtue of the session interface going down. If the CE has additional active member link(s) to this PE, a new LSoE session must be established on one of the active member links via HELLO PDUs sent by the CE on its remaining active member links to the PE. 7.4.1 Session Re-establishment LSoE session at the CE is torn down immediately following the session interface failure. While the LAG interface at the PE is still operationally UP, LSoE session at the PE is subject to Keep Alive PDUs received from the CE. Once the session expires at the PE because of missed Keep Alive PDUs from the CE, PE will respond to HELLO on one of the active member link with an OPEN to re-establish a new session. Note that the new session is still bound to the LAG AC at the PE and to a new member link at the CE. 7.4.2 TLV Retention TLVs learnt from a CE over a failed session MUST be retained at the PE if the PE LAG AC is still operationally up following a member link failure because of active member link(s) in the LAG. TLV retention logic at the PE MAY be based on an age-out time, that is a local matter at the PE. TLV age-out time MUST be higher than the missed Keep Alive duration, after which the session is considered closed. Once a new LSoE session is established, PE MUST implement a mark and sweep logic to reconcile retained TLVs from the CE peer with the new set of TLVs received from this CE. 7.4 LAG Failure When a LAG member link failure results in the LAG interface being operationally down, TLV age-out logic discussed above MUST NOT be in effect. LSoE session MAY be be considered as DOWN immediately on the LAG being down at the PE. This is so that, in the event of a total connectivity loss between a PE and CE, CE learnt routes can be withdrawn immediately. malhotra et al. Expires Sept 12 2019 [Page 18] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 7.5 Example PE <-> CE Control Plane Flow with All-active Multi-Homing An example LSoE over all-active multi-homing session flow is discussed below for clarity. +-------------+ +-------------+ | | | | | PE2 | | PE3 | | | | | +-+-----------+ +-+-----------+ | LAG | | LAG | ++--+---+--++ ++--+---+--++ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +--+--+---+--+----------+--+---+--++ | LAG | +----------------------------------+-+ | | | H1 | | | +------------------------------------+ Figure 7 Example topology with CE H1 multi-homed to PE2 and PE3 via EVPN all- active multi-homing LAG with four member ports to each PE: H1 member ports to PE2: i121, i122, i123, i124 | | | | PE2 member ports to H1: i211, i212, i213, i214 H1 member ports to PE3: i131, i132, i133, i134 | | | | PE3 member ports to H1: i311, i312, i313, i314 H1 LAG port to PE2/PE3: MLAG1 PE2 LAG port to H1: LAG2 PE3 LAG port to H1: LAG3 H1 LAG MAC: LMAC1 PE2 LAG MAC: LMAC2 PE3 LAG MAC: LMAC3 H1 running LSoE on MLAG1 in All-active Multi-Homing mode PE2 running LSoE on LAG2 in regular mode PE3 running LSoE on LAG3 in regular mode malhotra et al. Expires Sept 12 2019 [Page 19] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN PE2 H1 PE3 | HELLOs | HELLOs | LAG2|<-------------------|------------------->|LAG3 LAG2|<-------------------|------------------->|LAG3 LAG2|<-------------------|------------------->|LAG3 LAG2|<-------------------|------------------->|LAG3 | | | | OPEN | OPEN | |------------------->|<-------------------| LAG2| i122|i132 |LAG3 | | | | OPEN | OPEN | |<-------------------|------------------->| LAG2| i122|i132 |LAG3 | | | Session to | Session to |Session to |Session to LMAC1 on LAG2| LMAC2 on i122|LMAC3 on i132 |LMAC1 on LAG3 | | | | Encap-PDU | Encap-PDU | |<-------------------|------------------->| LAG2| i122|i132 |LAG3 | ACK | ACK | |------------------->|<-------------------| LAG2| | |LAG3 | | | | Overlay-PDU | Overlay-PDU | |------------------->|<-------------------| LAG2| | |LAG3 | ACK | ACK | |<-------------------|------------------->| LAG2| i122|i132 |LAG3 | | | Figure 8 In an example flow shown above: o H1: originates HELLO(SMAC=LMAC2) on all MLAG member ports o PE2: Multiple HELLO(SMAC=LMAC2) copies received on port LAG2 o PE3: Multiple HELLO(SMAC=LMAC2) copies received on port LAG3 o PE2: A single OPEN(SMAC=LMAC2, DMAC=LMAC1) sent on port LAG2 o PE3: A single OPEN(SMAC=LMAC3, DMAC=LMAC1) sent on port LAG3 o PE2/PE3:duplicate HELLOs from same source LMAC2 are ignored o H1: OPEN(SMAC=LMAC2, DMAC=LMAC1) received on member port i122 o H1: OPEN(SMAC=LMAC1, DMAC=LMAC2) sent on member port i122 o H1: Session established to LMAC2 on MLAG1 member port i122 malhotra et al. Expires Sept 12 2019 [Page 20] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN o PE2: Session established to LMAC1 on LAG AC LAG2 o H1: OPEN(SMAC=LMAC3, DMAC=LMAC1) received on member port i132 o H1: OPEN(SMAC=LMAC1, DMAC=LMAC3) sent on member port i132 o H1: Session established to LMAC3 on MLAG member port i132 o PE3: Session established to LMAC1 on LAG AC LAG3 o H1: IP encapsulation PDUs (type 4/5) sent to LMAC2 and LMAC3 o PE2/PE3: H1 MAC and IP are learned o PE2/PE3: overlay IP encapsulation PDUs (type 8/9) sent to LMAC1 o H1: Any-cast GW MAC and IP are learned o H1: Remote host MAC and IP are learned 8. Software Neighbor Tables Some networking stack implementations rely on ARP and ND populated neighbor tables for software forwarding. In order to inter-work with such an implementation, an LsoE learned IPv4/IPv6 neighbor entry MAY also be installed in ARP and ND neighbor table as a static / permanent entry. In addition, o Pre-installing LSoE learned neighbor entries may help reduce potential conflict with ARP or ND learned neighbor entries. o Pre-installing LSoE learned neighbor entries may help reduce reliance on data traffic triggered ARP requests / ND solicitations and associated learning latency. With respect to installing IPv6 entries learnt via LSoE in IPv6 ND cache, Router flag (R-bit) and Override flag (O-bit) received in LSoE PDU should be handled as defined in [RFC4861]. 9. MAC/IP Learning Conflict Resolution If LSoE learned neighbor entries are not already installed as static entries in ARP/ND neighbor table, it is possible that a neighbor IPv4/IPv6 adjacency may be learned both via LSoE and ARP/ND. Even if LSoE learned entries were pre-installed in neighbor table, a race condition is still possible leading to a potential conflict between ARP/ND learned and LSoE learned neighbor IP adjacency. In such scenarios, LSoE learned entry should be preferred for the purpose of programming neighbor IP adjacencies in forwarding. With respect to MAC-VRF entries, it is recommended that data plane learning be turned off when LSoE based learning is enabled. However, if it is not, data plane learned entries MUST be reconciled with LSoE learned entries in software and, in case of a conflict, LSoE learned entries preferred if LSoE based learning is enabled. malhotra et al. Expires Sept 12 2019 [Page 21] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 10. PE-CE Overlay Prefix Learning [EVPN-PREFIX-ADV] section 4.1 defines a use case, wherein, a PE may advertise IP prefixes and subnets behind a CE. In this use case, CE device does not run a dynamic routing protocol. Instead, these prefixes are learnt on the PE via local policy or configuration. Prefixes are then advertised by PE as RT-5 with the CE as the GW. PE-CE control plane defined in this document MAY be used to learn these prefixes from a CE as an alternative to local configuration on the PE. Once an LSoE session is established between a CE and a PE, as discussed earlier, o A CE MAY send type 10/11 PDUs with these IPv4/IPv6 prefixes over an LSoE session to a PE with the CE IP as the GW IP. o A PE MAY advertise prefixes learnt via type 10/11 PDUs as RT-5 with CE IP as the GW IP. To summarize, A PE would advertise: o RT-2 for the CE MAC-IP learnt via type 8/9 PDU o RT-5 for Prefixes learnt via type 10/11 PDU with GW IP = CE IP 11. Asymmetric EVPN-IRB Any deviations from the above procedures proposed in this document for asymmetric IRB design will be covered in subsequent updates to this document. 12. Centralized Gateway EVPN-IRB Any deviations from the above procedures proposed in this document for centralized GW based IRB design will be covered in subsequent updates to this document. 13. Use Cases 13.1 Simplified EVPN Operations This section will discuss in detail, benefits and simplifications that may be achieved in the context of an EVPN network, if one chooses to implement PE-CE control plane defined in this document as opposed to using traditional data-plane and ARP/ND snooping based PE-CE learning. malhotra et al. Expires Sept 12 2019 [Page 22] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 13.1.1 EVPN All-active Multi-Homing +------------------------+ | Underlay Network Fabric| +------------------------+ BGP-EVPN Peering <---------------------------------------------------> +------+ +------+ +------+ +------+ | PE1 | | PE2 | ..... | PEx | | PEy | +------+ +------+ +------+ +------+ \ / \ / \ / \ / \ / \ / \ ESI-a / \ ESI-b / LAG Bundle LAG Bundle to CE Host to CE Host Figure 9 Data plane and ARP/ND snooping based MAC/IP learning on PE-CE all- active multi-homed LAG ports is subject to unpredictable hashing of ARP, ND, and data frames from host to PE. As an example, an ARP request for a connected host might originate at PE1 but the resulting ARP response from the host might be received at PE2. Redundant EVPN PEs in all-active multi-homing mode typically handle this unpredictability via combination of methods below: o PEs can handle unsolicited ARP and ND response frames. o PEs can implement additional mechanism to SYNC ARP, ND, and MAC tables across all PEs in a redundancy group for optimal forwarding to locally connected hosts. o PEs can implement EVPN aliasing procedures discussed in [RFC 7432] OR re-originate SYNCed MAC-IP adjacencies as local RT- 2 to achieve MAC ECMP across the overlay. o PEs can also re-originate SYNCed MAC-IP adjacencies as local RT-2 to achieve IP ECMP across the overlay OR implement IP aliasing procedures discussed in [EVPN-IP-ALIASING]. o PEs can also ensure EVPN sequence number SYNC for local MAC entries for EVPN mobility procedures to work correctly, as discussed in [EVPN-IRB-MOBILITY]. The PE-CE control plane learning alternative defined in this document fully decouples MAC and IP learning over MLAG ports from unpredictable hashing of data, AR, ND frames on all-active multi- malhotra et al. Expires Sept 12 2019 [Page 23] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN homed LAG member links. As a result, above procedures that essentially result from data-plane PE-CE learning on all-active multi-homed LAGs can be simplified via the PE-CE control plane alternative defined in this document. 13.1.2 Convergence on CE Host Moves +------------------------+ | Underlay Network Fabric| +------------------------+ BGP-EVPN Peering <------------------------------> +------+ +------+ +------+ | PE1 | | PE2 | ..... | PEx | +------+ +------+ +------+ | | | Hosts Hosts Hosts Figure 10 Host mobility across EVPN PE switches is a common occurrence in a data center fabric for flexibility in work load placement across a DC. Further, a host move must result in minimal, if any, disruption to traffic flows / services to / from the device. Data plane and ARP/ND snooping based PE-CE learning may result in unpredictable convergence times, following host moves for the following cases: o A host may or may not send any data packet immediately following a move. o A host may or may not send an unsolicited ARP following a move. While probing procedures, discussed in the next sub-sections are typically used to minimize convergence time, certain scenarios discussed below may still result in extended convergence times and flooding. 13.1.2.1 Silent Hosts If a host is silent for an extended period following a move from PE1 to PE2, any bridged traffic flow destined to this host will continue to be black-holed by PE1 until the MAC ages out at PE1. Once the the MAC ages out at PE1, any bridged traffic flow destined to the host is malhotra et al. Expires Sept 12 2019 [Page 24] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN flooded across the overlay bridge. Flooding of unknown unicast traffic on the overlay is enabled for this purpose. In summary, PE-CE learning that is based on data-plane and AR/ND snooping may be subject to non-deterministic convergence time and flooding following host moves because of being heavily dependent on unpredictable CE behavior. PE-CE control plane based learning defined in this document fully decouples convergence in such scenarios from non-deterministic data flows and unsolicited ARP/ND behavior on a CE. 13.1.2.2 Probing ARP and ND probing procedures are typically used to achieve host re- learning and convergence following host moves across the overlay: o Following a host move from PE1 to PE2, the host's MAC is discovered at PE2 as a local MAC via a data frames received from the host. If PE2 has a prior REMOTE MAC-IP host route for this MAC from PE1, an ARP probe is typically triggered at PE2 to learn the MAC-IP as a local IP adjacency and triggers EVPN RT-2 advertisement for this MAC-IP across the overlay with new reachability via PE2. o Following a host move from PE1 to PE2, once PE1 receives a MAC or MAC-IP route from PE2 with a higher sequence number, an ARP probe is triggered at PE1 to clear the stale local MAC-IP neighbor adjacency OR re-learn the local MAC-IP in case the host has moved back or is duplicate. o Following a local MAC age-out, if there is a local IP adjacency with this MAC, an ARP probe is triggered for this IP to either re-learn the local MAC and maintain local l3 and l2 reachability to this host OR to clear the ARP entry in case the host is indeed no longer local. Note that clearing of stale ARP entries, following a move is required for traffic to converge in the event that the host was silent and not discovered at its new location. Once stale ARP entry for the host is cleared, routed traffic flow destined for the host can re-trigger ARP discovery for this host at the new location. ARP flooding on the overlay MUST also be done to enable ARP discovery via routed flows. o Alternatively, ARP probing timer may be tuned to be smaller than the MAC aging timer to avoid MAC age-out. PE-CE control plane learning alternative defined in this document decouples host learning following moves from unpredictable host behavior with respect to sending data traffic and unsolicited ARPs, malhotra et al. Expires Sept 12 2019 [Page 25] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN and as a result from ARP probing and MAC aging timer settings. Host move handling is hence greatly simplified to a very predictable and deterministic behavior. 13.1.3 ARP Gleaning Latency If a CE's ARP binding is not already learned on a PE via an unsolicited ARP sent by the CE following events such as boot-up, flaps, and moves, a data frame that needs to be routed to the CE triggers ARP or ND discovery process on the PE. On a typical hardware switching platform, an IP packet that does not resolve to a link layer re-write would be punted to host stack that delivers packets with incomplete link-layer resolution to ARP or ND for resolution. An ARP request / ND Solicitation is generated for the CE IP and an ARP response or NA results in installing a link-layer re-write for the CE IP. In an EVPN multi-homing environment, this procedure is further complicated as the response is only received by one of the PEs that may or may not be the one that generated the ARP or ND request. Learned neighbor binding is SYNCed to other PEs that share the multi- homed Ethernet Segment. Routed flows can now be forwarded to the host via all PEs. Latency associated with such data frame driven ARP discovery may result in significant initial convergence hit, following triggers that warrant re-gleaning of CE IP to MAC binding. PE-CE control plane learning alternative defined in this document results in proactive host learning following these scenarios, potentially avoiding a convergence hit on initial data packets. 13.2 Applicability to non-EVPN Use Cases While the LSoE based host learning procedure described in this document focuses on EVPN-IRB overlay fabric use case, it may also have benefits and applicability in non-EVPN use cases. Applicability of procedures described in this document to non-EVPN use cases is a topic for further study. 14. Summary PE-CE control plane is proposed as an alternative to data plane and ARP/ND snooping based PE-CE host MAC/IP learning and for PE-CE prefix learning. With a PE-CE control plane, CE host MAC and IP are deterministically learned on host boot-up, on host configuration, across host moves, on convergence triggers such as link failures, flaps, and PE re-boots and on all-active multi-homing LAG links. A PE-CE control plane decouples CE MAC and IP learning from traffic flows sourced by a CE, from varying CE behavior with respect to sending unsolicited ARP/ND frames, and from hashing of CE sourced frames over all-active multi-homed LAG links. As a result, it helps malhotra et al. Expires Sept 12 2019 [Page 26] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN achieve a predictable and reliable convergence behavior across these triggers and helps simplify certain EVPN procedures that are otherwise needed with a data-plane and ARP/ND snooping based PE-CE learning. In addition, it may also be used for non-host learning use cases such as prefix learning. malhotra et al. Expires Sept 12 2019 [Page 27] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 15. References 15.1 Normative References [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . [LSOE] Bush, R., Austein R., Patel, K., "Link State Over Ethernet", Feb 2019, . [EVPN-IRB] Sajassi, A., Salem, S., Thoria S., Drake J., Rabadan J., "Integrated Routing and Bridging in EVPN", July 2018, . [EVPN-PREFIX-ADV] Rabadan J., Henderickx W., Drake J., Lin W., Sajassi, A., "IP Prefix Advertisement in EVPN", May 2018, . [EVPN-IRB-MOBILITY] Malhotra, N., Sajassi, A., Rabadan, J., Drake J., Lingala A., Patekar A., "Extended Mobility Procedures for EVPN-IRB", Jan 2019, . [EVPN-IP-ALIASING] Sajassi, A., Badoni, G., "L3 Aliasing and Mass Withdrawal Support for EVPN", July 2017, . [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", March 1997, . [RFC8174] B. Leiba, "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", May 2017, . 15.2 Informative References malhotra et al. Expires Sept 12 2019 [Page 28] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN 16. Acknowledgements Authors would like to thank Randy Bush and Rob Austein for detailed review and feedback to ensure consistency with base LSOE protocol specification, as well as for helping build detailed LSOE flows included in this document. Authors would like to thank Ali Sajassi and John Drake for detailed review and very valuable input on PE-CE protocol design for EVPN use cases as well as structuring this document for EVPN use cases. Contributors Randy Bush Arrcus & IIJ 5147 Crystal Springs Bainbridge Island, WA 98110 United States of America Email: randy@psg.com Authors' Addresses Neeraj Malhotra (Editor) Arrcus 2077 Gateway Place, Suite #400 San Jose, CA 95119, USA Email: neeraj.ietf@gmail.com Keyur Patel Arrcus 2077 Gateway Place, Suite #400 San Jose, CA 95119, USA Email: keyur@arrcus.com Jorge Rabadan Nokia 777 E. Middlefield Road Mountain View, CA 94043, USA Email: jorge.rabadan@nokia.com malhotra et al. Expires Sept 12 2019 [Page 29] INTERNET DRAFT LSoE-based PE-CE Control Plane for EVPN malhotra et al. Expires Sept 12 2019 [Page 30]