Internet-Draft EVPN Context Label June 2020
Wang & Song Expires 12 December 2020 [Page]
Workgroup:
BESS WG
Published:
Intended Status:
Standards Track
Expires:
Authors:
Y. Wang
ZTE Corporation
B. Song
ZTE Corporation

Context Label for MPLS EVPN

Abstract

EVPN is designed to provide a better VPLS service than [RFC4761] and [RFC4762], and EVPN indeed introduced many new features which couldn't be achieved in those old VPLS implementions. But EVPN didn't inherit all features of old VPLS, and a few issues arises for EVPN only.

Some of these issues can be imputed to the MP2P nature of EVPN labels. The PW label in old VPLS is a label for P2P VC, so it contains more context than a identifier in dataplane for it's VSI instance.But the EVPN label just identifies it's VSI instnace and it can't stand for the ingress PE in dataplane. So the following issues arises with MPLS EVPN service:

This document introduces a compound label stack to take advantage of both P2P VC and MP2P evpn labels.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 12 December 2020.

Table of Contents

1. Terminology and Acronyms

This document uses the following acronyms and terms:

2. Problem Statement

EVPN is designed to provide a better VPLS service than RFC4761/RFC4762, and EVPN indeed introduced many new features which couldn't be achieved in those old VPLS implemention.But EVPN didn't inherit all features of old VPLS, and a few issues arises for EVPN only.

Some of these issues can be imputed to the MP2P nature of EVPN labels. The PW label in old VPLS is a label for P2P VC, so it contains more context than a identifier in dataplane for it's VSI instance. But the EVPN label just identifies it's VSI instnace and it can't stand for the ingress PE in dataplane. So the following issues arises with MPLS EVPN service:

So this document introduces an compound label stack to take advantage of both P2P VC and MP2P EVPN labels.

3. Using VC Label to Add Context Data to EVPN Flows

In order to add as much context as old VPLS to EVPN data packet, We can construct a infrastructure by a full-mesh of context-VCs among the EVPN PEs.

Take the context-VCs between PE-i and PE-j as an example, VC-ij is the context-VC from PE-i to PE-j, and VC-ji is the context-VC from PE-j to PE-i. The VC-ij identifies the PE-i node on PE-j. The VC-ji identifies PE-j node on PE-i. The VC-label for VC-ij is called as L-ij, and the VC-label for VC-ji is called as L-ji.

So the PE-i can push the L-ij onto the EVPN data packet for PE-j to distinguish the packet of PE-i from other data packets. Because the L-ij identifies the ingress PE of the data packet.

There are two styles of context-VC in this draft. One style is named as shared context-VC, the other style is named as per-EVI context VC.

3.1. The Shared Context VCs

The shared context-VCs are dedicated to identify the context for a data packet while the EVPN label still identifies the EVPN instance.

Note that typically a shared context VC can be shared by all the EVPN instances between it's ingress PE and egress PE. In other words, we don't have to establish a dedicated mesh of context VCs for each specified EVPN service. So we called the shared context VCs as a common infrastructure for those EVPN services.

3.2. The per-EVI Context VCs

The per-EVI context VCs are used to identify both the context (typically the ingress-PE) and the EVPN instance for a data packet at the same time. In other words, we have to establish a dedicated set of per-EVI context VCs for each specified EVPN service.

4. Signalling for Shared Context VCs

The VCs of a context VC infrastructure are set up by a context VC container, the container implements a VC signalling to set up the VCs. There are two existing signalling protocol can be reused to set up context VCs for a context VC container.

4.1. Kompella Signalling for Context VC

The signalling used by a Kompella VPLS instance per [RFC4761] can also be used by a context VC container.

Different from the ordinary Kompella VPLS instances, a context VC container only use the signalling to set up the context VCs. They are the same in signalling but different in dataplane. Take the PW between PE-i and PE-j as an example, it is constructed by VC-ij and VC-ji, and none of the two context VCs will identify a MAC-VRF. In other words the PW is a context PW.

Note that the context VC containers don't have a MAC-VRF or a MAC-table, they are just containers for context VC.

4.2. SR-MPLS signalling for CSL-based Context VC

SR-MPLS signalling is very similar to the singleton pattern of Kompella VPLS in the signalling behaviors, in spite of their different data plane and service procedure. The SID is similar to the VE-ID, the SRGB is similar to the label block.

So the established LSPs of the SR-MPLS signalling can be reinterpreted as context VCs in another label space named S. These context VCs use the same label values as those SR-LSPs but they are established at the same time in different label spaces. Take the VC-ij as an example, its label value L-ij is the same as the SID label for PE-i (The label for the SR-LSP destined to PEi) in PE-j's SRGB. But the VC-ij are established in the context label space S which is identified by a static label, the static label is the CL of label space S. VC-ij may not be established in the same label space as that SID label for PE-i.

The context VC signalling may be [RFC8665], [RFC8666], [RFC8667]. The context VC may be established along with SR-LSPs.

Note that the context VC label is a Context-specific Label (CSL), that's why the context-VC is called CSL-based context-VC. The CL of the SR-Signalling-Based Context-VCs may be the same value in the same domain. In such case, the PEs in that domain don't have to signal the CL to each other.

        +---------------------------------+
        |  underlay ethernet header       |
        +---------------------------------+
        |  PSN tunnel label               |
        +---------------------------------+
        |  EVPN label                     |
        +---------------------------------+
        |  Static CL for Label Space S    |
        +---------------------------------+
        |  Context VC Label = L-ij        |
        +---------------------------------+
        |  overlay ethernet or IP header  |
        +---------------------------------+

Figure 1: Encapsulation of CSL-based Context VC

Note that the static CL is the context label for L-ij, while the L-ij is the context label for the payload.

5. Signalling for per-EVI Context VCs

The IMET route per [RFC7432] have a corresponding route-type in MVPN. It is, in effect, the Intra-AS I-PMSI route per [RFC6514]. The difference between them is that an IMET route won't handle a responding Leaf A-D route, but an Intra-AS I-PMSI route will.

The Leaf A-D route per [I-D.ietf-bess-evpn-bum-procedure-updates] is required for per-EVI context VCs. In this draft, we use the Leaf A-D route with IR-PTA to establish per-EVI context-VCs. The Leaf A-D route is generated for an IMET route.

It is an update for [RFC7432]. The backward compatibility will be described in Section 5.3.

5.1. Construct Leaf A-D Route for IR

PE1 will construct a Leaf A-D route with IR-PTA for EVI1 in response to an IMET route R1 with IR-PTA. The IMET route R1 is received from PE2 previously. The key fields of the IMET route is included in the "Route Key" field of the Leaf A-D route (say R2) along with the ORIP of PE1 itself. We call the ORIP of PE1 itself as the Leaf A-D route's "self-ORIP" in order to distinguish it from the Leaf A-D route's Root-ORIP. So the "Route Type sepcific" field of the Leaf A-D route is per <EVI1, PE2> basis.

5.1.1. Advertising Per-platform VC Label

The MPLS label field in the IR-PTA of the Leaf A-D route is allocated per <EVI1, PE2> basis in per-platform label space on PE1. So the per-EVI context VC can identify the EVI1 too.

Note that PE1 may already advertise an IMET route R3 to PE2 before the advertisement of above Leaf A-D route.

5.1.2. Advertising Context-Specific VC label

Note that the per <EVI,Ingress PE> basis label allocation (see Section 5.1.1) may consume too many labels in per-platform label space. Sometimes we want to use the same EVPN label in all Leaf A-D routes and IMET routes of the same EVI. So we allocate a context-specific label (CSL) for a context VC in this section.

The EVPN label is still allocated from per-platform label space, and it identifies the EVPN instance as per [RFC7432]. But it also identifies a context label space CLS1. The VC label of the context VC is allocated in CLS1. So we say that the VC label is a context-specific VC label.

We introduce a new BGP Extended Community called Context-specific Label (CSL) Entry Extended Community, the CSL-Entry EC has the same format as the Context Label Space ID Extended Community (Section 3.1 of [I-D.ietf-bess-mvpn-evpn-aggregation-label]) except for a few notable differences.

The "sub-type" field of the CSL-Entry EC has a different codepoint from the CLS-ID EC. The ID-Value of the CSL-Entry EC is a MPLS label in a context-specific label space identified by the PTA label. And the MPLS label in the CSL-Entry EC will be pushed onto the label stack before the PTA label by the ingress PE. Typically, the MPLS label of the CSL-Entry EC is a downstream assigned label, which means that it will be used as an outgoing label by the PE receiving the CSL-Entry EC, not as incomming label.

When constructing the Leaf A-D route, the IR-PTA label is the EVPN Label, as per [RFC7432]. But the ID-value in the CSL-Entry EC is a label (say L1) that is allocated per TPE basis in CLS1. In fact, L1 is the context-specific VC label of the context VC of that ingress TPE. That's why the context VC is called as CSL-based context VC. So the CLS1-specific VC label need to be pushed onto the label stack before EVPN Label (which identifies CLS1) on ingress PEs.

Note that the CSL-Entry ECs (for different EVIs) received from the same TPE may be the same label, because that all EVI labels on the same PE may identify the same Context-specific Label Space (CLS). So we can select a single EVI to use the Leaf A-D route with CSL-Entry EC in such case. This EVI is called as administrating EVI (admin-EVI). The context VC label carried in the Leaf A-D routes of the admin-EVI will be used to take the place of the PTA label of the IMET route with the same ORIP in all other ordinary EVIs in such case. Note that all other ordinary EVIs don't use the Leaf A-D routes with IR-PTA in their signalling procedures, they use ordinary IMET routes instead. The admin-EVI need to be configured on all EVPN-PEs in such case.

Such encapsulation is illustrated as the following figure:

                     +---------------------------------+
                     |  underlay ethernet header       |
                     +---------------------------------+
                     |  PSN tunnel label               |
                     +---------------------------------+
                     |  EVPN label                     |
                     +---------------------------------+
                     |  Context VC Label               |
                     +---------------------------------+
                     |  overlay ethernet or IP header  |
                     +---------------------------------+

Figure 2: Encapsulation of EVI-Specific VC Label for EVPN Payload

Note that the Context-VC Label here is not the CLS-ID of the EVPN Label. But the EVPN label is the CLS-ID of the Context-VC Label. That's why the CLS-ID EC of [I-D.ietf-bess-mvpn-evpn-aggregation-label] is not appropriate for such encapsulation.

Note that when the PTA label is changed to a new value (caused by the BGP nexthop rewriting) by the SPE nodes, the CSL-Entry in the same EVPN route won't be rewrite. This is similar to the behavior of ESI Label EC of EAD per ES route.

5.2. Establish Ingress Replication List by Leaf A-D Route

PE2 receives the responding Leaf A-D route (say R2) of the IMET route R1 which is previously advertised by itself, and PE2 preiously received an IMET route R3 with the same ORIP as the self-ORIP of R2 . Given that R1,R2 and R3 both have a IR-PTA, PE2 SHOULD use R2 to install the Ingress Replication List (IRL) item for PE1 instead, and R3 will not used to install the IRL-item for PE1 from then on.

Note that when R2 included a CSL-Entry EC, the ID-value of the CSL-Entry EC will be used as the outgoing label of the IRL-item. The MPLS label of the IR-PTA will be used as the context label (CL) of the CSL-Entry in NHLFE. No ILM entry will be installed for the CSL of R2 on PE2.

5.3. Backward Compatibility

In [RFC7432], the LIR flag of IMET route is required to be zero when it is advertised and to be ignored on receipt.

It means that the LIR flag is reserved by IMET routes, but it technically can be used in the future. What should the LIR flag be restrained in the future use is no more severer than any other reserved PTA flags in the IMET routes.

So when PE2 set the LIR flag to one in the IMET route and send it to PE1, PE2 won't expect that the IMET route must be responded by a Leaf A-D. When the corresponding Leaf A-D route can't be received from PE1, the IMET route from PE1 still be used as per [RFC7432]. But when PE1 is a new PE following this draft, PE1 will indeed respond a Leaf A-D route for the IMET route.

6. Solutions

6.1. Solution for Source-Squelching in Hub-Spoke Scenarios

    PEs1--------RR1--------PEh---------RR2--------PEs3
                /
    PEs2-------/

Figure 3: Hub PE and Spoke PEs

Now take above use case for example, there are three spoke PEs and one hub PE. The spoke PEs are PEs1, PEs2 and PEs3. The hub PE is PEh. Two of the spoke PEs (PEs1 and PEs2) are connected to the same RR group and the third one connects to another RR group.

Although we can advertise different EVPN labels for different RR groups, we can't advertise different EVPN labels for PEs1 and PEs2.

But PEh can request PEs1 or PEs2 to push the label of the context VC from them to PEh. Benefit from the context VC label, PEh can distinguish where the packet from, in other words, PEh can decide where the packet can't be sent to.

The signaling for the hub PE to request the spoke PE to push the context VC label will be added in future versions.

Note that although PEs1 and PEs2 can receive EVPN routes from each other they won't import these routes because of the hub/spoke behaviors.

6.2. Solution for per ingress statistics

We use CSL-based per-EVI context-VCs(see Section 5.1.2) to do per-ingress statistics.

Note that The per-platform label space can be used as CLS1 at the same time. In such case, the inner context-VC label is similar to the downstream-assigned ESI-label in ILM-lookup behavior. Such context-VC is very similar to the shared context VC too.

Note that when PE1 sends a Leaf A-D route with a CSL-Entry EC to PE2, but PE2 don't recognize the CSL-Entry EC, then PE2 will encapsulate the EVPN label without the inner context-VC label. If CLS1 is actually identical to the per-platform label space, this will work as well as [RFC7432], although the per-ingress statistics can't be executed.

Note that legacy PEs will not send a Leaf A-D route in response to an IMET route even if the LIR flag in the IMET route is set to one. So when legacy PEs and new PEs following this section coexist in the same EVI, they can interwork well, but only the new PEs can do per-ingress statistics.

6.3. Solution for AR REPLICATOR in MPLS EVPN

    LEAF1--------REPLICATOR1--------RNVE1
                     /
    LEAF2-----------/

Figure 4: AR REPLICATOR in MPLS EVPN

When REPLICATOR1 node recieves an IMET Route with AR-role = AR-LEAF from LEAF1 node, REPLICATOR1 SHOLD respond to it with an Leaf A-D route with AR-PTA. The MPLS label field of the AR-PTA (say AR-PTA Label) will be allocated following the same rules as the IR-PTA Label in Section 5.1. When ALEAF1 receives above Leaf A-D route, the Leaf A-D route is treated as a Replicator-AR route for the same ORIP, and then the control-plane procedures works following [I-D.ietf-bess-evpn-optimized-ir]. When REPLICATOR1 receives data packets from the AR-PTA Label, REPLICATOR1 will do source-squelching for LEAF1 which means that these data packets will not be forwarded back to LEAF1.

Note that the old Replicator-AR route which is in terms of IMET route will not be used by MPLS EVPN AR-REPLICATOR. Because that the Leaf A-D routes will take it's place per AR-LEAF basis. But the old Regular-IR route can still be used by MPLS EVPN AR-REPLICATORs.

Note that the AR-REPLICATOR don't have to set the LIR flag of its IMET routes to one. We suggest that when receiving an IMET route with AR-role = AR-LEAF and tunnel-encapsulation = MPLS, the above Leaf A-D route SHOULD be generated for that IMET route, even if the LIR flag is set to zero.

6.4. Solution for anycast tunnel usage on SPE

       /--------SPE1-------\
     TPE1                   TPE2
       \--------SPE2-------/

Figure 5: SPE with Anycast Tunnel

Now take above use case for example, the two SPEs are the egress nodes of an anycast SR-MPLS tunnel. The anycast SR-MPLS tunnel is used to transport flows from TPE1 to either SPE1 or SPE2 according to load balancing procedures. So SPE1 and SPE2 have to advertise the same EVPN label independently for a given EVPN route.

6.4.1. Control-plane

In fact, SPE1 and SPE2 can simply inherit the EVPN label (say EVL4) from TPE2, and they advertise it to TPE1 along with a context-VC label (say VCL4). The context-VC label is for the shared context-VC from TPE2 to SPE1 or SPE2. We can make the VC labels from TPE2 to SPE1 and SPE2 have the same value through configuring.

Note that the context-VCs can be established according to Section 4.2. Note that VCL4 has the same value as the SR-LSP to TPE2 according to Section 4.2. Note that VCL4 identify a Label Space (say TPE2-specific CLS) that is dedicated to turning the EVPN label received from TPE2 into an incoming label on the SPEs. When SPE1 receives EVPN labels from different TPEs, SPE1 MUST use different CLS to install the corresponding ILM entry for Label swapping.

6.4.2. Data-plane

The label stack on the anycast SR-MPLS tunnel is constructed by TPE1 as the following:

        +---------------------------------+
        |  underlay ethernet header       |
        +---------------------------------+
        |  Anycast SR-TL = SR_LSP_to_SPEx |
        +---------------------------------+
        |  Static CL = CL4                |
        +---------------------------------+
        |  Context-VC Label = VCL4        |
        +---------------------------------+
        |  EVPN label = EVL4              |
        +---------------------------------+
        |  overlay ethernet or IP header  |
        +---------------------------------+

Figure 6: Anycast SPE dataplane

Note that the SR Tunnel Label (TL) in the label stack is the SR-LSP label from TPE1 to the SPE1 or SPE2.

Note that the context-VC is also constructed in a context label space (say CLS4), the label space CLS4 is identified by a static label (say CL4). And the CLS4 is identified by the same CL4 on all PEs of the service domain. so the label stacks on the anycast tunnel are the same for SPE1 and SPE2.

Then SPE1/SPE2 will perform ILM lookup for the EVPN label in the "TPE2-specific label space" which is identified by the context-VC label VCL4. The label operation will be swap, and the new outgoing EVPN label will be the same value.

7. Security Considerations

This section will be added in future versions.

8. IANA Considerations

The IANA considerations for CSL-Entry EC in Section 5.1.2 will be added in future versions.

9. Acknowledgements

The authors would like to thank the following for their comments and review of this document:

Benchong Xu.

10. References

10.1. Normative References

[I-D.ietf-bess-evpn-bum-procedure-updates]
Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A. Sajassi, "Updates on EVPN BUM Procedures", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-bum-procedure-updates-08, , <https://tools.ietf.org/html/draft-ietf-bess-evpn-bum-procedure-updates-08>.
[I-D.ietf-bess-evpn-optimized-ir]
Rabadan, J., Sathappan, S., Lin, W., Katiyar, M., and A. Sajassi, "Optimized Ingress Replication solution for EVPN", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-optimized-ir-06, , <https://tools.ietf.org/html/draft-ietf-bess-evpn-optimized-ir-06>.
[I-D.ietf-bess-mvpn-evpn-aggregation-label]
Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, "MVPN/EVPN Tunnel Aggregation with Common Labels", Work in Progress, Internet-Draft, draft-ietf-bess-mvpn-evpn-aggregation-label-03, , <https://tools.ietf.org/html/draft-ietf-bess-mvpn-evpn-aggregation-label-03>.
[RFC6514]
Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", RFC 6514, DOI 10.17487/RFC6514, , <https://www.rfc-editor.org/info/rfc6514>.
[RFC7432]
Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, , <https://www.rfc-editor.org/info/rfc7432>.

10.2. Informative References

[RFC4761]
Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, DOI 10.17487/RFC4761, , <https://www.rfc-editor.org/info/rfc4761>.
[RFC5331]
Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream Label Assignment and Context-Specific Label Space", RFC 5331, DOI 10.17487/RFC5331, , <https://www.rfc-editor.org/info/rfc5331>.
[RFC8665]
Psenak, P., Ed., Previdi, S., Ed., Filsfils, C., Gredler, H., Shakir, R., Henderickx, W., and J. Tantsura, "OSPF Extensions for Segment Routing", RFC 8665, DOI 10.17487/RFC8665, , <https://www.rfc-editor.org/info/rfc8665>.
[RFC8666]
Psenak, P., Ed. and S. Previdi, Ed., "OSPFv3 Extensions for Segment Routing", RFC 8666, DOI 10.17487/RFC8666, , <https://www.rfc-editor.org/info/rfc8666>.
[RFC8667]
Previdi, S., Ed., Ginsberg, L., Ed., Filsfils, C., Bashandy, A., Gredler, H., and B. Decraene, "IS-IS Extensions for Segment Routing", RFC 8667, DOI 10.17487/RFC8667, , <https://www.rfc-editor.org/info/rfc8667>.

Authors' Addresses

Yubao Wang
ZTE Corporation
No. 50 Software Ave, Yuhuatai Distinct
Nanjing
China
Bing Song
ZTE Corporation
No. 50 Software Ave, Yuhuatai Distinct
Nanjing
China