SRv6 and MPLS interworkingCisco Systemsswaagraw@cisco.comCisco Systemszali@cisco.comCisco Systemscfilsfil@cisco.comBell CanadaCanadadaniel.voyer@bell.caLinkedInUSAgdawra.ietf@gmail.comHuawei TechnologiesChinalizhenbin@huawei.com
Routing
SPRINGinterworkingSegment RoutingThis document describes SRv6 and MPLS/SR-MPLS interworking and co-existence
procedures.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.The incremental deployment of SRv6 into existing networks require SRv6 to
interwork and co-exist with SR-MPLS/MPLS. This document introduces interworking
scenarios and building blocks for solutions to inter connect them.This document assumes SR-MPLS-IPv4 for MPLS domains but the design equally works
for SR-MPLS-IPv6, LDP-IPv4/IPv6 and RSVP-TE-MPLS label binding protocols.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 when, and only
when, they appear in all capitals, as shown here.A multi-domain network () can be
generalized as a central domain C with many leaf domains around it.
Specifically, the document looks at a service flow from an ingress PE in an
ingress leaf domain (LI), through the C domain and up to an egress PE of
the egress leaf domain (LE). Each domain runs its own IGP instance.
A domain has a single data plane type applicable both for its overlay
and its underlay.There are various SRv6 and SR-MPLS-IPv4 interworking scenarios possible.Below scenarios cover various cascading of SRv6 and MPLS networks, e.g.,
SR-MPLS-IPv4 <-> SRv6 <-> SR-MPLS-IPv4 <-> SRv6 <->
SR-MPLS-IPv4 etc, though not all combinations are described for brevity.
Provider edge devices run MPLS based or
SRv6 Service SID based BGP L3(e.g.VPN) or
L2(e.g.EVPN) services through service Route Reflectors. Service endpoint
signaling through borders routers and corresponding forwarding state
provide interworking over intermediate transport domain.
SRv6 over MPLS (6oM)
LI and LE domains are SRv6 data plane, C is MPLS data planeL3/L2 BGP SRv6 services extend
between PEs. The ingress PE encapsulates the payload in an outer IPv6
header where the SRv6 Service SID is the last segment or
destination address(DA).Transport IW border nodes forward SRv6 encapsulated traffic destined to
egress PE over MPLS C domain.MPLS over SRv6 (Mo6)
LI and LE domains are MPLS data plane, C is SRv6 data planeL3/L2 BGP MPLS services , .
The ingress PE encapsulates the payload in an MPLS service label and
sends it through MPLS LSP to egress PE.Transport IW nodes forward encapsulated label stack to egress PE over
SRv6 C domain.Note: Easiest and most probable deployment is ships in the night i.e.
supporting dual stack and IPv4 MPLS in each domain.L3/L2 service signaling discontinuity i.e. SRv6 service SID based PE
interworks with BGP MPLS based PE for service connectivity. L3/L2 service
BGP signaling and forwarding state provide interworking over intermediate domain.
SRv6 to MPLS(6toM): The ingress PE encapsulates the payload
in an outer IPv6 header where the destination address is the SRv6
Service SID. Payload is
delivered to egress PE with MPLS service label
that it advertised with service prefixes.
MPLS to SRv6 (Mto6): The ingress PE encapsulates the payload
in an MPLS service label. Payload is delivered to egress PE with IPv6 header
with destination address as SRv6 service SID that it advertised with
service prefixes.
The following terms used within this document are defined in
: Segment Routing, SR-MPLS, SRv6, SR Domain,
Segment ID (SID), SRv6 SID, Prefix-SID.Domain: Without loss of the generality, domain is assumed to be instantiated
by a single IGP instance or a network within IGP if there is clear
separation of data plane.Node k has a classic IPv6 loopback address Ak::1/128.A SID at node k with locator block B and function F is represented by B:k:F::
A SID list is represented as <S1, S2, S3> where S1 is the first SID
to visit, S2 is the second SID to visit and S3 is the last SID to
visit along the SR path.(SA,DA) (S3, S2, S1; SL) represents an IPv6 packet with:IPv6 header with source address SA, destination addresses DA and
SRH as next-headerSRH with SID list <S1, S2, S3> with SegmentsLeft = SLNote the difference between the <> and () symbols: <S1, S2, S3>
represents a SID list where S1 is the first SID and S3 is the last
SID to traverse. (S3, S2, S1; SL) represents the same SID list but
encoded in the SRH format where the rightmost SID in the SRH is the
first SID and the leftmost SID in the SRH is the last SID. When
referring to an SR policy in a high-level use-case, it is simpler
to use the <S1, S2, S3> notation. When referring to an
illustration of the detailed packet behavior, the (S3, S2, S1; SL)
notation is more convenient.This document introduces a new SRv6 SID behavior. This behavior is executed
on border routers between the SRv6 and MPLS domain.The "Endpoint with decapsulation and MPLS table lookup" behavior.The End.DTM SID MUST be the last segment in a SR Policy, and a SID
instance is associated with an MPLS table.When N receives a packet destined to S and S is a local End.DTM SID,
N does:The "Endpoint with decapsulation and MPLS label push" behavior.The End.DPM SID MUST be the last segment and a SID
instance is associated with label stack.When N receives a packet destined to S and S is a local End.DPM SID,
N does:The H.Encaps.M behavior encapsulates a received MPLS Label stack
packet in an IPv6 header with an SRH.
Together MPLS label stack and its payload becomes the payload of
the new IPv6 packet. The Next Header field of the SRH MUST be set
to 137 .The H.Encaps.M.Red behavior is an optimization of the H.Encaps.M
behavior. H.Encaps.M.Red reduces the length of the SRH by excluding
the first SID in the SRH of the pushed IPv6 header. The first SID is
only placed in the Destination Address field of the pushed IPv6 header.
The push of the SRH MAY be omitted when the SRv6 Policy only contains
one segment and there is no need to use any flag, tag or TLV. In such
case, the Next Header field of the IPv6 header MUST be set to 137
.Binding Segment (BSID) is bound to SR policy . Further
an SR-MPLS label can be bound to an SRv6 Policy and an SRv6 SID can be bound to
an SR-MPLS Policy. The IW SR-PCE solution leverage these
BSIDs as segments of SR policy on headend domain to represent intermediate domain
of different dataplane type. In summary, an intermediate domain of different data
plane type is represented by BSID of ingress domain data plane type in SID list. shows reference multi-domain
network topology and its description. The
procedure in this section are illustrated using the topology.Following is assumed for data plane support of various nodes:
Nodes 2,3,5,6,8,9 are provider(P) routers which need to support
single data plane type.1 and 10 are PEs. They support single data plane type in overlay and underlay.
Border routers 4 and 7 need to support both the SRv6 and SR-MPLS-IPv4
data plane.A VPN route is advertised via service RRs (S-RR) between an egress PE(node 10)
and an ingress PE (node 1).For illustrations, the SRGB range starts from 16000 and prefix SID of a node
is 16000 plus node numberAs described in , transport IW requires:
For 6oM, tunnel traffic destined to SRv6 Service SID of egress PE over
MPLS C domain.For Mo6, tunnel MPLS label stack bound to IPv4 loopback address of
egress PE over SRv6 C domain.This draft enhances two well-known solutions to achieve above:
An SR-PCE multi-domain On Demand Next-hop (ODN)
SR policy stitching
end to end across different data plane domains using
interconnecting binding SIDs. These procedures can be used
when overlay prefixes are signaled with a color extended community
.
BGP Inter-Domain routing procedures advertising PE locator or IPv4 Loopback
address for best effort end to end connectivity.
This procedure provides a best-effort path as well as a path that satisfies
the intent (e.g. low latency), across multiple domains. Service routes
(VPN/EVPN) are received on ingress PE with color extended community from
egress PE. A Color is a 32-bit numerical value that associates an SR Policy
with an intent .
Ingress PE does not know how to compute the traffic engineered path through
the multi-domain network to egress PE and requests SR-PCE for it. The SR-PCE
is aware of interworking requirement at border nodes as its fed with BGP-LS
topological information from each domain. It programs intermediate domain
data plane specific policy on border nodes for the given intent and
represents it in end to end path SID list on ingress PE leveraging
.Below sections describe 6oM and Mo6 IW with SR-PCEService prefix (e.g. VPN or EVPN) is received on head-end (node 1)
with color extended community (C1) from egress PE (node 10) with
SRv6 service SID. The PCE computes (C1,10) path via node 2, 5 and 8.
It programs an SR policy at border node 4 with segment list node 5 and 7
bounded to an End.BM BSID
. SR-PCE responds
back to node 1 with SRv6 segments along required SLA including End.BM
at node 4 to traverse SR-MPLS-IPv4 C domain.For example, SR-PCE create SR-MPLS policy (C1,7) at node 4 with
segments <16005,16007>. It is bound to End.BM behavior with
SRv6 BSID as B:4:BM-C1-7::The data plane operations for the above-mentioned interworking example
are described in the following:
Node 1 performs SRv6 function H.Encaps.Red with VPN service SID
and SRv6 Policy (C1,10):
Packet leaving node 1 IPv6 ((A:1::, B:2:E::) (B:10::DT4, B:8:E::,
B:4:BM-C1-7:: ; SL=3))Node 2 performs End function
Packet leaving node 2 IPv6 ((A:1::, B:4:BM-C1-7::) (B:10::DT4,
B:8:E::, B:4:BM-C1-7:: ; SL=2))Node 4(border rout4er) performs End.BM function
Packet leaving node 4 MPLS (16005,16007,2)((A:1::, B:8:E::)
(B:10::DT4, B:8:E::, B:4:BM-C1-7-:: ; SL=1)).Node 7 performs a native IPv6 lookup on due PHP behavior for 16007
Packet leaving node 7 IPv6 ((A:1::, B:8:E::) (B:10::DT4, B:8:E::,
B:4:BM-C1-7:: ; SL=1))Node 8 performs End(PSP) function
Packet leaving node 8 IPv6 ((A:1::, B:10::DT4))Node 10 performs End.DT function and lookups IP in VRF and send
traffic to CE.Refer for Mo6 scenario.
MPLS Service prefix (e.g. VPN or EVPN) is received on head-end(node 1)
with color extended community(C1) from egress PE(node 10). The PCE computes
color-aware C1 path via node 2, 5 and 8. It programs a SRv6 policy bound to
MPLS BSID at border node 4 with SRv6 segment list along required color-aware
path with last segment of behavior End.DTM .
SR-PCE responds back to node 1 with MPLS segment list including MPLS BSID
of SRv6 policy at node 4 to traverse SRv6 core domain.For example, SR-PCE create SRv6 policy (C1,7) at node 4 with segments
<B:5:E::,B:7:DTM::>. It is bound to MPLS BSID 24407.The data plan operations for the above-mentioned interworking example
are described in the following:
Node 1 performs MPLS label stack encapsulation with VPN label and
SR-MPLS Policy (C1,10):
Packet leaving node 1 towards 2 (Note: PHP of node 2 prefix SID):
MPLS packet (16004,24407,16008,16010,vpn_label)Node 2 forwards traffic towards 4 (PHP of 16004)
Packet leaving node 2
MPLS packet (24407,16008,16010,vpn_label)Node 4 steers MPLS traffic into SRv6 policy bound to 24407
Packet leaving node 4
IPv6(A:4::, B:5:E::) (B:7:DTM:: ; SL=1)NH=137) MPLS((16008,16010,vpn_label)Node 7 receive IPv6 packet with DA=B:7:DTM::. It performs DTM behavior
to remove IPv6 header and perform 16008 lookup in MPLS table.
Packet leaves node 7 towards node 8(PHP of 16008)
MPLS packet (16010,vpn_label)Node 8 forwards traffic towards 10 (PHP of 16010)
Packet leaving node 8
MPLS packet (vpn_label)Node 10 performs vpn_label lookup and send traffic to CE.Procedures described below build upon BGP 3107
and to
advertise transport reachability for PE IPv4 loopbacks or SRv6 locators
across a multi-domain network. The procedures leverage existing BGP AFI/SAFIs
BGP IPv6 Unicast (2/1) and BGP-LU (1/4, 2/4). Nexthop self on border
routers provide independence of intra domain tunnel technology in
different domains.
The sections below describe 6oM and Mo6 IW with BGP procedures for
best effort paths to a locator or loopback prefix. The procedures are
equally applicable to intent aware paths, i.e., locator assigned for a
given intent, for instance from an IGP-FlexAlgo. They are also applicable
to color-aware routes recursing over
intent aware intra-domain paths.Refer for 6oM scenario.
SRv6 based L3/L2 BGP services are signaled with SRv6 Service SID
between PEs through Service RRs with no color extended community.
Ingress PEs need reachability to remote locator to send traffic to SRv6
service SID.
Egress border router learns local PE locators through IGP.
These should be redistributed in BGP like any IPv6 global prefixes.
Alternatively, locator is advertised by PE in the BGP iPv6 unicast
address family (AFI=2,SAFI=1) to border nodes.Egress border router advertise LE domain PE locators in
BGP IPv6 LU[AFI=2/SAFI=4] with local label (explicit NULL) to ingress
border router with IPv4 next hops. These next hops have SR-MPLS-IPv4
LSP paths built in C domain. It may advertise summary prefix covering
all locators in LE domain.If ingress border router advertise remote locators in LI domain
to ingress PE in BGP address family (AFI=2,SAFI=1), it attaches
local End behavior as SRv6 SID in Prefix-SID attribute TLV type 5
.
Alternatively, it may leak remote locators in LI IGP domain such that
P routers also have reachabilityIngress PE learn remote locator over BGP iPv6 address family
AFI=2, SAFI=1 or through LI IGP. When learnt through BGP, SRv6 SID carried
in Prefix-SID attribute TLV 5 tunnels traffic to ingress border node
in LI domain as P routers (node 2 and 3) will not be aware of remote
locator.Control plane example:
Routing Protocol(RP) @10:
In ISIS advertise locator B:10::/48BGP AFI=1,SAFI=128 originates a VPN route RD:V/v via B:10::1 and
Prefix-SID attribute B:10:DT4::. This route is advertised to service RR.RP @ 7:
ISIS redistribute B:10::/48 into BGPBGP Originates B:10::/48 in AFI=2/SAFI=4 with next hop node 7 and
label explicit null among border routers.RP @ 4:
BGP learns B:10::/48 with next hop node 7 and outgoing label.BGP advertise B:10::/48 in AFI=2/SAFI=1 with next hop B:4::1 and
Prefix-SID attribute tlv type 5 carrying local End behavior
function B:4:END:: to node 1Alternatively, BGP redistributes remote locator or summary route in
LI domain IGP.RP @ 1:
BGP learns B:10::/48 via B:4::1 and Prefix-SID attribute TLV type
5 with SRv6 SID B:4:END::Alternatively, B:10::/48 or summary route reachability is learned
through ISISBGP AFI=1, SAFI=128 learn service prefix RD:V/v, next hop B:10::1 and
PrefixSID attribute TLV type 5 with SRv6 SID B:10:DT4 FIB stateRefer for Mo6 scenario.
MPLS based L3/L2 BGP services are signaled with IPv4 next-hop of PE
through Service RRs with no color extended community. Ingress PE need
labelled reachability to remote PE IPv4 loopback address advertised as
next hop with service routes.BGP LU advertise IPv4 PE
loopbacks. Next hop self-performed on border routers.Following are options and protocol extensions
to tunnel IPv4 PE loopback LSP through SRv6 C domainIntuitive solution for an MPLS-minded operator
Existing BGP-LU label cross-connect on border routers for each PE
IPv4 loopback address.The lookups at the ingress border router are based on BGP3107
label as usualJust the SR-MPLS IGP label to next hop is replaced by an
IPv6 tunnel with DA = SRv6 SID associated with DTM behavior in C domain.Ingress border router forwarding perform 3107 label swap and
H.Encaps.M with DA = SRv6 SID associated with DTM behaviorSimilar to MPLS-over-IPExisting BGP LU updates between border routers signal SRv6 SID
associated with DTM behavior.
proposes "SRv6 tunnel for label route" TLV of the BGP Prefix-SID Attribute
to signal SRv6 SID to tunnel MPLS packet with label in NLRI at the top of
its label stack through SRv6/IPv6 domain. Below describes the control plane
and corresponding FIB state to achieve such tunneling:
Control plane example
Routing Protocol(RP) @10:
ISIS originates its IPv4 PE loopback with Node SID 16010BGP AFI=1,SAFI=4 originate IPv4 loopback address with next hop
node 10 and optionally label index=10 in Label-Index TLV of
Prefix-SID attribute.BGP AFI=1, SAFI=128 originates a VPN route RD:V/v next hop node 10.
This route is advertised to service RR.RP @ 7:
ISIS v6, advertise locator B:7::/48 in C domainBGP learns node 10 IPv4 loopback address with outgoing label.
It allocates local label (based on label index if present) and
programs label swap to outgoing label and MPLS LSP to next hop.BGP AFI=1, SAFI=4 advertise IPv4 loopback address of node 10
to node 4. NLRI label is set to local label and SRv6 SID B:7:DTM::
carried in SRv6 SID Information Sub-TLV of
"SRv6 tunnel for label route" TLV in Prefix-Sid attribute. If received,
label index=10 in Label-Index TLV of Prefix-SID attribute is also
signaled.RP @ 4:
ISIS v4 originates its IPv4 loopback with prefix SID 16004
in LI domain. BGP learns node10 IPv4 loopback address from node 7 with
outgoing label. It allocate local label (based on label index
if present) and programs label swap and H.Encaps.M.red with
IPv6 header destination address as SRv6 SID received in
"SRv6 tunnel for label route" TLV of Prefix-Sid attribute i.e. B:7:DTM::.
BGP AFI=1, SAFI=4 advertise IPv4 Loopback address of node 10 to
node 1. NLRI label is set to local label and do not signal
"SRv6 tunnel for label route" TLV in Prefix-SID attribute.RP @ 1:
BGP learns IPv4 loopback address of node 10 from node 4 with
outgoing label. It programs route to push outgoing label and
MPLS LSP to next hop i.e. node 4BGP AFI=1, SAFI=128 learn service prefix RD:V/v, next hop
IPv4 loopback address of node 10 and service label.Forwarding state at different nodes:During transition when MPLS data plane is still enabled in C domain, an
ABR that does not understand "SRv6 tunnel for label route" TLV in
BGP Prefix-SID Attribute or based on operator configured local policy
can continue MPLS encapsulation using label in NLRI and LSP to next hop.
For each PE IPv4 loopback address, existing BGP 3107 label cross-connect
on area border router is replaced by label to SRv6 SID cross-connect
or vice versa. In effect, it creates a translation between from 3107
label to SRv6 SID at ingress of SRv6 domain and SRv6 SID to 3107 label on
egress.For each BGP LU route (IPv4 loopback address of PE) received
from LE domain on egress border router, allocate SRv6 SID of DPM behavior
bound to the PE address. Lookup of SRv6 SID result in decapsulation of
IPv6 header and push of BGP LU outgoing label and MPLS LSP to next hop.
Advertise BGP route to PE address with SRv6 SID to ingress
border router.Ingress border router allocate local label and advertise to LI domain.
The lookups at the ingress border router are based on BGP 3107
label as usual. Lookup results SRv6 SID of DPM behavior signaled by
egress border node. Decap BGP3107 label and perform H.Encaps.M with
DA = SRv6 SID.Section 2.2 of
describes how existing BGP advertisement can signal SRv6 SID associated
with DPM behavior from egress to ingress border router.As described in Service IW need BGP SRv6
based L2/L3 PE interworking with BGP MPLS based L2/L3 PE.There are a number of different ways of handling this scenario as
detailed below.Gateway is router which supports both BGP SRv6 based L2/L3 services
and BGP MPLS based L2/L3 services for a service instance
(e.g. L3 VRF, EVPN EVI). It terminates service encapsulation and
perform L2/L3 destination lookup in service instance.A border router between SRv6 domain and SR-MPLS-IPv4 domain is
suitable for Gateway role.Transport reachability to SRv6 PE and gateway locators in SRv6 domain
or MPLS LSP to PE/gateway IPv4 Loopbacks can be exchanged in IGP or
through mechanism detailed in .Gateway exchange BGP L2/L3 service prefix with SRv6 based Service PEs
via set of service RRs. This session will learn/advertise L3/L2 service
prefixes with SRv6 service SID in prefix SID attribute
.Gateway exchange BGP L2/L3 service prefix with MPLS based Service
PEs via set of distinct service RRs. This session will learn/advertise
L3/L2 service prefixes with service labels .L2/L3 prefix received from a domain is locally installed in service
instance and re advertised to other domain with modified service
encapsulation information.Prefix learned with SRv6 service SID from SRv6 PE is installed
in service instance with instruction to perform H.Encaps.
It is advertised to MPLS service PE with service label.
When gateway receives traffic with service label from MPLS service PE,
it perform destination lookup in service instance. Lookup result in
instruction to perform H.Encaps with DA being SRv6 Service SID learnt
with prefix from SRv6 PE.Prefix learned with MPLS service label from MPLS service PE is
installed in service instance with instruction to perform service label
encapsulation and send to MPLS LSP to nexthop. It is advertised to SRv6
service PE with SRv6 service SID of behavior (e.g. DT4/DT6/DT2U)
. When gateway
receives traffic with SRv6 Service SID as DA of IPv6 header from
SRv6 service PE, it perform destination lookup in service instance
after decaps of IPv6 header. Lookup result in instruction to push
service label and send it to nexthop.Couple of border routers can act as gateway for redundancy. It can scale
horizontally by distributing service instance among them.This is similar to inter-as option B procedures described in
just that service label cross-connect
on border router is replaced with service label to SRv6 service SID or
vice verse translation on IW node.
IW node does not need service instance like VRF or EVI.IW node exchange BGP L2/L3 service prefix with SRv6 based Service PEs
via set of service RRs. This BGP session will learn/advertise L3/L2 service
prefixes with SRv6 service SID in prefix SID attribute
.IW node exchange BGP L2/L3 service prefix with MPLS based service
PEs via set of distinct service RRs. This BGP session will learn/advertise
L3/L2 service prefixes with service labels .IW node allocates SRv6 SID of behavior End.DPM that result in pushing
service label and MPLS label stack to service nexthop for BGP L2/L3 service
learnt from MPLS PE. It advertises the service to SRv6 domain.
IW node allocates service label that results in H.Encaps with IPv6 header DA
set to SRv6 SID signaled in BGP L2/L3 service learnt from SRv6 PE.
Advertises the service to MPLS domain with allocated service label.
In addition, the draft also addresses migration and coexistence of
the SRv6 and SR-MPLS-IPv4. Co-existence means a network that supports
both SRv6 and MPLS in a given domain. This may be a transient state when
brownfield SR-MPLS-IPv4 network upgrades to SRv6 (migration) or permanent
state when some devices are not capable of SRv6 but supports native IPv6
and SR-MPLS-IPv4.These procedures would be detailed in a future revisionFailure within domain are taken care by existing FRR
mechanisms .Procedures listed in
provides protection in SR-PCE multi-domain On Demand Nexthop (ODN) SR policy based
approach.Convergence on failure of border routers can be achieved by well known methods
for BGP inter domain routing approach:
BGP Add Path provide diverse path visibilityBGP backup path pre-programmingSub-second convergence on border router failure notified by local IGP.This document introduces a new SRv6 Endpoint behaviors "End.DTM" and "End.DPM".
IANA is requested to assign identifier value in the
"SRv6 Endpoint Behaviors" sub-registry under "Segment Routing Parameters"
registry.
The authors would like to acknowledge Kamran Raza, Dhananjaya Rao,
Stephane Litkowski, Pablo Camarillo, Ketan Talaulikar