SR-MPLS over IPAlibabaxiaohu.xxh@alibaba-inc.comHuaweistewart.bryant@gmail.comJuniperafarrel@juniper.netCiscobashandy@cisco.comNokiawim.henderickx@nokia.comHuaweilizhenbin@huawei.comMPLS Segment Routing (SR-MPLS in short) is an MPLS data plane-based
source routing paradigm in which the sender of a packet is allowed to
partially or completely specify the route the packet takes through the
network by imposing stacked MPLS labels on the packet. SR-MPLS could be
leveraged to realize a source routing mechanism across MPLS, IPv4, and
IPv6 data planes by using an MPLS label stack as a source routing
instruction set while preserving backward compatibility with
SR-MPLS.This document describes how SR-MPLS capable routers and IP-only
routers can seamlessly co-exist and interoperate through the use of
SR-MPLS label stacks and IP encapsulation/tunnelling such as MPLS-in-UDP
.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP 14
when, and only when,
they appear in all capitals, as shown here.MPLS Segment Routing (SR-MPLS in short) is an MPLS data
plane-based source routing paradigm in which the sender of a packet is
allowed to partially or completely specify the route the packet takes
through the network by imposing stacked MPLS labels on the packet.
SR-MPLS could be leveraged to realize a source routing mechanism across
MPLS, IPv4, and IPv6 data planes by using an MPLS label stack as a
source routing instruction set while preserving backward compatibility
with SR-MPLS. More specifically, the source routing instruction set
information contained in a source routed packet could be uniformly
encoded as an MPLS label stack no matter whether the underlay is IPv4,
IPv6, or MPLS.This document describes how SR-MPLS capable routers and IP-only
routers can seamlessly co-exist and interoperate through the use of
SR-MPLS label stacks and IP encapsulation/tunnelling such as MPLS-in-UDP
.Although the source routing instructions are encoded as MPLS labels,
this is a hardware convenience rather than an indication that the whole
MPLS protocol stack needs to be deployed. In particular, the MPLS
control protocols are not used in this or any other form of SR-MPLS. describes various use cases for the
tunneling SR-MPLS over IP. describes a typical
application scenario and how the packet forwarding happens. describes the forwarding procedures of different
elements when UDP encapsulation is adopted for source routing.This memo makes use of the terms defined in
and .Tunnelling SR-MPLS using IPv4 and/or IPv6 tunnels is useful at least
in the following use cases:Incremental deployment of the SR-MPLS technology may be
facilitated by tunnelling SR-MPLS packets across parts of a network
that are not SR-MPLS enabled using an IP tunneling mechanism such as
MPLS-in-UDP . The tunnel destination address
is the address of the next SR-MPLS-capable node along the path
(i.e., the egress of the active node segment). This is shown in
. If encoding of entropy is desired, IP tunneling mechanims that
allow encoding of entrpopy, such as MPLS-in-UDP encapsulation where the source port of the UDP header is used
as an entropy field, may be used to maximize the untilization of
ECMP and/or UCMP, specially when it is difficult to make use of
entropy label mechanism. Refer to ) for more discussion
about using entropy label in SR-MPLS.Tunneling MPLS into IP provides a transition technology that
enables SR in an IPv4 and/or IPv6 network where many routers have
not yet been upgraded to have SRv6 capabilities . It could be
deployed as an interim until full featured SRv6 is available on more
platforms. This is shown in . This section describes the construction of forwarding information
base (FIB) entries and the forwarding behavior that allow the deployment
of SR-MPLS when some routers in the network are IP only (i.e., do not
support SR-MPLS). Note that the examples described in and assume that OSPF or ISIS is
enabled: in fact, other mechanisms of discovery and advertisement could
be used including other routing protocols (such as BGP) or a central
controller.This sub-section describes the how to construct the forwarding
information base (FIB) entry on an SR-MPLS-capable router when some or
all of the next-hops along the shortest path towards a prefix-SID are
IP-only routers.Consider router A that receives a labeled packet with top label
L(E) that corresponds to the prefix-SID SID(E) of prefix P(E)
advertised by router E. Suppose the ith next-hop router (termed NHi)
along the shortest path from router A toward SID(E) is not SR-MPLS
capable. That is both routers A and E are SR-MPLS capable, but some
router NHi along the shortest path from A to E is not SR-MPLS capable.
The following processing steps apply:Router E is SR-MPLS capable so it advertises the
SR-Capabilities sub-TLV including the SRGB as described in and .Router E advertises the prefix-SID SID(E) of prefix P(E) so
MUST also advertise the encapsulation endpoint and the tunnel type
of any tunnel used to reach E. It does this using the mechanisms
described in or
.If A and E are in different IGP areas/levels, then: The OSPF Tunnel Encapsulation TLV or the ISIS Tunnel
Encapsulation sub-TLV is flooded
domain-wide.The OSPF SID/label range TLV or the
ISIS SR-Capabilities Sub-TLV is
advertised domain-wide. This way router A knows the
characteristics of the router that originated the
advertisement of SID(E) (i.e., router E).When router E advertises the prefix P(E): If router E is running ISIS it uses the extended
reachability TLV (TLVs 135, 235, 236, 237) and associates
the IPv4/IPv6 or IPv4/IPv6 source router ID sub-TLV(s)
.If router E is running OSPF it uses the OSPFv2 Extended
Prefix Opaque LSA and sets the
flooding scope to AS-wide.If router E is running ISIS and advertises the ISIS
capabilities TLV (TLV 242) , it MUST
set the "router-ID" field to a valid value or include an IPV6
TE router-ID sub-TLV (TLV 12), or do both. The "S" bit
(flooding scope) of the ISIS capabilities TLV (TLV 242) MUST
be set to "1" .Router A programs the FIB entry for prefix P(E) corresponding
to the SID(E) as follows: If the NP flag in OSPF or the P flag in ISIS is clear:
pop the top labelIf the NP flag in OSPF or the P flag in ISIS is set: swap the top label to a value equal to SID(E) plus the
lower bound of the SRGB of EEncapsulate the packet according to the encapsulation
advertised in
or Send the packet towards the next hop NHi.The description in this section assumes that the label associated
with each prefix-SID is advertised by the owner of the prefix-SID is
a Penultimate Hop Popping (PHP) label. That is, the NP flag in OSPF
or the P flag in ISIS associated with the prefix SID is not set.In the example shown in , assume that
routers A, E, G, and H are SR-MPLS-capable while the remaining
routers (B, C, D, and F) are only capable of forwarding IP packets.
Routers A, E, G, and H advertise their Segment Routing related
information via IS-IS or OSPF.Now assume that router A wants to send a packet via the explicit
path {E->G->H}. Router A will impose an MPLS label stack
corresponding to that explicit path on the packet. Since the next
hop toward router E is only IP-capable, router A replaces the top
label (that indicated router E) with a UDP-based tunnel for MPLS
(i.e., MPLS-over-UDP ) to router E and then
sends the packet. In other words, router A pops the top label and
then encapsulates the MPLS packet in a UDP tunnel to router E.When the IP-encapsulated MPLS packet arrives at router E, router
E strips the IP-based tunnel header and then process the
decapsulated MPLS packet. The top label indicates that the packet
must be forwarded toward router G. Since the next hop toward router
G is only IP-capable, router E replaces the current top label with
an MPLS-over-UDP tunnel toward router G and sends it out. That is,
router E pops the top label and then encapsulates the MPLS packet in
a UDP tunnel to router G.When the packet arrives at router G, router G will strip the
IP-based tunnel header and then process the decapsulated MPLS
packet. The top label indicates that the packet must be forwarded
toward router H. Since the next hop toward router H is only
IP-capable, router G would replace the current top label with an
MPLS-over-UDP tunnel toward router H and send it out. However, this
would leave the original packet that router A wanted to send to
router H encapsulated in UDP as if it was MPLS even though the
original packet could have been any protocol. That is, the final
SR-MPLS has been popped exposing the payload packet.To handle this, when a router (here it is router G) pops the
final SR-MPLS label, it inserts an explicit null label before encapsulating the packet with an
MPLS-over-UDP tunnel toward router H and sending it out. That is,
router G pops the top label, discovers it has reached the bottom of
stack, pushes an explicit null label, and then encapsulates the MPLS
packet in a UDP tunnel to router H. demonstrates the packet walk in the
case where the label associated with each prefix-SID advertised by
the owner of the prefix-SID is not a Penultimate Hop Popping (PHP)
label (i.e., the the NP flag in OSPF or the P flag in ISIS
associated with the prefix SID is set).As can be seen from the figure, the SR-MPLS label for each
segment is left in place until the end of the segment where it is
popped and the next instruction is processed. Further description
can be found in .Although the description in the previous two sections is based on
the use of prefix-SIDs, tunneling SR-MPLS packets are useful when
the top label of a received SR-MPLS packet indicates an adjacncy-SID
and the corresponding adjacent node to that adjacency-SID is not
capable of MPLS forwarding but can still process SR-MPLS packets. In
this scenario the top label would be replaced by an IP tunnel toward
that adjacent node and then forwarded over the corresponding link
indicated by the adjacency-SID.When encapsulating an MPLS packet with an IP tunnel header that
is capable of encoding entropy (such as ),
the corresponding entropy field (the source port in case UDP tunnel)
MAY be filled with an entropy value that is generated by the
encapsulator to uniquely identify a flow. However, what constitutes
a flow is locally determined by the encapsulator. For instance, if
the MPLS label stack contains at least one entropy label and the
encapsulator is capable of reading that entropy label, the entropy
label value could be directly copied to the source port of the UDP
header. Otherwise, the encapsulator may have to perform a hash on
the whole label stack or the five-tuple of the SR-MPLS payload if
the payload is determined as an IP packet. To avoid re-performing
the hash or hunting for the entropy label each time the packet is
encapsulated in a UDP tunnel it MAY be desireable that the entropy
value contained in the incoming packet (i.e., the UDP source port
value) is retained when stripping the UDP header and is re-used as
the entropy value of the outgoing packet.This section provides supplementary details to the description found
in . specifies an IP-based encapsulation for
MPLS, i.e., MPLS-in-UDP, which is applicable in some circumstances where
IP-based encapsulation for MPLS is required and further fine-grained
load balancing of MPLS packets over IP networks over Equal-Cost
Multipath (ECMP) and/or Link Aggregation Groups (LAGs) is required as
well. This section provides details about the forwarding procedure when
when UDP encapsulation is adopted for SR-MPLS over IP.Nodes that are SR capable can process SR-MPLS packets. Not all of the
nodes in an SR domain are SR capable. Some nodes may be "legacy routers"
that cannot handle SR packets but can forward IP packets. An SR capable
node may advertise its capabilities using the IGP as described in . There are six types of node in an SR domain: Domain ingress nodes that receive packets and encapsulate them
for transmission across the domain. Those packets may be any payload
protocol including native IP packets or packets that are already
MPLS encapsulated.Legacy transit nodes that are IP routers but that are not SR
capable (i.e., are not able to perform segment routing).Transit nodes that are SR capable but that are not identified by
a SID in the SID stack.Transit nodes that are SR capable and need to perform SR routing
because they are identified by a SID in the SID stack.The penultimate SR capable node on the path that processes the
last SID on the stack on behalf of the domain egress node.The domain egress node that forwards the payload packet for
ultimate delivery.The following sub-sections describe the processing behavior in each
case.Domain ingress nodes receive packets from outside the domain and
encapsulate them to be forwarded across the domain. Received packets
may already be SR-MPLS packets (in the case of connecting two SR-MPLS
networks across a native IP network), or may be native IP or MPLS
packets.In the latter case, the packet is classified by the domain ingress
node and an SR-MPLS stack is imposed. In the former case the SR-MPLS
stack is already in the packet. The top entry in the stack is popped
from the stack and retained for use below.The packet is then encapsulated in UDP with the destination port
set to 6635 to indicate "MPLS-UDP" or to 6636 to indicate
"MPLS-UDP-DTLS" as described in . The source
UDP port is set randomly or to provide entropy as described in and , above.The packet is then encapsulated in IP for transmission across the
network. The IP source address is set to the domain ingress node, and
the destination address is set to the address corresponding to the
label that was previously popped from the stack.This processing is equivalent to sending the packet out of a
virtual interface that corresponds to a virtual link between the
ingress node and the next hop SR node realized by a UDP tunnel. The
packet is then sent into the IP network and is routed according to the
local FIB and applying hashing to resolve any ECMP choices.A legacy transit node is an IP router that has no SR capabilities.
When such a router receives an SR-MPLS-in-UDP packet it will carry out
normal TTL processing and if the packet is still live it will forward
it as it would any other UDP-in-IP packet. The packet will be routed
toward the destination indicated in the packet header using the local
FIB and applying hashing to resolve any ECMP choices.If the packet is mistakenly addressed to the legacy router, the UDP
tunnel will be terminated and the packet will be discarded either
because the MPLS-in-UDP port is not supported or because the uncovered
top label has not been allocated. This is, however, a misconnection
and should not occur unless there is a routing error.Just because a node is SR capable and receives an SR-MPLS-in-UDP
packet does not mean that it performs SR processing on the packet.
Only routers identified by SIDs in the SR stack need to do such
processing.Routers that are not addressed by the destination address in the IP
header simply treat the packet as a normal UDP-in-IP packet carrying
out normal TTL processing and if the packet is still live routing the
packet according to the local FIB and applying hashing to resolve any
ECMP choices.This is important because it means that the SR stack can be kept
relatively small and the packet can be steered through the network
using shortest path first routing between selected SR nodes.An SR capable node that is addressed by the top most SID in the
stack when that is not the last SID in the stack (i.e., the S bit is
not set) is an SR transit node. When an SR transit node receives an
SR-MPLS-in-UDP packet that is addressed to it, it acts as follows.
Perform TTL processing as normal for an IP packet.Determine that the packet is addressed to the local node.Find that the payload is UDP and that the destination port
indicates MPLS-in-UDP.Strip the IP and UDP headers.Examine the label at the top of the stack and process according
to the FIB entry (see . If the top label identifies this node then no PHP was used
on the incoming segment and the label is popped. Continue the
processing with the new top label.Retain the value of the top label.If the top label was advertised requesting PHP, pop the
label. (Note that the case where this is the last label in the
stack is covered in .)Encapsulate the packet in UDP with the destination port set to
6635 (or 6636 for DTLS) and the source port set for entropy. The
entropy value SHOULD be retained from the received UDP header or
MAY be freshly generated since this is a new UDP tunnel (see ).Encapsulate the packet in IP with the IP source address set to
this transit router, and the destination address set to the
address corresponding to the SID for the label value retained
earlier.Send the packet into the IP network routing the packet
according to the local FIB and applying hashing to resolve any
ECMP choices.The penultimate SR transit node is an SR transit node as described
in where the top label is the last label on
the stack. When a penultimate SR transit node receives an
SR-MPLS-in-UDP packet that is addressed to it, it processes as for any
other transit node (see ) except for a special
case if PHP is supported for the final SID.If PHP is allowed for the final SID the penultimate SR transit node
acts as follows: Perform TTL processing as normal for an IP packet.Determine that the packet is addressed to the local node.Find that the payload is UDP and that the destination port
indicates MPLS-in-UDP.Strip the IP and UDP headers.Examine the label at the top of the stack and process according
to the FIB entry (see . If the top label identifies this node then no PHP was used
on the incoming segment and the label is popped. Continue the
processing with the new top label.Retain the value of the top label.If the top label was advertised requesting PHP, pop the
label. This will have been the last label in the stack. Push
an explicit null label (0 for IPv4
and 2 for IPv6) with bottom of stack (S bit) set.Encapsulate the packet in UDP with the destination port set to
6635 (or 6636 for DTLS) and the source port set for entropy. The
entropy value SHOULD be retained from the received UDP header or
MAY be freshly generated since this is a new UDP tunnel.Encapsulate the packet in IP with the IP source address set to
this transit router, and the destination address set to the domain
egress node IP address corresponding to the SID for the label
value retained earlier.Send the packet into the IP network routing the packet
according to the local FIB and applying hashing to resolve any
ECMP choices.End-to-end SR paths are comprised of multiple segments. The end
point of each segment is identified by a SID in the SID stack. In
normal SR processing a penultimate hop is the router that performs
SR routing immediately prior to the end-of-segment router. PHP
applies at the penultimate router in a segment.With SR-MPLS-in-UDP encapsulation, each SR segment is achieved
using an MPLS-in-UDP tunnel that runs the full length of the
segment. The SR SID stack on a packet is only examined at the head
and tail ends of this segment. Thus, each segment is effectively one
hop long in the SR overlay network and if there is any PHP
processing it takes place at the head-end of the segment.The domain egress acts as follows: Perform TTL processing as normal for an IP packet.Determine that the packet is addressed to the local node.Find that the payload is UDP and that the destination port
indicates MPLS-in-UDP.Strip the IP and UDP headers.Examine the label at the top of the stack and process according
to the FIB entry (see . If the top label identifies this node then no PHP was used
on the incoming segment and the label is popped. Continue the
processing with the new top label.If there is another label it should be the explicit null.
Pop it but retain its value.Forward the payload packet according to its type (as
potentially indicated by the value of the popped explicit null
label) and the local routing/forwarding mechanisms.Thanks to Joel Halpern, Bruno Decraene, Loa Andersson, Ron Bonica,
Eric Rosen, Jim Guichard, and Gunter Van De Velde for their insightful
comments on this draft.No IANA action is required.TBD.