Common Control and Measurment Plane I. Hussain, Ed.
Internet-Draft R. Valiveti
Intended status: Informational Infinera Corp
Expires: September 14, 2017 Q. Wang, Ed.
L. Andersson, Ed.
M. Chen
H. Zheng
March 13, 2017

GMPLS Routing and Signaling Framework for Flexible Ethernet (FlexE)


This document specifies GMPLS Control Plane Signalling and Routing protocol extensions for Flexible Ethernet (FlexE). The FlexE data plane were specified by Optical Internetworking Forum (OIF) in two implementation agreements in 2016.

As different from earlier Ethernet data planes FlexE allows for decoupling of the Ethernet Physical layer (PHY) and Media Access Control layer (MAC) rates.

This document also specifies the use cases of FlexE technology, GMPLS control plane requirements, framework and architecture.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 14, 2017.

Copyright Notice

Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents ( in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction

Traditionally, Ethernet MAC rates were constrained to match the rates of the Ethernet PHY(s). OIF's implementation agreement [OIFMLG3] was the first step in allowing MAC rates to be different than the PHY rates standardized by IEEE. A recently approved implementation agreement [OIFFLEXE1] allows for complete decoupling of the MAC data rates and the Ethernet PHY(s).

This includes support for

  1. MAC rates which are greater than the rate of a single PHY (multiple PHYs) are bonded to achive this
  2. MAC rates which are less than the rate of a PHY (sub-rate)
  3. support of multiple FlexE CLients carried over a single PHY, or over a collection of bonded PHY).

The capabilities supported by the FlexE implementation agreement version 1.0 are:

  1. Support a large rate Ethernet MAC over bonded Ethernet PHYs, e.g. supporting a 200G MAC over 2 bonded 100GBASE-R PHY(s)
  2. Support a sub-rate Ethernet MAC over a single Ethernet PHY, e.g. supportnig a 50G MAC over a 100GBASE-R PHY
  3. Support a collection of flexible Ethernet clients over a single Ethernet PHY, e.g. supporting two MACs with the rates 25G, 50G over a single 100GBASE-R PHY
  4. Support a sub-rate Ethernet MAC over bonded PHYs, e.g. supporting a 150G Ethernet client over 2 bonded 100GBASE-R PHY(s)
  5. Support a collection of Ethernet MAC clients over bonded Ethernet PHYs, e.g. supporting a 50G, and 150G MAC over 2 bonded Ethernet PHY(s)

All networks which support the bonding of Ehernet interfaces (as per [OIFFLEXE1]) include a basic building block -- which consists of two FlexE Shim functions (located at opposite ends of a link) and the (logical) point to point links that carry the Ethernet PHY signals between the two FlexE Shim Functions. These logical point-to-point PHY links can be realized in a variety of ways:

  1. These are direct point-to-point links with no intervening transport network.
  2. The Ethernet PHY(s) are transparently transported via an Optical Transport Network. Optical Transport Networks (defined by [G.709] and [G798]) have recently expanded the traditional bit (or codeword) transparent transport of Ethernet client signals, and included support for the usecases identified in the OIF FlexE implementation agreement.
  3. Realized by tunneling the Ethernet PHY(s) over some other type of network (e.g. IP/MPLS). Thus, for example, the Ethernet PHY(s) signals could be carried over a pseudowire (or a LSP)in the IP/MPLS network. Note that the OIF implementation agreement [OIFFLEXE1] only includes support for 100G Ethernet PHY(s). As a result of this encapsulation into a PW, the bandwidth of the PW will be much larger than the bit rate of the Ethernet PHY (i.e. 100G), and such a pseudowire cannot be transported in networks that only include 100G Ethernet links. This scenario is realizable when (a) higher rate Ethernet PHY(s), e.g. 200G/40G are supported) or (b) OIF extends the FlexE groups to include lower rate Ethernet PHY(s), e.g. at the 25G/50G rate. Further study is needed to ensure that these scenarios are realizable, practical, and beneficial to operators. With this in mind, the current draft doesn't include any coverage for this scenario.

This Internet-draft examines the usescases that arise when the logical links between FlexE capable devices are (a) point-to-point links without any intervening network (b) realized via Optical transport networks. This draft considers the variants in which fhe two peer FlexE devices are both customer-edge devices, or customer-edge/provider edge devices. This list of usecases will help identify the Control Plane (i.e. Routing and Signaling) extensions that may be required.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Terminology

  1. CE (Customer Edge) - the group of functions that support the termination/orignation of data received from or sent to the network,
  2. Crunching: (Editors note: text to be submitted>)
  3. Ethernet PHY: an entity representing 100G-R Physical Coding Sublayer (PCS), Physical Media Attachment (PMA), and Physical Media Dependent (PMD) layers.
  4. FlexE Calendar: The total capacity of a FlexE group is represented as a collection of slots which have a granularty of 5G. The calendar for a FlexE group composed of n 100G PHYs is represented as an array of 20n slots (each representing 5G of bandwidth). This calendar is partitioned into sub-calendars, with 20 slots per 100G PHY. Each FlexE client is mapped into one or more calendar slots (based on the bandwidth of the FlexE client).
  5. FlexE Client: an Ethernet flow based on a MAC data rate that may or may not correspond to any Ethernet PHY rate.
  6. FlexE Group: A FlexE Group is composed of from 1 to n Ethernet PHYs. In the first version of FlexE each PHY is identified by a number in the range [1-254].
  7. FlexE Interface: A logic interface that is composed of from 1 to n Ethernet interfaces.
  8. FlexE Link: A logic link that connects two FexE interfaces residing in two adjacent nodes.
  9. FlexE Shim: the layer that maps or demaps the FlexE clients carried over a FlexE group.
  10. FlexE Sub-Interface: A channelized logic sub-interface that is allocated specific slots from a FlexE interface, the number of slots depends on the rate of the FlexE Client that will be transmitted through this sub-interface.
  11. FlexE Sub-Link: A logic link that connects two FlexE sub-interfaces that residing in two adjacent nodes.

3. Usecases

3.1. FlexE Unware transport

The FlexE Shim layer in a router maps the FlexE client(s) over the FlexE group. The transport network is unware of the FlexE. Each of the FlexE group PHY is carried independently across the transport network over the same fiber route. The FlexE Shim in the router tolerates end-to-end skew across the network. In this usecase, the router makes flexible use of the full capacity of the FlexE group, and depends on legacy transport equipment to realize PCS-codeword-transparent transport of 100GbE. It allows striping of PHYs in the FlexE group over multiple line cards in the transport equipment. It is worth mentioning that in this case, the FlexE Shim layer is terminated at the routers, and the coordination of operations related to FlexE clients, e.g. creating new FlexE clients, deleting existing FlexE clients, and resizing the bandwidth of existing FlexE clients (if desired) happens between the two routers. Note that the transport network is completely transparent to the FlexE signals, and doesn't participate in any FlexE protocols.


    +                           FlexE Ethernet Client(s)        +
    +                                                           +
                     +  FlexE skew tolerance                  +
                     +  for end-to-distance                   +

+-----------+ 2x100GE +---------+   +----------+     +------------+
|           |         |         |   |          |     |            |
| Router1   |         |         |   |          |     |            |
|FlexE Shim +---------+ A-end   |   |  Z-end   +-----+Router 2    |
|           |         | (FlexE  |   |  (FlexE  |     |(FlexE Shim)|
|           +---^-----+ unaware)|   |  unaware)+-----+            |
|           |   |     |         |   |          |     |            |
|           |   |     |         |   |          |     |            |
+-----------+   +     +---------+   +----------+     +------------+
                 FlexE Group

+--------------+                                  +----------------+
| FlexE Clients|                                  | FlexE Client(s)|
+--------------+                                  +----------------+
| FlexE Shim   |                                  |  FlexE Shim    |
+----+----+----+                                  +----+------+----+
|PHY |  |  PHY |                                  |  PHY |   | PHY |
+---+---+--+---+                                  +---+--+   +--+--+
    |      |          +-----+           +-----+       |         |
    |      +----------+ PHY |           | PHY |-------+         |
    |                 +-----+           +-----+                 |
    |                 | ODU4+-----------+ ODU4|                 |
    |                 +-----+           +-----+                 |
    |                                                           |
    |                 +-----+           +-----+                 |
    +-----------------+ PHY |           | PHY +-----------------+
                      +-----+           +-----+
                      | ODU4+-----------+ ODU4|
                      +-----+           +-----+


Figure 1: FlexE unaware transport

3.2. FlexE Aware transport

This scenario represents an optimization of the FlexE unaware transport presented in Section 3.1, and illustrated in Figure 1. In this application (see Figure 2), the devices at the edge of the transport network do not terminate the FlexE shim layer, but are aware of the (a) composition of the FlexE group (i.e. set of all contained Ethernet PHYs) and (b) format of the FlexE overhead. At the ingress to the transport network, the transport network edge removes the unavailable calendar slots, and retains all available calendar slots (whether they are allocated or not). At the egress point of the transport network, the edge device adds the unavailable calendar slots back. The result is that the FlexE Shim layers at both routers see exactly the same input that they saw in the FlexE unware scenario -- with the added benefit that the line (or DWDM) side bandwidth has been optimized to be sufficient to carry only the available calendar slots in all of the Ethernet PHY(s) in the FlexE group.

The transport network edge device could learn of the set of unavailable calendar slots in a variety of ways; a few examples are listed below:

  1. In this scenario, the transport network edge does not expect the number of unavailable calendar slots to change dynamically. The set of unavailable calendar slots is configured against each Ethernet PHY in the FlexE group. The FlexE demux function in the transport network edge device (A) compares the information about calendar slots which are expected to be unavailable (as per user supplied configuration), with the corresponding information encoded by the customer edge device in the FlexE overhead (as specified in [OIFFLEXE1]). If there is a mismatch between the unavailable calendar slots in any of the PHYs within a FlexE group, the transport edge node software could raise an alarm to report the inconsistency between the provisioning information at the transport network edge, and the customer edge device.
  2. The Transport network edge is configured to act in a "slave" mode. In this mode, the FlexE demux function at the Transport network edge (A) receives the information about the available/unavailable calendar slots by observing the FlexE overhead (as specified in [OIFFLEXE1]) and uses this information to calculate the bandwidth of the ODUflex (or fixed rate ODUs) connection that could carry the FlexE PCS end-to-end. This scenario allows for the set of available/unavailable calendar slots to change (slowly) with time -- but comes with the complexity of resizing the ODUflex connection in response to changes in the number of available calendar slots.

Note that the process of removing unavailable calendar slots from a FlexE PHY is called "crunching" (see [OIFFLEXE1]). The following additional notes apply to Figure 2:

  1. As in the FlexE unaware case, all PHYs of the FlexE group MUST be terminated between the same two FlexE shims.
  2. The crunched FlexE PHYs are independently transported through the transport network. The number of used (and unused) calendar slots can be different across the FlexE group. In particular, if all the calendar slots in a FlexE PHY are in use, the crunching operation leaves the original signal intact.
  3. In this illustration, the different FlexE PHY(s) are transported using ODUflex containers in the transport network. These ODUflex connections can be of different rates.
  4. In the most general form, G.709 Section 17.12 [G.709] allows for a FlexE group consisting of m Ethernet PHY(s) to be crunched, combined, and transported using n ODUFlex containers (where n can range between 1 and m). In other words, the ITU G.709 recommendation allows for (but not require the support for) the degenerate cases in which (a) each Ethernet PHY within the group is transported using its own ODUflex, and (b) all the PHY(s) are crunched, combined and transported over a single ODUflex container. If all the sub-calendar slots in a given PHY are available, it is possible to transport the content of the PHY in one of two ways: (a) as shown in Figure 2, or (b) using a FlexE unware (i.e. PCS-codeword transparent transport) mode. The latter approach (of using FlexE unaware transport) for a few select (fully-utilized) PHYs is not attractive from the perspective of skew between the PHYs that comprise the FlexE group. For simplicity, the preferred mode of operation will be one in which the same mapping procedure is used for member PHYs of a FlexE group.
  5. When the crunched FlexE PHY(s) have a rate that is identical to that of a standard Ethernet PHY, it is possible that the transport network may utilize standard ODU containers such as ODU2e, ODU4 etc. As currently defined by ITU G.709 Section 17.12 [G.709], the crunched signal is always mapped to an ODUflex, and the mapping to a fixed rate ODU signal is not required. This option could be dropped if it results in any significant simplification.
  6. The bandwidth of the ODUFlex connections shall be computed based on the total number of available 5G calendar slots which in the subset of PHY(s) which are transported over this ODUflex entity (see Section 3.2, G.709:Table 7-2 [G.709]).
  7. As in the FlexE unaware case, the FlexE Shim layer is terminated at the routers, and the coordination of operations related to FlexE clients, e.g. creating new FlexE clients, deleting existing FlexE clients, and resizing the bandwidth of existing FlexE clients (if desired) happens between the two routers. Note that the transport network is completely transparent to the FlexE signals, and doesn't participate in any FlexE protocols. As long as the set of available (and unavailable) calendar slots on the PHY(s) does not change after the initial setup, the transport network is not required to make any changes to the number/rates of ODUflex connections which were created at service setup time.
  8. In the FlexE aware case, the OTN pipes are sized to match the currently configured set of available/unavailable calendar slots across the FlexE group. If this set of available/unavailable calendar slots on the PHY(s) is allowed to dynamically change, the ODUflex connections would also require resizing to match the new usage of available slots. However, the ODUflex hitless resizing mechanism defined in G.7044 [G7044] has the following restrictions: (a) ODUflex connection being resized must have bandwidth of 100G or less (b) the ODUflex connection cannot traverse OTUCn links which were introduced in the latest revision of G.709. With the present limitations in the ODUflex resizing mechanism, the dynamic adjustent of ODUflex bandwidth (for the FlexE aware case) is possible only if (a) the transport network edge maps each crunched PHY to its own ODUflex connection (b) the Ethernet PHY rates are 100G or less (c) the ODUflex connection does not traverse any OTUCn links along the end-to-end path. As a result, this scenario is not considered in this document.

[N1]RV: The figure may need further editing to accurately depict the signal hierarchy.


                  FlexE Ethernet Client(s)
              FlexE skew tolerance
               for end+to+distance

+--------+ 2 x 100GE +---------+      +---------+    +------+
|  R1    |           |         |      |         +----+  R2  |
|  (FlexE+-----------+  NE A   |      |  NE Z   |    |(FlexE|
|  Shim) |           | (FlexE  |      | (FlexE  +----+ Shim |
|        +-----^-----+ aware)  |      | aware)  |    |      |
|        |     |     |         |      |         |    |      |
+--------+     +     +---------+      +---------+    +------+
                     FlexE Group
+-------------+                                +-------------+
|FlexE clients|                                |FlexE clients|
+-------------+                                +-------------+
| FlexE Shim  |                                | FlexE Shim  |
+-------------+                                +-------------+
|  PHY |  PHY |                                |  PHY |  PHY |
+-------------+                                +-------------+
 |     |                                         |     |
 |     |  +-------------+        +------------+  |     |
 |     |  |  FlexE-psg  |        | FlexE-psg  |  |     |
 |     |  +-------------+        +------------+  |     |
 |     +--+ PHY|ODUflex +--------|ODUFlex|PHY +--+     |
 |        +-------------+        +------------+        |
 |                                                     |
 |        +-------------+        +------------+        |
 |        |  FlexE-psg  |        | FlexE-psg  |        |
 |        +-------------+        +------------+        |
 +--------+ PHY|ODUflex +--------|ODUFlex|PHY +--------+
          +-------------+        +------------+

     + Legend:
     | R1, R2 + Routers (supporting the FlexE clients)
     | NE A, Z  + Transport Network Edge nodes
     + FlexE-psg: FlexE partial rate (sub) group signal
                     (per G.709:17.12)


Figure 2: FlexE Aware Transport

3.3. FlexE Termination in Transport

These usecases build upon the basic router-transport equipment connectivity illustrated in Figure 1. The FlexE shim layer at the router maps to the set of FlexE clients over the FlexE group, as usual. This section considers various usecases in which the equipment located at the edge of the transport network instantiates the FlexE Shim function which peers with the FlexE shim on the customer device. In the router to network direction, the transport edge node terminates the FlexE shim layer, and extracts one or more FlexE client signals, and transports them through the network. That is, these usecases are distinguished from the FlexE unaware cases in that the FlexE group, and the FlexE shim layer end at the transport network edge, and only the extracted FlexE client signals transit the optical network. In the network to router direction, the transport edge node maps a set of FlexE clients to the FlexE group (i.e. performing the same functions as the router which connects to the transport network).The various usecases differ in the combination of service endpoints in the transport network. In the FlexE termination scenarios, the distance between the FlexE Shims is limited the normal Ethernet link distance. The FlexE shims in the router, and the equipment need to support a small amount skew.

3.3.1. FlexE Client at Both endpoints

In this scenario, service consists of transporting a FlexE client through the transport network, and possibly combining this FlexE client with other FlexE clients into a FlexE group at the endpoints. The FlexE client signal BMP mapped into an ODUflex (of the appropriate rate) and then switched across the OTN. Figure 3 illustrates the scenario involving the mapping of a FlexE client to an ODUflex envelope; this figure only shows the signal "stack" at the service endpoints, and doesn't illustrate the switching of the ODUflex entity through the OTN. The ODUflex signal then carried over a sequence of OTUk links (with a maximum rate of 100G), and/or OTUCn (with rates of n X 100G). Although Figure 3 illustrates the scenario in which one FlexE client is transported within the OTN, the following points should be noted:

  1. When the FlexE Shim termination function recovers multiple FlexE client signals (at node A), the FlexE signals can be transported independently. In other words, it is not a requirement that all the FlexE client signals be co-routed.
  2. Conversely, at the egress node, FlexE clients from different endpoints can be combined via the FlexE shim, eventually exiting the transport edge node over an Ethernet group.
  3. The description presented above(implicitly) assumes that the FlexE Client signals have a constant bit rate which does not change after the service setup. In the scenarios in which the FlexE Client Signal rates are permitted to be dynamically adjusted (i.e. resized), the resizing process would require coordination across three resizing domains: (a) between Router1, Node A (b) Resizing the ODUflex connection between the transport edge nodes A, Z (c) between the Node Z, Router2. This usecase is not considered in this document since G.709 [G.709] has dropped support for the the hitless resizing of ODUflex connections with bandwidths larger than 100G. In the absence of a hitless B100G ODUflex resizing mechanism, this will have to be realized by treating it like a request for new service with a new (increased or decreased) rate. The FlexE client bandwidth resize applicability for various use cases is summarized in Table 1.


 +--------+ 2 x 100GE +---------+       +----------+      +--------+
 |        |           |         |       |          |      |        |
 | Router1|           |         |       |          |      |        |
 | FlexE  +-----------+ A-end   |       |  Z-end   +------+Router2 |
 | Shim   |           | (FlexE  |       |  (FlexE  |      |FlexE   |
 |        +-----^-----+  term)          |  term)   +------+ Shim   |
 |        |     |     |         |       |          |      |        |
 |        |     |     |         |       |          |      |        |
 +--------+     +     +---------+       +----------+      +--------+
           FlexE Group

 +-----------+   +--------------+    +-------------+   +-----------+
 | Client(s) |   | Client       |    | Client      |   | Client(s) |
 +-----------+   +--------+-----+    +------+------+   +-----------+
 | FlexE Shim|   | Shim   |     |    |      | Shim |   | FlexE Shim|
 +-----------+   +--------+ ODU |    | ODU  +------+   +-----------+
 | PHY(s)    |   | PHY(s) | flex|    | flex |PHY(s)|   | PHY(s)    |
 +---+-------+   +---+----+--+--+    +---+--+---+--+   +---+-------+
 |               |           |           |      |          |
 +---------------+           +-----------+------+----------+


Figure 3: FlexE termination: FlexE clients at both endpoints

3.3.2. Interworking of FlexE Client w/ Native Client at the other endpoint

The OIF implementation agreement [OIFFLEXE1] currently supports FlexE client signals carried over one or more 100GBASE-R PHY(s). There is a calendar of 5G timeslots associated with each PHY, and each FlexE client can make use of a number of timeslots (possibly distributed across the members of the FlexE group). This implies that the FlexE client rates are multiples of 5Gbps. When the rates of the FlexE client signals matches the MAC rates corresponding to existing Ethernet PHYs, i.e. 10GBASE-R/40GBASE-R/100GBASE-R, there is a need for the FlexE client signal to interwork with the native Ethernet client received from a single (non-FlexE capable) Ethernet PHY. This capability is expected to be extended to any future Ethernet PHY rates that the IEEE may define in future (e.g. 25G, 50G, 200G etc.). In these cases, although the bit rate of the FlexE client matches the MAC rate of other endpoint, the 64B66B PCS codewords for the FlexE client need to be transformed (via ordered set translation) to match the specification for the specific Ethernt PHY. These details are described in Section 7.2.2 of [OIFFLEXE1] and are not eloborated any further in this document.

Figure 4 illustrates a scenario involving the interworking of a 10G FlexE client with a 10GBASE-R native Ethernet signal. In this example, the network wrapper is ODU2e.


 +--------+ 2 x 100GE +-------+           +-------+      +--------+
 |        |           |       |           |       |      |        |
 | Router1|           |       |           |       |      |        |
 |(FlexE  +-----------+ A-end |           | Z-end | 10GE |Router 2|
 | Shim)  |           |(FlexE |           |       +------+        |
 |        +-----^-----+ term) |           |       |      |        |
 |        |     |     |       |           |       |      |        |
 |        |     |     |       |           |       |      |        |
 +--------+     +     +-------+           +-------+      +--------+
          FlexE Group

 +-----------+   +---------------+
 | Client(s) |   | Client        |     +------------+    +---------+
 +-----------+   +-------+-------+     |   10GE PCS |    | 10GE PCS|
 | FlexE Shim|   | Shim  |       |     +-------+----+    +---------+
 +-----------+   +-------+  ODU  |     | ODU2e | PHY|    | PHY     |
 | PHY(s)    |   | PHY(s)|  2e   |     +---+---+--+-+    +-----+---+
 +---+-------+   +---+-------+---+         |      |            |
     |               |       |             |      |            |
     |               |       |             |      |            |
     +---------------+       +-------------+      +------------+


Figure 4: FlexE client interop with Native Ethernet Client

3.3.3. Interworking of FlexE client w/ Client from OIF_MLG

As explained in the Introduction section ( Section 1 OIFMLG3 [OIFMLG3] introduced support for carrying 10GE and 40GE client signals over a group of 100GBASE-R Ethernet PHY(s). While the most recent implementation agreement doesn't call it out explicitly, it is expected that the FlexE clients (as defined in [OIFFLEXE1]), and 10GBASE-R/40GBASE-R clients supported by OIFMLG3 [OIFMLG3]) will interoperate.

Figure 5 illustrates a scenario involving the interworking of a 10G FlexE client with a 10GBASE-R client supported by an OIFMLG3 interface. In this example, the network wrapper is ODU2e.


 +--------+ 2 x 100GE +---------+       +---------+      +---------+
 |        |           |         |       |         |      |         |
 | Router1|           |         |       |         |      |         |
 | FlexE  +-----------+ A-end   |       |  Z-end  +------+Router 2 |
 | Shim   |           | (FlexE  |       |         |      |(MLG-3.0)|
 |        +-----^-----+ term)   |       |         +------+         |
 |        |     |     |         |       |         |      |         |
 |        |     |     |         |       |         |      |         |
 +--------+     +     +---------+       +---------+      +---------+
           FlexE Group


+-----------+   +-------------+      +--------------+   +----------+
| Client(s) |   | Client      |      | 10GE PCS     |   | 10GE Cl. |
+-----------+   +--------+----+      +------+-------+   +----------+
| FlexE Shim|   | Shim   |    |      |      | MLG3  |   | MLG3     |
+-----------+   +--------+ ODU|      | ODU  +-------+   +----------+
| PHY(s)    |   | PHY(s) | 2e |      | 2e   | PHY(s)|   | PHY(s)   |
+---+-------+   +---+----+--+-+      +---+--+---+---+   +---+------+
    |               |       |            |      |            |
    +---------------+       +------------+      +------------+


Figure 5: FlexE client interop with Ethernet Client supported by MLG3

3.4. Back-to-Back FlexE

This section covers a degenerate FlexE termination scenario in which Router1, Router2, and Router3 are interconnected through back-to-back FlexE groups without an intermediate transport network (see Figure 6). Even in scenarios where there is a transport network providing FlexE unaware/aware transport services for this pair of FlexE groups, the FlexE layer network can be viewed as an overlay on top of the underlying transport network. As such, all of the FlexE Shim operations (e.g. adding/deleting FlexE clients, resizing existing clients) proceed in the same manner -- regardless of whether the routers are directly connected or not.

In this example, the FlexE Shim at Router2 extracts one or more FlexE client signals from the FlexE group connected to Router1, and mutliplexes these extracted FlexE signals into the FlexE group towards the appropriate router (e.g. Router3). Note that each of the extracted FlexE client signals can be independently routed towards its respective FlexE group.


 +--------+ 2 x 100GE +---------+ 3 x 100GE +---------+
 |        |           |         |           |         |
 | Router1|           |         |           |         |
 | FlexE  +-----------+ Router2 +-----------+ Router3 |
 | Shim   |           | FlexE   +-----------+ FlexE   |
 |        +-----^-----+ Shim    +-----^-----+ Shim    |
 |        |     |     |         |     |     |         |
 |        |     |     |         |     |     |         |
 +--------+     +     +---------+     +     +---------+
           FlexE Group           FlexE Group


Figure 6: Back-to-Back FlexE

3.5. FlexE Client BW Resizing

The hop-by-hop (a hop is delimited by two FlexE Shim functions) resizing of a FlexE client signal operates by maintaining two sets of calendar slots for each client: the present and the future. Once the configuration of both calendar slots for a specific client is complete, the node signals to its peer to switch to from the present set to the new set of calendar slots. Note that the switch to the new set of calendar slots is unidirectional, and the process is executed independently for both directions of transfer. This process makes use of the following FlexE overhead (as per [OIFFLEXE1]

  1. Currently active FlexE calendar (containing a list of mapping between the 5G tributary slots and the FlexE client signals
  2. Future calendar to which the sender wants to transition to.
  3. Calendar switch request bit (CR)
  4. Calendar switch acknowldege bit (CA)

FlexE client resizing operations are supported and can be achieved via the configuration of Calendar A and Calendar B. It is worth noting that there is no guarantee that such resizing will be hitless. Table 1 provides a summary of client bandwidth resize applicability in various use cases presented in this document.

FlexE Client Resizing
FlexE Shim endpoint 1 FlexE Shim endpoint 2 Usecase Transport Network Function Resizing supported?
CE (e.g. router) CE Section 3.1 FlexE unaware transport Yes. Done at endpoints. The OTN pipes are configured for the maximum number of calendar slots across each PHY in the FlexE group. Therefore, no resizing is required in the OTN layer.
CE (e.g. router) CE Section 3.2 FlexE aware transport Supported at the endpoints only if the set of available/unavailable calendar slots is constant. Not supported otherwise (see notes at the end of Section 3.2).
CE (e.g. router) Transport Network Edge Section 3.3.1 FlexE Termination in Transport Not supported due to lack to lack of a general (i.e. one that works regardless of the ODUflex bandwidth) hitless ODUflex resizing in G.709.
CE (e.g. router) CE Section 3.4 No transport network Yes. Done at endpoints by CE(s). Thus, for example, in Figure 6, the resizing of the end-to-end FlexE client circuit with a scope of Router1-Router2-Router3 is accomplished by correctly coordinating the resizing operations across these two segments: Router1-Router2, Router2-Router3. It is expected that the exact sequence of hop-by-hop resize operations is different between bandwidth increase/decrease scenarios.

4. Requirements

This section summarizes the requirements for FlexE Group and FlexE Client signaling and routing. The requirements are derived from the usecases described in Section 3 of this document. Data plane requirements (and/or solutions) (e.g. crunching of tributary slots, adding unavailable tributary slots etc.) are not explicitly mentioned in the following text. Given that the control plane sets up circuits that transport client streams, there are no implications for the control plane in matters of delay, jitter tolerance etc. The requirements listed in this section will be used to identify the Control Plane (i.e. Routing and Signaling) extensions that will be required to support FlexE services in an OTN.

A Control Plane solution will be compliant to the specification in Section 7 if it meets all the mandatory (MUST, SHALL) requirments, the solution may also meet the optional (SHOULD, MAY) requirments.

The solution SHALL support the creation of a FlexE group, consisting of one or more (i.e., in the 1 to 254 range) 100GE Ethernet PHY(s).

There are several alternatives that can meet this requirement, e.g. routing and signaling protocols, or a centrailized controller/management system with network access to the FlexE mux/demux at each FlexE group termination point.
The solution SHOULD be able to verify that the collection of Ethernet PHY(s) included in a FlexE group have the same characteristics (e.g. number of PHYs, rate of PHYs, etc.) at the peer FlexE shims.
The solution SHALL support the ability to delete a FlexE group.
It SHALL support the ability to administratively lock/unlock a FlexE group.
It SHALL be possible to add/remove PHY(s) to/from an operational FlexE group while the group has been administratively locked.

[Note: Since the addition/removal of Ethernet PHY(s) is done only when the group has been locked, this dataplane operation of the FlexE group ceases until it is placed in an unlocked state.]
The solution SHALL support the ability to advertise (and discover) the information about FlexE capable nodes, and the FlexE group instances they are supporting.
It SHALL be possible to assign the transport network treatment for a FlexE group to one the following choices: (a) FlexE unaware transport (b) FlexE aware transport (c) FlexE termination in Transport.
For the FlexE unaware case, each of the Ethernet PHY(s) in the FlexE group SHALL be mapped independently to the appropriately sized ODU container (as per [G.709], and switched across the transport network [OIFFLEXE1]. The control plane SHALL be capable of co-routing the ODU signals that are transporting the member PHY(s) between the two FlexE Shim functions.

[Note: Insert applicable references to ITU, OIF spec for hard skew tolerances]
In the FlexE aware mode, the OTN SHALL crunch the PHY(s), and map them to one or more ODUflex connections as per [G.709].

When two or more ODUflex connections are used to transport the collection of FlexE PHY(s) in a FlexE group, the system SHALL support the ability to constrain the routes for these ODUflex connections (e.g. co-route them) so that the end-to-end skew is kept to a minimum (and within the range supported by the FlexE Shims).
The system SHALL allow the addition (or removal) of one or more FlexE clients against the FlexE group which is being terminated. The addition (or removal) of FlexE client SHALL not affect the services for the other FlexE client signals.
The system SHALL allow the FlexE client signals to flexibly span the set of Ethernet PHY(s) which comprise the FlexE group. In other words, it SHALL be possible to distribute any FlexE client over an arbitrary combination of calendar slots (whose total capacity matches the client bitrate) chosen from a subset of the PHY(s).
When the FlexE group is terminated on the Transport edge node, this node SHOULD be capable of resizing one or more FlexE client (using the "A/B" calendar signaling defined by OIF) (see Section 3.5). It is acceptable that this resizing is not hitless, and the client signal incurs a glitch during the resizing operation.

There is no requirement for the OTN network to support the hitless resizing of the ODUFlex connection which is transporting the FlexE client signal.
The solution SHALL support FlexE client resizing without affecting any existing FlexE clients within the same FlexE group.

5. Framework

This section discusses the environment where FlexE operates, this should include both what FlexE runs over and what applications run on top of FlexE.

5.1. FlexE Layer Model

Based on the cases addressed in Section 3, FlexE has different kinds of mapping hierarchy accordingly. This section gives some description of FlexE layer model in different cases.

5.1.1. Layer Model in FlexE Unaware Case

This case is depicted in Section 3.1. The FlexE Ethernet client represents an end-to-end connection, which is from the Router 1 to destination Router 2. The FlexE Ethernet client signal is first mapped into the slots of FlexE at Router 1, then the FlexE signal is carried by Ethernet PHYs towards the destination Router 2. When the Ethernet PHYs arrive at Transport network edge node A-end, each PHY will be mapped into a separate ODU4 connection and then forwarded across the OTN network towards the ODU layer connection destination Z-end.

Note: in this case, more than one FlexE clients can be carried by FlexE layer.

Four different layers exist in this case, and the mapping hierarchy can be seen in Figure 1.

5.1.2. Layer Model in FlexE Terminating Case

This case is depicted in Section 3.3. Take Section 3.3.1 for example. The FlexE Ethernet signal is first mapped into the slots of FlexE at Router 1, then the FlexE signal is carried by Ethernet PHYs towards the Transport Network edge node A-end. When the FlexE signal arrives at node A-end, node A-end first terminate Ethernet PHY signal and FlexE signal, extracts the FlexE Ethernet client signal, then maps the Ethernet client signal into ODU signal and forwards across the OTN network towards destination node Z-end. Node Z-end first terminate the ODU signal, extract the FlexE client signal from the ODU signal, then map the Ethernet client signal into FlexE signal, which will then be carried by Ethernet PHYs towards destination node Router 2.

Two segments of FlexE connection exist in this case. one is from Router 1 to node A-end, and the other is from node Z-end to Router 2. The mapping hierarchy can be seen in Figure 3

5.1.3. Layer Model in FlexE Aware Case

This case is depicted in Section 3.2. The FlexE Ethernet client is transferred from the R1 to destination R2, while the internal node NE A and NE Z are capable of "crunching" and "combining" operation. The FlexE Ethernet client signal is first mapped into the slots of FlexE at R1, then the FlexE signal is carried by Ethernet PHYs towards the destination R2. When the Ethernet PHY signal arrives at node NE A, node NE A first discards unavailable slots, then map the remaining FlexE slots onto ODU Connection. According to the description in [G.709], these FlexE slots can be carried across the OTN network via a couple of ODUflex signals which are carried in ODUCn/OTUCn/OTSiA signals.

Two kinds of mapping hierarchy exist in this case, one is the FlexE connection is carried by Ethernet PHYs, the other is FlexE connection (e.g., FlexE-psg) is carried by ODUflex, which can be seen in Figure 2.

5.2. GMPLS Considerations

The goal of this section is to provide an insight into the application of GMPLS as a control mechanism in FlexE networks. Specific control-plane requirements for the support of FlexE networks are covered in Section 5.3. This section aims to describe the modelling of controlling the FlexE shim layer specific attributes in different network scenarios based on the capability of FlexE described in OIF Flex Ethernet (FlexE) Implementation Agreement [OIFFLEXE1].

5.2.1. General Considerations

The GMPLS control of the FlexE layer deals with the establishment of FlexE connections that are transferred in FlexE capable nodes. GMPLS labels are used to locally represent the FlexE connections and its associated slots assignment information for client.

5.2.2. Consideration of FlexE LSPs

The FlexE LSP is a control-plane representation of a FlexE Connection and MUST be carried by Ethernet PHYs LSP or ODU LSP in the network.

Figure 1 depicts a scenario that the FlexE LSP is carried over Ethernet PHYs LSP from Router 1 to Router 2. When there is a need to set up FlexE end-to-end connection to carry FlexE Ethernet client signal at R1, R1 will first check if there are enough resources for setting up FlexE LSP. If yes, R1 will first set up Ethernet PHYs LSP from R1 to R2, and then set up the FlexE LSP over the Ethernet PHYs LSP. This process actually includes three signalling procedures, the first one is to set up multiple ODU4 LSPs to carry Ethernet PHYs, the second one is to set up multiple Ethernet PHYs connection to carry FlexE LSP, and the third one is to set up FlexE connection to carry FlexE Ethernet client signal. The signalling of FlexE LSP SHOULD be able to reserve resource for Ethernet client.

Figure 2 depicts the case that the FlexE LSP is carried over ODU LSP between NE A and NE Z. This case is different from that one in Figure 1, and is used to support cases such as the Ethernet PHY rate is be greater than the wavelength rate, the wavelength rate is not an integral multiple of the PHY rate. Both NE A and NE Z support the partial-rate ability ,which means when the FlexE LSP over Ethernet PHYs arrives at NE A, NE A should first discard the unavailable slots and then map the remaining FlexE slots into the ODU signal.

5.2.3. Control-Plane Modelling of FlexE Network Elements

FlexE is a new kinds of transport technology, which has many new constraints. These constraints are listed as follows:

  • Unavailable slots: this is different from "unused" slot, in that it is known, due to transport network constraints, that not all of the calendar slots generated from the FlexE mux will reach the FlexE demux and therefore no FlexE client should be assigned to those slots. As defined in the Flex Ethernet Implementation Agreement, unavailable slots are always at the end of the sub-calendar configuration for the respective PHY.
  • Unused slots: unused slots can be allocated to Ethernet client as available resource.
  • Partial-rate capability: the partial-rate capability is usually supported by the OTN edge equipments. If an equipment supports partial-rate, it means this equipment has the capability of discarding unavailable slots and transfers the remaining slots across OTN transport network.
  • Slot granularity: currently, only one kinds of 5G slot granularity is defined in OIF Flex Ethernet (FlexE) Implementation Agreement.

5.2.4. FlexE Layer Resource Allocation Considerations

FlexE LSP is used to provide resource service for its client, which is mainly reflected through the provision of the unused slot resource information towards the client layer. Besides the slot information, there are also some other attributes that need to be specified when allocating resource during connection setup process.

  • FlexE group number: a bunch of Ethernet PHYs can be bounded together and used as a whole as one FlexE LSP. FlexE LSPs between the same source and destination equipment SHOULD NOT have the same FlexE group number. Source equipment and destination equipment SHOULD be aware of the existing of different FlexE groups and which Ethernet PHYs are in which FlexE group.
  • PHY Number: it's a dynamic and logical number that is assigned through control plane or management plane, which is unique within the context of (source, destination), and has a one-to-one correlation with physical port. This information will also be carried in the FlexE overhead. Source equipment and destination equipment SHOULD negotiate a value for every Ethernet PHYs within one FlexE group.
  • Slot Assignment information: the FlexE LSP transfers based on the slot positions, so the equipment SHOULD be able to tell which slot is assigned to which client.
  • Partial-rate: during the process of resource allocation, where the partial-rate would happen should be indicated.
  • Granularity: currently, only one kinds of 5G slot granularity is defined in OIF Flex Ethernet (FlexE) Implementation Agreement [FlexE-IA].

5.2.5. Neighbour Discovery and Link Property Correlation

There are potential interworking problems between different FlexE capable equipment. Devices or equipments might not be able to support the interworking of every slot due to the constraints of transport network equipment or other constraints. In this case, two directly connected FlexE capable equipments SHOULD run the neighbour discovery process and correlate the link property to make sure which slots are unavailable, which slots can be used by the client. Neighbour discovery protocol can be communicated in in-band FlexE section management channel, and also can be communicated through out-of-band management channel.

5.2.6. Routing and Topology Dissemination

The topology and routing information is used by the path computation entity to compute an end-to-end path. Besides the basic interconnected information, there are also some FlexE specific attributes that should be taken into consideration.

  • Partial-rate: partial-rate capability is a special feature which allows an equipment to discard unavailable slots and transfers the left slots across OTN transport network. Path computation entity is more likely to compute a feasible path if this capability is taken into consideration when computing path.
  • Unavailable slot information: this information is used to indicate certain slots SHOULD not be considered when computing an end-to-end path. The unavailable slots can not be used to forward signal because of the transport constraints.
  • Unused slot information: unused slot can be allocated to the path as available resource.

5.3. Control-Plane Protocol Requirements

The control of FlexE networks brings some new additional requirements to the GMPLS protocols. This section summarizes those requirements for signalling,routing and Link management protocol.

5.3.1. Support for Signalling of FlexE

Aim of the signaling is to set up an end-to-end LSP for FlexE signal.

The signalling procedures shall be able to assign FlexE releated attributes for an LSP, which include FlexE group number for a FlexE LSP. This FlexE group number is unique and can be used to indicate a group of Ethernet PHYs bonded together.

The signalling procedures shall be able to assign an unique PHY number for each bonded Ethernet PHY, and a correlation relationship SHOULD also be indicated between the assigned PHY number and real physical port number when signalling.

The signalling procedures shall be able to configure the slots information allocated for a FlexE LSP.

The Signalling procedures shall be able to indicate the palace where partial-rate mapping happens.

The Signalling procedures shall be able to support the non-hitless resizing of FlexE client.

5.3.2. Support for Routing of FlexE

The routing protocol extensions are mainly based on the functionality that is described in [RFC4202] and these extensions are made to fit into FlexE network.

The routing protocol SHALL distribute sufficient information to compute paths to enable the signalling procedure to establish LSPs as described in the previous sections.

The routing protocol SHALL update its advertisements of available resources and capabilities to include the partial-rate support information and unused slot information on each Ethernet PHY port.

5.3.3. Support for Neighbour Discovery and Link Property and Link Correlation

The control plane MAY include support for neighbour discovery such that a FlexE network can be constructed in a "plug-and-play" manner.

The control plane SHOULD allow the nodes at opposite ends of a link to correlate the properties that they will apply to the link. Such a correlation SHOULD include at least the identities of the nodes and the identities that they apply to the link. Other FlexE specific properties, such as the link characteristics of unavailable slot information, SHOULD also be correlated. Such neighbour discovery and link property correlation, if provided, MUST be able to operate in both in-band and out-of-band manner.

6. Architecture

This section discusses the different parts of FlexE signaling and routing and how these parts interoperte.

FlexE control plane technology SHOULD be able to set up end-to-end connection in different cases, which may include the management of FlexE group, assignment of the resource to the FlexE client and so on.

The FlexE routing mechanism is used to provide resource available information for set up FlexE connections, like Ethernet PHYs' information, partial-rate support information. Based on the resource available information advertised by routing protocol, an end-to-end FlexE connection is computed, and then the signalling protocol is used to set up an end-to-end connection.

7. Solution

8. Acknowledgements

9. IANA Considerations

This memo includes no request to IANA.

Note to the RFC Editor: This section should be removed before publishing.

10. Security Considerations


11. Contributors

Khuzema Pithewan, Infinera Corp,

Fatai Zhang, Huawei,

Jie Dong, Huawei,

Zongpeng Du, Huawei,

Xian Zhang, Huawei,

James Huang, Huawei,

Qiwen Zhong, Huawei,

12. References

12.1. Normative References

[G.709] ITU, "Optical Transport Network Interfaces (", July 2016.
[G7044] ITU, "Hitless adjustment of ODUflex(GFP) (", Cctober 2011.
[G798] ITU, "Characteristics of optical transport network hierarchy equipment functional blocks (", February 2014.
[OIFFLEXE1] OIF, "FLex Ethernet Implementation Agreement Version 1.0 (OIF-FLEXE-01.0)", March 2016.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V. and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001.
[RFC3471] Berger, L., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Functional Description", RFC 3471, DOI 10.17487/RFC3471, January 2003.
[RFC3473] Berger, L., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Extensions", RFC 3473, DOI 10.17487/RFC3473, January 2003.
[RFC3630] Katz, D., Kompella, K. and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2", RFC 3630, DOI 10.17487/RFC3630, September 2003.
[RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4202, DOI 10.17487/RFC4202, October 2005.
[RFC4203] Kompella, K. and Y. Rekhter, "OSPF Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4203, DOI 10.17487/RFC4203, October 2005.
[RFC4204] Lang, J., "Link Management Protocol (LMP)", RFC 4204, DOI 10.17487/RFC4204, October 2005.

12.2. Informative References

[OIFMLG3] OIF, "Multi-Lane Gearbox Implementation Agreement Version 3.0 (OIF-MLG-3.0)", April 2016.
[RFC3945] Mannie, E., "Generalized Multi-Protocol Label Switching (GMPLS) Architecture", RFC 3945, DOI 10.17487/RFC3945, October 2004.

Appendix A. Additional Stuff

This becomes an Appendix.

Authors' Addresses

Iftekhar Hussain (editor) Infinera Corp 169 Java Drive Sunnyvale, CA 94089 USA EMail:
Radha Valiveti Infinera Corp 169 Java Drive Sunnyvale, CA 94089 USA EMail:
Qilei Wang (editor) ZTE Nanjing, CN EMail:
Loa Andersson (editor) Huawei Stockholm, Sweden EMail:
Mach Chen Huawei CN EMail:
Haomian Zheng Huawei CN EMail: