Identification of Overlay Operations, Administration, and Maintenance (OAM)
ZTE Corp.
gregimirsky@gmail.com
Routing
RTGWG Working Group
Internet-Draft
Overlay Network
OAM
This document analyzes how the presence of Operations, Administration, and Maintenance (OAM)
control command and/or special data is identified in some overlay networks and an impact on the choice of identification
may have on OAM functionality.
Operations, Administration, and Maintenance (OAM) protocols are used to detect, localize defects in the network,
and monitor network performance. Some OAM functions, e.g., failure detection, work in the
network proactively, while others, e.g., defect localization, usually performed on-demand.
These tasks achieved by a combination of active, passive, and hybrid OAM methods, as defined in .
This document analyzes how the presence of Operations, Administration, and Maintenance (OAM)
control command and/or special data, i.e., OAM packet, is identified in some overlay networks, and an impact the choice of identification
may have on OAM functionality of active and hybrid
OAM methods for the respective overlay network encapsulation.
AMM Alternate Marking method
BIER Bit Indexed Explicit Replication
DetNet Deterministic Networks
GUE Generic UDP Encapsulation
HTS Hybrid Two-step
NSH Network Service Header
NVO3 Network Virtualization Overlays
OAM Operations, Administration and Maintenance
SFC Service Function Chaining
TLV Type-Length-Value
VXLAN-GPE Generic Protocol Extension for VXLAN
Underlay Network or Underlay Layer: The network that provides
connectivity between the DetNet nodes. MPLS network that provides LSP
connectivity between DetNet nodes is an example of an underlay layer.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14
when, and only when, they appear in all capitals, as shown here.
New overlay network encapsulations analyzed in two groups:
encapsulations that support optional meta-data;
fixed-size encapsulations.
Number of the new encapsulation protocols (e.g., Geneve ,
GUE , and SFC NSH )
support use of Type-Length-Value (TLV)
encoding to include optional information into the header. The identification of OAM in these protocols is
as the following:
Geneve:
O (1 bit): after the WGLC discussion, the interpretation of the O field has changed. The O field now identifies a control packet.
This packet contains a control message.
Control messages are sent between tunnel endpoints. Tunnel
Endpoints MUST NOT forward the payload and transit devices MUST
NOT attempt to interpret it. Since these are infrequent control
messages, it is RECOMMENDED that tunnel endpoints direct these
packets to a high priority control queue (for example, to direct
the packet to a general purpose CPU from a forwarding ASIC or to
separate out control traffic on a NIC). Transit devices MUST NOT
alter forwarding behavior on the basis of this bit, such as ECMP
link selection.
GUE:
C-bit provides the separate namespace to
carry formatted data that are implicitly addressed
to the decapsulator to monitor or control the state or behavior of a tunnel.
The payload is interpreted as a control message with the type specified in
the proto/ctype field. The format and contents of the control message are indicated by the type and can be variable length.
SFC NSH:
O bit: Setting this bit indicates an OAM packet.
Common between Geneve and NSH is the use of the dedicated flag to identify the OAM packet and,
at the same time, the presence of the field that identifies the protocol of the payload that immediately follows
after the encapsulation header. points out that if the value of that field interpreted as none,
i.e., no payload follows the header, then OAM may be included in TLVs, thus creating an active OAM
packet. The problem with this mechanism to support active OAM methods may be a limitation of the size
of data that can be included in a TLV. For example, the maximum size of data in an NSH Meta-data Type 2,
as defined in section 2.5.1 , is 512 octets. The maximum length of data in Geneve Option,
per section 3.5 , is 128 octets. Thus, using one TLV as active OAM packet,
would not allow creating test packets of larger size, which is useful when measuring packet loss and latency with synthetic traffic
as part of the service activation procedure.
suggests that the O bit used to identify OAM packet and the Next Protocol
field identifies the OAM function:
While the presence of OAM marker in
the overlay header (e.g., O bit in the NSH header) indicates it as
OAM packet, it is not sufficient to signal for which OAM function the
packet is intended.
At the same time, some of in-situ OAM proposals, e.g., ,
suggest using TLV to communicate hybrid OAM commands and data. The proposed resolution of
using the combination of O bit and the Next Protocol field:
... the O bit MUST NOT be set for regular customer
traffic which also carries IOAM data and the O bit MUST be set for
OAM packets which carry only IOAM data without any regular data
payload.
implies that the O bit only identifies the active
OAM packet and not set when hybrid OAM methods used.
One of the possible solutions for encapsulations with meta-data has been specified in :
To identify the active OAM message the value on the Next Protocol field MUST be set to Active SFC OAM.
The rules of interpreting the values of O bit and the Next Protocol field are as follows:
O bit set and the Next Protocol value is not one of identifying active or hybrid OAM protocol (per definitions),
e.g., defined in this specification Active SFC OAM - a Fixed-Length Context Header
or Variable-Length Context Header(s) contain OAM command or data
and the type of payload determined by the Next Protocol field;
O bit set and the Next Protocol value is one of identifying active or hybrid OAM protocol -
the payload that immediately follows SFC NSH contains OAM command or data;
O bit is clear - no OAM in a Fixed-Length Context Header or Variable-Length Context Header(s)
and the payload determined by the value of the Next Protocol field;
O bit is clear, and the Next Protocol value is one of identifying active or hybrid OAM protocol
MUST be identified and reported as the erroneous combination. An implementation MAY have
control to enable processing of the OAM payload.
From the above-listed rules follows the recommendation to avoid the
combination of OAM in a Fixed-Length Context Header or Variable-
Length Context Header(s) and in the payload immediately following the
SFC NSH because there is no unambiguous way to identify such
combination using the O bit and the Next Protocol field.
Number of the new encapsulation protocols (e.g., VXLAN-GPE ,
BIER )
suse fixed-size header. The identification of OAM in these protocols is
as the following:
VXLAN-GPE:
OAM Flag Bit (O bit): The O bit is set to indicate that the packet
is an OAM packet.
BIER:
OAM packet identified by the value of the Next Protocol field.
IANA BIER Next Protocol Identifiers registry includes the identifier for OAM (5).
The use of a combination of OAM Flag Bit and the Next Protocol field in VXLAN-GPE
requires clarification of the header interpretation when the OAM Flag Bit is set, and
the value of the Next Protocol field is one of defined in section 3.2 of .
BIER encapsulation, defined in , identifies OAM message immediately following the BIER header
by the value of the Next Protocol field.
Availability of the packet originator's source information is required for active two-way OAM,
e.g., echo request/reply. In cases when the underlay network is
IPv4/IPv6 the source information will be derived from the underlay.
But when using MPLS underlay network encapsulation of an active OAM packet have to follow specific rules:
if available, use Sender ID in the overlay domain (example BFIR ID in BIER ;
use IP/UDP encapsulation of an OAM packet in the overlay (similar to Section 4.3 ).
In addition to active methods, OAM toolset may include methods that don't use specially constructed and injected in the network
test packets. defines OAM methods that are neither entirely active nor passive
but are a combination of both as hybrid methods.
One of the examples of the hybrid OAM methods, in-situ OAM, mentioned in .
Another example, Alternate Marking method (AMM) , enables on-path OAM functions, e.g., delay and loss measurements,
using the data traffic. Because AMM impact on the network can be minimized,
measured metrics can be correlated to the network conditions experienced by the specific service.
Of all listed in , BIER allocated the field that may be used for AMM,
as discussed in . Applicability of AMM to other overlay
protocols, i.e., SFC NSH discussed in ,
Geneve , and in IPv6 networks , been actively discussed.
Hybrid Two-step (HTS), defined in , provides
on-path collection and transport of the telemetry information. HTS enables accurate and consistent
measurements by separating the measurement action from the transporting data while ensuring that the follow-up
packet that carries the telemetry information does follow the data packet that had triggered the measurement.
OAM control commands and data may be present as part of the overlay encapsulation header or as a payload
that follows the overlay network header. The recommendations:
OAM in the overlay header, if supported by the overlay network, identified
by the dedicated flag. Use of this method as active OAM is possible, but functionality is limited.
OAM that follows the overlay header identified as payload type, e.g., by the value of the Next Protocol field.
This document does not propose any IANA consideration. This section may be removed.
This document lists the OAM requirements for a DetNet domain
and does not raise any security concerns or issues in addition to ones common to networking.