NVO3 P. Jain
Internet-Draft Nuage Networks
Intended status: Standards Track K. Singh
Expires: December 09, 2013 F. Balus
Nuage Networks
W. Henderickx
Alcatel-Lucent
V. Bannai
PayPal
June 07, 2013

Detecting VXLAN Segment Failure
draft-jain-nvo3-vxlan-ping-00

Abstract

This proposal describes a mechanism that can be used to detect Data Path Failures and sanity of VXLAN Control and Data Plane for a given VXLAN Segment. This document defines the following:

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 09, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [RFC2119].

When used in lower case, these words convey their typical use in common language, and are not to be interpreted as described in RFC2119 [RFC2119].

1. Introduction

VXLAN [I-D.draft-mahalingam-dutt-dcops-vxlan]is a tunneling mechanism to overlay Layer 2 networks on top of Layer 3 networks. In most cases the end point of the tunnel (VTEP) is intended to be at the edge of the network, typically connecting an access switch to an IP transport network. The access switch could be a physical or a virtual switch located within the hypervisor on the server which is connected to End System which is a VM.

This document describes a mechanism that can be used to detect Data Plane failures and sanity of VXLAN Control and Data Plane for a given VXLAN Segment.

There proposal describes:

The proposal defines the information to check correct operation of the Data Plane, as well as a mechanism to verify the Data Plane against the Control Plane for a given VXLAN Segment.

Its is important consideration in this proposal to carry VXLAN Echo Request along same data path that normal VXLAN Packets would traverse.

The tenants VM(s) are not aware of the Virtualization Overlays and as such the need for the verification of the Data Path MUST solely rest with the Cloud Provider. The use cases where the Tenant VM(s) need to be aware of the Data Plane failures is beyond the scope of this document.

2. Terminology

Terminology used in this document:

VXLAN: Virtual eXtensible Local Area Network.

VTEP: VXLAN Tunnel End Point.

VM: Virtual Machine.

VNI: VXLAN Network Identifier (or VXLAN Segment ID)

NVE: Network Virtulalized Edge

End System: Could be VM etc. - System whose data is expected to go over VXLAN Segment.

Other terminologies are as defined in [I-D.draft-mahalingam-dutt-dcops-vxlan].

3. Need for VXLAN Ping

When a VXLAN Segment fails to deliver user traffic, there is a need to provide a tool that would enable users, as Cloud Providers to detect such failures, and a mechanism to isolate faults. It may also be desirable to test the data path before mapping End System traffic to the VXLAN Segment.

The basic idea is to facilitate following verifications:-

To facilitate verification of the above requirements, this document proposes sending of a Packet (called an "VXLAN Echo Request") along the same data path as other Packets belonging to this Segment. VXLAN Echo Request also carries information about the VXLAN Segment whose Data Path is to be verified. This Echo Request is forwarded just like any other End System Data Packet belonging to that VXLAN Segment, as it contains the same VXLAN Encapsulation as regular End System's data which uses the given VXLAN Segment.

On receiving VXLAN Echo Request at the end of the VXLAN Segment, it is sent to the Control Plane of the Terminating VTEP, which in-turn would respond with VXLAN Echo Reply.

	 
                          +------- L3 Network ------+ 
                          |                         | 
                          |       Tunnel Overlay    | 
             +------------+---------+       +---------+------------+ 
             | +----------+-------+ |       | +---------+--------+ | 
             | |  Overlay Module  | |       | |  Overlay Module  | | 
        NVE1 | |   +-----------+  | |       | |   +-----------+  | | NVE2
             | |   | OAM Module|  | |       | |   | OAM Module|  | | 
             | |   +-----------+  | |       | |   +-----------+  | | 
             | +------------------+ |       | +------------------+ |
             +----+------------+----+       +----+-----------+-----+ 
                  |            |                 |           | 
           -------+------------+-----------------+-----------+------- 
                  |            |                 |           | 
                  |            |                 |           | 
                 Tenant Systems                 Tenant Systems    
	                               
								   Fig 1
      

As depicted in Fig 1, VXLAN OAM Module SHOULD reside at NVE device and Echo Request and Reply SHOULD be handled at the NVE.

4. Packet Format

VXLAN Echo Request/Reply is a IPv4 UDP Packet. Following is the format of VXLAN Echo Packet:



       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  Message Type |   Reply mode  |  Return Code  | Return Subcode|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Originator Handle                      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Sequence Number                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    TimeStamp Sent (seconds)                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                  TimeStamp Sent (microseconds)                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                  TimeStamp Received (seconds)                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                TimeStamp Received (microseconds)              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                            TLVs ...                           |
      .                                                               .
      .                                                               .
      .                                                               .
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      

The Message Type is one of the following:-

Value What it means
1 VXLAN Echo Request
2 VXLAN Echo Reply

Reply Mode Values:-

Value What it means
1 Do not reply
2 Reply via an IPv4/IPv6 UDP Packet

VXLAN Echo Request with 1 (Do not reply) in the Reply Mode field may be used for one-way connectivity tests; the receiving node may log gaps in the Sequence Numbers and/or maintain delay/jitter statistics. For normal operation VXLAN Echo Request would have 2 (Reply via an IPv4 UDP Packet) in the Reply Mode field.

The Originator's Handle is filled in by the Originator, and returned unchanged by the receiver in the Echo Reply (if any). There value used for this field can be implementation dependent, this MAY be used by the Originator for matching up requests with replies.

The Sequence Number is assigned by the sender of the VXLAN echo request and can be (for example) used to detect missed replies.

The TimeStamp Sent is the time-of-day (in seconds and microseconds, according to the sender's clock) in NTP format [NTP] when the VXLAN Echo Request is sent. The TimeStamp Received in an Echo Reply is the time-of-day (according to the receiver's clock) in NTP format that the corresponding Echo Request was received.

TLVs (Type-Length-Value tuples) have the following format:


       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |             Type              |            Length             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                             Value                             |
      .                                                               .
      .                                                               .
      .                                                               .
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  

Types are defined below; Length is the length of the Value field in octets. The Value field depends on the Type; it is zero padded to align to a 4-octet boundary.

Example of the TLV for VXLAN ping is as follows.


       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |    Type = 1 (VXLAN ping)    |          Length                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          VXLAN VNI                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     IPv4 Sender Address                       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     
                       TLV if Sender Address is IPv4
	 
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |    Type = 1 (VXLAN ping)    |          Length                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          VXLAN VNI                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      |                     IPv6 Sender Address                       |
      |                                                               |
      |                                                               |	  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	 
	 
	 
	                   TLV if Sender Address is IPv6
 

5. Return Codes

Sender MUST always set the Return Code set to zero. The receiver can set it to one of the values listed below when replying back to Echo-Request.

Following are the Return Codes (Suggested):-

Value What it means
0 No return code
1 Malformed Echo Request Received
2 No VN-ID Present
3 Return-Code-OK

6. Procedure for VXLAN Ping

VXLAN Echo Request is used to test Data Plane and its view of Control Plane for particular VXLAN Segment. The VXLAN Segment to be verified is identified by the "VN-ID". For the Data Plane verification, the complete VXLAN Echo Request Packet would be encapsulated with the VXLAN header of the given VXLAN Segment. For control plane's view of the given VXLAN Segment at the Terminating VTEP, the TLV will be encoded under Echo Request Packet, to identify presence of given VXLAN Segment.

When VXLAN Echo Request is received, the Terminating VTEP is expected to verify the consistency of the Control Plane and Data Plane for the VXLAN Segment identified by the VN-ID.

6.1. Sending VXLAN Echo Request

Inner Ethernet Header for the VXLAN Echo Request Packet should have the Destination Mac set to 00-00-5E-90-XX-XX (to be assigned IANA); the Source Mac is set to Mac Address of the Originating VTEP. In the payload portion of this VXLAN Frame. The IP header is set as follows: the source IP Address is a routable Address of the sender; the destination IP Address is a (randomly chosen) IPv4 Address from the range 127/8, IPv6 addresses are chosen from the range 0:0:0:0:0:FFFF:127/104. The IP TTL is set to 255. The UDP header is set as follows: the source UDP port is chosen by the sender. It is desired to choose the Source UDP port so as to exercise the entropy for ECMP/load balancing across the VXLAN Overlay, and it left to the implementation; the destination UDP port is set to xxxx (assigned by IANA for VXLAN Echo Requests). The rest of the Echo Request is encoded as define in above section "Packet Format", VN-ID in the TLV for Echo Request MUST be same as the VN-ID for the VXLAN Segment. The IPv4 Sender Address is set to the Address of Originating VTEP's IP Address. The VXLAN Router Alert option [I-D.draft-singh-nvo3-vxlan-router-alert] MUST be set in the VXLAN header as shown below. A VXLAN Echo Request is sent encapsulated with the VXLAN header corresponding to the VXLAN Segment being tested.

    VXLAN Header:
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |R|R|R|R|I|R|R|RA|           Reserved                           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                VXLAN Network Identifier (VNI) |   Reserved    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       RA: Router Alter Bit (Proposed)
	   
	   

The sender chooses a Originator's Handle and a Sequence Number. When sending subsequent VXLAN Echo Requests, the sender SHOULD increment the Sequence Number by 1.

The TimeStamp Sent is set to the time-of-day (in seconds and microseconds) that the Echo Request is sent. The TimeStamp Received is set to zero. Also, the Reply Mode must be set to the desired reply mode. The Return Code and Subcode are set to zero.

6.2. Receiving VXLAN Echo Request

At the Terminating VTEP, sending VXLAN Echo Request to the Control Plane is triggered by one of the following Packet processing exceptions: VXLAN Router Alert option, [I-D.draft-singh-nvo3-vxlan-router-alert] the Inner Destination MAC Address of 00-00-5E-90-XX-XX as defined in above section, and the Destination IP Address in the 127/8 Address range for IPv4 Address, or 0:0:0:0:0:FFFF:127/104 for IPv6 Address. The Control Plane further identifies the VXLAN Echo Request by UDP destination port xxxx.

Once the VXLAN Echo Request Packed is identified at Control Plane, it is processed as follows:-

6.3. Sending VXLAN Echo Reply

VXLAN Echo Reply is a UDP Packet. It MUST ONLY be sent in response to VXLAN Echo Request. The Source IP Address is a routable Address of the replier; the source port is the well-known UDP port for VXLAN ping. The Destination IP Address and UDP port are copied from the Source IP Address and UDP port of the Echo Request. The IP TTL is set to 255.

The format of the Echo Reply is the same as the Echo Request. The Originator Handle, the Sequence Number, and TimeStamp Sent are copied from the Echo Request; the TimeStamp Received is set to the time-of- day that the Echo Request is received (note that this information is most useful if the time-of-day clocks on the requester and the replier are synchronized).The replier MUST fill in the Return Code and Subcode, as determined in the previous subsection.

6.4. Receiving VXLAN Echo Reply

An Originating VTEP should only receive VXLAN Echo Reply in response to an VXLAN Echo Request that it sent. Thus, on receipt of VXLAN echo reply, Originating VTEP should parse the Packet to ensure that it is well-formed, then attempt to match up the Echo Reply with an Echo Request that it had previously sent, using the destination UDP port and the Originator Handle. If no match is found, then VTEP should drop the Echo Reply Packet; otherwise, it checks the Sequence Number to see if it matches.

7. Security Considerations

TBD

8. Management Considerations

None

9. Acknowledgements

This document is the outcome of many discussions among many people, including Diego Garcia Del Rio, Saurabh Shrivastava and Suresh Boddapati of Nuage Networks and Jorge Rabadan of Alcatel-Lucent, Inc.

10. IANA Considerations

Action-1: This specification reserves a IANA UDP Port Number to be used when sending the VXLAN Echo Request Packet.

Action-2: This specification reserves a IANA Ethernet unicast Address for VXLAN Exception handling. This Address needs to be reserved from the block. "IANA Ethernet Address block - Unicast Use"

11. References

11.1. Normative References

[I-D.draft-mahalingam-dutt-dcops-vxlan] Mahalingam, M., Dutt, D., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M. and C. Wright, "VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", May 2013.
[I-D.draft-lasserre-nvo3-framework] Lasserre, M., Balus, F., Morin, T., Bitar, N. and Y. Rekhter, "Framework for DC Network Virtualization", September 2011.
[I-D.draft-singh-nvo3-vxlan-router-alert] Singh, K., Jain, P., Balus, F. and W. Henderickx, "VxLAN Router Alert Option", May 2013.

11.2. Informative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures", RFC 4379, February 2006.
[RFC4330] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for IPv4, IPv6 and OSI", RFC 4330, January 2006.

Authors' Addresses

Pradeep Jain Nuage Networks 755 Ravendale Drive Mountain View, CA, 94043 USA EMail: pradeep@nuagenetworks.net
Kanwar Singh Nuage Networks 755 Ravendale Drive Mountain View, CA, 94043 USA EMail: kanwar@nuagenetworks.net
Florin Balus Nuage Networks 755 Ravendale Drive Mountain View, CA, 94043 USA EMail: florin@nuagenetworks.net
Wim Henderickx Alcatel-Lucent Copernicuslaan 50 Antwerp, 2018 Belgium EMail: wim.henderickx@alcatel-lucent.be
Vinay Bannai PayPal 2211 N. First St, San Jose, 95131 USA EMail: vbannai@paypal.com