Enhancing Virtual Network Encapsulation with IPv6
draft-smith-enhance-vne-with-ipv6-00

Abstract

A variety of network virtualization over layer 3 methods are currently being developed and deployed. These methods treat IPv4 and IPv6 as equivalent underlay network transports. This memo suggests how IPv6's additonal capabilities may be used to enhance virtual network encapsulation.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 4, 2014.

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Introduction
2. Terminology
3. Carrying the Virtual Network Context ID in the Flow Label Field
4. Using /64s to Identify NVEs
5. Carrying Tenant Packet information in IIDs
6. Permanent Virtual Network Multicast Group Identifier
7. Per-Virtual Network Multicast Group Addresses
8. Per-NVE Multicast Addresses
9. Security Considerations
10. Acknowledgements
11. Change Log [RFC Editor please remove]
12. References
12.1. Normative References
12.2. Informative References
Author's Address

1. Introduction

A variety of network virtualization over layer 3 methods are currently being developed and deployed [I-D.mahalingam-dutt-dcops-vxlan][I-D.sridharan-virtualization-nvgre][I-D.davie-stt]. Each of these methods treats both IPv4 and IPv6 as functionally equivalent underlay network transports, with both providing unicast and multicast capabilities.

IPv6 provides a number of capabilities not available in IPv4. This memo suggests how they may be used to enhance the encapsulation of virtual network traffic when transported over an IPv6 underlay network.

This memo does not consider how virtual network signalling protocols could be enhanced using IPv6. However, it may be possible to use similar techniques to enhance these signalling protocols when being transported over IPv6.

2. Terminology

This memo adopts the terminology described in [I-D.ietf-nvo3-framework], summarised and supplimented below.

Tenant Systems - devices operated by the user of the virtual network service. They may be hosts or other network elements such as routers, and are not aware of the virtual network service.

IPv6 Underlay Network - the IPv6 network across which Tenant Packets are carried, encapsulated within IPv6, and most likely also within some other Network Virtualization header. It is assumed that this network supports both unicast and multicast IPv6 transport.

Tenant Packet - a packet originated by a Tenant system, tunneled within IPv6 across the IPv6 Underlay Network. The most common Tenant Packet is likely to be Ethernet/IEEE 802.3 frames, although other link-layer frame types, and other network layer packets such as IPv6 or IPv4 packets could be supported.

Virtual Network - a single logical network interconnecting Tenant systems. The IPv6 Underlay Network provides transport services for some or all of the Virtual Network.

Virtual Network Context Identifier (Virtual Network Context ID) - an identifier used to specify the Virtual Network a Tenant Packet belongs to while the packet is carried over the IPv6 Underlay Network.

Network Virtualization Edge (NVE) - a device or function within a device that performs IPv6 encapsulation of Tenant Packets on ingress to the IPv6 Underlay Network and IPv6 decapsulation of Tenant Packets on egress from the IPv6 Underlay Network. Other network virtualization related headers may be added or removed during the IPv6 encapsulation or decapsulation procedure.

3. Carrying the Virtual Network Context ID in the Flow Label Field

The IPv6 Flow Label field is a 20 bit field in a fixed location early in the IPv6 header [RFC2460][RFC6437]. It is intended to be used to identify flows between a pair of source and destination IPv6 addresses, as an alternative to identifying flows using transport layer header port numbers, which may be located deeper within the IPv6 packet, perhaps following a number of other IPv6 extension headers, or hidden by IPSec.

One of the expected (and encouraged) uses of the Flow Label field is as an input into link or path selection when using stateless load balancing of traffic across multiple links, using methods such as Equal Cost Multi-Path (ECMP) or Link Aggregation Groups (LAGs) [RFC6436][RFC6438].

The Flow Label field could be used to carry a whole or partial copy of the Virtual Network Context ID, providing it as an input into a stateless load balancing method. Traffic for different Virtual Networks may then be distributed across different links/paths within the IPv6 Underlay Network.

Alternatively, the Flow Label field could carry the Virtual Network Context ID itself, providing support for up to 1 Million Virtual Networks. This would reduce the encapsulation overhead of tunneling over IPv6.

A drawback of using the Flow Label to carry the Virtual Network Context ID is that it is a 'best effort' field, meaning that it may be changed as it transits the network without any protection by an end-to-end checksum, including when other fields in the IPv6 header are protected by IPsec [RFC4302]. A change of the Flow Label field value when used to carry the Virtual Network Context ID would mean the Tenant Packet would either be delivered to the incorrect Virtual Network or would be dropped because specified Virtual Network does not exist. Incorrect Virtual Network delivery would likely be unnacceptable to the Virtual Network's user for security reasons.

This could be resolved by protecting the integrity of the Flow Label field value using a checksum carried in some other virtual network related header, and validating that checksum when the IPv6 tunelling header is removed, before delivery to the corresponding Virtual Network.

Note that a Flow Label value of zero has been deemed to mean that the Flow Label value has not been set [RFC6437], and can therefore be changed as the the IPv6 packet traverses the network. This would preclude the use of the Flow Label field to carry a Virtual Network Context ID value of zero, as if it was changed by an intermediary device it would fail the Flow Label integrity check using checksum information carried by some other tunelling header.

Carrying Virtual Network Context ID information in the Flow Label field is also likely to assist with troubleshooting and facilitate traffic analysis using standard IPv6 tools that can analyse the Flow Label field.

4. Using /64s to Identify NVEs

Networks operating IPv6 have large numbers of /64 subnets; at a minimum, even the smallest end site is expected to be assigned a /56 or 256 /64s [RFC6177].

Instead of assigning NVEs unicast IPv6 addresses, NVEs could be assigned /64 prefixes. An NVE would then announce its /64 prefix into the IPv6 Underlay Network's routing domain, using an IGP or EGP such as OSPF or BGP. This would provide reachability and availability information to other NVEs, and support multihoming and load sharing when an NVE has multiple attachments to the IPv6 Underlay Network. Automated discovery of NVEs could be facilitated by attaching a special identifier to the NVE /64 announcements using route attributes, using mechanisms such as OSPF's External Route Tag or a BGP community.

If multiple NVEs are attached to the same Tenant System network segment, they could both be assigned and both announce the same /64 prefix. This would result in unicast Tenant Packets encapsulated in unicast IPv6 packets being more optimally forwarded to the closest NVE that provides access to the Tenant System network segement, and would also provide redundancy if one of these NVEs announcing the same /64 prefix fails.

As the NVEs are now identified by /64 prefixes, for unicast Tenant Packets, the source and destination IPv6 addresses used for the IPv6 encapsulation can be the Subnet-Router anycast address, the result of the NVE /64 prefix and an IID portion value of all zeros [RFC4291]. For multicast traffic, the source used can be the Subnet-Router anycast address, while the destination address used is an IPv6 multicast address used to reach the appropriate NVEs.

[I-D.carpenter-6man-why64] reports that some IPv6 routers provide optimal forwarding performance for /64 or shorter prefixes. Assigning /64s to NVEs would gain the best performance from this class of IPv6 routers when carrying traffic across the IPv6 Underlay Network.

5. Carrying Tenant Packet information in IIDs

If /64s are used to identify NVEs, then the IPv6 encapsulation's 8 octet IID portion in its source and destination address of unicast traffic, or the source address field for multicast traffic in the encapsulating IPv6 header, can be used to carry copies of fields of the encapsulated Tenant Packets. This is instead of setting the IID portion to the Subnet-Router anycast address IID previously suggested.

The likely most valuable fields to copy from the Tenant Packets into the outer IPv6 address IID portons are the addressing fields. As the IPv6 source and destination address fields can be used for stateless load balancing (e.g., ECMP or LAG), the variation in the IID portions of the address, as a result of being copies of the Tenant Packet addresses, will improve the effectiveness of load balancing, while preserving in-order delivery of Tenant Packets between pairs of Tenant Systems.

When the IPv6 IID portions are used to carry Tenant Packet values, the receiving NVE would not to consider any of the received IID values to have any significance. In other words, none of the IID values described in [RFC5453] are to be considered reserved. This exception would only apply to /64 prefix(es) an NVE is using for IPv6 Underlay Network tunneling.

For most if not all Tenant Packet addresses, the 8 octet IPv6 IID field will be more than large enough to hold a complete copy of the Tenant Packet addresses. To reduce tunneling overhead, these fields could be removed from the Tenant Packet while being tunneled, and restored when the IPv6 packet arrives at the destination NVE, as part of the process of IPv6 decapsulation.

Note that the IPv6 header is not protected by an end-to-end checksum, so removing the Tenant Packet fields during IPv6 encapsulation should only be performed when the removed fields are protected by a Tenant Packet checksum. The validation of this checksum could occur either when the Tenant Packet is reconstructed by the destination NVE, dropping corrupted frames as early as possible, or be left to be validated by the Tenant Packet destination, increasing NVE performance at the cost of possibly forwarding corrupted Tenant Packets towards their destinaton Tenant System.

As Tenant Packet corruption is likely to be rare, it is recommended to leave this validation to the final Tenant Packet destination. It would be useful for a network operator to be able to switch on validation at an NVE temporarily for troubleshooting purposes.

If the Tenant Packet addresses are smaller than the IPv6 address IID portions, other Tenant Packet fields could be copied into the remaining parts of the IPv6 address IIDs portion, and also removed from the Tenant Packet, further increasing stateless load-balancing effectiveness and reducing tunneling overhead. For example, for Ethernet/IEEE 802.3 Tenant Packets, the type/length field could also be copied into the remaining two octets of the IPv6 source address IID portion, after the 6 octet Ethernet/IEEE 802.3 source address, and then removed from the original Tenant Packet.

This optimisation of carrying Tenant Packet field values in the IPv6 encapsulating header's address field IIDs portions and removing them from the Tenant Packet could be indicated to the destination NVE implicitly by the Virtual Network Context ID, or via some other header carried in the IPv6 packet.

Carrying Tenant Packet addresses and other fields in the IID fields of the IPv6 Underlay Network header would expose this information to IPv6 traffic analysis tools such as IPFIX [RFC7011], providing the IPv6 Underlay Network operator with more detailed information about traffic volumes and other information between individual Tenant Systems.

6. Permanent Virtual Network Multicast Group Identifier

To simplify and automate configuration, a permanent IPv6 multicast group identifier could be assigned by IANA, in accordance with the allocation guidelines specified in [RFC3307], to be used for encapsulation of multicast Tenant Packets in IPv6 multicast packets.

This group ID would be used to form Interface-Local, Link-Local, and Site-Local scope multicast addresses. Each NVE would then subscribe to these scoped multicast addresses for the permanent group ID. The range of different scopes will allow an origin NVE to constrain the forwarding domain of IPv6 multicast packets holding multicast Tenant Packets if necessary or useful.

Other multicast scopes that may be useful for NVE encapsulation operation might be the Realm-Local, Admin-Local, and Organization-Local scopes [I-D.droms-6man-multicast-scopes], also used with the IANA reserved group ID.

7. Per-Virtual Network Multicast Group Addresses

Using a single well known multicast group to flood IPv6 encapsulated multicast Tenant Packets to all NVEs for all Virtual Networks may eventually impact network performance, due to the volume of multicast traffic being sent to NVEs at which the corresponding Virtual Network is not present.

Reducing network load may be achieved by using multiple multicast groups to distribute IPv6 encapsulated multicast Tenant Packets only to NVEs where the Virtual Network is present.

[RFC3306] describes how to create multicast addresses using a unicast IPv6 prefix, between 0 and 64 bits in length. For each unicast IPv6 derived multicast prefix, 32 bits are available for the Group ID. These group IDs are created using the guidelines specificed in [RFC3307]. For dynamically created multicast addresses, [RFC3307] restricts the group ID range to (in IPv6 address form) ::8000:0000 to ::FFFF:FFFF, a range of 31 bits or approximately 2 billion unique groups. The leading high order bit in the Group ID corresponds to the 'T' bit value in the multicast address flag, which indicates a Temporary multicast address. This ensures that when the multicast group is mapped to a link-local address, by copying the lower 32 bits of the multicast address to the link-layer multicast address range (e.g., 33:33:xx:xx:xx:xx for Ethernet/IEEE 802.3), the link-layer multicast address does not collide with Permanent IPv6 multicast addresses at the link-layer.

These 31 bits of dynamic group IDs, available for a unicast prefix, could be used to form a unique multicast group address per Virtual Network, using the Virtual Network Context ID, by combining it with an IPv6 prefix used by all NVEs. The NVEs would be informed of the common IPv6 prefix using manual configuration or a signalling protocol.

The common IPv6 prefix used to form these addresses does not have to be related to any of the /64 prefixes being used by the NVEs. However it is recommended to relate them intuitively, by using an IPv6 prefix such as an aggregate prefix that covers the set of identifying /64 prefixes being used by the NVEs attached to the same IPv6 Underlay Network. This would simplify configuration by reducing errors, simplfy troubleshooting, and may benefit inter-domain multicast routing (I don't know much about this, can you aggregate multicast groups across inter-domain boundaries?).

With larger numbers of Virtual Networks, one multicast group per Virtual Network may exceed the IPv6 Underlay Network's capacity to reliably track multicast group membership for all of the present multicast groups.

The preferred option in this situation would be to create another IPv6 Underlay Network, and to move some, and ideally half of the Virtual Networks to the new IPv6 Underlay Network. This would preserve the efficiency of one multicast group per Virtual Network, as well as increasing encapsulation network unicast and multicast capacity.

An alternative, although less efficient option would be to map multiple Virtual Networks onto each multicast group. Assuming ascending Virtual Network Context IDs starting at 0 or 1, a simple stateless mapping scheme would be to use a bitwise mask to zero the lower order bits of the Virtual Network Context ID. The number of Virtual Network Context IDs mapped to each multicast group would be equal to 2^(number of zeroed bits). For example, zeroing the lower 8 bits of the Virtual Network Context ID would result in 256 Virtual Network Context IDs mapping to the same multicast group. Alternatively, a lookup table and some other allocation method could be used to map present Virtual Network Context IDs to the corresponding multicast groups.

8. Per-NVE Multicast Addresses

As each NVE is identified by a /64 prefix, the method of forming multicast addresses described in [RFC3306] could also be used by an NVE to generate multicast group addresses specific to its /64 prefix. This may be useful when multiple NVEs are using the same /64 prefix for performance and redundancy purposes, and the origin NVE can determine that it only needs to send encapsulated multicast Tenant Packets to the set of NVEs sharing a single /64 prefix.

NVEs creating multicast groups for all of their present Virtual Network Context IDs for their /64 prefix may not be practical, as it would increase the number of multicast group memberships the IPv6 Underlay Network needs to track proportional to the number of NVEs and the number of Virtual Networks. Mapping multiple Virtual Networks to a multicast group may also consume excessive multicast membership tracking resources, as the amount of traffic towards one or more NVEs using a single /64 is likely to be small.

Instead, the NVEs should use the IANA permanent multicast group ID to form the per-NVE /64 derived multicast addresses. As before, the NVE would then subscribe to the Interface-Local, Link-Local and Site-Local scope forms of this multicast address, and optonally other multicast scopes.

9. Security Considerations

Within a trusted IPv6 Underlay Network, copying or carrying virtual network or tenant system packet attributes in IPv6 header fields will not significantly further expose them to untrusted parties, as they are likely to already exist in clear text within the IPv6 packet payload.

However, if the IPv6 Underlay Network is to span portions of the Internet, the IPv6 packets should be carried within IPsec or some other secure tunneling protocol that provides confidentiality, integrity and authenticity, to mitigate pervasive monitoring [RFC7258] and other security concerns.

In particular, when using IPsec, tunnel mode should be used, as the IPv6 Underlay Packets or their Tenant System packets would facilitate analysis of Tenant System traffic, by exposing detailed information about the numbers and identities of the Virtual Networks, possibly globally unique details of individual Tenant Systems, and volumes of traffic between distinct Tenant Systems. (Perhaps some text about setting the IPv6 Flow Label in the IPsec outer IPv6 packet to a hash of some of the inner IPv6 packet fields (derived from the Tenant Packes), with a periodically changing salt mixed into to the hash, for the benefit of ECMP/LAG while traversing the Internet? Perhaps there could be an RFC about using the outer Flow Label field with IPsec tunnel mode? (and perhaps transport mode?))

To reduce the possibility of accidental forwarding of IPv6 Underlay Network traffic onto the Internet, it is recommended that the IPv6 Underlay Network is numbered using the Unique Local Address space [RFC4193], with egress packet filters dropping ULA source or destination packets at the network's Internet boundary. Additional egress packet filters at the edge of the IPv6 Underlay Network, for the ULA address space in use within the IPv6 Underlay Network, would provide further protection against accidental forwarding of IPv6 Underlay Network traffic onto the Internet.

10. Acknowledgements

Review and comments were provided by Your Name Here!

This memo was prepared using the xml2rfc tool.

11. Change Log [RFC Editor please remove]

draft-smith-enhance-vne-with-ipv6-00, initial version, 2014-06-02

12. References

12.1. Normative References

[RFC6177]	Narten, T., Huston, G. and L. Roberts, "IPv6 Address Assignment to End Sites", BCP 157, RFC 6177, March 2011.
[RFC7011]	Claise, B., Trammell, B. and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, September 2013.
[RFC7258]	Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an Attack", BCP 188, RFC 7258, May 2014.

12.2. Informative References

[I-D.carpenter-6man-why64]	Carpenter, B., Chown, T., Gont, F., Jiang, S., Petrescu, A. and A. Yourtchenko, "Analysis of the 64-bit Boundary in IPv6 Addressing", Internet-Draft draft-carpenter-6man-why64-01, February 2014.
[I-D.davie-stt]	Davie, B. and J. Gross, "A Stateless Transport Tunneling Protocol for Network Virtualization (STT)", Internet-Draft draft-davie-stt-06, April 2014.
[I-D.droms-6man-multicast-scopes]	Droms, R., "IPv6 Multicast Address Scopes", Internet-Draft draft-droms-6man-multicast-scopes-02, July 2013.
[I-D.ietf-nvo3-framework]	Lasserre, M., Balus, F., Morin, T., Bitar, N. and Y. Rekhter, "Framework for DC Network Virtualization", Internet-Draft draft-ietf-nvo3-framework-03, July 2013.
[I-D.mahalingam-dutt-dcops-vxlan]	Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M. and C. Wright, "VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", Internet-Draft draft-mahalingam-dutt-dcops-vxlan-04, May 2013.
[I-D.sridharan-virtualization-nvgre]	Sridharan, M., Greenberg, A., Venkataramaiah, N., Wang, Y., Duda, K., Ganga, I., Lin, G., Pearson, M., Thaler, P. and C. Tumuluri, "NVGRE: Network Virtualization using Generic Routing Encapsulation", Internet-Draft draft-sridharan-virtualization-nvgre-02, February 2013.
[RFC2460]	Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.
[RFC3306]	Haberman, B. and D. Thaler, "Unicast-Prefix-based IPv6 Multicast Addresses", RFC 3306, August 2002.
[RFC3307]	Haberman, B., "Allocation Guidelines for IPv6 Multicast Addresses", RFC 3307, August 2002.
[RFC4193]	Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast Addresses", RFC 4193, October 2005.
[RFC4291]	Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.
[RFC4302]	Kent, S., "IP Authentication Header", RFC 4302, December 2005.
[RFC5453]	Krishnan, S., "Reserved IPv6 Interface Identifiers", RFC 5453, February 2009.
[RFC6436]	Amante, S., Carpenter, B. and S. Jiang, "Rationale for Update to the IPv6 Flow Label Specification", RFC 6436, November 2011.
[RFC6437]	Amante, S., Carpenter, B., Jiang, S. and J. Rajahalme, "IPv6 Flow Label Specification", RFC 6437, November 2011.
[RFC6438]	Carpenter, B. and S. Amante, "Using the IPv6 Flow Label for Equal Cost Multipath Routing and Link Aggregation in Tunnels", RFC 6438, November 2011.

Author's Address

Mark Smith In My Own Time PO BOX 521 HEIDELBERG, VIC 3084 AU EMail: markzzzsmith@yahoo.com.au