Network Working Group F. Templin, Ed.
Internet-Draft Boeing Research & Technology
Obsoletes: rfc5320, rfc5558, rfc5720, June 2, 2016
rfc6179, rfc6706 (if
approved)
Intended status: Standards Track
Expires: December 4, 2016

Asymmetric Extended Route Optimization (AERO)
draft-templin-aerolink-67.txt

Abstract

This document specifies the operation of IP over tunnel virtual links using Asymmetric Extended Route Optimization (AERO). Nodes attached to AERO links can exchange packets via trusted intermediate routers that provide forwarding services to reach off-link destinations and redirection services for route optimization. AERO provides an IPv6 link-local address format known as the AERO address that supports operation of the IPv6 Neighbor Discovery (ND) protocol and links IPv6 ND to IP forwarding. Admission control, address/prefix provisioning and mobility are supported by the Dynamic Host Configuration Protocol for IPv6 (DHCPv6), and route optimization is naturally supported through dynamic neighbor cache updates. Although DHCPv6 and IPv6 ND messaging are used in the control plane, both IPv4 and IPv6 are supported in the data plane. AERO is a widely-applicable tunneling solution using standard control messaging exchanges as described in this document.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 4, 2016.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This document specifies the operation of IP over tunnel virtual links using Asymmetric Extended Route Optimization (AERO). The AERO link can be used for tunneling to neighboring nodes over either IPv6 or IPv4 networks, i.e., AERO views the IPv6 and IPv4 networks as equivalent links for tunneling. Nodes attached to AERO links can exchange packets via trusted intermediate routers that provide forwarding services to reach off-link destinations and redirection services for route optimization [RFC5522].

AERO provides an IPv6 link-local address format known as the AERO address that supports operation of the IPv6 Neighbor Discovery (ND) [RFC4861] protocol and links IPv6 ND to IP forwarding. Admission control, address/prefix provisioning and mobility are supported by the Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315], and route optimization is naturally supported through dynamic neighbor cache updates. Although DHCPv6 and IPv6 ND messaging are used in the control plane, both IPv4 and IPv6 can be used in the data plane. AERO is a widely-applicable tunneling solution using standard control messaging exchanges as described in this document. The remainder of this document presents the AERO specification.

2. Terminology

The terminology in the normative references applies; the following terms are defined within the scope of this document:

[RFC3315].

AERO link

a Non-Broadcast, Multiple Access (NBMA) tunnel virtual overlay configured over a node's attached IPv6 and/or IPv4 networks. All nodes on the AERO link appear as single-hop neighbors from the perspective of the virtual overlay even though they may be separated by many underlying network hops. AERO can also operate over native multiple access link types (e.g., Ethernet, WiFi etc.) when a tunnel virtual overlay is not needed.
AERO interface

a node's attachment to an AERO link. Nodes typically have a single AERO interface; support for multiple AERO interfaces is also possible but out of scope for this document. AERO interfaces do not require Duplicate Address Detection (DAD) and therefore set the administrative variable DupAddrDetectTransmits to zero [RFC4862].
AERO address

an IPv6 link-local address constructed as specified in Section 3.3 and assigned to a Client's AERO interface.
AERO node

a node that is connected to an AERO link and that participates in IPv6 ND and DHCPv6 messaging over the link.
AERO Client ("Client")

a node that issues DHCPv6 messages to receive IP Prefix Delegations (PDs) from one or more AERO Servers. Following PD, the Client assigns an AERO address to the AERO interface for use in DHCPv6 and IPv6 ND exchanges with other AERO nodes.
AERO Server ("Server")

a node that configures an AERO interface to provide default forwarding and DHCPv6 services for AERO Clients. The Server assigns an administratively provisioned IPv6 link-local unicast address to support the operation of DHCPv6 and the IPv6 ND protocol. An AERO Server can also act as an AERO Relay.
AERO Relay ("Relay")

a node that configures an AERO interface to relay IP packets between nodes on the same AERO link and/or forward IP packets between the AERO link and the native Internetwork. The Relay assigns an administratively provisioned IPv6 link-local unicast address to the AERO interface the same as for a Server. An AERO Relay can also act as an AERO Server.
AERO Forwarding Agent ("Forwarding Agent")

a node that performs data plane forwarding services as a companion to an AERO Server.
ingress tunnel endpoint (ITE)

an AERO interface endpoint that injects tunneled packets into an AERO link.
egress tunnel endpoint (ETE)

an AERO interface endpoint that receives tunneled packets from an AERO link.
underlying network

a connected IPv6 or IPv4 network routing region over which the tunnel virtual overlay is configured. A typical example is an enterprise network, but many other use cases are also in scope.
underlying interface

an AERO node's interface point of attachment to an underlying network.
link-layer address

an IP address assigned to an AERO node's underlying interface. When UDP encapsulation is used, the UDP port number is also considered as part of the link-layer address; otherwise, UDP port number is set to the constant value '0'. Link-layer addresses are used as the encapsulation header source and destination addresses.
network layer address

the source or destination address of the encapsulated IP packet.
end user network (EUN)

an internal virtual or external edge IP network that an AERO Client connects to the rest of the network via the AERO interface.
AERO Service Prefix (ASP)

an IP prefix associated with the AERO link and from which AERO Client Prefixes (ACPs) are derived (for example, the IPv6 ACP 2001:db8:1:2::/64 is derived from the IPv6 ASP 2001:db8::/32).
AERO Client Prefix (ACP)

a more-specific IP prefix taken from an ASP and delegated to a Client.

Throughout the document, the simple terms "Client", "Server" and "Relay" refer to "AERO Client", "AERO Server" and "AERO Relay", respectively. Capitalization is used to distinguish these terms from DHCPv6 client/server/relay

The terminology of DHCPv6 [RFC3315] and IPv6 ND [RFC4861] (including the names of node variables and protocol constants) applies to this document. Also throughout the document, the term "IP" is used to generically refer to either Internet Protocol version (i.e., IPv4 or IPv6).

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Lower case uses of these words are not to be interpreted as carrying RFC2119 significance.

3. Asymmetric Extended Route Optimization (AERO)

The following sections specify the operation of IP over Asymmetric Extended Route Optimization (AERO) links:

3.1. AERO Link Reference Model

                           .-(::::::::)
                        .-(:::: IP ::::)-.
                       (:: Internetwork ::)
                        `-(::::::::::::)-'
                           `-(::::::)-' 
                                |
    +--------------+   +--------+-------+   +--------------+
    |AERO Server S1|   | AERO Relay R1  |   |AERO Server S2|
    |  Nbr: C1; R1 |   |   Nbr: S1; S2  |   |  Nbr: C2; R1 |
    |  default->R1 |   |(P1->S1; P2->S2)|   |  default->R1 |
    |    P1->C1    |   |      ASP A1    |   |    P2->C2    |
    +-------+------+   +--------+-------+   +------+-------+
            |                   |                  |
    X---+---+-------------------+------------------+---+---X
        |                  AERO Link                   |
  +-----+--------+                            +--------+-----+
  |AERO Client C1|                            |AERO Client C2|
  |    Nbr: S1   |                            |   Nbr: S2    |
  | default->S1  |                            | default->S2  |
  |    ACP P1    |                            |    ACP P2    |
  +--------------+                            +--------------+
        .-.                                         .-.
     ,-(  _)-.                                   ,-(  _)-.
  .-(_   IP  )-.                              .-(_   IP  )-.
 (__    EUN      )                           (__    EUN      )
    `-(______)-'                                `-(______)-'
         |                                           |
     +--------+                                  +--------+
     | Host H1|                                  | Host H2|
     +--------+                                  +--------+

Figure 1: AERO Link Reference Model

Figure 1 presents the AERO link reference model. In this model:

Each AERO node maintains an AERO interface neighbor cache and an IP forwarding table. For example, AERO Relay R1 in the diagram has neighbor cache entries for Servers S1 and S2 as well as IP forwarding table entries for the ACPs delegated to Clients C1 and C2. In common operational practice, there may be many additional Relays, Servers and Clients. (Although not shown in the figure, AERO Forwarding Agents may also be provided for data plane forwarding offload services.)

3.2. AERO Link Node Types

AERO Relays provide default forwarding services to AERO Servers. Relays forward packets between neighbors connected to the same AERO link and also forward packets between the AERO link and the native IP Internetwork. Relays present the AERO link to the native Internetwork as a set of one or more AERO Service Prefixes (ASPs) and serve as a gateway between the AERO link and the Internetwork. AERO Relays maintain an AERO interface neighbor cache entry for each AERO Server, and maintain an IP forwarding table entry for each AERO Client Prefix (ACP). AERO Relays can also be configured to act as AERO Servers.

AERO Servers provide default forwarding services to AERO Clients. Each Server also peers with each Relay in a dynamic routing protocol instance to advertise its list of associated ACPs. Servers configure a DHCPv6 server function to facilitate Prefix Delegation (PD) exchanges with Clients. Each delegated prefix becomes an ACP taken from an ASP. Servers forward packets between AERO interface neighbors, and maintain an AERO interface neighbor cache entry for each AERO Relay. They also maintain both neighbor cache entries and IP forwarding table entries for each of their associated Clients. AERO Servers can also be configured to act as AERO Relays.

AERO Clients act as requesting routers to receive ACPs through DHCPv6 PD exchanges with AERO Servers over the AERO link. Each Client MAY associate with a single Server or with multiple Servers, e.g., for fault tolerance, load balancing, etc. Each IPv6 Client receives at least a /64 IPv6 ACP, and may receive even shorter prefixes. Similarly, each IPv4 Client receives at least a /32 IPv4 ACP (i.e., a singleton IPv4 address), and may receive even shorter prefixes. AERO Clients maintain an AERO interface neighbor cache entry for each of their associated Servers as well as for each of their correspondent Clients.

AERO Forwarding Agents provide data plane forwarding services as companions to AERO Servers. Note that while Servers are required to perform both control and data plane operations on their own behalf, they may optionally enlist the services of special-purpose Forwarding Agents to offload data plane traffic.

3.3. AERO Addresses

An AERO address is an IPv6 link-local address with an embedded ACP and assigned to a Client's AERO interface. The AERO address is formed as follows:

For IPv6, the AERO address begins with the prefix fe80::/64 and includes in its interface identifier the base prefix taken from the Client's IPv6 ACP. The base prefix is determined by masking the ACP with the prefix length. For example, if the AERO Client receives the IPv6 ACP:

it constructs its AERO address as:

[RFC4291] that includes the base prefix taken from the Client's IPv4 ACP. For example, if the AERO Client receives the IPv4 ACP:

For IPv4, the AERO address is formed from the lower 64 bits of an IPv4-mapped IPv6 address

it constructs its AERO address as:

The AERO address remains stable as the Client moves between topological locations, i.e., even if its link-layer addresses change.

NOTE: In some cases, prospective neighbors may not have advanced knowledge of the Client's ACP length and may therefore send initial IPv6 ND messages with an AERO destination address that matches the ACP but does not correspond to the base prefix. For example, if the Client receives the IPv6 ACP 2001:db8:1000:2000::/56 then subsequently receives an IPv6 ND message with destination address fe80::2001:db8:1000:2001, it accepts the message as though it were addressed to fe80::2001:db8:1000:2000.

3.4. AERO Interface Characteristics

AERO interfaces use encapsulation (see: Section 3.10) to exchange packets with neighbors attached to the AERO link. AERO interfaces maintain a neighbor cache, and AERO nodes use both DHCPv6 PD and IPv6 ND control messaging. AERO Clients send DHCPv6 Solicit, Rebind, Renew and Release messages to AERO Servers, which respond with DHCPv6 Reply messages. These messages result in the creation, modification and deletion of neighbor cache entries.

AERO interfaces use unicast IPv6 ND Neighbor Solicitation (NS), Neighbor Advertisement (NA), Router Solicitation (RS) and Router Advertisement (RA) messages the same as for any IPv6 link. AERO interfaces use two IPv6 ND redirection message types -- the first known as a Predirect message and the second being the standard Redirect message (see Section 3.17). AERO links further use link-local-only addressing; hence, AERO nodes ignore any Prefix Information Options (PIOs) they may receive in RA messages over an AERO interface.

AERO interface ND messages include one or more Source/Target Link-Layer Address Options (S/TLLAOs) formatted as shown in Figure 2:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Type = 2   |   Length = 3  |           Reserved            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Link ID    |    NDSCPs     |  DSCP #1  |Prf|  DSCP #2  |Prf|
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  DSCP #3  |Prf|  DSCP #4  |Prf| ....
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |        UDP Port Number        |                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
     |                                                               |
     +                                                               +
     |                          IP Address                           |
     +                                                               +
     |                                                               |
     +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 2: AERO Source/Target Link-Layer Address Option (S/TLLAO) Format

In this format, Link ID is an integer value between 0 and 255 corresponding to an underlying interface of the target node, NDSCPs encodes an integer value between 0 and 64 indicating the number of Differentiated Services Code Point (DSCP) octets that follow. Each DSCP octet is a 6-bit integer DSCP value followed by a 2-bit Preference ("Prf") value. Each DSCP value encodes an integer between 0 and 63 associated with this Link ID, where the value 0 means "default" and other values are interpreted as specified in [RFC2474]. The 'Prf' qualifier for each DSCP value is set to the value 0 ("deprecated'), 1 ("low"), 2 ("medium"), or 3 ("high") to indicate a preference level for packet forwarding purposes. When a particular DSCP value is not specified, its preference level is set to "medium" by default.

UDP Port Number and IP Address are set to the addresses used by the target node when it sends encapsulated packets over the underlying interface. When UDP is not used as part of the encapsulation, UDP Port Number is set to the value '0'. When the encapsulation IP address family is IPv4, IP Address is formed as an IPv4-mapped IPv6 address [RFC4291].

AERO interfaces may be configured over multiple underlying interfaces. For example, common mobile handheld devices have both wireless local area network ("WLAN") and cellular wireless links. These links are typically used "one at a time" with low-cost WLAN preferred and highly-available cellular wireless as a standby. In a more complex example, aircraft frequently have many wireless data link types (e.g. satellite-based, terrestrial, air-to-air directional, etc.) with diverse performance and cost properties.

If a Client's multiple underlying interfaces are used "one at a time" (i.e., all other interfaces are in standby mode while one interface is active), then Redirect, Predirect and unsolicited NA messages include only a single TLLAO with Link ID set to a constant value.

If the Client has multiple active underlying interfaces, then from the perspective of IPv6 ND it would appear to have multiple link-layer addresses. In that case, Redirect and Predirect messages MAY include multiple TLLAOs -- each with a Link ID that corresponds to a specific underlying interface of the Client.

3.5. AERO Link Registration

When an administrative authority first deploys a set of AERO Relays and Servers that comprise an AERO link, they also assign a unique domain name for the link, e.g., "linkupnetworks.example.com". Next, if administrative policy permits Clients within the domain to serve as correspondent nodes for Internet mobile nodes, the administrative authority adds a Fully Qualified Domain Name (FQDN) for each of the AERO link's ASPs to the Domain Name System (DNS) [RFC1035]. The FQDN is based on the suffix "aero.linkupnetworks.net" with a prefix formed from the wildcard-terminated reverse mapping of the ASP [RFC3596][RFC4592], and resolves to a DNS PTR resource record. For example, for the ASP '2001:db8:1::/48' within the domain name "linkupnetworks.example.com", the DNS database contains:

'*.1.0.0.0.8.b.d.0.1.0.0.2.aero.linkupnetworks.net. PTR linkupnetworks.example.com'

This DNS registration advertises the AERO link's ASPs to prospective correspondent nodes.

3.6. AERO Interface Initialization

3.6.1. AERO Relay Behavior

When a Relay enables an AERO interface, it first assigns an administratively provisioned link-local address fe80::ID to the interface. Each fe80::ID address MUST be unique among all AERO nodes on the link, and MUST NOT collide with any potential AERO addresses nor the special addresses fe80:: and fe80::ffff:ffff:ffff:ffff. (The fe80::ID addresses are typically taken from the available range fe80::/96, e.g., as fe80::1, fe80::2, fe80::3, etc.) The Relay then engages in a dynamic routing protocol session with all Servers on the link (see: Section 3.7), and advertises its assigned ASP prefixes into the native IP Internetwork.

Each Relay subsequently maintains an IP forwarding table entry for each ACP covered by its ASP(s), and maintains a neighbor cache entry for each Server on the link. Relays exchange NS/NA messages with AERO link neighbors the same as for any AERO node, however they typically do not perform explicit Neighbor Unreachability Detection (NUD) (see: Section 3.18) since the dynamic routing protocol already provides reachability confirmation.

3.6.2. AERO Server Behavior

When a Server enables an AERO interface, it assigns an administratively provisioned link-local address fe80::ID the same as for Relays. The Server further configures a DHCPv6 server function to facilitate DHCPv6 PD exchanges with AERO Clients. The Server maintains a neighbor cache entry for each Relay on the link, and manages per-ACP neighbor cache entries and IP forwarding table entries based on control message exchanges. Each Server also engages in a dynamic routing protocol with each Relay on the link (see: Section 3.7).

When the Server receives an NS/RS message from a Client on the AERO interface it returns an NA/RA message. The Server further provides a simple link-layer conduit between AERO interface neighbors. Therefore, packets enter the Server's AERO interface from the link layer and are forwarded back out the link layer without ever leaving the AERO interface and therefore without ever disturbing the network layer.

3.6.3. AERO Client Behavior

When a Client enables an AERO interface, it uses the special address fe80::ffff:ffff:ffff:ffff to obtain one or more ACPs from an AERO Server via DHCPv6 PD. Next, it assigns the corresponding AERO address(es) to the AERO interface and creates a neighbor cache entry for the Server, i.e., the DHCPv6 PD exchange bootstraps autoconfiguration of unique link-local address(es). The Client maintains a neighbor cache entry for each of its Servers and each of its active correspondent Clients. When the Client receives Redirect/Predirect messages on the AERO interface it updates or creates neighbor cache entries, including link-layer address information.

3.6.4. AERO Forwarding Agent Behavior

When a Forwarding Agent enables an AERO interface, it assigns the same link-local address(es) as the companion AERO Server. The Forwarding Agent thereafter provides data plane forwarding services based solely on the forwarding information assigned to it by the companion AERO Server.

3.7. AERO Routing System

The AERO routing system is based on a private instance of the Border Gateway Protocol (BGP) [RFC4271] that is coordinated between Relays and Servers and does not interact with either the public Internet BGP routing system or the native IP Internetwork interior routing system. Relays advertise only a small and unchanging set of ASPs to the native routing system instead of the full dynamically changing set of ACPs.

In a reference deployment, each AERO Server is configured as an Autonomous System Border Router (ASBR) for a stub Autonomous System (AS) using an AS Number (ASN) that is unique within the BGP instance, and each Server further peers with each Relay but does not peer with other Servers. Similarly, Relays do not peer with each other, since they will reliably receive all updates from all Servers and will therefore have a consistent view of the AERO link ACP delegations.

Each Server maintains a working set of associated ACPs, and dynamically announces new ACPs and withdraws departed ACPs in its BGP updates to Relays. Clients are expected to remain associated with their current Servers for extended timeframes, however Servers SHOULD selectively suppress BGP updates for impatient Clients that repeatedly associate and disassociate with them in order to dampen routing churn.

Each Relay configures a black-hole route for each of its ASPs. By black-holing the ASPs, the Relay will maintain forwarding table entries only for the ACPs that are currently active, and all other ACPs will correctly result in destination unreachable failures due to the black hole route.

Scaling properties of the AERO routing system are limited by the number of BGP routes that can be carried by Relays. Assuming O(10^6) as a reasonable maximum number of BGP routes, this means that O(10^6) Clients can be serviced by a single set of Relays. A means of increasing scaling would be to assign a different set of Relays for each set of ASPs. In that case, each Server still peers with each Relay, but the Server institutes route filters so that each set of Relays only receives BGP updates for the ASPs they aggregate. For example, if the ASP for the AERO link is 2001:db8::/32, a first set of Relays could service the ASP segment 2001:db8::/40, a second set of Relays could service 2001:db8:0100::/40, a third set could service 2001:db8:0200::/40, etc.

Assuming up to O(10^3) sets of Relays, the AERO routing system can then accommodate O(10^9) ACPs with no additional overhead for Servers and Relays (for example, it should be possible to service 4 billion /64 ACPs taken from a /32 ASP and even more for shorter ASPs). In this way, each set of Relays services a specific set of ASPs that they advertise to the native routing system, and each Server configures ASP-specific routes that list the correct set of Relays as next hops. This arrangement also allows for natural incremental deployment, and can support small scale initial deployments followed by dynamic deployment of additional Clients, Servers and Relays without disturbing the already-deployed base.

Note that in an alternate routing arrangement each set of Relays could advertise the aggregated ASP for the link into the native routing system even though each Relay services only a segment of the ASP. In that case, a Relay upon receiving a packet with a destination address covered by the ASP segment of another Relay can simply tunnel the packet to the correct Relay. The tradeoff then is the penalty for Relay-to-Relay tunneling compared with reduced routing information in the native routing system.

3.8. AERO Interface Neighbor Cache Maintenace

Each AERO interface maintains a conceptual neighbor cache that includes an entry for each neighbor it communicates with on the AERO link, the same as for any IPv6 interface [RFC4861]. AERO interface neighbor cache entires are said to be one of "permanent", "static" or "dynamic".

Permanent neighbor cache entries are created through explicit administrative action; they have no timeout values and remain in place until explicitly deleted. AERO Relays maintain a permanent neighbor cache entry for each Server on the link, and AERO Servers maintain a permanent neighbor cache entry for each Relay. Each entry maintains the mapping between the neighbor's fe80::ID network-layer address and corresponding link-layer address.

Static neighbor cache entries are created through DHCPv6 PD exchanges and remain in place for durations bounded by prefix lifetimes. AERO Servers maintain static neighbor cache entries for the ACPs of each of their associated Clients, and AERO Clients maintain a static neighbor cache entry for each of their associated Servers. When an AERO Server sends a Reply message response to a Client's Solicit, Rebind or Renew message, it creates or updates a static neighbor cache entry based on the Client's DHCP Unique Identifier (DUID) as the Client identifier, the AERO address(es) corresponding to the Client's ACP(s) as the network-layer address(es), the prefix lifetime as the neighbor cache entry lifetime, the Client's encapsulation IP address and UDP port number as the link-layer address and the prefix length(s) as the length to apply to the AERO address(es). When an AERO Client receives a Reply message from a Server, it creates or updates a static neighbor cache entry based on the Reply message link-local source address as the network-layer address, the prefix lifetime as the neighbor cache entry lifetime, and the encapsulation IP source address and UDP source port number as the link-layer address.

Dynamic neighbor cache entries are created or updated based on receipt of a Predirect/Redirect message, and are garbage-collected if not used within a bounded timescale. AERO Clients maintain dynamic neighbor cache entries for each of their active correspondent Client ACPs with lifetimes based on IPv6 ND messaging constants. When an AERO Client receives a valid Predirect message it creates or updates a dynamic neighbor cache entry for the Predirect target network-layer and link-layer addresses plus prefix length. The node then sets an "AcceptTime" variable in the neighbor cache entry to ACCEPT_TIME seconds and uses this value to determine whether packets received from the correspondent can be accepted. When an AERO Client receives a valid Redirect message it creates or updates a dynamic neighbor cache entry for the Redirect target network-layer and link-layer addresses plus prefix length. The Client then sets a "ForwardTime" variable in the neighbor cache entry to FORWARD_TIME seconds and uses this value to determine whether packets can be sent directly to the correspondent. The Client also sets a "MaxRetry" variable to MAX_RETRY to limit the number of keepalives sent when a correspondent may have gone unreachable.

It is RECOMMENDED that FORWARD_TIME be set to the default constant value 30 seconds to match the default REACHABLE_TIME value specified for IPv6 ND [RFC4861].

It is RECOMMENDED that ACCEPT_TIME be set to the default constant value 40 seconds to allow a 10 second window so that the AERO redirection procedure can converge before AcceptTime decrements below FORWARD_TIME.

It is RECOMMENDED that MAX_RETRY be set to 3 the same as described for IPv6 ND address resolution in Section 7.3.3 of [RFC4861].

Different values for FORWARD_TIME, ACCEPT_TIME, and MAX_RETRY MAY be administratively set, if necessary, to better match the AERO link's performance characteristics; however, if different values are chosen, all nodes on the link MUST consistently configure the same values. Most importantly, ACCEPT_TIME SHOULD be set to a value that is sufficiently longer than FORWARD_TIME to allow the AERO redirection procedure to converge.

When there may be a Network Address Translator (NAT) between the Client and the Server, or if the path from the Client to the Server should be tested for reachability, the Client can send periodic RS messages to the Server to receive RA replies. The RS/RA messaging will keep NAT state alive and test Server reachability without disturbing the DHCPv6 server.

3.9. AERO Interface Sending Algorithm

IP packets enter a node's AERO interface either from the network layer (i.e., from a local application or the IP forwarding system), or from the link layer (i.e., from the AERO tunnel virtual link). Packets that enter the AERO interface from the network layer are encapsulated and admitted into the AERO link, i.e., they are tunnelled to an AERO interface neighbor. Packets that enter the AERO interface from the link layer are either re-admitted into the AERO link or delivered to the network layer where they are subject to either local delivery or IP forwarding. Since each AERO node may have only partial information about neighbors on the link, AERO interfaces may forward packets with link-local destination addresses at a layer below the network layer. This means that AERO nodes act as both IP routers/hosts and sub-IP layer forwarding nodes. AERO interface sending considerations for Clients, Servers and Relays are given below.

When an IP packet enters a Client's AERO interface from the network layer, if the destination is covered by an ASP the Client searches for a dynamic neighbor cache entry with a non-zero ForwardTime and an AERO address that matches the packet's destination address. (The destination address may be either an address covered by the neighbor's ACP or the (link-local) AERO address itself.) If there is a match, the Client uses a link-layer address in the entry as the link-layer address for encapsulation then admits the packet into the AERO link. If there is no match, the Client instead uses the link-layer address of a neighboring Server as the link-layer address for encapsulation.

When an IP packet enters a Server's AERO interface from the link layer, if the destination is covered by an ASP the Server searches for a neighbor cache entry with an AERO address that matches the packet's destination address. (The destination address may be either an address covered by the neighbor's ACP or the AERO address itself.) If there is a match, the Server uses a link-layer address in the entry as the link-layer address for encapsulation and re-admits the packet into the AERO link. If there is no match, the Server instead uses the link-layer address in a permanent neighbor cache entry for a Relay selected through longest-prefix-match as the link-layer address for encapsulation.

When an IP packet enters a Relay's AERO interface from the network layer, the Relay searches its IP forwarding table for an entry that is covered by an ASP and also matches the destination. If there is a match, the Relay uses the link-layer address in the corresponding neighbor cache entry as the link-layer address for encapsulation and admits the packet into the AERO link. When an IP packet enters a Relay's AERO interface from the link-layer, if the destination is not a link-local address and does not match an ASP the Relay removes the packet from the AERO interface and uses IP forwarding to forward the packet to the Internetwork. If the destination address is a link-local address or a non-link-local address that matches an ASP, and there is a more-specific ACP entry in the IP forwarding table, the Relay uses the link-layer address in the corresponding neighbor cache entry as the link-layer address for encapsulation and re-admits the packet into the AERO link. When an IP packet enters a Relay's AERO interface from either the network layer or link-layer, and the packet's destination address matches an ASP but there is no more-specific ACP entry, the Relay drops the packet and returns an ICMP Destination Unreachable message (see: Section 3.14).

When an AERO Server receives a packet from a Relay via the AERO interface, the Server MUST NOT forward the packet back to the same or a different Relay.

When an AERO Relay receives a packet from a Server via the AERO interface, the Relay MUST NOT forward the packet back to the same Server.

When an AERO node re-admits a packet into the AERO link without involving the network layer, the node MUST NOT decrement the network layer TTL/Hop-count.

When an AERO node forwards a data packet to the primary link-layer address of a Server, it may receive Redirect messages with an SLLAO that include the link-layer address of an AERO Forwarding Agent. The AERO node SHOULD record the link-layer address in the neighbor cache entry for the neighbor and send subsequent data packets via this address instead of the Server's primary address (see: Section 3.16).

3.10. AERO Interface Encapsulation and Re-encapsulation

AERO interfaces encapsulate IP packets according to whether they are entering the AERO interface from the network layer or if they are being re-admitted into the same AERO link they arrived on. This latter form of encapsulation is known as "re-encapsulation".

The AERO interface encapsulates packets per the Generic UDP Encapsulation (GUE) encapsulation procedures in [I-D.ietf-nvo3-gue][I-D.herbert-gue-fragmentation], or through an alternate encapsulation format (see: Appendix A). For packets entering the AERO link from the IP layer, the AERO interface copies the "TTL/Hop Limit", "Type of Service/Traffic Class" [RFC2983], "Flow Label"[RFC6438].(for IPv6) and "Congestion Experienced" [RFC3168] values in the packet's IP header into the corresponding fields in the encapsulation IP header. For packets undergoing re-encapsulation within the AERO link, the AERO interface instead copies the "TTL/Hop Limit", "Type of Service/Traffic Class", "Flow Label" and "Congestion Experienced" values in the original encapsulation IP header into the corresponding fields in the new encapsulation IP header, i.e., the values are transferred between encapsulation headers and *not* copied from the encapsulated packet's network-layer header.

When GUE encapsulation is used, the AERO interface next sets the UDP source port to a constant value that it will use in each successive packet it sends, and sets the UDP length field to the length of the encapsulated packet plus 8 bytes for the UDP header itself plus the length of the GUE header (or 0 if GUE direct IP encapsulation is used). For packets sent to a Server, the AERO interface sets the UDP destination port to 8060, i.e., the IANA-registered port number for AERO. For packets sent to a correspondent Client, the AERO interface sets the UDP destination port to the port value stored in the neighbor cache entry for this correspondent. The AERO interface then either includes or omits the UDP checksum according to the GUE specification.

For IPv4 encapsulation, the AERO interface sets the DF bit as discussed in Section 3.13.

3.11. AERO Interface Decapsulation

AERO interfaces decapsulate packets destined either to the AERO node itself or to a destination reached via an interface other than the AERO interface the packet was received on. Decapsulation is per the procedures specified for the appropriate encapsulation format.

3.12. AERO Interface Data Origin Authentication

AERO nodes employ simple data origin authentication procedures for encapsulated packets they receive from other nodes on the AERO link. In particular:

  • AERO Servers and Relays accept encapsulated packets with a link-layer source address that matches a permanent neighbor cache entry.
  • AERO Servers accept authentic encapsulated DHCPv6 messages from Clients, and create or update a static neighbor cache entry for the Client based on the specific DHCPv6 message type.
  • AERO Clients and Servers accept encapsulated packets if there is a static neighbor cache entry with a link-layer address that matches the packet's link-layer source address.
  • AERO Clients, Servers and Relays accept encapsulated packets if there is a dynamic neighbor cache entry with an AERO address that matches the packet's network-layer source address, with a link-layer address that matches the packet's link-layer source address, and with a non-zero AcceptTime.

Note that this simple data origin authentication is effective in environments in which link-layer addresses cannot be spoofed. In other environments, each AERO message must include a signature that the recipient can use to authenticate the message origin.

3.13. AERO Interface MTU and Fragmentation

The AERO interface is the node's attachment to the AERO link. The AERO interface acts as a tunnel ingress when it sends a packet to an AERO link neighbor and as a tunnel egress when it receives a packet from an AERO link neighbor.

AERO links over IP networks have a maximum link MTU of 64KB minus the encapsulation overhead (i.e., 64KB-ENCAPS), since the maximum packet size in the base IP specifications is 64KB [RFC0791][RFC2460]. While IPv6 jumbograms can be up to 4GB [RFC2675], they are considered optional for IPv6 nodes [RFC6434] and therefore out of scope for this document. MTU and fragmentation considerations for tunnels are further discussed in [RFC4459].

The AERO interface can configure either an indefinite MTU (i.e., 64KB-ENCAPS) or a smaller fixed MTU that determines the maximum sized packet that can be admitted into the AERO interface. The MTU for each AERO interface neighbor is therefore constrained by the minimum of the MTU of the AERO interface, the MTU of the underlying interface used for tunneling (minus ENCAPS), and the path MTU within the tunnel (minus ENCAPS).

IPv6 specifies a minimum link MTU of 1280 bytes [RFC2460]. This is the minimum packet size the AERO interface MUST admit without returning an ICMP Packet Too Big (PTB) message. Although IPv4 specifies a smaller minimum link MTU of 68 bytes [RFC0791], AERO interfaces also observe a 1280 byte minimum for IPv4 even if some fragmentation is needed.

The vast majority of links in the Internet configure an MTU of at least 1500 bytes. Original source hosts have therefore become conditioned to expect that IP packets up to 1500 bytes in length will either be delivered to the final destination or a suitable PTB message returned. However, PTB messages may be crafted for malicious purposes such as denial of service, or lost in the network [RFC2923] resulting in failure of the IP Path MTU Discovery (PMTUD) mechanisms [RFC1191][RFC1981]. For these reasons, the tunnel ingress sends encapsulated packets to the tunnel egress subject to the specific path considerations as follows:

3.13.1. Operational Assurance of Sufficient MTU

When there is operational assurance that all links in the paths that the tunnel may traverse are capable of passing packets up to S bytes in length, the ingress can admit all packets up to (S-ENCAPS) bytes without loss due to path MTU restrictions and without invoking fragmentation. An example is a closed data center where it is known that all links configure an MTU of at least 9KB. Note that since there may be additional encapsulations on the path from the ingress to the egress, however, it may not always be sufficient to rely on operational assurance. In that case, the ingress should observe one or both of the following approaches.

3.13.2. Classical Path MTU Discovery with Reactive Fragmentation

When the original source, ingress and egress are all within the same well-managed administrative domain, the ingress admits a packet into the tunnel if it is no larger than the current path MTU estimate for this egress (initially set to the MTU of the underlying link to be used for tunneling minus ENCAPS). Otherwise, the ingress drops the packet and sends a network layer (L3) PTB message back to the original source. Additionally, the ingress SHOULD cache the MTU value in any link-layer (L2) PTB messages it receives from a router on the path to the egress as a new path MTU estimate. Thereafter, the ingress SHOULD periodically reset the path MTU estimate to the MTU of the underlying link minus ENCAPS to detect path MTU increases.

These procedures apply when the path MTU for this egress is no smaller than (1280+ENCAPS) bytes; otherwise, the ingress can either declare the egress unreachable or commence fragmentation in a manner that parallels the standard behavior specified in [RFC2473]. In that case, the ingress encapsulates all packets that are no larger than 1280 bytes while using encapsulation layer fragmentation if necessary as specified in Section 3.13.3. (For IPv4 packets with DF=0 that are larger than 1280 bytes, the ingress instead uses IPv4 fragmentation before encapsulation.)

3.13.3. Proactive Fragmentation with Expectation of Packetization Layer Path MTU Discovery

When the original source, ingress and egress are not all within the same well-managed administrative domain, the ingress admits all packets up to 1500 bytes in length even if some fragmentation is needed, and admits larger packets without fragmentation in case they are able to traverse the tunnel in one piece.

Several factors must be considered when fragmentation is needed. For AERO links over IPv4, the IP ID field is only 16 bits in length, meaning that fragmentation at high data rates could result in data corruption due to reassembly misassociations [RFC6864][RFC4963] (see: Section 3.13.5). For AERO links over both IPv4 and IPv6, studies have also shown that IP fragments are dropped unconditionally over some network paths [I-D.taylor-v6ops-fragdrop]. For these reasons, when fragmentation is needed the ingress inserts an encapsulation layer fragment header and applies tunnel fragmentation in the manner suggested in Section 3.1.7 of [RFC2764] instead of IP fragmentation. Since the fragment header reduces the room available for packet data, but the original source has no way to control its insertion, the ingress MUST include the fragment header length in the ENCAPS length even for packets in which the header is absent.

The ingress therefore sends encapsulated packets to the egress according to the following algorithm:

  • For IP packets that are no larger than (1280-ENCAPS) bytes, the ingress encapsulates the packet and admits it into the tunnel without fragmentation. For IPv4 AERO links, the ingress sets the Don't Fragment (DF) bit to 0 so that these packets will be delivered to the egress even if there is a restricting link in the path, i.e., unless lost due to congestion or routing errors.
  • For IP packets that are larger than (1280-ENCAPS) bytes but no larger than 1500 bytes, the ingress encapsulates the packet and inserts an encapsulation layer fragment header. Next, the ingress fragments the packet into a minimum number of non-overlapping fragments where the first fragment (including ENCAPS) is no larger than 1024 bytes and the remaining fragments are no larger than the first. Each fragment consists of corresponding encapsulation headers followed by the fragment of the encapsulated packet itself. The ingress then admits the fragments into the tunnel, and for IPv4 sets the DF bit to 0 in the IP encapsulation header. These fragmented encapsulated packets will be delivered to the egress, which reassembles them into a whole packet. The egress therefore MUST be capable of reassembling packets up to (1500+ENCAPS) bytes in length; hence, it is RECOMMENDED that the egress be capable of reassembling at least 2KB.
  • For IPv4 packets that are larger than 1500 bytes and with the DF bit set to 0, the ingress uses ordinary IPv4 fragmentation to break the unencapsulated packet into a minimum number of non-overlapping fragments where the first fragment (including ENCAPS) is no larger than 1024 bytes and the remaining fragments are no larger than the first. The ingress then encapsulates each fragment (and for IPv4 sets the DF bit to 0) then admits them into the tunnel. These fragments will be delivered to the final destination via the egress.
  • For all other IP packets, if the packet is larger than the current path MTU estimate for this egress, the ingress drops the packet and returns an L3 PTB message to the original source with MTU set to the larger of 1500 bytes or the current path MTU estimate. Otherwise, the ingress encapsulates the packet and admits it into the tunnel without fragmentation (and for IPv4 sets the DF bit to 1). Since PTB messages may either be lost or contain insufficient information, however, it is RECOMMENDED that original sources that send unfragmentable IP packets larger than 1500 bytes use Packetization Layer Path MTU Discovery (PLPMTUD) [RFC4821].

A first exception to these procedures occurs when the ingress and egress are both within the same well-managed administrative domain. In that case, the ingress MAY initially admit all packets into the tunnel without fragmentation. If the ingress subsequently receives an L2 PTB message reporting a size smaller than (1500+ENCAPS) it can commence fragmentation per the above algorithm.

A second exception occurs when the original source and ingress are both within the same well-managed administrative domain. In that case, if the underlying interface used by the ingress for tunneling configures an MTU smaller than (1500+HLEN) bytes, the ingress MAY drop packets that are larger than 1280 bytes and larger than the underlying interface MTU following encapsulation, and return an L3 PTB message to the original source.

3.13.4. Accommodating Large Control Messages

The tunnel ingress MUST accommodate control messages (i.e., IPv6 ND, DHCPv6, etc.) even if the path MTU is insufficient to deliver the message without fragmentation. For control messages that are larger than the known or assumed minimum path MTU, the ingress encapsulates the packet and inserts an encapsulation layer fragment header. Next, the ingress breaks the packet into a minimum number of non-overlapping fragments where the first fragment (including ENCAPS) is no larger than 1024 bytes and the remaining fragments are no larger than the first. The ingress then encapsulates each fragment (and for IPv4 sets the DF bit to 0) then admits them into the tunnel.

Control messages that exceed the 2KB minimum reassembly size rarely occur in current operational practices, however the egress SHOULD be able to reassemble them if they appear in future applications. This means that the egress SHOULD include a configuration knob allowing the operator to set a larger reassembly buffer size if large control messages become more common in the future.

The ingress MAY send large control messages without fragmentation if there is assurance that large packets can traverse the tunnel without fragmentation.

3.13.5. Integrity

When fragmentation is needed, there must be assurance that reassembly can be safely conducted without incurring data corruption. Sources of corruption can include implementation errors, memory errors and misassociations of fragments from a first datagram with fragments of another datagram. The first two conditions (implementation and memory errors) are mitigated by modern systems and implementations that have demonstrated integrity through decades of operational practice. The third condition (reassembly misassociations) must be accounted for by AERO.

The fragmentation procedure described in the above algorithms can reuse standard IPv6 fragmentation and reassembly code. Since encapsulation layer fragment headers include a 32-bit ID field, there would need to be 2^32 packets alive in the network before a second packet with a duplicate ID enters the system with the (remote) possibility for a reassembly misassociation. For 1280 byte packets, and for a maximum network lifetime value of 60 seconds [RFC2460], this means that the ingress would need to produce ~(7 *10^12) bits/sec in order for a duplication event to be possible. This exceeds the bandwidth of modern data link technologies at the time of this writing, but not necessarily so for future datal links. Although wireless data links commonly used by AERO Clients support vastly lower data rates, the aggregate data rates between AERO Servers and Relays may be substantial. However, high speed data links in the network core are expected to configure larger MTUs (e.g., 4KB, 8KB or even larger) such that unfragmented packets can be used. Hence, no integrity check is included to cover fragmentation and reassembly procedures.

When the ingress sends an IPv4-encapsulated packet with the DF bit set to 0 in the above algorithms, there is a chance that the packet may be fragmented by an IPv4 router somewhere within the tunnel. Since the largest such packet is only 1280 bytes, however, it is very likely that the packet will traverse the tunnel without incurring a restricting link. Even when a link within the tunnel configures an MTU smaller than 1280 bytes, it is very likely that it does so due to limited performance characteristics [RFC3819]. This means that the tunnel would not be able to convey fragmented IPv4-encapsulated packets fast enough to produce reassembly misassociations, as discussed above. However, AERO must also account for the possibility of tunnel paths that traverse a high-speed IPv4 link with a degenerate MTU.

Since the IPv4 header includes only a 16-bit ID field, there would only need to be 2^16 packets alive in the network before a second packet with a duplicate ID enters the system. For 1280 byte packets, and for a maximum network lifetime value of 120 seconds[RFC0791], this means that the ingress would only need to produce ~(5 *10^6) bits/sec in order for a duplication event to be possible - a value that is well within range for modern wired and wireless data link technologies.

Therefore, if there is strong operational assurance that no IPv4 links capable of supporting data rates of 5Mbps or more configure an MTU smaller than 1280 the ingress MAY omit an integrity check for the IPv4 fragmentation and reassembly procedures; otherwise, the ingress SHOULD include an integrity check. When an upper layer encapsulation (e.g., IPsec) already includes an integrity check, the ingress need not include an additional check. Otherwise, the ingress calculates the encapsulation layer checksum over the encapsulated packet and writes the value into the encapsulation layer checksum header. The egress will then verify the checksum and discard the packet if the checksum is incorrect.

3.14. AERO Interface Error Handling

When an AERO node admits encapsulated packets into the AERO interface, it may receive link-layer (L2) or network-layer (L3) error indications.

An L2 error indication is an ICMP error message generated by a router on the path to the neighbor or by the neighbor itself. The message includes an IP header with the address of the node that generated the error as the source address and with the link-layer address of the AERO node as the destination address.

The IP header is followed by an ICMP header that includes an error Type, Code and Checksum. For ICMPv6 [RFC4443], the error Types include "Destination Unreachable", "Packet Too Big (PTB)", "Time Exceeded" and "Parameter Problem". For ICMPv4 [RFC0792], the error Types include "Destination Unreachable", "Fragmentation Needed" (a Destination Unreachable Code that is analogous to the ICMPv6 PTB), "Time Exceeded" and "Parameter Problem".

The ICMP header is followed by the leading portion of the packet that generated the error, also known as the "packet-in-error". For ICMPv6, [RFC4443] specifies that the packet-in-error includes: "As much of invoking packet as possible without the ICMPv6 packet exceeding the minimum IPv6 MTU" (i.e., no more than 1280 bytes). For ICMPv4, [RFC0792] specifies that the packet-in-error includes: "Internet Header + 64 bits of Original Data Datagram", however [RFC1812] Section 4.3.2.3 updates this specification by stating: "the ICMP datagram SHOULD contain as much of the original datagram as possible without the length of the ICMP datagram exceeding 576 bytes".

The L2 error message format is shown in Figure 3:

     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     ~                               ~
     |        L2 IP Header of        |
     |         error message         |
     ~                               ~
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |         L2 ICMP Header        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---
     ~                               ~   P
     |   IP and other encapsulation  |   a
     | headers of original L3 packet |   c
     ~                               ~   k
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   e
     ~                               ~   t
     |        IP header of           |   
     |      original L3 packet       |   i
     ~                               ~   n
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
     ~                               ~   e
     |    Upper layer headers and    |   r
     |    leading portion of body    |   r
     |   of the original L3 packet   |   o
     ~                               ~   r
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---

Figure 3: AERO Interface L2 Error Message Format

  • When an AERO node receives an L2 Parameter Problem message, it processes the message the same as described as for ordinary ICMP errors in the normative references [RFC0792][RFC4443].
  • When an AERO node receives persistent L2 IPv4 Time Exceeded messages, the IP ID field may be wrapping before earlier fragments have been processed. In that case, the node SHOULD begin including IPv4 integrity checks (see: Section 3.13.5).
  • When an AERO Client receives persistent L2 Destination Unreachable messages in response to tunneled packets that it sends to one of its dynamic neighbor correspondents, the Client SHOULD test the path to the correspondent using Neighbor Unreachability Detection (NUD) (see Section 3.18). If NUD fails, the Client SHOULD set ForwardTime for the corresponding dynamic neighbor cache entry to 0 and allow future packets destined to the correspondent to flow through a Server.
  • When an AERO Client receives persistent L2 Destination Unreachable messages in response to tunneled packets that it sends to one of its static neighbor Servers, the Client SHOULD test the path to the Server using NUD. If NUD fails, the Client SHOULD delete the neighbor cache entry and attempt to associate with a new Server.
  • When an AERO Server receives persistent L2 Destination Unreachable messages in response to tunneled packets that it sends to one of its static neighbor Clients, the Server SHOULD test the path to the Client using NUD. If NUD fails, the Server SHOULD cancel the DHCPv6 PD for the Client's ACP, withdraw its route for the ACP from the AERO routing system and delete the neighbor cache entry (see Section 3.18 and Section 3.19).
  • When an AERO Relay or Server receives an L2 Destination Unreachable message in response to a tunneled packet that it sends to one of its permanent neighbors, it discards the message since the AERO routing system is likely in a temporary transitional state that will soon re-converge. In case of a prolonged outage, however, the AERO routing system will compensate for Relays or Servers that have fallen silent.
  • When an AERO node receives an L2 PTB message, it caches the MTU field value of the L2 ICMP header then translates the message into an L3 PTB message if possible and forwards the message toward the original source as described below. Note that in some instances the packet-in-error field of an L2 PTB message may not include enough information for translation to an L3 PTB message. In that case, the AERO interface simply discards the L2 PTB message since translation of L2 PTB messages to L3 PTB messages can provide a useful optimization when possible, but is not critical for sources that correctly use PLPMTUD.

To translate an L2 PTB message to an L3 PTB message, the AERO node discards the L2 IP and ICMP headers, and also discards the encapsulation headers of the original L3 packet. Next the node encapsulates the included segment of the original L3 packet in an L3 IP and ICMP header, and sets the ICMP header Type and Code values to appropriate values for the L3 IP protocol. When the AERO node, AERO link neighbor and original source are all within the same administrative domain, the node writes the maximum of 1280 bytes and (L2 MTU - ENCAPS) into the MTU field of the L3 ICMP header. Otherwise, the node translates L2 PTB messages for which (L2 MTU - ENCAPS) is no less than 1500 bytes and discards all other L2 PTBs.

The node next writes the IP source address of the original L3 packet as the destination address of the L3 PTB message and determines the next hop to the destination. If the next hop is reached via the AERO interface, the node uses the IPv6 address "::" or the IPv4 address "0.0.0.0" as the IP source address of the L3 PTB message. Otherwise, the node uses one of its non link-local addresses as the source address of the L3 PTB message. The node finally calculates the ICMP checksum over the L3 PTB message and writes the Checksum in the corresponding field of the L3 ICMP header. The L3 PTB message therefore is formatted as follows:

     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     ~                               ~
     |        L3 IP Header of        |
     |         error message         |
     ~                               ~
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |         L3 ICMP Header        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ---
     ~                               ~   p
     |        IP header of           |   k
     |      original L3 packet       |   t
     ~                               ~ 
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   i  
     ~                               ~   n
     |    Upper layer headers and    |
     |    leading portion of body    |   e
     |   of the original L3 packet   |   r
     ~                               ~   r
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---

Figure 4: AERO Interface L3 Error Message Format

When an AERO Relay receives an L3 packet for which the destination address is covered by an ASP, if there is no more-specific routing information for the destination the Relay drops the packet and returns an L3 Destination Unreachable message. The Relay first writes the IP source address of the original L3 packet as the destination address of the L3 Destination Unreachable message and determines the next hop to the destination. If the next hop is reached via the AERO interface, the Relay uses the IPv6 address "::" or the IPv4 address "0.0.0.0" as the IP source address of the L3 Destination Unreachable message and forwards the message to the next hop within the AERO interface. Otherwise, the Relay uses one of its non link-local addresses as the source address of the L3 Destination Unreachable message and forwards the message via a link outside the AERO interface.

When an AERO node receives any L3 error message via the AERO interface, it examines the destination address in the L3 IP header of the message. If the next hop toward the destination address of the error message is via the AERO interface, the node re-encapsulates and forwards the message to the next hop within the AERO interface. Otherwise, if the source address in the L3 IP header of the message is the IPv6 address "::" or the IPv4 address "0.0.0.0", the node writes one of its non link-local addresses as the source address of the L3 message and recalculates the IP and/or ICMP checksums. The node finally forwards the message via a link outside of the AERO interface.

3.15. AERO Router Discovery, Prefix Delegation and Address Configuration

3.15.1. AERO DHCPv6 Service Model

Each AERO Server configures a DHCPv6 server function to facilitate PD requests from Clients. Each Server is provisioned with a database of ACP-to-Client ID mappings for all Clients enrolled in the AERO system, as well as any information necessary to authenticate each Client. The Client database is maintained by a central administrative authority for the AERO link and securely distributed to all Servers, e.g., via the Lightweight Directory Access Protocol (LDAP) [RFC4511] or a similar distributed database service.

Therefore, no Server-to-Server DHCPv6 PD delegation state synchronization is necessary, and Clients can optionally hold separate delegations for the same ACPs from multiple Servers. In this way, Clients can associate with multiple Servers, and can receive new delegations from new Servers before deprecating delegations received from existing Servers. This provides the Client with a natural fault-tolerance and/or load balancing profile.

AERO Clients and Servers exchange Client link-layer address information using an option format similar to the Client Link Layer Address Option (CLLAO) defined in [RFC6939]. Due to practical limitations of CLLAO, however, AERO interfaces instead use Vendor-Specific Information Options as described in the following sections.

3.15.2. AERO Client Behavior

AERO Clients discover the link-layer addresses of AERO Servers via static configuration (e.g., from a flat-file map of Server addresses and locations), or through an automated means such as DNS name resolution. In the absence of other information, the Client resolves the FQDN "linkupnetworks.[domainname]" where "linkupnetworks" is a constant text string and "[domainname]" is a DNS suffix for the Client's underlying network (e.g., "example.com"). After discovering the link-layer addresses, the Client associates with one or more of the corresponding Servers.

To associate with a Server, the Client acts as a requesting router to request ACPs through a two-message (i.e., Solicit/Reply) DHCPv6 PD exchange [RFC3315][RFC3633]. The Client's Solicit message includes fe80::ffff:ffff:ffff:ffff as the IPv6 source address, 'All_DHCP_Relay_Agents_and_Servers' as the IPv6 destination address and the link-layer address of the Server as the link-layer destination address. The Solicit message also includes a Client Identifier option with a DUID and an Identity Association for Prefix Delegation (IA_PD) option. If the Client is pre-provisioned with ACPs associated with the AERO service, it MAY also include the ACPs in the IA_PD to indicate its preferences to the DHCPv6 server.

The Client also SHOULD include an AERO Link-registration Request (ALREQ) option in the Solicit message to register one or more links with the Server. The Server will include an AERO Link-registration Reply (ALREP) option in the corresponding Reply message as specified in Section 3.15.3. (The Client MAY omit the ALREQ option, in which case the Server will still include an ALREP option in its Reply with "Link ID" set to 0.)

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |      OPTION_VENDOR_OPTS       |         option-len (1)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   enterprise-number = 45282                   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  opt-code = OPTION_ALREQ (0)  |         option-len (2)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |     Link ID   |  DSCP #1  |Prf|  DSCP #2  |Prf|   ...
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

Figure 5: AERO Link-registration Request (ALREQ) Option

The format for the ALREQ option is shown in Figure 5:

In the above format, the Client sets 'option-code' to OPTION_VENDOR_OPTS, sets 'option-len (1)' to the length of the option following this field, sets 'enterprise-number' to 45282 (see: "IANA Considerations"), sets opt-code to the value 0 ("OPTION_ALREQ") and sets 'option-len (2)' to the length of the remainder of the option. The Client includes appropriate 'Link ID, 'DSCP' and 'Prf' values for the underlying interface over which the Solicit message will be issued the same as specified for an S/TLLAO Section 3.4. The Server will register each value with the Link ID in the Client's neighbor cache entry. The Client finally includes any necessary authentication options to identify itself to the DHCPv6 server, and sends the encapsulated Solicit message via the underlying interface corresponding to Link ID. (Note that this implies that the Client must send additional Rebind messages with ALREQ options to the server following the initial PD exchange using different underlying interfaces and their corresponding Link IDs if it wishes to register additional link-layer addresses and their associated DSCPs.)

When the Client receives its ACP via a Reply from the AERO Server, it creates a static neighbor cache entry with the Server's link-local address as the network-layer address and the Server's encapsulation address as the link-layer address. The Client then considers the link-layer address of the Server as the primary default encapsulation address for forwarding packets for which no more-specific forwarding information is available. The Client further caches any ASPs included in the ALREP option as ASPs to apply to the AERO link.

Next, the Client autoconfigures an AERO address for each of the delegated ACPs, assigns the address(es) to the AERO interface and sub-delegates the ACPs to its attached EUNs and/or the Client's own internal virtual interfaces. Alternatively, the Client can configure as many addresses as it wants from /64 prefixes taken from the ACPs and assign them to either an internal virtual interface ("weak end-system") or to the AERO interface itself ("strong end-system") [RFC1122] while black-holing the remaining portions of the /64s. Finally, the Client assigns one or more default IP routes to the AERO interface with the link-local address of a Server as the next hop.

After AERO address autoconfiguration, the Client SHOULD begin using the AERO address as the source address for further DHCPv6 messaging. The Client subsequently renews its ACP delegations through each of its Servers by sending Renew messages with the link-layer address of a Server as the link-layer destination address and the same options that were used in the initial PD request. Note that if the Client does not issue a Renew before the delegations expire (e.g., if the Client has been out of touch with the Server for a considerable amount of time) it must re-initiate the DHCPv6 PD procedure.

Since the addresses assigned to the Client's AERO interface are obtained from the unique ACP delegations it receives, there is no need for DAD on AERO links. Other nodes maliciously attempting to hijack addresses from an authorized Client's ACPs will be denied access to the network by the Server due to an unacceptable link-layer address and/or security parameters (see: Security Considerations).

When a Client attempts to perform a DHCPv6 PD exchange with a Server that is too busy to service the request, the Client may receive either a "NoPrefixAvail" status code in the Server's Reply per [RFC3633] or no reply at all. In that case, the Client SHOULD discontinue DHCPv6 PD attempts through this Server and try another Server.

3.15.2.1. Autoconfiguration for Constrained Platforms

On some platforms (e.g., popular cell phone operating systems), the act of assigning a default IPv6 route and/or assigning an address to an interface may not be permitted from a user application due to security policy. Typically, those platforms include a TUN/TAP interface [TUNTAP] that acts as a point-to-point conduit between user applications and the AERO interface. In that case, the Client can instead generate a "synthesized RA" message. The message conforms to [RFC4861] and is prepared as follows:

[RFC2131].

  • the IPv6 source address is the Client's AERO address
  • the IPv6 destination address is all-nodes multicast
  • the Router Lifetime is set to a time that is no longer than the ACP DHCPv6 lifetime
  • the message does not include a Source Link Layer Address Option (SLLAO)
  • the message includes a Prefix Information Option (PIO) with a /64 prefix taken from the ACP as the prefix for autoconfiguration

The Client then sends the synthesized RA message via the TUN/TAP interface, where the operating system kernel will interpret it as though it were generated by an actual router. The operating system will then install a default route and use StateLess Address AutoConfiguration (SLAAC) to configure an IPv6 address on the TUN/TAP interface. Methods for similarly installing an IPv4 default route and IPv4 address on the TUN/TAP interface are based on synthesized DHCPv4 messages

3.15.3. AERO Server Behavior

AERO Servers configure a DHCPv6 server function on their AERO links. AERO Servers arrange to add their encapsulation layer IP addresses (i.e., their link-layer addresses) to a static map of Server addresses for the link and/or the DNS resource records for the FQDN "linkupnetworks.[domainname]" before entering service.

When an AERO Server receives a prospective Client's Solicit on its AERO interface, and the Server is too busy to service the message, it SHOULD return a Reply with status code "NoPrefixAvail" per [RFC3633]. Otherwise, the Server authenticates the message. If authentication succeeds, the Server determines the correct ACPs to delegate to the Client by searching the Client database.

When the Server delegates the ACPs, it also creates IP forwarding table entries so that the AERO routing system will propagate the ACPs to all Relays that aggregate the corresponding ASP (see: Section 3.7). Next, the Server prepares a Reply message to send to the Client while using fe80::ID as the IPv6 source address, the link-local address taken from the Client's Solicit as the IPv6 destination address, the Server's link-layer address as the source link-layer address, and the Client's link-layer address as the destination link-layer address. The server also includes IA_PD options with the delegated ACPs. Since the Client may experience a fault that prevents it from issuing a Release before departing from the network, Servers should set a short prefix lifetime (e.g., 40 seconds) so that stale prefix delegation state can be flushed out of the network.

The Server also includes an ALREP option that includes the UDP Port Number and IP Address values it observed when it received the ALREQ in the Client's original DHCPv6 message (if present) followed by the ASP(s) for the AERO link. The ALREP option is formatted as shown in Figure 6:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |      OPTION_VENDOR_OPTS       |         option-len (1)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   enterprise-number = 45282                   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  opt-code = OPTION_ALREP (1)  |         option-len (2)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Link ID    |    Reserved   |         UDP Port Number       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     +                                                               +
     |                                                               |
     +                          IP Address                           +
     |                                                               |
     +                                                               +
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     +              AERO Service Prefix (ASP) #1     +-+-+-+-+-+-+-+-+
     |                                               |  Prefix Len   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     +              AERO Service Prefix (ASP) #2     +-+-+-+-+-+-+-+-+
     |                                               |  Prefix Len   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     ~                                                               ~
     ~                                                               ~

Figure 6: AERO Link-registration Reply (ALREP) Option

Section 3.3), except that the low-order 8 bits of the ASP field encode the prefix length instead of the low-order 8 bits of the prefix. The longest prefix that can therefore appear as an ASP is /56 for IPv6 or /24 for IPv4. (Note that if the Client did not include an ALREQ option in its DHCPv6 message, the Server MUST still include an ALREP option in the corresponding reply with 'Link ID' set to 0.)

When the Server admits the Reply message into the AERO interface, it creates a static neighbor cache entry for the Client based on the DUID and AERO addresses with lifetime set to no more than the delegation lifetimes and the Client's link-layer address as the link-layer address for the Link ID specified in the ALREQ. The Server then uses the Client link-layer address information in the ALREQ option as the link-layer address for encapsulation based on the (DSCP, Prf) information.

After the initial DHCPv6 PD exchange, the AERO Server maintains the neighbor cache entry for the Client until the delegation lifetimes expire. If the Client issues a Renew, the Server extends the lifetimes. If the Client issues a Release, or if the Client does not issue a Renew before the lifetime expires, the Server deletes the neighbor cache entry for the Client and withdraws the IP routes from the AERO routing system.

3.15.3.1. Lightweight DHCPv6 Relay Agent (LDRA)

AERO Clients and Servers are always on the same link (i.e., the AERO link) from the perspective of DHCPv6. However, in some implementations the DHCPv6 server and AERO interface driver may be located in separate modules. In that case, the Server's AERO interface driver module can act as a Lightweight DHCPv6 Relay Agent (LDRA)[RFC6221] to relay DHCPv6 messages to and from the DHCPv6 server module.

When the LDRA receives a DHCPv6 message from a client, it prepares an ALREP option the same as described above then wraps the option in a Relay-Supplied DHCP Option option (RSOO) [RFC6422]. The LDRA then incorporates the option into the Relay-Forward message and forwards the message to the DHCPv6 server.

When the DHCPv6 server receives the Relay-Forward message, it caches the ALREP option and authenticates the encapsulated DHCPv6 message. The DHCPv6 server subsequently ignores the ALREQ option itself, since the relay has already included the ALREP option.

When the DHCPv6 server prepares a Reply message, it then includes the ALREP option in the body of the message along with any other options, then wraps the message in a Relay-Reply message. The DHCPv6 server then delivers the Relay-Reply message to the LDRA, which discards the Relay-Reply wrapper and delivers the DHCPv6 message to the Client.

3.15.4. Deleting Link Registrations

After an AERO Client registers its Link IDs and their associated (DSCP,Prf) values with the AERO Server, the Client may wish to delete one or more Link registrations, e.g., if an underlying link becomes unavailable. To do so, the Client prepares a Rebind message that includes an AERO Link-registration Delete (ALDEL) option and sends the Rebind message to the Server over any available underlying link. The ALDEL option is formatted as shown in Figure 7:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |      OPTION_VENDOR_OPTS       |         option-len (1)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   enterprise-number = 45282                   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  opt-code = OPTION_ALDEL (2)  |         option-len (2)        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |   Link ID #1  |  Link ID #2   |  Link ID #3   |    ...
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

Figure 7: AERO Link-registration Delete (ALDEL) Option

If the Client wishes to discontinue use of a Server and thereby delete all of its Link ID associations, it must issue a Release to delete the entire neighbor cache entry, i.e., instead of issuing a Rebind with one or more ALDEL options.

3.16. AERO Forwarding Agent Behavior

AERO Servers MAY associate with one or more companion AERO Forwarding Agents as platforms for offloading high-speed data plane traffic. When an AERO Server receives a Client's Solicit/Renew/Rebind/Release message, it services the message then forwards the corresponding Reply message to the Forwarding Agent. When the Forwarding Agent receives the Reply message, it creates, updates or deletes a neighbor cache entry with the Client's AERO address and link-layer information included in the Reply message. The Forwarding Agent then forwards the Reply message back to the AERO Server, which forwards the message to the Client. In this way, Forwarding Agent state is managed in conjunction with Server state, with the Client responsible for reliability.

When an AERO Server receives a data packet on an AERO interface with a network layer destination address for which it has distributed forwarding information to a Forwarding Agent, the Server returns a Redirect message to the source neighbor (subject to rate limiting) then forwards the data packet as usual. The Redirect message includes a TLLAO with the link-layer address of the Forwarding Engine.

When the source neighbor receives the Redirect message, it SHOULD record the link-layer address in the TLLAO as the encapsulation addresses to use for sending subsequent data packets. However, the source MUST continue to use the primary link-layer address of the Server as the encapsulation address for sending control messages.

3.17. AERO Intradomain Route Optimization

When a source Client forwards packets to a prospective correspondent Client within the same AERO link domain (i.e., one for which the packet's destination address is covered by an ASP), the source Client MAY initiate an intra-domain AERO route optimization procedure. It is important to note that this procedure is initiated by the Client; if the procedure were initiated by the Server, the Server would have no way of knowing whether the Client was actually able to contact the correspondent over the route-optimized path.

The procedure is based on an exchange of IPv6 ND messages using a chain of AERO Servers and Relays as a trust basis. This procedure is in contrast to the Return Routability procedure required for route optimization to a correspondent Client located in the Internet as described in Section 3.22. The following sections specify the AERO intradomain route optimization procedure.

3.17.1. Reference Operational Scenario

Figure 8 depicts the AERO intradomain route optimization reference operational scenario, using IPv6 addressing as the example (while not shown, a corresponding example for IPv4 addressing can be easily constructed). The figure shows an AERO Relay ('R1'), two AERO Servers ('S1', 'S2'), two AERO Clients ('C1', 'C2') and two ordinary IPv6 hosts ('H1', 'H2'):