Network Working Group C. Hopps
Internet-Draft LabN Consulting, L.L.C.
Intended status: Standards Track March 11, 2019
Expires: September 12, 2019

IP Traffic Flow Security
draft-hopps-ipsecme-iptfs-00

Abstract

This document describes a mechanism to enhance IPsec traffic flow security by adding traffic flow confidentiality to encrypted IP encapsulated traffic. Traffic flow confidentiality is provided by obscuring the size and frequency of IP traffic using a fixed-sized, constant-send-rate IPsec tunnel. The solution allows for congestion control as well.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 12, 2019.

Copyright Notice

Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting information about data being sent through a network. While one may directly obscure the data through the use of encryption [RFC4303], the traffic pattern itself exposes information due to variations in it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the size and frequency of traffic is referred to as Traffic Flow Confidentiality (TFC) per [RFC4303].

[RFC4303] provides for TFC by allowing padding to be added to encrypted IP packets and allowing for sending all-pad packets (indicated using protocol 59). This method has the major limitation that it can significantly under-utilize the available bandwidth.

The IP-TFS solution provides for full TFC without the aforementioned bandwidth limitation. To do this we use a constant-send-rate IPsec [RFC4303] tunnel with fixed-sized encapsulating packets; however, these fixed-sized packets can contain partial, full or multiple IP packets to maximize the bandwidth of the tunnel.

For a comparison of the overhead of IP-TFS with the RFC4303 prescribed TFC solution see Appendix A.

Additionally, IP-TFS provides for dealing with network congestion [RFC2914]. This is important for when the IP-TFS user is not in full control of the domain through which the IP-TFS tunnel path flows.

1.1. Terminology & Concepts

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document assumes familiarity with IP security concepts described in [RFC4301].

2. The IP-TFS Tunnel

As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel as it's transport. To provide for full TFC we send fixed-sized encapsulating packets at a constant rate on the tunnel.

The primary input to the tunnel algorithm is the requested bandwidth of the tunnel. Two values are then required to provide for this bandwidth, the fixed size of the encapsulating packets, and rate at which to send them.

The fixed packet size may either be specified manually or can be determined through the use of Path MTU discovery [RFC1191] and [RFC8201].

Given the encapsulating packet size and the requested tunnel bandwidth, the correct packet send rate can be calculated. The packet send rate is the requested bandwidth divided by the payload size of the encapsulating packet.

The egress of the IP-TFS tunnel SHOULD NOT impose any restrictions on tunnel packet size or arrival rate. Packet size and send rate is entirely the function of the ingress (sending) side of the IP-TFS tunnel. Indeed, the ingress (sending) side of the IP-TFS tunnel MUST be allowed by the egress side to vary the size and rate at which it sends encapsulating packets, including sending them larger, smaller, faster or slower than the requested size and rate.

2.1. Tunnel Content

As previously mentioned, one issue with the TFC padding solution in [RFC4303] is the large amount of wasted bandwidth as only one IP packet can be sent per encapsulating packet. In order to maximize bandwidth IP-TFS breaks this one-to-one association.

With IP-TFS we fragment as well as aggregate the inner IP traffic flow into fixed-sized encapsulating IP tunnel packets. We only pad the tunnel packets if there is no data available to be sent at the time of tunnel packet transmission.

In order to do this we create a new payload data type identified with a new IP protocol number IPTFS_PROTOCOL (TBD). A payload of IPTFS_PROTOCOL type is comprised of a 32 bit header followed by either a partial, a full or multiple partial or full data-blocks.

2.1.1. IPSec/ESP Payload

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . Outer Encapsulating Header ...                                  .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . ESP Header...                                                   .
 +-----------------------------------------------------------------+
 |               ...            :           BlockOffset            |
 +-----------------------------------------------------------------+
 |       Data Blocks Payload ...                                   ~
 ~                                                                 ~
 ~                                                                 |
 +-----------------------------------------------------------------|
 . ESP Trailer...                                                  .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 1: Layout of IP-TFS IPSec Packet

The BlockOffset value is either zero or some offset into or past the end of the data blocks payload data. If the value is zero it means that a new data-block immediately follows the fixed header (i.e., the BlockOffset value). Conversely, if the BlockOffset value is non-zero it points at the start of the next data block. The BlockOffset can point past the end of the data block payload data, this means that the next data-block occurs in a subsequent encapsulating packet. When the BlockOffset is non-zero the data immediately following the header belongs to the previous data-block that is still being re-assembled.

2.1.2. Data-Blocks

 +-----------------------------------------------------------------+
 | Type  | rest of IPv4, IPv6 or pad.
 +--------

Figure 2: Layout of IP-TFS data block

A data-block is defined by a 4-bit type code followed by the data block data. The type values have been carefully chosen to coincide with the IPv4/IPv6 version field values so that no per-data-block type overhead is required to encapsulate an IP packet. Likewise, the length of the data block is extracted from the encapsulated IPv4 or IPv6 packet's length field.

2.1.3. No Implicit Padding

It's worth noting that there is no need for implicit pads at the end of an encapsulating packet. Even when the start of a data block occurs near the end of a encapsulating packet such that there is no room for the length field of the encapsulated header to be included in the current encapsulating packet, the fact that the length comes at a known location and as is guaranteed to be present is enough to fetch the length field from the subsequent encapsulating packet payload.

2.1.4. IP Header Value Mapping

[RFC4301] provides some direction on when and how to map various values from an inner IP header to the outer encapsulating header, namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the Differentiated Services (DS) field [RFC2474] and the Explicit Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301] with IP-TFS we may and often will be encapsulating more than 1 IP packet per ESP packet. To deal with this we further restrict these mappings. In particular we never map the inner DF bit as it is unrelated to the IP-TFS tunnel functionality; we never directly fragment the inner packets and the inner packets will not affect the fragmentation of the outer encapsulation packets. Likewise, the ECN value need not be mapped as any congestion related to the constant-send-rate IP-TFS tunnel is unrelated (by design!) to the inner traffic flow. Finally, by default the DS field SHOULD NOT be copied although an implementation MAY choose to allow for configuration to override this behavior. An implementation SHOULD also allow the DS value to be set by configuration.

2.2. Exclusive SA Use

It is not the intention of this specification to allow for mixed use of an IPsec SA. In other words, an SA that is created for IP-TFS is exclusively for IP-TFS use and MUST NOT have non-IP-TFS payloads such as IP (IP protocol 4), TCP transport (IP protocol 6), or ESP pad packets (protocol 59) intermixed with IP-TFS (IP protocol TBD) payloads. While it's possible to envision making the algorithm work in the presence of sequence number skips in the IP-TFS payload stream, the added complexity is not deemed worthwhile. Other IPsec uses can configure and use their own SAs.

2.3. Initiation of TFS mode

While normally a user will configure their IPsec tunnel to operate in IP-TFS mode to start, we also allow IP-TFS mode to be enabled post-SA creation. This may be useful for debugging or other purposes. In this late enabled mode the receiver would switch to IP-TFS mode on receipt of the first ESP payload with the IPTFS_PROTOCOL indicated as the payload type.

2.4. Example of an encapsulated IP packet flow

Below we show an example inner IP packet flow within the encapsulating tunnel packet stream. Notice how encapsulated IP packets can start and end anywhere, and more than one or less than 1 may occur in a single encapsulating packet.

  Offset: 0        Offset: 100    Offset: 2900    Offset: 1400
 [ ESP1  (1500) ][ ESP2  (1500) ][ ESP3  (1500) ][ ESP4  (1500) ]
 [--800--][--800--][60][-240-][--4000----------------------][pad]

Figure 3: Inner and Outer Packet Flow

The encapsulated IP packet flow (lengths include IP header and payload) is as follows: an 800 octet packet, an 800 octet packet, a 60 octet packet, a 240 octet packet, a 4000 octet packet.

The BlockOffset values in the 4 IP-TFS payload headers for this packet flow would thus be: 0, 100, 2900, 1400 respectively. The first encapsulating packet ESP1 has a zero BlockOffset which points at the IP data block immediately following the IP-TFS header. The following packet ESP2s BlockOffset points inward 100 octets to the start of the 60 octet data block. The third encapsulating packet ESP3 contains the middle portion of the 4000 octet data block so the offset points past its end and into the forth encapsulating packet. The fourth packet ESP4s offset is 1400 pointing at the padding which follows the completion of the continued 4000 octet packet.

Having the BlockOffset always point at the next available data block allows for quick recovery with minimal inner packet loss in the presence of outer encapsulating packet loss.

2.5. Modes of operation

Just as with normal IPsec tunnels IP-TFS tunnels are unidirectional. Bidirectional functionality is achieved by setting up 2 tunnels, one in either direction.

An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled mode and congestion controlled mode.

2.5.1. Non-Congestion Controlled Mode

In the non-congestion controlled mode IP-TFS sends fixed-sized packets at a constant rate. The packet send rate is constant and is not automatically adjusted regardless of any network congestion (i.e., packet loss).

For similar reasons as given in [RFC7510] the non-congestion controlled mode should only be used where the user has full administrative control over the path the tunnel will take. This is required so the user can guarantee the bandwidth and also be sure as to not be negatively affecting network congestion [RFC2914]. In this case packet loss should be reported to the administrator (e.g., via syslog, YANG notification, SNMP traps, etc) so that any failures due to a lack of bandwidth can be corrected.

2.5.2. Congestion Controlled Mode

With the congestion controlled mode, IP-TFS adapts to network congestion by lowering the packet send rate to accommodate the congestion, as well as raising the rate when congestion subsides.

If congestion were handled in the network on a octet level we might consider lowering the IPsec (encapsulation) packet size to adapt; however, as congestion is normally handled in the network by dropping packets we instead choose to lower the frequency we send our fixed sized packets. This choice also minimizes transport overhead.

The output of a congestion control algorithm SHOULD adjust the frequency that ingress sends packets until the congestion is accommodated. While this document does not standardize the congestion control algorithm, the algorithm used by an implementation SHOULD conform to the guidelines in [RFC2914].

When an implementation is choosing a congestion control algorithm it is worth noting that IP-TFS is not providing for reliable delivery of IP traffic and so per packet ACKs are not required, and are not provided.

It's worth noting that the adjustable rate of sending over the congestion controlled IP-TFS tunnel is being controlled by the network congestion. As long as the encapsulated traffic flow shape and timing are not directly affecting the network congestion, the variations in the tunnel rate will not weaken the provided traffic flow confidentiality.

2.5.2.1. Circuit Breakers

In additional to congestion control, implementations MAY choose to define and implement circuit breakers [RFC8084] as a recovery method of last resort. Enabling circuit breakers is also a reason a user may wish to enable congestion information reports even when using the non-congestion controlled mode of operation. The definition of circuit breakers are outside the scope of this document.

3. Congestion Information

In order to support the congestion control mode, the receiver (egress tunnel endpoint) MUST send regular packet drop reports to the sender (ingress tunnel endpoint). These reports indicate the number of packet drops during a sequence of packets. The sequence or range of packets is identified using the start and end ESP sequence numbers of the packet range.

These congestion information reports MAY also be sent when in the non-congestion controlled mode to allow for reporting from the sending device or to implement Circuit Breakers [RFC8084].

The congestion information is sent using an IKEv2 INFORMATION notifications [RFC7296]. These notifications are sent at a configured interval (which can be configured to 0 to disable the sending of the reports).

3.1. ECN Support

In additional to normal packet loss information IP-TFS supports use of the ECN bits in the encapsulating IP header [RFC3168] for identifying congestion. If ECN use is enabled and a packet arrives at the egress endpoint with the Congestion Experienced (CE) value set, then the receiver records that packet as being dropped, although it does not drop it. When the CE information is used to calculate the packet drop count the receiver also sets the E bit in the congestion information notification data. In order to respond quickly to the congestion indication the receiver MAY immediately send a congestion information notification to the sender upon receiving a packet with the CE indication. This additional immediate send SHOULD only be done once per normal congestion information sending interval though.

As noted in [RFC3168] the ECN bits are not protected by IPsec and thus may constitute a covert channel. For this reason ECN use SHOULD NOT be enabled by default.

4. Configuration

IP-TFS is meant to be deployable with a minimal amount of configuration. All IP-TFS specific configuration (i.e., in addition to the underlying IPsec tunnel configuration) should be able to be specified at the tunnel ingress (sending) side alone (i.e., single-ended provisioning).

4.1. Bandwidth

Bandwidth is a local configuration option. For non-congestion controlled mode the bandwidth SHOULD be configured. For congestion controlled mode one can configure the bandwidth or have no configuration and let congestion control discover the maximum bandwidth available. No standardized configuration method is required.

4.2. Fixed Packet Size

The fixed packet size to be used for the tunnel encapsulation packets can be configured manually or can be automatically determined using Path MTU discovery (see [RFC1191] and [RFC8201]). No standardized configuration method is required.

4.3. Congestion Information Configuration

If congestion control mode is to be used, or if the user wishes to receive congestion information on the sender for circuit breaking or other operational notifications in the non-congestion controlled mode, IP-TFS will need to configure the egress tunnel endpoint to send congestion information periodically.

In order to configure the sending interval of periodic congestion information on the egress tunnel endpoint, we utilize the IKEv2 Configuration Payload (CP) [RFC7296]. Implementations MAY also allow for manual (or default) configuration of this interval; however, implementations of IP-TFS MUST support configuration using the IKEv2 exchange described below.

We utilize a new IKEv2 configuration attribute TFS_INFO_INTERVAL (TBD) to configure the sending interval from the egress endpoint of the tunnel. This value is configured using a CFG_REQUEST payload and is acknowledge by the receiver using a CFG_REPLY payload. This configuration exchange SHOULD be sent during the IKEv2 configuration exchanges occurring as the tunnel is first brought up. The sending interval value MAY also be changed at any time afterwards using a similar CFG_REQUEST/CFG_REPLY payload inside an IKEv2 INFORMATIONAL exchange.

In the absence of a congestion information configuration exchange the sending interval is up to the receiving device configuration.

The sending interval value is given in milliseconds and is 16 bits wide; however, it is not recommended that values below 1/10th of a second are used as this could lead to early exhaustion of the Message ID field used in the IKEv2 INFORMATIONAL exchange to send the congestion information.

{{question: Could we get away with sending the info using the same message ID each time? We have a timestamp that would allow for duplicate detection, and the payload will be authenticated by IKEv2. }}

A sending interval value of 0 disables sending of the congestion information.

5. Packet and Data Formats

5.1. IPSec

5.1.1. Payload Format

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |V|          Reserved           |          BlockOffset            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |       DataBlocks ...
 +-+-+-+-+-+-+-+-+-+-+-

V:

A 1 bit version field that MUST be set to zero. If received as one the packet MUST be dropped.
Reserved:

A 15 bit field set to 0 and ignored on receipt.
BlockOffset:

A 16 bit unsigned integer counting the number of octets following this 32 bit header before the next data block. It can also point past the end of the containing packet in which case the data entirely belongs to the previous data block. If the offset extends into subsequent packets the subsequent 32 bit IP-TFS headers are not counted by this value.
DataBlocks:

Variable number of octets that constitute the start or continuation of a previous data block.

5.1.2. Data Blocks

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | Type  | IPv4, IPv6 or pad...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

Type:

A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates an IPv4 data block, and 0x6 indicates an IPv6 data block.

5.1.2.1. IPv4 Data Block

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  0x4  |  IHL  |  TypeOfService  |         TotalLength           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | Rest of the inner packet ...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

These values are the actual values within the encapsulated IPv4 header. In other words, the start of this data block is the start of the encapsulated IP packet.

Type:

A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the IPv4 packet).
TotalLength:

The 16 bit unsigned integer length field of the IPv4 inner packet.

5.1.2.2. IPv6 Data Block

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  0x6  | TrafficClass  |               FlowLabel                 |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |          TotalLength          | Rest of the inner packet ...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

These values are the actual values within the encapsulated IPv6 header. In other words, the start of this data block is the start of the encapsulated IP packet.

Type:

A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the IPv6 packet).
TotalLength:

The 16 bit unsigned integer length field of the inner IPv6 inner packet.

5.1.2.3. Pad Data Block

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  0x0  | Padding ...
 +-+-+-+-+-+-+-+-+-+-+-

Type:

A 4 bit value of 0x0 indicating a padding data block.
Padding:

extends to end of the encapsulating packet.

5.2. IKEv2

5.2.1. IKEv2 Congestion Information Configuration Attribute

The following defines the configuration attribute structure used in the IKEv2 [RFC7296] configuration exchange to set the congestion information report sending interval.

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |R|       Attribute Type        |             Length              |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |            Interval           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

R:

1 bit set to 0.
Attribute Type:

15 bit value set to TFS_INFO_INTERVAL (TBD).
Length:

2 octet length set to 2.
SendInterval:

A 2 octet unsigned integer. The sending interval in milliseconds.

5.2.2. IKEv2 Congestion Information Notification Data

We utilize a send only (i.e., no response expected) IKEv2 INFORMATIONAL exchange (37) to transmit the congestion information using a notification payload of type TFS_CONGEST_INFO (TBD). The The Response bit should be set to 0. As no response is expected the only payload should be the congestion information in the notification payload. The following diagram defines the notification payload data.

                      1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |E|  Reserved   |                  DropCount                      |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                          Timestamp                              |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                          AckSeqStart                            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                          AckSeqEnd                              |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

E:

A 1 bit value that if set indicates that packet[s] with Congestion Experienced (CE) ECN bits set were received and used in calculating the DropCount value.
Reserved:

A 7 bit field set to 0 ignored on receipt.
DropCount:

A 24 bit unsigned integer count of the drops that occurred between AckSeqStart and AckSeqEnd. If the drops exceed the resolution of the counter then set to the maximum value (i.e., 0xFFFFFF).
AckSeqStart:

A 32 bit unsigned integer containing the first ESP sequence number (as defined in [RFC4303]) of the packet range that this information relates to.
AckSeqEnd:

A 32 bit unsigned integer containing the last ESP sequence number (as defined in [RFC4303]) of the packet range that this information relates to.
Timestamp:

A 32 bit unsigned integer containing the lower 32 bits of a running monotonic millisecond timer of when this notification data was created/sent. This value is used to determine duplicates and drop counts of this information. Implementations should deal with wrapping of this timer value.

6. IANA Considerations

This document requests a protocol number IPTFS_PROTOCOL be allocated by IANA from "Assigned Internet Protocol Numbers" registry for identifying the IP-TFS ESP payload format.

Type: TBD Description: IP-TFS ESP payload format. Reference: This document

Additionally this document requests an attribute value TFS_INFO_INTERVAL (TBD) be allocated by IANA from "IKEv2 Configuration Payload Attribute Types" registry.

Type: TBD Description: The sending rate of congestion information from egress tunnel endpoint. Reference: This document

Additionally this document requests a notify message status type TFS_CONGEST_INFO (TBD) be allocated by IANA from "IKEv2 Notify Message Types - Status Types" registry.

Type: TBD Description: The sending rate of congestion information from egress tunnel endpoint. Reference: This document

7. Security Considerations

This document describes a mechanism to add Traffic Flow Confidentiality to IP traffic. Use of this mechanism is expected to increase the security of the traffic being transported. Other than the additional security afforded by using this mechanism, IP-TFS utilizes the security protocols [RFC4303] and [RFC7296] and so their security considerations apply to IP-TFS as well.

As noted previously in Section 2.5.2, for TFC to be fully maintained the encapsulated traffic flow should not be affecting network congestion in a predictable way, and if it would be then non-congestion controlled mode use should be considered instead.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", RFC 4303, DOI 10.17487/RFC4303, December 2005.
[RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P. and T. Kivinen, "Internet Key Exchange Protocol Version 2 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 2014.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017.

8.2. Informative References

[AppCrypt] Schneier, B., "Applied Cryptography: Protocols, Algorithms, and Source Code in C", November 2017.
[I-D.iab-wire-image] Trammell, B. and M. Kuehlewind, "The Wire Image of a Network Protocol", Internet-Draft draft-iab-wire-image-01, November 2018.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, DOI 10.17487/RFC0791, September 1981.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, DOI 10.17487/RFC1191, November 1990.
[RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, DOI 10.17487/RFC2474, December 1998.
[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 2914, DOI 10.17487/RFC2914, September 2000.
[RFC3168] Ramakrishnan, K., Floyd, S. and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, December 2005.
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R. and D. Black, "Encapsulating MPLS in UDP", RFC 7510, DOI 10.17487/RFC7510, April 2015.
[RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, DOI 10.17487/RFC8200, July 2017.
[RFC8201] McCann, J., Deering, S., Mogul, J. and R. Hinden, "Path MTU Discovery for IP version 6", STD 87, RFC 8201, DOI 10.17487/RFC8201, July 2017.

Appendix A. Comparisons of IP-TFS

A.1. Comparing Overhead

A.1.1. IP-TFS Overhead

The overhead of IP-TFS is 40 bytes per outer packet. Therefore the octet overhead per inner packet is 40 divided by the number of outer packets required (fractional allowed). The overhead as a percentage of inner packet size is a constant based on the Outer MTU size.

   OH = 40 / Outer Payload Size / Inner Packet Size
   OH % of Inner Packet Size = 100 * OH / Inner Packet Size
   OH % of Inner Packet Size = 4000 / Outer Payload Size
		     Type  IP-TFS  IP-TFS  IP-TFS 
		      MTU     576    1500    9000 
		    PSize     536    1460    8960 
		   -------------------------------
		       40   7.46%   2.74%   0.45% 
		      576   7.46%   2.74%   0.45% 
		     1500   7.46%   2.74%   0.45% 
		     9000   7.46%   2.74%   0.45% 

Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size

A.1.2. ESP with Padding Overhead

The overhead per inner packet for constant-send-rate padded ESP (i.e., traditional IPSec TFC) is 36 octets plus any padding, unless fragmentation is required.

When fragmentation of the inner packet is required to fit in the outer IPsec packet, overhead is the number of outer packets required to carry the fragmented inner packet times both the inner IP overhead (20) and the outer packet overhead (36) minus the initial inner IP overhead plus any required tail padding in the last encapsulation packet. The required tail padding is the number of required packets times the difference of the Outer Payload Size and the IP Overhead minus the the Inner Payload Size. So:

  Inner Paylaod Size = IP Packet Size - IP Overhead
  Outer Payload Size = MTU - IPSec Overhead

                Inner Payload Size
  NF0 = ----------------------------------
         Outer Payload Size - IP Overhead

  NF = CEILING(NF0)

  OH = NF * (IP Overhead + IPsec Overhead)
       - IP Overhead
       + NF * (Outer Payload Size - IP Overhead)
       - Inner Payload Size

  OH = NF * (IPSec Overhead + Outer Payload Size)
       - (IP Overhead + Inner Payload Size)

  OH = NF * (IPSec Overhead + Outer Payload Size)
       - Inner Packet Size

A.2. Overhead Comparison

The following tables collect the overhead values for some common L3 MTU sizes in order to compare them. The first table is the number of octets of overhead for a given L3 MTU sized packet. The second table is the percentage of overhead in the same MTU sized packet.

        Type  ESP+Pad  ESP+Pad  ESP+Pad  IP-TFS  IP-TFS  IP-TFS 
      L3 MTU      576     1500     9000     576    1500    9000 
       PSize      540     1464     8964     536    1460    8960 
     -----------------------------------------------------------
          40      500     1424     8924     3.0     1.1     0.2 
         128      412     1336     8836     9.6     3.5     0.6 
         256      284     1208     8708    19.1     7.0     1.1 
         536        4      928     8428    40.0    14.7     2.4 
         576      576      888     8388    43.0    15.8     2.6 
        1460      268        4     7504   109.0    40.0     6.5 
        1500      228     1500     7464   111.9    41.1     6.7 
        8960     1408     1540        4   668.7   245.5    40.0 
        9000     1368     1500     9000   671.6   246.6    40.2 

Figure 5: Overhead comparison in octets

       Type  ESP+Pad  ESP+Pad   ESP+Pad  IP-TFS  IP-TFS  IP-TFS 
        MTU      576     1500      9000     576    1500    9000 
      PSize      540     1464      8964     536    1460    8960 
     -----------------------------------------------------------
         40  1250.0%  3560.0%  22310.0%   7.46%   2.74%   0.45% 
        128   321.9%  1043.8%   6903.1%   7.46%   2.74%   0.45% 
        256   110.9%   471.9%   3401.6%   7.46%   2.74%   0.45% 
        536     0.7%   173.1%   1572.4%   7.46%   2.74%   0.45% 
        576   100.0%   154.2%   1456.2%   7.46%   2.74%   0.45% 
       1460    18.4%     0.3%    514.0%   7.46%   2.74%   0.45% 
       1500    15.2%   100.0%    497.6%   7.46%   2.74%   0.45% 
       8960    15.7%    17.2%      0.0%   7.46%   2.74%   0.45% 
       9000    15.2%    16.7%    100.0%   7.46%   2.74%   0.45% 

Figure 6: Overhead as Percentage of Inner Packet Size

A.3. Comparing Available Bandwidth

Another way to compare the two solutions is to look at the amount of available bandwidth each solution provides. The following sections consider and compare the percentage of available bandwidth. For the sake of providing a well understood baseline we will also include normal (unencrypted) Ethernet as well as normal ESP values.

A.3.1. Ethernet

In order to calculate the available bandwidth we first calculate the per packet overhead in bits. The total overhead of Ethernet is 14+4 octets of header and CRC plus and additional 20 octets of framing (preamble, start, and inter-packet gap) for a total of 48 octets. Additionally the minimum payload is 46 octets.

      Size  E + P  E + P  E + P  IPTFS  IPTFS  IPTFS  Enet   ESP 
       MTU    590   1514   9014    590   1514   9014   any   any 
        OH     74     74     74     78     78     78    38    74 
     ------------------------------------------------------------
        40    614   1538   9038     45     42     40    84   114 
       128    614   1538   9038    146    134    129   166   202 
       256    614   1538   9038    293    269    258   294   330 
       536    614   1538   9038    614    564    540   574   610 
       576   1228   1538   9038    659    606    581   614   650 
      1460   1842   1538   9038   1672   1538   1472  1498  1534 
      1500   1842   3076   9038   1718   1580   1513  1538  1574 
      8960  11052  10766   9038  10263   9438   9038  8998  9034 
      9000  11052  10766  18076  10309   9480   9078  9038  9074 

Figure 7: L2 Octets Per Packet

     Size  E + P  E + P  E + P  IPTFS  IPTFS  IPTFS  Enet   ESP   
      MTU  590    1514   9014   590    1514   9014   any    any   
       OH  74     74     74     78     78     78     38     74    
    --------------------------------------------------------------
       40  2.0M   0.8M   0.1M   27.3M  29.7M  31.0M  14.9M  11.0M 
      128  2.0M   0.8M   0.1M   8.5M   9.3M   9.7M   7.5M   6.2M  
      256  2.0M   0.8M   0.1M   4.3M   4.6M   4.8M   4.3M   3.8M  
      536  2.0M   0.8M   0.1M   2.0M   2.2M   2.3M   2.2M   2.0M  
      576  1.0M   0.8M   0.1M   1.9M   2.1M   2.2M   2.0M   1.9M  
     1460  678K   812K   138K   747K   812K   848K   834K   814K  
     1500  678K   406K   138K   727K   791K   826K   812K   794K  
     8960  113K   116K   138K   121K   132K   138K   138K   138K  
     9000  113K   116K   69K    121K   131K   137K   138K   137K  

Figure 8: Packets Per Second on 10G Ethernet

 Size   E + P   E + P   E + P   IPTFS   IPTFS   IPTFS    Enet     ESP 
          590    1514    9014     590    1514    9014     any     any 
           74      74      74      78      78      78      38      74 
----------------------------------------------------------------------
   40   6.51%   2.60%   0.44%  87.30%  94.93%  99.14%  47.62%  35.09% 
  128  20.85%   8.32%   1.42%  87.30%  94.93%  99.14%  77.11%  63.37% 
  256  41.69%  16.64%   2.83%  87.30%  94.93%  99.14%  87.07%  77.58% 
  536  87.30%  34.85%   5.93%  87.30%  94.93%  99.14%  93.38%  87.87% 
  576  46.91%  37.45%   6.37%  87.30%  94.93%  99.14%  93.81%  88.62% 
 1460  79.26%  94.93%  16.15%  87.30%  94.93%  99.14%  97.46%  95.18% 
 1500  81.43%  48.76%  16.60%  87.30%  94.93%  99.14%  97.53%  95.30% 
 8960  81.07%  83.22%  99.14%  87.30%  94.93%  99.14%  99.58%  99.18% 
 9000  81.43%  83.60%  49.79%  87.30%  94.93%  99.14%  99.58%  99.18% 

Figure 9: Percentage of Bandwidth on 10G Ethernet

A sometimes unexpected result of using IP-TFS (or any packet aggregating tunnel) is that, for small to medium sized packets, the available bandwidth is actually greater than native Ethernet. This is due to the reduction in Ethernet framing overhead. This increased bandwidth is paid for with an increase in latency. This latency is the time to send the unrelated octets in the outer tunnel frame. The following table illustrates the latency for some common values on a 10G Ethernet link. The table also includes latency introduced by padding if using ESP with padding.

	             ESP+Pad  ESP+Pad  IP-TFS   IP-TFS  
	             1500     9000     1500     9000    
                                          
	      ------------------------------------------
	         40  1.14 us  7.14 us  1.17 us  7.17 us 
	        128  1.07 us  7.07 us  1.10 us  7.10 us 
	        256  0.97 us  6.97 us  1.00 us  7.00 us 
	        536  0.74 us  6.74 us  0.77 us  6.77 us 
	        576  0.71 us  6.71 us  0.74 us  6.74 us 
	       1460  0.00 us  6.00 us  0.04 us  6.04 us 
	       1500  1.20 us  5.97 us  0.00 us  6.00 us 

Figure 10: Added Latency

Notice that the latency values are very similar between the two solutions; however, whereas IP-TFS provides for constant high bandwidth, in some cases even exceeding native Ethernet, ESP with padding often greatly reduces available bandwidth.

Appendix B. Acknowledgements

We would like to thank Don Fedyk for help in reviewing this work.

Appendix C. Contributors

The following people made significant contributions to this document.

   Lou Berger
   LabN Consulting, L.L.C.

   Email: lberger@labn.net

Author's Address

Christian Hopps LabN Consulting, L.L.C. EMail: chopps@chopps.org