Network Working Group F. L. Templin, Ed. Internet-Draft Boeing Research & Technology Updates: RFC2675 (if approved) 12 January 2023 Intended status: Standards Track Expires: 16 July 2023 IP Parcels draft-templin-intarea-parcels-23 Abstract IP packets (both IPv4 and IPv6) contain a single unit of upper layer protocol data which becomes the retransmission unit in case of loss. Upper layer protocols including the Transmission Control Protocol (TCP) and transports over the User Datagram Protocol (UDP) prepare data units known as "segments", with traditional arrangements including a single segment per IP packet. This document presents a new construct known as the "IP Parcel" which permits a single packet to carry multiple upper layer protocol segments, essentially creating a "packet-of-packets". IP parcels provide an essential building block for improved performance and efficiency while encouraging larger Maximum Transmission Units (MTUs) in the Internet as discussed in this document. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 16 July 2023. Copyright Notice Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved. Templin Expires 16 July 2023 [Page 1] Internet-Draft IP Parcels January 2023 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Background and Motivation . . . . . . . . . . . . . . . . . . 5 4. IP Parcel Formation . . . . . . . . . . . . . . . . . . . . . 7 5. TCP Parcels . . . . . . . . . . . . . . . . . . . . . . . . . 11 6. UDP Parcels . . . . . . . . . . . . . . . . . . . . . . . . . 12 7. Transmission of IP Parcels . . . . . . . . . . . . . . . . . 12 8. Parcel Path Qualification . . . . . . . . . . . . . . . . . . 16 9. Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . 22 10. RFC2675 Updates . . . . . . . . . . . . . . . . . . . . . . . 25 11. IPv4 Jumbograms . . . . . . . . . . . . . . . . . . . . . . . 25 12. Implementation Status . . . . . . . . . . . . . . . . . . . . 26 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 14. Security Considerations . . . . . . . . . . . . . . . . . . . 26 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 27 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 16.1. Normative References . . . . . . . . . . . . . . . . . . 27 16.2. Informative References . . . . . . . . . . . . . . . . . 28 Appendix A. IP Parcel Futures . . . . . . . . . . . . . . . . . 30 Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 32 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 32 1. Introduction IP packets (both IPv4 [RFC0791] and IPv6 [RFC8200]) contain a single unit of upper layer protocol data which becomes the retransmission unit in case of loss. Upper layer protocols such as the Transmission Control Protocol (TCP) [RFC9293] and transports over the User Datagram Protocol (UDP) [RFC0768] (including QUIC [RFC9000], LTP [RFC5326] and others) prepare data units known as "segments", with traditional arrangements including a single segment per IP packet. This document presents a new construct known as the "IP Parcel" which permits a single packet to carry multiple upper layer protocol segments. This essentially creates a "packet-of-packets" with the IP layer and full {TCP,UDP} headers appearing only once but with possibly more than one segment included. Templin Expires 16 July 2023 [Page 2] Internet-Draft IP Parcels January 2023 Parcels are formed when an upper layer protocol entity identified by the "5-tuple" (source address, destination address, source port, destination port, protocol number) prepares a data buffer beginning with an Integrity Block of up to 256 2-octet Checksums followed by their corresponding upper layer protocol segments that can be broken out into smaller sub-parcels and/or individual packets if necessary. All segments except the final one must be equal in length and no larger than 65535 octets (minus headers), while the final segment must be no larger than the others but may be smaller. The upper layer protocol entity then delivers the buffer, number of segments and non-final segment size to lower layers which append a {TCP,UDP} header and an IP header plus extensions that identify this as a parcel and not an ordinary packet. Parcels can be forwarded over consecutive parcel-capable links in a path until arriving at a router where the next hop is via a link that does not support parcels, a parcel-capable link with a size restriction, or an ingress middlebox Overlay Multilink Network (OMNI) Interface [I-D.templin-intarea-omni] that spans intermediate Internetworks using adaptation layer encapsulation and fragmentation. In the first case, the router transforms the parcel into individual IP packets then forwards each via the next hop link. In the second case, the router breaks the parcel into smaller sub-parcels and forwards them via the next hop link. In the final case, the OMNI interface breaks the parcel into smaller sub-parcels if necessary then encapsulates each (sub-)parcel in headers suitable for traversing the Internetworks while applying adaptation layer fragmentation if necessary. These OMNI interface sub-parcels may then be recombined into one or more larger parcels by an egress middlebox OMNI interface which either delivers them locally or forwards them over additional parcel- capable links on the path to the final destination. Reordering and even loss or damage of individual segments in the network is therefore possible, but what matters is that the number of parcels delivered to the final destination should be kept to a minimum for the sake of efficiency and that the loss or receipt of individual segments (and not parcel size) determines the retransmission unit. The following sections discuss rationale for creating and shipping IP parcels as well as the actual protocol constructs and procedures involved. IP parcels provide an essential building block for improved performance and efficiency while encouraging larger Maximum Transmission Units (MTUs) in the Internet. It is further expected that the parcel concept will drive future innovation in applications, operating systems, network equipment and data links. Templin Expires 16 July 2023 [Page 3] Internet-Draft IP Parcels January 2023 2. Terminology The Oxford Languages dictionary defines a "parcel" as "a thing or collection of things wrapped in paper in order to be carried or sent by mail". Indeed, there are many examples of parcel delivery services worldwide that provide an essential transit backbone for efficient business and consumer transactions. In this same spirit, an "IP parcel" is simply a collection of up to 256 upper layer protocol segments wrapped in an efficient package for transmission and delivery (i.e., a "packet-of-packets") while a "singleton IP parcel" is simply a parcel that contains a single segment. IP parcels are distinguished from ordinary packets through the special header constructions discussed in this document. The IP parcel construct is defined for both IPv4 and IPv6. Where the document refers to "IPv4 header length", it means the total length of the base IPv4 header plus all included options, i.e., as determined by consulting the Internet Header Length (IHL) field. Where the document refers to "IPv6 header length", however, it means only the length of the base IPv6 header (i.e., 40 octets), while the length of any extension headers is referred to separately as the "IPv6 extension header length". Finally, the term "IP header plus extensions" refers generically to an IPv4 header plus all included options or an IPv6 header plus all included extension headers. Where the document refers to "{TCP, UDP} header length", it means the length of either the TCP header plus options (20 or more octets) or the UDP header (8 octets). It is important to note that only a single IP header and a single full upper layer header appears in each parcel regardless of the number of segments included. This distinction often provides a significant savings in overhead made possible only by IP parcels. Where the document refers to checksum calculations, it means the standard Internet checksum unless otherwise specified. The same as for TCP [RFC9293], UDP [RFC0768] and IPv4 [RFC0791], the standard Internet checksum is defined as (sic) "the 16-bit one's complement of the one's complement sum of all (pseudo-)headers plus data, padded with zero octets at the end (if necessary) to make a multiple of two octets". A notional Internet checksum algorithm can be found in [RFC1071], while practical implementations require special attention to byte ordering "endianness" to ensure interoperability between diverse architectures. The term "Maximum Transmission Unit (MTU)" is widely understood in Internetworking terminology to mean the largest packet size that can traverse a single link ("link MTU") or an entire path ("path MTU") Templin Expires 16 July 2023 [Page 4] Internet-Draft IP Parcels January 2023 without requiring IP layer fragmentation. Where the document refers to "parcel path MTU", it means the maximum-sized IP parcel that can traverse the forward path to the destination as determined through parcel path qualification (see: Section 8). Note that this size may be larger than the maximum-sized singleton IP (jumbo) packet that can traverse the same path, since intermediate nodes are permitted to break oversized parcels into smaller sub-parcels but cannot do the same for singleton IP packets. The term "parcel-capable link" refers to any data link medium (physical or virtual) capable of transiting a {TCP,UDP}/IP packet that employs the parcel-specific constructions specified in this document. The link MUST be capable of forwarding parcels with at least one segment of maximum size, therefore each parcel-capable link MUST configure an MTU of at least 64KB and SHOULD configure a larger MTU if possible. Currently, only the OMNI link satisfies these properties, but future link designs should also incorporate parcel support. The Automatic Extended Route Optimization (AERO) [I-D.templin-intarea-aero] and Overlay Multilink Network Interface (OMNI) [I-D.templin-intarea-omni] technologies provide an ideal architectural framework for transmission of IP parcels. AERO/OMNI are expected to provide an operational environment for IP parcels beginning from the earliest deployment phases and extending to accommodate continuous growth. As more and more parcel-capable links begin to emerge in data centers and other edge networks, AERO/OMNI will provide a transit backbone for true IP parcel Internetworking. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Background and Motivation Studies have shown that applications can improve their performance by sending and receiving larger packets due to reduced numbers of system calls and interrupts as well as larger atomic data copies between kernel and user space. Larger packets also result in reduced numbers of network device interrupts and better network utilization (e.g., due to header overhead reduction) in comparison with smaller packets. A first study [QUIC] involved performance enhancement of the QUIC protocol [RFC9000] using the linux Generic Segment/Receive Offload (GSO/GRO) facility. GSO/GRO provides a robust (but non-standard) service similar in nature to the IP parcel service described here, Templin Expires 16 July 2023 [Page 5] Internet-Draft IP Parcels January 2023 and its application has shown significant performance increases due to the increased transfer unit size between the operating system kernel and QUIC applications. Unlike IP parcels, however, GSO/GRO perform fragmentation and reassembly at the transport layer with the transport protocol segment size limited by the path MTU (typically 1500 octets or smaller in today's Internet). A second study [I-D.templin-dtn-ltpfrag] showed that GSO/GRO also improves performance for the Licklider Transmission Protocol (LTP) [RFC5326] used for the Delay Tolerant Networking (DTN) Bundle Protocol [RFC9171] for segments larger than the actual path MTU through the use of OMNI interface encapsulation and fragmentation. Historically, the NFS protocol also saw significant performance increases using larger (single-segment) UDP datagrams even when IP fragmentation is invoked, and LTP still follows this profile today. Moreover, LTP shows this (single-segment) performance increase profile extending to the largest possible segment size which suggests that additional performance gains are possible using (multi-segment) IP parcels that approach or even exceed 65535 octets. TCP also benefits from larger packet sizes and efforts have investigated TCP performance using jumbograms internally with changes to the linux GSO/GRO facilities [BIG-TCP]. The idea is to use the jumbo payload option internally and to allow GSO/GRO to use buffer sizes larger than 65535 octets, but with the understanding that links that support jumbos natively are not yet widely available. Hence, IP parcels provides a packaging that can be considered in the near term under current deployment limitations. A limiting consideration for sending large packets is that they are often lost at links with smaller MTUs, and the resulting Packet Too Big (PTB) message may be lost somewhere in the path back to the original source. This "Path MTU black hole" condition can degrade performance unless robust path probing techniques are used, however the best case performance always occurs when no packets are lost due to size restrictions. These considerations therefore motivate a design where transport protocols should employ a maximum segment size no larger than 65535 octets (minus headers), while parcels that carry multiple segments may themselves be significantly larger. Then, even if the network needs to sub-divide the parcels into smaller sub-parcels to forward further toward the final destination, an important performance optimization for the original source, final destination and network path as a whole can be realized. Templin Expires 16 July 2023 [Page 6] Internet-Draft IP Parcels January 2023 An analogy: when a consumer orders 50 small items from a major online retailer, the retailer does not ship the order in 50 separate small boxes. Instead, the retailer packs as many of the small items as possible into one or a few larger boxes (i.e., parcels) then places the parcels on a semi-truck or airplane. The parcels may then pass through one or more regional distribution centers where they may be repackaged into different parcel configurations and forwarded further until they are finally delivered to the consumer. But most often, the consumer will only find one or a few parcels at their doorstep and not 50 separate small boxes. This flexible parcel delivery service greatly reduces shipping and handling cost for all including the retailer, regional distribution centers and finally the consumer. 4. IP Parcel Formation An upper layer protocol entity (identified by the 5-tuple described above) forms an IP parcel when it prepares a data buffer containing the concatenation of an Integrity Block of up to 256 2-octet Checksums followed by their corresponding upper layer protocol segments (with each TCP non-first segment preceded by a 4-octet Sequence Number). All non-final segments MUST be equal in length while the final segment MUST NOT be larger and MAY be smaller. Each non-final segment MUST NOT be larger than 65535 octets minus the length of the {TCP,UDP} header, minus the length of the IP header (plus options/extensions), minus 2 octets for the per-segment Checksum. (Note that this also satisfies the case of ingress middlebox OMNI interfaces in the path that would process the headers as upper layer protocol payload during IPv6 encapsulation/ fragmentation.) The upper layer protocol entity then presents the buffer and non- final segment size L to lower layers, noting that the buffer may be larger than 65535 octets if it includes sufficient segments of a large enough size to exceed that value. If the buffer plus headers would together be no larger than the parcel path MTU, lower layers then append a single full {TCP,UDP} header (plus options) followed by a single IP header (plus options/extensions). If the buffer would cause a single parcel to exceed the parcel path MTU, lower layers instead break the buffer up into multiple smaller buffers (each with an integral number of segments) and append separate {TCP,UDP}/IP headers for each as independent parcels. The IP layer then presents each parcel to a network interface attachment to either an ordinary parcel-capable link or an OMNI link that performs adaptation layer encapsulation and fragmentation (see: Section 7). The IP layer includes a Jumbo Payload option in the IP header formed as shown in Figure 1: Templin Expires 16 July 2023 [Page 7] Internet-Draft IP Parcels January 2023 |<------- Option Header ------->| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Nsegs | Jumbo Payload Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |<------------------------ Option Data ------------------------>| Figure 1: Jumbo Payload Option Format For IPv4, the Jumbo Payload option format follows from [RFC2675] except that the IP layer sets option type to '00001011' and option length to '00000110' noting that the length distinguishes this type from its obsoleted use as the "IPv4 Probe MTU" option [RFC1063]. The IP layer also interprets the most significant option data octet as an Nsegs field that encodes a value J between 0 and 255 and sets the Jumbo Payload Length field to a 3-octet value M that encodes the length of the IPv4 header plus the length of the {TCP,UDP} header plus the combined length of the Integrity Block plus all concatenated segments. The IP layer next sets the IPv4 header DF bit to 1, then sets the IPv4 header Total Length field to the non-final segment size L. Note that the IP layer can form true IPv4 jumbograms (as opposed to parcels) by instead setting the IPv4 header Total Length field to 0 and treating the entire 4 octets of the option data as the Jumbo Payload Length (see: Section 11). For IPv6, the IP layer includes a Jumbo Payload option in an IPv6 Hop-by-Hop Options extension header formatted the same as for IPv4 above, but with option type set to '11000010' and option length set to '00000100'. The IP layer sets the option data Nsegs field to a 1-octet value J between 0 and 255 and sets the Jumbo Payload Length field to a 3-octet value M that encodes the lengths of all IPv6 extension headers present plus the length of the {TCP,UDP} header plus the combined length of the Integrity Block plus all concatenated segments. The IP layer next sets the IPv6 header Payload Length field to L. Note that the IP layer can form true IPv6 jumbograms (as opposed to parcels) by instead setting the IPv6 header Payload Length field to 0 and treating the entire 4 octets of the option data as the Jumbo Payload Length (see: [RFC2675]). The IP layer then prepares the rest of the {TCP,UDP}/IP parcel according to the formats shown in Figure 2: Templin Expires 16 July 2023 [Page 8] Internet-Draft IP Parcels January 2023 TCP/IP Parcel Structure UDP/IP Parcel Structure +------------------------------+ +------------------------------+ |IP Hdr plus options/extensions| |IP Hdr plus options/extensions| ~ {Total, Payload} Length = L ~ ~ {Total, Payload} Length = L ~ | Nsegs = J; Jumbo Length = M | | Nsegs = J; Jumbo Length = M | +------------------------------+ +------------------------------+ | | | | ~ TCP header (plus options) ~ ~ UDP header ~ | (Includes Sequence Number 0) | | | +------------------------------+ +------------------------------+ | | | | ~ Integrity Block ~ ~ Integrity Block ~ | | | | +------------------------------+ +------------------------------+ ~ ~ ~ ~ ~ Segment 0 (L-4 octets) ~ ~ Segment 0 (L octets) ~ +------------------------------+ +------------------------------+ ~ Sequence Number 1 followed ~ ~ ~ ~ by Segment 1 (L octets) ~ ~ Segment 1 (L octets) ~ +------------------------------+ +------------------------------+ ~ Sequence Number 2 followed ~ ~ ~ ~ by Segment 2 (L octets) ~ ~ Segment 2 (L octets) ~ +------------------------------+ +------------------------------+ ~ ... ~ ~ ... ~ ~ ... ~ ~ ... ~ +------------------------------+ +------------------------------+ ~ Sequence Number J followed ~ ~ ~ ~ by Segment J (K octets) ~ ~ Segment J (K octets) ~ +------------------------------+ +------------------------------+ Figure 2: {TCP,UDP}/IP Parcel Structure where the total number of segments is (J + 1), L is the length of each non-final segment which MUST NOT be larger than 65535 octets (minus headers) and K is the length of the final segment which MUST NOT be larger than L. (Note that when J is 0, K and L are one and the same value.) The {TCP,UDP} header is then immediately followed by an Integrity Block containing (J + 1) 2-octet Checksums concatenated in numerical order as shown in Figure 3: Templin Expires 16 July 2023 [Page 9] Internet-Draft IP Parcels January 2023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum (0) | Checksum (1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum (2) | ... ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... ~ ~ ... ... ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum (J-1) | Checksum (J) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: Integrity Block Format The Integrity Block is then followed by (J + 1) upper layer protocol segments. For TCP, the TCP header Sequence Number field encodes a 4-octet starting sequence number for the first segment only, while each additional segment is preceded by its own 4-octet Sequence Number field. For this reason, the length of the first segment is only (L-4) octets since the 4-octet TCP header Sequence Number field applies to that segment. (All non-first TCP segments instead begin with their own Sequence Numbers, with the 4-octet length included in L and K.) Following parcel construction, the Nsegs value unambiguously determines the number of 2-octet Checksums present in the Integrity Block and (together with the Jumbo Payload Length) also determines the number of parcel data segments present. Receivers therefore observe the following: * if the Jumbo Payload Length indicates insufficient space for the full Integrity Block plus at least one data segment octet, the receiver discards the parcel. * if the length of the payload following the Integrity Block is (J * L) or less, the receiver processes all initial Checksums along with their corresponding segments up to the end of the payload and ignores any remaining Checksums. * if the length of the payload following the Integrity Block is greater than ((J + 1) * L) the receiver processes all Checksums with their corresponding segments and ignores any remaining payload beyond the end of the final segment. Note: per-segment Checksums appear in a contiguous Integrity Block immediately following the {TCP,UDP}/IP headers instead of inline with the parcel segments to greatly increase the probability that they will appear in the contiguous head of a kernel receive buffer even if the parcel was subject to OMNI interface IPv6 fragmentation. This condition may not always hold if the IPv6 fragments also incur IPv4 Templin Expires 16 July 2023 [Page 10] Internet-Draft IP Parcels January 2023 encapsulation and fragmentation over paths that traverse slow IPv4 links with small MTUs. In that case, performance is bounded by the unavoidable slow link traversal and not the overhead for pulling the fragmented Integrity Block into the contiguous head of a kernel receive buffer. 5. TCP Parcels A TCP Parcel is an IP Parcel that includes an IP header plus extensions with a Jumbo Payload option with Nsegs/J encoding one less than the number of segments and Jumbo Payload length encoding a value up to 16MB. The IP header plus extensions is then followed by a TCP header plus options (20 or more octets), which is then followed by an Integrity Block with (J + 1) consecutive 2-octet Checksums. The Integrity Block is then followed by (J + 1) consecutive segments, where the first segment is (L-4)-octets in length and uses the 4-octet sequence number found in the TCP header, each intermediate segment is L octets in length (including its own 4-octet Sequence Number segment header) and the final segment is K octets in length (including its own 4-octet Sequence Number segment header). The value L is encoded in the IP header {Total, Payload} Length field while J is encoded in the Nsegs octet. The overall length of the parcel as well as final segment length K are determined by the Jumbo Payload length M as discussed above. The source prepares TCP Parcels in a similar fashion as for TCP jumbograms [RFC2675]. The source calculates a checksum of the TCP header plus IP pseudo-header only (see: Section 9), but with the TCP header Sequence Number field temporarily set to 0 during the calculation since the true sequence number will be included as a pseudo header for the first segment. The source then writes the calculated value in the TCP header Checksum field as-is (i.e., without converting calculated '0' values to 'ffff') and finally re- writes the actual sequence number back into the Sequence Number field. (Nodes that verify the header checksum first perform the same operation of temporarily setting the Sequence Number field to 0 and then resetting to the actual value following checksum verification.) The source then calculates the checksum of the first segment beginning with the sequence number found in the full TCP header as a 4-octet pseudo-header and extending over the (L-4)-octet length of the segment. The source next calculates the checksum for each L octet intermediate segment independently over the length of the segment (beginning with its sequence number), then finally calculates the checksum of the K octet final segment (beginning with its sequence number). As the source calculates each per-segment checksum for segments i=(0 thru J), it writes the value into the corresponding Integrity Block Checksum(i) field as-is. Templin Expires 16 July 2023 [Page 11] Internet-Draft IP Parcels January 2023 See: Section 9 for further discussion. 6. UDP Parcels A UDP Parcel is an IP Parcel that includes an IP header plus extensions with a Jumbo Payload option with Nsegs/J encoding one less than the number of segments and Jumbo Payload length encoding a value up to 16MB. The IP header plus extensions is then followed by an 8-octet UDP header followed by an Integrity Block with (J + 1) consecutive 2-octet Checksums followed by (J + 1) upper layer protocol segments. Each segment must begin with a transport-specific start delimiter (e.g., a segment identifier) included by the transport layer user of UDP. The length of the first segment L is encoded in the IP {Total, Payload} Length field while J is encoded in the Nsegs octet. The overall length of the parcel as well as the final segment length are determined by the Jumbo Payload length M as discussed above. The source prepares UDP Parcels in a similar fashion as for UDP jumbograms [RFC2675] and MUST therefore set the UDP header length field to 0. The source then calculates the checksum of the UDP header plus IP pseudo-header (see: Section 9) and writes the calculated value in the UDP header Checksum field as-is (i.e., without converting calculated '0' values to 'ffff'). The source then calculates a separate checksum for each segment for which checksums are enabled independently over the length of the segment. As the source calculates each per-segment checksum for segments i=(0 thru J), it writes the value into the corresponding Integrity Block Checksum(i) field with calculated '0' values written as 'ffff'; for segments with checksums disabled, the source instead writes the value '0'. See: Section 9 for further discussion. 7. Transmission of IP Parcels The IP layer of the source next presents each parcel to a network interface for transmission over a parcel-capable link. For ordinary IP interface attachments to parcel-capable links, the interface simply admits each parcel into the link the same as for any IP packet after which it may then be forwarded by any number of routers over additional consecutive parcel-capable links possibly even traversing the entire forward path to the final destination. If any router in the path does not recognize the parcel construct, it may drop the parcel and return an ICMP "Parameter Problem" message. For this reason, the source should perform parcel path qualification before sending parcels over new paths (see: Section 8). Templin Expires 16 July 2023 [Page 12] Internet-Draft IP Parcels January 2023 If the router recognizes parcels but the next hop link in the path does not, or if the parcel would exceed the next hop parcel MTU, the router instead opens the parcel. The router then forwards each enclosed segment in singleton IP packets or in a set of smaller sub- parcels that each contain a subset of the original parcel's segments. The router prepares each singleton IP packet or smaller sub-parcel for transmission to the next hop as follows. For transmission of singleton IP packets over links that do not support parcels, the router removes the Jumbo Payload option and Integrity Block then copies the {TCP,UDP}/IP headers followed by each segment into (J + 1) separate singleton IP packets. The router then sets IP {Total, Payload} length for each singleton based on its segment length according to the standards [RFC0791] [RFC8200]. For TCP, the router then clears the ACK flag in all except the first singleton and sets the TCP header Sequence Number field based on the segment's sequence number according to [RFC9293] while removing the per-segment Sequence Number field itself and subtracting the checksum of the sequence number only from the segment's checksum found in the Integrity Block. The router then calculates the TCP header checksum only according to the standards (i.e., while including a pseudo- header of the IP header), adds the current segment's remaining checksum value and forwards the packet. For UDP, the router instead sets the UDP length field according to [RFC0768]. If the current UDP segment checksum found in the Integrity Block is 0, the router then sets the UDP header checksum to 0 and forwards the packet. Otherwise, the router calculates the UDP header checksum according to the standards (i.e., while including a pseudo-header of the IP header). If the current UDP segment checksum is not 'ffff', the router then adds the value to the header checksum; otherwise, the router re-calculates the current UDP segment checksum. If the calculated value is 'ffff' the router adds 'ffff' to the header checksum; otherwise, the re-calculated value was either incorrect or 0 and the router adds nothing. The router then writes the final checksum value into the UDP checksum field (or writes 'ffff' if the final value was 0) and forwards the packet. For transmission of smaller sub-parcels over parcel-capable links, the router breaks the original parcel into smaller groups of segments that would fit within the parcel path MTU by determining the number of segments of length L that can fit into each sub-parcel under the size constraints. For example, if the router determines that a sub- parcel can contain 3 segments of length L, it creates sub-parcels with the first containing Integrity Block Checksums/Segments 0-2, the second containing Checksums/Segments 3-5, etc., and with the final containing any remaining Checksums/Segments. The router then appends identical {TCP,UDP}/IP headers plus extensions to each sub-parcel while resetting L and M in each according to the above equations with Templin Expires 16 July 2023 [Page 13] Internet-Draft IP Parcels January 2023 Nsegs/J set to 2 for each intermediate sub-parcel and with Nsegs/J set to one less than the remaining number of segments for the final sub-parcel. For TCP, the router then sets the TCP Sequence Number field to the value that appears in the first sub-parcel segment while removing the first segment Sequence Number field (if present) and also clears the ACK flag in all sub-parcels except the first. For both TCP and UDP, the router finally resets the {TCP,UDP} header checksum according to ordinary parcel formation procedures (see above) then forwards each (sub-)parcel over the outgoing parcel- capable link. For transmission of original parcels or sub-parcels over OMNI interfaces, the OMNI Adaptation Layer (OAL) of this First Hop Segment (FHS) OAL source node then forwards the parcel to the next OAL hop which may be either an OAL intermediate node or a Last Hop Segment (LHS) OAL destination. OMNI interface upper layer protocol processing procedures are specified in detail in the remainder of this section, while lower layer encapsulation and fragmentation procedures are specified in detail in [I-D.templin-intarea-omni]. When the OAL source forwards a parcel (whether generated by a local application or generated by another node then forwarded over one or more parcel-capable links), it first assigns a monotonically- incrementing (modulo 127) "Parcel ID" for adaptation layer processing. If necessary, the OAL source then subdivides the parcel into sub-parcels the same as for the IP layer parcel subdivision procedures discussed above. The OAL source next assigns a different monotonically-incrementing Identification value for each sub-parcel of the same "Parcel ID" then performs adaptation layer encapsulation and fragmentation and finally forwards them to the next OAL hop which forwards further toward the OAL destination as necessary. If sub- dividing an IP parcel under current size constraints would result in more than 64 sub-parcels, each successive group of at most 64 sub- parcels must be transmitted under a new Parcel ID value to avoid Identification value overlap between successive groups. When the sub-parcels arrive at the OAL destination, the node can optionally retain them along with their Parcel ID and Identifications for a brief time to support re-combining with peer sub-parcels of the same original parcel identified by the adaptation layer 4-tuple consisting of the (source, destination, Identification, Parcel ID) fields. This re-combining entails the concatenation of Checksums/ Segments included in sub-parcels with the same Parcel ID and with Identification values within 64 of one another to create a larger sub-parcel possibly even as large as the entire original parcel. Order of concatenation need not be strictly enforced, with the exception that the sub-parcel containing the final segment must occur as a final concatenation and not as an intermediate. The OAL Templin Expires 16 July 2023 [Page 14] Internet-Draft IP Parcels January 2023 destination then appends a common {TCP,UDP}/IP header plus extensions to each re-combined sub-parcel while resetting J, K, L and M in each according to the above equations. For TCP, if any sub-parcels have the ACK bit set the OAL destination also sets the ACK bit in the re- combined sub-parcel TCP header. The OAL destination then resets the {TCP,UDP}/IP header checksum for each re-combined sub-parcel. If the OAL destination is also the final destination, it then delivers the sub-parcels to the IP layer which processes them according to the 5-tuple information supplied by the original source. Otherwise, the OAL destination forwards each sub-parcel toward the final destination the same as for an ordinary IP packet as discussed above. Note: sub-dividing a larger parcel into two or more sub-parcels entails replication of the {TCP,UDP}/IP headers. For TCP, the process entails copying the full TCP/IP header from the original parcel while writing the sequence number of the first sub-parcel segment into the TCP Sequence Number field, clearing the ACK bit if necessary as discussed above and truncating the (new) first segment Sequence Number field. For UDP, the process entails copying the full UDP/IP header from the original parcel into each sub-parcel. For both TCP and UDP, the process finally includes recalculating and resetting Nsegs and Jumbo Payload Length then recalculating the {TCP,UDP} header checksum. Note that the per-segment Integrity Block Checksum values in the sub-parcel segments themselves are still valid and need not be recalculated. Note: re-combining two or more sub-parcels into a larger parcel entails a reverse process of the above in which the {TCP,UDP}/IP headers of non-first sub-parcels are discarded and their included segments concatenated following those of a first sub-parcel. For TCP, the process includes setting the ACK in the TCP header only if ACK was set in any of the original sub-parcels. For both TCP and UDP, the process finally includes recalculating and resetting Nsegs and Jumbo Payload Length then recalculating the {TCP,UDP} header checksum as discussed above (the per-segment Integrity Block Checksums need not be recalculated). The OAL destination can instead avoid this process if it would negatively impact performance, noting that forwarding individual sub-parcels without delay and without re- combining is always acceptable. Note: while the OAL destination and/or final destination could theoretically re-combine the sub-parcels of multiple different parcels with identical upper layer protocol 5-tuples and intermediate segment lengths, this process could become complicated when the different parcels each have differing final segment lengths. Since this might interfere with any perceived performance advantage, the decision of whether and how to perform inter-parcel concatenation is an implementation matter. Templin Expires 16 July 2023 [Page 15] Internet-Draft IP Parcels January 2023 Note: sub-dividing of IP parcels over OMNI links occurs only at an OAL ingress node while re-combining of IP parcels occurs only at an OAL egress node. Therefore, intermediate OAL nodes do not participate in the sub-dividing or recombining processes. For TCP, the ACK bit must be managed as specified above to avoid confusing receivers with gratuitous duplicate ACKs. 8. Parcel Path Qualification To determine whether parcels are supported over at least an initial portion of the forward path toward the final destination, the original source can send IP parcels that contain Jumbo Payload options formatted as "Parcel Probes". The purpose of the probe is to elicit a "Parcel Reply" and possibly also an ordinary upper layer protocol probe reply from the final destination. The former is used to establish the parcel path MTU, while the latter determines the (transport layer) maximum segment size. If the original source receives a positive Parcel Reply, it marks the path as "parcels supported" and ignores any ordinary ICMP [RFC0792][RFC4443] and/or Packet Too Big (PTB) messages [RFC1191][RFC8201] concerning the probe. If the original source instead receives a negative Parcel Reply or no reply, it marks the path as "parcels not supported" and may regard any ordinary ICMP and/ or PTB messages concerning the probe (or its contents) as indications of a possible MTU restriction. The original source can therefore send Parcel Probes in parallel with sending real data as ordinary IP packets/parcels. The parcel probes will traverse parcel-capable links joined by routers on the forward path possibly extending all the way to the destination. If the original source receives a Parcel Reply, it can continue using IP parcels. Parcel Probes include the same Jumbo Payload option type used for ordinary parcels (see: Section 4) but set a different option length and include a 4-octet "(Parcel) Path MTU" field into which conformant routers write the minimum link MTU observed in a similar fashion as described in [RFC1063][I-D.ietf-6man-mtu-option]. Parcel Probes include one or more upper layer protocol segments corresponding to the 5-tuple for the flow, which may also include {TCP,UDP} segment size probes used for packetization layer path MTU discovery [RFC4821] [RFC8899]. The original source sends Parcel Probes unidirectionally in the forward path toward the final destination to elicit a Parcel Reply, since it will often be the case that IP parcels are supported only in the forward path and not in the return path. Parcel Probes may be Templin Expires 16 July 2023 [Page 16] Internet-Draft IP Parcels January 2023 dropped in the forward path by any node that does not recognize IP parcels, but Parcel Replys must be packaged to avoid filtering since parcels may not be recognized along portions of the return path. For this reason, the Jumbo Payload options included in Parcel Probes are always packaged as IPv4 header options or IPv6 Hop-by-Hop options while Parcel Replys are returned as UDP/IP encapsulated ICMPv6 PTB messages with a "Parcel Reply" Code value (see: [I-D.templin-intarea-omni]). Original sources send Parcel Probes that include a Jumbo Payload option coded in an alternate format as shown in Figure 4: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | Nonce-1 | Check | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Nsegs | Jumbo Payload Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Parcel) Path MTU (PMTU) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+- Nonce-2 -+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4: Parcel Probe Jumbo Payload Option Format Templin Expires 16 July 2023 [Page 17] Internet-Draft IP Parcels January 2023 For IPv4, the original source includes the option as an IPv4 header option with Type set to '00001011' the same as for an ordinary IPv4 parcel (see: Section 4) but with Length set to '00010100' to distinguish this as a probe. The original source sets Nonce-1 to '11111111', sets Check to the same value that will appear in the TTL of the outgoing IPv4 header, sets PMTU to the MTU of the outgoing IPv4 interface and sets Nonce-2 to a 64-bit random number. The source next includes a {TCP,UDP} header followed by an Integrity Block with Checksums followed by their upper layer protocol Segments in the same format as for an ordinary parcel. (The source can also form a NULL probe by setting Protocol to "No Next Header (59)" and including an Integrity Block with Checksum fields set to 0 followed by NULL segments with zero, random and/or other disposable payloads.) The source then sets {Nsegs, Jumbo Payload Length, IPv4 Total Length} and calculates the header and per-segment checksums the same as for an ordinary parcel. The source finally sends the Parcel Probe via the outbound IPv4 interface. According to [RFC7126], middleboxes (i.e., routers, security gateways, firewalls, etc.) that do not observe this specification SHOULD drop IP packets that contain option type '00001011' ("IPv4 Probe MTU") but some might instead either attempt to implement [RFC1063] or ignore the option altogether. IPv4 middleboxes that observe this specification instead MUST process the option as a Parcel Probe as specified below. For IPv6, the original source includes the probe option as an IPv6 Hop-by-Hop option with Type set to '11000010' the same as for an ordinary IPv6 parcel (see: Section 4) but with Length set to '00010010' to distinguish this as a probe. The original source sets Nonce-1 to '11111111', sets Check to the same value that will appear in the Hop Limit of the outgoing IPv6 header, sets PMTU to the MTU of the outgoing IPv6 interface and sets Nonce-2 to a 64-bit random number. The source next includes a {TCP,UDP} header followed by upper layer protocol Segments along with their Integrity Block Checksums in the same format as for an ordinary parcel. (The source can also form a NULL probe by setting Next Header to "No Next Header (59)" and including an Integrity Block with Checksum fields set to 0 followed by NULL segments with zero, random and/or other disposable payloads.) The source then sets {Nsegs, Jumbo Payload Length, IPv6 Payload Length} and calculates the header and per-segment checksums the same as for an ordinary parcel. The source finally sends the Parcel Probe via the outbound IPv6 interface. According to [RFC2675], middleboxes (i.e., routers, security gateways, firewalls, etc.) that recognize the IPv6 Jumbo Payload option but do not observe this specification SHOULD return an ICMPv6 Parameter Problem message (and presumably also drop the packet) due to the different option length. IPv6 middleboxes that observe this specification instead MUST process the option as a Parcel Probe as specified below. Templin Expires 16 July 2023 [Page 18] Internet-Draft IP Parcels January 2023 When a router that observes this specification receives an IP Parcel Probe it first compares Nonce-1 with '11111111' and Check with the IP header TTL/Hop Limit; if either value differs, the router MUST drop the probe and return a negative Parcel Reply (see below). Otherwise, if the next hop link is non-parcel-capable or configures an MTU that is too small to pass the probe, the router compares the PMTU value with the MTU of the inbound link for the probe and MUST (re)set PMTU to the lower MTU. The router then MUST return a positive Parcel Reply (see below) and convert the probe into an ordinary IP packet(s) the same as was described previously for routers forwarding to non- parcel-capable links. If the next hop IP link configures a sufficiently large MTU to pass the packet(s), the router then MUST forward each packet to the next hop; otherwise, it MUST drop each packet and return a suitable PTB. If the next hop IP link both supports parcels and configures an MTU that is large enough to pass the probe, the router instead compares the probe PMTU value with the MTUs of both the inbound and outbound links for the probe and MUST (re)set PMTU to the lower MTU. The router then MUST reset Check to the same value that will appear in the TTL/Hop Limit of the outgoing IP header, and MUST forward the Parcel Probe to the next hop. The final destination may therefore receive either one or more ordinary IP packets or intact Parcel Probes. If the final destination receives ordinary IP packets, it performs any necessary integrity checks then delivers the packets to upper layers which will return an upper layer probe response if necessary. If the final destination receives a Parcel Probe, it first compares Nonce-1 with '11111111' and Check with the IP header TTL/Hop Limit; if either value differs, the final destination MUST drop the probe and return a negative Parcel Reply. Otherwise, the final destination compares the probe PMTU value with the MTU of the inbound link and MUST (re)set PMTU to the lower MTU. The final destination then MUST return a positive Parcel Reply and deliver the probe contents to upper layers the same as for an ordinary IP parcel. When a router or final destination returns a Parcel Reply, it prepares an ICMPv6 PTB message [RFC4443] with Code set to "Parcel Reply" (see: [I-D.templin-intarea-omni]) and with MTU set to either the PMTU value reported in the Parcel Probe for a positive reply or to the value 0 for a negative reply. The node then writes its own IP address as the Parcel Reply source and writes the source of the Parcel Probe as the Parcel Reply destination (for IPv4 Parcel Probes, the node writes the Parcel Reply addresses as IPv4-Compatible IPv6 addresses [RFC4291]). The node next copies as much of the leading portion of the Parcel Probe (beginning with the IP header) as possible into the "packet in error" field without causing the Parcel Reply to exceed 512 octets in length, then calculates the ICMPv6 header checksum. Since IPv6 packets cannot traverse IPv4 paths, and Templin Expires 16 July 2023 [Page 19] Internet-Draft IP Parcels January 2023 since middleboxes often filter ICMPv6 messages as they traverse IPv6 paths, the node next wraps the Parcel Reply in UDP/IP headers of the correct IP version with the IP source and destination addresses copied from the Parcel Reply and with UDP port numbers set to the UDP port number for OMNI [I-D.templin-intarea-omni]. In the process, the node either calculates or omits the UDP checksum as appropriate and (for IPv4) clears the DF bit. The node finally sends the prepared Parcel Reply to the original source of the probe. After sending a Parcel Probe the original source may therefore receive a UDP/IP encapsulated Parcel Reply (see above) and/or an upper layer protocol probe reply. If the source receives a Parcel Reply, it first verifies the checksum then matches the enclosed PTB message with the original Parcel Probe by examining the Nonce-2 field echoed in the ICMPv6 "packet in error" field containing the leading portion of the probe. If PTB does not match, the source discards the Parcel Reply; otherwise, it continues to process. If the Parcel Reply MTU is 0, the source marks the path as "parcels not supported; otherwise, it marks the path as "parcels supported" and also records the MTU value as the parcel path MTU for the forward path to this destination. (Note that this size may be larger than the maximum- sized singleton jumbogram that can traverse the path.) After receiving a positive Parcel Reply, the original source can continue sending IP parcels addressed to the final destination up to the size of the parcel path MTU; any upper layer protocol probe replies will determine the maximum segment size that can be included in the parcel as an upper layer consideration. After receiving a negative Parcel Reply (or no reply) the original source should refrain from sending parcels until a path change event might occur. In both cases, the original source should periodically re-initiate Parcel Path Qualification for as long as it desires to use the IP parcel service. If at any time performance appears to degrade, the original source should reduce the size of the parcels it sends and/or begin sending singleton IP packets instead. The original source can also use Parcel Path Qualification to qualify the path for ordinary IP jumbograms simply by setting the IP header length field to 0 and formatting the probe body as an ordinary jumbogram no larger than the maximum size that can be represented in the 32-bit Jumbo Payload Length. (The source can also form a NULL probe by setting Protocol/Next Header to "No Next Header (59)" and including a zero, random and/or other disposable jumbo payload.) Routers that forward the (Jumbogram) Parcel Probe will recognize the 0 IP header length as an indication that the probe is a true Jumbogram (i.e., and not a parcel). Each router sets PMTU to the largest Jumbogram size it is capable of forwarding, then forwards the probe to the next hop. If the next hop MTU is too small, the router Templin Expires 16 July 2023 [Page 20] Internet-Draft IP Parcels January 2023 instead drops the probe and returns a negative (Jumbogram) Parcel Reply. Therefore, only the destination itself may return a positive (Jumbogram) Parcel Reply with the resulting PMTU value. This especially implies the largest possible Jumbogram size may be significantly less than the largest possible parcel size, since forwarding nodes can sub-divide parcels but cannot sub-divide singleton Jumbograms. Note: when a Parcel Probe forwarded into an ingress OMNI interface is broken into sub-parcels, each sub-parcel includes its own copy of the Parcel Probe header. When multiple sub-parcels of the same Parcel Probe arrive at an egress OMNI interface, the interface optionally re-combines the sub-parcels while retaining the Parcel Probe header. It is therefore possible that a single Parcel Probe with multiple upper layer protocol segments could generate multiple Parcel Replys. Note: The original source includes Nonce-1 and Check fields as the first 2 octets of Parcel Probes in case a router on the path overwrites the values in a wayward attempt to implement [RFC1063]. Parcel Probe recipients should therefore regard a Nonce-1 value other than '11111111' as an indication that the field was either intentionally or accidentally altered by a previous hop node that does not recognize parcels. Note: The MTU value returned in a Parcel Reply determines only the maximum IP parcel size for the path, while the maximum upper layer protocol segment size may be significantly smaller. The upper layer protocol segment size is instead determined separately according to any upper layer protocol probing. Note: When the OMNI interface of an ingress middlebox receives a Parcel Probe with PMTU larger than 64KB (but no larger than 16MB), it can optionally leave PMTU unchanged (i.e., if it intends to support parcel subdivision internally) or rewrite PMTU to 64KB to disable adaptation layer parcel sub-division. Regardless of the decision taken by the ingress middlebox, correct behavior will be observed by the final destination whether or not the egress middlebox elects to recombine sub-parcels. Note: If a router or final destination receives a Parcel Probe but does not recognize the parcel construct, it drops the probe without further processing (and may return an ICMP error). The original source will then consider the probe as lost and parcels cannot be used. Templin Expires 16 July 2023 [Page 21] Internet-Draft IP Parcels January 2023 9. Integrity The {TCP,UDP}/IP header plus each segment of a (multi-segment) IP parcel includes its own integrity check. This means that IP parcels can support stronger and more discrete integrity checks for the same amount of upper layer protocol data compared to an ordinary IP packet or Jumbogram. The {TCP/UDP} header integrity checks can be verified at each hop to ensure that parcels with errored headers are dropped to avoid mis-delivery. The per-segment Integrity Block Checksums are set by the source and verified by the final destination, noting that TCP parcels must honor the sequence number discipline discussed in Section 5. IP parcels can range in length from as small as only the {TCP,UDP}/IP headers plus a single Integrity Block Checksum with a non-zero length segment to as large as the headers plus (256 * (65535 minus headers)) octets. Although 32-bit link layer integrity checks provide sufficient protection for contiguous data blocks up to approximately 9KB, reliance on link-layer integrity checks may be inadvisable for links with significantly larger MTUs and may not be possible at all for links such as tunnels over IPv4 that invoke fragmentation. Moreover, the segment contents of a received parcel may arrive in an incomplete and/or rearranged order with respect to their original packaging. Lower layer protocol entities calculate and verify the {TCP,UDP}/IP parcel header Checksums at their layer, since an errored header could result in mis-delivery to the wrong upper layer protocol entity. If a lower layer protocol entity on the path detects an incorrect {TCP,UDP}/IP Checksum it discards the entire IP parcel unless the header(s) can somehow be repaired. To support the parcel header checksum calculation, lower layer protocol entities use modified versions of the {TCP,UDP}/IPv4 "pseudo-header" found in [RFC0768][RFC9293], or the {TCP,UDP}/IPv6 "pseudo-header" found in Section 8.1 of [RFC8200]. Note that while the contents of the two IP protocol version-specific pseudo-headers beyond the address fields are the same, the order in which the contents are arranged differs and must be honored according to the specific IP protocol version as shown in Figure 5. This allows for maximum reuse of widely deployed code while ensuring interoperability. Templin Expires 16 July 2023 [Page 22] Internet-Draft IP Parcels January 2023 IPv4 Parcel Pseudo-Header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | zero | Next Header | Segment Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Nsegs | Upper-Layer Packet Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ IPv6 Parcel Pseudo-Header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ IPv6 Source Address ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ IPv6 Destination Address ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Nsegs | Upper-Layer Packet Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Segment Length | zero | Next Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: {TCP,UDP}/IP Parcel Pseudo-Header Formats where the following fields appear in both pseudo-headers but with different ordering: * Source Address is the 4-octet IPv4 or 16-octet IPv6 source address of the prepared parcel. * Destination Address is the 4-octet IPv4 or 16-octet IPv6 destination address of the prepared parcel. * zero encodes the constant value '0'. * Next Header is the IP protocol number corresponding to the upper layer protocol, i.e., TCP or UDP. * Segment Length is the value that appears in the IPv4 Total Length or IPv6 Payload Length field of the prepared parcel. Templin Expires 16 July 2023 [Page 23] Internet-Draft IP Parcels January 2023 * Nsegs is a 1-octet value one less than the number of segments included, and must contain a number between 0 and 255 (this is the same value that appears in the Jumbo Payload Option Nsegs field). * Upper-Layer Packet Length is the 3-octet length of the {TCP,UDP} header plus data (this value can be derived from the Jumbo Payload Length by subtracting the IPv4 header length for IPv4 or IPv6 extension header length for IPv6). Upper layer protocol entities use socket options to coordinate per- segment checksum processing with lower layers. If the upper layer sets a SO_NO_CHECK(TX) socket option, the upper layer is responsible for supplying per-segment checksums on transmission and the lower layer forwards the IP parcel to the next hop without further processing; otherwise, the lower layer calculates the per-segment checksums before forwarding. If the upper layer sets a SO_NO_CHECK(RX) socket option, the upper layer is responsible for verifying per-segment checksums on reception and the lower layer delivers each received parcel body to the upper layer without further processing; otherwise, the lower layer verifies the per-segment parcel checksums before delivering. When the upper layer protocol entity of the source sends a parcel body to lower layers, it prepends an Integrity Block of (J + 1) 2-octet Checksum fields and includes a 4-octet Sequence Number field with each TCP non-first segment. If the SO_NO_CHECK(TX) socket option is set, the upper layer protocol either calculates each segment checksum and writes the value into the corresponding Checksum field (and for UDP with '0' values written as 'ffff') or for UDP writes the value '0' to disable checksums for specific segments. If the SO_NO_CHECK(TX) socket options is clear, the upper layer instead writes the value '0' for UDP to disable or any non-zero value to enable checksums for specific segments. When the lower layer protocol entity of the source receives the parcel body from upper layers, if the SO_NO_CHECK(TX) socket option is set the lower layer appends the {TCP,UDP}/IP headers and forwards the parcel to the next hop without further processing. If the SO_NO_CHECK(TX) socket option is clear, the lower layer instead calculates the checksum for each segment with a non-zero value in the corresponding Integrity Block Checksum field and overwrites the calculated value into the Checksum field (and for UDP with '0' values written as 'ffff'). When the lower layer protocol entity of the destination receives a parcel from the source, if the SO_NO_CHECK(RX) socket option is set the lower layer delivers the parcel body to the upper layer without further processing, and the upper layer is responsible for per- Templin Expires 16 July 2023 [Page 24] Internet-Draft IP Parcels January 2023 segment checksum verification. If the SO_NO_CHECK(RX) socket option is clear, the lower layer instead calculates the checksum for each TCP segment (or each UDP segment with a non-zero value in the corresponding Integrity Block Checksum field) and marks a corresponding field for the segment in an ancillary data structure as one of "correct", "incorrect" or "disabled". The lower layer then delivers both the parcel body (beginning with the Integrity block) and ancillary data to the upper layer which can then determine which segments have correct/incorrect/disabled checksums. Note: The Integrity Block itself is intentionally omitted from the IP Parcel {TCP,UDP} header checksum calculation. This permits destinations to accept as many intact segments as possible from received parcels with checksum block bit errors, whereas the entire parcel would need to be discarded if the header checksum also covered the Integrity Block. Note: IP parcels and jumbograms that set Protocol/Next Header to "No Next Header (59)" do not include a {TCP,UDP} Checksum field and therefore do not include a header checksum. Intermediate nodes simply forward these NULL parcels/jumbos without verifying a header checksum, while destination nodes simply discard them after returning a Parcel Reply, if necessary. 10. RFC2675 Updates Section 3 of [RFC2675] provides a list of certain conditions to be considered as errors. In particular: error: IPv6 Payload Length != 0 and Jumbo Payload option present error: Jumbo Payload option present and Jumbo Payload Length < 65,536 Implementations that obey this specification ignore these conditions and do not regard them as errors. 11. IPv4 Jumbograms By defining a new IPv4 Jumbo Payload option, this document also implicitly enables a true IPv4 jumbogram service defined as an IPv4 packet with a Jumbo Payload option included and with Total Length set to 0. All other aspects of IPv4 jumbograms are the same as for IPv6 jumbograms [RFC2675]. Templin Expires 16 July 2023 [Page 25] Internet-Draft IP Parcels January 2023 12. Implementation Status Common widely-deployed implementations include services such as TCP Segmentation Offload (TSO) and Generic Segmentation/Receive Offload (GSO/GRO). These services support a robust (but non-standard) service that has been shown to improve performance in many instances. UDP/IPv4 parcels have been implemented in the linux-5.10.67 kernel and ION-DTN ion-open-source-4.1.0 source distributions. Patch distribution found at: "https://github.com/fltemplin/ip-parcels.git". Performance analysis with a single-threaded receiver has shown that including increasing numbers of segments in a single parcel produces measurable performance gains over fewer numbers of segments due to more efficient packaging and reduced system calls/interrupts. For example, sending parcels with 30 2000-octet segments shows a 48% performance increase in comparison with ordinary IP packets with a single 2000-octet segment. Since performance is strongly bounded by single-segment receiver processing time (with larger segments producing dramatic performance increases), it is expected that parcels with increasing numbers of segments will provide a performance multiplier on multi-threaded receivers in parallel processing environments. 13. IANA Considerations The IANA is instructed to change the "MTUP - MTU Probe" entry in the 'ip option numbers' registry to the "JUMBO - IPv4 Jumbo Payload" option. The Copy and Class fields must both be set to 0, and the Number and Value fields must both be set to '11'. The reference must be changed to this document [RFCXXXX]. 14. Security Considerations In the control plane, original sources match the Nonce values in received Parcel Replys with their corresponding Parcel Probes. If the values match, the reply is likely an authentic response to a probe. In environments where stronger authentication is necessary, nodes that send Parcel Replys can apply the message authentication services specified for AERO/OMNI. In the data plane, multi-layer security solutions may be needed to ensure confidentiality, integrity and availability. Since parcels are defined only for TCP and UDP, IP layer securing services such as IPsec-AH/ESP [RFC4301] cannot be applied directly to parcels, although they can certainly be used at lower layers such as for transmission of parcels over VPNs and/or OMNI link secured spanning Templin Expires 16 July 2023 [Page 26] Internet-Draft IP Parcels January 2023 trees. Since the IP layer does not manipulate segments exchanged with upper layers, parcels do not interfere with transport- or higher-layer security services such as (D)TLS/SSL [RFC8446] which may provide greater flexibility in some environments. Further security considerations related to IP parcels are found in the AERO/OMNI specifications. 15. Acknowledgements This work was inspired by ongoing AERO/OMNI/DTN investigations. The concepts were further motivated through discussions on the IETF intarea and 6man lists as well as with Boeing colleagues. A considerable body of work over recent years has produced useful "segmentation offload" facilities available in widely-deployed implementations. 16. References 16.1. Normative References [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, August 1980, . [RFC0791] Postel, J. and RFC Publisher, "Internet Protocol", STD 5, RFC 791, DOI 10.17487/RFC0791, September 1981, . [RFC0792] Postel, J. and RFC Publisher, "Internet Control Message Protocol", STD 5, RFC 792, DOI 10.17487/RFC0792, September 1981, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC2675] Borman, D., Deering, S., Hinden, R., and RFC Publisher, "IPv6 Jumbograms", RFC 2675, DOI 10.17487/RFC2675, August 1999, . [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2006, . Templin Expires 16 July 2023 [Page 27] Internet-Draft IP Parcels January 2023 [RFC4443] Conta, A., Deering, S., Gupta, M., Ed., and RFC Publisher, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, March 2006, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8200] Deering, S., Hinden, R., and RFC Publisher, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, DOI 10.17487/RFC8200, July 2017, . [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)", STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022, . 16.2. Informative References [BIG-TCP] Dumazet, E., "BIG TCP, Netdev 0x15 Conference (virtual), https://netdevconf.info/0x15/session.html?BIG-TCP", 31 August 2021. [I-D.ietf-6man-mtu-option] Hinden, R. M. and G. Fairhurst, "IPv6 Minimum Path MTU Hop-by-Hop Option", Work in Progress, Internet-Draft, draft-ietf-6man-mtu-option-15, 10 May 2022, . [I-D.templin-dtn-ltpfrag] Templin, F., "LTP Fragmentation", Work in Progress, Internet-Draft, draft-templin-dtn-ltpfrag-09, 25 July 2022, . [I-D.templin-intarea-aero] Templin, F., "Automatic Extended Route Optimization (AERO)", Work in Progress, Internet-Draft, draft-templin- intarea-aero-10, 8 December 2022, . [I-D.templin-intarea-omni] Templin, F. L., "Transmission of IP Packets over Overlay Multilink Network (OMNI) Interfaces", Work in Progress, Templin Expires 16 July 2023 [Page 28] Internet-Draft IP Parcels January 2023 Internet-Draft, draft-templin-intarea-omni-11, 9 January 2023, . [QUIC] Ghedini, A., "Accelerating UDP packet transmission for QUIC, https://blog.cloudflare.com/accelerating-udp-packet- transmission-for-quic/", 8 January 2020. [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP MTU discovery options", RFC 1063, DOI 10.17487/RFC1063, July 1988, . [RFC1071] Braden, R., Borman, D., and C. Partridge, "Computing the Internet checksum", RFC 1071, DOI 10.17487/RFC1071, September 1988, . [RFC1191] Mogul, J., Deering, S., and RFC Publisher, "Path MTU discovery", RFC 1191, DOI 10.17487/RFC1191, November 1990, . [RFC4301] Kent, S., Seo, K., and RFC Publisher, "Security Architecture for the Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, December 2005, . [RFC4821] Mathis, M., Heffner, J., and RFC Publisher, "Packetization Layer Path MTU Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, . [RFC5326] Ramadas, M., Burleigh, S., and S. Farrell, "Licklider Transmission Protocol - Specification", RFC 5326, DOI 10.17487/RFC5326, September 2008, . [RFC7126] Gont, F., Atkinson, R., and C. Pignataro, "Recommendations on Filtering of IPv4 Packets Containing IPv4 Options", BCP 186, RFC 7126, DOI 10.17487/RFC7126, February 2014, . [RFC8201] McCann, J., Deering, S., Mogul, J., Hinden, R., Ed., and RFC Publisher, "Path MTU Discovery for IP version 6", STD 87, RFC 8201, DOI 10.17487/RFC8201, July 2017, . [RFC8446] Rescorla, E. and RFC Publisher, "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, . Templin Expires 16 July 2023 [Page 29] Internet-Draft IP Parcels January 2023 [RFC8899] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., Völker, T., and RFC Publisher, "Packetization Layer Path MTU Discovery for Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, September 2020, . [RFC9000] Iyengar, J., Ed., Thomson, M., Ed., and RFC Publisher, "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . [RFC9171] Burleigh, S., Fall, K., and E. Birrane, III, "Bundle Protocol Version 7", RFC 9171, DOI 10.17487/RFC9171, January 2022, . Appendix A. IP Parcel Futures Historic and current-day data links configure Maximum Transmission Units (MTUs) that are far smaller than the desired state for the future of IP parcel transmission. When the first Ethernet data links were deployed many decades ago, their 1500 octet MTU set a strong precedent that was widely adopted. This same size now appears as the predominant MTU limit for most paths in the Internet today, although modern link deployments with larger MTUs up to 9KB have begun to emerge. In the late 1980's, the Fiber Distributed Data Interface (FDDI) standard defined a new link type with MTU slightly larger than 4500 octets. The goal of the larger MTU was to increase performance by a factor of 10 over the ubiquitous 10Mbps and 1500-octet MTU Ethernet technologies of the time. Many factors including a failure to harmonize MTU diversity and an Ethernet performance increase to 100Mbps led to poor FDDI market reception. In the next decade, the 1990's saw new initiatives including ATM/AAL5 (9KB MTU) and HiPPI (64KB MTU) which offered high-speed data link alternatives with larger MTUs but again the inability to harmonize diversity derailed their momentum. By the end of the 1990s and leading into the 2000's, emergence of the 1Gbps, 10Gbps and even faster Ethernet performance levels seen today has obscured the fact that the modern Internet of the 21st century is still operating with 20th century MTUs! To bridge this gap, increased OMNI interface deployment in the near future will provide a virtual link type that can pass IP parcels over paths that traverse traditional data links with small MTUs. Performance analysis has proven that (single-threaded) receive-side performance is bounded by upper layer protocol segment size, with performance increasing in direct proportion with segment size. Experiments have also shown measurable (single-threaded) performance Templin Expires 16 July 2023 [Page 30] Internet-Draft IP Parcels January 2023 increases by including larger numbers of segments per parcel, with steady increases for including increasing number of segments. However, parallel receive-side processing will provide performance multiplier benefits since the multiple segments that arrive in a single parcel can be processed simultaneously instead of serially. In addition to the clear near-term benefits, IP parcels will increase performance to new levels as future parcel-capable links with very large MTUs begin to emerge. These links will provide MTUs far in excess of 64KB to as large as 16MB. With such large MTUs, the traditional CRC-32 (or even CRC-64) error checking with errored packet discard discipline will no longer apply for large parcels. Instead, parcels larger than a link-specific threshold will include Forward Error Correction (FEC) codes so that errored parcels can be repaired at the receiver's data link layer then delivered to upper layers rather than being discarded and triggering retransmission of large amounts of data. Even if the FEC repairs are incomplete or imperfect, all parcels can still be delivered to upper layers where the individual segment checksums will detect and discard any damaged data not repaired by lower layers. These new "super-links" will appear mostly in the network edges (e.g., high-performance data centers) and not as often in the middle of the Internet. (However, some space-domain links that extend over enormous distances may also benefit.) For this reason, a common use case will include parcel-capable super-links in the edge networks of both parties of an end-to-end session with an OMNI link connecting the two over wide area Internetworks. Medium- to moderately large- sized IP parcels over OMNI links will already provide considerable performance benefits for wide-area end-to-end communications while truly large IP parcels over super-links can provide boundless increases for localized bulk transfers in edge networks or for deep space long haul transmissions. The ability to grow and adapt without practical bound enabled by IP parcels will inevitably encourage new data link development leading to future innovations in new markets that will revolutionize the Internet. Until these new links begin to emerge, however, parcels will already provide a tremendous benefit to end systems by allowing applications to send and receive segment buffers larger than 65535 octets in a single system call. By expanding the current operating system call data copy limit from its current 16-bit length to a 32-bit length, applications will be able to send and receive maximum-length parcel buffers even if lower layers need to break them into multiple parcels to fit within the underlying interface MTU. For applications such as the Delay Tolerant Networking (DTN) Bundle Protocol [RFC9171], this will allow applications to send and receive entire large upper layer protocol constructs (such as DTN bundles) in a single system call. Templin Expires 16 July 2023 [Page 31] Internet-Draft IP Parcels January 2023 Appendix B. Change Log << RFC Editor - remove prior to publication >> Changes from earlier versions: * Submit for Intarea Standards Track RFC Publication. Author's Address Fred L. Templin (editor) Boeing Research & Technology P.O. Box 3707 Seattle, WA 98124 United States of America Email: fltemplin@acm.org Templin Expires 16 July 2023 [Page 32]