IP ParcelsBoeing Research & TechnologyP.O. Box 3707SeattleWA98124USAfltemplin@acm.orgI-DInternet-DraftIP packets (both IPv4 and IPv6) contain a single unit of upper layer
protocol data which becomes the retransmission unit in case of loss.
Upper layer protocols including the Transmission Control Protocol (TCP)
and transports over the User Datagram Protocol (UDP) prepare data units
known as "segments", with traditional arrangements including a single
segment per IP packet. This document presents a new construct known as
the "IP Parcel" which permits a single packet to carry multiple upper
layer protocol segments, essentially creating a "packet-of-packets". IP
parcels provide an essential building block for improved performance,
efficiency and integrity while encouraging larger Maximum Transmission
Units (MTUs) in the Internet.IP packets (both IPv4 and IPv6 ) contain a single unit of upper layer protocol data
which becomes the retransmission unit in case of loss. Upper layer
protocols such as the Transmission Control Protocol (TCP) and transports over the User Datagram Protocol (UDP)
(including QUIC , LTP
and others) prepare data units known as
"segments", with traditional arrangements including a single segment per
IP packet. This document presents a new construct known as the "IP
Parcel" which permits a single packet to carry multiple upper layer
protocol segments. This essentially creates a "packet-of-packets" with
the IP layer and full {TCP,UDP} headers appearing only once but with
possibly more than one segment included.Parcels are formed when an upper layer protocol entity identified by
the "5-tuple" (source address, destination address, source port,
destination port, protocol number) prepares a data buffer beginning with
an Integrity Block of up to 256 2-octet Checksums followed by their
corresponding upper layer protocol segments that can be broken out
into smaller sub-parcels and/or individual packets if necessary. All
segments except the final one must be equal in length and no larger
than 65535 octets (minus headers), while the final segment must not
be larger than the others but may be smaller. The upper layer protocol
entity then delivers the buffer, number of segments and non-final
segment size to lower layers which append a {TCP,UDP} header and
an IP header plus extensions that identify this as a parcel and
not an ordinary packet.Parcels can be forwarded over consecutive parcel-capable links in
a path until arriving at a router where the next hop is via a link
that does not support parcels, a parcel-capable link with a size
restriction, or an ingress middlebox Overlay Multilink Network
(OMNI) Interface that
spans intermediate Internetworks using adaptation layer encapsulation
and fragmentation. In the first case, the router transforms the parcel
into individual IP packets and forwards them via the next hop link.
In the second case, the router breaks the parcel into smaller
sub-parcels and forwards them via the next hop link. In the final
case, the OMNI interface breaks the parcel into smaller sub-parcels
if necessary then encapsulates each (sub-)parcel in headers suitable
for traversing the Internetworks while applying adaptation layer
fragmentation if necessary.These OMNI interface sub-parcels may then be recombined into one
or more larger parcels by an egress middlebox OMNI interface which
either delivers them locally or forwards them over additional
parcel-capable links on the path to the final destination.
Reordering and even loss or damage of individual segments within the
network is therefore possible, but what matters is that the number
of parcels delivered to the final destination should be kept to a
minimum for the sake of efficiency and that the loss or receipt of
individual segments (and not parcel size) determines the retransmission
unit.The following sections discuss rationale for creating and shipping
IP parcels as well as the actual protocol constructs and procedures
involved. IP parcels provide an essential building block for improved
performance, efficiency and integrity while encouraging larger Maximum
Transmission Units (MTUs) in the Internet. It is further expected that
the parcel concept will drive future innovation in applications,
operating systems, network equipment and data links.The Oxford Languages dictionary defines a "parcel" as "a thing or
collection of things wrapped in paper in order to be carried or sent by
mail". Indeed, there are many examples of parcel delivery services
worldwide that provide an essential transit backbone for efficient
business and consumer transactions.In this same spirit, an "IP parcel" is simply a collection of up to
256 upper layer protocol segments wrapped in an efficient package for
transmission and delivery (i.e., a "packet-of-packets") while a
"singleton IP parcel" is simply a parcel that contains a single segment.
IP parcels are distinguished from ordinary packets through the special
header constructions discussed in this document.The IP parcel construct is defined for both IPv4 and IPv6. Where the
document refers to "IPv4 header length", it means the total length of
the base IPv4 header plus all included options, i.e., as determined by
consulting the Internet Header Length (IHL) field. Where the document
refers to "IPv6 header length", however, it means only the length of the
base IPv6 header (i.e., 40 octets), while the length of any extension
headers is referred to separately as the "IPv6 extension header length".
Finally, the term "IP header plus extensions" refers generically to an
IPv4 header plus all included options or an IPv6 header plus all
included extension headers.Where the document refers to "{TCP, UDP} header length", it means
the length of either the TCP header plus options (20 or more octets)
or the UDP header (8 octets). It is important to note that only a
single IP header and a single full upper layer header appears in each
parcel regardless of the number of segments included. This distinction
often provides a significant savings in overhead made possible only
by IP parcels.Where the document refers to checksum calculations, it means the
standard Internet checksum unless otherwise specified. The same as for
TCP , UDP and IPv4
, the standard Internet checksum is defined as
(sic) "the 16-bit one's complement of the one's complement sum of all
(pseudo-)headers plus data, padded with zero octets at the end (if
necessary) to make a multiple of two octets". A notional Internet
checksum algorithm can be found in , while
practical implementations require special attention to byte ordering
"endianness" to ensure interoperability between diverse architectures.The Automatic Extended Route Optimization (AERO) and Overlay Multilink Network
Interface (OMNI) technologies
provide an ideal architectural framework for transmission of IP parcels.
AERO/OMNI are expected to provide an operational environment for IP
parcels beginning from the earliest deployment phases and extending to
accommodate continuous growth. As more and more parcel-capable links
begin to emerge, e.g., in data centers, edge networks, space-domain
links, etc., AERO/OMNI will provide a transit backbone for true IP
parcel Internetworking.The term "parcel-capable link" refers to any data link medium
(physical or virtual) capable of transiting a {TCP,UDP}/IP packet
that employs the parcel-specific constructions specified in this
document. The link MUST be capable of forwarding all parcels
with segment lengths no larger than the minimum of the link Maximum
Transmission Unit (MTU) and 65535, while applying parcel subdivision
if necessary (see: ). Currently, only the OMNI
link satisfies these properties, but new and existing link types
are encouraged to incorporate parcel support in their designs.The term "Maximum Transmission Unit (MTU)" is widely understood
in Internetworking terminology to mean the largest packet size that
can traverse a single link ("link MTU") or an entire path ("path MTU")
without requiring IP layer fragmentation. If the MTU value returned
during parcel path qualification is larger than 65535, it determines
only the maximum parcel size that a router can forward over a
restricting link without performing subdivision; otherwise, it
determines both the maximum parcel size and maximum size for a
single parcel segment (see: ).The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP 14
when, and only when,
they appear in all capitals, as shown here.Studies have shown that applications can improve their performance by
sending and receiving larger packets due to reduced numbers of system
calls and interrupts as well as larger atomic data copies between kernel
and user space. Larger packets also result in reduced numbers of network
device interrupts and better network utilization (e.g., due to header
overhead reduction) in comparison with smaller packets.A first study involved performance enhancement
of the QUIC protocol using the linux Generic
Segment/Receive Offload (GSO/GRO) facility. GSO/GRO provides a robust
(but non-standard) service similar in nature to the IP parcel
service described here, and its application has shown significant
performance increases due to the increased transfer unit size between
the operating system kernel and QUIC applications. Unlike IP parcels,
however, GSO/GRO perform fragmentation and reassembly at the transport
layer with the transport protocol segment size limited by the path MTU
(typically 1500 octets or smaller in today's Internet).A second study showed that
GSO/GRO also improves performance for the Licklider Transmission
Protocol (LTP) used for the Delay Tolerant
Networking (DTN) Bundle Protocol for segments
larger than the actual path MTU through the use of OMNI interface
encapsulation and fragmentation. Historically, the NFS protocol also
saw significant performance increases using larger (single-segment)
UDP datagrams even when IP fragmentation is invoked, and LTP still
follows this profile today. Moreover, LTP shows this (single-segment)
performance increase profile extending to the largest possible segment
size which suggests that additional performance gains are possible
using (multi-segment) IP parcels that approach or even exceed
65535 octets.TCP also benefits from larger packet sizes and efforts have
investigated TCP performance using jumbograms internally with changes to
the linux GSO/GRO facilities . The idea is to
use the jumbo payload option internally and to allow GSO/GRO to use
buffer sizes larger than 65535 octets, but with the understanding that
links that support jumbos natively are not yet widely available. Hence,
IP parcels provides a packaging that can be considered in the near term
under current deployment limitations.A limiting consideration for sending large packets is that they are
often lost at links with smaller MTUs, and the resulting Packet Too Big
(PTB) message may be lost somewhere in the path back to the original
source. This "Path MTU black hole" condition can degrade performance
unless robust path probing techniques are used, however the best case
performance always occurs when no packets are lost due to size
restrictions.These considerations therefore motivate a design where transport
protocols should employ a maximum segment size no larger than 65535
octets (minus headers), while parcels that carry multiple segments may
themselves be significantly larger. Then, even if the network needs to
sub-divide the parcels into smaller sub-parcels to forward further
toward the final destination, an important performance optimization for
the original source, final destination and network path as a whole can
be realized. This performance advantage is accompanied by an overall
improvement in integrity and efficiency.An analogy: when a consumer orders 50 small items from a major online
retailer, the retailer does not ship the order in 50 separate small
boxes. Instead, the retailer packs as many of the small items as
possible into one or a few larger boxes (i.e., parcels) then places the
parcels on a semi-truck or airplane. The parcels may then pass through
one or more regional distribution centers where they may be repackaged
into different parcel configurations and forwarded further until they
are finally delivered to the consumer. But most often, the consumer will
only find one or a few parcels at their doorstep and not 50 separate
small boxes. This flexible parcel delivery service greatly reduces
shipping and handling cost for all including the retailer, regional
distribution centers and finally the consumer.An upper layer protocol entity (identified by the 5-tuple as
above) forms an IP parcel when it prepares a data buffer containing the
concatenation of an Integrity Block of up to 256 2-octet Checksums
followed by their corresponding upper layer protocol segments (with each
TCP non-first segment preceded by a 4-octet Sequence Number). All non-final
segments MUST be equal in length while the final segment MUST NOT be
larger and MAY be smaller. Each non-final segment MUST NOT be larger
than the minimum of 65535 octets and the path MTU, minus the length of
the {TCP,UDP} header, minus the length of the IP header (plus
options/extensions), minus 2 octets for the per-segment Checksum.
(Note that this also satisfies the case of ingress middlebox OMNI
interfaces in the path that would process the headers as upper layer
protocol payload during IPv6 encapsulation/fragmentation.)The upper layer protocol entity then presents the buffer and
non-final segment size L to lower layers (noting that the buffer may be
larger than 65535 octets if it includes sufficient segments of a large
enough size to exceed that value). If the buffer plus headers would
together be no larger than the first hop link MTU or path MTU, the
lower layer then appends a single full {TCP,UDP} header (plus
options) followed by a single IP header (plus options/extensions).
If the buffer would cause a single parcel to exceed the link/path
MTU, the lower layer instead breaks the buffer up into multiple smaller
buffers (each with an integral number of segments) and appends separate
{TCP,UDP}/IP headers for each as separate parcels. (Note: if the first
hop link MTU is larger than the path MTU, the lower layer can either
limit the size of the parcels it sends to the path MTU or send parcels
as large as the link MTU with the understanding that a router in the
path may need to do extra work to subdivide it into smaller parcels.)The IP layer then presents each parcel to a network interface attachment
to either an ordinary parcel-capable link or an OMNI link that performs
adaptation layer encapsulation and fragmentation (see: ).
The IP layer includes a Jumbo Payload option in the IP header formed as shown
in :For IPv4, the Jumbo Payload option format follows from except that the IP layer sets option type to
'00001011' and option length to '00000110' noting that the length
distinguishes this type from its obsoleted use as the "IPv4 Probe MTU"
option . The IP layer also interprets the most
significant option data octet as an Nsegs field that encodes a value J
between 0 and 255 and sets the Jumbo Payload Length field to a
3-octet value M that encodes the length of the IPv4 header
plus the length of the {TCP,UDP} header plus the combined
length of the Integrity Block plus all concatenated segments. The IP
layer next sets the IPv4 header DF bit to 1 and Total Length field to
the non-final segment size L. Note that the IP layer can form true
IPv4 jumbograms (as opposed to parcels) by instead setting the IPv4
header Total Length field to 0 and treating the entire 4 octets of
the option data as the Jumbo Payload Length (see: ).For IPv6, the IP layer includes a Jumbo Payload option in an IPv6
Hop-by-Hop Options extension header formatted the same as for IPv4
above, but with option type set to '11000010' and option length set to
'00000100'. The IP layer sets the option data Nsegs field to a 1-octet
value J between 0 and 255 and sets the Jumbo Payload Length field to a
3-octet value M that encodes the lengths of all IPv6 extension headers
present plus the length of the {TCP,UDP} header plus the
combined length of the Integrity Block plus all concatenated segments.
The IP layer next sets the IPv6 header Payload Length field to L. Note
that the IP layer can form true IPv6 jumbograms (as opposed to parcels)
by instead setting the IPv6 header Payload Length field to 0 and treating
the entire 4 octets of the option data as the Jumbo Payload Length
(see: ).The IP layer then prepares the rest of the {TCP,UDP}/IP parcel
according to the formats shown in :where the total number of segments is (J + 1), L
is the length of each non-final segment which MUST NOT be larger than
65535 octets (minus headers) and K is the length of the final segment
which MUST NOT be larger than L. (Note that when J is 0, K and L
are one and the same value.)The {TCP,UDP} header is then immediately followed by an Integrity
Block containing (J + 1) 2-octet Checksums concatenated in numerical
order as shown in :
The Integrity Block is then followed by (J + 1) upper layer
protocol segments. For TCP, the TCP header Sequence Number field
encodes a 4-octet starting sequence number for the first segment
only, while each additional segment is preceded by its own 4-octet
Sequence Number field. For this reason, the length of the first
segment is only (L-4) octets since the 4-octet TCP header
Sequence Number field applies to that segment. (All non-first
TCP segments instead begin with their own Sequence Numbers,
with the 4-octet length included in L and K.)Following parcel construction, the Nsegs value unambiguously
determines the number of 2-octet Checksums present in the Integrity
Block and (together with the Jumbo Payload Length) also determines
the number of parcel data segments present. Receivers therefore
observe the following:if the Jumbo Payload Length indicates insufficient space for
the full Integrity Block plus at least one data segment octet,
the receiver discards the parcel.if the length of the payload following the Integrity Block
is (J * L) or less, the receiver processes all initial
Checksums along with their corresponding segments up to the
end of the payload and ignores any remaining Checksums.if the length of the payload following the Integrity Block is
greater than ((J + 1) * L) the receiver processes all Checksums
with their corresponding segments and ignores any remaining
payload beyond the end of the final segment.Note: per-segment Checksums appear in a contiguous Integrity Block
immediately following the {TCP,UDP}/IP headers instead of inline with
the parcel segments to greatly increase the probability that they will
appear in the contiguous head of a kernel receive buffer even if the
parcel was subject to OMNI interface IPv6 fragmentation. This condition
may not always hold if the IPv6 fragments also incur IPv4 encapsulation
and fragmentation over paths that traverse slow IPv4 links with small
MTUs. In that case, performance is bounded by the unavoidable slow link
traversal and not the overhead for pulling a fragmented Integrity
Block into the contiguous head of a kernel receive buffer.A TCP Parcel is an IP Parcel that includes an IP header plus
extensions with a Jumbo Payload option with Nsegs/J encoding one
less than the number of segments and Jumbo Payload length encoding
a value up to (2**24 - 1). The IP header plus extensions is then
followed by a TCP header plus options (20 or more octets), which is
then followed by an Integrity Block with (J + 1) consecutive 2-octet
Checksums. The Integrity Block is then followed by (J + 1) consecutive
segments, where the first segment is (L-4) octets in length and uses
the 4-octet sequence number found in the TCP header, each intermediate
segment is L octets in length (including its own 4-octet Sequence
Number segment header) and the final segment is K octets in length
(including its own 4-octet Sequence Number segment header). The value
L is encoded in the IP header {Total, Payload} Length field while J
is encoded in the Nsegs octet. The overall length of the parcel as
well as final segment length K are determined by the Jumbo Payload
length M as discussed above.The source prepares TCP Parcels in a similar fashion as for TCP
jumbograms . The source calculates a checksum
of the TCP header plus IP pseudo-header only (see: ),
but with the TCP header Sequence Number field temporarily set to 0
during the calculation since the true sequence number will be included
as a pseudo header for the first segment. The source then writes the
calculated value in the TCP header Checksum field as-is (i.e., without
converting calculated '0' values to 'ffff') and finally re-writes the
actual sequence number back into the Sequence Number field. (Nodes
that verify the header checksum first perform the same operation of
temporarily setting the Sequence Number field to 0 and then resetting
to the actual value following checksum verification.)The source then calculates the checksum of the first segment
beginning with the sequence number found in the full TCP header as a
4-octet pseudo-header then extending over the remaining (L-4) octet
length of the segment. The source next calculates the checksum for
each L octet intermediate segment independently over the length of
the segment (beginning with its sequence number), then finally
calculates the checksum of the K octet final segment (beginning
with its sequence number). As the source calculates each per-segment
checksum for segment(i) (for i = 0 thru J), it writes the value into
the corresponding Integrity Block Checksum(i) field as-is.See: for further discussion.A UDP Parcel is an IP Parcel that includes an IP header plus
extensions with a Jumbo Payload option with Nsegs/J encoding one
less than the number of segments and Jumbo Payload length encoding
a value up to (2**24 - 1). The IP header plus extensions is then
followed by an 8-octet UDP header followed by an Integrity Block
with (J + 1) consecutive 2-octet Checksums followed by (J + 1)
upper layer protocol segments. Each segment must begin with a
transport-specific start delimiter (e.g., a segment identifier)
included by the transport layer user of UDP. The length of the first
segment L is encoded in the IP {Total, Payload} Length field while
J is encoded in the Nsegs octet. The overall length of the parcel
as well as the final segment length are determined by the Jumbo
Payload length M as discussed above.The source prepares UDP Parcels in a similar fashion as for UDP
jumbograms and therefore MUST set the UDP header
length field to 0. The source then calculates the checksum of the UDP
header plus IP pseudo-header (see: ) and
writes the calculated value in the UDP header Checksum field as-is
(i.e., without converting calculated '0' values to 'ffff').The source then calculates a separate checksum for each segment
for which checksums are enabled independently over the length of the
segment. As the source calculates each per-segment checksum for
segment(i) (for i = 0 thru J), it writes the value into the
corresponding Integrity Block Checksum(i) field with calculated
'0' values converted to 'ffff'; for segments with checksums
disabled, the source instead writes the value '0'.See: for further discussion.The IP layer of the source next presents each parcel to a network
interface for transmission. For ordinary IP interface attachments to
parcel-capable links, the interface simply admits each parcel into
the link the same as for any IP packet after which it may then be
forwarded by any number of routers over additional consecutive
parcel-capable links possibly even traversing the entire forward
path to the final destination. If any router in the path does not
recognize the parcel construct, it may drop the parcel and return
an ICMP "Parameter Problem" message. For this reason, the source
should perform parcel path qualification before sending parcels
over new paths (see: ).If the router recognizes parcels but the next hop link in the path
does not, or if the parcel would exceed the next hop link MTU, the
router instead opens the parcel. The router then forwards each enclosed
segment in singleton IP packets or in a set of smaller sub-parcels that
each contain a subset of the original parcel's segments. If the next
hop link is via an OMNI interface, the router instead proceeds according
to OMNI Adaptation Layer procedures. These considerations are discussed
in detail in the following sections.For transmission of singleton IP packets over links that do not
support parcels, the router removes the Jumbo Payload option, sets
aside and remembers the Integrity Block (and for TCP also truncates
the Sequence Number headers of each non-first segment while remembering
their values) then copies the {TCP,UDP}/IP headers followed by segment(i)
(for i= 0 thru J) into i individual singleton IP packets. The router then
sets IP {Total, Payload} length for each singleton(i) based on the length
of segment(i) according to the standards . The router then processes each singleton(i)
according to upper layer protocol conventions.For TCP, the router clears the SYN/ACK flags in all except
singleton(0) then calculates the checksum for singleton(0)'s TCP/IP
headers only according to but with the Sequence
Number value saved and the field set to 0. The router then adds Integrity
Block Checksum(0) to the calculated value and writes the sum into
singleton(0)'s TCP checksum field. The router then resets the Sequence
Number field to singleton(0)'s saved sequence number and forwards
singleton(0) to the next hop. The router next calculates the checksum
of singleton(1)'s TCP/IP headers with the Sequence Number field set
to 0 and saves the calculated value. In each non-first singleton(i)
(for i = 1 thru J), the router then adds the saved value to Integrity
Block Checksum(i), writes the sum into singleton(i)'s TCP checksum
field, sets the TCP Sequence Number field to singleton(i)'s sequence
number then forwards singleton(i) to the next hop.For UDP, the router sets the UDP length field according to in each singleton(i) (for i= 0 thru J). If Integrity
Block Checksum(i) is 0, the router then sets the UDP header checksum
to 0, forwards singleton(i) to the next hop and continues to the next.
The router next calculates the checksum over singleton(i)'s UDP/IP
headers only according to . If Integrity Block
Checksum(i) is not 'ffff', the router then adds the value to the header
checksum; otherwise, the router re-calculates the checksum for segment(i).
If the re-calculated segment(i) checksum value is 'ffff' or '0' the
router adds the value to the header checksum; otherwise, it continues
to singleton(i+1) (see note). The router finally writes the total
checksum value into the UDP checksum field for singleton(i) (or
writes 'ffff' if the total was '0') and forwards singleton(i) to
the next hop.Note: for each UDP singleton(i), the router must recalculate
the segment checksum if Checksum(i) is 'ffff', since that value is
shared by both '0' and 'ffff' calculated checksums. If recalculating
the checksum produces an incorrect value, segment(i) is considered
errored and the router can optionally drop or forward (noting that
the forwarded singleton would simply be discarded as an error by
the final destination).Note: for each {TCP,UDP} singleton(i), the router can optionally
re-calculate and verify the segment checksum unconditionally before
forwarding, but this may introduce undesirable extra delay and
processing overhead.For transmission of smaller sub-parcels over parcel-capable links,
the router breaks the original parcel into smaller groups of segments
that would fit within the path MTU by determining the number of
segments of length L that can fit into each sub-parcel under the size
constraints. For example, if the router determines that a sub-parcel
can contain 3 segments of length L, it creates sub-parcels with the
first containing Integrity Block Checksums/Segments 0-2, the second
containing Checksums/Segments 3-5, etc., and with the final containing
any remaining Checksums/Segments.The router then appends identical {TCP,UDP}/IP headers (including the
jumbo payload option and any other extensions) to each sub-parcel while
resetting L and M in each according to the above equations with Nsegs/J
set to 2 for each intermediate sub-parcel and with Nsegs/J set to one
less than the remaining number of segments for the final sub-parcel. For
TCP, the router then sets the TCP Sequence Number field to the value
that appears in the first sub-parcel segment while removing the first
segment Sequence Number field (if present) and also clears the SYN/ACK
flags in all sub-parcels except the first. For both TCP and UDP, the
router finally resets the {TCP,UDP} header checksum according to
ordinary parcel formation procedures (see above) then forwards each
(sub-)parcel over the outgoing parcel-capable link.Note: sub-dividing a larger parcel into two or more sub-parcels
entails replication of the {TCP,UDP}/IP headers (including the
jumbo payload option and any other extensions). For TCP, the process
entails copying the full TCP/IP header from the original parcel while
writing the sequence number of the first sub-parcel segment into the TCP
Sequence Number field, clearing the SYN/ACK flags if necessary as discussed
above and truncating the (new) first segment Sequence Number field. For
UDP, the process entails copying the full UDP/IP header from the original
parcel into each sub-parcel. For both TCP and UDP, the process finally
includes recalculating and resetting Nsegs and Jumbo Payload Length then
recalculating the {TCP,UDP} header checksum. Note that the per-segment
Integrity Block Checksum values in the sub-parcel segments themselves
are still valid and need not be recalculated.For transmission of original parcels or sub-parcels over OMNI
interfaces, the OMNI Adaptation Layer (OAL) of this First Hop Segment (FHS)
OAL source node then forwards the parcel to the next OAL hop which may be
either an OAL intermediate node or a Last Hop Segment (LHS) OAL destination.
OMNI interface upper layer protocol processing procedures are specified in
detail in the remainder of this section, while lower layer encapsulation
and fragmentation procedures are specified in detail in
.When the OAL source forwards a parcel (whether generated by a local
application or generated by another node then forwarded over one or more
parcel-capable links), it first assigns a monotonically-incrementing
(modulo 127) "Parcel ID" for adaptation layer processing. If necessary,
the OAL source then subdivides the parcel into sub-parcels the same
as for the IP layer parcel subdivision procedures discussed above. The
OAL source next assigns a different monotonically-incrementing
Identification value for each sub-parcel of the same "Parcel ID" then
performs adaptation layer encapsulation and fragmentation and finally
forwards them to the next OAL hop which forwards further toward the
OAL destination as necessary. If sub-dividing an IP parcel under
current size constraints would result in more than 64 sub-parcels,
each successive group of at most 64 sub-parcels must be transmitted
under a new Parcel ID value to avoid Identification value overlaps
between successive groups.When the sub-parcels arrive at the OAL destination, the node can
optionally retain them along with their Parcel ID and Identifications
for a brief time to support re-combining with peer sub-parcels of the
same original parcel identified by the adaptation layer 4-tuple
consisting of the (source, destination, Identification, Parcel ID)
fields. This re-combining entails the concatenation of Checksums/Segments
included in sub-parcels with the same Parcel ID and with Identification
values within 64 of one another to create a larger sub-parcel possibly
even as large as the entire original parcel. Order of concatenation need
not be strictly enforced, with the exception that the sub-parcel containing
the final segment must occur as a final concatenation and not as an
intermediate. The OAL destination then appends a common {TCP,UDP}/IP
header plus extensions to each re-combined sub-parcel while resetting J,
K, L and M in each according to the above equations. For TCP, if any
sub-parcels have the SYN/ACK flags set the OAL destination also sets
the SYN/ACK flags in the re-combined sub-parcel TCP header. The OAL
destination then resets the {TCP,UDP}/IP header checksum for each
re-combined sub-parcel. If the OAL destination is also the final
destination, it then delivers the sub-parcels to the IP layer which
processes them according to the 5-tuple information supplied by the
original source. Otherwise, the OAL destination forwards each sub-parcel
toward the final destination the same as for an ordinary IP packet as
discussed above.Note: re-combining two or more sub-parcels into a larger parcel
entails a process in which the {TCP,UDP}/IP headers of non-first
sub-parcels are discarded and their included segments concatenated
following those of a first sub-parcel. For TCP, the process includes
setting the SYN/ACK flags in the TCP header only if SYN/ACK were set
in any of the original sub-parcels. For both TCP and UDP, the process
finally includes recalculating and resetting Nsegs and Jumbo Payload
Length then recalculating the {TCP,UDP} header checksum as discussed
above (the per-segment Integrity Block Checksums need not be
recalculated). The OAL destination can instead avoid this process
if it would negatively impact performance, noting that forwarding
individual sub-parcels without delay and without re-combining is
always acceptable.Note: while the OAL destination and/or final destination could
theoretically re-combine the sub-parcels of multiple different parcels
with identical upper layer protocol 5-tuples and intermediate segment
lengths, this process could become complicated when the different
parcels each have differing final segment lengths. Since this might
interfere with any perceived performance advantage, the decision of
whether and how to perform inter-parcel concatenation is an
implementation matter.Note: sub-dividing of IP parcels over OMNI links occurs only at an
OAL ingress node while re-combining of IP parcels occurs only at an OAL
egress node. Therefore, intermediate OAL nodes do not participate in
the sub-dividing or recombining processes. For TCP, the SYN/ACK flags
must be managed as specified above to avoid confusing receivers with
gratuitous duplicate ACKs.To determine whether parcels are supported over at least an initial
portion of the forward path toward the final destination, the original
source can send IP parcels that contain Jumbo Payload options formatted
as "Parcel Probes". The purpose of the probe is to elicit a "Parcel
Reply" and possibly also an upper layer protocol-specific probe reply
from the final destination.If the original source receives a positive Parcel Reply, it marks
the path as "parcels supported" and ignores any ordinary ICMP and/or Packet Too Big (PTB)
messages concerning the
probe. If the original source instead receives a negative Parcel Reply
or no reply, it marks the path as "parcels not supported" and may regard
any ordinary ICMP and/or PTB messages concerning the probe (or its
contents) as indications of a possible MTU restriction.The original source can therefore send Parcel Probes in parallel with
sending real data as ordinary IP packets/parcels. The parcel probes will
traverse parcel-capable links joined by routers on the forward path
possibly extending all the way to the destination. If the original
source receives a Parcel Reply, it can continue using IP parcels.Parcel Probes include the same Jumbo Payload option type used for
ordinary parcels (see: ) but set a different
option length and include a 4-octet "Path MTU" field into which
conformant routers write the minimum link MTU observed in a
similar fashion as described in . Parcel Probes include one or more
upper layer protocol segments corresponding to the 5-tuple for the
flow, which may also include {TCP,UDP} segment size probes used for
packetization layer path MTU discovery .The original source sends Parcel Probes unidirectionally in the
forward path toward the final destination to elicit a Parcel Reply,
since it will often be the case that IP parcels are supported only
in the forward path and not in the return path. Parcel Probes may be
dropped in the forward path by any node that does not recognize IP
parcels, but Parcel Replys must be packaged to avoid filtering since
parcels may not be recognized along portions of the return path. For
this reason, the Jumbo Payload options included in Parcel Probes are
always packaged as IPv4 header options or IPv6 Hop-by-Hop options while
Parcel Replys are returned as UDP/IP encapsulated ICMPv6 PTB messages
with a "Parcel Reply" Code value (see: ).Original sources send Parcel Probes that include a Jumbo Payload
option coded in an alternate format as shown in :
For IPv4, the original source includes the option as an IPv4
header option with Type set to '00001011' the same as for an ordinary IPv4
parcel (see: ) but with Length set to '00010100'
to distinguish this as a probe. The original source sets Nonce-1 to
'11111111', sets Check to the same value that will appear in the TTL
of the outgoing IPv4 header, sets PMTU
to the MTU of the outgoing IPv4 interface and sets Nonce-2 to a 64-bit
random number. The source next includes a {TCP,UDP} header followed by
an Integrity Block with Checksums followed by their upper layer protocol
Segments in the same format as for an ordinary parcel. (The source can
also form a NULL probe by setting Protocol to "No Next Header (59)" and
including an Integrity Block with Checksum fields set to '0' followed by
NULL segments with null, random and/or other disposable payloads.) The
source then sets {Nsegs, Jumbo Payload Length, IPv4 Total Length} and
calculates the header and per-segment checksums the same as for an
ordinary parcel. The source finally sends the Parcel Probe via the
outbound IPv4 interface. According to ,
middleboxes (i.e., routers, security gateways, firewalls, etc.) that
do not observe this specification SHOULD drop IP packets that contain
option type '00001011' ("IPv4 Probe MTU") but some might instead either
attempt to implement or ignore the option
altogether. IPv4 middleboxes that observe this specification instead
MUST process the option as a Parcel Probe as specified below.For IPv6, the original source includes the probe option as an IPv6
Hop-by-Hop option with Type set to '11000010' the same as for an
ordinary IPv6 parcel (see: ) but with Length set
to '00010010' to distinguish this as a probe. The original source sets
Nonce-1 to '11111111', sets Check to the same value that will appear in
the Hop Limit of the outgoing IPv6 header, sets PMTU to the MTU of the
outgoing IPv6 interface and sets Nonce-2 to a 64-bit random number. The
source next includes a {TCP,UDP} header followed by upper
layer protocol Segments along with their Integrity Block Checksums
in the same format as for an ordinary parcel. (The source can also
form a NULL probe by setting Next Header to "No Next Header (59)"
and including an Integrity Block with Checksum fields set to '0'
followed by NULL segments with zero, random and/or other disposable payloads.)
The source then sets {Nsegs, Jumbo Payload Length, IPv6 Payload Length}
and calculates the header and per-segment checksums the same as for an
ordinary parcel. The source finally sends the Parcel Probe via the outbound
IPv6 interface. According to , middleboxes
(i.e., routers, security gateways, firewalls, etc.) that recognize the
IPv6 Jumbo Payload option but do not observe this specification SHOULD
return an ICMPv6 Parameter Problem message (and presumably also drop the
packet) due to the different option length. IPv6 middleboxes that
observe this specification instead MUST process the option as a Parcel
Probe as specified below.When a router that observes this specification receives an
IP Parcel Probe it first compares Nonce-1 with '11111111' and
Check with the IP header TTL/Hop Limit; if either value differs, the router
MUST drop the probe and return a negative Parcel Reply (see below). Otherwise,
if the next hop link is non-parcel-capable the router compares the PMTU value
with the MTU of the inbound link for the probe and MUST (re)set PMTU to
the lower MTU. The router then MUST return a positive Parcel Reply (see
below) and convert the probe into an ordinary IP packet(s) the same as was
described previously for routers forwarding to non-parcel-capable links.
If the next hop IP link configures a sufficiently large MTU to pass the
packet(s), the router then MUST forward each packet to the next hop;
otherwise, it MUST drop each packet and return a (single) suitable PTB.
If the next hop IP link both supports parcels and configures an MTU that
is large enough to pass the probe, the router instead compares the probe
PMTU value with the MTUs of both the inbound and outbound links for the
probe and MUST (re)set PMTU to the lower MTU. The router then MUST reset
Check to the same value that will appear in the TTL/Hop Limit of the
outgoing IP header, and MUST forward the Parcel Probe to the next hop.
If the next hop IP link supports parcels but configures an MTU that is
too small to pass the probe, it resets PMTU and Check the same as above
then subdivides the probe into multiple smaller probes small enough to
traverse the link.The final destination may therefore receive either one or more
ordinary IP packets or intact Parcel Probes. If the final destination
receives ordinary IP packets, it performs any necessary integrity checks
then delivers the packets to upper layers which will return an upper layer
probe response if necessary. If the final destination receives a Parcel
Probe, it first compares Nonce-1 with '11111111' and Check with the IP
header TTL/Hop Limit; if either value differs, the final destination
MUST drop the probe and return a negative Parcel Reply. Otherwise, the
final destination compares the probe PMTU value with the MTU of the
inbound link and MUST (re)set PMTU to the lower MTU. The final destination
then MUST return a positive Parcel Reply and deliver the probe contents
to upper layers the same as for an ordinary IP parcel.When a router or final destination returns a Parcel Reply, it
prepares an ICMPv6 PTB message with Code set to
"Parcel Reply" (see: ) and with
MTU set to either the PMTU value reported in the Parcel Probe for a positive
reply or to the value '0' for a negative reply. The node then writes its
own IP address as the Parcel Reply source and writes the source of the
Parcel Probe as the Parcel Reply destination (for IPv4 Parcel Probes,
the node writes the Parcel Reply addresses as IPv4-Compatible IPv6
addresses ). The node next copies as much of
the leading portion of the Parcel Probe (beginning with the IP header)
as possible into the "packet in error" field without causing the Parcel
Reply to exceed 512 octets in length, then calculates the ICMPv6 header
checksum. Since IPv6 packets cannot traverse IPv4 paths, and since
middleboxes often filter ICMPv6 messages as they traverse IPv6 paths,
the node next wraps the Parcel Reply in UDP/IP headers of the correct
IP version with the IP source and destination addresses copied from
the Parcel Reply and with UDP port numbers set to the UDP port number
for OMNI . In the process, the
node either calculates or omits the UDP checksum as appropriate and
(for IPv4) clears the DF bit. The node finally sends the prepared
Parcel Reply to the original source of the probe.After sending a Parcel Probe the original source may therefore
receive a UDP/IP encapsulated Parcel Reply (see above) and/or an upper
layer protocol probe reply. If the source receives a Parcel Reply, it
first verifies the checksum then matches the enclosed PTB message
with the original Parcel Probe by examining the Nonce-2 field echoed in
the ICMPv6 "packet in error" field containing the leading portion of the
probe. If PTB does not match, the source discards the Parcel Reply;
otherwise, it continues to process. If the Parcel Reply MTU is '0',
the source marks the path as "parcels not supported; otherwise, it
marks the path as "parcels supported" and also records the MTU value
as the path MTU for the forward path to this destination. If the
MTU value is 65535 or smaller, the value represents both a segment
size restriction and largest whole parcel size that can traverse
the path without subdivision. If the MTU value is larger, the
value determines only the largest whole parcel size.After receiving a positive Parcel Reply, the original source can
continue sending IP parcels addressed to the final destination; any
upper layer protocol probe replies may further reduce the maximum
segment size that can be included in the parcel as an upper layer
consideration. After receiving a negative Parcel Reply (or no reply)
the original source should refrain from sending parcels until a path
change event might occur. In both cases, the original source should
periodically re-initiate Parcel Path Qualification for as long as it
desires to use the IP parcel service. If at any time performance
appears to degrade, the original source should reduce the size of
the parcels it sends and/or begin sending singleton IP packets
instead.The original source can also use Parcel Path Qualification to
qualify the path for ordinary IP jumbograms simply by setting the IP
header length field to '0' and formatting the probe body as an ordinary
jumbogram no larger than the maximum size that can be represented in the
32-bit Jumbo Payload Length. (The source can also form a NULL probe by
setting Protocol/Next Header to "No Next Header (59)" and including a
zero, random and/or other disposable jumbo payload.) Routers that
forward the (Jumbogram) Parcel Probe will recognize the '0' IP header
length as an indication that the probe is a true Jumbogram (i.e., and
not a parcel). Each router sets PMTU to the largest Jumbogram size
it is capable of forwarding, then forwards the probe to the next hop.
If the next hop MTU is too small, the router instead drops the probe
and returns a negative (Jumbogram) Parcel Reply. Therefore, only the
destination itself may return a positive (Jumbogram) Parcel Reply with
the resulting PMTU value. This especially implies the largest possible
Jumbogram size may be significantly smaller than the largest possible
parcel size, since forwarding nodes can sub-divide parcels but
cannot sub-divide singleton Jumbograms.Note: when a Parcel Probe forwarded into an ingress OMNI interface is
broken into sub-parcels, each sub-parcel includes its own copy of the
Parcel Probe header. When multiple sub-parcels of the same Parcel Probe
arrive at an egress OMNI interface, the interface optionally re-combines
the sub-parcels while retaining the Parcel Probe header. It is therefore
possible that a single Parcel Probe with multiple upper layer protocol
segments could generate multiple Parcel Replys.Note: The original source includes Nonce-1 and Check fields as the
first 2 octets of Parcel Probes in case a router on the path
overwrites the values in a wayward attempt to implement . Parcel Probe recipients should therefore regard a
Nonce-1 value other than '11111111' as an indication that
the field was either intentionally or accidentally altered by a
previous hop node that does not recognize parcels.Note: If a router or final destination receives a Parcel Probe but
does not recognize the parcel construct, it drops the probe without
further processing (and may return an ICMP error). The original
source will then consider the probe as lost and parcels cannot
be used.The {TCP,UDP}/IP header plus each segment of a (multi-segment) IP
parcel includes its own integrity check. This means that IP parcels can
support stronger and more discrete integrity checks for the same amount
of upper layer protocol data compared to an ordinary IP packet or
Jumbogram. The {TCP/UDP} header integrity checks can be verified at
each hop to ensure that parcels with errored headers are detected.
The per-segment Integrity Block Checksums are set by the source and
verified by the final destination, noting that TCP parcels must
honor the sequence number discipline discussed in
.IP parcels can range in length from as small as only the {TCP,UDP}/IP
headers plus a single Integrity Block Checksum with a non-zero length
segment to as large as the headers plus (256 * (65535 minus headers)) octets.
Although 32-bit link layer integrity checks provide sufficient protection
for contiguous data blocks up to approximately 9KB, reliance on link-layer
integrity checks may be inadvisable for links with significantly larger
MTUs and may not be possible at all for links such as tunnels over IPv4
that invoke fragmentation. Moreover, the segment contents of a received
parcel may arrive in an incomplete and/or rearranged order with respect
to their original packaging.Lower layer protocol entities calculate and verify {TCP,UDP}/IP
parcel header Checksums at their layer, since an errored header could
result in mis-delivery to the wrong upper layer protocol entity. If a
lower layer protocol entity on the path detects an incorrect
{TCP,UDP}/IP Checksum it discards the entire IP parcel unless the
header(s) can somehow be repaired.To support the parcel header checksum calculation, lower layer
protocol entities use modified versions of the {TCP,UDP}/IPv4
"pseudo-header" found in ,
or the {TCP,UDP}/IPv6 "pseudo-header" found in Section 8.1 of
. Note that while the contents of the
two IP protocol version-specific pseudo-headers beyond the address
fields are the same, the order in which the contents are arranged
differs and must be honored according to the specific IP protocol
version as shown in . This allows for maximum
reuse of widely deployed code while ensuring interoperability.where the following fields appear in both pseudo-headers
but with different ordering:Source Address is the 4-octet IPv4 or 16-octet IPv6 source
address of the prepared parcel.Destination Address is the 4-octet IPv4 or 16-octet IPv6
destination address of the prepared parcel.zero encodes the constant value '0'.Next Header is the IP protocol number corresponding to the upper
layer protocol, i.e., TCP or UDP.Segment Length is the value that appears in the IPv4 Total
Length or IPv6 Payload Length field of the prepared parcel.Nsegs is a 1-octet value one less than the number of segments
included, and must contain a number between 0 and 255 (this is
the same value that appears in the Jumbo Payload Option Nsegs
field).Upper-Layer Packet Length is the 3-octet length of the
{TCP,UDP} header plus data (this value can be derived from
the Jumbo Payload Length by subtracting the IPv4 header length
for IPv4 or IPv6 extension header length for IPv6).Upper layer protocol entities use socket options to coordinate
per-segment checksum processing with lower layers. If the upper layer
sets a SO_NO_CHECK(TX) socket option, the upper layer is responsible for
supplying per-segment checksums on transmission and the lower layer
forwards the IP parcel to the next hop without further processing;
otherwise, the lower layer supplies the per-segment checksums before
forwarding. If the upper layer sets a SO_NO_CHECK(RX) socket option,
the upper layer is responsible for verifying per-segment checksums on
reception and the lower layer delivers each received parcel body to
the upper layer without further processing; otherwise, the lower
layer verifies the per-segment parcel checksums before delivering.When the upper layer protocol entity of the source sends a parcel
body to lower layers, it prepends an Integrity Block of (J + 1) 2-octet
Checksum fields and includes a 4-octet Sequence Number field with each
TCP non-first segment. If the SO_NO_CHECK(TX) socket option is set, the
upper layer protocol either calculates each segment checksum and writes
the value into the corresponding Checksum field (and for UDP with '0'
values written as 'ffff') or writes the value '0' to disable checksums
for specific segments (for UDP only). If the SO_NO_CHECK(TX) socket
options is clear, the upper layer instead writes the value '0' for
UDP to disable or any non-zero value to enable checksums for specific
segments.When the lower layer protocol entity of the source receives the
parcel body from upper layers, if the SO_NO_CHECK(TX) socket option is
set the lower layer appends the {TCP,UDP}/IP headers and forwards the
parcel to the next hop without further processing. If the
SO_NO_CHECK(TX) socket option is clear, the lower layer instead
calculates the checksum for each segment with a non-zero value in the
corresponding Integrity Block Checksum field and overwrites the
calculated value into the Checksum field (and for UDP with '0'
values written as 'ffff').When the lower layer protocol entity of the destination receives a
parcel from the source, if the SO_NO_CHECK(RX) socket option is set the
lower layer delivers the parcel body to the upper layer without further
processing, and the upper layer is responsible for per-segment checksum
verification. If the SO_NO_CHECK(RX) socket option is clear, the lower
layer instead verifies the checksum for each TCP segment (or each
UDP segment with a non-zero value in the corresponding Integrity Block
Checksum field) and marks a corresponding field for the segment in an
ancillary data structure as one of "correct" or "incorrect". (For UDP,
if the Checksum is '0' the lower layer protocol unconditionally marks
the segment as "correct".) The lower layer then delivers both the parcel
body (beginning with the Integrity block) and ancillary data to the
upper layer which can then determine which segments have
correct/incorrect checksums, noting that a '0' checksum always
means that the checksum for this segment is disabled.Note: The Integrity Block itself is intentionally omitted from the IP
Parcel {TCP,UDP} header checksum calculation. This permits destinations
to accept as many intact segments as possible from received parcels with
checksum block bit errors, whereas the entire parcel would need to be
discarded if the header checksum also covered the Integrity Block.Note: IP parcels and jumbograms that set Protocol/Next Header to
"No Next Header (59)" do not include a {TCP,UDP} Checksum field and
therefore do not include a header checksum. Intermediate nodes simply
forward these NULL parcels/jumbos without verifying a header checksum,
while destination nodes simply discard them after returning a Parcel
Reply, if necessary.Section 3 of provides a list of certain
conditions to be considered as errors. In particular:error: IPv6 Payload Length != 0 and Jumbo Payload option
presenterror: Jumbo Payload option present and Jumbo Payload Length <
65,536Implementations that obey this specification ignore these conditions
and do not regard them as errors.By defining a new IPv4 Jumbo Payload option, this document also
implicitly enables a true IPv4 jumbogram service defined as an IPv4
packet with a Jumbo Payload option included and with Total Length set to
0. All other aspects of IPv4 jumbograms are the same as for IPv6
jumbograms .Common widely-deployed implementations include services such as TCP
Segmentation Offload (TSO) and Generic Segmentation/Receive Offload
(GSO/GRO). These services support a robust (but non-standard)
service that has been shown to improve performance in many
instances.UDP/IPv4 parcels have been implemented in the linux-5.10.67 kernel and
ION-DTN ion-open-source-4.1.0 source distributions. Patch distribution
found at: "https://github.com/fltemplin/ip-parcels.git".Performance analysis with a single-threaded receiver has shown that
including increasing numbers of segments in a single parcel produces
measurable performance gains over fewer numbers of segments due to more
efficient packaging and reduced system calls/interrupts. For example,
sending parcels with 30 2000-octet segments shows a 48% performance
increase in comparison with ordinary IP packets with a single
2000-octet segment.Since performance is strongly bounded by single-segment receiver
processing time (with larger segments producing dramatic performance
increases), it is expected that parcels with increasing numbers of
segments will provide a performance multiplier on multi-threaded
receivers in parallel processing environments.The IANA is instructed to change the "MTUP - MTU Probe" entry in the
'ip option numbers' registry to the "JUMBO - IPv4 Jumbo Payload" option.
The Copy and Class fields must both be set to 0, and the Number and
Value fields must both be set to '11'. The reference must be changed to
this document [RFCXXXX].In the control plane, original sources match the Nonce values
in received Parcel Replys with their corresponding Parcel Probes.
If the values match, the reply is likely an authentic response to
a probe. In environments where stronger authentication is necessary,
nodes that send Parcel Replys can apply the message authentication
services specified for AERO/OMNI.In the data plane, multi-layer security solutions may be needed
to ensure confidentiality, integrity and availability. Since parcels
are defined only for TCP and UDP, IP layer securing services such as
IPsec-AH/ESP cannot be applied directly to
parcels, although they can certainly be used at lower layers such as
for transmission of parcels over VPNs and/or OMNI link secured
spanning trees. Since the IP layer does not manipulate segments
exchanged with upper layers, parcels do not interfere with
transport- or higher-layer security services such as (D)TLS/SSL
which may provide greater flexibility in
some environments.Further security considerations related to IP parcels are found
in the AERO/OMNI specifications.This work was inspired by ongoing AERO/OMNI/DTN investigations. The
concepts were further motivated through discussions on the IETF intarea
and 6man lists as well as with Boeing colleagues.A considerable body of work over recent years has produced useful
"segmentation offload" facilities available in widely-deployed
implementations.Accelerating UDP packet transmission for QUIC,
https://blog.cloudflare.com/accelerating-udp-packet-transmission-for-quic/BIG TCP, Netdev 0x15 Conference (virtual),
https://netdevconf.info/0x15/session.html?BIG-TCPHistoric and current-day data links configure Maximum Transmission
Units (MTUs) that are far smaller than the desired state for the future
of IP parcel transmission. When the first Ethernet data links were
deployed many decades ago, their 1500 octet MTU set a strong precedent
that was widely adopted. This same size now appears as the predominant
MTU limit for most paths in the Internet today, although modern link
deployments with larger MTUs up to 9KB have begun to emerge.In the late 1980's, the Fiber Distributed Data Interface (FDDI)
standard defined a new link type with MTU slightly larger than 4500
octets. The goal of the larger MTU was to increase performance by a
factor of 10 over the ubiquitous 10Mbps and 1500-octet MTU Ethernet
technologies of the time. Many factors including a failure to harmonize
MTU diversity and an Ethernet performance increase to 100Mbps led to
poor FDDI market reception. In the next decade, the 1990's saw new
initiatives including ATM/AAL5 (9KB MTU) and HiPPI (64KB MTU) which
offered high-speed data link alternatives with larger MTUs but again
the inability to harmonize diversity derailed their momentum. By the
end of the 1990s and leading into the 2000's, emergence of the 1Gbps,
10Gbps and even faster Ethernet performance levels seen today has
obscured the fact that the modern Internet of the 21st century is
still operating with 20th century MTUs!To bridge this gap, increased OMNI interface deployment in the
near future will provide a virtual link type that can
pass IP parcels over paths that traverse traditional data links with
small MTUs. Performance analysis has proven that (single-threaded)
receive-side performance is bounded by upper layer protocol segment
size, with performance increasing in direct proportion with segment
size. Experiments have also shown measurable (single-threaded) performance
increases by including larger numbers of segments per parcel, with steady
increases for including increasing number of segments. However, parallel
receive-side processing will provide performance multiplier benefits
since the multiple segments that arrive in a single parcel can be
processed simultaneously instead of serially.In addition to the clear near-term benefits, IP parcels will increase
performance to new levels as future parcel-capable links with very
large MTUs begin to emerge. These links will provide MTUs far in excess
of 64KB to as large as 16MB. With such large MTUs, the traditional CRC-32
(or even CRC-64) error checking with errored packet discard discipline
will no longer apply for large parcels. Instead, parcels larger than a
link-specific threshold will include Forward Error Correction (FEC)
codes so that errored parcels can be repaired at the receiver's data
link layer then delivered to upper layers rather than being discarded
and triggering retransmission of large amounts of data. Even if the
FEC repairs are incomplete or imperfect, all parcels can still be
delivered to upper layers where the individual segment checksums
will detect and discard any damaged data not repaired by lower layers.These new "super-links" will appear mostly in the network edges
(e.g., high-performance data centers) and not as often in the middle
of the Internet. (However, some space-domain links that
extend over enormous distances may also benefit.) For this reason, a
common use case will include parcel-capable super-links in the edge
networks of both parties of an end-to-end session with an OMNI link
connecting the two over wide area Internetworks. Medium- to moderately
large-sized IP parcels over OMNI links will already provide considerable
performance benefits for wide-area end-to-end communications while truly
large IP parcels over super-links can provide boundless increases for
localized bulk transfers in edge networks or for deep space long haul
transmissions. The ability to grow and adapt without practical bound
enabled by IP parcels will inevitably encourage new data link
development leading to future innovations in new markets that will
revolutionize the Internet.Until these new links begin to emerge, however, parcels will already
provide a tremendous benefit to end systems by allowing applications to
send and receive segment buffers larger than 65535 octets in a single
system call. By expanding the current operating system call data copy
limit from its current 16-bit length to a 32-bit length, applications
will be able to send and receive maximum-length parcel buffers even if
lower layers need to break them into multiple parcels to fit within the
underlying interface MTU. For applications such as the Delay Tolerant
Networking (DTN) Bundle Protocol , this will
allow applications to send and receive entire large upper layer
protocol constructs (such as DTN bundles) in a single system call.<< RFC Editor - remove prior to publication >>Changes from earlier versions:Submit for Intarea Standards Track RFC Publication.