Some Congestion Experienced in TCP
Redacted
Portland
97217
United States
OR
rgrimes@freebsd.org
Redacted
Liberec 30
463 11
Czech Republic
pete@heistp.net
Internet
TCP Maintenance and Minor Extensions
Internet-Draft
SCE
TCP
This memo classifies a TCP code point
ESCE ("Echo Some Congestion Experienced") for use in
feedback of IP code point SCE ("Some Congestion Experienced").
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
and when, and only when, they appear in all
capitals, as shown here.
Introduction
This memo requests a TCP header codepoint for use as ESCE.
This memo limits its scope to the definition of the TCP codepoint ESCE,
with a few brief illustrations of how it may be used.
SCE provides early and proportional feedback to the CC (congestion control)
algorithms for transport protocols, including but not limited to TCP. The
is a Linux kernel modified to support SCE, including:
- Enhancements to Linux's (Common Applications Kept Enhanced) AQM to
support SCE signaling
- Modifications to the TCP receive path to reflect SCE signals back to the
sender
- The addition of three new TCP CC algorithms that modify the originals to add
SCE support: Reno-SCE, DCTCP-SCE and Cubic-SCE (work in progress as of this
writing)
Background
defines the IP SCE codepoint.
TCP Receiver
The mechanism defined to feed back SCE signals to the sender explicitly
makes use of the ESCE ("Echo Some Congestion Experienced") code point
in the TCP header.
Single ACK implementation
Upon receipt of a packet an ACK is immediatly generated,
the SCE codepoint is copied into the ESCE codepoint of the ACK.
This keeps the count of bytes SCE marked or
not marked properly reflected in the ACK packet(s).
This valid implementation has the downside of increasing ACK traffic.
This implementation is NOT RECOMMENDED, but useful for experimental work.
Simple Delayed ACK implementation
Upon receipt of a packet without an SCE codepoint traditional delayed
ACK processing is performed.
Upon receipt of a packet with an SCE codepoint immediate ACK processing
SHOULD be done,
this allows some delaying of ACK's,
but creates earlier feedback of the congested state.
This has the negative effect of over signalling ESCE.
Dithered Delayed ACK implementation
Upon receipt of a packet the SCE codepoint is stored in the TCP state.
Multiple packets state may be stored.
Upon generation of an ACK, normal or delayed,
the stored SCE state is used to set the state of ESCE.
If no SCE state is in the TCP state,
then the ESCE code point MUST NOT be set.
If all of the packets to be ACKed have SCE
state set then the ESCE code point MUST be set in the ACK.
If some of the packets to be ACKed have SCE state set then some
proportional number of ACK packets SHOULD be sent with the ESCE code point set.
Though this may defer a ESCE congestion signal when there is not a next packet for some time
it is generally accepted that such sparse flows are not the source of congestion
and thus the delayed signal is of low impact.
The goal is to have the same number of bytes marked with ESCE
as arrived with SCE.
Advanced ACK implementation
The Advanced ACK implementation actually immediately flushes any pending ACK's
up to the previous segment when the state of the SCE marking changes,
allowing consecutive packets with the same SCE state to be coalesced by the normal delayed-ack logic.
The ACK volume is then inflated only slightly compared to an unmarked connection,
and may actually involve fewer acks than a connection involving CE marks or losses,
during which delayed acks are temporarily disabled.
ACK Thinning
Ack thinning is something that has been considered,
given that includes an optional ack-filter which does thinning.
We have, for example, added consideration of the ESCE bit to Cake's ack-filter.
Mathematically, the most extreme errors possible in either direction,
due to ack thinning,
are easily corrected during subsequent RTTs.
TCP Sender
The recommended response to each single segment marked with ESCE
is to reduce cwnd by an amortised 1/sqrt(cwnd) segments. If the growth rate is greater
than that provided by the Reno-linear algorithm - eg. slow-start exponential or CUBIC
polynomial - then the growth rate SHOULD also be reduced.
Other responses, such as the 1/cwnd from DCTCP, are also acceptable but may perform less well.
There are no changes to the response functions with respect to CE or packet loss
specificed by this draft,
hence and are still applicable
This is still an area of continued investigation.
Related Work
More Accurate ECN Feedback in TCP
AccECN replaces the definition of the ECE and CWR bits (and the former NS bit) with its own three-bit field.
This new interpretation is predicated on successfully negotiating AccECN,
and is not useful to SCE implementations because it provides no information about any ECT(1) codepoints received,
and SCE does not need or use the extra information about CE marks that the three-bit field does provide.
Hence SCE may be considered mutually exclusive with AccECN on any given connection.
AccECN supports a fallback to style signalling during the three-way handshake
by recognising the normal requests and responses of an endpoint.
SCE endpoints also exhibit behaviour during the handshake,
so this mutual exclusivity occurs naturally.
There will therefore be no confusion on the wire between the two experiments,
even though SCE does not explicitly negotiate its upgrade from plain behaviour.
The latter is consistent with the (now historic) Nonce Sum specification,
which also did not explicitly negotiate support,
and used the same additional ECN codepoint and TCP header bit that SCE is now requesting.
IANA Considerations
This document requests one of the reserved bits in the TCP header,
with the former TCP NS ("Nonce Sum") bit (bit 7) being suggested due to similarities with its previous usage.
(section 3) obsoletes the NS codepoint making it avaliable for use.
Security Considerations
There are no Security considerations.
Normative References
Informative References
Cake - Common Applications Kept Enhanced
Some Congestion Experienced Reference Implementation GitHub Repository