Packet Loss Signaling for Encrypted Protocols
Orange Labs
alexandre.ferrieux@orange.com
Orange Labs
isabelle.hamchaoui@orange.com
Akamai Technologies
150 Broadway
Cambridge
MA
1122
USA
ilubashe@akamai.com
Loss Signaling
TSVWG
Internet-Draft
This document describes a protocol-independent method that employs two bits to
allow endpoints to signal packet loss in a way that can be used by network
devices to measure and locate the source of the loss. The signaling method
applies to all protocols with a protocol-specific way to identify packet
loss. The method is especially valuable when applied to protocols that encrypt
transport header and do not allow an alternative method for loss detection.
Packet loss is a pervasive problem of day-to-day network operation, and
proactively detecting, measuring, and locating it is crucial to maintaining high
QoS and timely resolution of crippling end-to-end throughput issues. To this
effect, in a TCP-dominated world, network operators have been heavily relying on
information present in the clear in TCP headers: sequence and acknowledgment
numbers, and SACK when enabled. These allow for quantitative estimation of
packet loss by passive on-path observation, and the lossy segment (upstream or
downstream from the observation point) can be quickly identified by moving the
passive observer around.
With encrypted protocols, the equivalent transport headers are encrypted and
passive packet loss observation is not possible, as described in
.
Since encrypted protocols could be routed by the network differently, and the
fraction of Internet traffic delivered using encrypted protocols is increasing
every year, it is imperative to measure packet loss experienced by encrypted
protocol users directly instead of relying on measuring TCP loss between similar
endpoints.
Following the recommendation in of making path signals explicit,
this document proposes adding two explicit loss bits to the clear portion of the
protocol headers to restore network operators’ ability to maintain high QoS for
users of encrypted protocols. These bits can be added to an unencrypted portion
of a header belonging to any protocol layer, e.g. two most significant its of
the TTL field in IP (see ) and IPv6 (see ) headers or reserved
bits in a QUIC v1 header (see ).
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be
interpreted as described in .
The proposal introduces two bits that are to be present in every packet capable
of loss reporting. These are packets that include protocol headers with the loss
bits. Only loss of packets capable of loss reporting is reported using loss
bits.
Whenever this specification refers to packets, it is referring only to packets
capable of loss reporting.
Q: The “sQuare signal” bit is toggled every N outgoing packets as explained
below in .
L: The “Loss event” bit is set to 0 or 1 according to the Unreported Loss
counter, as explained below in .
Each endpoint maintains appropriate counters independently and separately for
each connection (each subflow for multipath connections).
The sQuare Value is initialized to the Initial Q Value (0 or 1) and is reflected
in the Q bit of every outgoing packet. The sQuare value is inverted after
sending every N packets (Q Period is 2*N).
The choice of the Initial Q Value and Q Period is determined by the protocol
containing Q and L bits. For example, the values can be protocol constants (e.g
“Initial Q Value” is 0, and “Q Period” is 128), or they can be set explicitly
for each connection (e.g. “Initial Q Value” is whatever value the initial packet
has, and “Q Period” is set per a dedicated TCP option on SYN and SYN/ACK).
Observation points can estimate the upstream losses by counting the number of
packets during a half period of the square signal, as described in .
The Unreported Loss counter is initialized to 0, and L bit of every outgoing
packet indicates whether the Unreported Loss counter is positive (L=1 if the
counter is positive, and L=0 otherwise).
The value of the Unreported Loss counter is decremented every time a packet with
L=1 is sent.
The value of the Unreported Loss counter is incremented for every packet that
the protocol declares lost, using whatever loss detection machinery the protocol
employs. If the protocol is able to rescind the loss determination later, the
Unreported Loss counter SHOULD NOT be decremented due to the rescission.
Observation points can estimate the end-to-end loss, as determined by the
upstream endpoint’s loss detection machinery, by counting packets in this
direction with a L bit equal to 1, as described in .
There are three sources of observable loss:
upstream loss - loss between the sender and the observation point
()
downstream loss - loss between the observation point and the destination
()
observer loss - loss by the observer itself that does not cause downstream
loss ()
The upstream and downstream loss together constitute end-to-end loss
().
The Q and L bits allow detection and measurement of the types of loss listed
above.
The Loss Event bit allows an observer to calculate the end-to-end loss rate by
counting packets with L bit value of 0 and 1 for a given connection. The
end-to-end loss rate is the fraction of packets with L=1.
The simplifying assumption here is that upstream loss affects packets with L=0
and L=1 equally. This may be a simplification, if some loss is caused by
tail-drop in a network device. If the sender congestion controller reduces the
packet send rate after loss, there may be a sufficient delay before sending
packets with L=1 that they have a greater chance of arriving at the observer.
Blocks of N (half of Q Period) consecutive packets are sent with the same value
of the Q bit, followed by another block of N packets with inverted value of the
Q bit. Hence, knowing the value of N, an on-path observer can estimate the
amount of loss after observing at least N packets. The upstream loss rate is an
average number of packets in a block of packets with the same Q value divided by
N.
The observer needs to be able to tolerate packet reordering that can blur the
edges of the square signal.
The Q Period needs to be chosen carefully, since the observation could become
too unreliable in case of packet reordering and loss if Q Period is too
small. However, when Q Period is too large, short connections may not yield a
useful upstream loss measurement.
The observer needs to differentiate packets as belonging to different
connections, since they use independent counters.
Upstream loss is calculated by observing the actual packets that did not suffer
the upstream loss. End-to-end loss, however, is calculated by observing
subsequent packets after the sender’s protocol detected the loss. Hence,
end-to-end loss is generally observed with a delay of between 1 RTT (loss
declared due to multiple duplicate acknowledgments) and 1 RTO (loss declared due
to a timeout) relative to the upstream loss.
The connection RTT can sometimes be estimated by timing protocol handshake
messages. This RTT estimate can be greatly improved by observing a dedicated
protocol mechanism for conveying RTT information, such as the Latency Spin bit
of .
Whenever the observer needs to perform a computation that uses both upstream and
end-to-end loss rate measurements, it SHOULD use upstream loss rate leading the
end-to-end loss rate by approximately 1 RTT. If the observer is unable to
estimate RTT of the connection, it should accumulate loss measurements over time
periods of at least 4 times the typical RTT for the observed connections.
If the calculated upstream loss rate exceeds the end-to-end loss rate calculated
in , then either the Q Period is too short for the amount of
packet reordering or there is observer loss, described in . If
this happens, the observer SHOULD adjust the calculated upstream loss rate to
match end-to-end loss rate.
Because downstream loss affects only those packets that did not suffer upstream
loss, the end-to-end loss rate (e) relates to the upstream loss rate (u) and
downstream loss rate (d) as (1-u)(1-d)=1-e. Hence, d=(e-u)/(1-u).
A typical deployment of a passive observation system includes a network tap
device that mirrors network packets of interest to a device that performs
analysis and measurement on the mirrored packets. The observer loss is the loss
that occurs on the mirror path.
Observer loss affects upstream loss rate measurement since it causes the
observer to account for fewer packets in a block of identical Q bit values (see
{{upstreamloss)}). The end-to-end loss rate measurement, however, is unaffected
by the observer loss, since it is a measurement of the fraction of packets with
the set L bit value, and the observer loss would affect all packets equally
(see ).
The need to adjust the upstream loss rate down to match end-to-end loss rate as
described in is a strong indication of the observer loss,
whose magnitude is between the amount of such adjustment and the entirety of the
upstream loss measured in .
Accurate loss information is not critical to the operation of any protocol,
though its presence for a sufficient number of connections is important for the
operation of the networks.
The loss bits are amenable to “greasing” described in , if the
protocol designers are not ready to dedicate (and ossify) bits used for loss
reporting to this function. The greasing could be accomplished similarly to the
Latency Spin bit greasing in . Namely, implementations could
decide that a fraction of connections should not encode loss information in the
loss bits and, instead, the bits would be set to arbitrary values. The observers
would need to be ready to ignore connections with loss information more
resembling noise than the expected signal.
Passive loss observation has been a part of the network operations for a long
time, so exposing loss information to the network does not add new security
concerns.
Guarding user’s privacy is an important goal for modern protocols and protocol
extensions per . While an explicit loss signal – a preferred way to
share loss information per – helps to minimize unintentional
exposure of additional information, implementations of loss reporting must
ensure that loss information does not compromise protocol’s privacy goals.
For example, allows changing Connection IDs in the middle of
a connection to reduce the likelihood of a passive observer linking old and new
subflows to the same device. A QUIC implementation would need to reset all
counters when it changes Connection ID used for outgoing packets. It would also
need to avoid incrementing Unreported Loss counter for loss of packets sent with
a different Connection ID.
This document makes no request of IANA.
The sQuare Bit was originally specified by Kazuho Oku in early proposals for
loss measurement.
Internet Protocol
Internet Protocol, Version 6 (IPv6) Specification
This document specifies version 6 of the Internet Protocol (IPv6). It obsoletes RFC 2460.
Transport Protocol Path Signals
This document discusses the nature of signals seen by on-path elements examining transport protocols, contrasting implicit and explicit signals. For example, TCP's state machine uses a series of well-known messages that are exchanged in the clear. Because these are visible to network elements on the path between the two nodes setting up the transport connection, they are often used as signals by those network elements. In transports that do not exchange these messages in the clear, on-path network elements lack those signals. Often, the removal of those signals is intended by those moving the messages to confidential channels. Where the endpoints desire that network elements along the path receive these signals, this document recommends explicit signals be used.
Key words for use in RFCs to Indicate Requirement Levels
In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.
QUIC: A UDP-Based Multiplexed and Secure Transport
This document defines the core of the QUIC transport protocol. Accompanying documents describe QUIC's loss detection and congestion control and the use of TLS for key negotiation. Note to Readers Discussion of this draft takes place on the QUIC working group mailing list (quic@ietf.org), which is archived at <https://mailarchive.ietf.org/arch/search/?email_list=quic>. Working Group information can be found at <https://github.com/ quicwg>; source code and issues list for this draft can be found at <https://github.com/quicwg/base-drafts/labels/-transport>.
The Impact of Transport Header Confidentiality on Network Operation and Evolution of the Internet
This document describes implications of applying end-to-end encryption at the transport layer. It identifies in-network uses of transport layer header information. It then reviews the implications of developing end-to-end transport protocols that use authentication to protect the integrity of transport information or encryption to provide confidentiality of the transport protocol header and expected implications of transport protocol design and network operation. Since transport measurement and analysis of the impact of network characteristics have been important to the design of current transport protocols, it also considers the impact on transport and application evolution.
Applying GREASE to TLS Extensibility
This document describes GREASE (Generate Random Extensions And Sustain Extensibility), a mechanism to prevent extensibility failures in the TLS ecosystem. It reserves a set of TLS protocol values that may be advertised to ensure peers correctly handle unknown values.
Application-Layer Traffic Optimization (ALTO) Protocol
Applications using the Internet already have access to some topology information of Internet Service Provider (ISP) networks. For example, views to Internet routing tables at Looking Glass servers are available and can be practically downloaded to many network application clients. What is missing is knowledge of the underlying network topologies from the point of view of ISPs. In other words, what an ISP prefers in terms of traffic optimization -- and a way to distribute it.The Application-Layer Traffic Optimization (ALTO) services defined in this document provide network information (e.g., basic network location structure and preferences of network paths) with the goal of modifying network resource consumption patterns while maintaining or improving application performance. The basic information of ALTO is based on abstract maps of a network. These maps provide a simplified view, yet enough information about a network for applications to effectively utilize them. Additional services are built on top of the maps.This document describes a protocol implementing the ALTO services. Although the ALTO services would primarily be provided by ISPs, other entities, such as content service providers, could also provide ALTO services. Applications that could use the ALTO services are those that have a choice to which end points to connect. Examples of such applications are peer-to-peer (P2P) and content delivery networks.