Internet-Draft Collaborative Host/Network Signaling: Us March 2024
Rajagopalan, et al. Expires 5 September 2024 [Page]
Transport and Services Working Group
Intended Status:
S. Rajagopalan
Cloud Software Group
D. Wing
Cloud Software Group
M. Boucadair
T. Reddy

Signaling Use Cases for Collaborative Traffic Differentiation


Host-to-network (and vice versa) signaling can improve the user experience by informing the network which flows are more important and which packets within a flow are more important without having to disclose the content of the packets being delivered. The differentiated service may be provided at the network (e.g., packet discard preference), the sender (e.g., adaptive transmission or session migration), or through cooperation of both the host and the network.

This document outlines a set of use-cases that highlight the need for a mechanism to share metadata about flows between a host and its network in order to enable different traffic treatment. Such a mechanism is typically implemented using a signaling protocol between the host and a set of trusted netwrok elements.

About This Document

This note is to be removed before publishing as an RFC.

The latest revision of this draft can be found at Status information for this document may be found at

Discussion of this document takes place on the Transport and Services Working Group Working Group mailing list (, which is archived at Subscribe at

Source for this draft and an issue tracker can be found at

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 5 September 2024.

Table of Contents

1. Introduction

Bandwidth constraints exist most predominantly at the access network (e.g., radio access networks). Users who are serviced via these networks use various hosts which run various applications; each having different connectivity needs for an optimal user experience. These needs are not frozen but change over time depending on the application and even depending on how an application is used (e.g., user's preferences).

The simple network diagram below shows where such bandwidth and performance constraints usually exist with a "B" (for Bottleneck). Other network bottlenecks may be experienced in other segments not shown in the figure, such as interconnection links or the infrastructure that hosts the service (e.g., flash crowds). A bottleneck may be limited in time, present or not regular patters, etc.

WLAN host B access B router router router host point Transit Content User Network ISP Network Network Network

Complications that are induced by such phenomena may be eliminated by adequate dimensioning and upgrades. However, such upgrades may not be always immediately possible or economically justified.

Complementary mitigations are thus needed to soften these complications by introducing some collaboration between hosts and networks to adjust their behaviors.

For traffic sent in either direction, the network network elements that terminate a bandwidth constraining link (or located few hops next to that element) can be fed with flow metadata. Such augmentation allows those network elements to make autonomous decisions to prioritize, delay, or drop packets, especially when performing reactive resource management. Absent such metadata, these network elements have no means to guide the enforcement of the reactive resource policy.

There are several challenges with this metadata augmentation:

The metadata signals from a content provider are more likely to be authentic (if adequate authorization/validation are in place) but the metadata signals from other hosts may be "wrong", undesired by the peer host, or maliciously contain improper metadata. Attempts to automate identification of content providers have included HTTP "Host" header inspection and TLS SNI inspection which are expected to fail as encrypted SNI and privacy-enhancing proxies become more prevalent. Another mechanism to authorize metadata signals from a content provider is to configure the ISP equipment with the content network's source IP addresses (or other labels that may be visible on the packets) and provide a differentiated service to the traffic that match these criteria. However, such an arrangement may have scalability issues. An approach to mitigate these issues is to limit the target contents networks and networks that would put in place these arrangements. Such limitations would benefit large players (large ISPs and large content network) and disadvantages small players (and new players). A more egalitarian approach would provide the same benefit to all parties -- large and small -- and also provide richer signaling to further improve user experience and metadata interoperability. This would allow all parties to become part of the "Internet fast lane".

The authorization problem exists with technologies as relatively simple as DiffServ and the problem persists with many other recently discussed metadata signaling mechanisms, including embedding information in the UDP payload ([I-D.trammell-plus-spec]), UDP options ([I-D.kaippallimalil-tsvwg-media-hdr-wireless]), overloading the IPv6 Flow Label ([], and Hop-by-Hop Options. One mechanism suggested occasionally is to encrypt or integrity protect the metadata with a key; such a key could be established using a signaling protocol, see Section 6.2.

There is some consensus that applications can benefit by collaborative signaling the network ([IAB], [ATIS]). This document provides use-cases to further detail the need of such signaling.

2. Scope & Running Experiments

This document does not intend to define any signaling protocol nor call whether a new signaling protocol, a new extension, one or more signaling protocols are needed.

However, this document provides a reference to digest the intended benefits for enabling collaborating between hosts and networks. These benefits are yet to be backed up with more evidence. Some experimental work would be reasonable to be endorsed by the IETF to solicit more feedback and collect assess the benefits under various setups.

3. Conventions and Definitions

Intentional Management:

network policy such as (monthly) bandwidth quota or bandwidth limit, or quality (delay and/or jitter)) assurances.

Reactive Management:

network reactions to congestion events, with very short to very long durations (e.g., varying wireless and mobile air interface conditions).

4. Various Approaches for Collaborative Signaling

Figure 1 depicts examples of approaches to establish channels to convey and share metadata between hosts, networks, and servers.

Metadata exchanges can occur in one single direction or both directions of a flows.

(1) Proxied Connection Network(s) Client Server User Data+Metadata User Data+Metadata Secure Connection 1 Secure Connection 2 (2) Out-of-band Metadata Sharing Network(s) Client Server End-to-End Secure Connection + User Data GLUE CXs Metadata (Optional) Metadata Secure Connection 1 Secure Connection 2 (3) Client-centric Metadata Sharing Network(s) Client Server Metadata Secure Connection End-to-End Secure Connection User Data+Metadata |
Figure 1: Candidate Signaling Approaches

The client-centric metadata sharing approach because it preserves privacy and also takes advantage of clients having a full view on their available network attachments.

5. Use Cases

5.1. Generic Cases

5.1.1. Priority Between Flows (Inter-Flow) of The Same Host

Certain flows being received by a host (or by an application on a host) are less or more important than other flows of the same host. For example, a host downloading a software update is generally considered less important than another host doing interactive audio/video or gaming. By signaling the relative importance of flows to a network element, the network element can (de-)prioritize those flows to best accomodate the needs of the various applications (on a same host) and between hosts on a network.

5.1.2. Priority Within a Flow (Intra-Flow)

Interactive Audio/Video has long been using [RTP] which runs over UDP. As described in Section of [RFC7478], there is value in differentiating between voice, video and data. Today's video streaming is exclusively over TCP but will migrate to QUIC and eventually is likely to support unreliable transport ([RFC9221], [I-D.kpugin-rush]). With unreliable transport of video in RTP or QUIC, it is beneficial to differentiate the important video keyframes from other video frames. Other applications such as gaming and remote desktop also benefit from differentiating their packets to the network.

Many of these flows do not originate from a content provider's network. Thus, the flows originate from an IP address that is not known before connection establishment, so there needs to be a way for the client to authorize the network elements to receive and hopefully to honor the metadata of those packets.

5.2. Detailed Use Cases

5.2.1. Video Streaming

Streaming video contains the occasional key frame ("i-frame") containing a full video frame. These are necessary to rebuild receiver state after loss of delta frames. The key frames are therefore more critical to deliver to the receiver than delta frames.

Streaming video also contains audio frames which can be encoded separately and thus can be signaled separately. Audio is more critical than video for almost all applications, but its importance (relative to other packets in the flow) is still an application decision. In the example below, the audio is more important than video (importance=high, PT=keep, RU=reliable), video key frames have middle importance (importance=low, PT=discard, RU=reliable), and both types of video delta frames (P-frame and B-frame) have least importance (importance=low, PT=discard, RU=unreliable).

Video Streaming Metadata:

Based on metadata types listed in the [I-D.rwbr-sconepro-flow-metadata], the host to network metadata parameters for video streaming type is given below.

Table 1: Example Values for Video Streaming Metadata
Traffic type Importance PacketNature PacketType
video I-frame (key frame) low realtime reliable
video delta P-frame low discard unreliable
video delta B-frame low discard unreliable
audio high realtime reliable

5.2.2. Interactive Media

Examples: VoIP, gaming.

Requirement: Signal the flow needs low jitter and low delay. However, the network can only provide a limited amount of low jitter/low delay to each host, maybe as few as one. This requires signaling feedback indicating that low jitter and low delay flows are already subscribed to other hosts. In response, the user and the application will likely continue, occasionally re-attempting to get the desired quality of service from the network.

In many scenarios a game or VoIP application will want to signal different metadata for the same type of packet in each direction. For example, for a game, video in the server-to-client direction might be more important than audio, whereas input devices (e.g., keystrokes) might be more important than audio.

Both gaming (video in both directions, audio in both directions, input devices from client to server) and interactive audio/video (VoIP, video conference) involves important traffic in both directions -- thus is a slightly more complicated use-case than the previous example. Additionally, most Internet service providers constrain upstream bandwidth so proper packet treatment is critical in the upstream direction.


Based on metadata types listed in the [I-D.rwbr-sconepro-flow-metadata], the host to network metadata parameters for interactive media type is given below.

Interactive A/V, downstream Metadata:

Table 2: Example Values for Interactive A/V, downstream
Traffic type Importance PacketNature PacketType
video key frame low realtime reliable
video delta frame low discard unreliable
audio high realtime reliable
Table 3: Example Values for Interactive A/V, upstream
Traffic type Importance PacketNature PacketType
video key frame low realtime reliable
video delta frame low discard unreliable
audio high realtime reliable

Many interactive audio/video applications also support sharing the presenter's screen, file, video, or pictures. During this sharing the presenter's video is less important but the screen or picture is more important. This change of imporance can be conveyed in metadata to the network, as in the table below:

Interactive A/V, upstream Metadata:

Table 4: Example Values for Interactive A/V with picture sharing, upstream
Traffic type Importance PacketNature PacketType
video key frame low realtime reliable
video delta frame low discard unreliable
audio high realtime reliable
picture sharing high realtime reliable

In many scenarios a game or VoIP application will want to signal different metadata for the same type of packet in each direction. For example, for a game, video in the server-to-client direction might be more important than audio, whereas input devices (e.g., keystrokes) might be more important than audio.

Todo: this section on cooperation needs editing.

5.2.3. Bulk Data Transfer

Examples: backup/restore, software update, RSS feed update, email, printing to a print server

Requirement: Signal the flow as below best-effort.


Table 5
Traffic type Importance PacketNature PacketType Comments
File copy low bulk reliable  
Printing high bulk reliable  

5.2.4. Mixed Traffic

Examples: Desktop Virtualization, Office software in the cloud (editing local files, typing is interactive while save operation is bulk transfer)

Requirement: Signal flow will vary depending on the nature of the packet. With variety of traffic going through the session, some packets can contain interactive traffic while the others contain bulk transfer. There can be combination of reliable and unreliable traffic within the same session through multiple streams. Host-to-network signaling plays a vital role in effectively routing mixed traffic for ideal user interactivity and network performance.

Example packet metadata for Desktop Virtualization (like Citrix Virtual Apps and Desktops - CVAD) application. This is shown in two tables, client-to-server traffic (Table 6) and server-to-client traffic (Table 7).

Remote Desktop Virtualization Metadata:

Based on metadata types listed in the [I-D.rwbr-sconepro-flow-metadata], the host to network metadata parameters for remote desktop virtualization type is given below.

Table 6: Example Values for Remote Desktop Virtualization Metadata, client to server
Traffic type Importance PacketNature PacketType Comments
User typing high realtime reliable  
Mouse click/End Position high realtime reliable The start and endpoint of the pointer movement is vital to ensure user action is completed correctly. So, the endpoints have to be reliably transmitted with real-time priority. **
Interactive audio high keep unreliable  
Authentication - Finger print, smart card low realtime reliable  
Interactive video key frame low keep unreliable Video key frames form the base frames of a video upon which the next 'n' timeframe of video updates is applied on. These frames, are hence, critical and without them, the video would not be coherent until the next critical frame is received. Retransmits of these are harmful to the UX. ***
Mouse position tracking low discard unreliable When the pointer is moved from one point to another, the coordinates of the pointers between the two points can be lost without much of an impact to the UX as long as the start and endpoint reaches. This would ensure the user action is completed, even if the experience seems glitchy.
Interactive video delta frame low discard unreliable  
Table 7: Example Values for Remote Desktop Virtualization Metadata, server to client
Traffic type Importance PacketNature PacketType Comments
Glyph critical high realtime reliable The frames that form the base for the image is more critical and needs to be transmitted as reliably as possible. Retransmits of these are harmful to the UX.**
Interactive (or streaming) audio high keep unreliable  
Haptic feedback high discard unreliable Virtualizing haptic feedback is real-time and high importance although the feedback being delivered late is of no use. So dropping the packet altogether and not retransmitting it makes more sense
Interactive (or streaming) video key frame low keep unreliable Video key frames form the base frames of a video upon which the next 'n' timeframe of video updates is applied on. These frames, are hence, critical and without them, the video would not be coherent until the next critical frame is received. Retransmits of these are harmful to the UX. ***
File copy low bulk reliable  
Interactive (or streaming) video predictive frame low discard unreliable Video predictive frames can be lost, which would result in minor glitch but not compromise the user activity and video would still be coherent and useful. The reception of subsequent video key frame would mitigate the loss in quality caused by lost predictive frames.
Glyph smoothing low discard Unreliable The smoothing elements of the glyph can be lost and would still present a recognizable image, although with a lesser quality. Hence, these can be marked as loss tolerant as the user action is still completed with a small compromise to the UX. Moreover, with the reception of the next glyph critical frame would mitigate the loss in quality caused by lost glyph smoothing elements.

*** A video key frame should be handled differently by the network depending on a streaming application versus a remote desktop application. The video streaming application's primary and only nature of traffic is video and audio. In contrast, a remote desktop application might be playing a video and its associated audio while at the same time the user is editing a document. The user's keystrokes and those glyphs need to be prioritized over the video lest the user think their inputs are being ignored (and type the same characters again). Hence, the values are different even for the same nature of traffic but a different application.

5.2.5. Assisted Offload

There are cases (crisis) where "normal" network resources cannot be used at maximum and, thus, a network would seek to reduce or offload some of the traffic during these events -- often called 'reactive traffic policy'. An example of such sue case is cellular networks that are overly used (and radio resources exhausted) while alternative network attachment networks are available to host.

Network-to-host signals are useful to put in place adequate traffic distribution policies (e.g., prefer the use of alternate paths, offload a network).

6. Operational Considerations

6.1. Abuse and Constraints

It is important that not every flow be prioritized; otherwise, the network devolves into the best-effort network that existed prior to metadata signaling. It is a requirement that mechanisms exist to prevent this occurrence.

Such a mechanism might be simple, for example, a cellular network might allow one flow from a subscriber to declare itself as important; other flows with that subscriber are denied attempts to prioritize themselves. The mechanism might be more complex where authentication and authorization is performed by an enterprise network which, itself, decides which flows are important based on its policy and only the enterprise network communicates flow priorities to the ISP network. The enterprise might prioritize certain users (e.g., IT staff), certain equipment (audio/video equipment in a conference room), or whatever its policies it might want.

6.2. Key Establishment

Various proposals have suggested establishing a key to validate per-packet metadata or to decrypt per-packet metadata. However, most proposals have not specified how this key would be established. A signaling protocol from the receiving host to its ISP could establish such a key. The host can then convey the key to the sending host to use to integrity protect or encrypt the per-packet metadata.

  • Note: The CPU overhead of validating or decrypting such per-packet metadata needs to be carefully considered (and further assessed via experiments) by the signaling protocol proposing such keying. Also, the required operational setup should be documented.

6.3. Metadata Version/Capability Exchange

The sender has to convey metadata in a way that is understood by the various network elements on the path -- each of which might be operated by different entities and have different capabilities. For example, the Wi-Fi access point might be operated by an enterprise network, hotel, or home user, whereas the upstream router is operated by the ISP. Each of those might support different versions of the same metadata, or might need the metadata expressed in different ways.

The signaling protocol would provide a way to learn the needs of those networks, and provide metadata signaling satisfying most or all of their needs.

7. Requirements Summary

TODO summary.

8. Security Considerations

TODO Security

9. IANA Considerations

This document has no IANA actions.

10. Informative References

"Content Classification for Traffic Optimization", , <>.
Carder, D. W., Chown, T., McKee, S., and M. Babik, "Use of the IPv6 Flow Label for WLCG Packet Marking", Work in Progress, Internet-Draft, draft-cc-v6ops-wlcg-flow-label-marking-02, , <>.
Kaippallimalil, J., Gundavelli, S., and S. Dawkins, "Media Handling Considerations for Wireless Networks", Work in Progress, Internet-Draft, draft-kaippallimalil-tsvwg-media-hdr-wireless-04, , <>.
Pugin, K., Frindell, A., Ferret, J. C., and J. Weissman, "RUSH - Reliable (unreliable) streaming protocol", Work in Progress, Internet-Draft, draft-kpugin-rush-02, , <>.
Rajagopalan, S., Wing, D., Boucadair, M., and T. Reddy.K, "Flow Metadata for Collaborative Host/Network Signaling", Work in Progress, Internet-Draft, draft-rwbr-sconepro-flow-metadata-00, , <>.
Trammell, B. and M. Kühlewind, "Path Layer UDP Substrate Specification", Work in Progress, Internet-Draft, draft-trammell-plus-spec-01, , <>.
Arkko, J., Hardie, T., Pauly, T., and M. Kühlewind, "Considerations on Application - Network Collaboration Using Path Signals", RFC 9419, DOI 10.17487/RFC9419, , <>.
Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real-Time Communication Use Cases and Requirements", RFC 7478, DOI 10.17487/RFC7478, , <>.
Pauly, T., Kinnear, E., and D. Schinazi, "An Unreliable Datagram Extension to QUIC", RFC 9221, DOI 10.17487/RFC9221, , <>.
Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, , <>.


TODO acknowledge.

Authors' Addresses

Sridharan Rajagopalan
Cloud Software Group Holdings, Inc.
United States of America
Dan Wing
Cloud Software Group Holdings, Inc.
United States of America
Mohamed Boucadair
Tirumaleswar Reddy