The CONNECT-UDP HTTP Method

The CONNECT-UDP HTTP Method Google LLC

1600 Amphitheatre Parkway Mountain View, California 94043 United States of America dschinazi.ietf@gmail.com

Internet-Draft This document describes the CONNECT-UDP HTTP method. CONNECT-UDP is similar to the HTTP CONNECT method, but it uses UDP instead of TCP. Discussion of this work is encouraged to happen on the MASQUE IETF mailing list or on the GitHub repository which contains the draft: . Discussion Venues Source for this draft and an issue tracker can be found at .

Introduction This document describes the CONNECT-UDP HTTP method. CONNECT-UDP is similar to the HTTP CONNECT method (see section 4.3.6 of ), but it uses UDP instead of TCP . Discussion of this work is encouraged to happen on the MASQUE IETF mailing list or on the GitHub repository which contains the draft: .

Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. In this document, we use the term "proxy" to refer to the HTTP server that opens the UDP socket and responds to the CONNECT-UDP request. If there are HTTP intermediaries (as defined in Section 2.3 of ) between the client and the proxy, those are referred to as "intermediaries" in this document.

Supported HTTP Versions The CONNECT-UDP method is defined for all versions of HTTP. When the HTTP version used runs over QUIC , UDP payloads can be sent over QUIC DATAGRAM frames . Otherwise they are sent on the stream where the CONNECT-UDP request was made. Note that, when the HTTP version in use does not support multiplexing streams (such as HTTP/1.1), then any reference to "stream" in this document is meant to represent the entire connection.

The CONNECT-UDP Method The CONNECT-UDP method requests that the recipient establish a tunnel over a single HTTP stream to the destination origin server identified by the request-target and, if successful, thereafter restrict its behavior to blind forwarding of packets, in both directions, until the tunnel is closed. Tunnels are commonly used to create an end-to-end virtual connection, which can then be secured using QUIC or another protocol running over UDP. The request-target of a CONNECT-UDP request is a URI which uses the "masque" scheme and an immutable path of "/". For example: When using HTTP/2 or later, CONNECT-UDP requests use HTTP pseudo-headers with the following requirements:

The ":method" pseudo-header field is set to "CONNECT-UDP".
The ":scheme" pseudo-header field is set to "masque".
The ":path" pseudo-header field is set to "/".
The ":authority" pseudo-header field contains the host and port to connect to (similar to the authority-form of the request-target of CONNECT requests; see , Section 5.3).

A CONNECT-UDP request that does not conform to these restrictions is malformed (see , Section 8.1.2.6). The recipient proxy establishes a tunnel by directly opening a UDP socket to the request-target. Any 2xx (Successful) response indicates that the proxy has opened a socket to the request-target and is willing to proxy UDP payloads. Any response other than a successful response indicates that the tunnel has not yet been formed. A proxy MUST NOT send any Transfer-Encoding or Content-Length header fields in a 2xx (Successful) response to CONNECT-UDP. A client MUST treat a response to CONNECT-UDP containing any Content-Length or Transfer-Encoding header fields as malformed. A payload within a CONNECT-UDP request message has no defined semantics; a CONNECT-UDP request with a non-empty payload is malformed. Note that the CONNECT-UDP stream is used to convey UDP packets, but they are not semantically part of the request or response themselves. Responses to the CONNECT-UDP method are not cacheable.

Datagram Encoding of Proxied UDP Packets When the HTTP connection supports HTTP/3 datagrams , UDP packets can be encoded using QUIC DATAGRAM frames. This support is ascertained by checking the received value of the H3_DATAGRAM SETTINGS Parameter. If the client has both sent and received the H3_DATAGRAM SETTINGS Parameter with value 1 on this connection, it SHOULD attempt to use HTTP/3 datagrams. This is accomplished by requesting a datagram flow identifier from the flow identifier allocation service . That service generates an even flow identifier, and the client sends it to the proxy by using the "Datagram-Flow-Id" header; see . A CONNECT-UDP request with an odd flow identifier is malformed. The proxy that is creating the UDP socket to the destination responds to the CONNECT-UDP request with a 2xx (Successful) response, and indicates it supports datagram encoding by echoing the "Datagram-Flow-Id" header. Once the client has received the "Datagram-Flow-Id" header on the successful response, it knows that it can use the HTTP/3 datagram encoding to send proxied UDP packets for this particular request. It then encodes the payload of UDP datagrams into the payload of HTTP/3 datagrams. Is the CONNECT-UDP response does not carry the "Datagram-Flow-Id" header, then the datagram encoding is not available for this request. A CONNECT-UDP response that carries the "Datagram-Flow-Id" header but with a different flow identifier than the one sent on the request is malformed. When the proxy processes a new CONNECT-UDP request, it MUST ensure that the datagram flow identifier is not equal to flow identifiers from other requests: if it is, the proxy MUST reject the request with a 4xx (Client Error) status code. Extensions MAY weaken or remove this requirement. Clients MAY optimistically start sending proxied UDP packets before receiving the response to its CONNECT-UDP request, noting however that those may not be processed by the proxy if it responds to the CONNECT-UDP request with a failure or without echoing the "Datagram-Flow-Id" header, or if the datagrams arrive before the CONNECT-UDP request. Note that a proxy can send the H3_DATAGRAM SETTINGS Parameter with a value of 1 while disabling datagrams on a particular request by not echoing the "Datagram-Flow-Id" header. If the proxy does this, it MUST NOT treat receipt of datagrams as an error, because the client could have sent them optimistically before receiving the response. In this scenario, the proxy MUST discard those datagrams. Extensions to CONNECT-UDP MAY leverage parameters on the "Datagram-Flow-Id" header (parameters are defined in Section 3.1.2 of ). Proxies MUST NOT echo parameters on the "Datagram-Flow-Id" header if it does not understand their semantics.

Stream Chunks The bidirectional stream that the CONNECT-UDP request was sent on is a sequence of CONNECT-UDP Stream Chunks, which are defined as a sequence of type-length-value tuples using the following format (using the notation from the "Notational Conventions" section of ):

CONNECT-UDP Stream Format

CONNECT-UDP Stream Chunk Format

CONNECT-UDP Stream Chunk Type:: A variable-length integer indicating the Type of the CONNECT-UDP Stream Chunk. Endpoints that receive a chunk with an unknown CONNECT-UDP Stream Chunk Type MUST silently skip over that chunk.
CONNECT-UDP Stream Chunk Length:: The length of the CONNECT-UDP Stream Chunk Value field following this field. Note that this field can have a value of zero.
CONNECT-UDP Stream Chunk Value:: The payload of this chunk. Its semantics are determined by the value of the CONNECT-UDP Stream Chunk Type field.

Stream Encoding of Proxied UDP Packets CONNECT-UDP Stream Chunks can be used to convey UDP payloads, by using a CONNECT-UDP Stream Chunk Type of UDP_PACKET (value 0x00). The payload of UDP packets is encoded in its unmodified entirety in the CONNECT-UDP Stream Chunk Value field. This is necessary when the version of HTTP in use does not support QUIC DATAGRAM frames, but MAY also be used when datagrams are supported. Note that empty UDP payloads are allowed.

Proxy Handling Unlike TCP, UDP is connection-less. The proxy that opens the UDP socket has no way of knowing whether the destination is reachable. Therefore it needs to respond to the CONNECT-UDP request without waiting for a TCP SYN-ACK. Proxies can use connected UDP sockets if their operating system supports them, as that allows the proxy to rely on the kernel to only send it UDP packets that match the correct 5-tuple. If the proxy uses a non-connected socket, it MUST validate the IP source address and UDP source port on received packets to ensure they match the client's CONNECT-UDP request. Packets that do not match MUST be discarded by the proxy. The lifetime of the socket is tied to the CONNECT-UDP stream. The proxy MUST keep the socket open while the CONNECT-UDP stream is open. Proxies MAY choose to close sockets due to a period of inactivity, but they MUST close the CONNECT-UDP stream before closing the socket.

HTTP Intermediaries HTTP/3 DATAGRAM flow identifiers are specific to a given HTTP/3 connection. However, in some cases, an HTTP request may travel across multiple HTTP connections if there are HTTP intermediaries involved; see Section 2.3 of . Intermediaries that support both CONNECT-UDP and HTTP/3 datagrams MUST negotiate flow identifiers separately on the client-facing and server-facing connections. This is accomplished by having the intermediary parse the "Datagram-Flow-Id" header on all CONNECT-UDP requests it receives, and sending the same value in the "Datagram-Flow-Id" header on the response. The intermediary then ascertains whether it can use datagrams on the server-facing connection. If they are supported (as indicated by the H3_DATAGRAM SETTINGS parameter), the intermediary uses its own flow identifier allocation service to allocate a flow identifier for the server-facing connection, and waits for the server's reply to see if the server sent the "Datagram-Flow-Id" header on the response. The intermediary then translates datagrams between the two connections by using the flow identifier specific to that connection. An intermediary MAY also choose to use datagrams on only one of the two connections, and translate between datagrams and streams.

Performance Considerations Proxies SHOULD strive to avoid increasing burstiness of UDP traffic: they SHOULD NOT queue packets in order to increase batching. When the protocol running over UDP that is being proxied uses congestion control (e.g., ), the proxied traffic will incur at least two nested congestion controllers. This can reduce performance but the underlying HTTP connection MUST NOT disable congestion control unless it has an out-of-band way of knowing with absolute certainty that the inner traffic is congestion-controlled. When the protocol running over UDP that is being proxied uses loss recovery (e.g., ), and the underlying HTTP connection runs over TCP, the proxied traffic will incur at least two nested loss recovery mechanisms. This can reduce performance as both can sometimes independently retransmit the same data. To avoid this, HTTP/3 datagrams SHOULD be used.

Security Considerations There are significant risks in allowing arbitrary clients to establish a tunnel to arbitrary servers, as that could allow bad actors to send traffic and have it attributed to the proxy. Proxies that support CONNECT-UDP SHOULD restrict its use to authenticated users. Because the CONNECT method creates a TCP connection to the target, the target has to indicate its willingness to accept TCP connections by responding with a TCP SYN-ACK before the proxy can send it application data. UDP doesn't have this property, so a CONNECT-UDP proxy could send more data to an unwilling target than a CONNECT proxy. However, in practice denial of service attacks target open TCP ports so the TCP SYN-ACK does not offer much protection in real scenarios. Proxies MUST NOT introspect the contents of UDP payloads as that would lead to ossification of UDP-based protocols by proxies.

IANA Considerations

HTTP Method This document will request IANA to register "CONNECT-UDP" in the HTTP Method Registry (IETF review) maintained at <>.

URI Scheme Registration This document will request IANA to register the URI scheme "masque". The syntax definition below uses Augmented Backus-Naur Form (ABNF) . The definitions of "host" and "port" are adopted from . The syntax of a MASQUE URI is: The "host" and "port" component MUST NOT be empty, and the "port" component MUST NOT be 0.

Stream Chunk Type Registration This document will request IANA to create a "CONNECT-UDP Stream Chunk Type" registry. This registry governs a 62-bit space, and follows the registration policy for QUIC registries as defined in . In addition to the fields required by the QUIC policy, registrations in this registry MUST include the following fields:

Type:: A short mnemonic for the type.
Description:: A brief description of the type semantics, which MAY be a summary if a specification reference is provided.

The initial contents of this registry are: Each value of the format 37 * N + 23 for integer values of N (that is, 23, 60, 97, ...) are reserved; these values MUST NOT be assigned by IANA and MUST NOT appear in the listing of assigned values.

Normative References Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content The Hypertext Transfer Protocol (HTTP) is a stateless \%application- level protocol for distributed, collaborative, hypertext information systems. This document defines the semantics of HTTP/1.1 messages, as expressed by request methods, request header fields, response status codes, and response header fields, along with the payload of messages (metadata and body content) and mechanisms for content negotiation. User Datagram Protocol Transmission Control Protocol Key words for use in RFCs to Indicate Requirement Levels In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings. QUIC: A UDP-Based Multiplexed and Secure Transport This document defines the core of the QUIC transport protocol. QUIC provides applications with flow-controlled streams for structured communication, low-latency connection establishment, and network path migration. QUIC includes security measures that ensure confidentiality, integrity, and availability in a range of deployment circumstances. Accompanying documents describe the integration of TLS for key negotiation, loss detection, and an exemplary congestion control algorithm. DO NOT DEPLOY THIS VERSION OF QUIC DO NOT DEPLOY THIS VERSION OF QUIC UNTIL IT IS IN AN RFC. This verion is still a work in progress. For trial deployments, please use earlier versions. Note to Readers Discussion of this draft takes place on the QUIC working group mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is archived at https://mailarchive.ietf.org/arch/search/?email_list=quic Working Group information can be found at https://github.com/quicwg; source code and issues list for this draft can be found at https://github.com/quicwg/base-drafts/labels/-transport. An Unreliable Datagram Extension to QUIC This document defines an extension to the QUIC transport protocol to add support for sending and receiving unreliable datagrams over a QUIC connection. Discussion of this work is encouraged to happen on the QUIC IETF mailing list quic@ietf.org (mailto:quic@ietf.org) or on the GitHub repository which contains the draft: https://github.com/quicwg/ datagram (https://github.com/quicwg/datagram). Uniform Resource Identifier (URI): Generic Syntax A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. This specification defines the generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier. This specification does not define a generative grammar for URIs; that task is performed by the individual specifications of each URI scheme. [STANDARDS-TRACK] Hypertext Transfer Protocol Version 2 (HTTP/2) This specification describes an optimized expression of the semantics of the Hypertext Transfer Protocol (HTTP), referred to as HTTP version 2 (HTTP/2). HTTP/2 enables a more efficient use of network resources and a reduced perception of latency by introducing header field compression and allowing multiple concurrent exchanges on the same connection. It also introduces unsolicited push of representations from servers to clients. This specification is an alternative to, but does not obsolete, the HTTP/1.1 message syntax. HTTP's existing semantics remain unchanged. Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document provides an overview of HTTP architecture and its associated terminology, defines the "http" and "https" Uniform Resource Identifier (URI) schemes, defines the HTTP/1.1 message syntax and parsing requirements, and describes related security concerns for implementations. Using QUIC Datagrams with HTTP/3 The QUIC DATAGRAM extension provides application protocols running over QUIC with a mechanism to send unreliable data while leveraging the security and congestion-control properties of QUIC. However, QUIC DATAGRAM frames do not provide a means to demultiplex application contexts. This document defines how to use QUIC DATAGRAM frames when the application protocol running over QUIC is HTTP/3 by adding an identifier at the start of the frame payload. This allows HTTP messages to convey related information using unreliable DATAGRAM frames, ensuring those frames are properly associated with an HTTP message. Discussion of this work is encouraged to happen on the MASQUE IETF mailing list (masque@ietf.org (mailto:masque@ietf.org)) or on the GitHub repository which contains the draft: https://github.com/DavidSchinazi/draft-h3-datagram. Structured Field Values for HTTP This document describes a set of data types and associated algorithms that are intended to make it easier and safer to define and handle HTTP header and trailer fields, known as "Structured Fields", "Structured Headers", or "Structured Trailers". It is intended for use by specifications of new HTTP fields that wish to use a common syntax that is more restrictive than traditional HTTP field values. Augmented BNF for Syntax Specifications: ABNF Internet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]

Acknowledgments This document is a product of the MASQUE Working Group, and the author thanks all MASQUE enthusiasts for their contibutions. This proposal was inspired directly or indirectly by prior work from many people. In particular, the author would like to thank Eric Rescorla for suggesting to use an HTTP method to proxy UDP. Thanks to Lucas Pardue for their inputs on this document.