Internet-Draft M. Milutinovic Expires: Mar 16, 2020 UC Berkeley Intended status: Proposed Standard M. Toomim Invisible College B. Bellomy Invisible College Nov 18, 2019 Range Patch draft-toomim-httpbis-range-patch-00 Abstract A uniform approach for expressing changes to state over HTTP. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at https://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at https://www.ietf.org/shadow.html Table of Contents 1. Introduction ....................................................3 2. Range Patch .....................................................3 2.1. Multiple Range Patches .....................................5 2.2. Stand-Alone Range Patch ....................................5 2.3. URI Fragment Identifiers ...................................6 3. Range Units .....................................................7 3.1. Bytes Range Unit ...........................................7 3.2. JSON Range Unit ............................................8 3.3. Lines Range Unit ..........................................10 4. IANA Considerations ............................................11 4.1. Range Unit Registrations ..................................11 4.2. The +patch Structured Syntax Suffix .......................12 5. Checking Capabilities ..........................................13 6. Race Conditions ................................................14 7. Security Considerations ........................................14 8. Conventions ....................................................14 9. Copyright Notice ...............................................14 10. References ....................................................15 10.1. Normative References .....................................15 10.2. Informative References ...................................15 1. Introduction This documents describes a uniform approach for expressing changes to state over HTTP. It builds upon [RFC7233] and details how patches can be defined using range units, ranges, and content. Any patch is expressed in the form: "range X in units Y of the data was replaced with content Z" Range units define how original content (being patched) should be parsed to obtain a region of the content which is being patched, and then how that region is replaced with new content. 2. Range Patch [RFC7233] effectively already defines how a patch operating on byte units can be represented over HTTP, using Content-Range, Content-Type, and Content-Length HTTP headers. Example: HTTP/1.1 206 Partial Content Date: Wed, 15 Nov 1995 06:25:24 GMT Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT Content-Range: bytes 21010-47021/47022 Content-Length: 26012 Content-Type: image/gif ... 26012 bytes of partial image data ... The same approach can be used to describe a range inside content interpreted not as bytes, but, for example, as JSON [RFC8259] or JSON-compatible structure. We define such JSON range unit in Section 4.1. For example, given the following JSON document: {"foo": {"bar": [ {"some": "thing"}, {"no": "thing"}, {"mo": "re"}, {"baz": {"1": {"two": "tree"}}} ]}} One might make the following request: GET /api/document/1 HTTP/1.1 Host: example.com Accept: application/json Range: json=/foo/bar/3/baz And receive the following response: HTTP/1.1 206 Partial Content Date: Thu, 31 Oct 2019 07:51:08 GMT Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT Content-Range: json /foo/bar/3/baz Content-Length: 22 Content-Type: application/json {"1": {"two": "tree"}} [RFC7233] defines and allows a Range header only for the GET request method. In this document, we define the behavior for other request methods. Which methods a given resource supports and which methods accept range patches as defined in this document is left to the server to define. When issuing a non-GET request to a resource, a range patch can be provided using Range header field. PATCH /api/image/1 HTTP/1.1 Host: example.com Range: bytes=21010-47021 Content-Length: 26012 Content-Type: image/gif ... 26012 bytes of new partial image data ... And for JSON: PATCH /api/document/1 HTTP/1.1 Host: example.com Range: json=/foo/bar/3/baz Content-Length: 25 Content-Type: application/json {"2": {"three": "flour"}} A patch with empty contents corresponds to deletion of existing content at the specified range. A patch with a zero-length range but non-empty contents corresponds to inserting content immediately before the location of the zero-length range. A patch with non-empty contents at a non-zero-length range corresponds to replacing existing content at the range with new content. When server supports Range header with non-GET requests, server MUST NOT ignore the Range header when used with a non-GET request. When server does not support Range header with non-GET requests, a server SHOULD generate a 416 (Range Not Satisfiable) or a 400 (Bad Request) response when a non-GET request with a Range header is made. Proxies SHOULD NOT drop Range header for non-GET requests. To assure correct handling of non-GET requests with the Range header, requester can check server's support for it as described in Section 5. 2.1. Multiple Range Patches Multiple range patches can also be combined in one request. This can be done by reusing [RFC7233] for transferring multiple parts using multipart/byteranges payload as described in Section 4.1. of [RFC7233]. When issuing a non-GET request to a resource, multiple range patches can be provided as well: PATCH /api/document/1 HTTP/1.1 Host: example.com Content-Length: 200 Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES --THIS_STRING_SEPARATES Content-Type: application/json Range: json=/foo/bar/2/mo 42 --THIS_STRING_SEPARATES Content-Type: application/json Range: json=/foo/bar/1/no "person" 2.2. Stand-Alone Range Patch When range patches are transmitted outside of HTTP session, a stand-alone range patch format can be used. For example, in this format a patch can be stored in a file, send to a mailing list, or a code version control system can display the patch in the range patch format. The format reuses structure from HTTP and consists of headers separated from the patch body by an empty line. Only Content-Range header is required. Example: Content-Range: json /foo/bar/3/baz {"1": {"two": "tree"}} Additional headers can be provided. This can be used even for multiple range patches. In such case the patch starts with Content-Type header defining the boundary. Example: Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES --THIS_STRING_SEPARATES Content-Range: json /foo/bar/2/mo 42 --THIS_STRING_SEPARATES Content-Range: json /foo/bar/1/no "person" Stand-alone range patches can be transmitted over HTTP as-is as well. This can be used to provide the patch which has been used in a previous non-GET request. A Content-Type with "+patch" suffix identifies such stand-alone range patch. For example, the patch used in the PATCH request example above could be retrieved as: HTTP/1.1 200 OK Date: Thu, 31 Oct 2019 07:51:08 GMT Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT Content-Length: 62 Content-Type: application/json+patch Content-Range: json /foo/bar/3/baz {"2": {"three": "flour"}} Stand-alone range patches are binary data. 2.3. URI Fragment Identifiers For media types which support range patches, ranges can be used as URI fragment identifies as well. For example, URI: /api/document/1#json=/foo/bar/0 identifies a fragment with the following content: {"some": "thing"} Multiple ranges are supported as well and they identify multiple fragments: /api/document/1#json=/foo/bar/0,/foo/bar/1 3. Range Units Range units define how content is parsed into a structure. They define a corresponding range specification which is a string describing range under the unit. Different range units can be compatible with content expresses through different media types. 3.1. Bytes Range Unit Bytes range unit is already specified in [RFC7233]. We extend it by allowing a zero-length range using a zero-length-byte-range-spec. zero-length-byte-range-spec = 1*DIGIT A zero-length range is a byte offset used to identify a location immediately before which new content can be inserted with a patch. Additionally, we note that the range "-0" is allowed, is a zero-length range, and identifies a location immediately after the last byte of data. This allows appending bytes to data. Note that bytes range unit operates on encoded content as specified in any Content-Encoding header. That holds both for GET and non-GET requests. 3.2. JSON Range Unit JSON range unit operates on JSON and JSON-compatible data structures. Its range specification is based on JSON pointer as described in [RFC6901]. The content of the range MUST always be a valid JSON by itself. JSON range unit is identified with "json". JSON pointer provides capabilities to identify a single element of a data structure. Here we extend it to allow a range of elements for arrays and strings, by extending the scheme how reference token modifies which value is referenced from Section 4 of [RFC6901]: o If the currently referenced value is a JSON array, the reference token can be compromised of two sets of digits (according to the ABNF syntax for array indices as specified in Section 4 of [RFC6901]), delimited by the character "-". Each set of digits represent an unsigned base-10 integer value. The first integer value MUST be smaller than the number of elements in the array. The second integer value MUST be smaller than or equal to the number of elements in the array. The second integer value MUST be larger than or equal to the first integer value. If any of these requirements are violated, an error condition is raised. The new referenced value is a new array with a subset of elements starting at the zero-based index of the first integer value, and ending at the element before the zero-based index of the second integer value (the first index is inclusive, the second index is exclusive). o If the currently referenced value is a JSON array, the reference token can be the character "-". The new referenced value is a zero-length array corresponding to the position immediately after the end of the current array. This design makes such JSON pointer compatible with the use of JSON pointers in JSON Patch [RFC6902]. This allows appending array elements to an array. o If the currently referenced value is a JSON string, the scheme for JSON arrays is used to index into a string and makes the new referenced value a substring of the currently referenced value. String indexing is done by code units. A range of elements can be specified only as the last reference token in JSON range. It follows that a range of elements can be specified only once. For example, given the JSON document: { "foo": [ "bar", "baz", "bax" ] } The following JSON strings evaluate to the accompanying JSON values: "/foo" ["bar", "baz"] "/foo/0" "bar" "/foo/0-1" ["bar"] "/foo/1-3" ["baz", "bax"] "/foo/1-1" [] "/foo/-" [] "/foo/3-3" // error "/foo/4-4" // error "/foo/1-0" // error "/foo/1-4" // error "/foo/1-3/0" // error "/foo/0/1-3" "ar" JSON ranges "/foo/1-1" and "/foo/-" are on its own of little utility, but serve as a zero-length range to identify a location immediately before which new content can be inserted with a patch. JSON range unit operates always on non-encoded content, ignoring any Content-Encoding header. That holds both for GET and non-GET requests. 3.3. Lines Range Unit For textual contents lines range unit operates on lines. Line positions are numbered starting with zero (with line position zero always being identical with character position zero). Ranges identified by lines include the line endings. If a content does not contain any line endings, then it consists of a single (the first) line. Implementers should be aware of the fact that line endings in textual contents can be represented by other characters or character sequences than CR+LF. Besides the CR and LF, there are also NEL and CR+NEL. In general, the encoding of line endings can also depend on the character encoding of textual contents, and implementations have to take this into account where necessary. Lines range unit is identified with "lines". Lines range specification is defined by: lines-range-spec = first-line "-" second-line first-line = 1*DIGIT second-line = 1*DIGIT Each lines range consists of two sets of digits, delimited by a character "-". Each set of digits represent an unsigned base-10 integer value. The first integer value MUST be smaller than the number of lines in contents. The second integer value MUST be smaller than or equal to the number of lines in contents. The second integer value MUST be larger than or equal to the first integer value. The range are lines starting at the line corresponding to the first integer value, and ending at the line before the line corresponding to the second integer value (the first integer is inclusive, the second integer is exclusive). Lines range where the first and second integer value are equal are empty and are on its own of little utility, but serve as a zero-length range to identify a location immediately before which new content can be inserted with a patch. Additionally, lines range specification can be the character "-", representing a zero-length range, and identifies a location immediately after the last line of textual contents. This allows appending lines to textual contents. Lines range unit operates always on non-encoded content, ignoring any Content-Encoding header. That holds both for GET and non-GET requests. 4. IANA Considerations 4.1. Range Unit Registrations This document registers the following range units: +-------------+---------------------------------------+-------------+ | Range Unit | Description | Reference | | Name | | | +-------------+---------------------------------------+-------------+ | json | a JSON pointer range on JSON and | Section 3.2 | | | JSON-compatible data structures | | +-------------+---------------------------------------+-------------+ | lines | a range of lines of textual contents | Section 3.3 | +-------------+---------------------------------------+-------------+ The change controller is: "IETF (iesg@ietf.org) - Internet Engineering Task Force". 4.2. The +patch Structured Syntax Suffix This document registers the following media type structured syntax suffix: Name: Range patch +suffix: +patch References: See Section 2.2 of this document. Encoding considerations: Stand-alone range patches are binary data. Fragment identifier considerations: The syntax and semantics of fragment identifiers specified for +patch SHOULD be as specified for range patches themselves. (At publication of this document, there is no fragment identification syntax defined for range patches themselves.) The syntax and semantics for fragment identifiers for a specific "xxx/yyy+patch" SHOULD be processed as follows: For cases defined in +patch, where the fragment identifier resolves per the +patch rules, then process as specified in +patch. For cases defined in +patch, where the fragment identifier does not resolve per the +patch rules, then such fragment SHOULD identifies a fragment which is obtained by intersection of the fragment identifier and the underlying range patch range specification for "xxx/yyy+patch". For cases not defined in +patch, then such fragment SHOULD identifies a fragment which is obtained by intersection of the fragment identifier and the underlying range patch range specification for "xxx/yyy+patch". Interoperability considerations: n/a Security considerations: See Section 7 of this document. Contact: IETF HTTP Working Group (ietf-http-wg@w3.org) Author/Change controller: IETF (iesg@ietf.org) - Internet Engineering Task Force 5. Checking Capabilities A server may support or not support non-GET requests with a Range header. The default behavior of servers is simply to ignore unknown or unsupported headers. In the case of a range patch, this implies that a request issuing a patch to a specific subsection of a resource might be interpreted by a server as a request to overwrite the entire resource with the patch, leaving the resource in a corrupted state. To determine whether or not the server can fulfill such a request correctly, the requester may first issue an OPTIONS request: OPTIONS /api/document/1 Range-Request-Method: PATCH Range-Request-Units: json,bytes To which the server may reply in the affirmative: HTTP/1.1 204 No Content Connection: keep-alive Range-Request-Allow-Methods: PATCH Range-Request-Allow-Units: json,bytes Version: 33a64df551425fcc55e4d42a148795d9f25f89d4 In the partial negative: HTTP/1.1 204 No Content Connection: keep-alive Range-Request-Allow-Methods: PATCH Range-Request-Allow-Units: json Version: 33a64df551425fcc55e4d42a148795d9f25f89d4 Or in the complete negative: HTTP/1.1 204 No Content Connection: keep-alive Range-Request-Allow-Methods: Range-Request-Allow-Units: Empty header fields are allowed per [RFC2616] section 2.1. Also note the presence of the Version header, discussed in section 6. The server may preemptively send this to obviate the need for another GET prior to a range patch request. 6. Race Conditions As with standard PUT, POST, and PATCH requests, a non-GET request with a Range header carries the risk of a mid-air collision with another simultaneous request. If one requester updates a resource, and another requester, not being aware of that update, issues a second update, the resource may be left in an unexpected state. Standard PUT, POST, and PATCH requests handle this with the ETag and If-Match headers. However, these headers vary based on the Content-Encoding of the request. Alternatively, requests can use the Versioning in [Braid-HTTP] to determine the ordering of simultaneous requests, and can specify consistency guarantees with [Merge-Types]. The server may return a Version header in response to HTTP requests directed at a given resource. Correspondingly, when issuing a range patch, the requester may include a Version header containing the version of the resource it intends to update. If the server cannot merge the patch at the given version, it must return a 409 Conflict response. 7. Security Considerations Both GET and non-GET requests with a Range header are potentially susceptible to denial-of-service attacks because the effort required to compute the patch or apply the patch. 8. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 9. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC7233] Fielding, R., Lafon, Y., and J. Reschke, "Hypertext Transfer Protocol (HTTP/1.1): Range Requests", RFC 7233, June 2014. [RFC6901] Bryan, P., Zyp, K., and M. Nottingham, "JavaScript Object Notation (JSON) Pointer", RFC 6901, April 2013. [RFC6902] Bryan, P., and M. Nottingham, "JavaScript Object Notation (JSON) Patch", RFC 6902, April 2013. 10.2. Informative References [Merge-Types] draft-toomim-httpbis-merge-types-00 [Braid-HTTP] draft-toomim-httpbis-braid-http-00 [RFC8259] T. Bray, "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 8259, December 2017. [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. Authors' Addresses For more information, the authors of this document are best contacted via Internet mail: Mitar Milutinovic UC Berkeley, EECS Department 775 Soda Hall #1776 Berkeley, CA 94720-1776 EMail: mitar.ietf@tnode.com Web: https://mitar.tnode.com/ Michael Toomim Invisible College, Berkeley 2053 Berkeley Way Berkeley, CA 94704 EMail: toomim@gmail.com Web: https://invisible.college/@toomim Bryn Bellomy Invisible College, Berkeley 2053 Berkeley Way Berkeley, CA 94704 EMail: bryn@signals.io Web: https://invisible.college/@bryn