Defining Well-Known Uniform Resource Identifiers (URIs)

mnot@mnot.net https://www.mnot.net/

General Internet-Draft This memo defines a path prefix for “well-known locations”, “/.well-known/”, in selected Uniform Resource Identifier (URI) schemes. RFC EDITOR: please remove this section before publication This draft is a proposed revision of RFC5875. The issues list for this draft can be found at https://github.com/mnot/I-D/labels/rfc5785bis. The most recent (often, unpublished) draft is at https://mnot.github.io/I-D/rfc5785bis/. Recent changes are listed at https://github.com/mnot/I-D/commits/gh-pages/rfc5785bis. See also the draft’s current status in the IETF datatracker, at https://datatracker.ietf.org/doc/draft-nottingham-rfc5785bis/.

Some applications on the Web require the discovery of information about an origin (sometimes called “site-wide metadata”) before making a request. For example, the Robots Exclusion Protocol (http://www.robotstxt.org/) specifies a way for automated processes to obtain permission to access resources; likewise, the Platform for Privacy Preferences tells user-agents how to discover privacy policy before interacting with an origin server. While there are several ways to access per-resource metadata (e.g., HTTP headers, WebDAV’s PROPFIND ), the perceived overhead (either in terms of client-perceived latency and/or deployment difficulties) associated with them often precludes their use in these scenarios. When this happens, one solution is designating a “well-known location” for data or services related to the origin overall, so that it can be easily located. However, this approach has the drawback of risking collisions, both with other such designated “well-known locations” and with resources that the origin has created (or wishes to create). To address this, this memo defines a path prefix in HTTP(S) URIs for these “well-known locations”, “/.well-known/”. Future specifications that need to define a resource for such metadata can register their use to avoid collisions and minimise impingement upon origins’ URI space. Well-known URIs can also be used with other URI schemes, but only when those schemes’ definitions explicitly allow it.

As per , “publishing independent standards that mandate particular forms of URI substructure is inappropriate, because that essentially usurps ownership.” Well-known URIs are not an escape hatch from the requirements therein; they are a very limited carve-out of the path name space owned by the authority, ceded to standard use for a designated purpose. That purpose is to facilitate discovery of information about an origin when it isn’t practical to use other mechanisms; for example, when discovering policy that needs to be evaluated before a resource is accessed, or when the information applies to many (or all) of the origin’s resources. Typically, the resource(s) identified by a well-known URI will make information about the origin (e.g., policy) available directly, or provide references to other URIs that provide it. In general, that information should be applicable to most origins (i.e., Web sites – while acknowledging that some origins might not use a particular well-known location, for various reasons). In keeping with the Architecture of the World-Wide Web , well-known URIs are not intended for general information retrieval or establishment of large URI namespaces. Specifically, well-known URIs are not a “protocol registry” for applications and protocols that wish to use HTTP as a substrate. Instead, such applications and protocols are encouraged to used an absolute URI to bootstrap their operation, rather than using a hostname and a well-known URI. Exceptionally, the registry expert(s) may approve such a registration for documents in the IETF Stream , in consultation with the IESG, provided that the protocol in question cannot be bootstrapped with a URI (e.g., the discovery mechanism can only carry a hostname). However, merely making it easier to locate it is not a sufficient reason. Likewise, future use unsupported by the specification in question is not sufficient reason to register a well-known location. Well-known locations are also not suited for information on topics other than the origin that they are located upon; for example, creating a well-known resource about a business entity or organisational structure presumes that Internet hosts and organisations share structure, and are likely to have significant deployment issues in environments where this is not true.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 .

A well-known URI is a URI whose path component begins with the characters “/.well-known/”, and whose scheme is “HTTP”, “HTTPS”, or another scheme that has explicitly been specified to use well-known URIs. Applications that wish to mint new well-known URIs MUST register them, following the procedures in Section 5.1. For example, if an application registers the name ‘example’, the corresponding well-known URI on ‘http://www.example.com/’ would be ‘http://www.example.com/.well-known/example’. Registered names MUST conform to the segment-nz production in . This means they cannot contain the “/” character. Note that this specification defines neither how to determine the authority to use for a particular context, nor the scope of the metadata discovered by dereferencing the well-known URI; both should be defined by the application itself. Typically, a registration will reference a specification that defines the format and associated media type to be obtained by dereferencing the well-known URI. It MAY also contain additional information, such as the syntax of additional path components, query strings and/or fragment identifiers to be appended to the well-known URI, or protocol-specific details (e.g., HTTP method handling). Note that this specification does not define a format or media-type for the resource located at “/.well-known/” and clients should not expect a resource to exist at that location. Well-known URIs are only valid when rooted in the top of the path’s hierarchy; they MUST NOT be used in other parts of the path. For example, “/.well-known/example” is a valid use, but “/foo/.well-known/example” is not.

This memo does not specify the scope of applicability of metadata or policy obtained from a well-known URI, and does not specify how to discover a well-known URI for a particular application. Individual applications using this mechanism must define both aspects. Applications minting new well-known URIs, as well as administrators deploying them, will need to consider several security-related issues, including (but not limited to) exposure of sensitive data, denial-of-service attacks (in addition to normal load issues), server and client authentication, vulnerability to DNS rebinding attacks, and attacks where limited access to a server grants the ability to affect how well-known URIs are served. Security-sensitive applications using well-known locations should consider that some server administrators might be unaware of its existence (especially on operating systems that hide directories whose names begin with “.”). This means that if an attacker has write access to the .well-known directory, they would be able to control its contents, possibly without the administrator realising it.

This document specifies procedures for the well-known URI registry, first defined in . Well-known URIs are registered on the advice of one or more experts (appointed by the IESG or their delegate), with a Specification Required (using terminology from ). To allow for the allocation of values prior to publication, the expert(s) may approve registration once they are satisfied that such a specification will be published. Registration requests can be sent to the wellknown-uri-review@ietf.org mailing list for review and comment, with an appropriate subject (e.g., “Request for well-known URI: example”).

The name requested for the well-known URI, relative to “/.well-known/”; e.g., “example”. For Standards-Track RFCs, state “IETF”. For others, give the name of the responsible party. Other details (e.g., postal address, e-mail address, home page URI) may also be included. Reference to the document that specifies the field, preferably including a URI that can be used to retrieve a copy of the document. An indication of the relevant sections may also be included, but is not required. Optionally, citations to additional documents containing further relevant information.

The Web Origin Concept This document defines the concept of an "origin", which is often used as the scope of authority or privilege by user agents. Typically, user agents isolate content retrieved from different origins to prevent malicious web site operators from interfering with the operation of benign web sites. In addition to outlining the principles that underlie the concept of origin, this document details how to determine the origin of a URI and how to serialize an origin into a string. It also defines an HTTP header field, named "Origin", that indicates which origins are associated with an HTTP request. [STANDARDS-TRACK] URI Design and Ownership Section 1.1.1 of RFC 3986 defines URI syntax as "a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme." In other words, the structure of a URI is defined by its scheme. While it is common for schemes to further delegate their substructure to the URI's owner, publishing independent standards that mandate particular forms of URI substructure is inappropriate, because that essentially usurps ownership. This document further describes this problematic practice and provides some acceptable alternatives for use in standards. RFC Streams, Headers, and Boilerplates IAB RFC documents contain a number of fixed elements such as the title page header, standard boilerplates, and copyright/IPR statements. This document describes them and introduces some updates to reflect current usage and requirements of RFC publication. In particular, this updated structure is intended to communicate clearly the source of RFC creation and review. This document is not an Internet Standards Track specification; it is published for informational purposes. Key words for use in RFCs to Indicate Requirement Levels In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Uniform Resource Identifier (URI): Generic Syntax A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. This specification defines the generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier. This specification does not define a generative grammar for URIs; that task is performed by the individual specifications of each URI scheme. [STANDARDS-TRACK] Guidelines for Writing an IANA Considerations Section in RFCs Many protocols make use of points of extensibility that use constants to identify various protocol parameters. To ensure that the values in these fields do not have conflicting uses and to promote interoperability, their allocations are often coordinated by a central record keeper. For IETF protocols, that role is filled by the Internet Assigned Numbers Authority (IANA).To make assignments in a given registry prudently, guidance describing the conditions under which new values should be assigned, as well as when and how modifications to existing values can be made, is needed. This document defines a framework for the documentation of these guidelines by specification authors, in order to assure that the provided guidance for the IANA Considerations is clear and addresses the various issues that are likely in the operation of a registry.This is the third edition of this document; it obsoletes RFC 5226. The Platform for Privacy Preferences 1.0 (P3P1.0) Specification HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV) Web Distributed Authoring and Versioning (WebDAV) consists of a set of methods, headers, and content-types ancillary to HTTP/1.1 for the management of resource properties, creation and management of resource collections, URL namespace manipulation, and resource locking (collision avoidance).RFC 2518 was published in February 1999, and this specification obsoletes RFC 2518 with minor revisions mostly due to interoperability experience. [STANDARDS-TRACK] Architecture of the World Wide Web, Volume One Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content The Hypertext Transfer Protocol (HTTP) is a stateless \%application- level protocol for distributed, collaborative, hypertext information systems. This document defines the semantics of HTTP/1.1 messages, as expressed by request methods, request header fields, response status codes, and response header fields, along with the payload of messages (metadata and body content) and mechanisms for content negotiation. Defining Well-Known Uniform Resource Identifiers (URIs) This memo defines a path prefix for "well-known locations", "/.well-known/", in selected Uniform Resource Identifier (URI) schemes. [STANDARDS-TRACK]

Aren’t well-known locations bad for the Web? They are, but for various reasons – both technical and social – they are sometimes necessary. This memo defines a “sandbox” for them, to reduce the risks of collision and to minimise the impact upon pre-existing URIs on sites. Why /.well-known? It’s short, descriptive, and according to search indices, not widely used. What impact does this have on existing mechanisms, such as P3P and robots.txt? None, until they choose to use this mechanism. Why aren’t per-directory well-known locations defined? Allowing every URI path segment to have a well-known location (e.g., “/images/.well-known/”) would increase the risks of colliding with a pre-existing URI on a site, and generally these solutions are found not to scale well, because they’re too “chatty”. I want to use a well-known location to make it easy to configure my protocol that uses HTTP. This is not what well-known locations are for; see .

Discuss appropriate and inappropriate uses more fully Adjust IANA instructions Update references Various other clarifications