INTERNET-DRAFT B. Bos draft-bos-http-redirect-00.txt W3C/INRIA Expires 1 January 2000 30 June 1999 Handling of fragment identifiers in redirected URLs Status of this memo This document is [probably going to be] an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026[RFC2026]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The HTTP 1.1 specification describes how a server can answer a request with a redirection, instructing the client to get the resource from a different URL. It doesn't explain what to do with any fragment identifier that might have been on the original URL, and this omission has resulted in different clients handling fragments in different ways. This draft gives rules towards a more consistent handling by future HTTP clients. Comments on this draft can be sent to bert@w3.org Description of the problem The HTTP 1.1 protocol[HTTP] contains a facility whereby servers can inform clients that the resource they requested is not available at the requested address, but at some other. The server sends back a Bos [Page 1] draft Fragment IDs in redirected URLs 30 June 1999 status code such as 301 or 302 and the correct URI of the resource. Clients then typically issue a new request, to the same or to a different server, with the new URI. URIs may contain a fragment identifier, indicated by a # (hash mark) in the URI[URI]. For example http://www.w3.org/TR/REC-xml-names#NT-NCName A client that is retrieving this fragment will ask a server for the resource "http://www.w3.org/TR/REC-xml-names" and will then locate the fragment "NT-NCName" in that resource. It depends on the client and on the type of the resource what is done with the fragment. A browser displaying an HTML[HTML] page usually scrolls the view port so that the indicated fragment is at the top. In the example, the fragment identifier is a single name, but again depending on the type of resource, it may be a complex expression. The problem is what happens when a URI with a fragment identifier gets redirected. Assume that when the client sends the URL "http://www.w3.org/TR/REC-xml-names" to a server, it will receive a status code 301, which means "Moved permanently", and a new URL. Let's assume the new URL is http://www.w3.org/TR/REC-xml-names/ i.e., with an extra slash compared to the original URL. The question is whether the client should interpret this as http://www.w3.org/TR/REC-xml-names/#NT-NCName or as http://www.w3.org/TR/REC-xml-names/ The former assumes that the document may have changed location, but that it is still the same document and it still contains the same fragment. The latter assumes that, because the document changed location, it probably also changed contents, and doesn't have that fragment anymore. The HTTP 1.1 specification talks about a single resource which is available at one or more locations or in one or more representations, so the former interpretation appears to be the right one. It may be the case that some of those alternative representations do not allow fragments to be identified, but we will have to assume that at least one of them does. Bos [Page 2] draft Fragment IDs in redirected URLs 30 June 1999 But HTTP 1.1 doesn't talk explicitly about fragment identifiers, which has resulted in the sad fact that at the time of writing, there are clients that drop the fragment identifier upon a redirect. Anecdotal evidence suggests that in fact only about one third of Web browsers re-applies the fragment identifier to the redirected URL. This draft therefore explains how to apply the fragment identifier in case of a redirection. Detailed specification There are different cases, depending on which type of redirection is used, and on whether the new URI itself contains a fragment identifier. We assume that a client issued an HTTP GET request for a particular URI (referred to as the "original URI"). This draft does not specify what happens with other kinds of requests, such as HEAD, PUT and POST. If the server returns a response code of 300 ("multiple choice"), 301 ("moved permanently"), 302 ("moved temporarily") or 303 ("see other"), and if the server also returns one or more URIs where the resource can be found, then the client SHOULD treat the new URIs as if the fragment identifier of the original URI was added at the end. The exception is when a returned URI already has a fragment identifier. In that case the original fragment identifier MUST NOT be not added to it. If the client retrieves the resource using the new URI and the resource turns out to be of a type that doesn't allow fragments to be identified, then the client SHOULD silently ignore the fragment ID and not issue an error message. The response codes 304 ("not modified") and 305 ("use proxy") both indicate that the resource can be found in a different way, but do not specify a new URI. The resource is still identified by the original URI with the original fragment identifier. Open issue If a resource is available in several representations (as indicated by the 300 response code), it may be the case that some of these representations would be able to identify the fragment, but not using the same fragment identifier. For example, one of the representations Bos [Page 3] draft Fragment IDs in redirected URLs 30 June 1999 may be an HTML file with elements carrying ID attributes, while another may be a Postscript file with page numbers. The author of both may consider them to be the same resource and may want to map page numbers to IDs and vice versa. There is currently no way for a server to tell a client about such mappings of fragment identifiers between different representations of a resource. A suggestion for a future version of HTTP may be to add an (optional) Fragment header to the request, which holds the fragment identifier. Even simpler may be to allow an HTTP request to contain a fragment identifier. Security considerations No new security considerations are added to those already present in HTTP 1.1. References [HTML] Dave Raggett, Arnaud Le Hors, Ian Jacobs. "HTML 4.0 Specifica- tion." December 1997, revised April 1998. W3C Recommendation REC- html40-19980424. Available at URL http://www.w3.org/TR/REC-html40/ [HTTP] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee. "Hypertext Transfer Protocol -- HTTP/1.1." January 1997. Internet RFC 2068. Available at URL http://www.w3.org/Protocols/rfc2068/rfc2068 [RFC2026] S. Bradner. "The Internet Standards Process -- Revision 3." October 1996. Internet RFC 2026. Available at URL ftp://ftp.nordu.net/rfc/rfc2026.txt [URI] T. Berners-Lee, L. Masinter, M. McCahill. "Uniform Resource Loca- tors (URL)." December 1994. Internet RFC 1738. Available at URL ftp://ftp.nordu.net/rfc/rfc1738.txt Bos [Page 4] draft Fragment IDs in redirected URLs 30 June 1999 Author's address Bert Bos W3C/INRIA 2004, route des Lucioles B.P. 93 06902 Sophia Antipolis Cedex France tel: +33 (0)4 92 38 76 92 e-mail: bert@w3.org Bos [Page 5]