Network Working Group Jacob Palme Internet Draft Stockholm University/KTH draft-palme-text-html-00.txt Sweden Category-to-be: Experimental November 1995 Expires May 1996 The Text/HTML content type and the Content-Location MIME header or Sending HTML documents via MIME e-mail Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). This memo provides information for the Internet community. This' memo does not specify an Internet standard of any kind, since this document is mainly a compilation of information taken from other RFC-s.. Distribution of this memo is unlimited. Abstract This memo specifies how to send HTML-formatted documents in Internet mail. The memo particularly adresses the issue of handling of hyperlinks in HTML documents referring to other body parts in the same message. In order to do this, the memo introduces one new MIME content- header with the name "Content-Location" and one new attribute to the MIME header "Content-Type: Text/HTML" with the name "linking". Palme [Page 1] draft-palme-text-html-01.txt July 1995 Table of contents 1. Introduction 2. Terminology 3. The Content-Location MIME content-header 4. Parameters for the Content-Type: Text/HTML 5. Use of relative URL-s in Text/HTML contents 6. Use of the Content-Type: multipart/related 7. Use of Content-type: Multipart/alternative 8. Combination of Content-Type: Multipart/related and Multipart/alternative. 9. Links to other body parts 9.1 Location-method: Use of the Content-Location field 9.2 Filename-method: Use of file names 9.3 CID-method: Use of CID URL-s 9.4 Recommended choice of method: 10. Indication of method used 11. Encoding considerations 12. Security considerations 13. Acknowledgments 14. References 15. Author's address 1. Introduction The HTML format is a very common format for documents in the Internet, and there is an obvious need to be able to send documents in this format in e-mail [SMTP, RFC822]. The "text/html; version=2.0" media type is defined in [HTML2]. This memo gives additional specifications and advice on how to use the text/html media type as a Content-Type in MIME [MIME] e-mail messages. 2. Terminology Most of the terms used in this memo are defined in other RFC-s. For example, URL is defined in [URL], URI, absolute URI, and relative URI is defined in [HTML2]. 3. The Content-Location MIME content-header An additional MIME heading field is defined with the name "Content- Location". This header field can occur in any MIME message heading or content heading. Its value is an absolute URI. The data below the current heading, and the data retrievable through this URI, should be identical. In practice, at present only those URI-s which are URL-s are used, but it is anticipated that other forms of URI-s will in the future be used. This heading is similar to the Location header as defined in [HTTP]. Palme [Page 2] draft-palme-text-html-01.txt July 1995 The syntax for the new heading field is, using the syntax definition tools from [RFC822]: content-location ::= "Content-Location:" <"> uri <"> where uri is at present (November 1995) restricted to the syntax for URL-s as defined in [URL]. This syntax will be widened when the definition of URI syntax becomes more stable. Question: Is it necessary with the quoutes (<">) around the value? HTTP does not have such quotes around the value of the Location heading field in HTTP. 4. Parameters for the Content-Type: Text/HTML The optional "version" parameter for the Content-Type: Text/HTML indicates the version of HTML used, with "2.0" as default value. The new optional parameter "linking" defined in chapter 10 below. 5. Use of relative URL-s in Text/HTML contents The use of relative URL-s in Content-Type: Text/HTML SHOULD never be used except in one of the following three cases: (a) There is a BASE element in the HTML document which resolves the relative URL into a non-relative URL. (b) The relative URL refers to another body part in the same message as defined in chapter 9.2 Filename-method below. (c) In special cases, where the sender and recipient have prior agreement on the resolution of relative URL-s. 6. Use of the Content-Type: multipart/related A message can contain one or more Text/HTML body parts and also contain as separate body parts, data, to which hyperlinks (as defined in [HTML2]) in the Text/HTML body part refers. If this is done, it is recommended to use the multipart/related Content-Type as defined in [REL]. The root (as defined in [REL]) should then be of the Content- Type: Text/HTML. 7. Use of Content-type: Multipart/alternative If the message is sent to recipients, all of which may not have mailers capable of handling the Text/HTML content-type, then the Content-Type: Multipart/Alternative [MIME] can be used, for example with Content- Type: Text/plain as the first choice, and Content-Type: Text/HTML as the second choice. Palme [Page 3] draft-palme-text-html-01.txt July 1995 8. Combination of Content-Type: Multipart/related and Multipart/alternative. Both the Content-type: Multipart/related, as defined in chapter 6 above and the Content-Type: Multipart/alternative, as defined in chapter 7 above can be combined in the same message. It is then recommended to put the Multipart/alternative inside the Multipart/related. Example: Content-Type: Multipart/related; boundary="boundary-example-1" type=Text/HTML --boundary-example 1 Content-Type: MULTIPART/ALTERNATIVE Boundary: boundary-example-2 --boundary-example-2 Content-Type: Text/plain ... plain text version of the document for recipients whose mailers cannot handle Text/HTML ... --boundary-example-2 Content-Type: Text/HTML ... text of the HTML document ... --boundary-example-2-- --boundary-example-1 Content-Type: Image/GIF ... a body part, to which the HTML document has a link ... --boundary-example-1-- 9. Links to other body parts A Text/HTML body part may contain hyperlinks to documents which are included as other body parts in the same message. Three ways to do this is specified in this memo: 9.1 Location-method: Use of the Content-Location field With this method, All URI-s in the Text/HTML document SHOULD be non- relative URI-s as defined in [HTML2], and it SHOULD be possible to use these URI-s to retrieve the referred document using the protocol defined for retrieval of this particular URL scheme in [URL] (subject to access control). Palme [Page 4] draft-palme-text-html-01.txt July 1995 For each distinct URI in the Text/HTML document, which refers to data which is sent in the same MIME message, there SHOULD be a separata body part in the message containing this data. Each such body part SHOULD contain a Content-Location heading field, and the value of this field SHOULD be identical to the URI as used in the Text/HTML document. The receiving mailer can then resolve the hyperlink either by using the URI in the normal way, or by using the data in the body part whose Content-Location contains the same URI. Example: Content-Type: Multipart/related; boundary="boundary-example-1"; type=Text/HTML --boundary-example 1 Content-Type: Text/HTML ... text of the HTML document, which might contain a hyperlink to the other body part, for example through a statement such as: --boundary-example-1 Content-Type: Image/GIF Content-Location: "http://www.dsv.su.se/images/DSV-sign-eng.gif" --boundary-example-1-- 9.2 Filename-method: Use of file names With this method, the hyperlink URIs to other body parts in the same message in the Text/HTML document SHOULD have a very simple format. This simple format is relative URL-s of the form relative-url ::= 1ALPHA 0#7ALPHADIGIT [ "." 1#3ALPHADIGIT ] ALPHADIGIT ::= ALPHA / DIGIT i.e. 1-8 characters plus 0-3 extension characters, only using Ascii letters and digits and beginning with a letter. The choice of this simple format is to match permitted file name formats in most operating systems in wide use today. For each distinct URI in the Text/HTML document, which refers to data which is sent in the same MIME message, there should be a separate body part in the message containing this data. Each such body part SHOULD contain a Content-Disposition header [RFC 1806] with a filename parameter. The value of this filename parameter should be identical to the relative URI as used in the Text/HTML document. Palme [Page 5] draft-palme-text-html-01.txt July 1995 The value of the Content-Disposition header should be "inline" if the URI in the Text/HTML document is used for an inline HTML element, such as an element, and should be "attachment" if the URI in the Text/HTML document i used for a hyperlink to a document which is activated at the request of the recipient, such as an element. Example: Content-Type: Multipart/related; boundary="boundary-example-1"; type=Text/HTML --boundary-example 1 Content-Type: Text/HTML ... text of the HTML document, which might contain a hyperlink to the other body part, for example through a statement such as: --boundary-example-1 Content-Type: Image/GIF Content-Disposition: inline/filename=signeng.gif --boundary-example-1-- 9.3 CID-method: Use of CID URL-s With this method, the hyperlink URIs to other body parts in the same message in the Text/HTML document SHOULD be CID (Content-ID) URL-s as defined in [URL] and [MIDCID]. For each distinct URI in the Text/HTML document, which refers to data which is sent in the same MIME message, there should be a separate body part in the message containing this data. Each such body part SHOULD have a Content-ID header [MIME]. The value of this Content-ID header should be identical to the CID as used in the Text/HTML document. Example: Content-Type: Multipart/related; boundary="boundary-example-1"; type=Text/HTML --boundary-example 1 Content-Type: Text/HTML ... text of the HTML document, which might contain a hyperlink to the other body part, for example through a statement such as: --boundary-example-1 Content-Type: Image/GIF Content-ID: sign-eng*jpalme@dsv.su.se --boundary-example-1-- Palme [Page 6] draft-palme-text-html-01.txt July 1995 9.4 Recommended choice of method: A Text/HTML content may always, in addition to the use the methods described in this chapter of this memo, contain URI-s only resolvable using the method defined for this particular URI scheme, and not referring to any data in separate body parts of the same message. Method Body part identifi- Recommendation cation method ------ ------------------- -------------- Location- Content-Location When the referred document is method publicly available and retrievable using the scheme used in the URI. Filename- filename in Content- For private documents or documents method Dispositon header not retrievable. CID-method Content-ID For experimental use between consenting partners. 10. Indication of method used There should be an additional optional parameter to the Content-Type: Text/HTML header, with the name "linking" and the syntax: linking ::= "linking=" linkmethod where linkmethod can have the following values: external Only use of absolute URI-s and retrieval using the scheme defined for this URI, and not containing any URI referring to other body parts in the same essage. location Location-method as defined in chapter 9.1. filename Filename-method as defined in chapter 9.2. cid CID-method as defined in chapter 9.3. Default value if this parameter is omitted is "external". Palme [Page 7] draft-palme-text-html-01.txt July 1995 11. Encoding considerations There are two recommended ways to encode 8-bit characters in Text/HTML contents: (1) Let the charset of the content part be iso-8859-1, and encode the content with the quoted-printable encoding method. (2) Let the charset of the content part be us-ascii, and encode non-us-ascii characters in the text using the Data character encoding defined in [HTML2]. Both these encoding methods are permitted, and they can also be mixed in the same document. Recipients must be capable of handling both encoding alternatives. However, it is recommended that encoding method (2) above is used when sending Text/HTML messages. If only method (2) is used, the charset parameter should be "us-ascii". If method (1), or a mixture of method (1) and method (2) is used, the charset parameter should be "iso-8859-1". 12. Security considerations There is a potential security risk if the Content-Location: heads a body part whose data is not identical to that retrievable using the URI in the Content-Location. To reduce this risk, it might be unsuitable to cache the data in such a way that the cached data can be used for retrieval of this URL from other documents than those included in the same message as the Content-Location header. 13. Acknowledgments Harald Tveit Alvestrand, Keith Moore, Ed Levinson, Al Gilman, Valdis Kletnieks, Larry Masinter and several other people have helped me with preparing this memo. I alone take responsibility for any errors which may still be in the memo. 14. References Temporary note: This list contains some references to Internet drafts. It is anticipated that these Internet drafts will become RFC-s before this memo. The references will then in this memo be changed to refer to the corresponding RFC instead. Ref. Author, title ------------ -------------------------------------------------------- [CIDMID] E. Levinson: "Content-ID and Message-ID Uniform Resource Locators", , October 1995. Palme [Page 8] draft-palme-text-html-01.txt July 1995 [HOSTS] R. Braden (editor): "Requirements for Internet Hosts -- Application and Support", STD-3, RFC 1123, October 1989. [HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: "Hypertext Transfer Protocol -- HTTP/1.0", , April 1996. [MIME] N. Borenstein & N. Freed: "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Sept 1993. [NEWS] M.R. Horton, R. Adams: "Standard for interchange of USENET messages", RFC 1036, December 1987. [REL] Harald Tveit Alvestrand, Edward Levinson: "The MIME Multipart/Related Content-type", , January 1995. [RFC1806] R. Troost, S. Dorner: "Communicating Presentation Information in Internet Messages: The Content-Disposition Header", RFC 1806, June 1995. [RFC822] D. Crocker: "Standard for the format of ARPA Internet text messages." STD 11, RFC 822, August 1982. [SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982. [URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform Resource Locators (URL)", RFC 1738, December 1994. |HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language - 2.0", RFC 1866, November 1995. 15. Author's address Jacob Palme Phone: +46-8-16 16 67 Stockholm University and KTH Fax: +46-8-783 08 29 Electrum 230 E-mail: jpalme@dsv.su.se S-164 40 Kista, Sweden Palme [Page 9]