Network Working Group Donald Eastlake 3rd INTERNET-DRAFT Motorola Expires: May 2001 November 2000 Protocol versus Document Points of View -------- ------ -------- ------ -- ---- Status of This Document This draft is intended to become an Informational RFC. It's distribution is unlimited. Please send comments to the author. This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Two points of view are contrasted: the "document" point of view, where objects of interest are like pieces of paper, and the "protocol" point of view where objects of interest are like composite protocol messages. While each point of view has its place, inappropriate adherence to a purely document point of view is detrimental to protocol design. D. Eastlake 3rd [Page 1] INTERNET-DRAFT Document versus Protocol October 2000 Table of Contents Status of This Document....................................1 Abstract...................................................1 Table of Contents..........................................2 1. Introduction............................................3 2. Points of View..........................................3 2.1 Basic Point of View....................................3 2.2 The Question of Meaning................................4 2.3 Processing or Lack Thereof.............................4 2.4 Canonicalization and Security..........................4 3. Examples................................................6 4. Synthesis of the Points of View.........................6 References.................................................7 Author's Address...........................................8 Expiration and File Name...................................8 D. Eastlake 3rd [Page 2] INTERNET-DRAFT Document versus Protocol October 2000 1. Introduction Much of the IETF's traditional work has concerned low level binary protocol constructs. These are almost always viewed from the protocol point of view as defined below. But as higher level application constructs and syntaxes are involved in the standards process, difficulties can arise due to participants who are fixated on the document point of view. An example of something designed, to a significant extent, from the document point of view is the X.509v3 Certificate [X509v3]. An example of something that can easily be viewed both ways and where the best results frequently attention to not only the document but also the protocol point of view is the eXtensible Markup Language (XML [XML]). 2. Points of View The following subsections contrast the document and protocol points of view. Each view is exaggerated for effect. The document point of view is indicated in paragraphs headed "DOCUM" while the protocol point of view is indicated in paragraphs headed "PROTO". 2.1 Basic Point of View DOCUM: What is important are complete digital documents viewed by people or things which are very close equivalents. A major concern is to be able to present such documents as directly as possible to a court or adjudicator should a dispute arise. PROTO: What is important are bits on the wire generated and consumed by well defined computer processes or things which are very close equivalents. Although pieces of such messages may end up being included in or influencing data displayed to a person, a protocol message as a whole is only viewed by a geek when debugging. If you actually ever have to prove something about such a message is a court, there isn't any way to avoid having expert witnesses interpret it. D. Eastlake 3rd [Page 3] INTERNET-DRAFT Document versus Protocol October 2000 2.2 The Question of Meaning DOCUM: The "meaning" of a document is a deep and interesting human question. It is probably necessary for the document to include or reference human language policy and/or warranty/disclaimer information. It is reasonable to consult attorneys and require some minimal human readable statements to be "within the four corners" of the document (i.e., actually embedded in the digital structure). PROTO: The "meaning" of a protocol message is clear from the protocol specification and is frequently defined in terms of the state machines of the sender and recipient. Protocol messages are only truly meaningful to the processes producing and consuming them, which processes have additional context. Adding human readable text that is not functionally required is silly. Consulting attorneys may needlessly complicate the protocol and in the worst case tie any design effort in knots. 2.3 Processing or Lack Thereof DOCUM: The standard model of a document is as a quasi-static object somewhat like a piece of paper. About all you do to documents is transfer them as a whole from one storage area to another or add attachments. (Possibly you might want an extract from a document or to combine multiple documents into a summary but this isn't the common case.) PROTO: The standard model of a protocol message is as an ephemeral composite object created by a source process and consumed by a destination process. Normally a message is constructed from information contained in or pieces of other messages previously received by the sending process, as well as local information. 2.4 Canonicalization and Security Canonicalization is the transformation of the information in a message into a "standard" form, discarding "insignificant" information. For example, encoding into a standard character set or changing line endings into a standard encoding and discarding the information as to what the original character set or line ending encodings were. Obvious, what is "standard" and what is "insignificant" varies with the application or protocol and can be tricky to determine. DOCUM: From the document point of view, canonicalization is extremely D. Eastlake 3rd [Page 4] INTERNET-DRAFT Document versus Protocol October 2000 suspect if not outright evil. After all, if you have a piece of paper with writing on it, any modification to "standardize" its format can be an unauthorized change in the original message as created by the author. From the document point of view, digital signatures are like authenticating signatures or seals or time stamps on the bottom of the "piece of paper". They do not justify and should not depend on the slightest change in the message appearing above them. Similarly, from the document point of view, encryption is just putting the "piece of paper" in a vault that only certain people can open, and does not justify any standardization or canonicalization of the message. PROTO: From the protocol point of view, you know that you just have a pile of bits that have never been seen and never will be seen by a person. In some cases, a human sensible representation of some of the bits may be shown to a person. But, for protocols of realistic complexity, most of the parts of the message will be artifacts of encoding, protocol structure, and computer representation rather than anything intended for a person to see. In theory, the "original" idiosyncratic form of any digitally signed part could be conveyed unchange through the computer processes which implement the protocol and usefully signed in that form, but in practical systems of any complexity, this always proves unreasonably difficult for at least some parts of some messages. Thus, the signed data must be canonicalized as part of the signing and verification processes. Even if, miraculously, an initial system design avoids all cases of signed message part reconstruction based on processed data or re-encoding based on character set or line ending or capitalization or numeric representation or time zones or whatever, later revisions and extensions are almost certain to require such reconstruction and/or re-encoding. Because of this, from the protocol point of view, canonicalization is always required. It is just a question of exactly what canonicalization or canonicalizations. Thus, for protocol systems of practical complexity, you are faced with the choice of (1) doing no canonicalization and having brittle signatures, useless due to insignificant failures to verify, or (2) doing the sometimes difficult and tricky work of designing an appropriate canonicalization or caonnicalizaitons to be used as part of signature generation and verification producing robust and useful signatures. While the application of canonicalization is more obvious with digital signatures, it may also apply to encryption. In particular, sometimes elements of the environment where the encrypted data is found effect its interpretation. For example, the character encoding or bindings of dummy symbols. When the data is decrypted, it may be into an environment with a different character encoding and dummy D. Eastlake 3rd [Page 5] INTERNET-DRAFT Document versus Protocol October 2000 symbol bindings. With a plain text message part, it is usually clear what of these environmental elements need to be conveyed with the message. But a encrypted message part is opaque. Thus some canonical representation that incorporates such environmental factors may be needed. 3. Examples (to be added) 4. Synthesis of the Points of View There are some merits to each point of view. Certainly the document point of view is simpler and easier and would thus be preferred if it meets the needs of an application. The protocol point of view can come close to encompassing the document point of view as a limiting case. In particular, as the complexity of messages declines to a single payload (perhaps with attachments) and the mutability of the payload declines to some standard binary format that needs no canonicalization and the number of parties and amount of processing as messages are transferred declines and the portion of the message intended for more or less direct human consumption increases, the protocol point of view would be narrowed to something close to the document point of view. Even when the document point of view is questionable, the addition of a few options to a protocol, such as minimal and/or no canonicalication or optional policy statement/pointer inclusion, will usually satisfy the perceived needs of those holding a strictly document point of view. On the other hand, the document point of view is hard to stretch to encompass the protocol case. From an extreme document point of view, canonicalization is wrong, inclusion of human language policy test within every object should be mandatory, etc. Failure to incorporate the protocol view point as described above in the design of protocols of realistic complexity may have fatal consequences. D. Eastlake 3rd [Page 6] INTERNET-DRAFT Document versus Protocol October 2000 References [X509v3] ITU-T Recommendation X.509 version 3 (1997). "Information Technology - Open Systems Interconnection - The Directory Authentication Framework" ISO/IEC 9594-8:1997. [XML] Extensible Markup Language (XML) 1.0 Recommendation. T. Bray, J. Paoli, C. M. Sperberg-McQueen. February 1998. D. Eastlake 3rd [Page 7] INTERNET-DRAFT Document versus Protocol October 2000 Author's Address The author of this document is: Donald E. Eastlake 3rd Motorola 155 Beaver Street Milford, MA 01757 USA Phone: +1 508-261-5434 (w) +1 508-634-2066 (h) Fax: +1 508-261-4777 (w) EMail: Donald.Eastlake@motorola.com Expiration and File Name This draft expires May 2001. Its file name is . D. Eastlake 3rd [Page 8]