Network Working Group M. Garcia-Martin Internet-Draft Nokia Siemens Networks Intended status: Standards Track M. Matuszewski Expires: May 12, 2008 Nokia November 9, 2007 An Extensible Data Format (XML) for Describing Files draft-garcia-app-area-file-data-format-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 12, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document defines an Extensible Data Format (XML) for describing files. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 1] Internet-Draft XML Data Format for Files November 2007 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. The 'file-metadata' XML Document . . . . . . . . . . . . . . . 3 3.1. Full 'file-metadata' document . . . . . . . . . . . . . . 4 3.2. Partial 'file-metadata' document: patch operations . . . . 7 3.3. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . 8 3.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.4.1. Example of a full 'file-document' document . . . . . . 11 3.4.2. Example of a partial 'file-metadata' document . . . . 12 3.4.3. Example of a full 'file-metadata' document . . . . . . 14 3.4.4. Example of a partial 'file-metadata' document . . . . 15 4. Security Considerations . . . . . . . . . . . . . . . . . . . 17 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.1. Normative References . . . . . . . . . . . . . . . . . . . 17 6.2. Informative References . . . . . . . . . . . . . . . . . . 18 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 Intellectual Property and Copyright Statements . . . . . . . . . . 19 Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 2] Internet-Draft XML Data Format for Files November 2007 1. Introduction In the recent times there is a growing interest for defining a standard format for describing files and its associated meta-data. This is the case, for example, of the Session Initiation Protocol (SIP) file sharing framework [I-D.garcia-sipping-file-sharing-framework], which describes a usage of SIP for publishing file metadata. Other usages, for example, based on the HyperText Transfer Protocol (HTTP) [RFC2616] have been also discussed, and it is expected that the growing usage of XML in IETF protocols will increase the demand for this format. This document creates a generic XML document format for describing files and their associated meta-data. The document format is extensible, so future needs can be addressed thorough extensions. It is expected that applications that need to describe files use this format as their standard format. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119] and indicate requirement levels for compliant implementations. 3. The 'file-metadata' XML Document A 'file-metadata' document is an XML document [W3C.REC-xml-20001006] that MUST be well-formed and SHOULD be valid. A 'file-metadata' document MUST be based on XML 1.0 and MUST be encoded using UTF-8 [RFC3629]. This specification makes use of XML namespaces for identifying 'file-metadata' documents. The namespace URI for elements defined by this specification is a URN [RFC2141], using the namespace identifier 'ietf'. This URN is: urn:ietf:params:xml:ns:file The 'file-metadata' documents are identified with the MIME type "application/file+xml" and are instances of the XML schema defined in Section 3.3. The XML schema that defines the constrains of the 'file-metadata' document provides support for full and partial 'file-metadata' documents, so that applications that can accommodate differentiated versions of XML documents can use partial content to signal a change Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 3] Internet-Draft XML Data Format for Files November 2007 in one or more files. The XML schema contains provisions for two root elements, namely and , of which only one MUST be present in a valid 'file-metadata' document. The element is used to describe a full 'file-metadata' document, i.e., one containing a full state of the available files. A full 'file- metadata' document MUST be used in any initial publication or initial notification. On the contrary, the element is used to describe a partial 'file-metadata' document. The element contains a number of patch operations that, once applied to a previous version of a full 'file-metadata' document, create an updated full document. The XML schema rules require that only one root element is present in an XML document. Therefore, a 'file-metadata' document compliant with the XML schema definition contains either a root element or a root element, but not both. Due to the duality of a 'file-metadata' document, depending on whether it contains a full or a partial 'file-metadata' document, we describe separately each of them in Section 3.1 and Section 3.2, respectively. 3.1. Full 'file-metadata' document A full 'file-metadata' document begins with the root element tag that describes a collection of files. The element contains a mandatory 'version' attribute. When 'file- metadata' documents are used with protocols that provide the notion of a session, such as SIP [RFC3261], an initial appearance of a 'file-metadata' document in a session selects an initial value for the 'version' attribute of the element. Subsequent 'file- metadata' documents within the same session MUST increment the value of the 'version' attribute by one, no matter whether they are full or partial, and add it either to the or element, as appropriate. As a consequence, the counter of the 'version' attribute is shared between and elements. The element consists of one or more child elements. The element MAY contain a element, a element, and MAY contain other elements and attributes from different namespaces for the purposes of extensibility; elements or attributes from unknown namespaces MUST be ignored. Each element represents the description data of a file. It includes an 'id' attribute that contains a unique identifier. The value of the 'id' attribute MUST be unique within the 'file-metadata' document. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 4] Internet-Draft XML Data Format for Files November 2007 The element to indicate the age of the document. The element contains a string value, which is usually used for a human readable comment. A element MAY appear as a child element of element. The element consists of one element and one or more elements. The element MAY also contain other elements and attributes from different namespaces for the purposes of extensibility; elements or attributes from unknown namespaces MUST be ignored. The element groups a number of elements that represent the invariant data of the file, i.e., file metadata that is common across different instances of the file. For example, the element provides a description for the hash or size of a file. On the contrary, data that is specific to the location of the file are grouped in the element. This can include a Uniform Resource Identifier (URI) of the user who hosts the file or a human readable description of the file. The element contains an 'id' attribute whose value MUST be unique within the XML document. The element contains zero or one , , , and elements. The element MAY also contain other elements and attributes from different namespaces for the purposes of extensibility; elements or attributes from unknown namespaces MUST be ignored. The element contains a persistent, location-independent, resource identifier expressed as a Uniform Resource Name (URN) [RFC2141] that is allocated to the file and uniquely identifies it. If present, the value of the element MUST be formatted according to the URN syntax specified in RFC 2141 [RFC2141]. The element contains the Multipurpose Internet Mail Extensions (MIME) type of the file. If present, the value of the element SHOULD contain an IANA registered MIME media type expressed as type/subtype format. The element contains the file size in octets. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 5] Internet-Draft XML Data Format for Files November 2007 The element contains the results of a hashing operation on the file. The hashing operation MUST be computed using the US Secure Hash Algorithm 1 (SHA1) [RFC3174] and MUST be expressed in hexadecimal format. One or more elements can be included in the element. Each element provides information that is related to a particular instance of the file, rather than the file itself. Each element contains an 'id' attribute whose value MUST be unique within the XML document. The element also contains one or more and elements, and zero or one , , , , , , and elements. Additionally, the element MAY contain other elements and attributes from different namespaces for the purposes of extensibility; elements or attributes from unknown namespaces MUST be ignored. The element contains either a location-dependent, typically protocol-specific file identifier expressed as a Uniform Resource Identifier (URI) [RFC3986]. The element can be, for example, an HTTP or FTP URI. The provides a container for a URI that resolves to the URI f the user where the file is available. For example, this can be a SIP or SIPS URI. This might be useful when it is not possible to provide a URI (in the element) that resolves to the file itself, but instead, there is a URI that resolves to the user that hosts the file. The provides a SIP Globally Routable User Agent URI (GRUU) [I-D.ietf-sip-gruu] that points to the SIP instance in the User Agent where the file is available. This element completes the by providing an pointer to the SIP instance that is hosting the file. The element contains the file name. The element contains a human readable text describing the file. The element contains a URI that points to an icon that represents the file. This is typically applicable to image or video files. The element indicates the date and time at which the Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 6] Internet-Draft XML Data Format for Files November 2007 file was created. The element indicates the date and time at which the file was last modified. The element indicates the date and time at which the file was last read. The element is a container of keywords associated to the file. Its main purpose is to assist indexing and search engines. The element contains one or more elements (notice the singular form of the child elements). The element MAY contain other attributes from different namespaces for the purposes of extensibility; attributes from unknown namespaces MUST be ignored. Each element contains one word that represents a keyword associated to the file. A element SHOULD NOT contain any white spaces. If several keywords are to be included, each one should be included in a separate element. 3.2. Partial 'file-metadata' document: patch operations A partial 'file-metadata' document begins with the root element tag that describes a collection of XML patch operations [I-D.ietf-simple-xml-patch-ops] that are to be applied to a previous version of a full 'file-metadata' document. The element contains a mandatory 'version' attribute whose counter is shared with the 'version' attribute of the element. Each new partial 'file-metadata' document MUST increment the 'version' attribute value by one, with respect the previously sent version. The value of the 'version' attribute can be used to ensure consistent updates as the recipient of the document can use the 'version' number to properly order received documents and to ensure that updates have not been lost. The element consists of one or more child , , or elements whose type definitions are included from the XML Patch Operations [I-D.ietf-simple-xml-patch-ops] identified with the namespace "urn:ietf:params:xml:schema:xml-patch-ops". The element MAY contain other elements and attributes from different namespaces for the purposes of extensibility; elements or attributes from unknown namespaces MUST be ignored. The element is used to add new content to the 'file-metadata' document. The details of the element are discussed in the XML Patch Operations [I-D.ietf-simple-xml-patch-ops]. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 7] Internet-Draft XML Data Format for Files November 2007 The element is used to update content in the 'file- metadata' document. The details of the element are discussed in the XML Patch Operations [I-D.ietf-simple-xml-patch-ops]. The element is used to remove content from the 'file- metadata' document. The details of the element are discussed in the XML Patch Operations [I-D.ietf-simple-xml-patch-ops]. Once all the patch operations have been applied to the previous full 'file-metadata' document, a new full 'file-metadata' document is created with the same 'version' attribute value, letting a subsequent partial 'file-metadata' document operate on the last full document. 3.3. XML Schema Implementations according to this specification MUST comply to the following XML schema that defines the constraints of the 'file- metadata' document: XML Schema Definition to provide information about available files at a host. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 9] Internet-Draft XML Data Format for Files November 2007 > Figure 1: 'file-metadata' document XML schema Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 10] Internet-Draft XML Data Format for Files November 2007 3.4. Examples 3.4.1. Example of a full 'file-document' document Figure 2 is an example of a 'file-metadata' document. image/jpeg 230432 72245FE8653DDAF371362F86D471913EE4A2CE2E coolpic.jpg This is my latest cool picture from my summer vacation sip:miguel.an.garcia@example.com sip:miguel.an.garcia@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 http://www.example.com/coolpic-icon.jpg 2006-05-09T09:30:47+03:00 2006-05-09T10:24:34+03:00 2006-05-10T14:24:32+03:00 summer vacation 2007-11-12T09:55:28Z There is only one available file Figure 2: Example of a full 'file-metadata' document The example in Figure 2 shows a full 'file-metadata' document. The Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 11] Internet-Draft XML Data Format for Files November 2007 example contains the description of a single file: an image file. The source of the information provides description of the file (an element) that contains the static data of the file included in the element and the variable data (that depends on the actual instance of the file) in the element. The element contains a number of characteristics of the file that would not change across different instances, such as the MIME type, the size, and the hash of the file. On the contrary, the element contains the data related to the particular instance of the file, such as the name assigned by the user to the file, a human readable description, the GRUU that points to the SIP User Agent where the file is stored, the creation, modification, and read dates, etc. Last, a and a elements indicate the date and time when the XML document was created and a human readable note. 3.4.2. Example of a partial 'file-metadata' document Figure 3 is an example of a partial 'file-metadata' document. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 12] Internet-Draft XML Data Format for Files November 2007 message/msrp IETFers chat room Dedicated chat room for IETF discussions sip:miguel.an.garcia@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 sip:miguel.an.garcia@example.com 2007-11-12T11:00:00Z Now I have two available files Figure 3: Example of a partial 'file-metadata' document The example in Figure 3 shows an example of a partial 'file-metadata' document. The document contains the patch operations that adds one more new file to the existing list of files, so the result of applying the patch to the initial file metadata document of Figure 2 results in a document that contains the description of two files. The 'version' attribute of the element is incremented by one with respect the 'version' attribute of the element of the full 'file-metadata' document in Figure 2. The document replaces the previous and elements with new values. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 13] Internet-Draft XML Data Format for Files November 2007 3.4.3. Example of a full 'file-metadata' document Figure 4 is an example of a full 'file-metadata' document. audio/3gpp 34987 E05DA01A590E31F6E3100AD7BEC39C63464A1CD0 recording-1.3gp Bob's speech at a conference sip:miguel.an.garcia@example.com sip:miguel.an.garcia@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 2006-05-01T01:30:47+03:00 2006-05-02T02:24:34+03:00 2006-05-03T03:24:32+03:00 Bob speech bob-speech.3gp Bob talking about nanotechnology sip:alice@example.com sip:alice@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-4de9-00a2ac4e398a 2006-05-01T01:30:47+03:00 Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 14] Internet-Draft XML Data Format for Files November 2007 2006-05-02T02:24:34+03:00 2006-05-24T05:12:07+02:00 Bob nanotechnology 2007-11-12T14:01:02Z There is a single file available at two endpoints Figure 4: Example of a full 'file-metadata' document The example in Figure 4 shows an example of a full 'file-metadata' document. The document describes a single audio file, which is available at two difference hosts, thus, the 'file-metadata' document starts with a element that contains the description of the file in the element. The element contains an element and two elements. The element contains descriptive invariant data of the file. Each of the elements contains data related to the particular instance where the file is available. 3.4.4. Example of a partial 'file-metadata' document Figure 5 is an example of a partial 'file-metadata' document. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 15] Internet-Draft XML Data Format for Files November 2007 nanotalk.3gp Nanotechnology speech sip:bob@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-5d3a-bbc333431122 sip:bob@example.com 2006-06-07T17:26:04+03:00 2007-11-12T18:02:02Z Figure 5: Example of a partial 'file-metadata' document Figure 5 contains a number of XML patch operations to be applied to the full 'file-metadata' document included in Figure 4. The document in Figure 5 starts with a root element, indicating that it is a partial 'file-metadata' document. The 'version' attribute is incremented by one with respect the 'version' attribute of the element of the full 'file-metadata' document of Figure 4. The document contains an element that first selects the element whose 'id' attribute is set to "nkcdn0". Then it appends, at the end of the existing child elements, a new element that describes the availability of the same file at a different endpoint. The first element selects the element of the whose 'id' attribute is set to "idea1dof" and replaces the value with a new date and time. Then, the following elements replace the and elements, respectively. Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 16] Internet-Draft XML Data Format for Files November 2007 Note: the 'sel' attribute of the element in Figure 5 is split in two lines due to formatting restrictions. It will appear as a single line in XML documents. 4. Security Considerations TBD 5. IANA Considerations TBD 6. References 6.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. [RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 (SHA1)", RFC 3174, September 2001. [RFC3339] Klyne, G., Ed. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, July 2002. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [W3C.REC-xml-20001006] Paoli, J., Maler, E., Sperberg-McQueen, C., and T. Bray, "Extensible Markup Language (XML) 1.0 (Second Edition)", World Wide Web Consortium FirstEdition REC-xml-20001006, October 2000, . [I-D.ietf-simple-xml-patch-ops] Urpalainen, J., "An Extensible Markup Language (XML) Patch Operations Framework Utilizing XML Path Language (XPath) Selectors", draft-ietf-simple-xml-patch-ops-03 (work in Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 17] Internet-Draft XML Data Format for Files November 2007 progress), August 2007. 6.2. Informative References [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [I-D.ietf-sip-gruu] Rosenberg, J., "Obtaining and Using Globally Routable User Agent (UA) URIs (GRUU) in the Session Initiation Protocol (SIP)", draft-ietf-sip-gruu-15 (work in progress), October 2007. [I-D.garcia-sipping-file-sharing-framework] Garcia-Martin, M., "Sharing Files with the Session Initiation Protocol (SIP)", draft-garcia-sipping-file-sharing-framework-00 (work in progress), June 2007. Authors' Addresses Miguel A. Garcia-Martin Nokia Siemens Networks P.O.Box 22 Nokia Siemens Networks, FIN 02022 Finland Email: miguel.garcia@nsn.com Marcin Matuszewski Nokia P.O.Box 407 NOKIA GROUP, FIN 00045 Finland Email: marcin.matuszewski@nokia.com Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 18] Internet-Draft XML Data Format for Files November 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Garcia-Martin & Matuszewski Expires May 12, 2008 [Page 19]