INTERNET-DRAFT J. Boynton Produx House, Corp. Expires six months from --> 5 August, 2001 Uniform Object Locator -- UOL Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http//www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http//www.ietf.org/shadow.html. Abstract A Uniform Object Locator (UOL) provides a general-purpose identifier for "human" interaction with object oriented and relational data. UOL is designed to meet the recommendations for URI queries laid out in "Uniform Resource Identifiers (URI) Generic Syntax" [RFC2396]. This document defines syntax and semantics of UOL, including both absolute and relative forms, and guidelines for their use; it revises features, definitions, and examples, given in "draft-boynton-uol-02" and updates the scheme to provide additional functionality. Boynton Expires August 2001 [page 1] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 Table of Contents 1 Introduction 1.1 Purpose 1.2 General Description 1.3 Terminology 2 UOL Syntactic Components 2.1 Path Components 2.1.1 Authority 2.1.2 Object Constructor 2.1.3 Object Name 2.2 Absolute and Relative Form 2.3 Reserved Markers 2.4 Example UOL 3 Object Elements 3.1 Objects 3.2 Containers 3.3 Undefined 4 Name Elements 4.1 Names 4.2 Index Names 4.3 Default Names 5 Reserved UOL Parameters 6 Relational Data 7 UOL, XPath, and XML 8 References 9 Author's Address (send comments) 1 Introduction 1.1 Purpose A Uniform Object Locator (UOL) provides an intuitive, hierarchical, "human-readable", identifier for defining the location and relationship of data within object oriented and relational data structures. UOL is intended for general purpose use as a command line argument/ID for retrieval and storage of data. It also defines elements that identify relative forms of UOL for simplified concatenation and efficient serialization of UOL within a data stream. 1.3 General Description The UOL is constructed from a hierarchical list of elements in similar fashion to a URL path component; with the '/' forward slash as the delimiter for directory elements. A "//" (double slash) is used to define an authority element. UOL maps a data structure by providing two separate components that define strings of related hierarchical elements. The two components are separated by a '#' (crosshatch), and divide the UOL into re-usable segments; an object constructor and object name. When used together, the two segments form a unique all-purpose identifier. Boynton Expires August 2001 [page 2] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 By promoting re-use of the component strings, UOL intends to reduce errors associated with data entry and interpretation. It also allows both user and machine to efficiently and accurately derive data relationships with out regard to location. 1.2 Terminology For clarity, the specific role of elements within the UOL are defined below. element A single, un-delimited, string of characters. directory A typical hierarchical string of elements delimited by a '/' (forward slash). attribute A parameter element descending from an object or directory that terminates a path. schema For this draft, schema is a subset of nodes, directories, or attributes, defined by a schema, DTD(Document Type Definition), or class definition. object A collection of directories and/or attributes defined by a schema. An object can be visualized as a re-usable tree of nodes descending from a single directory. In UOL, objects are identified by the prefix "." and by a corresponding name element. container An object that servers as the parent component to one or more named sub-elements. A container is identified by the prefix "..". constructor A complete collection of object and/or directory nodes followed by a single attribute; where no attribute is given, the directory path itself. object name A hierarchical collection of "." delimited elements that, when used together, provide the full name for a UOL constructor. name element An individual element within an object name. Its value is mapped to an object element within a constructor. Boynton Expires August 2001 [page 3] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 2. UOL Syntactic Components UOL characteristics are derived from widely implemented URL syntax and semantics. A UOL consists of a scheme, authority, constructor (path component) and name fragment. UOL may also be appended to a URI as a query component; [RFC2396]. The UOL scheme, itself, does not define a query component. # Where used as a query to a URL, the authority and path component of the URL comprise the authority for the UOL ?# 2.1 Path Components Elements within a UOL form objects and containers by mapping key names to directories and attributes within a path. 2.1.1 Authority The authority component defines a top level root for resolving a uol path to an absolute form. All elements within the uol are bound by the context of the authority and may only resolve to elements that descend from the authority. 2.1.2 Object Constructor The constructor component maps the relationship of objects, directories, and attributes through the use of a path metaphor. In most cases, the constructor does not map the actual location of data. Instead, a UOL parser uses the constructor as a pattern for binding keys (name elements) to object elements within the constructor. The "compiled" form is then used to retrieve or store data. 2.1.3 Object Name The name component is comprised of key elements that are mapped to object elements within the constructor. Name elements are appended, from left-to-right, corresponding to the object element each identifies. The constructor and name component are separated by a '#' (crosshatch). Name elements are delimited by a "." (dot). 2.2 Absolute and Relative Forms A UOL can be absolute or relative. An absolute UOL will resolve to, or contain, a top level element preceded by two forward slashes "//" (the authority). For directory elements descending from an object or authority, UOLs Boynton Expires August 2001 [page 4] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 beginning with "../", "./", or "/", are resolved in a manner consistent with the behavior for URI path components in Section 5.2 [RFC2396]. The prefix, itself, is not considered an element of the UOL path. In addition to the components above, UOL provides two special elements for resolving objects. ".~/" and "..~/" resolve a relative path to the current object or container object, respectively. During UOL parsing, the relative elements above are disambiguated from object elements by the absence of intervening characters between ".", "~", or "/". The following elements are defined for resolving a partial UOL to: "/" = AUTHORITY "./" = the current DIRECTORY | OBJECT "../" = the next higher DIRECTORY | OBJECT. This element will not resolve above OBJECT | AUTHORITY. ".~/" = the current OBJECT. Combinations of this element will resolve to higher OBJECT nodes. "..~/" = the current CONTAINER. Combinations of this element will resolve to higher CONTAINER nodes. Identical prefixes may be combined such that successive attempts to match the element will resolve to higher nodes within the UOL. The last successful match will resolve the UOL to an absolute form. If no match occurs, an application will indicate that no such element exist. Note: Semantics for combining un-like special elements are not defined by UOL and considered unsafe. 2.3 Reserved Markers The UOL scheme reserves the characters '=', '!', and '$' for assigning special behavior to a UOL. The characters modify the manner in which a UOL is treated by an application. Each marker is reserved for use as a UOL prefix. The meaning for each modifies the entire relative or absolute UOL segment to which it is applied. Marked UOLs must not have white space between the marker and the UOL it modifies. link = "=" Identifies the UOL as a variable within a formula or data cell. When encountered as a value to an argument, the value referenced by the UOL is returned. Boynton Expires August 2001 [page 5] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 comment = "!" Identifies the UOL as a references to a "comment" value. The comment provides specific details regarding the data mapped by the UOL. constant = "$" Indicates that a UOL is mapped to a constant in-memory location that cannot be moved, translated, or resolved to another path. Note: When used in conjunction with other markers, "=" must always appear first, then "!" then "$". 2.4 Example UOL The following section provides examples for how object elements are used in a UOL. The examples given are for a fictitious company with branch offices in three cities LA, Dallas, and Tampa. Each office has, three departments: sales, accounting, and personnel. The example below shows a UOL referencing the total hours worked by an employee named Jones, from the sales department of the LA branch office. "//company/..branch/..dept/.log/hours/total#LA.sales.Jones" To following pseudo XML code illustrates the construction concepts for the above UOL. The next example uses the same constructor to reference the total hours worked by an accounting employee at the Dallas branch. "/..branch/..dept/.log/hours/total#Dallas.accounting.Smith" Like the constructor, object names can also be re-used. In the example below, a constructor is modified to retrieve several records for a personnel associate in Tampa. "/..branch/..dept/.log/hours/total#Tampa.personnel.Carter" "/..branch/..dept/.log/hours/overtime#Tampa.personnel.Carter" "/..branch/..dept/.log/hours/vacation#Tampa.personnel.Carter" The next example shows absolute and relative forms of UOL. The UOLs shown point to an attribute in the department object (.dept). abs = "/..branch/.dept/administrator#Tampa.personnel" rel = "..~/..~/.dept/administrator#Tampa.personnel" rel = ".~/.~/administrator#personnel" Boynton Expires August 2001 [page 6] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 Note that the second relative UOL inherits the branch name "Tampa". Therefore, it does not need to provide it. 3. Object Elements 3.1 Objects In the UOL scheme, an object is a directory element that declares its sub-elements to represent a tree of objects, directories, or attributes, defined by a schema, DTD(Document Type Definition), or class definition. All UOL objects are mapped by name elements appended to the name component of the UOL. An object element is identified by the prefix "." (dot). It is important to note that the values "./", "../", ".~/", and "..~/" are reserved for resolving a relative UOL path to an absolute form and are NOT considered object elements; Section 2.2. object reference "/..object/dir/attr#name" relative reference "../dir/attr" 3.2 Containers An object element with a prefix of ".." (two dots) is interpreted as a "container" for other objects. A container element extends an object by sharing its name space as well as its name element. It does not, however, reference the objects attributes. Instead, the container diverts the UOL path to nested objects within its scope. UOL allows objects and containers to be expressed within the same name space because the elements are mutually exclusive (they cannot be visible at the same time). The following relative UOLs illustrate this concept by describing attributes and tags within an HTML table. ref to - .table/border#table1="10" ref to
- ..table/..tr/.td/align#table1.row1.td1="Center"
Example Text
3.3 Undefined To facilitate use of UOL in describing SGML documents and popular sub-sets of SGML, such as HTML, UOL provides an all-purpose undefined object. The object is primarily used to identify text between mark-up tag. Boynton Expires August 2001 [page 7] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 The undefined object is represented by a "~" (tilde). Within a UOL constructor, it has the same characteristics as an attribute but, because it is an object, it has an associated name element. The name element assigned is always an index value corresponding to the location of the element within its container. html = my text uol = /..font/~#fnt1.1="my text" 4 Name Elements 4.1 Names Each object name space is assigned a unique name id. The name is appended to the object name (if any) inherited from the parent directory. The new object name is then inherited by all sub-directories of the identified object element. Elements in the object name are delimited by a "." (dot). "/.object/sub_dir/field#name" "/.parent_object/.child_object/field#parent_name.child_name" 4.2 Index Names UOL allows name substitution with index values. The index name is inclosed by "(" and ")" (open and closed parentheses). For any list of object names, the integer index may be used in place of the actual name. Where no list is given, the name element will be parsed to the string value of the index number (i.e. "(9)" == "9"). It is envisioned that name elements containing parentheses may contain platform dependant code such as functions, arithmetic operators, and variables. Therefore, parsers should not attempt to interpret the index value directly. Also, because such content is likely to include reserved and unsafe characters, such code should be converted to "x-form-url-encoded" strings. Characters outside of the parentheses, but within the name element, should be removed. 4.3 Default Name The "" (empty string) is reserved for naming default object elements. Applications can use the default object for storing schemas or default values. Where a named object appears with a default object, the "." (dot) delimiter implies the default name. default name (empty) "//authority/..object1/..object2/attr#" default name (leading) "//authority/..object1/..object2/attr#.name2" default name (trailing) "//authority/..object1/..object2/attr#name1" Boynton Expires August 2001 [page 8] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 5 Reserved UOL Parameters Though UOL does not define a language for interacting with its components, it does provide a limited convention for passing parameters to a host application. Unlike other elements in a UOL, parameter elements are case insensitive. The parameters provide useful information on directory elements (including objects) within the UOL constructor. The information provided may be used by software for memory allocation, assigning permissions, indexing, and for the creation of "in-memory" objects based upon UOL patterns. UOL parameters are delimited from the constructor by an "@" (commercial at sign). Where present, "@" indicates both the start of a parameter and the end of a directory. All characters between "@" and the next "#", or the end of a UOL, are considered a parameter element. To simplify concatenation of a parameter element within a UOL, it is permissible to append a parameter to a directory element without removing the directory's trailing forward slash. However, a well formed UOL should have the "/" removed. "/.object@attlist#Name1" | "/.object/@attlist#Name1" The following parameters are reserved for UOL: @names = Used to obtain a line delimited string of name elements. Each represent a "human readable" key extension in the current object directory. @schema = Used to obtain a line delimited list of UOL strings representing a complete schema for sub-elements of the current directory. The list provides partial UOL constructors that will resolve correctly when combined with the current path. Each element path is enumerated in turn. The object name fragment (if any) is removed. This list is "read-only". @attlist = a non-recursive list of elements contained in the current directory. @code = the class description or file for an object that will use data from this directory. @codebase = the path description or URL to a class described by "@code" @aka = "Also Known As" holds a single, alternate name for describing an object element. Where present, "aka" should provide the most common generic substitute for the actual name used. @functions = Used to retrieve or update a line delimited list of Boynton Expires August 2001 [page 9] INTERNET-DRAFT Uniform Object Locator 5 February, 2001 functions that will interact with values associated with a UOL directory. functions should be stated in a form consistent with a "classid" URI; Section 13.3 [REC html 4.01]. 6 Relational Data [Pending] This section will discuss conventions for expressing fields within a relational data base through the use of nested UOL strings. 7 UOL, XPath, and XML User interaction with parsed XML data is one possible use for the UOL scheme. However, UOL is NOT proposed for referencing attributes from within XML schemas, DTDs, or payload. This functionality is provided by XML Path Language which was specifically written for this task. At the time this draft was updated, the W3C recommendation for XML Path Language was [XPath10]; 8 References [RFC2396] T. Berners-Lee, R. Fielding, and L. Masinter. "Uniform Resource Identifiers (URI) Generic Syntax". IETF RFC 2396 August 1998. [HTML401] D. Raggett, A. Le Hors, and I. Jacobs, Editors, "HTML 4.01 Specification". W3C Recommendation HTML401, 24 December 1999. [XPath10] J. Clark, and S. DeRose, Editors, "XML Path Language (XPath) Version 1.0". W3C Recommendation XPath10, 16 November 1999. 9 Author's Address Jon L. Boynton Produx House, Corp. 19300 Nalle Rd. North Ft.Myers, FL 33917 Phone 941 543 4491 Email Comments to jon@datamessenger.com Boynton Expires August 2001 [page 10]