Network Working Group C. Bormann Internet-Draft Universität Bremen TZI Intended status: Informational 4 July 2024 Expires: 5 January 2025 CBOR Extended Diagnostic Notation (EDN) draft-ietf-cbor-edn-literals-10 Abstract The Concise Binary Object Representation, CBOR (STD 94, RFC 8949), defines a "diagnostic notation" in order to be able to converse about CBOR data items without having to resort to binary data. RFC 8610 extends this into what is known as Extended Diagnostic Notation (EDN). This document sets forth a further step of evolution of EDN, and it is intended to serve as a single reference target in specifications that use EDN. It specifies how to add application-oriented extensions to the diagnostic notation. It then defines two such extensions for text representations of epoch-based date/times and of IP addresses and prefixes (RFC 9164). A few further additions close some gaps in usability. It modifies one extension specified in Appendix G.4 of RFC 8610 to enable further increasing usability. To facilitate tool interoperation, this document specifies a formal ABNF definition for EDN as defined today, and it adds media types. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://cbor- wg.github.io/edn-literal/. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-cbor-edn- literals/. Discussion of this document takes place on the cbor Working Group mailing list (mailto:cbor@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/cbor/. Subscribe at https://www.ietf.org/mailman/listinfo/cbor/. Source for this draft and an issue tracker can be found at https://github.com/cbor-wg/edn-literal. Bormann Expires 5 January 2025 [Page 1] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 5 January 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Structure of This Document . . . . . . . . . . . . . . . 4 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 1.3. (Non-)Objectives of this Document . . . . . . . . . . . . 5 2. Application-Oriented Extension Literals . . . . . . . . . . . 7 2.1. The "dt" Extension . . . . . . . . . . . . . . . . . . . 8 2.2. The "ip" Extension . . . . . . . . . . . . . . . . . . . 8 3. Stand-in Representations in Binary CBOR . . . . . . . . . . . 9 3.1. Handling unknown application-extension identifiers . . . 10 3.2. Handling information deliberately elided from an EDN document . . . . . . . . . . . . . . . . . . . . . . . . 11 4. ABNF Definitions . . . . . . . . . . . . . . . . . . . . . . 12 4.1. Overall ABNF Definition for Extended Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2. ABNF Definitions for app-string Content . . . . . . . . . 18 Bormann Expires 5 January 2025 [Page 2] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 4.2.1. h: ABNF Definition of Hexadecimal representation of a byte string . . . . . . . . . . . . . . . . . . . . . 18 4.2.2. b64: ABNF Definition of Base64 representation of a byte string . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.3. dt: ABNF Definition of RFC 3339 Representation of a Date/Time . . . . . . . . . . . . . . . . . . . . . . 19 4.2.4. ip: ABNF Definition of Textual Representation of an IP Address . . . . . . . . . . . . . . . . . . . . . . . 20 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 5.1. CBOR Diagnostic Notation Application-extension Identifiers Registry . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2. Encoding Indicators . . . . . . . . . . . . . . . . . . . 23 5.3. Media Type . . . . . . . . . . . . . . . . . . . . . . . 24 5.4. Content-Format . . . . . . . . . . . . . . . . . . . . . 26 5.5. Stand-in Tags . . . . . . . . . . . . . . . . . . . . . . 26 6. Security considerations . . . . . . . . . . . . . . . . . . . 27 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 7.1. Normative References . . . . . . . . . . . . . . . . . . 27 7.2. Informative References . . . . . . . . . . . . . . . . . 29 Appendix A. EDN and CDDL . . . . . . . . . . . . . . . . . . . . 30 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 32 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 32 1. Introduction For the Concise Binary Object Representation (CBOR) Section 8 of RFC 8949 [STD94] in conjunction with Appendix G of [RFC8610] defines a "diagnostic notation" in order to be able to converse about CBOR data items without having to resort to binary data. Diagnostic notation syntax is based on JSON, with extensions for representing CBOR constructs such as binary data and tags. (Standardizing this together with the actual interchange format does not serve to create another interchange format, but enables the use of a shared diagnostic notation in tools for and in documents about CBOR.) This document sets forth a further step of evolution of EDN, and it is intended to serve as a single reference target in specifications that use EDN. It specifies how to add application-oriented extensions to the diagnostic notation. It then defines two such extensions for text representations of epoch-based date/times and of IP addresses and prefixes [RFC9164]. A few further additions close some gaps in usability. It modifies one extension specified in Appendix G.4 of RFC 8610 to enable further increasing usability. To facilitate tool interoperation, this document specifies a formal ABNF definition for EDN as defined today. Bormann Expires 5 January 2025 [Page 3] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 (See Section 4.1 for an overall ABNF grammar as well as the ABNF definitions in Section 4.2 for grammars for both the byte string presentations predefined in [STD94] and the application-extensions defined here.) In addition, this document finally registers a media type identifier and a content-format for CBOR diagnostic notation. This does not elevate its status as an interchange format, but recognizes that interaction between tools is often smoother if media types can be used. | Examples in RFCs often do not use media type identifiers, but | special sourcecode type names that are allocated in | https://www.rfc-editor.org/materials/sourcecode-types.txt | (https://www.rfc-editor.org/materials/sourcecode-types.txt). | At the time of writing, this resource lists four sourcecode | type names that can be used in RFCs for including CBOR data | items and CBOR-related languages: | | * cbor (which is actually not useful, as CBOR is a binary | format and cannot be used in textual examples in an RFC), | | * cbor-diag (which is another name for EDN, as defined in | the present document), | | * cbor-pretty (which is a possibly annotated and pretty- | printed hexdump of an encoded CBOR data item, along the | lines of the grammar of Section 4.2.1, as used for | instance for some of the examples in Appendix A.3 of | [RFC9290]), and | | * cddl (which is used for the Concise Data Definition | Language, CDDL, see Section 1.2 below). 1.1. Structure of This Document After introductory material, Section 2 introduces the concept of application-oriented extension literals and defines the "dt" and "ip" extensions. Section 3 defines mechanisms for dealing with unknown application-oriented literals and deliberately elided information. Section 4 gives the formal syntax of EDN in ABNF, with explanations for some features of and additions to this syntax, as an overall grammar (Section 4.1) and specific grammars for the content of app- string and byte-string literals (Section 4.2). This is followed by the conventional sections for IANA Considerations (5), Security considerations (6), and References (7.1, 7.2). An informational comparison of EDN with CDDL follows in Appendix A. Bormann Expires 5 January 2025 [Page 4] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 1.2. Terminology Section 8 of RFC 8949 [STD94] defines the original CBOR diagnostic notation, and Appendix G of [RFC8610] supplies a number of extensions to the diagnostic notation that result in the Extended Diagnostic Notation (EDN). The diagnostic notation extensions include popular features such as embedded CBOR (encoded CBOR data items in byte strings) and comments. A simple diagnostic notation extension that enables representing CBOR sequences was added in Section 4.2 of [RFC8742]. As diagnostic notation is not used in the kind of interchange situations where backward compatibility would pose a significant obstacle, there is little point in not using these extensions. Therefore, when we refer to "_diagnostic notation_", we mean to include the original notation from Section 8 of RFC 8949 [STD94] as well as the extensions from Appendix G of [RFC8610], Section 4.2 of [RFC8742], and the present document. However, we stick to the abbreviation "_EDN_" as it has become quite popular and is more sharply distinguishable from other meanings than "DN" would be. In a similar vein, the term "ABNF" in this document refers to the language defined in [STD68] as extended in [RFC7405], where the "characters" of Section 2.3 of RFC 5234 [STD68] are Unicode scalar values. The term "CDDL" (Concise Data Definition Language) refers to the data definition language defined in [RFC8610] and its registered extensions (such as those in [RFC9165]), as well as [I-D.ietf-cbor-update-8610-grammar]. Additional information about the relationship between the two languages EDN and CDDL is captured in Appendix A. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 1.3. (Non-)Objectives of this Document Section 8 of RFC 8949 [STD94] states the objective of defining a human-readable diagnostic notation with CBOR. In particular, it states: | All actual interchange always happens in the binary format. Bormann Expires 5 January 2025 [Page 5] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 One important application of EDN is the notation of CBOR data for humans: in specifications, on whiteboards, and for entering test data. A number of features, such as comments in string literals, are mainly useful for people-to-people communication via EDN. Programs also often output EDN for diagnostic purposes, such as in error messages or to enable comparison (including generation of diffs via tools) with test data. For comparison with test data, it is often useful if different implementations generate the same (or similar) output for the same CBOR data items. This is comparable to the objectives of deterministic serialization for CBOR data items themselves (Section 4.2 of RFC 8949 [STD94]). However, there are even more representation variants in EDN than in binary CBOR, and there is little point in specifically endorsing a single variant as "deterministic" when other variants may be more useful for human understanding, e.g., the << >> notation as opposed to h''; an EDN generator may have quite a few options that control what presentation variant is most desirable for the application that it is being used for. Because of this, a deterministic representation is not defined for EDN, and there is no expectation for "roundtripping" from EDN to CBOR and back, i.e., for an ability to convert EDN to binary CBOR and back to EDN while achieving exactly the same result as the original input EDN — the original EDN possibly was created by humans or by a different EDN generator. However, there is a certain expectation that EDN generators can be configured to some basic output format, which: * looks like JSON where that is possible; * inserts encoding indicators only where the binary form differs from preferred encoding; * uses hexadecimal representation (h'') for byte strings, not b64'' or embedded CBOR (<<>>); * does not generate elaborate blank space (newlines, indentation) for pretty-printing, but does use common blank spaces such as after , and :. Bormann Expires 5 January 2025 [Page 6] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 Additional features such as ensuring deterministic map ordering (Section 4.2 of RFC 8949 [STD94]) on output, or even deviating from the basic configuration in some systematic way, can further assist in comparing test data. Information obtained from a CDDL model can help in choosing application-oriented literals or specific string representations such as embedded CBOR or b64'' in the appropriate places. 2. Application-Oriented Extension Literals This document extends the syntax used in diagnostic notation for byte string literals to also be available for application-oriented extensions. As per Section 8 of RFC 8949 [STD94], the diagnostic notation can notate byte strings in a number of [RFC4648] base encodings, where the encoded text is enclosed in single quotes, prefixed by an identifier (»h« for base16, »b32« for base32, »h32« for base32hex, »b64« for base64 or base64url). This syntax can be thought to establish a name space, with the names "h", "b32", "h32", and "b64" taken, but other names being unallocated. The present specification defines additional names for this namespace, which we call _application-extension identifiers_. For the quoted string, the same rules apply as for byte strings. In particular, the escaping rules that were adapted from JSON strings are applied equivalently for application-oriented extensions, e.g., within the quoted string \\ stands for a single backslash and \' stands for a single quote. An application-extension identifier is a name consisting of a lower- case ASCII letter (a-z) and zero or more additional ASCII characters that are either lower-case letters or digits (a-z0-9). Application-extension identifiers are registered in a registry (Section 5.1). Prefixing a single-quoted string, an application-extension identifier is used to build an application-oriented extension literal, which stands for a CBOR data item the value of which is derived from the text given in the single-quoted string using a procedure defined in the specification for an application-extension identifier. An application-extension (such as dt) MAY also define the meaning of a variant of the application-extension identifier where each lower- case character is replaced by its upper-case counterpart (such as DT), for building an application-oriented extension literal using that all-uppercase variant as the prefix of a single-quoted string. Bormann Expires 5 January 2025 [Page 7] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 As a convention for such definitions, using the all-uppercase variant implies making use of a tag appropriate for this application-oriented extension (such as tag number 1 for DT). Examples for application-oriented extensions to CBOR diagnostic notation can be found in the following sections. 2.1. The "dt" Extension The application-extension identifier "dt" is used to notate a date/ time literal that can be used as an Epoch-Based Date/Time as per Section 3.4.2 of RFC 8949 [STD94]. The text of the literal is a Standard Date/Time String as per Section 3.4.1 of RFC 8949 [STD94]. The value of the literal is a number representing the result of a conversion of the given Standard Date/Time String to an Epoch-Based Date/Time. If fractional seconds are given in the text (production time-secfrac in Figure 4), the value is a floating-point number; the value is an integer number otherwise. In the all-upper-case variant of the app-prefix, the value is enclosed in a tag number 1. As an example, the CBOR diagnostic notation dt'1969-07-21T02:56:16Z', dt'1969-07-21T02:56:16.5Z', DT'1969-07-21T02:56:16Z' is equivalent to -14159024, -14159023.5, 1(-14159024) See Section 4.2.3 for an ABNF definition for the content of dt literals. 2.2. The "ip" Extension The application-extension identifier "ip" is used to notate an IP address literal that can be used as an IP address as per Section 3 of [RFC9164]. The text of the literal is an IPv4address or IPv6address as per Section 3.2.2 of [RFC3986]. Bormann Expires 5 January 2025 [Page 8] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 With the lower-case app-string ip, the value of the literal is a byte string representing the binary IP address. With the upper-case app- string IP, the literal is such a byte string tagged with tag number 54, if an IPv6address is used, or tag number 52, if an IPv4address is used. As an additional case, the upper-case app-string IP'' can be used with a prefix such as 2001:db8::/56 or 192.0.2.0/24, with the equivalent tag as its value. (Note that [RFC9164] representations of address prefixes need to implement the truncation of the address byte string as described in Section 4.2 of [RFC9164]; see example below.) For completeness, the lower-case variant ip'2001:db8::/56' or ip'192.0.2.0/24' stands for an unwrapped [56,h'20010db8'] or [24,h'c00002']; however, in this case the information on whether an address is IPv4 or IPv6 often needs to come from the context. Note that there is no direct representation of an address combined with a prefix length; this can be represented as 52([ip'192.0.2.42',24]), if needed. Examples: the CBOR diagnostic notation ip'192.0.2.42', IP'192.0.2.42', IP'192.0.2.0/24', ip'2001:db8::42', IP'2001:db8::42', IP'2001:db8::/64' is equivalent to h'c000022a', 52(h'c000022a'), 52([24,h'c00002']), h'20010db8000000000000000000000042', 54(h'20010db8000000000000000000000042'), 54([64,h'20010db8']) See Section 4.2.4 for an ABNF definition for the content of ip literals. 3. Stand-in Representations in Binary CBOR In some cases, an EDN consumer cannot construct actual CBOR items that represent the CBOR data intended for eventual interchange. This document defines stand-in representation for two such cases: Bormann Expires 5 January 2025 [Page 9] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 * The EDN consumer does not know (or does not implement) an application-extension identifier used in the EDN document (Section 3.1) but wants to preserve the information for a later processor. * The generator of some EDN intended for human consumption (such as in a specification document) may not want to include parts of the final data item, destructively replacing complete subtrees or possibly just parts of a lengthy string by _elisions_ (Section 3.2). Implementation note: Typically, the ultimate applications will fail if they encounter tags unknown to them, which the ones defined in this section likely are. Where chains of tools are involved in processing EDN, it may be useful to fail earlier than at the ultimate receiver in the chain unless specific processing options (e.g., command line flags) are given that indicate which of these stand-ins are expected at this stage in the chain. 3.1. Handling unknown application-extension identifiers When ingesting CBOR diagnostic notation, any application-oriented extension literals are usually decoded and transformed into the corresponding data item during ingestion. If an application- extension is not known or not implemented by the ingesting process, this is usually an error and processing has to stop. However, in certain cases, it can be desirable to exceptionally carry an uninterpreted application-oriented extension literal in an ingested data item, allowing to postpone its decoding to a specific later stage of ingestion. This specification defines a CBOR Tag for this purpose: The Diagnostic Notation Unresolved Application-Extension Tag, tag number CPA999 (Section 5.5). The content of this tag is an array of two text strings: The application-extension identifier, and the (escape- processed) content of the single-quoted string. For example, dt'1969-07-21T02:56:16Z' can be provisionally represented as /CPA/ 999(["dt", "1969-07-21T02:56:16Z"]). If a stage of ingestion is not prepared to handle the Unresolved Application-Extension Tag, this is an error and processing has to stop, as if this stage had been ingesting an unknown or unimplemented application-extension literal itself. // RFC-Editor: This document uses the CPA (code point allocation) // convention described in [I-D.bormann-cbor-draft-numbers]. For Bormann Expires 5 January 2025 [Page 10] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 // each usage of the term "CPA", please remove the prefix "CPA" from // the indicated value and replace the residue with the value // assigned by IANA; perform an analogous substitution for all other // occurrences of the prefix "CPA" in the document. Finally, please // remove this note. 3.2. Handling information deliberately elided from an EDN document When using EDN for exposition in a document or on a whiteboard, it is often useful to be able to leave out parts of an EDN document that are not of interest at that point of the exposition. To facilitate this, this specification supports the use of an _ellipsis_ (notated as three or more dots in a row, as in ...) to indicate parts of an EDN document that have been elided (and therefore cannot be reconstructed). Upon ingesting EDN as a representation of a CBOR data item for further processing, the occurrence of an ellipsis usually is an error and processing has to stop. However, it is useful to be able to process EDN documents with ellipses in the automation scripts for the documents using them. This specification defines a CBOR Tag that can be used in the ingestion for this purpose: The Diagnostic Notation Ellipsis Tag, tag number CPA888 (Section 5.5). The content of this tag either is 1. null (indicating a data item entirely replaced by an ellipsis), or it is 2. an array, the elements of which are alternating between fragments of a string and the actual elisions, represented as ellipses carrying a null as content. Elisions can stand in for entire subtrees, e.g. in: [1, 2, ..., 3] { "a": 1, "b": ..., ...: ... } A single ellipsis (or key/value pair of ellipses) can imply eliding multiple elements in an array (members in a map); if more detailed control is required, a data definition language such as CDDL can be employed. (Note that the stand-in form defined here does not allow multiple key/value pairs with an ellipsis as a key: the CBOR data item would not be valid.) Bormann Expires 5 January 2025 [Page 11] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 Subtree elisions can be represented in a CBOR data item by using /CPA/888(null) as the stand-in: [1, 2, 888(null), 3] { "a": 1, "b": 888(null), 888(null): 888(null) } Elisions also can be used as part of a (text or byte) string: { "contract": "Herewith I buy" + ... + "gned: Alice & Bob", "signature": h'4711...0815', } The example "contract" uses string concatenation as per Appendix G.4 of [RFC8610] as updated by Section 4.1, extending that by allowing ellipses; while the example "signature" uses special syntax that allows the use of ellipses between the bytes notated _inside_ h'' literals. String elisions can be represented in a CBOR data item by a stand-in that wraps an array of string fragments alternating with ellipsis indicators: { "contract": /CPA/888(["Herewith I buy", 888(null), "gned: Alice & Bob"]), "signature": 888([h'4711', 888(null), h'0815']), } Note that the use of elisions is different from "commenting out" EDN text, e.g.: { "contract": "Herewith I buy" /.../ "gned: Alice & Bob", "signature": h'4711/.../0815', # ...: ... } The consumer of this EDN will ignore the comments and therefore will have no idea after ingestion that some information has been elided; validation steps may then simply fail instead of being informed about the elisions. 4. ABNF Definitions This section collects grammars in ABNF form ([STD68] as extended in [RFC7405]) that serve to define the syntax of EDN and some application-oriented literals. Bormann Expires 5 January 2025 [Page 12] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 Implementation note: The ABNF definitions in this section are intended to be useful in a Parsing Expression Grammar (PEG) parser interpretation (see Appendix A of [RFC8610] for an introduction into PEG). 4.1. Overall ABNF Definition for Extended Diagnostic Notation This subsection provides an overall ABNF definition for the syntax of CBOR extended diagnostic notation. To complete the parsing of an app-string with prefix, say, p, the processed sqstr inside it is further parsed using the ABNF definition specified for the production app-string-p in Section 4.2. For simplicity, the internal parsing for the built-in EDN prefixes is specified in the same way. ABNF definitions for h'' and b64'' are provided in Section 4.2.1 and Section 4.2.2. However, the prefixes b32'' and h32'' are not in wide use and an ABNF definition in this document could therefore not be based on implementation experience. seq = S [item S *(OC item S) OC] one-item = S item S item = map / array / tagged / number / simple / string / streamstring string1 = (tstr / bstr) spec string1e = string1 / ellipsis ellipsis = 3*"." ; "..." or more dots string = string1e *(S "+" S string1e) number = (basenumber / decnumber / infin) spec sign = "+" / "-" decnumber = [sign] (1*DIGIT ["." *DIGIT] / "." 1*DIGIT) ["e" [sign] 1*DIGIT] basenumber = [sign] "0" ("x" 1*HEXDIG [["." *HEXDIG] "p" [sign] 1*DIGIT] / "x" "." 1*HEXDIG "p" [sign] 1*DIGIT / "o" 1*ODIGIT / "b" 1*BDIGIT) infin = %s"Infinity" / %s"-Infinity" / %s"NaN" simple = %s"false" / %s"true" / %s"null" / %s"undefined" / %s"simple(" S item S ")" Bormann Expires 5 January 2025 [Page 13] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 uint = "0" / DIGIT1 *DIGIT tagged = uint spec "(" S item S ")" app-prefix = lcalpha *lcalnum ; including h and b64 / ucalpha *ucalnum ; tagged variant, if defined app-string = app-prefix sqstr sqstr = "'" *single-quoted "'" bstr = app-string / sqstr / embedded ; app-string could be any type tstr = DQUOTE *double-quoted DQUOTE embedded = "<<" seq ">>" array = "[" spec S [item S *(OC item S) OC] "]" map = "{" spec S [kp S *(OC kp S) OC] "}" kp = item S ":" S item ; We allow %x09 HT in prose, but not in strings blank = %x09 / %x0A / %x0D / %x20 non-slash = blank / %x21-2e / %x30-D7FF / %xE000-10FFFF non-lf = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF S = *blank *(comment *blank) comment = "/" *non-slash "/" / "#" *non-lf %x0A ; optional comma (ignored) OC = ["," S] ; check semantically that strings are either all text or all bytes ; note that there must be at least one string to distinguish streamstring = "(_" S string S *(OC string S) OC ")" spec = ["_" *wordchar] double-quoted = unescaped / "'" / "\" DQUOTE / "\" escapable single-quoted = unescaped / DQUOTE / "\" "'" / "\" escapable escapable = %s"b" ; BS backspace U+0008 / %s"f" ; FF form feed U+000C / %s"n" ; LF line feed U+000A / %s"r" ; CR carriage return U+000D / %s"t" ; HT horizontal tab U+0009 / "/" ; / slash (solidus) U+002F (JSON!) Bormann Expires 5 January 2025 [Page 14] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 / "\" ; \ backslash (reverse solidus) U+005C / (%s"u" hexchar) ; uXXXX U+XXXX hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" / non-surrogate / (high-surrogate "\" %s"u" low-surrogate) non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / ("D" ODIGIT 2HEXDIG ) high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG / non-surrogate / 1*3HEXDIG ; Note that no other C0 characters are allowed, including %x09 HT unescaped = %x0A ; new line / %x0D ; carriage return -- ignored on input / %x20-21 ; omit 0x22 " / %x23-26 ; omit 0x27 ' / %x28-5B ; omit 0x5C \ / %x5D-D7FF ; skip surrogate code points / %xE000-10FFFF DQUOTE = %x22 ; " double quote DIGIT = %x30-39 ; 0-9 DIGIT1 = %x31-39 ; 1-9 ODIGIT = %x30-37 ; 0-7 BDIGIT = %x30-31 ; 0-1 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F" ; Note: double-quoted strings as in "A" are case-insensitive in ABNF lcalpha = %x61-7A ; a-z lcalnum = lcalpha / DIGIT ucalpha = %x41-5A ; A-Z ucalnum = ucalpha / DIGIT wordchar = "_" / lcalnum / ucalpha ; [_a-z0-9A-Z] Figure 1: Overall ABNF Definition of CBOR EDN While an ABNF grammar defines the set of character strings that are considered to be valid EDN by this ABNF, the mapping of these character strings into the generic data model of CBOR is not always obvious. The following additional items should help in the interpretation: Bormann Expires 5 January 2025 [Page 15] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 * decnumber stands for an integer in the usual decimal notation, unless at least one of the optional parts starting with "." and "e" are present, in which case it stands for a floating point value in the usual decimal notation. Note that the grammar now allows 3. for 3.0 and .3 for 0.3 (also for hexadecimal floating point below); implementers are advised that some platform numeric parsers accept only a subset of the floating point syntax in this document and may require some preprocessing to use here. * basenumber stands for an integer in the usual base 16/hexadecimal ("0x"), base 8/octal ("0o"), or base 2/binary ("0b") notation, unless the optional part containing a "p" is present, in which case it stands for a floating point number in the usual hexadecimal notation (which uses a mantissa in hexadecimal and an exponent in decimal notation, see Section 5.12.3 of [IEEE754], Section 6.4.4.2 of [C], or Section 5.13.4 of [Cplusplus]; floating-suffix/floating-point-suffix from the latter two is not used here). * When decnumber or basenumber stands for an integer, the corresponding CBOR data item is represented using major type 0 or 1 if possible, or using tag 2 or 3 if not. In the latter case, this specification does not define any encoding indicators that apply. If fine control over encoding is desired, this can be expressed by being explicit about the representation as a tag: E.g., 987654321098765432310, which is equivalent to 2(h'35 8a 75 04 38 f3 80 f5 f6') in its preferred serialization, might be written as 2_3(h'00 00 00 35 8a 75 04 38 f3 80 f5 f6'_1) if leading zeros need to be added during serialization to obtain specific sizes for tag head, byte string head, and the overall byte string. Otherwise, and for infin, a floating point data item with major type 7 is used in preferred serialization (unless modified by an encoding indicator, which then needs to be _1, _2, or _3). For this, the number range needs to fit into a binary64 (or the size corresponding to the encoding indicator), and the precision will be adjusted to binary64 before further applying preferred serialization (or to the size corresponding to the encoding indicator). Tag 4/5 representations are not generated in these cases. Future app-prefixes could be defined to allow more control for obtaining a tag 4/5 representation directly from a hex or decimal floating point literal. * spec stands for an encoding indicator. Bormann Expires 5 January 2025 [Page 16] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 (In the following, an abbreviation of the form ai=nn gives nn as the numeric value of the field _additional information_, the low- order 5 bits of the initial byte: see Section 3 of RFC 8949 [STD94].) As per Section 8.1 of RFC 8949 [STD94]: - an underscore _ on its own stands for indefinite length encoding (ai=31, only available behind the opening brace/ bracket for map and array: strings have a special syntax streamstring for indefinite length encoding except for the special cases ''_ and ""_), and - _0 to _3 stand for ai=24 to ai=27, respectively. Surprisingly, Section 8.1 of RFC 8949 [STD94] does not address ai=0 to ai=23 — the assumption seems to be that preferred serialization (Section 4.1 of RFC 8949 [STD94]) will be used when converting CBOR diagnostic notation to an encoded CBOR data item, so leaving out the encoding indicator for a data item with a preferred serialization will implicitly use ai=0 to ai=23 if that is possible. The present specification allows to make this explicit: - _i ("immediate") stands for encoding with ai=0 to ai=23. While no pressing use for further values for encoding indicators comes to mind, this is an extension point for EDN; Section 5.2 defines a registry for additional values. * string and the rules preceding it in the same block realize both the representation of strings that are split up into multiple chunks (Appendix G.4 of [RFC8610]) and the use of ellipses to represent elisions (Section 3.2). Note that the syntax defined here for concatenation of components uses an explicit + operator between the components to be concatenated; Appendix G.4 of [RFC8610] used simple juxtaposition, which got in the way of making the use of commas optional in other places (OC). The example equivalent text strings from Appendix G.4 of [RFC8610] now read: "Hello world" "Hello " + "world" "Hello" + h'20' + "world" "" + h'48656c6c6f20776f726c64' + "" Similarly, the following byte string values are equivalent: Bormann Expires 5 January 2025 [Page 17] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 'Hello world' 'Hello ' + 'world' 'Hello ' + h'776f726c64' 'Hello' + h'20' + 'world' '' + h'48656c6c6f20776f726c64' + '' + b64'' h'4 86 56c 6c6f' + h' 20776 f726c64' The semantic processing of these rules is relatively complex: - A single ... is a general ellipsis, which can stand for any data item. - An ellipsis can be concatenated (on one or both sides) with string chunks (string1); the result is a CBOR tag number CPA888 that contains an array with joined together spans of such chunks plus the ellipses represented by 888(null). - A concatenated sequence of string chunks is simply joined together. In both cases of joining strings, the rules of Appendix G.4 of [RFC8610] need to be followed; in particular, if a text string results from the joining operation, that result needs to be valid UTF-8. - Some of the strings may be app-strings. If the type of the app-string is an actual string, joining of chunked strings occurs as with directly notated strings; otherwise the occurrence of more than one app-string or an app-string together with a directly notated string cannot be processed. 4.2. ABNF Definitions for app-string Content This subsection provides ABNF definitions for application-oriented extension literals defined in [STD94] and in this specification. These grammars describe the _decoded_ content of the sqstr components that combine with the application-extension identifiers to form application-oriented extension literals. Each of these may make use of ABNF rules defined in Figure 1. 4.2.1. h: ABNF Definition of Hexadecimal representation of a byte string The syntax of the content of byte strings represented in hex, such as h'', h'0815', or h'/head/ 63 /contents/ 66 6f 6f' (another representation of << "foo" >>), is described by the ABNF in Figure 2. This syntax accommodates both lower case and upper case hex digits, as well as blank space (including comments) around each hex digit. Bormann Expires 5 January 2025 [Page 18] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 app-string-h = S *(HEXDIG S HEXDIG S / ellipsis S) ["#" *non-lf] ellipsis = 3*"." HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" DIGIT = %x30-39 ; 0-9 blank = %x09 / %x0A / %x0D / %x20 non-slash = blank / %x21-2e / %x30-10FFFF non-lf = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF S = *blank *(comment *blank ) comment = "/" *non-slash "/" / "#" *non-lf %x0A Figure 2: ABNF Definition of Hexadecimal Representation of a Byte String 4.2.2. b64: ABNF Definition of Base64 representation of a byte string The syntax of the content of byte strings represented in base64 is described by the ABNF in Figure 2. This syntax allows both the classic (Section 4 of [RFC4648]) and the URL-safe (Section 5 of [RFC4648]) alphabet to be used. It accommodates, but does not require base64 padding. Note that inclusion of classic base64 makes it impossible to have in-line comments in b64, as "/" is valid base64-classic. app-string-b64 = B *(4(b64dig B)) [b64dig B b64dig B ["=" B "=" / b64dig B ["="]] B] ["#" *inon-lf] b64dig = ALPHA / DIGIT / "-" / "_" / "+" / "/" B = *iblank *(icomment *iblank) iblank = %x0A / %x20 ; Not HT or CR (gone) icomment = "#" *inon-lf %x0A inon-lf = %x20-D7FF / %xE000-10FFFF ALPHA = %x41-5a / %x61-7a DIGIT = %x30-39 Figure 3: ABNF definition of Base64 Representation of a Byte String 4.2.3. dt: ABNF Definition of RFC 3339 Representation of a Date/Time The syntax of the content of dt literals can be described by the ABNF for date-time from [RFC3339] as summarized in Section 3 of [RFC9165]: Bormann Expires 5 January 2025 [Page 19] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 app-string-dt = date-time date-fullyear = 4DIGIT date-month = 2DIGIT ; 01-12 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap sec ; rules time-secfrac = "." 1*DIGIT time-numoffset = ("+" / "-") time-hour ":" time-minute time-offset = "Z" / time-numoffset partial-time = time-hour ":" time-minute ":" time-second [time-secfrac] full-date = date-fullyear "-" date-month "-" date-mday full-time = partial-time time-offset date-time = full-date "T" full-time DIGIT = %x30-39 ; 0-9 Figure 4: ABNF Definition of RFC3339 Representation of a Date/Time 4.2.4. ip: ABNF Definition of Textual Representation of an IP Address The syntax of the content of ip literals can be described by the ABNF for IPv4address and IPv6address in Section 3.2.2 of [RFC3986], as included in slightly updated form in Figure 5. Bormann Expires 5 January 2025 [Page 20] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 app-string-ip = IPaddress ["/" uint] IPaddress = IPv4address / IPv6address ; ABNF from RFC 3986, re-arranged for PEG compatibility: IPv6address = 6( h16 ":" ) ls32 / "::" 5( h16 ":" ) ls32 / [ h16 ] "::" 4( h16 ":" ) ls32 / [ h16 *1( ":" h16 ) ] "::" 3( h16 ":" ) ls32 / [ h16 *2( ":" h16 ) ] "::" 2( h16 ":" ) ls32 / [ h16 *3( ":" h16 ) ] "::" h16 ":" ls32 / [ h16 *4( ":" h16 ) ] "::" ls32 / [ h16 *5( ":" h16 ) ] "::" h16 / [ h16 *6( ":" h16 ) ] "::" h16 = 1*4HEXDIG ls32 = ( h16 ":" h16 ) / IPv4address IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet dec-octet = "25" %x30-35 ; 250-255 / "2" %x30-34 DIGIT ; 200-249 / "1" 2DIGIT ; 100-199 / %x31-39 DIGIT ; 10-99 / DIGIT ; 0-9 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" DIGIT = %x30-39 ; 0-9 DIGIT1 = %x31-39 ; 1-9 uint = "0" / DIGIT1 *DIGIT Figure 5: ABNF Definition of Textual Representation of an IP Address 5. IANA Considerations // RFC Editor: please replace RFC-XXXX with the RFC number of this // RFC, [IANA.cbor-diagnostic-notation] with a reference to the new // registry group, and remove this note. 5.1. CBOR Diagnostic Notation Application-extension Identifiers Registry IANA is requested to create an "Application-Extension Identifiers" registry in a new "CBOR Diagnostic Notation" registry group [IANA.cbor-diagnostic-notation], with the policy "expert review" (Section 4.5 of RFC 8126 [BCP26]). Bormann Expires 5 January 2025 [Page 21] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 The experts are instructed to be frugal in the allocation of application-extension identifiers that are suggestive of generally applicable semantics, keeping them in reserve for application- extensions that are likely to enjoy wide use and can make good use of their conciseness. The expert is also instructed to direct the registrant to provide a specification (Section 4.6 of RFC 8126 [BCP26]), but can make exceptions, for instance when a specification is not available at the time of registration but is likely forthcoming. If the expert becomes aware of application-extension identifiers that are deployed and in use, they may also initiate a registration on their own if they deem such a registration can avert potential future collisions. Each entry in the registry must include: Application-Extension Identifier: a lower case ASCII [STD80] string that starts with a letter and can contain letters and digits after that ([a-z][a-z0-9]*). No other entry in the registry can have the same application- extension identifier. Description: a brief description Change Controller: (see Section 2.3 of RFC 8126 [BCP26]) Reference: a reference document that provides a description of the application-extension identifier The initial content of the registry is shown in Table 1; all initial entries have the Change Controller "IETF". Bormann Expires 5 January 2025 [Page 22] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 +==================================+===================+===========+ | Application-extension Identifier | Description | Reference | +==================================+===================+===========+ | h | Reserved | RFC8949 | +----------------------------------+-------------------+-----------+ | b32 | Reserved | RFC8949 | +----------------------------------+-------------------+-----------+ | h32 | Reserved | RFC8949 | +----------------------------------+-------------------+-----------+ | b64 | Reserved | RFC8949 | +----------------------------------+-------------------+-----------+ | dt | Date/Time | RFC-XXXX | +----------------------------------+-------------------+-----------+ | ip | IP Address/Prefix | RFC-XXXX | +----------------------------------+-------------------+-----------+ Table 1: Initial Content of Application-extension Identifier Registry 5.2. Encoding Indicators IANA is requested to create an "Encoding Indicators" registry in the newly created "CBOR Diagnostic Notation" registry group [IANA.cbor- diagnostic-notation], with the policy "specification required" (Section 4.6 of RFC 8126 [BCP26]). The experts are instructed to be frugal in the allocation of encoding indicators that are suggestive of generally applicable semantics, keeping them in reserve for encoding indicator registrations that are likely to enjoy wide use and can make good use of their conciseness. If the expert becomes aware of encoding indicators that are deployed and in use, they may also solicit a specification and initiate a registration on their own if they deem such a registration can avert potential future collisions. Each entry in the registry must include: Encoding Indicator: an ASCII [STD80] string that starts with an underscore letter and can contain zero or more underscores, letters and digits after that (_[_A-Za-z0-9]*). No other entry in the registry can have the same Encoding Indicator. Description: a brief description. This description may employ an abbreviation of the form ai=nn, where nn is the numeric value of the field _additional information_, the low-order 5 bits of the initial byte (see Section 3 of RFC 8949 [STD94]). Bormann Expires 5 January 2025 [Page 23] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 Change Controller: (see Section 2.3 of RFC 8126 [BCP26]) Reference: a reference document that provides a description of the application-extension identifier The initial content of the registry is shown in Table 2; all initial entries have the Change Controller "IETF". +====================+===================+===========+ | Encoding Indicator | Description | Reference | +====================+===================+===========+ | _ | Indefinite Length | RFC8949, | | | Encoding (ai=31) | RFC-XXXX | +--------------------+-------------------+-----------+ | _i | ai=0 to ai=23 | RFC-XXXX | +--------------------+-------------------+-----------+ | _0 | ai=24 | RFC8949, | | | | RFC-XXXX | +--------------------+-------------------+-----------+ | _1 | ai=25 | RFC8949, | | | | RFC-XXXX | +--------------------+-------------------+-----------+ | _2 | ai=26 | RFC8949, | | | | RFC-XXXX | +--------------------+-------------------+-----------+ | _3 | ai=27 | RFC8949, | | | | RFC-XXXX | +--------------------+-------------------+-----------+ Table 2: Initial Content of Encoding Indicator Registry | As the "Reference" column reflects, all the encoding indicators | initially registered are already defined in Section 8.1 of RFC | 8949 [STD94], with the exception of _i, which is defined in | Section 4.1 of the present document. 5.3. Media Type IANA is requested to add the following Media-Type to the "Media Types" registry [IANA.media-types]. Bormann Expires 5 January 2025 [Page 24] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 +=================+=============================+=============+ | Name | Template | Reference | +=================+=============================+=============+ | cbor-diagnostic | application/cbor-diagnostic | RFC-XXXX, | | | | Section 5.3 | +-----------------+-----------------------------+-------------+ Table 3: New Media Type application/cbor-diagnostic Type name: application Subtype name: cbor-diagnostic Required parameters: N/A Optional parameters: N/A Encoding considerations: binary (UTF-8) Security considerations: Section 6 of RFC XXXX Interoperability considerations: none Published specification: Section 5.3 of RFC XXXX Applications that use this media type: Tools interchanging a human- readable form of CBOR Fragment identifier considerations: The syntax and semantics of fragment identifiers is as specified for "application/cbor". (At publication of RFC XXXX, there is no fragment identification syntax defined for "application/cbor".) Additional information: Deprecated alias names for this type: N/A Magic number(s): N/A File extension(s): .diag Macintosh file type code(s): N/A Person & email address to contact for further information: CBOR WG mailing list (cbor@ietf.org), or IETF Applications and Real-Time Area (art@ietf.org) Intended usage: LIMITED USE Restrictions on usage: CBOR diagnostic notation represents CBOR data items, which are the format intended for actual interchange. The media type application/cbor-diagnostic is intended to be used within documents about CBOR data items, in diagnostics for human consumption, and in other representations of CBOR data items that are necessarily text-based such as in configuration files or other data edited by humans, often under source-code control. Author/Change controller: IETF Provisional registration: no Bormann Expires 5 January 2025 [Page 25] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 5.4. Content-Format IANA is requested to register a Content-Format number in the "CoAP Content-Formats" sub-registry, within the "Constrained RESTful Environments (CoRE) Parameters" Registry [IANA.core-parameters], as follows: +=============================+================+======+===========+ | Content-Type | Content Coding | ID | Reference | +=============================+================+======+===========+ | application/cbor-diagnostic | - | TBD1 | RFC-XXXX | +-----------------------------+----------------+------+-----------+ Table 4: New Content-Format TBD1 is to be assigned from the space 256..9999, according to the procedure "IETF Review or IESG Approval", preferably a number less than 1000. 5.5. Stand-in Tags // RFC-Editor: This document uses the CPA (code point allocation) // convention described in [I-D.bormann-cbor-draft-numbers]. For // each usage of the term "CPA", please remove the prefix "CPA" from // the indicated value and replace the residue with the value // assigned by IANA; perform an analogous substitution for all other // occurrences of the prefix "CPA" in the document. Finally, please // remove this note. In the "CBOR Tags" registry [IANA.cbor-tags], IANA is requested to assign the tags in Table 5 from the "specification required" space (suggested assignments: 888 and 999), with the present document as the specification reference. +========+===========+==================================+===========+ | Tag | Data | Semantics | Reference | | | Item | | | +========+===========+==================================+===========+ | CPA888 | null or | Diagnostic Notation Ellipsis | RFC-XXXX | | | array | | | +--------+-----------+----------------------------------+-----------+ | CPA999 | array | Diagnostic Notation | RFC-XXXX | | | | Unresolved Application-Extension | | +--------+-----------+----------------------------------+-----------+ Table 5: Values for Tags Bormann Expires 5 January 2025 [Page 26] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 6. Security considerations The security considerations of [STD94] and [RFC8610] apply. The EDN specification provides two explicit extension points, application-extension identifiers (Section 5.1) and encoding indicators (Section 5.2). Extensions introduced this way can have their own security considerations (see, e.g., Section 5 of [I-D.bormann-cbor-e-ref]). When implementing tools that support the use of EDN extensions, the implementer needs to be careful not to inadvertently introduce a vector for an attacker to invoke extensions not planned for by the tool operator, who might not have considered security considerations of specific extensions such as those posed by their use of dereferenceable identifiers (Section 6 of [I-D.bormann-t2trg-deref-id]). For instance, tools might require explicitly enabling the use of each extension that is not on an allowlist. This task can possibly be made less onerous by combining it with a mechanism for supplying any parameters controlling such an extension. 7. References 7.1. Normative References [BCP26] Best Current Practice 26, . At the time of writing, this BCP comprises the following: Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, . [C] International Organization for Standardization, "Information technology — Programming languages — C", Fourth Edition, ISO/IEC 9899:2018, June 2018, . The text of the standard is also available via https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf [Cplusplus] International Organization for Standardization, "Programming languages — C++", Sixth Edition, ISO/ IEC 14882:2020, December 2020, . The text of the standard is also available via https://isocpp.org/files/papers/N4860.pdf Bormann Expires 5 January 2025 [Page 27] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 [IANA.cbor-tags] IANA, "Concise Binary Object Representation (CBOR) Tags", . [IANA.core-parameters] IANA, "Constrained RESTful Environments (CoRE) Parameters", . [IANA.media-types] IANA, "Media Types", . [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, . [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, . [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", RFC 7405, DOI 10.17487/RFC7405, December 2014, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, June 2019, . [RFC8742] Bormann, C., "Concise Binary Object Representation (CBOR) Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, . Bormann Expires 5 January 2025 [Page 28] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 [RFC9164] Richardson, M. and C. Bormann, "Concise Binary Object Representation (CBOR) Tags for IPv4 and IPv6 Addresses and Prefixes", RFC 9164, DOI 10.17487/RFC9164, December 2021, . [STD68] Internet Standard 68, . At the time of writing, this STD comprises the following: Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008, . [STD80] Internet Standard 80, . At the time of writing, this STD comprises the following: Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, October 1969, . [STD94] Internet Standard 94, . At the time of writing, this STD comprises the following: Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", STD 94, RFC 8949, DOI 10.17487/RFC8949, December 2020, . 7.2. Informative References [I-D.bormann-cbor-e-ref] Bormann, C., "External References to Values in CBOR Diagnostic Notation (EDN)", Work in Progress, Internet- Draft, draft-bormann-cbor-e-ref-00, 29 February 2024, . [I-D.bormann-t2trg-deref-id] Bormann, C. and C. Amsüss, "The "dereferenceable identifier" pattern", Work in Progress, Internet-Draft, draft-bormann-t2trg-deref-id-03, 2 March 2024, . Bormann Expires 5 January 2025 [Page 29] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 [I-D.ietf-cbor-update-8610-grammar] Bormann, C., "Updates to the CDDL grammar of RFC 8610", Work in Progress, Internet-Draft, draft-ietf-cbor-update- 8610-grammar-06, 24 June 2024, . [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, . [RFC9165] Bormann, C., "Additional Control Operators for the Concise Data Definition Language (CDDL)", RFC 9165, DOI 10.17487/RFC9165, December 2021, . [RFC9290] Fossati, T. and C. Bormann, "Concise Problem Details for Constrained Application Protocol (CoAP) APIs", RFC 9290, DOI 10.17487/RFC9290, October 2022, . [STD90] Internet Standard 90, . At the time of writing, this STD comprises the following: Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", STD 90, RFC 8259, DOI 10.17487/RFC8259, December 2017, . Appendix A. EDN and CDDL This appendix is for information. EDN was designed as a language to provide a human-readable representation of an instance, i.e., a single CBOR data item or CBOR sequence. CDDL was designed as a language to describe an (often large) set of such instances (which itself constitutes a language), in the form of a _data definition_ or _grammar_ (or sometimes called _schema_). The two languages share some similarities, not the least because they have mutually inspired each other. But they have very different roots: * EDN syntax is an extension to JSON syntax [STD90]. (Any (interoperable) JSON text is also valid EDN.) Bormann Expires 5 January 2025 [Page 30] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 * CDDL syntax is inspired by ABNF's syntax [STD68]. For engineers that are using both EDN and CDDL, it is easy to write "CDDLisms" or "EDNisms" into their drafts that are meant to be in the other language. (This is one more of the many motivations to always validate formal language instances with tools.) Important differences include: * Comment syntax. CDDL inherits ABNF's semicolon-delimited end of line characters, while EDN finds nothing in JSON that could be inherited here. Inspired by JavaScript, EDN simplifies JavaScript's copy of the original C comment syntax to be delimited by single slashes (where line ends are not of interest); it also adds end-of-line comments starting with #. EDN: { / alg / 1: -7 / ECDSA 256 / } , { 1: # alg -7 # ECDSA 256 } CDDL: ? 1 => int / tstr, ; algorithm identifier * Syntax for tags. CDDL's tag syntax is part of the system for referring to CBOR's fundamentals (the major type 6, in this case) and (with [I-D.ietf-cbor-update-8610-grammar]) allows specifying the actual tag number separately, while EDN's tag syntax is a simple decimal number and a pair of parentheses. EDN: 98([h'', # empty encoded protected header {}, # empty unprotected header ... # rest elided here ]) CDDL: COSE_Sign_Tagged = #6.98(COSE_Sign) * Previously, the use of comma as separator character. // Move this after embedded CBOR as it actually no longer is a // difference. But first check the diff... JSON requires commas as separators between array elements and map members; these commas also were required in the original diagnostic notation defined in [STD94] and the EDN defined in [RFC8610]. They are now optional in the places where EDN syntax allows commas. (EDN also allows, but does not require, a trailing comma before the closing bracket/ brace, enabling an easier to maintain "terminator" style of their use). CDDL's comma separators in the equivalent contexts (CDDL Bormann Expires 5 January 2025 [Page 31] Internet-Draft CBOR Extended Diagnostic Notation (EDN) July 2024 groups) are entirely optional (and actually are terminators, which together with their optionality allows them to be used like separators as well, or even not at all). In summary, comma use is now aligned between EDN and CDDL, in a fully backwards compatible way. * Embedded CBOR. EDN has a special syntax to describe the content of byte strings that are encoded CBOR data items. CDDL can specify these with a control operator, which looks very different. EDN: 98([<< {/alg/ 1: -7 /ECDSA 256/} >>, # == h'a10126' ... # rest elided here ]) CDDL: serialized_map = bytes .cbor header_map Acknowledgements The concept of application-oriented extensions to diagnostic notation, as well as the definition for the "dt" extension, were inspired by the CoRAL work by Klaus Hartke. (TBD) Author's Address Carsten Bormann Universität Bremen TZI Postfach 330440 D-28359 Bremen Germany Phone: +49-421-218-63921 Email: cabo@tzi.org Bormann Expires 5 January 2025 [Page 32]