Internet-Draft | CBOR EDN: Literals and ABNF | October 2023 |
Bormann | Expires 19 April 2024 | [Page] |
The Concise Binary Object Representation, CBOR (STD 94, RFC 8949), defines a "diagnostic notation" in order to be able to converse about CBOR data items without having to resort to binary data.¶
This document specifies how to add application-oriented extensions to the diagnostic notation. It then defines two such extensions for text representations of epoch-based date/times and of Constrained Resource Identifiers (draft-ietf-core-href).¶
To facilitate tool interoperation, this document also specifies a formal ABNF definition for extended diagnostic notation (EDN) that accommodates application-oriented literals.¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://cbor-wg.github.io/edn-literal/. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-cbor-edn-literals/.¶
Discussion of this document takes place on the cbor Working Group mailing list (mailto:cbor@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/cbor/. Subscribe at https://www.ietf.org/mailman/listinfo/cbor/.¶
Source for this draft and an issue tracker can be found at https://github.com/cbor-wg/edn-literal.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 April 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
For the Concise Binary Object Representation, CBOR, Section 8 of [STD94] in conjunction with Appendix G of [RFC8610] defines a "diagnostic notation" in order to be able to converse about CBOR data items without having to resort to binary data. Diagnostic notation syntax is based on JSON, with extensions for representing CBOR constructs such as binary data and tags. (Standardizing this together with the actual interchange format does not serve to create another interchange format, but enables the use of a shared diagnostic notation in tools for and in documents about CBOR.)¶
This document specifies how to add application-oriented extensions to the diagnostic notation. It then defines two such extensions for text representations of epoch-based date/times and of Constrained Resource Identifiers [I-D.ietf-core-href].¶
To facilitate tool interoperation, this document also specifies a formal ABNF definition for extended diagnostic notation (EDN) that accommodates application-oriented literals. (See Appendix A.1 for an overall ABNF grammar as well as the ABNF definitions in Appendix A.2 for grammars for both the byte string presentations predefined in [STD94] and the application-extensions).¶
Note that Section 2.1 and Appendix A.2.5 about CRIs may move to the [I-D.ietf-core-href] specification, depending on the relative speed of approval; the later document gets the section.¶
Section 8 of [STD94] defines the original CBOR diagnostic notation, and Appendix G of [RFC8610] supplies a number of extensions to the diagnostic notation that result in the Extended Diagnostic Notation (EDN). The diagnostic notation extensions include popular features such as embedded CBOR (encoded CBOR data items in byte strings) and comments. A simple diagnostic notation extension that enables representing CBOR sequences was added in Section 4.2 of [RFC8742]. As diagnostic notation is not used in the kind of interchange situations where backward compatibility would pose a significant obstacle, there is little point in not using these extensions.¶
Therefore, when we refer to "diagnostic notation", we mean to include the original notation from Section 8 of [STD94] as well as the extensions from Appendix G of [RFC8610], Section 4.2 of [RFC8742], and the present document. However, we stick to the abbreviation "EDN" as it has become quite popular and is more sharply distinguishable from other meanings than "DN" would be.¶
In a similar vein, the term "ABNF" in this document refers to the language defined in [STD68] as extended in [RFC7405], where the "characters" of Section 2.3 of [STD68] are Unicode scalar values. The term "CDDL" refers to the data definition language defined in [RFC8610] and its registered extensions (such as those in [RFC9165]), as well as [I-D.ietf-cbor-update-8610-grammar].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Section 8 of [STD94] states the objective of defining a human-readable diagnostic notation with CBOR. In particular, it states:¶
All actual interchange always happens in the binary format.¶
One important application of EDN is the notation of CBOR data for humans: in specifications, on whiteboards, and for entering test data. A number of features, such as comments in string literals, are mainly useful for people-to-people communication via EDN. Programs also often output EDN for diagnostic purposes, such as in error messages or to enable comparison (including generation of diffs via tools) with test data.¶
For comparison with test data, it is often useful if different
implementations generate the same (or similar) output for the same
CBOR data items.
This is comparable to the objectives of deterministic serialization
for CBOR data items themselves (Section 4.2 of [STD94]).
However, there are even more representation variants in EDN than in
binary CBOR, and there is little point in specifically endorsing a
single variant as "deterministic" when other variants may be more
useful for human understanding, e.g., the << >>
notation as
opposed to h''
; an EDN generator may have quite a few options
that control what presentation variant is most desirable for the
application that it is being used for.¶
Because of this, a deterministic representation is not defined for EDN, and there is little expectation of "roundtripping": the ability to convert EDN to binary CBOR and back to EDN while achieving exactly the same result as the original input EDN, which possibly was created by humans or by a different EDN generator.¶
However, there is a certain expectation that EDN generators can be configured to some basic output format, which:¶
looks like JSON where that is possible;¶
inserts encoding indicators only where the binary form differs from preferred encoding;¶
uses hexadecimal representation (h''
) for byte strings, not
b64''
or embedded CBOR (<<>>
);¶
does not generate elaborate blank space (newlines, indentation) for
pretty-printing, but does use common blank spaces such as after ,
and :
.¶
Additional features such as ensuring deterministic map ordering
(Section 4.2 of [STD94]) on output, or even deviating from the basic
configuration in some systematic way, can further assist in comparing
test data.
Information obtained from a CDDL model can help in choosing
application-oriented literals or specific string representations such
as embedded CBOR or b64''
in the appropriate places.¶
This document extends the syntax used in diagnostic notation for byte string literals to also be available for application-oriented extensions.¶
As per Section 8 of [STD94], the diagnostic notation can notate byte strings in a number of [RFC4648] base encodings, where the encoded text is enclosed in single quotes, prefixed by an identifier (»h« for base16, »b32« for base32, »h32« for base32hex, »b64« for base64 or base64url).¶
This syntax can be thought to establish a name space, with the names
"h", "b32", "h32", and "b64" taken, but other names being unallocated.
The present specification defines additional names for this namespace,
which we call application-extension identifiers.
For the quoted string, the same rules apply as for byte strings.
In particular, the escaping rules of JSON strings are applied
equivalently for application-oriented extensions, e.g., within the
quoted string \\
stands
for a single backslash and \'
stands for a single quote.¶
An application-extension identifier is a name consisting of a lower-case ASCII letter (a-z) and zero or more additional ASCII characters that are either lower-case letters or digits (a-z0-9).¶
Application-extension identifiers are registered in a registry (Section 4.1).¶
Prefixing a single-quoted string, an application-extension identifier is used to build an application-oriented extension literal, which stands for a CBOR data item the value of which is derived from the text given in the single-quoted string using a procedure defined in the specification for an application-extension identifier.¶
An application-extension (such as dt
) MAY also define the meaning of
a variant of the application-extension identifier where each
lower-case character is replaced by its upper-case counterpart (such
as DT
), for building an application-oriented extension literal using
that all-uppercase variant as the prefix of a single-quoted string.¶
As a convention for such definitions, using the all-uppercase variant
implies making use of a tag appropriate for this application-oriented
extension (such as tag number 1 for DT
).¶
Examples for application-oriented extensions to CBOR diagnostic notation can be found in the following sections.¶
In addition, this document finally registers a media type identifier and a content-format for CBOR diagnostic notation. This does not elevate its status as an interchange format, but recognizes that interaction between tools is often smoother if media types can be used.¶
The application-extension identifier "cri" is used to notate a Constrained Resource Identifier literal as per [I-D.ietf-core-href].¶
The text of the literal is a URI Reference as per [RFC3986] or an IRI Reference as per [RFC3987].¶
The value of the literal is a CRI that can be converted to the text of the literal using the procedure of Section 6.1 of [I-D.ietf-core-href]. Note that there may be more than one CRI that can be converted to the URI/IRI given; implementations are expected to favor the simplest variant available and make non-surprising choices otherwise.¶
As an example, the CBOR diagnostic notation¶
cri'https://example.com/bottarga/shaved'¶
is equivalent to¶
[-4, ["example", "com"], ["bottarga", "shaved"]]¶
See Appendix A.2.5 for an ABNF definition for the content of cri
literals.¶
The application-extension identifier "dt" is used to notate a date/time literal that can be used as an Epoch-Based Date/Time as per Section 3.4.2 of [STD94].¶
The text of the literal is a Standard Date/Time String as per Section 3.4.1 of [STD94].¶
The value of the literal is a number representing the result of a
conversion of the given Standard Date/Time String to an Epoch-Based
Date/Time.
If fractional seconds are given in the text (production
time-secfrac
in Figure 4), the value is a
floating-point number; the value is an integer number otherwise.
In the all-upper-case variant of the app-prefix, the value is enclosed
in a tag number 1.¶
As an example, the CBOR diagnostic notation¶
dt'1969-07-21T02:56:16Z', dt'1969-07-21T02:56:16.5Z', DT'1969-07-21T02:56:16Z'¶
is equivalent to¶
-14159024, -14159023.5, 1(-14159024)¶
See Appendix A.2.3 for an ABNF definition for the content of dt
literals.¶
The application-extension identifier "ip" is used to notate an IP address literal that can be used as an IP address as per Section 3 of [RFC9164].¶
The text of the literal is an IPv4address or IPv6address as per Section 3.2.2 of [RFC3986].¶
With the lower-case app-string ip
, the value of the literal is a
byte string representing the binary IP address.
With the upper-case app-string IP
, the literal is such a byte string
tagged with tag number 54, if an IPv6address is used, or tag number
52, if an IPv4address is used.¶
As an additional case, the upper-case app-string IP''
can be used
with a prefix such as 192.0.2.0/24
, with the equivalent tag as its value.
(Note that [RFC9164] representations of address prefixes need to
implement the truncation of the address byte string as described in
Section 4.2 of [RFC9164]; see example below.)
For completeness, the lower-case variant ip'192.0.2.0/24'
stands for
an unwrapped [24,h'c00002']
; however, in this case the information
on whether an address is IPv4 or IPv6 often needs to come from the context.¶
Note that there is no direct representation of an address combined
with a prefix length; this can be represented as
52([ip'192.0.2.42',24])
, if needed.¶
Examples: the CBOR diagnostic notation¶
ip'192.0.2.42', IP'192.0.2.42', IP'192.0.2.0/24', ip'2001:db8::42', IP'2001:db8::42', IP'2001:db8::/64'¶
is equivalent to¶
h'c000022a', 52(h'c000022a'), 52([24,h'c00002']), h'20010db8000000000000000000000042', 54(h'20010db8000000000000000000000042'), 54([64,h'20010db8'])¶
See Appendix A.2.4 for an ABNF definition for the content of ip
literals.¶
In some cases, an EDN consumer cannot construct actual CBOR items that represent the CBOR data intended for eventual interchange. This document defines stand-in representation for two such cases:¶
The EDN consumer does not know (or does not implement) an application-extension identifier used in the EDN document (Section 3.1) but wants to preserve the information for a later processor.¶
The generator of some EDN intended for human consumption (such as in a specification document) may not want to include parts of the final data item, destructively replacing complete subtrees or possibly just parts of a lengthy string by elisions (Section 3.2).¶
When ingesting CBOR diagnostic notation, any application-oriented extension literals are usually decoded and transformed into the corresponding data item during ingestion. If an application-extension is not known or not implemented by the ingesting process, this is usually an error and processing has to stop.¶
However, in certain cases, it can be desirable to exceptionally carry an uninterpreted application-oriented extension literal in an ingested data item, allowing to postpone its decoding to a specific later stage of ingestion.¶
This specification defines a CBOR Tag for this purpose:
The Diagnostic Notation Unresolved Application-Extension Tag, tag
number CPA999 (Section 4.5).
The content of this tag is an array of two text strings: The
application-extension identifier, and the (escape-processed) content
of the single-quoted string.
For example, dt'1969-07-21T02:56:16Z'
can be provisionally represented as
/CPA/ 999(["dt", "1969-07-21T02:56:16Z"])
.¶
RFC-Editor: This document uses the CPA (code point allocation) convention described in [I-D.bormann-cbor-draft-numbers]. For each usage of the term "CPA", please remove the prefix "CPA" from the indicated value and replace the residue with the value assigned by IANA; perform an analogous substitution for all other occurrences of the prefix "CPA" in the document. Finally, please remove this note.¶
EDN supports the use of an ellipsis (notated as three or more dots
in a row, as in ...
) to indicate parts of an EDN document that have
been elided (and therefore cannot be reconstructed).¶
This specification defines a CBOR Tag for this purpose: The Diagnostic Notation Ellipsis Tag, tag number CPA888 (Section 4.5). The content of this tag either is¶
null (indicating a data item entirely replaced by an ellipsis), or it is¶
an array, the elements of which are alternating between fragments of a string and the actual elisions, represented as ellipses carrying a null as content.¶
Elisions can stand in for entire subtrees, e.g. in:¶
[1, 2, ..., 3] , { "a": 1, "b": ..., ...: ... }¶
A single ellipsis (or key/value pair of ellipses) can imply eliding multiple elements in an array (members in a map); if more detailed control is required, a data definition language such as CDDL can be employed. (Note that the stand-in form defined here does not allow multiple key/value pairs with an ellipsis as a key as the CBOR data item would not be valid.)¶
Subtree elisions can be represented in a CBOR data item by using
/CPA/888(null)
as the stand-in:¶
[1, 2, 888(null), 3] , { "a": 1, "b": 888(null), 888(null): 888(null) }¶
Elisions also can be used as part of a (text or byte) string:¶
{ "contract": "Herewith I buy" ... "gned: Alice & Bob", "signature": h'4711...0815', }¶
The example "contract" uses string concatenation as per Appendix G.4 of [RFC8610], extending that by allowing ellipses; while the example
"signature" uses special syntax that allows the use of ellipses
between the bytes notated inside h''
literals.¶
String elisions can be represented in a CBOR data item by a stand-in that wraps an array of string fragments alternating with ellipsis indicators:¶
{ "contract": /CPA/888(["Herewith I buy", 888(null), "gned: Alice & Bob"]), "signature": 888([h'4711', 888(null), h'0815']), }¶
Note that the use of elisions is different from "commenting out" EDN text, e.g.¶
{ "contract": "Herewith I buy" /.../ "gned: Alice & Bob", "signature": h'4711/.../0815', # ...: ... }¶
The consumer of this EDN will ignore the comments and therefore will have no idea after ingestion that some information has been elided; validation steps may then simply fail instead of being informed about the elisions.¶
RFC Editor: please replace RFCthis with the RFC number of this RFC, [IANA.cbor-diagnostic-notation] with a reference to the new registry group, and remove this note.¶
IANA is requested to create an "Application-Extension Identifiers" registry in a new "CBOR Diagnostic Notation" registry group [IANA.cbor-diagnostic-notation], with the policy "expert review" (Section 4.5 of [BCP26]).¶
The experts are instructed to be frugal in the allocation of application-extension identifiers that are suggestive of generally applicable semantics, keeping them in reserve for application-extensions that are likely to enjoy wide use and can make good use of their conciseness. The expert is also instructed to direct the registrant to provide a specification (Section 4.6 of [BCP26]), but can make exceptions, for instance when a specification is not available at the time of registration but is likely forthcoming. If the expert becomes aware of application-extension identifiers that are deployed and in use, they may also initiate a registration on their own if they deem such a registration can avert potential future collisions.¶
Each entry in the registry must include:¶
a lower case ASCII [STD80] string that starts with a letter and can
contain letters and digits after that ([a-z][a-z0-9]*
). No other
entry in the registry can have the same application-extension identifier.¶
a brief description¶
(see Section 2.3 of [BCP26])¶
a reference document that provides a description of the application-extension identifier¶
The initial content of the registry is shown in Table 1; all entries have the Change Controller "IETF".¶
Application-extension Identifier | Description | Reference |
---|---|---|
h | Reserved | RFC8949 |
b32 | Reserved | RFC8949 |
h32 | Reserved | RFC8949 |
b64 | Reserved | RFC8949 |
cri | Constrained Resource Identifier | RFCthis |
dt | Date/Time | RFCthis |
ip | IP Address/Prefix | RFCthis |
IANA is requested to create an "Encoding Indicators" registry in the newly created "CBOR Diagnostic Notation" registry group [IANA.cbor-diagnostic-notation], with the policy "specification required" (Section 4.6 of [BCP26]).¶
The experts are instructed to be frugal in the allocation of encoding indicators that are suggestive of generally applicable semantics, keeping them in reserve for encoding indicator registrations that are likely to enjoy wide use and can make good use of their conciseness. If the expert becomes aware of encoding indicators that are deployed and in use, they may also solicit a specification and initiate a registration on their own if they deem such a registration can avert potential future collisions.¶
Each entry in the registry must include:¶
an ASCII [STD80] string that starts with an underscore letter and
can contain zero or more underscores, letters and digits after that
(_[_A-Za-z0-9]*
). No other entry in the registry can have the same
Encoding Indicator.¶
a brief description¶
(see Section 2.3 of [BCP26])¶
a reference document that provides a description of the application-extension identifier¶
The initial content of the registry is shown in Table 2; all entries have the Change Controller "IETF".¶
Encoding Indicator | Description | Reference |
---|---|---|
_ | Indefinite Length Encoding (ai=31) | RFC8949, RFCthis |
_i | ai=0 to ai=23 | RFCthis |
_0 | ai=24 | RFC8949, RFCthis |
_1 | ai=25 | RFC8949, RFCthis |
_2 | ai=26 | RFC8949, RFCthis |
_3 | ai=27 | RFC8949, RFCthis |
IANA is requested to add the following Media-Type to the "Media Types" registry [IANA.media-types].¶
Name | Template | Reference |
---|---|---|
cbor-diagnostic | application/cbor-diagnostic | RFC XXXX, Section 4.3 |
application¶
cbor-diagnostic¶
N/A¶
N/A¶
binary (UTF-8)¶
none¶
Section 4.3 of RFC XXXX¶
Tools interchanging a human-readable form of CBOR¶
The syntax and semantics of fragment identifiers is as specified for "application/cbor". (At publication of RFC XXXX, there is no fragment identification syntax defined for "application/cbor".)¶
CBOR WG mailing list (cbor@ietf.org), or IETF Applications and Real-Time Area (art@ietf.org)¶
COMMON¶
none¶
IETF¶
no¶
IANA is requested to register a Content-Format number in the "CoAP Content-Formats" sub-registry, within the "Constrained RESTful Environments (CoRE) Parameters" Registry [IANA.core-parameters], as follows:¶
Content-Type | Content Coding | ID | Reference |
---|---|---|---|
application/cbor-diagnostic | - | TBD1 | RFC XXXX |
TBD1 is to be assigned from the space 256..999.¶
RFC-Editor: This document uses the CPA (code point allocation) convention described in [I-D.bormann-cbor-draft-numbers]. For each usage of the term "CPA", please remove the prefix "CPA" from the indicated value and replace the residue with the value assigned by IANA; perform an analogous substitution for all other occurrences of the prefix "CPA" in the document. Finally, please remove this note.¶
In the "CBOR Tags" registry [IANA.cbor-tags], IANA is requested to assign the tags in Table 5 from the "specification required" space (suggested assignments: 888 and 999), with the present document as the specification reference.¶
Tag | Data Item | Semantics | Reference |
---|---|---|---|
CPA888 | null or array | Diagnostic Notation Ellipsis | [RFCthis] |
CPA999 | array | Diagnostic Notation Unresolved Application-Extension |
[RFCthis] |
The security considerations of [STD94] and [RFC8610] apply.¶
This appendix provides an overall ABNF definition for the syntax of CBOR extended diagnostic notation.¶
To complete the parsing of an app-string
with prefix, say, p
, the
processed sqstr
inside it is further parsed using the ABNF definition specified
for the production app-string-p
in Appendix A.2.¶
For simplicity, the internal parsing for the built-in EDN prefixes is
specified in the same way.
ABNF definitions for h''
and b64''
are provided in Appendix A.2.1 and
Appendix A.2.2.
However, the prefixes b32''
and h32''
are not in wide use and an
ABNF definition in this document could therefore not be based on
implementation experience.¶
While an ABNF grammar defines the set of character strings that are considered to be valid EDN by this ABNF, the mapping of these character strings into the generic data model of CBOR is not always obvious.¶
The following additional items should help in the interpretation:¶
decnumber
stands for an integer in the usual decimal notation, unless at
least one of the optional parts starting with "." and "e" are
present, in which case it stands for a floating point value in the
usual decimal notation.¶
basenumber
stands for an integer in the usual base 16/hexadecimal
("0x"), base 8/octal ("0o"), or base 2/binary ("0b") notation, unless the
optional part containing a "p" is present, in which case it stands
for a floating point number in the usual hexadecimal notation (which
uses a mantissa in hexadecimal and an exponent in decimal notation).¶
spec
stands for an encoding indicator.
As per Section 8.1 of [STD94]:¶
an underscore _
on its own stands
for indefinite length encoding (ai=31
, only available behind the
opening brace/bracket for map
and array
: strings have a special
syntax streamstring
for indefinite length encoding except for the
special cases ''_ and ""_), and¶
_0
to _3
stand for ai=24
to ai=27
, respectively.¶
Surprisingly, Section 8.1 of [STD94] does not address ai=0
to
ai=23
— the assumption seems to be that preferred serialization
(Section 4.1 of [STD94]) will be used when converting CBOR
diagnostic notation to an encoded CBOR data item, so leaving out the
encoding indicator for a data item with a preferred serialization
will implicitly use ai=0
to ai=23
if that is possible.
The present specification allows to make this explicit:¶
_i
("immediate") stands for encoding with ai=0
to ai=23
.¶
While no pressing use for further values for encoding indicators comes to mind, this is an extension point for EDN; Section 4.2 defines a registry for additional values.¶
string
and the rules preceding it in the same block realize both
the representation of strings that are split up into multiple chunks
(Appendix G.4 of [STD94]) and the use of ellipses to represent elisions
(Section 3.2). The semantic processing of these rules is relatively
complex:¶
A single ...
is a general ellipsis, which can stand for any data
item.¶
An ellipsis can be surrounded (on one or both sides) by string
chunks, the result is a CBOR tag number CPA888 that contains an
array with joined together spans of such chunks plus the ellipses
represented by 888(null)
.¶
A simple sequence of string chunks is simply joined together. In both cases of joining strings, the rules of Appendix G.4 of [STD94] need to be followed; in particular, if a text string results from the joining operation, that result needs to be valid UTF-8.¶
Some of the strings may be app-strings. If the type of the app-string is an actual string, joining of chunked strings occurs as with directly notated strings; otherwise the occurrence of more than one app-string or an app-string together with a directly notated string cannot be processed.¶
This appendix provides ABNF definitions for application-oriented extension
literals defined in [STD94] and in this specification.
These grammars describe the decoded content of the sqstr
components that
combine with the application-extension identifiers to form
application-oriented extension literals.
Each of these may make use of rules defined in Figure 1.¶
The syntax of the content of byte strings represented in hex,
such as h''
, h'0815
, or h'/head/ 63 /contents/ 66 6f 6f'
(another representation of << "foo" >>
), is described by the ABNF in Figure 2.
This syntax accommodates both lower case and upper case hex digits, as
well as blank space (including comments) around each hex digit.¶
The syntax of the content of byte strings represented in base64 is described by the ABNF in Figure 2.¶
This syntax allows both the classic (Section 4 of [RFC4648]) and the URL-safe (Section 5 of [RFC4648]) alphabet to be used. It accommodates, but does not require base64 padding. Note that inclusion of classic base64 makes it impossible to have in-line comments in b64, as "/" is valid base64-classic.¶
The syntax of the content of dt
literals can be described by the
ABNF for date-time
from [RFC3339] as summarized in Section 3 of [RFC9165]:¶
The syntax of the content of ip
literals can be described by the
ABNF for IPv4address
and IPv6address
in Section 3.2.2 of [RFC3986],
as included in slightly updated form in Figure 5.¶
The syntax of the content of cri
literals can be described by the
ABNF for URI-reference
in Section 4.1 of [RFC3986], as reproduced
in Figure 6.
If the content is not ASCII only (i.e., for IRIs), first apply
Section 3.1 of [RFC3987] and apply this grammar to the result.¶
EDN was designed as a language to provide a human-readable representation of an instance, i.e., a single CBOR data item or CBOR sequence. CDDL was designed as a language to describe an (often large) set of such instances (which itself constitutes a language), in the form of a data definition or grammar (or sometimes called schema).¶
The two languages share some similarities, not the least because they have mutually inspired each other. But they have very different roots:¶
EDN syntax is an extension to JSON syntax [STD90]. (Any (interoperable) JSON text is also valid EDN.)¶
For engineers that are using both EDN and CDDL, it is easy to write "CDDLisms" or "EDNisms" into their drafts that are meant to be in the other language. (This is one more of the many motivations to always validate formal language instances with tools.)¶
Important differences include:¶
Comment syntax. CDDL inherits ABNF's semicolon-delimited end of
line characters, while EDN finds nothing in JSON that could be inherited here.
Inspired by JavaScript, EDN simplifies JavaScript's copy of the
original C comment syntax to be delimited by single slashes (where
line ends are not of interest); it also adds end-of-line comments
starting with #
.¶
Syntax for tags. CDDL's tag syntax is part of the system for referring to CBOR's fundamentals (the major type 6, in this case) and (with [I-D.ietf-cbor-update-8610-grammar]) allows specifying the actual tag number separately, while EDN's tag syntax is a simple decimal number and a pair of parentheses.¶
Separator character. Like JSON, EDN requires commas as separators between array elements and map members and doesn't allow a trailing comma before the closing bracket/brace. CDDL's comma separators in these contexts (CDDL groups) are optional (and actually are terminators, which together with their optionality allows them to be used like separators as well or even not at all).¶
Embedded CBOR. EDN has a special syntax to describe the content of byte strings that are encoded CBOR data items. CDDL can specify these with a control operator, which looks very different.¶
The concept of application-oriented extensions to diagnostic notation, as well as the definition for the "dt" extension were inspired by the CoRAL work by Klaus Hartke.¶