Additional Control Operators for CDDLUniversität Bremen TZIPostfach 330440BremenD-28359Germany+49-421-218-63921cabo@tzi.orgInternet-DraftThe Concise Data Definition Language (CDDL), standardized in RFC 8610,
provides "control operators" as its main language extension point.The present document defines a number of control operators that did
not make it into RFC 8610: .plus, .cat and .det for the construction of constants,
.abnf/.abnfb for including ABNF (RFC 5234/RFC 7405) in CDDL specifications, and
.feature for indicating the use of a non-basic feature in an instance.IntroductionThe Concise Data Definition Language (CDDL), standardized in RFC 8610,
provides "control operators" as its main language extension point.The present document defines a number of control operators that did
not make it into RFC 8610:NamePurpose.plusNumeric addition.catString Concatenation.detString Concatenation, pre-dedenting.abnfABNF in CDDL (text strings).abnfbABNF in CDDL (byte strings).featureDetecting feature use in extension pointsTerminologyThe key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.This specification uses terminology from .
In particular, with respect to control operators, "target" refers to
the left hand side operand, and "controller" to the right hand side operand.Computed LiteralsCDDL as defined in does not have any mechanisms to compute
literals. As an 80 % solution, this specification adds three control
operators: .plus for numeric addition, .cat for string
concatenation, and .det for string concatenation with dedenting of
the right hand side (controller).For these operators, as with all control operators, targets and
controllers are types. The resulting type is therefore formally a
function of the elements of the cross-product of the two types.
Not all tools may be able to work with non-unique targets or
controllers.Numeric AdditionIn many cases in a specification, numbers are needed relative to a
base number. The .plus control identifies a number that is
constructed by adding the numeric values of the target and of the
controller.Target and controller MUST be numeric.
If the target is a floating point number and the controller an integer
number, or vice versa, the sum is converted into the type of the
target; converting from a floating point number to an integer selects
its floor (the largest integer less than or equal to the floating
point number).The example in contains the generic definition of a group
interval that gives a lower and an upper bound and optionally a
tolerance.
rect combines two of these groups into a map, one group for the X
dimension and one for Y dimension.String ConcatenationIt is often useful to be able to compose string literals out of
component literals defined in different places in the specification.The .cat control identifies a string that is built from a
concatenation of the target and the controller.
Target and controller MUST be strings.
The result of the operation has the type of the target.
The concatenation is performed on the bytes in both strings.
If the target is a text string, the result of that concatenation MUST
be valid UTF-8.The example in
builds a text string named a out of concatenating the target text string "foo"
and the controller byte string entered in a text form byte string literal.
(This particular idiom is useful when the text string contains
newlines, which, as shown in the example for b, may be harder to
read when entered in the format that the pure CDDL text string
notation inherits from JSON.)String Concatenation with DedentingMulti-line string literals for various applications, including
embedded ABNF (), need to be set flush left, at least
partially.
Often, having some indentation in the source code for the literal can
promote readability, as in .The control operator .det works like .cat, except that both
arguments (target and controller) are independently dedented before
the concatenation takes place.
For the purposes of this specification, we define dedenting as:determining the smallest amount of left-most white space (number of
leading space characters) in all the non-blank lines, andremoving exactly that number of leading space characters from each
line. For blank (white space only or empty) lines, there may be
less (or no) leading space characters than this amount, in which
case all leading space is removed.(The name .det is a shortcut for "dedenting cat".
The maybe more obvious name .dedcat has not been chosen
as it is longer and may invoke unpleasant images.)Occasionally, dedenting of only a single item is needed.
This can be achieved by using this operator with an empty string,
e.g., "" .det rhs or lhs .det "", which can in turn be combined
with a .cat: in the construct lhs .cat ("" .det rhs), only rhs
is dedented.Embedded ABNFMany IETF protocols define allowable values for their text strings in
ABNF .
It is often desirable to define a text string type in CDDL by
employing existing ABNF embedded into the CDDL specification.
Without specific ABNF support in CDDL, that ABNF would usually need to
be translated into a regular expression (if that is even possible).ABNF is added to CDDL in the same way that regular
expressions were added: by defining a .abnf control operator.
The target is usually text or some restriction on it, the controller
is the text of an ABNF specification.There are several small issues, with solutions given here:ABNF can be used to define byte sequences as well as UTF-8 text
strings interpreted as Unicode scalar sequences. This means this
specification defines two control operators: .abnfb for ABNF
denoting byte sequences and .abnf for denoting sequences of
Unicode scalar values (codepoint) represented as UTF-8 text strings.
Both control operators can be applied to targets of either string
type; the ABNF is applied to sequence of bytes in the string
interpreting that as a sequence of bytes (.abnfb) or as a sequence
of code points represented as an UTF-8 text string (.abnf).
The controller string MUST be a text string.ABNF defines a list of rules, not a single expression (called
"elements" in ). This is resolved by requiring the
controller string to be one valid "element", followed by zero or
more valid "rule" separated from the element by a newline; so the
controller string can be built by preceding a piece
of valid ABNF by an "element" that selects from that ABNF and a newline.For the same reason, ABNF requires newlines; specifying newlines in
CDDL text strings is tedious (and leads to essentially unreadable
ABNF). The workaround employs the .cat operator introduced in
and the syntax for text in byte strings.
As is customary for ABNF, the syntax of ABNF itself (NOT the syntax
expressed in ABNF!) is relaxed to allow a single linefeed as a
newline:One set of rules provided in an ABNF specification is often used in
multiple positions, in particular staples such as DIGIT and ALPHA.
(Note that all rules referenced need to be defined in each ABNF
operator controller string —
there is no implicit import of Core ABNF or other rules.)
The composition this calls for can be provided by the .cat
operator, and/or by .det if there is indentation to be disposed of.These points are combined into an example in , which uses
ABNF from to specify one each of the CBOR tags defined in
and .FeaturesTraditionally, the kind of validation enabled by languages such as
CDDL provided a Boolean result: valid, or invalid.In rapidly evolving environments, this is too simplistic. The data
models described by a CDDL specification may continually be enhanced
by additional features, and it would be useful even for a
specification that does not yet describe a specific future feature to
identify the extension point the feature can use, accepting such
extensions while marking them as such.The .feature control annotates the target as making use of the
feature named by the controller. The latter will usually be a string.
A tool that validates an instance against that specification may mark
the instance as using a feature that is annotated by the
specification.More specifically, the tool's diagnostic output might contain
the controller (right hand side) as a feature name, and the target
(left hand side) as a feature detail. However, in some cases, the target has
too much detail, and the specification might want to hint the tool
that more limited detail is appropriate. In this case, the controller
should be an array, with the first element being the feature name
(that would otherwise be the entire controller), and the second
element being the detail (usually another string), as illustrated in
. shows what could be the definition of a person, with
potential extensions beyond name and organization being marked
further-person-extension.
Extensions that are known at the time this definition is written can be
collected into $$person-extensions. However, future extensions
would be deemed invalid unless the wildcard at the end of the map is
added.
These extensions could then be specifically examined by a user or a
tool that makes use of the validation result; the label (map key)
actually used makes a fine feature detail for the tool's diagnostic
output.Leaving out the entire extension point would mean that instances that
make use of an extension would be marked as whole-sale invalid, making
the entire validation approach much less useful.
Leaving the extension point in, but not marking its use as special,
would render mistakes such as using the label organisation instead of
organization invisible. shows another example where .feature provides for
type extensibility.A CDDL tool may simply report the set of features being used; the
control then only provides information to the process requesting the
validation.
One could also imagine a tool that takes arguments allowing the tool to accept
certain features and reject others (enable/disable). The latter approach
could for instance be used for a JSON/CBOR switch, as illustrated in
.It remains to be seen if the enable/disable approach can lead to new
idioms of using CDDL. The language currently has no way to enforce
mutually exclusive use of features, as would be needed in this example.IANA ConsiderationsThis document requests IANA to register the contents of
into the registry
"" of :NameReference.plus[RFCthis].cat[RFCthis].det[RFCthis].abnf[RFCthis].abnfb[RFCthis].feature[RFCthis]Implementation StatusAn early implementation of the control operator .feature has been
available in the CDDL tool described in since version 0.8.11.
The validator warns about each feature being used and provides the set
of target values used with the feature.
The other control operators defined in this specification are also
implemented as of version 0.8.21 and 0.8.26 (double-handed .det).Andrew Weiss' has an ongoing implementation of this draft
which is feature-complete except for the ABNF and dedenting support (https://github.com/anweiss/cddl/pull/79).Security considerationsThe security considerations of apply.Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data StructuresThis document proposes a notational convention to express Concise Binary Object Representation (CBOR) data structures (RFC 7049). Its main goal is to provide an easy and unambiguous way to express structures for protocol messages and data formats that use CBOR or JSON.Concise Data Definition Language (CDDL)IANAAugmented BNF for Syntax Specifications: ABNFInternet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]Case-Sensitive String Support in ABNFThis document extends the base definition of ABNF (Augmented Backus-Naur Form) to include a way to specify US-ASCII string literals that are matched in a case-sensitive manner.Key words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Ambiguity of Uppercase vs Lowercase in RFC 2119 Key WordsRFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.cddl-rsDate and Time on the Internet: TimestampsThis document defines a date and time format for use in Internet protocols that is a profile of the ISO 8601 standard for representation of dates and times using the Gregorian calendar.Concise Binary Object Representation (CBOR) Tags for DateThe Concise Binary Object Representation (CBOR), as specified in RFC 7049, is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. In CBOR, one point of extensibility is the definition of CBOR tags. RFC 7049 defines two tags for time: CBOR tag 0 (date/time string as per RFC 3339) and tag 1 (POSIX "seconds since the epoch"). Since then, additional requirements have become known. This specification defines a CBOR tag for a date text string (as per RFC 3339) for applications needing a textual date representation within the Gregorian calendar without a time. It also defines a CBOR tag for days since the date 1970-01-01 in the Gregorian calendar for applications needing a numeric date representation without a time. This specification is the reference document for IANA registration of the CBOR tags defined.Concise Binary Object Representation (CBOR)The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack.This document obsoletes RFC 7049, providing editorial improvements, new details, and errata fixes while keeping full compatibility with the interchange format of RFC 7049. It does not create a new version of the format.AcknowledgementsJim Schaad suggested several improvements.
The .feature feature was developed out of a discussion with Henk Birkholz.
Paul Kyzivat helped isolate the need for .det..det is an abbreviation for "dedenting cat", but Det is also the name
of a German TV Cartoon character created in the 1960s.