Internet-Draft Formal FSM October 2021
Petit-Huguenin Expires 14 April 2022 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-petithuguenin-formal-fsm-00
Published:
Intended Status:
Standards Track
Expires:
Author:
M. Petit-Huguenin
Impedance Mismatch LLC

A Formal Language to Describe State Machines

Abstract

This document describes a meta-language for finite state machines.

Editorial Note

The intent of this Internet-Draft is not to be published as an RFC but to solicit reviews on the meta-language described in it. The meta-language described will become a section in the specific Internet-Draft that will succeed it.

The ABNF was generated using the RFC5234 DSL described in computerate specifying [I-D.petithuguenin-computerate-specifying].

Discussion of this document takes place on the FDT mailing list (mailto:fdt@w3.org), which is archived at https://mailarchive.ietf.org/arch/browse/fdt/.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 14 April 2022.

Table of Contents

1. Introduction

TBD.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Presentation Language for Finite State Machine

The only tentative of formalization of a standalone presentation language for Finite State Machines (FSM) at the IETF was Cosmogol [I-D.bortzmeyer-language-state-machines], which did not generate enough interest from the community to be published as an RFC.

Instead this document proposes an ad-hoc meta-language to describe FSMs and purposely delays efforts to make it a standard.

3.1. Semantics

The state machines formalized by this meta-language are based on Mealy machines, i.e, as finite state machines whose next state and actions are determined both by its current state and the current event, in the form of a transition. This model is enriched by the possibility of having actions, states and events carrying additional data. Regardless of these data the state machines described are still deterministic, in that for each possible value carried by a state and for each possible value carried by an event, only one transition is possible. This implies that there MUST NOT be a combination of state and event values that does not trigger exactly one of the transitions described in that state machine.

A simple calculus is used to assign values to the data carried by the next state and by the actions triggered by a transition. This permits to carry as much information as possible in a compact way to the people or processes that will implement, or verify conformance to, the described state machine.

3.2. Syntax

The syntax of the meta-language is described in the following sections in bottom up order by using ABNF [RFC5234], including the Core Rules defined in appendix B.

The retransmission state machine underlying section 6.2.1 of STUN [RFC8489] will be used to illustrate the syntax of the meta-language.

3.2.1. Formatting

Each element of the language is presented either on a single line or unfolded on several lines with the subsequent lines starting with either a space or an horizontal tab character.

  • LWS   =  [*WSP CRLF] 1*WSP

3.2.2. Names

Constants, states, events, actions, and their eventual parameters are all assigned to a name. Bindings are assignments of names to a parameter values in the context of a transition.

A name is composed of an alphabetic character, followed by a repetition of either alphanumeric characters, the "-" character, or the "_" character. Names are case-sensitive.

  • name  =  ALPHA *ALPHA / DIGIT / "-" / "_"

"init", and "RTO" are examples of name.

3.2.3. Constants

Constants are immutable values that are used by transitions and set to a specific value before the state machine is instantiated. The specific value used is not part of the definition of the state machine.

Constants can also be thought as total functions that takes the current value of all the possible bindings available when that constant is accessed and return a value. Here it is the function that is constant, although the value returned may not be when it depends on the values of bindings.

Constants are declared with the keyword "constant". The name of a constant MUST be unique among all constant names defined in an FSM.

  • constant =  name [LWS] "<-" [LWS] "constant"

"RTO <- constant" and "POWER2 <- constant" are two examples of constants used by the STUN retransmission state machine.

3.2.4. State

States represent the possible values of an implicit binding that is the current state in the execution of a state machine. State can carry parameters, which must be named at the same time than the state. States are declared with the keyword "state". The name of a state MUST be unique among all state names defined in an FSM. The name of a state parameter MUST be unique among all state parameter names for a specific state.

  • state =  name [LWS] "<-" [LWS] "state" *(LWS name)

"Init <- state" and "Sending <- state request count" are two examples of states used by the STUN retransmission state machine (we use the convention that states names start with an uppercase).

In the second example the Sending state carries the number of remaining retransmissions and the request to be retransmitted.

3.2.5. Event

An event is a signal external of the state machine that when received triggers a transition. An event can carry parameters which are declared similarly to states. Events are declared with the keyword "event". The name of an event MUST be unique among all event names defined in an FSM. The name of an event parameter MUST be unique among all event parameter names for a specific event.

  • state =  name [LWS] "<-" [LWS] "state" *(LWS name)

The "init" event is implicitly defined and is always the first event received by a state machine. It does not carry any parameter.

"appRequest <- event request" and "timeout <- event" are two examples of events used by the STUN retransmission state machine (we use the convention that event names start with a lowercase).

3.2.6. Action

An action is a signal sent by the state machine following a transition, and that is sent externally. An action can carry parameters which are declared similarly to states. Actions are declared with the keyword "action". The name of an action MUST be unique among all action names defined in an FSM. Although actions names are not used elsewhere, the name of an action parameter MUST be unique among all action parameter names for a specific action.

  • action   =  name [LWS] "<-" [LWS] "action" *(LWS name)

Note that an external piece of code can be represented as either a constant, or as both an action and one or more events. A rule of thumb is that an action/event should be used for something that is taking time or that is doing I/O and that a constant as function should be used when the value can be calculated by using only the constants and bindings defined in the state machine.

"netRequest <- action request" and "timer <- action delay" are two examples of actions used by the STUN retransmission state machine (we use the convention that action names start with a lowercase).

3.2.7. Event and State Patterns

An event pattern is used to select the particular event for which a particular transition is described. The pattern matching is done on the name of the event and possibly on values for the parameters of that event.

Using "*" as the name of the event means that this transition is for all possible events. In that case it is not possible to pattern match on event parameters.

The meta-language permits to combine two different ways to pattern match on a parameter: positional pattern matching and named pattern matching.

Positional pattern matching is useful when most of the parameters need to be pattern matched or will be used later. On the other hand if only one or two parameters are needed, it becomes quickly annoying to use positional pattern matching.

Named pattern matching is useful when a few of the parameters need to be pattern matched or will be used later. On the other hand if most of the parameters are needed, it becomes quickly annoying to have to name all of them.

Combining the two ways permits to get the best of each type of pattern matching, especially if the most used parameters are declared first.

So state parameters can be patterned-matched using positional pattern, i.e., using either a value, a name, or a "*" for each of the parameter, in the same order than parameters were declared for the event.

Using a value means that this parameter must have this exact value.

Using a name in a positional pattern binds that name to the value of the parameter in that position. It means that there is no restriction to the value that this parameter can take, but that we can reference that value elsewhere. That name can be identical to the parameter name, but it does not have to be.

Using a "*" in a positional pattern means that there is no restriction of the value of that parameter, but that we will not be able to use that value.

  • num-value            =  "0" / (%x31-39 *DIGIT)
    
    str-value            =  DQUOTE *( %x01-21 / %x32-5B / %x5D-7F /
                            ("\" DQUOTE) / ("\" "\") ) DQUOTE
    
    bool-value           =  "true" / "false"
    
    value                =  num-value / str-value / bool-value
    
    pattern              =  "*" / name / value
    
    positional-patterns  =  pattern *(LWS pattern)

Alternatively variables can be pattern-matched using named patterns, in which each parameter is named and possibly associated with either a value, a name, or "*" by using the "=" symbol. The parameter name MUST be one of the names used when declaring the event. Parameter names do not have to be used in the same order that declared in the event, but they cannot be used more than once.

Like for positional pattern matching, using a value means that the named parameter must have this exact value. A parameter name alone, i.e. without the "=" symbol, implicitly creates a binding to that parameter, binding that has the same name than the parameter.

Using a value after the "=" symbol means that this parameter must have this exact value.

Using a name after the "=" symbol binds that name to the value of the parameter. It means that there is no restriction to the value that this parameter can take, but that use that value elsewhere. That name can be identical to the parameter name, but it does not have to be.

Using a "*" after the "=" symbol means that there is no restriction of the value of that parameter, but that we do not need to use that value.

A name declared as an event parameter but that is not used in either the positional or named patterns is implicitly associated with "*". This means that an empty set of named patterns will match any values for any parameter for the specified event.

  • named-pattern  =  name [[LWS] "=" [LWS] pattern]
    
    named-patterns =  "{" [LWS] [named-pattern *([LWS] "," [LWS]
                      named-pattern)] [LWS] "}"

Positional and named patterns can be combined by listing first a subset of the positional patterns followed by the named patterns, as long as positional patterns come first, and as long as named patterns do not use the name of a parameter that is already used by a positional pattern.

  • patterns =  "*" / ("(" [LWS] "*" [LWS] ")") / name / ("(" [LWS]
                name [LWS] ")") / ("(" [LWS] name [LWS
                positional-patterns] [LWS named-patterns] [LWS] ")")

The state pattern is identical to the event pattern, but with the name of a state replacing the name of the event, and the names of the state parameters replacing the names of the event parameters.

"*", "init", "(Sending r c)", and "(appRequest {})" are examples of event or state patterns.

3.2.8. Expression

An expression is the combination of constant names, binding names, or values with binary operators between each of them.

Each operators is assigned a precedence that permits to unambiguously resolve the order these operators are processed. In the following ABNF, an operator defined by a rule with a name ending with a lower number binds loosely than an operator with a rule name ending with an higher number.

  • ops-1 =  "||"
    
    ops-2 =  "&&"
    
    ops-3 =  "<" / "<=" / "==" / "!=" / ">" / ">="
    
    ops-4 =  "+" / "-"
    
    ops-5 =  "*" / "/" / "%"
    
    ops   =  ops-1 / ops-2 / ops-3 / ops-4 / ops-5

Parentheses can be used to override the precedence rules. The two unary operators binds as close as possible to the next value, name or opening parenthesis.

  • un-ops   =  "!" / "-"
    
    sub-expr =  ([un-ops [LWS]] name) / ([un-ops [LWS]] value) /
                ( [un-ops [LWS]] ("(" [LWS] ops [LWS] ")") )
    
    expr     =  sub-expr *([LWS] ops [LWS] sub-expr)

The type (numeric, character string, or boolean) of an expression is inferred by recursively applying a series of rules on the combination of names, values, and operators for that expression.

  • The type of a numeric value is the numeric type.
  • The type of a character string value is the character string type.
  • The type of a boolean value is the boolean type.
  • The type of the operators "||" and "&&" is boolean if the expressions on the right and left of them are of type boolean.
  • The type of the operators "<", "<=", "==", "!=", ">", and "`>="` is numeric if the expressions on the right and left of them are of type numeric.
  • The type of the operators "<", "<=", "==", "!=", ">", and ">=" is character string if the expressions on the right and left of them are of type character string.
  • The type of the operators "+", "-", "*", "/", and "%" is numeric if the expressions on the right and left of them are of type numeric.
  • the type of the unary operator "!" is boolean if the expression on the right of it is of type boolean.
  • the type of the unary operator "-" is numeric if the expression on the right of it is of type numeric.
  • Each binding associated to an event parameter or state parameter has the same type than the parameter.
  • Each constant in a whole state machine have a unique type.
  • Each event parameter, state parameter, or action parameter in a whole state machine has a unique type.
  • Any expression that does not match one of these rules is incorrect.

"(1 + 1) * 4 < name && true || !false" is an example of expression. Its type is the boolean type.

3.2.9. State Assignment

A state assignment assigns an expression to each parameter of the next state. The expression can use any constant, and any binding that is defined in the state or event pattern in the transition. The type of the expressions assigned to a specific parameter of a state MUST be the same across all transitions.

Most of the time state parameters are assigned to the same value than is carried by the previous state. To simplify this use case, it is also possible to assign only the parameters that require modification, and to implicitly assign the same value by default for the others. Positional assignments and named assignments can be used in the same way than positional patterns and named patterns are.

  • named-assignment  =  name [[LWS] "=" [LWS] expr]
    
    named-assignment  =  ("{" [LWS] "}") / ("{" [LWS] named-assignment
                         *([LWS] "," [LWS] named-assignment) [LWS]
                         "}")
    
    state-assignment  =  name named-assignment

"Disabled {last=last+1}" is an example of state assignment.

3.2.10. Action Assignment

An action assignment assigns an expression to each parameter of an action. The expression can use any constant, and any binding that is defined in the state or event pattern in the transition. The type of the expressions assigned to a specific parameter of an action MUST be the same across all transitions.

The number of expressions after the name of the action must be equal to the number of parameters declared for this action.

  • action-assignment =  name *(LWS expr)

"timer (RTO * POWER2 c)" is an example of action assignment.

3.2.11. Transition

A transition describes what happen when an event is received on a state and under an optional condition.

A transition is composed of 5 elements: a state pattern, an event pattern, an optional condition which is an expression of type boolean, an optional next state, and a possibly empty set of actions.

If the next state is not present then the transition leaves the state machine in the same state.

  • transition  =  patterns LWS patterns [[LWS] "|" [LWS] expr] [LWS]
                   "->" [[LWS] state-assignment] *(LWS
                   action-assignment)

Note that the result of an expression cannot be used in an expression, as the result of each expression is built only from the values set in the event and state patterns. That makes the execution of the state change and actions atomic, deterministic, and unambiguous.

The name of all the bindings in a transition MUST be unique in that transition.

The following is an example of transition.

  • timeout (Disabled {last}) ->
      Disabled {last=last+1}
      timer CONNECTIVITY_TIMEOUT
      probe (last+1) 0
    Figure 1

3.2.12. State Machine

A state machine is composed of constants, states, events, actions, transitions, empty lines, or comments.

  • element        =  constant / state / event / action / transition /
                      ("#" *VCHAR)
    
    state-machine  =  [element] *(CRLF [element])

The following example is the STUN retransmission state machine.

  • init * -> Init
    (appRequest r) Init ->
      Sending r 0
      netRequest r
      timer RTO
    
    * Init ->
    
    (appRequest _) (Sending {}) ->
    
    timeout (Sending r c) | c < RC - 1 ->
      Sending {count=c+1}
      netRequest r
      timer (RTO * POWER2 c)
    
    timeout (Sending r c) | c = RC - 1 ->
      Last
      netRequest r
      timer (RTO * RM)
    
    netIcmp Sending ->
      End
      cancelTimer
      appError
    
    netResponse Sending
      End
      cancelTimer
      appResponse
    
    (appRequest _) Last ->
    
    timeout Last ->
      End
      appTimeout
    
    netIcmp Last ->
      End
      cancelTimer
      appError
    
    netResponse Last
      End
      cancelTimer
      appResponse
    
    * End ->
    Figure 2

4. References

4.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC5234]
Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/info/rfc5234>.
[RFC8174]
Leiba, B., "RFC 2119 Key Words: Clarifying the Use of Capitalization", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

4.2. Informative References

[I-D.bortzmeyer-language-state-machines]
Bortzmeyer, S., "Cosmogol: a language to describe finite state machines", Work in Progress, Internet-Draft, draft-bortzmeyer-language-state-machines-01, , <https://datatracker.ietf.org/doc/html/draft-bortzmeyer-language-state-machines-01>.
[I-D.petithuguenin-computerate-specifying]
Petit-Huguenin, M., "The Computerate Specifying Paradigm", Work in Progress, Internet-Draft, draft-petithuguenin-computerate-specifying-13, , <https://datatracker.ietf.org/doc/html/draft-petithuguenin-computerate-specifying-13>.
[RFC8489]
Petit-Huguenin, M., Salgueiro, G., Rosenberg, J., Wing, D., Mahy, R., and P. Matthews, "Session Traversal Utilities for NAT (STUN)", RFC 8489, DOI 10.17487/RFC8489, , <https://www.rfc-editor.org/info/rfc8489>.

Appendix A. ABNF

The complete ABNF for the meta-language is listed here:

LWS                  =  [*WSP CRLF] 1*WSP

name                 =  ALPHA *ALPHA / DIGIT / "-" / "_"

constant             =  name [LWS] "<-" [LWS] "constant"

state                =  name [LWS] "<-" [LWS] "state" *(LWS name)

event                =  name [LWS] "<-" [LWS] "event" *(LWS name)

action               =  name [LWS] "<-" [LWS] "action" *(LWS name)

num-value            =  "0" / (%x31-39 *DIGIT)

str-value            =  DQUOTE *( %x01-21 / %x32-5B / %x5D-7F / ("\"
                        DQUOTE) / ("\" "\") ) DQUOTE

bool-value           =  "true" / "false"

value                =  num-value / str-value / bool-value

pattern              =  "*" / name / value

positional-patterns  =  pattern *(LWS pattern)

named-pattern        =  name [[LWS] "=" [LWS] pattern]

named-patterns       =  "{" [LWS] [named-pattern *([LWS] "," [LWS]
                        named-pattern)] [LWS] "}"

patterns             =  "*" / ("(" [LWS] "*" [LWS] ")") / name / ("("
                        [LWS] name [LWS] ")") / ("(" [LWS] name [LWS
                        positional-patterns] [LWS named-patterns]
                        [LWS] ")")

ops-1                =  "||"

ops-2                =  "&&"

ops-3                =  "<" / "<=" / "==" / "!=" / ">" / ">="

ops-4                =  "+" / "-"

ops-5                =  "*" / "/" / "%"

un-ops               =  "!" / "-"

sub-expr             =  ([un-ops [LWS]] name) / ([un-ops [LWS]]
                        value) / ( [un-ops [LWS]] ("(" [LWS] ops
                        [LWS] ")") )

expr                 =  sub-expr *([LWS] ops [LWS] sub-expr)

named-assignment     =  name [[LWS] "=" [LWS] expr]

named-assignment     =  ("{" [LWS] "}") / ("{" [LWS] named-assignment
                        *([LWS] "," [LWS] named-assignment) [LWS]
                        "}")

state-assignment     =  name named-assignment

action-assignment    =  name *(LWS expr)

transition           =  patterns LWS patterns [[LWS] "|" [LWS] expr]
                        [LWS] "->" [[LWS] state-assignment] *(LWS
                        action-assignment)

element              =  constant / state / event / action /
                        transition / ("#" *VCHAR)

state-machine        =  [element] *(CRLF [element])

Acknowledgements

Thanks to Stephane Bryant and Colin Perkins for the comments, suggestions, and questions that helped improve this document.

Author's Address

Marc Petit-Huguenin
Impedance Mismatch LLC