Network Working Group                  Richard Price, Siemens/Roke Manor 
Internet-Draft                        Robert Finking, Siemens/Roke Manor 
Expires: February 2004               Abigail Surtees, Siemens/Roke Manor 
                                         Mark A West, Siemens/Roke Manor 
 
                                                        October 20, 2003 
 
          Formal Notation for Robust Header Compression (ROHC-FN) 
                 <draft-price-rohc-formal-notation-00.txt> 
  
Status of this memo 
 
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026. 
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time.  It is inappropriate to use Internet-Drafts as reference 
   material or cite them other than as "work in progress". 
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 
    
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html 
    
   This document is a submission of the IETF ROHC WG.  Comments should 
   be directed to its mailing list, rohc@ietf.org. 
    
    
Abstract 
         
   This document defines a proposal for the ROHC-FN: a formal notation 
   for specifying how to compress and decompress fields from an 
   arbitrary protocol stack.  ROHC-FN is proposed with the intention of 
   simplifing the creation of new compression profiles to fit within the 
   ROHC [RFC-3095] framework. 
    
    
Price et al.                                                    [Page 1]  
Internet-Draft                   ROHC-FN                October 20, 2003   
                  

Table of contents
   1.  Introduction..................................................2  
   2.  Terminology...................................................3 
   3.  Overview of ROHC-FN...........................................3 
   4.  Normative definition of ROHC-FN...............................7 
   5.  Encoding Methods..............................................12 
   6.  Prolog definitions of encoding methods........................25 
   7.  Bit level worked example......................................34 
   8.  Security considerations.......................................40 
   9.  Acknowledgements..............................................40 
   10. Authors' addresses............................................40 
   11. References....................................................40 
    
   Appendix A. Supporting Prolog Code................................42 
    
1. Introduction 
                
   This draft is the new proposal for the formal notation.  The 
   intention is to update draft-ietf-rohc-formal-notation-01.txt with 
   this notation if the members of the RoHC working group agree with 
   this proposal. 
    
   ROHC-FN is a simple notation designed to help with the creation of 
   new ROHC [RFC-3095] header compression profiles.  ROHC-FN offers a 
   library of "encoding methods" that are often used in ROHC profiles, 
   so new profiles can be defined without needing to redefine this 
   library from scratch. 
    
   Informally, an encoding method is just a function that converts 
   uncompressed data into compressed data.  The simplest encoding 
   methods only have one input and output: the input is an uncompressed 
   field and the output is the compressed version of the field.  More 
   complex encoding methods can compress multiple fields at the same 
   time, e.g. "list" encoding from [RFC-3095], which is designed to 
   compress an ordered list of fields. 
    
   The features required for defining ROHC-FN are offered by the 
   programming language Prolog.  As such ROHC-FN is defined both in 
   English and also in Prolog.  The English definition is more 
   digestible but less formal.  The Prolog definition is less digestible 
   but totally precise in that it allows any profile defined using ROHC-
   FN to be compiled and executed, allowing  the profile's behaviour to 
   be observed on real data.  Hence where any ambiguity appears in the 
   English definition, the Prolog definition will clarify the issue.  
   There should however, be no conflicts between the English and Prolog 
   definitions.  Any such conflicts should be reported to the authors. 
    

Price et al.                                                    [Page 2] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   Note that this draft contains a standalone definition of ROHC-FN 
   (i.e. there is no need to understand Prolog in order to understand 
   ROHC-FN). 
    
    
2. Terminology 
               
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC-2119 [RFC-2119]. 
    
   Control field  
        
     Control fields are transmitted from a ROHC compressor to a ROHC  
     decompressor, but are not part of the uncompressed protocol header  
     itself.  An example is a checksum field over the header to ensure  
     robustness against bit errors and dropped packets. 
    
   Encoding method 
    
     Encoding methods are functions that can be applied to compress  
     fields in a protocol header. 
    
   Field 
    
     ROHC-FN divides the protocol to be compressed into a set of  
     contiguous bit patterns known as fields. 
    
   Library of encoding methods 
    
     The library of encoding methods contains a number of commonly used  
     encoding methods for compressing header fields. 
    
   Profile 
     
     A ROHC [RFC-3095] profile is a description of how to compress a  
     certain protocol stack over a certain type of link.  Each profile  
     includes packet formats to compress the headers and a state machine  
     to control the actions of each endpoint. 
    
3. Overview of ROHC-FN 
                       
   This section gives an overview of ROHC-FN and explains how it can be 
   used to compress header fields as part of a ROHC profile. 
    
3.1. Scope of ROHC-FN 
                      
   The following section describes the scope of ROHC-FN, and explains 
   how it relates to the overall ROHC framework and also how it relates 
   to specific ROHC profiles. 
    
 
Price et al.                                                    [Page 3] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   The ROHC framework is common to all profiles: it defines the general 
   principles for doing ROHC compression.  It defines the profile 
   concept, which makes ROHC a general platform for compression schemes.  
   It sets link layer requirements, and in particular negotiation 
   requirements for all ROHC profiles.  It defines a set of common 
   functions such as Context Identifiers (CIDs) and padding and 
   segmentation (useful if the link layer can only handle a limited 
   range of packet sizes).  It also defines common packet formats (IR, 
   IR-DYN, Feedback, Short-CID expander, etc.), and it defines a 
   generic, profile independent, handling of feedback. 
    
   A ROHC profile is a description of how to compress a certain protocol 
   stack over a certain type of link.  For example, ROHC profiles are 
   available for RTP/UDP/IP and many other protocol stacks. 
    
   Each ROHC profile can be further subdivided into the following two 
   components: 
    
   a)  Packet formats for compressing and decompressing headers 
   b)  State machine 
    
   The job of the packet formats is to define how to compress and 
   decompress headers.  The packet formats must define the compressed 
   version of each uncompressed header (and vice versa). 
    
   The packet formats will typically compress headers relative to a 
   "context" of field values from previous headers in a flow.  This 
   improves the overall compression ratio, due to taking into account 
   redundancies between successive headers. 
    
   The job of the state machine is to ensure that the profile is robust 
   against bit errors and dropped packets.  The state machine manages 
   the context, providing feedback and other mechanisms to ensure that 
   the compressor and decompressor contexts are kept in sync. 
    
   ROHC-FN is designed to help provide the packet formats for use in new 
   ROHC profiles.  It offers a library of encoding methods for 
   compressing fields, and a mechanism for combining these encoding 
   methods to create new packet formats tailored to a specific protocol 
   stack.  Note however that the state machine for the new profiles is 
   beyond the scope of ROHC-FN, and must be provided separately. 
    
3.2. Example using IPv4 
    
   Rather than immediately diving in with a formal definition of ROHC-
   FN, the following section will give an overview of how the notation 
   is used by means of an example.  The example will develop the formal 
   notation for an encoding method capable of compressing a single, 
   well-known header: the IPv4 header. 
 

Price et al.                                                    [Page 4] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   The first step is to specify the overall encoding method for the IPv4 
   header.  In this case we will use the single_packet_format encoding 
   method.  This encoding method compresses a header by dividing it into 
   fields, compressing each field in turn, and then sending a single 
   packet containing the compressed version of each field.  We define 
   this by writing the following in ROHC-FN: 
    
   ipv4_header           ::=   single_packet_format, 
    
   { 
    
   The symbol "::=" means "is encoded as", so the above expression 
   defines that the IPv4 header is encoded by sending a single packet 
   format (containing the compressed version of each field in the IPv4 
   header). 
    
   Note the opening curly brace, which indicates that subsequent 
   definitions are local to the ipv4_header.  This scoping mechanism 
   helps to clarify which fields belong to which headers: it becomes 
   especially useful when compressing complex protocol stacks with 
   several headers and fields, often sharing the same names. 
    
   The next step is to specify the fields contained in the uncompressed 
   IPv4 header, which is accomplished in ROHC-FN as follows: 
    
     uncompressed_data   ::=   version,        %  4 bits 
                               header_length,  %  4 bits 
                               tos,            %  6 bits 
                               ecn,            %  2 bits 
                               length,         % 16 bits 
                               id,             % 16 bits 
                               reserved,       %  1 bits 
                               dont_frag,      %  1 bits 
                               more_fragments, %  1 bits 
                               offset,         % 13 bits 
                               ttl,            %  8 bits 
                               protocol,       %  8 bits 
                               checksum,       % 16 bits 
                               src_addr,       % 32 bits 
                               dest_addr,      % 32 bits 
    
   After this, we specify the fields contained in the compressed header.  
   Exactly what appears in this list of fields depends on the encoding 
   methods used to encode the uncompressed fields - we may be able to 
   compress certain fields down to 0 bits, in which case they do not 
   need to be sent in the compressed header at all as explained below.  
   Note that the order of the fields in the compressed header is 
   independent of the order of the fields in the uncompressed header. 
    

Price et al.                                                    [Page 5] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
     compressed_data     ::=   src_addr,       % 32 bits 
                               dest_addr,      % 32 bits  
                               length,         % 16 bits 
                               id,             % 16 bits  
                               ttl,            %  8 bits 
                               protocol,       %  8 bits 
                               tos,            %  6 bits 
                               ecn,            %  2 bits 
                               dont_frag,      %  1 bits 
                                
   The next step is to specify the encoding methods for each field in 
   the IPv4 header.  These will be taken from well-known encoding 
   methods in the ROHC-FN library.  Note that the intention here is to 
   illustrate the use of the notation, rather than to describe the 
   optimum method of compressing IPv4 headers, therefore for the purpose 
   of the example we will use just three encoding methods from the ROHC-
   FN library. 
    
   The "value" encoding method can compress any field whose length and 
   value is fixed.  No compressed bits need to be sent because the field 
   can be reconstructed using its known size and value.  The "value" 
   encoding method is used to compress five fields in the IPv4 header as 
   described below: 
    
     version             ::=   value(4, 4), 
     header_length       ::=   value(4, 5), 
     reserved            ::=   value(1, 0), 
     more_fragments      ::=   value(1, 0), 
     offset              ::=   value(13, 0), 
    
   Note that the first parameter indicates the length of the 
   uncompressed field in bits, and the second parameter gives its 
   integer value. 
    
   The "irregular" encoding method can be used to encode any field whose 
   length is fixed.  It is a very general encoding method that can be 
   used for fields to which no other encoding method applies.  All of 
   the bits in the uncompressed field need to be sent; hence this 
   encoding does not give any compression. 
    
     tos                 ::=   irregular(6), 
     ecn                 ::=   irregular(2), 
     length              ::=   irregular(16), 
     id                  ::=   irregular(16), 
     dont_frag           ::=   irregular(1), 
     ttl                 ::=   irregular(8), 
     protocol            ::=   irregular(8), 
     src_addr            ::=   irregular(32), 
     dest_addr           ::=   irregular(32), 
    

Price et al.                                                    [Page 6] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   The final encoding method is at the opposite extreme of generality: 
   "inferred_ip_checksum" is a specific encoding method for calculating 
   the IP checksum from the rest of the header values.  Like the "value" 
   encoding method, no compressed bits need to be sent, since the field 
   value can be entirely reconstructed using the values in the other 
   fields of the IP header. 
    
     checksum            ::=   inferred_ip_checksum 
   } 
 
   We have now defined the format of the compressed IPv4 header, and 
   provided enough information to allow an implementation to construct 
   the compressed header from an uncompressed header and vice versa.  
   This completes the example.  
    
3.3. Adding robustness 
                       
   ROHC profiles are designed to be "robust" against packet loss and 
   residual bit errors on the link over which header compression takes 
   place.  A well-designed profile can cope with these errors without 
   losing additional packets or introducing additional bit errors in the 
   decompressed headers. 
    
   ROHC-FN offers two techniques to help ensure that a ROHC profile is 
   robust.  Firstly, the encoding methods in the ROHC-FN library are 
   designed to tolerate a certain number of dropped or misordered 
   packets between the compressor and decompressor.  For example, Least-
   Significant Bit (LSB) encoding can robustly compress fields that 
   change by a small value between successive headers. 
    
   Secondly, the "CRC" encoding method can be used to provide a CRC over 
   the original uncompressed header, to detect faulty decompressed 
   headers and prevent them from mistakenly being used to update the 
   context.  This situation is illustrated in Figure 1: 
    
                                        CRC failure 
    +--------------+ +--------------+ ================ +--------------+ 
   -| Valid header |-| Valid header |-|Invalid header|-| Valid header |- 
    +--------------+ +--------------+ ================ +--------------+ 
          |  |             |  |                              |  | 
   >------+  +------>------+  +--------------->--------------+  +------> 
                 context                   context 
    
         Figure 1: Preventing accidental corruption of the context 
    
4. Normative definition of ROHC-FN 
                                   
   This section gives the normative definition of ROHC-FN, including its 
   syntax and any data structures that it requires. 
    
4.1. ROHC-FN syntax 
 
 
Price et al.                                                    [Page 7] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
   Defining how to compress a field or header using ROHC-FN is extremely 
   simple.  All that needs to be provided is the following: 
    
   a)  A name for the field or header to be compressed. 
   b)  An encoding method, together with any parameters it needs, 
   including subfield parameters. 
    
   For example: 
    
     field_name ::= encoding_method(param1, param2, ...), 
     { 
        sub_field_1  ::= ... 
        sub_field_2  ::= ... 
        etc. 
     } 
    
   All formal notation is represented using this simple construct.  
   Because the construct can be nested, complex relationships can be 
   notated. 
    
4.2. Comments 
    
   Comments do not affect the formal meaning of what is notated, but can 
   be used to improve readability.  The use of them is entirely 
   optional. 
    
   It should be noted that profiles will be read, by many readers, in 
   terms of their intuitive English meaning.  Such readers will not 
   necessarily differentiate between the formal and commentary parts of 
   a profile.  It is essential therefore that any comments written are 
   correct.  They should not be considered of lesser importance than the 
   rest of the notation in a profile, and should be strictly consistent 
   with it. 
    
   If the profile author does wish to insert free English text into the 
   profile, in order to explain why something has been done a particular 
   way, to clarify the intended meaning of the notation, or to elaborate 
   on some point, they can do so by use of one of the two commenting 
   styles described below. 
    
4.2.1. End Of Line Comments 
                            
    
   The end of line comment style makes use of the % comment character.  
   Any text between the % character and the end of the line has no 
   formal meaning.  For example: 
    
     %----------------------------------------------------------------- 
     %    IR-REPLICATE packet formats 
     %----------------------------------------------------------------- 
 
 
Price et al.                                                    [Page 8] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
     % The following fields are included in all of the IR-REPLICATE 
     % packet formats: 
     % 
    
     replicate_common    ::=   discriminator,    %    8 bits 
                               tcp.seq_number,   %   32 bits 
                               tcp.flags.ecn,    %    2 bits 
    
4.2.2. Block Comments 
                      
   The block comment style makes use of the /* and */ delimiters.  Any 
   text between the /* and the */ has no formal meaning.  For example: 
    
     /****************************************************************** 
      *   IR-REPLICATE packet formats 
      *****************************************************************/ 
    
     /* The following fields are included in all of the IR-REPLICATE 
      * packet formats: 
      */ 
    
     replicate_common    ::=   discriminator,    /*   8 bits */ 
                               tcp.seq_number,   /*  32 bits */ 
                               tcp.flags.ecn,    /*   2 bits */ 
    
   The block comment style allows comments to be nested (i.e. comments 
   inside comments are allowed).  For example: 
    
     /* Old version temporarily kept as a comment; delete when finalised 
      * 
      *replicate_common    ::=   discriminator,           /*   8 bits */ 
      *                          tcp.scaled_seq_number,   /*  22 bits */ 
      *                          tcp.seq_number_residue,  /*  10 bits */ 
      *                          tcp.flags.ecn,           /*   2 bits */ 
      */ 
    
    
   Readers familiar with the C, C++ or Java programming languages, 
   should take careful note of this fact! 
 

Price et al.                                                    [Page 9] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
4.3. Implementation structures 
                               
   The following section gives some information about the data that must 
   be stored by an implementation of a ROHC profile.  ROHC-FN assumes 
   that the data is available as a single structure, indexed by the name 
   of the relevant field.  Note, however, that provided the relevant 
   data is available, the exact way in which the data structure is 
   stored is up to the implementation itself. 
    
   ROHC-FN assumes that for each field to be compressed, the following 
   eight attributes are available: 
    
     uncomp, uncomp_start, uncomp_length, comp, comp_start, comp_length,  
     context, updated_context 
    
   The notation to access any of the attributes for a particular field, 
   is the name of the attribute, followed by the field name in brackets.  
   For example; 
    
     uncomp(tcp_ip.options.list_length) 
 
   Gives the uncompressed value of the tcp_ip.options.list_length field.  
   Each of the attributes is explained in more detail below. 
    
4.3.1. The uncomp attribute 
                            
   The uncomp attribute contains the uncompressed value of the field.  
   This can either be the value of a field from the uncompressed header, 
   or the uncompressed value of a control field, but all fields have an 
   uncomp value attribute. 
    
4.3.2. The uncomp_start attribute 
                                  
   The uncomp_start attribute contains the position in the header that 
   the uncompressed field starts at, specified in bits.  Control fields 
   do not make use of this attribute. 
    
4.3.3. The uncomp_length attribute 
                                   
   The uncomp_length attribute contains the length of the uncompressed 
   field, specified in bits.  All fields have an uncomp_length 
   attribute. 
    
4.3.4. The comp attribute 
                          
   The comp attribute contains the compressed value of the field, i.e. 
   the value of the field as it appears in the compressed header.  Note 
   that this will not be used for fields which are not encoded; some 
   don't appear in the compressed header at all.  


Price et al.                                                   [Page 10] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
4.3.5. The comp_start attribute 
    
   The comp_start attribute contains the position in the compressed 
   header that the field starts at, specified in bits.  All attributes 
   which appear in the compressed header make use of this attribute.  
    
4.3.6. The comp_length attribute 
                                 
   The comp_length attribute contains the length of the compressed 
   field, specified in bits.  All attributes which appear in the 
   compressed header make use of this attribute. 
    
4.3.7. The context attribute 
                             
   The purpose of the context attribute is to allow inter-packet 
   compression.  An analogy can be found in MPEG video compression.  
   Reasonable video compression can be achieved simply by treating each 
   frame as a still image and compressing it e.g. using JPEG.  However 
   MPEG compression takes advantage of the fact that successive frames 
   of video can be compressed more efficiently by taking into account 
   the similarities between them.  Similarly, the context attribute 
   contains information about the previous value of the field.  Note 
   that there will be no context for the first packet in a stream. 
    
   The context attribute is actually key to efficient compression, since 
   the behaviour of one header is very often related to the behaviour of 
   previous headers in a flow.  For example, the RTP Sequence Number 
   field increases by 1 for each consecutive header in an RTP stream. 
    
   ROHC profiles take into account the dependency between successive 
   headers by storing and referencing the context attribute.  However, 
   whilst it is possible to do this explicitly, most of the time the 
   context is referenced implicitly by the encoding methods. 
    
   An implementation of ROHC-FN should allow encoding methods to read 
   values from the context, and should be able to update the context 
   with the new field values from the current header (or some other 
   value if that is appropriate). 
    
   All fields make use of the context attribute. 
    
4.3.8. The updated_context attribute 
                                     
   The updated_context attribute contains the value that the context 
   attribute will take for the next header.  The state machine for a 
   ROHC profile defines a specific point at which the context is 
   updated: at this point the updated_context attribute should be copied 
   into the context attribute. 
    
   All fields make use of the updated_context attribute. 
 
 
Price et al.                                                   [Page 11] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
5. Encoding methods 
    
   The ROHC [RFC-3095] standard contains a number of different 
   techniques for compressing header fields (LSB encoding, value 
   encoding, list-based compression etc.).  Each of these techniques can 
   be added to the ROHC-FN library so that they can be reused when 
   creating new ROHC profiles. 
    
   The following encoding methods are all defined in English; a formal 
   Prolog definition for each is given in the next section.  The sub-
   section numbers are the same as those in the next section to make it 
   straightforward to refer from English to Prolog and vice versa, 
   without cluttering up the English definitions with Prolog. 
    
5.1. Basic encoding methods 
 
   This section defines the simplest set of encoding methods.  All these 
   encoding methods are self-contained in that they do not need to refer 
   to other fields. 
    
5.1.1. Value 
             
   The value encoding method is used to encode header fields which 
   always have a fixed size and value.  E.g. the IPv6 header version 
   number is a four bit field that always has the value 6: 
    
     version             ::=   value(4, 6) 
    
   Since the value is fixed, it is omitted from the compressed header.  
   As with all omitted fields the author of a profile has the option of 
   notating a value encoded field as a zero bit field in the compressed 
   header field order list, if they so wish. 
    
5.1.2. Irregular 
                 
   The "irregular" encoding method leaves the field untouched.  The 
   field in the compressed packet will have an identical bit pattern to 
   the original field in the uncompressed packet.  E.g. 
    
     age_in_years        ::=   irregular(16) 
    
   Note that since the field divisions specified in the profile are 
   completely arbitrary, there is no reason not to take what is 
   specified as a single field in a header specification and break it 
   down into smaller fields. 
    
   Using this technique, fields which are only irregular in part can be 
   better compressed.  E.g. if the above field was the age in years of 
   the human who originated the packet, and if we knew from the protocol 
   definition that the field would never have a value greater than 123, 
 
 
Price et al.                                                   [Page 12] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   we would know that the most significant bits would always be zero, so 
   we might encode it as follows: 
    
     age_in_years_part_1 ::=   value(9,0), 
     age_in_years_part_2 ::=   irregular(7) 
 
5.1.3. Static 
              
   The "static" encoding method compresses a field whose length and 
   value is the same as for the previous header in the flow.  E.g. 
    
     src_port            ::=   static 
    
   Since the field value is the same as the previous field value, the 
   entire field can be reconstructed from the context, so it is 
   compressed to zero bits and does not appear in the compressed header. 
    
5.1.4. LSB 
           
   The "lsb" encoding method compresses a field whose value differs by a 
   small amount from the value stored in the context.  E.g.  
    
       msn               ::=   lsb(2,0), 
    
   The "lsb(k, p)" encoding method can compress a field f whose value 
   lies between (context(f) - p) and (context(f) - p + 2^k - 1) 
   inclusive.  In particular, if p = 0 then the field value can only 
   stay the same or increase relative to the previous header in the 
   flow.  If p = -1 then it can only increase, whereas if p = 2^k then 
   it can only decrease. 
    
   The compressed field takes up the specified number of bits in the 
   compressed header.  See the ROHC [RFC-3095] standard for a full 
   definition of LSB encoding. 
    
5.1.5. Index 
             
   The "index" encoding method compresses a field whose value is one of 
   a list of possible values.  E.g. 
    
     id_flags            ::=   index(1, ['11111000':'10001111']) 
    
   The index encoding method takes two parameters.  The first is the 
   number of bits to use to encode the index.  The second is the list of 
   possible values the field can take.  For "index(n, the_list)", the 
   length of the_list can be anything up to 2^n items long. 
   The compressed packet contains the index of the value to be 
   compressed.  The leftmost item in the list has an index of 0, the 
   next item an index of 1 and so on. 
    

Price et al.                                                   [Page 13] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   The compressed field takes up the specified number of bits in the 
   compressed header. 
     
    
5.2. Relative Field Encoding Methods 
                                     
   The encoding methods in this section are all able to encode a field 
   whose value can be inferred from the value of another field or 
   fields. 
    
   Fields can be referred to outside of the scope they are defined in, 
   by using the '.' scoping notation.  So for example, to refer to 
   field_1 from outside the scope of the test_single_format header 
   (where it is defined), use 'test_single_format.field_1'.  The same 
   scoping mechanism can be used for subfields within fields. 
    
 
5.2.1. Same As 
    
   The "same_as" encoding method is used for fields that are always 
   identical to another field.  Whilst having two identical fields in a 
   header is not normal, "same_as" is also useful for encoding fields 
   that are needed by encoding techniques that need to refer to other 
   fields.  For example: 
    
     count               ::=   inferred_offset(4), 
     { 
       base_field        ::=   same_as(test_offset.id), 
       offset            ::=   value(4,3) 
     } 
    
   Since the same_as encoding method gets the entire value of the field 
   from another field, it takes up zero bits in the compressed header. 
    
5.2.2. Group 
             
   The "group" encoding method is used to group two or more 
   noncontiguous uncompressed fields together, so that they can be 
   treated as a single field for compression.  This encoding method 
   takes a single argument, which is the list of fields to be joined 
   together. This argument is specified as a subfield. For example: 
    
     ecn_and_reserved         ::=   group, 
     { 
       field_list             ::=   ip.ecn, 
                                    tcp.ecn, 
                                    tcp.reserved 
     } 
    

Price et al.                                                   [Page 14] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   Since the group encoding method gets the entire value of the field 
   from the fields that it is composed of, it takes up zero bits in the 
   compressed header. 
    
5.2.3. Expression 
                  
   This encoding method is used to when the uncompressed value of the 
   field is defined by a mathematical expression.  The expression can be 
   made up of any of the following components: 
    
   Integers              Integers can be expressed as decimal values, 
                         binary values (prefixed  by 0b), or  hex values 
                         (prefixed  by  0x).    Negative  integers   are 
                         prefixed by a "-" sign. 
 
   Operators             The operators +, -, *, / and ^ are available, 
                         along with ( and ) for grouping.  Note that 
                         k / v is undefined if k is not an integer 
                         multiple of v (i.e. if it does not evaluate to 
                         an integer).   
                         The precedence for each of the operators, along 
                         with parentheses is given below (higher 
                         precedence first): 
                         (, ) 
                         ^ 
                         *, / 
                         +, - 
    
   floor (k, v)          Returns k / v rounded down to the nearest 
                         integer (undefined for v == 0). 
    
   mod (k, v)            Returns k - v * floor(k, v). 
    
   log2 (v)              Returns the smallest integer k where v <= 2^k,  
                         i.e. it returns the smallest number of bits in 
                         which value v can be stored. 
    
    
   The expression may refer to any of the attributes in the data 
   structure stored for each field (see above), but the following 
   attributes are most likely to be useful: 
    
   uncomp        - the uncompressed value of the field,  
   uncomp_length - the length of the uncompressed field in bits, 
   comp          - the compressed value of the field,  
   comp_length   - the length of the compressed field in bits, 
    
   To access any of the attributes for a particular field, write the 
   name of the attribute, followed by the field name in brackets.  E.g. 
    
     uncomp(tcp_ip.options.list_length) 
 
 
Price et al.                                                   [Page 15] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
   This will get the uncompressed value of the list length of the tcp 
   options list.  Note that if any of the attributes used in the 
   expression are undefined, the value of the expression is undefined.  
   Here is a complete example of expression encoding, which employs the 
   above attribute: 
    
     data_offset     ::= expression((uncomp(tcp_ip.options.list_length) 
                                     + 160) / 32) 
    
   Since the value of an expression encoded field is constructed 
   entirely from the expression, it takes up zero bits in the compressed 
   header.  
    
5.2.4. Constant 
 
   "Constant" encoding works in the same manner as expression, but the 
   expression must yield a value that is constant for all headers. 
    
    
5.2.5. Derived value 
 
   The "derived_value" encoding method is similar to the value encoding 
   method, except that the length and field value do not have to be 
   constant, since they are specified as subfields, rather than as in 
   line parameters. For example: 
    
     tcp.seq_number  ::=  derived_value, 
     { 
       field_length  ::=  constant(8), 
       field_value   ::=  expression(uncomp(tcp.seq_number.residue) + 
                                     (uncomp(tcp.seq_number.scaled) * 
                                      uncomp(tcp.payload_size))) 
     }  
    
   If constant encoding is used for both fields, the encoding method is 
   identical to value encoding. For example, 
    
     field_1         ::= value(4, 11) 
    
   Has identical meaning to: 
    
    
     field_1         ::=  derived_value, 
     { 
       field_length  ::=  constant(4), 
       field_value   ::=  constant(11) 
     }  
    
    
Price et al.                                                   [Page 16] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   The number of bits that derived_value encoding takes up in the 
   compressed header depends on the encoding methods used for the length 
   and value. The above examples would both take up zero bits in the 
   header since the constant and expression encoding methods both take 
   up zero bits in the compressed header. If both length and value 
   encoding methods take up bits in the compressed header, the length 
   encoding is done first, followed by the value encoding. 
    
5.2.6. Inferred_translate 
                          
   TBD. 
    
5.2.7. Inferred_size 
                     
   The "inferred_size" encoding method infers the value of a field from 
   the total amount of remaining data in the header. 
    
   The first parameter specifies the length of the uncompressed field in 
   bits, and the second parameter specifies an offset that will be
   subtracted from the total data length when deriving the value of the 
   field.  E.g. 
 
     size_field          ::=   inferred_size(4, -8) 
    
   Since the value of the field is only dependent on the size of the 
   data, which is known, the encoded field is zero bits long. 
    
5.2.8. Inferred_offset 
                       
   The "inferred_offset" encoding method compresses a field that takes a 
   known offset relative to a certain base value.  In typical usage the 
   base value will be specified as the value of another field, although 
   any value can be specified. 
    
   The method has three parameters.  The first parameter, length, 
   defines the length of the field in bits.  An offset of up to 
   (2^length - 1) can be specified from the base value.  The length 
   parameter is specified in parentheses in the normal way.  Offset 
   addition is done modulo 2^length, so negative offsets are possible. 
    
   The second parameter specifies the base value, along with how to 
   encode that value in the compressed header.  The third parameter 
   specifies the offset from the base value, along with the encoding 
   method for that.  Because these two parameters allow for the 
   specification of encoding methods, they are specified using 
   subfields, rather than as regular parameters.  E.g. 
 
     id              ::=   inferred_offset(16), 
     { 
       base_field    ::=   same_as(msn), 
       offset        ::=   static 
 
 
Price et al.                                                   [Page 17] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
     } 
    
   This says that the id field is 16 bits long, and has a static offset 
   from the value of the MSN.  The exact number of bits it takes to 
   encode an inferred offset field depends on the encoding methods used 
   for the base_field and offset.  The above examples both take zero 
   bits, since both the same_as and static encoding methods compress 
   down to zero bits. 
    
5.2.9. Inferred_sequence 
                         
   TBD 
    
5.2.10. Inferred_ip_checksum 
                             
   The "inferred_ip_checksum" encoding method is a very specific 
   encoding method used to compress the IP checksum field.  It should 
   only be used for that purpose: 
 
     checksum        ::=   inferred_ip_checksum 
    
   Since the checksum can be constructed solely from the other fields in 
   the header, zero bytes are sent for this encoding. 
    
5.3. Control field encoding methods 
                                    
   This section provides a selection of encoding method for handling 
   control fields, i.e. fields which appear in the compressed header to 
   control the compression in some way and do not appear in the 
   uncompressed header at all.  
    
5.3.1. Literal Discriminator 
                             
   The literal_discriminator encoding method writes a literal bit string 
   into the compressed header.  It is one of two discriminator encoding 
   methods intended to be used in conjunction with the 
   multiple_packet_formats encoding method, which allows for more than 
   one method of compression for a given header.  The 
   literal_discriminator encoding method allows the unique bit pattern 
   to be specified, in binary, which identifies the particular method of 
   compression that has been used.  The syntax for the 
   literal_discriminator encoding method is unusual - the discriminator 
   is simply specified in between two single quote marks. For example 
    
     discriminator     ::=   '011' 
    
   The discriminator is added into the compressed header as is, so it 
   takes up however many bits are in the given literal bit pattern. 
 
5.3.2. Control field 
                     
 
Price et al.                                                   [Page 18] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   This encoding method is used for fields that need to be sent in the 
   compressed header, but which don't appear in the uncompressed header 
   at all.  It takes two parameters, the base field, which is the field 
   it is based on and the compressed_method, which specifies the method 
   to use to encode the given field.  E.g. 
    
     order_data           ::=   control_field, 
     {
       base_field         ::=   same_as(test_list.list_of_fields.order),
       compressed_method  ::=   irregular(1)
     }. 
    
   The exact encoding of a control field, and the number of bits it 
   takes up are determined by the encoding method used by 
   compressed_method. 
    
5.3.3. Self-describing values 
                              
   TBD 
    
5.3.4. Network Byte Order 
                          
   TBD 
    
5.3.5. Scale 
             
   TBD 
    
5.3.6. CRC 
           
   The "CRC" encoding method provides a CRC calculated across the 
   original uncompressed header.  The size of the CRC can be altered 
   depending on the characteristics of the link over which the protocol 
   is to be transmitted.  A sufficiently long CRC should be provided to 
   ensure the probability that an unexpected error will be missed is 
   negligible.  E.g. A 3 bit CRC, 
    
     crc_field   ::=   crc (3) 
    
   CRC algorithm to be described here in a later version of this 
   document. 
    
5.3.7. Optional field 
                      
   The "optional_field" encoding method allows for fields that may or 
   may not be present in the header.  This encoding method takes two 
   arguments, condition, which controls whether the field is present, 
   and field_val, which describes how to encode the value of the field 
   when it is present.  E.g. 
    
    
Price et al.                                                   [Page 19] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
     extension_bits   ::=  optional_field 
     { 
       condition      ::=  same_as(message_extended), 
       field_val      ::=  lsb(4, 0) 
     } 
    
   The condition is considered to be "false" if it evaluates to zero, 
   and "true" otherwise. 
    
   Note that the condition must not depend on a field which occurs later 
   in the packet than the optional field, otherwise the decompressor 
   will not be able check the condition at the point when it needs to 
   know whether to include the optional field or not. 
    
   The exact length of the field in the compressed header depends on the 
   encoding methods used for "condition" and "field_val", and for a 
   particular packet of course it depends on whether the condition is 
   true or not. 
    
5.4. Packet format encoding methods 
                                    
   This section details encoding methods used to encode whole headers.  
   All the encoding methods described above are designed to encode 
   single fields within headers; the packet format encoding methods 
   allow the individual fields to be built up into packets.  The 
   encoding methods described in this section are intended for that 
   purpose, to contain a list of fields and corresponding encoding 
   methods, by which the whole packet can be encoded. 
    
5.4.1. Single packet format 
 
   The "single_packet_format" encoding method specifies a single fixed 
   encoding for a given kind of header.  This is the simplest packet 
   encoding method.  E.g. 
    

Price et al.                                                   [Page 20] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
     test_single_format    ::=   single_packet_format, 
     { 
       uncompressed_data   ::=   field_1   : 4 bits 
                                 field 2   : 4 bits 
    
       compressed_data     ::=   field_2   : 0 bits 
                                 field 1   : 4 bits 
    
       field_1             ::=   irregular(4), 
       field 2             ::=   value(4, 9) 
     } 
    
   This specifies the order (and length) of the fields in the 
   uncompressed header, followed by the order (and length) of the fields 
   in the compressed header, followed by a list of encoding techniques 
   for each field. 
    
   The compressed data will appear in the order specified by the field 
   order list "compressed_data", with each individual field being 
   encoded in the manner given for that field.  Consequently the length 
   of the compressed data will be the total of the lengths of all the 
   individual fields.  The above example would encode field_2 first 
   (zero bits long), followed by field_1 (four bits long), giving a 
   total length of four bits. 
    
   Note that the order of the fields specified in compressed_data does 
   not have to match the order they appear in the uncompressed_data.  
   Fields of zero bits length may be omitted from the field order list, 
   since their position in the list is not significant.  So, without 
   changing the meaning, we could have written the above as: 
    
     test_single_format    ::=   single_packet_format, 
     { 
       uncompressed_data   ::=   field_1,   % 4 bits 
                                 field 2,   % 4 bits 
    
       compressed_data     ::=   field 1,   % 4 bits 
    
       field_1             ::=   irregular(4), 
       field 2             ::=   value(4, 9) 
     } 
    
    
5.4.2. Multiple packet formats 
                               
   This encoding method allows multiple encodings for a given header.  
   This allows different compression techniques to be used at different 
   times, depending on what is the most efficient way of compressing a 
   particular header. 
    

Price et al.                                                   [Page 21] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   For example a field may have a fixed value most of the time, but very 
   occasionally the fixed value may change.  Using single_packet_format, 
   this field would have to be encoded as irregular, even though the 
   value only changes rarely.  Using multiple_packet_formats however we 
   can provide two alternative encodings, one for when the value remains 
   fixed and another for when the value changes. 
    
   The encoding method is notated in a similar way to the 
   single_packet_format encoding method; there are however a number of 
   differences.  Firstly, it is necessary to specify the number of 
   alternative packet formats that are defined, which is done via the 
   co_format_count field.  This is a control field in that it doesn't 
   appear in the uncompressed header.  Typically it will be encoded as a 
   constant and so won't take up any bits in the compressed header 
   either, for example: 
    
     co_format_count     ::=   constant(2), 
    
   Secondly the field names are different.  "uncompressed_data", becomes 
   "uncompressed_format", and "compressed_data" is split into several 
   fields, since whilst there is still only a single definition of the 
   uncompressed packet format, there are obviously several alternative 
   compressed packet formats.  These are defined via fields named 
   co_format_0, co_format_1, co_format_2 etc., each of which has a 
   separate set of field encodings associated with it.  In particular 
   each co_format must include a discriminator which uniquely identifies 
   that particular co_format. 
    
   The third difference is that the field encodings appear as subfields 
   of each compressed packet format.  This is necessary to make it 
   explicit which encoding methods are to be used for which compressed 
   packet format, for example: 
    
     co_format_0         ::=   discriminator, 
                               field_1, 
     { 
       discriminator     ::=   '0', 
       field_1           ::=   static 
     } 
    
   Note that the discriminator must always appear first in the field 
   order list, since the decompressor needs to know what packet format 
   it is dealing with before it can do anything else with the rest of 
   the packet. 
    
   Finally, default encoding methods can be specified for each field.  
   The default encoding methods specify the encoding method to use for a 
   field if a given co_format does not give an encoding method for that 
   field.  This prevents the same encoding method from having to be 
   spelt out for every co_format.  There is no need to specify a field 
   order list for the default encoding methods, since the field order is 
 
 
Price et al.                                                   [Page 22] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   specified individually for each co_format, so "..." can be given 
   instead.  For example: 
 
     default_methods     ::=   ... , 
     { 
       field_1           ::=   value(4,1), 
       field_2           ::=   value(4,2) 
     } 
    
   Note that the normal case will be for all default encodings to be 
   compressed to zero bits, in which case they are irrelevant to 
   compressed field order.  However if any default encodings are used 
   which compress to greater than zero bits, their position in the field 
   order list must be specified explicitly for each packet format. 
    
   Putting this altogether, here is a complete example of multiple 
   packet formats: 
    
     test_packet_formats   ::=   multiple_packet_formats, 
     { 
       co_format_count     ::=   constant(2), 
    
       co_format_0         ::=   discriminator, 
                                 field_1, 
       { 
         discriminator     ::=   '0', 
         field_1           ::=   static 
       }, 
    
       co_format_1         ::=   discriminator, 
                                 field_1, 
       { 
         discriminator     ::=   '11', 
         field_1           ::=   irregular(4) 
       }, 
    
       uncompressed_format ::=   field_1, 
                                 field_2, 
    
       default_methods     ::=   ... , 
       { 
         field_2           ::=   value(4,2) 
       } 
     } 
    
5.4.3. List of known length 
                            
   The "list_of_known_length" encoding method compresses a list of items 
   that do not necessarily occur in the same order for every header.  
   Example applications for "list" encoding include TCP options and TCP 
   SACK blocks. 
 
 
Price et al.                                                   [Page 23] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
   The encoding method requires two subfields to be supplied: the 
   overall length of the list (in bits), and the items that can occur in 
   the list.  Each list item is a single field, which must also be 
   compressed by supplying a suitable encoding method. 
    
   The list_of_known_length encoding method allows the list items to 
   occur in any order in the uncompressed header.  Moreover, it is not 
   necessary for all of the list items to be present in every header.  
   Once the total list size (in bits) is reached, the 
   list_of_known_length encoding method stops compressing list items, 
   even if some of the items have not yet occurred in the list. 
    
   If there is more than one valid way of ordering the list items, then 
   the choice of which way to use is left to the compressor. 
    
   The set of list items that are present, and the order in which they 
   occur can change between successive headers.  When they change this 
   information must be sent to the decompressor, so that it knows which 
   fields to reconstruct and which order to place them in the 
   uncompressed header.  The list_of_known_length encoding method sends 
   the order and presence information to the decompressor by creating 
   two new control fields called "order" and "presence". 
    
   The order information starts with a 6 bit field, which specifies how 
   many entries there are in the order list. The rest of the order list 
   is a string of n entries, indicating the order of possible options, 
   where n is specified by the 6 bit field at the start of the order 
   information.  For a list which can contain N different types of list 
   item, the length of each entry in the list will be the minimum number 
   of bits required to represent the N different types of entry. For 
   example if there were between 5 and 8 different types of entry then 3 
   bits. 
    
   The order list contains the indices in the order in which they 
   occurred.  The presence data is a list a 1-bit flags, one per entry 
   in the list, which are set to "1" to indicate that the list item is 
   present or "0" if not.  These flags occur in the order in which the 
   entries appear in the list encoding.  
    
   The complete description of list encoding, along with at least one 
   example, will be included in later versions of this document. 
 
    
5.5. Miscellaneous encoding methods 
                                    
   This section introduces some miscellaneous encoding methods that can 
   be used to compress fields in a protocol header. 
 
5.5.1. Uncompressible 
                      
 
Price et al.                                                   [Page 24] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   TBD 
    
5.5.2. No update 
                 
   TBD 
    
6. Prolog Definitions of Encoding Methods 
                                          
   This section contains the prolog definitions of the encoding methods 
   described in English in the previous section.  The sub-section 
   numbers are the same as those in the previous section to make it 
   straightforward to refer from Prolog to English and vice versa. 
    
   Note that if the prolog definitions given below are used in 
   conjunction with a profile to compress real data, all possible 
   encodings of each packet will be given by Prolog (if asked for).  If 
   more than one possible encoding is available for a given packet, a 
   particular implementation of a compressor is free to choose any 
   suitable encoding (not necessarily the most efficient).  Therefore, a 
   correct implementation of the decompressor needs to be able to handle 
   all the alternative encodings given. 
    
   The Prolog definitions given in this section rely on underpinning 
   Prolog routines, which are included in Appendix A. 
    
6.1. Basic encoding methods 
 
 
6.1.1. Value 
             
   value(NUM_BITS, V) :- 
     get_current_comp(NAME), 
     evaluate(NUM_BITS, V, PROPOSED_VALUE), 
     ( 
       doing(compression) -> 
         extract_bits(NAME, NUM_BITS, 0), 
         uncomp(NAME, VALUE), 
         PROPOSED_VALUE = VALUE, 
         comp(NAME, '') 
       ; 
       doing(decompression) -> 
         comp(NAME, ''), 
         uncomp(NAME, PROPOSED_VALUE)  
     ), 
     updated_context(NAME, PROPOSED_VALUE). 
    

Price et al.                                                   [Page 25] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.1.2. Irregular 
                    
   irregular(NUM_BITS) :- 
     get_current_comp(NAME), 
     ( 
       doing(compression) -> 
         extract_bits(NAME, NUM_BITS, 0), 
         uncomp(NAME, VALUE), 
         comp(NAME, VALUE) 
       ; 
       doing(decompression) -> 
         extract_bits(NAME, NUM_BITS, 1),
         comp(NAME, VALUE), 
         uncomp(NAME, VALUE) 
     ), 
     updated_context(NAME, VALUE). 
    
6.1.3. Static 
              
   static :- 
     get_current_comp(NAME), 
     context(NAME, VALUE), 
     defined(VALUE), 
     atom_length(VALUE, N), 
     ( 
       doing(compression) -> 
         extract_bits(NAME, N, 0), 
         uncomp(NAME, V), VALUE = V, 
         comp(NAME, '') 
       ; 
       doing(decompression) -> 
         comp(NAME, ''), 
         uncomp(NAME, VALUE) 
     ), 
     updated_context(NAME, VALUE). 
    

Price et al.                                                   [Page 26] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.1.4. LSB
           
   lsb(K, P) :- 
     get_current_comp(NAME), 
     context(NAME, CONTEXT_VALUE),
     defined(CONTEXT_VALUE), 
     atom_length(CONTEXT_VALUE, N), 
     evaluate(N, P - CONTEXT_VALUE, BASE), 
     ( 
       doing(compression) -> 
         extract_bits(NAME, N, 0), 
         uncomp(NAME, U_VALUE), 
         evaluate(N, (U_VALUE + BASE) mod 2^K, X), 
         evaluate(N, U_VALUE + BASE, Y), 
         X = Y, 
         evaluate(K, U_VALUE, C_VALUE), 
         comp(NAME, C_VALUE) 
       ; 
       doing(decompression) -> 
         extract_bits(NAME, K, 1), 
         comp(NAME, C_VALUE), 
         evaluate(N, (C_VALUE + BASE) mod 2^K, X), 
         evaluate(N, X - BASE, U_VALUE), 
         uncomp(NAME, U_VALUE) 
     ), 
     updated_context(NAME, U_VALUE). 
    
6.1.5. Index 
             
   TBD 
    
6.2. Relative Field Encoding Methods 
                                     
 
6.2.1. Same As 
    
   same_as(QFIELD_NAME) :- 
     get_current_field(CURRENT_NAME), 
     term_to_atom(QFIELD_NAME, ATOMISED_FIELD_NAME), 
     ( 
       doing(compression) -> 
         uncomp(ATOMISED_FIELD_NAME, UNCOMPRESSED_FIELD_VALUE), 
         uncomp(CURRENT_NAME, UNCOMPRESSED_FIELD_VALUE) 
       ; 
       doing(decompression) -> 
         uncomp(CURRENT_NAME, UNCOMPRESSED_FIELD_VALUE), 
         uncomp(ATOMISED_FIELD_NAME, UNCOMPRESSED_FIELD_VALUE) 
     ). 
    

Price et al.                                                   [Page 27] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.2.2. Expression         
                  
   expression(EXPRESSION) :- 
     get_current_field(NAME),
     precision(NUM_BITS), 
     evaluate(NUM_BITS, EXPRESSION, VALUE), 
     uncomp(NAME, VALUE). 
    
6.2.3. Constant 
 
   constant(VALUE) :- 
     expression(VALUE). 
    
6.2.4. Choice 
              
   TBD 
    
6.2.5. Inferred_translate 
                          
   TBD. 
    
6.2.6. Inferred_size 
                     
   inferred_size(LENGTH, OFFSET) :- 
     get_current_field(NAME), 
     ( 
       uncomp('', WHOLE_HEADER), 
       atom_length(WHOLE_HEADER, HEADER_LENGTH), 
       precision(P), 
       evaluate(P, (HEADER_LENGTH - OFFSET) / 8, TEMP_FIELD_VALUE), 
       evaluate(LENGTH, TEMP_FIELD_VALUE, PROPOSED_FIELD_VALUE), 
       ( 
         doing(compression) -> 
           extract_bits(NAME, LENGTH, 0), 
           uncomp(NAME, FIELD_VALUE), 
           PROPOSED_FIELD_VALUE = FIELD_VALUE, 
           comp(NAME, '') 
         ; 
         doing(decompression) -> 
           comp(NAME, ''), 
           uncomp(NAME, PROPOSED_FIELD_VALUE) 
       ) -> 
         true 
       ; 
       doing(decompression) -> 
         comp(NAME, ''), 
         evaluate(LENGTH, 0, PROPOSED_FIELD_VALUE), 
         uncomp(NAME, PROPOSED_FIELD_VALUE) 
     ). 
    
6.2.7. Inferred_offset 
 
 
Price et al.                                                   [Page 28] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
   inferred_offset(LENGTH) :- 
     get_current_field(NAME), 
     qualify_name('base_field', NAME, BASE_FIELD), 
     qualify_name('offset', NAME, OFFSET), 
     BASE_FIELD, 
     uncomp(BASE_FIELD, BASE_FIELD_VALUE), 
     ( 
      doing(compression) -> 
        extract_bits(NAME, LENGTH, 0), 
        uncomp(NAME, FIELD_VALUE), 
        evaluate(LENGTH, FIELD_VALUE - BASE_FIELD_VALUE, PADDED_OFFSET), 
        uncomp(NAME, PADDED_OFFSET), 
        uncomp_start(OFFSET, 0), 
        OFFSET, 
        uncomp(NAME, FIELD_VALUE) 
      ; 
      doing(decompression) -> 
        OFFSET, 
        uncomp_start(OFFSET, 0), 
        uncomp(OFFSET, OFFSET), 
        evaluate(LENGTH, OFFSET + BASE_FIELD_VALUE, PADDED_FIELD_VALUE), 
        uncomp(NAME, PADDED_FIELD_VALUE) 
     ). 
    
6.2.8. Inferred_sequence 
                         
   TBD 
    

Price et al.                                                   [Page 29] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.2.9. Inferred_ip_checksum 
                            
   inferred_ip_checksum :- 
     get_current_field(NAME), 
     ( 
       qualify_name(_, HEADER, NAME), 
       uncomp(HEADER, HDR_BITS), 
       sub_atom(HDR_BITS, 0, 80, _, PRE), 
       sub_atom(HDR_BITS, 96, 64, _, POST) -> 
         eval16(PRE, VALUE1), 
         eval16(POST, VALUE2), 
         sum16(VALUE1, VALUE2, TEMP), 
         evaluate(16, 65535 - TEMP, CHECKSUM), 
         ( 
           doing(compression) -> 
             sub_atom(HDR_BITS, 80, 16, _, CHECKSUM), 
             uncomp(NAME, CHECKSUM), 
             comp(NAME, '') 
           ; 
           doing(decompression) -> 
             comp(NAME, ''), 
             uncomp(NAME, CHECKSUM) 
         ) 
       ; 
       doing(decompression) -> 
         comp(NAME, ''), 
         evaluate(16, 0, CHECKSUM), 
         uncomp(NAME, CHECKSUM) 
     ). 
    
6.3. Control field encoding methods 
                                    
6.3.1. Discriminator 
                     
   discriminator(DISCRIMINATOR) :- 
     get_current_field(NAME), 
     atom_length(DISCRIMINATOR, NUM_BITS), 
     ( 
       doing(compression) -> 
         uncomp(NAME, ''), 
         comp(NAME, DISCRIMINATOR) 
       ; 
       doing(decompression) -> 
         extract_bits(NAME, NUM_BITS, 1), 
         comp(NAME, D1), DISCRIMINATOR = D1, 
         uncomp(NAME, '') 
     ). 
    

Price et al.                                                   [Page 30] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.3.2. Control field 
                     
   control_field :-
     get_current_field(NAME), 
     qualify_name(base_field, NAME, BASE_NAME), 
     qualify_name(compressed_method, NAME, Q_NAME), 
     qualify_name(_, QUALIFIER, NAME), 
     ( 
       doing(compression) -> 
         BASE_NAME, 
         uncomp(BASE_NAME, VALUE), 
         uncomp(NAME, VALUE), 
         uncomp_start(Q_NAME, 0), 
         Q_NAME, 
         uncomp(Q_NAME, UNCOMP_VALUE),
                                       
         VALUE = UNCOMP_VALUE, 
         comp(Q_NAME, COMP_VALUE), 
         comp(NAME, COMP_VALUE) 
       ; 
       doing(decompression) -> 
         comp(QUALIFIER, VALUE), 
         comp(NAME, VALUE), 
         comp_start(NAME, START), 
         comp_start(Q_NAME, START), 
         Q_NAME, 
         comp(Q_NAME, COMP_VALUE), 
         comp(NAME, COMP_VALUE), 
         uncomp(Q_NAME, UNCOMP_VALUE), 
         uncomp(NAME, UNCOMP_VALUE), 
         uncomp(BASE_NAME, UNCOMP_VALUE), 
         BASE_NAME 
     ). 
    
6.3.3. Self-describing values 
                              
   TBD 
    
6.3.4. Network Byte Order 
                          
   TBD 
    
6.3.5. Scale 
             
   TBD 
    

Price et al.                                                   [Page 31] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.3.6. CRC 
           
   crc(N) :- 
     evaluate(N, 0, V),         
     discriminator(V). 
   % Place holder, correct definition to follow in a later version 
    
6.3.7. Optional field 
                      
   TBD 
    
6.4. Packet format encoding methods 
                                    
6.4.1. Single packet format 
 
   single_packet_format :- 
     get_current_field(NAME), 
     qualify_name('uncompressed_data', NAME, U_NAME), 
     qualify_name('compressed_data', NAME, C_NAME), 
     chosen_packet_format(NAME, U_NAME, C_NAME, no_common_format). 
    
6.4.2. Compressed packet formats 
                                 
   multiple_packet_formats :- 
     get_current_field(NAME), 
     ( 
       PREFIX = co -> 
         true 
       ; 
       PREFIX = replicate 
     ), 
     qualify_name('uncompressed_format', NAME, U_NAME), 
     qualify_name('chosen_format', NAME, CHOSEN_FORMAT), 
     qualify_name('chosen_common', NAME, CHOSEN_COMMON), 
     atom_concat(PREFIX, '_formats', FORMATS), 
     qualify_name(FORMATS, NAME, Q_FORMATS), 
     Q_FORMATS, 
     uncomp(Q_FORMATS, NUM_FORMATS), 
     format_name(NUM_FORMATS, FORMAT_NUM_ATOM), 
     concat_atom([PREFIX, '_format_', FORMAT_NUM_ATOM], 
   UNQUALIFIED_C_NAME), 
     atom_concat(PREFIX, '_common', UNQUALIFIED_COMMON_NAME), 
     qualify_name(UNQUALIFIED_C_NAME, NAME, C_NAME), 
     qualify_name(UNQUALIFIED_COMMON_NAME, NAME, COMMON_NAME), 
     comp(CHOSEN_FORMAT, C_NAME), 
     comp(CHOSEN_COMMON, COMMON_NAME), 
     chosen_packet_format(NAME, U_NAME, C_NAME, COMMON_NAME). 
    

Price et al.                                                   [Page 32] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
6.4.3. List of known length
                            
   list_of_known_length :- 
     get_current_field(NAME),
     qualify_name('list_length', NAME, LIST_LENGTH), 
     qualify_name('list_items', NAME, LIST_ITEMS), 
     qualify_name(_, QUALIFIER, NAME), 
     ( 
       doing(compression) -> 
         uncomp(QUALIFIER, HEADER), 
         uncomp_start(NAME, START), 
         sub_atom(HEADER, START, _, 0, SUB_HEADER), 
         uncomp(NAME, SUB_HEADER), 
         LIST_LENGTH, 
         LIST_ITEMS, 
         uncomp(LIST_ITEMS, UNCOMP_HEADER), 
         atom_prefix(SUB_HEADER, UNCOMP_HEADER), 
         uncomp(NAME, UNCOMP_HEADER), 
         comp(LIST_ITEMS, COMP_HEADER), 
         comp(NAME, COMP_HEADER) 
       ; 
       doing(decompression) -> 
         comp(QUALIFIER, HEADER), 
         comp_start(NAME, START), 
         sub_atom(HEADER, START, _, 0, SUB_HEADER), 
         comp(NAME, SUB_HEADER), 
         LIST_LENGTH, 
         LIST_ITEMS, 
         comp(LIST_ITEMS, COMP_HEADER), 
         atom_prefix(SUB_HEADER, COMP_HEADER), 
         comp(NAME, COMP_HEADER), 
         uncomp(LIST_ITEMS, UNCOMP_HEADER), 
         uncomp(NAME, UNCOMP_HEADER) 
     ). 
    
6.4.4. List_n 
              
   TBD 
    
    
6.5. Miscellaneous encoding methods 
                                    
6.5.1. Uncompressible 
                      
   TBD 
    
6.5.2. No update 
                 
   TBD 
 

Price et al.                                                   [Page 33] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
7. Bit level worked example                                                               
                            
   This section gives a worked example at the bit level, showing how a
   simple profile describes the compression of real data from an 
   imaginary packet format.  The example used has been kept fairly 
   simple, whilst still aiming to illustrate some of the intricacies 
   that arise in use of the notation. 
    
   All the formal notation in this section has been tested using the 
   Prolog definitions of the encoding methods given in section 6. 
    
7.1. Example Packet Format 

   Our imaginary header contains information about a packet of 
   sandwiches.  It is just 8 bits long, consisting of two four bit 
   fields: 
    
     1. number of sandwiches 
     2. number of extras (including cake, fruit, etc.) 
    
   So for example 10010010 would indicate a packet with 5 sandwiches and 
   two extras. 
    
7.2. Initial Encoding 
                      
   An initial definition based solely on the above information is: 
 
     sandwich_header       ::=   single_packet_format, 
     { 
       uncompressed_data   ::=   num_sandwiches  : 4 bits 
                                 num_extras      : 4 bits 
                                
       compressed_data     ::=   num_sandwiches  : 4 bits 
                                 num_extras      : 4 bits 
                                
       num_sandwiches      ::=   irregular(4), 
       num_extras          ::=   irregular(4) 
     } 
    
   This defines the packet nicely, but doesn't actually offer any 
   compression.  If we use it to encode the above header, we get: 
    
     Uncompressed header: 10010010 
     Compressed header:   10010010 
    
   This is because we have stated that both fields are irregular - i.e. 
   we don't know anything about their behaviour. 


Price et al.                                                   [Page 34] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
7.3. Basic Compression

   If packets of sandwiches were standardized to always contain two 
   extras, regardless of the number of sandwiches, then the second field 
   would always be 0010.  The second field however remains in the header 
   for backward compatibility reasons.  We now have: 
 
      
     sandwich_header       ::=   single_packet_format, 
     { 
       uncompressed_data   ::=   num_sandwiches  : 4 bits 
                                 num_extras      : 4 bits 
    
       compressed_data     ::=   num_sandwiches  : 4 bits 
                                 num_extras      : 0 bits 
    
       num_sandwiches      ::=   irregular(4), 
       num_extras          ::=   value(4, 2) 
     } 
    
   Using this simple scheme, we have successfully encoded the fact that 
   one of the fields has a permanently fixed value of two, and therefore 
   contains no useful information.  Note that we could just as well have 
   omitted "num_extras  : 0 bits" from the definition of the compressed 
   data if we so wished. 
    
   Using this new encoding on the above header, we get: 
    
     Uncompressed header: 10010010 
     Compressed header:   1001 
    
   Which halves the amount of data we need to transmit.  However, this 
   encoding fails to take any advantage of a stream of identical 
   packets: 
 
     Uncompressed header: 10010010 
     Compressed header:   1001 
    
     Uncompressed header: 10010010 
     Compressed header:   1001 
    
     Uncompressed header: 10010010 
     Compressed header:   1001 
    
7.4. Inter-packet compression 
                              
   The profile we have defined so far has not compressed the 
   num_sandwiches field at all.  This field can take any value, and so 
   there is no better single method of encoding this field than the 
   irregular encoding already used.  However using the 
 
 
Price et al.                                                   [Page 35] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   multiple_packet_formats encoding we can avoid having to stick to a 
   single encoding method. 
    
   What would be ideal is to avoid encoding the field on the occasions 
   when its value is the same as the same field in the preceding header.  
   This is exactly what static encoding does: 
    
   sandwich_header       ::=   multiple_packet_formats, 
   { 
     uncompressed_format ::=   num_sandwiches,  % 4 bits 
                               num_extras,      % 4 bits 
                                
     co_format_count     ::=   constant(2), 
    
     co_format_0         ::=   discriminator,   % 1 bits 
                               num_sandwiches,  % 0 bits 
                               num_extras,      % 0 bits 
     { 
       discriminator     ::=   '0', 
       num_sandwiches    ::=   static, 
       num_extras        ::=   value(4, 2) 
     }, 
      
     co_format_1         ::=   discriminator,   % 1 bits 
                               num_sandwiches,  % 4 bits 
                               num_extras,      % 0 bits 
     { 
       discriminator     ::=   '1', 
       num_sandwiches    ::=   irregular(4), 
       num_extras        ::=   value(4, 2) 
     } 
   } 
    
   Note that we have had to add a discriminator field, in order that the 
   decompressor knows which packet format we have used.  The format with 
   a static number of sandwiches is now just 1 bit long.  However, the 
   original packet format (with an irregular number of sandwiches) has 
   also grown by one bit.  An important consideration when creating 
   multiple packet formats is whether the extra format occurs frequently 
   enough that the average compressed header length is shorter as a 
   result.  For example if no two packets of sandwiches, with the same 
   number of sandwiches in, were ever transmitted consecutively, then 
   the static format packet would never be used and all we have just 
   achieved is to lengthen our packet by one bit.  However it turns out 
   that it is quite common to send out consecutive packets of sandwiches 
   which have the same number of sandwiches in, so we achieve a 
   significant saving by being able to encode the headers of such 
   packets in a single bit. 
    

Price et al.                                                   [Page 36] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   Using the above header, we now get: 
    
     Uncompressed header: 10010010 
     Compressed header:   11001 
    
     Uncompressed header: 10010010 
     Compressed header:   0 ; 11001 
    
     Uncompressed header: 10010010 
     Compressed header:   0 ; 11001 
    
   The first header in the stream is compressed the same way as before, 
   except that it now has the extra 1 bit discriminator at the start.  
   When a second header arrives, with the same number of sandwiches as 
   the first, it can now be compressed in two possible ways, either as a 
   single bit (0), or in the same way as previously. 
    
   Prolog execution of a profile will show all possible encodings of a 
   packet as defined by a given profile, separated by semi-colons.  
   Either of the above encodings for the packet could be produced by a 
   valid implementation, although of course a good implementation would 
   always pick the encoding which led to the best compression of the 
   packet stream (which is not necessarily the smallest encoding for a 
   particular packet). 
    
7.5. Variable Length Discriminators 
 
   Suppose we do some analysis on sandwich flows and discover that 
   whilst it is usual for successive packets to have the same number of 
   sandwiches in them, on the occasions when they don't, the packet is 
   almost always a "diet" packet.  The number of sandwiches in a diet 
   packet is always one.  To encode the flow more efficiently a packet 
   format needs to be written to reflect this. 
    
   This now gives a total of three packet formats, which means we need 
   three discriminators to differentiate between them.  The obvious 
   solution here is to increase the number of bits in the discriminator 
   from 1 to two and for example use discriminators 00, 01, and 10.  
   However we can do slightly better than this. 
    
   Any uniquely identifiable discriminator will suffice, so we can use 
   0, 10 and 11.  If the discriminator starts with 0, that's the whole 
   thing.  If it starts with 1 the decompressor knows it has to check 
   one more bit to determine the packet kind. 
    
   It would be erroneous to use 0, 01 and 10 as discriminators since 
   after reading an initial 0, the decompressor would have no way of 
   knowing if the next bit was a second bit of discriminator, or the 
   first bit of the next field in the packet stream. 
    
   This gives us the following: 
 
 
Price et al.                                                   [Page 37] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
     sandwich_header       ::=   multiple_packet_formats, 
     { 
       uncompressed_format ::=   num_sandwiches,  % 4 bits 
                                 num_extras,      % 4 bits 
    
       co_format_count     ::=   constant(3), 
    
       co_format_0         ::=   discriminator,   % 1 bits 
                                 num_sandwiches,  % 0 bits 
                                 num_extras,      % 0 bits 
       { 
         discriminator     ::=   '0', 
         num_sandwiches    ::=   static, 
         num_extras        ::=   value(4, 2) 
       }, 
    
       co_format_1         ::=   discriminator,   % 2 bits 
                                 num_sandwiches,  % 0 bits 
                                 num_extras,      % 0 bits 
       { 
         discriminator     ::=   '10', 
         num_sandwiches    ::=   value(4, 1), 
         num_extras        ::=   value(4, 2) 
       }, 
    
       co_format_2         ::=   discriminator,   % 2 bits 
                                 num_sandwiches,  % 4 bits 
                                 num_extras,      % 0 bits 
       { 
         discriminator     ::=   '11', 
         num_sandwiches    ::=   irregular(4), 
         num_extras        ::=   value(4, 2) 
       } 
     } 
    
   Here is some example output: 
    
     Uncompressed header: 10010010 
     Compressed header:   111001 
    
     Uncompressed header: 10010010 
     Compressed header:   0 ; 111001 
    
     Uncompressed header: 10010010 
     Compressed header:   0 ; 111001 
    
     Uncompressed header: 00010010 
     Compressed header:   10 ; 110001 


Price et al.                                                   [Page 38] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
7.6. Default encoding 
                      
   There is some redundancy in the notation used to define the profile 
   so far.  The num_extras field is the same in every packet format, and 

   time in the future (e.g. suppose the number of extras is no longer 
   fixed to 2), the num_extras field would have to be changed in every 
   packet. 
    
   This problem can be avoided by specifying a default encoding for this 
   field, which also leads to a more concisely notated profile: 
    
     sandwich_header_5     ::=   multiple_packet_formats, 
     { 
       uncompressed_format ::=   num_sandwiches,  % 4 bits 
                                 num_extras,      % 4 bits 
    
       co_format_count     ::=   constant(3), 
    
                                                           
   is redefined each time.  If the sandwich protocol was changed at some 
       co_format_0         ::=   discriminator,   % 1 bits
                                 num_sandwiches,  % 0 bits 
       { 
         discriminator     ::=   '0', 
         num_sandwiches    ::=   static 
       }, 
    
       co_format_1         ::=   discriminator,   % 2 bits 
                                 num_sandwiches,  % 0 bits 
       { 
         discriminator     ::=   '10', 
         num_sandwiches    ::=   value(4, 1) 
       }, 
    
       co_format_2         ::=   discriminator,   % 2 bits 
                                 num_sandwiches,  % 4 bits 
       { 
         discriminator     ::=   '11', 
         num_sandwiches    ::=   irregular(4) 
       }, 
      
       default_methods     ::=   ... , 
       { 
         num_extras        ::=   value(4,2) 
       } 
     } 
    
   The above profile behaves in exactly the same way as the one notated 
   previously. 
    

Price et al.                                                   [Page 39] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
    
8. Security considerations 
                           
   This draft describes a formal notation similar to ABNF [RFC-2234], 
   and hence is not believed to raise any security issues.  
 
9. Acknowledgements 
    
   A number of important concepts and ideas have been borrowed from ROHC 
   [RFC-3095].  Updates to the LIST encoding methods owe much to 
   discussions with Qian Zhang and Hongbin Liao. 
    
   Thanks to Paul Ollis for field labeling; and to Rob Hancock and 
   Stephen McCann for putting up with the authors' arguments and making 
   helpful suggestions, frequently against the tide! 
    
   The authors would also like to thank Carsten Bormann, Ghyslain 
   Pelletier, Christian Schmidt, Max Riegel and Lars-Erik Jonsson for 
   their comments and encouragement.  We haven't always agreed, but the 
   arguments have been fun! 
    
10. Authors' addresses 
                       
   Richard Price         Tel: +44 1794 833681 
   Email:                richard.price@roke.co.uk 
    
   Robert Finking        Tel: +44 1794 833189 
   Email:                robert.finking@roke.co.uk 
    
   Abigail Surtees       Tel: +44 1794 833131 
   Email:                abigail.surtees@roke.co.uk 
    
   Mark A West           Tel: +44 1794 833311 
   Email:                mark.a.west@roke.co.uk 
    
    
   Roke Manor Research Ltd 
   Romsey, Hants, SO51 0ZN 
   United Kingdom 
   http://www.roke.co.uk 
    
11. References 
               
   [RFC-2026]  "The Internet Standards Process - Revision 3", Scott 
               Bradner, RFC 2026, Internet Engineering Task Force,  
               October 1996 
    
   [RFC-2119]  "Key words for use in RFCs to Indicate Requirement 
               Levels", Scott Bradner, RFC 2119, Internet Engineering  
               Task Force, March 1997 
    
 
Price et al.                                                   [Page 40] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
   [RFC-2234]  "Augmented BNF for Syntax Specifications: ABNF",   
               D. Crocker and P. Overell, RFC 2234, Internet Engineering  
               Task Force, November 1997 
    
   [RFC-3095]  "RObust Header Compression (ROHC)", Carsten Bormann et 
               al, RFC3095, Internet Engineering Task Force, July 2001 
    
    
Price et al.                                                   [Page 41] 
Internet-Draft                   ROHC-FN                October 20, 2003   
 
Appendix A.    Supporting Prolog Code 
                                      
   This appendix will, in a later version of the document, contain the 
   supporting Prolog code that is needed in order to execute a profile 
   written in ROHC-FN. 


Price et al.                                                   [Page 42]