Network Working Group                               Hans Hannu, Ericsson
INTERNET-DRAFT                             Jan Christoffersson, Ericsson
Expires: August 2001                           Krister Svanbro, Ericsson
                                                                  Sweden
                                                       February 23, 2001


             RObust GEneric message size Reduction (ROGER)
                    <draft-hannu-rohc-roger-00.txt>


Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is an individual submission to the ROHC working group
   in IETF. Comments should be directed to the authors or to the ROHC
   mailing list (rohc@cdt.luth.se).


Abstract

   Using existing ASCII based application signaling protocols over
   bandwidth limited channels, such as cellular access channels, create
   problems with e.g. long session setup times, long control times and
   waste scarce radio resources. This draft provides a robust and
   efficient compression scheme for ASCII based protocols, which reduces
   the mentioned problems.


Hannu, Christoffersson, Svanbro                                 [Page 1]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


TABLE OF CONTENTS

   1.  Introduction..................................................4

   2.  General description...........................................4

   3.  Terminology...................................................6

   4.  Compression algorithm.........................................7
   4.1.  Dictionary build-up and maintenance.........................8

   5.  Message header................................................9
   5.1.  Message ID field...........................................10
   5.2.  Bit-mask...................................................11
   5.3.  The CRC for messages.......................................11
   5.4.  Errors in the Dynamic Dictionary...........................12
   5.5.  Wrap around................................................12
   5.6.  Avoiding deadlock..........................................13

   6.  Compressor-Decompressor entities.............................13
   6.1.  No contact mode............................................13
   6.1.1.  Compression..............................................14
   6.1.2.  Decompression............................................14
   6.2.  Limited contact mode.......................................14
   6.2.1.  Compression..............................................15
   6.2.2.  Decompression............................................16
   6.3.  Full contact mode..........................................16
   6.3.1.  Compression..............................................17
   6.3.2.  Decompression............................................17
   6.4.  Move of table content......................................17
   6.4.1.  TRD to DD................................................17
   6.4.2.  TST to DD................................................18

   7.  Relation to header compression...............................18
   7.1.  ROHC and ROGER.............................................19
   7.1.1.  ROHC and ROGER, limited contact mode.....................19
   7.1.1.1.  Packet types...........................................19
   7.1.2.  ROHC and ROGER, full contact mode........................19
   7.2.  ROGER realized outside of ROHC scheme......................20
   7.2.1. ROGER realized outside of ROHC scheme, ltd contact mode...20
   7.2.2. ROGER realized outside of ROHC scheme, full contact mode..20

   8.  Evaluation of compression scheme.............................21

   9.  Conclusion...................................................22

   10. Security considerations......................................22

   11. IANA considerations..........................................22


Hannu, Christoffersson, Svanbro                                 [Page 2]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   12. Acknowledgments..............................................23

   13. Intellectual property considerations.........................23

   14. Authors addresses............................................23

   15. References...................................................23


Hannu, Christoffersson, Svanbro                                 [Page 3]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


1. Introduction

   Two communication technologies have become commonly used by the
   general public in the recent years: cellular telephony and the
   Internet. Cellular telephony has provided its users with the freedom
   of mobility - the possibility of always being reachable with
   reasonable service quality no matter where they are. However, until
   now the main service provided has been speech. With the Internet, the
   conditions have been almost the opposite. While flexibility for all
   kinds of usage has been its strength, its focus has been on fixed
   connections and large terminals, and the experienced quality of some
   services (such as Internet telephony) has generally been low. Due to
   new enhanced technologies this is about to change. Internet and
   cellular technologies are beginning to merge. Future cellular
   "phones" will contain an IP-stack and support voice over IP in
   addition to web-browsing, e-mail, etc. One could say that the
   Internet is going mobile, or that cellular systems are becoming much
   more than telephony depending on one's point of view.

   Commonly used terms in this technical area are "all-IP" and "IP all
   the way". These terms all relate to the case where IP is used end to
   end, even if the path involves cellular links. This is done for all
   types of traffic, both the user data (e.g. voice or streaming) and
   control signaling data (e.g. SIP or RTSP). A great benefit of this is
   the service flexibility introduced by IP combined with the freedom
   provided by continuos mobility. A high cost, on the other hand, is
   the relative large overhead the IP protocol suite typically
   introduces, due to large headers and text-based signaling protocols.

   It is very important in cellular systems to use the scarce radio
   resources in an efficient way. It must be possible to support a
   sufficient number of users per cell, otherwise costs will be
   prohibitive [CELL].

   The ROHC (RObust Header Compression) working group has successfully
   solved the problem of reducing bandwidth requirements for the header
   parts of e.g. RTP/UDP/IP packets [ROHC]. With this obstacle removed,
   new challenges of enhancing mobile Internet performance arise. One of
   these relates to application signaling protocols. Protocols such as
   SIP [SIP], SDP [SDP] and RTSP [RTSP] will typically be used to setup
   and control applications also in a mobile Internet. However, the
   generous size of the protocol messages combined with their
   request/response nature create delays and waste bandwidth.
   Compression of these messages should be considered in order to
   increase spectrum efficiency and reduce transmission delay [APP].

2. General description

   This chapter describes compression of protocol data above IP/UDP or
   IP/TCP. The solution is a framework which is robust to packet loss
   and will give efficient compression of ASCII based protocol messages.


Hannu, Christoffersson, Svanbro                                 [Page 4]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   Furthermore, the compression is transparent, i.e. a compressed
   message will after decompression be identical to the original
   message.

   The framework is especially suitable for SIP/SDP, RTSP and HTTP
   [HTTP], but could also be used for other ASCII based protocols.

   Three possible compression/decompression scenarios are identified.
   The scenarios differ from each other depending on to what extent the
   compressor and decompressor can communicate. In all three cases it is
   assumed that the messages are compressed when sent over a narrow band
   link, such as a cellular link. This means that one of the entities
   may reside in the terminal equipment (mobile phone, thin client etc.)
   and the other (somewhere)in the core network of a cellular system.
   The different cases and how to handle them are described in Chapter
   6.

   The compression scheme enhances dictionary based compression of
   single messages by compressing messages from designated packet
   flow(s) in a sequential manner. Already transmitted messages are used
   when compressing new messages. The great gain in doing so stems from
   the fact that previous messages will contain much of the information
   or text strings that are found in later messages.

   In order to classify messages as belonging to a certain flow the
   messages must all pass through the points where the
   compressor/decompressor entities reside. That is, packets that go to
   different mobile terminals can not in general belong to the same
   compressed packet flow.

   The method takes advantage of the possibility to acknowledge received
   messages. The acknowledgements can either be sent with messages
   travelling in the opposite direction or using a dedicated backwards
   channel. All sent and received messages are temporarily stored. Once
   the messages are acknowledged, they are put in a dictionary which is
   used for compression/decompression of future messages. Details on
   where the messages are stored while waiting for acknowledgements is
   given in Chapter 3. The dictionary management and a more thorough
   description of the different scenarios is described in Chapter 6.

   The compression algorithm used for compression is based on the
   Lempel-Ziv algorithm which replaces strings in the message by
   references to previous occurrences in the message or dictionary. The
   use of a dictionary will greatly increase the compression efficiency.
   More information on the compression algorithm can be found in Chapter
   4.

   An important feature needed in order for the proposed compression
   scheme to be efficient, is a method to classify flows, that is, a
   method for identifying the type of protocol which is carried on top
   of IP/UDP, [IP], [UDP], or IP/TCP, [TCP]. Thus, avoiding compression


Hannu, Christoffersson, Svanbro                                 [Page 5]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   of data that is not suitable for this type of compression algorithm.
   This is not an elementary problem. One solution could be to look for
   certain protocol characteristic into the data transported by UDP or
   TCP.


3. Terminology

   * Static Dictionary (SD)
     A dictionary which is static, i.e. does not change during or
     between compression of message flows. The dictionary contains
     protocol-specific Header field names, Methods, Status-codes etc.
     The static dictionary is known by the compressor and decompressor
     prior to compression/decompression at both sides of the link. The
     SD is used for both compression and decompression.

   * Dynamic Dictionary (DD)
     Contains acknowledged messages (or parts of them), which have been
     transmitted during the session. The dynamic dictionary is known by
     both the compressor and the decompressor on the opposite side. The
     DD is empty when the compression begins and is updated according to
     some specific scheme during the message sequence. The DD is used
     for both compression and decompression.

   * Temporary Receiver Dictionary (TRD)
     Messages (or parts of them), which have traversed the link, are
     stored at the receiver side in the TRD. When a receiving entity is
     positive that the opposite side knows that the messages have been
     received, the messages are moved from the TRD to the DD. The TRD is
     used for compression only.

   * Temporary Sender Table (TST)
     Messages that have been sent over the link are stored in the TST at
     the sending side until it is positive that the messages have been
     received at the opposite side, then they are moved to the DD.

   * Temporary Receiver Dictionary Table (TRDT)
     The TRDT is used to keep track of when to move a message from the
     TRD to the DD. When a message has been put in the TRD and an
     acknowledgement has been sent indicating that the message in the
     TRD has been received, the sequence number of the sent message is
     put in the TRDT. When an acknowledgement of the message whose
     sequence number is in the TRDT arrives, the message in the TRD is
     moved to the DD.

   * Headers
     In order for the two entities to keep track of which messages have
     been sent from and received by the other entity the compressed
     messages are supplied with a header. The header holds the sequence
     number of the present message and the sequence numbers of all the
     received messages which have not yet been acknowledged.


Hannu, Christoffersson, Svanbro                                 [Page 6]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   * Context
     The context in ROHC, [ROHC], contains the information necessary to
     perform compression and decompression. The context belongs to a
     certain flow of packets, which is identified by the IP source and
     destination address in combination with the source and destination
     port. For RTP, the SSRC identifier is also used to identify the
     context. The lifetime of the context is not specified within ROHC
     (implementation issue). The context of ROGER is the dictionaries
     and the tables, in this draft called ROGER context. The ROGER
     context is identified in the same way as ROHC context; IP address
     and port numbers of UDP or TCP.


4. Compression algorithm

   The default compression algorithm used to compress messages is a
   slightly modified LZSS [LZSS], which is of Lempel-Ziv type. The
   algorithm works by scanning through the file from left to right and
   replace repeated strings by references to the last previous
   occurrence in the file. The reference is of the form (offset, length
   of match) and is typically represented using two bytes.

   The implementation of LZSS must be done so that it is possible to
   compress and decompress messages using dictionaries. A logical
   representation of how this can be achieved is as follows, see also
   Figure 4.1:

   Compression

   1. Append the message to the dictionary and compress the extended
      file using LZSS.
   2. Separate the part of the compressed file that corresponds to the
      dictionary from the part which corresponds to the message. This is
      possible since LZSS processes the file from left to right and the
      part which has already been compressed does not change as the
      compression proceeds. That is, compressing the dictionary by
      itself or compressing it with a message appended to it will
      produce the same output (apart from the compressed message)

   Decompression

   1. Append the compressed message to the compressed dictionary and
      decompress the extended file.
   2. Separate the message from the dictionary.

   It is of course of vital importance that the same dictionary is used
   by both the compressor and decompressor.


Hannu, Christoffersson, Svanbro                                 [Page 7]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   +--------------+---------+                  +--------+------+
   |  Dictionary  | Message |       --->       |   CD   |  CM  |
   +--------------+---------+   Compression    +--------+------+


   +--------------+                            +--------+
   |  Dictionary  |                 --->       |   CD   |
   +--------------+             Compression    +--------+

   Figure 4.1. Compression of the dictionary with a message appended and
   the dictionary by itself. CD is the compressed dictionary and CM is
   the compressed message.

   The LZSS implementation should be tailored to enable the split of the
   compressed file into these two parts in a simple fashion. To
   facilitate the splitting the implementation should not replace a
   repeated string which runs from the dictionary and into the message
   with a single reference. The part of the string in the dictionary
   should be replaced with one reference and the part of the string in
   the message should be replaced by another reference.

   Compression with LZSS is valid for virtually all types of protocol
   data, not just ASCII based. However, compression would probably not
   be as efficient for other types of data.

   Note: LZSS is chosen as the default compression algorithm in this
   draft. However, it is left as an open issue how LZSS could be
   modified or if some other compression algorithm should be used, in
   order to enhance the performance of ROGER.


4.1. Dictionary build-up and maintenance

   The dynamic dictionary is specific to each packet flow. That is, each
   new packet flow, identified by its IP address and port number, gives
   rise to its own dynamic dictionary. The dictionary is kept as long as
   packets arrive. Determining whether a packet flow is still active can
   be done using a timer. It could also be possible to identify the end
   of a packet flow from the semantics of the protocol, but this would
   complicate the compressor scheme by forcing the compressor to know
   the semantics of the compressed protocols and also to keep track of
   the type of the transmitted messages. Once a packet flow has ended,
   the DD, TRD and TST are emptied.

   Note: How long a dynamic dictionary is kept after the last packet has
   arrived is decided by the system where ROGER is implemented. However,
   the ROGER context at the compressor and the decompressor must be kept
   equally long to avoid that a compressed message can not be
   decompressed.
   There are several different strategies which could be used to
   determine what to update the dynamic dictionary with. The method


Hannu, Christoffersson, Svanbro                                 [Page 8]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   should ensure a high compression efficiency while keeping the
   dictionary size and the complexity of the scheme at a reasonable
   level. Some possibilities on how to update the dictionary are given
   below:

   *Append all messages to the dictionary. This would give very large
    dictionaries in long sessions with many messages.

   *Only use the first (or last) n messages of the session. Typically,
    n would be small to ensure that the dictionary does not grow to an
    unreasonable size. Still, it would have to be large enough to make
    the compression efficient.

   *Append only new strings or rows to the dictionary. This would also
    ensure a slower growth of the dictionary. However, it remains to be
    investigated how this will affect the compression efficiency.

   *Messages, strings or rows could also be deleted from the dictionary
    to avoid having an unreasonably large dictionary. One strategy for
    this could be to delete from the beginning of the dictionary, i.e.
    deleting the oldest parts first.

   Note: It still remains to be investigated what is the most efficient
   way to update the dictionary. However, the dictionaries at compressor
   and decompressor must be updated according to the same procedure,
   otherwise it is impossible to decompress a compressed message.


5. Message header

   To achieve robustness and keep track of sent and received messages, a
   header is added to the compressed message. There are two header types
   available, one for the basic operation used for the great majority of
   headers and one extended header used to verify the correctness of the
   dynamic dictionary. The basic header consists of a message
   identification field, a bit-mask for indication of previously
   received messages and finally a cyclic redundancy code (CRC). Figure
   5.1 shows the basic header.

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | Message ID    |       Bit-    |
   +---+---+---+---+               +
   |             mask              |
   +---+---+---+---+---+---+---+---+
   |              CRC              |
   +---+---+---+---+---+---+---+---+
   /   Compressed message          /
   +---+---+---+---+---+---+---+---+

   Figure 5.1. Basic message header


Hannu, Christoffersson, Svanbro                                 [Page 9]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   *Message ID - 4 bits: The message identification field (number). The
    number is increased with one for each sent message. This number is
    used by the receiving entity (decompressor) to determine which
    message it has received. The field can also be used as a code point
    to signal special actions, see Section 5.1.

   *Bit-mask - 12 bits: Indicating received messages by using their ID.
    These messages have been received at the entity generating this
    message and is stored in its TRD. Once the message has been moved
    from the TRD it will not be indicated in the bit-mask any further.
    See Chapter 6 for details.

   *CRC - 8 bits: The checksum is computed over the uncompressed
    message and saved in the header. After decompression, the checksum
    is computed again and compared to the CRC in the header. If these
    CRC's fail to match, this indicates that an error has occurred.


5.1. Message ID field

   The bit-mask of 12 bits can indicate 12 received messages. Using 4
   bits for the message identification field gives 16 possible numbers.
   Using 12 numbers for message identification leaves 4 numbers for
   indication of other actions.

   * Identification

     +---+---+---+---+       +---+---+---+---+
     | 0   0   0   1 |  to   | 1   1   0   0 |
     +---+---+---+---+       +---+---+---+---+

     are used for message identification number. 

   * Code points

     +---+---+---+---+
     | 0   0   0   0 | : Flush everything in the DD, TRD and TST and use
     +---+---+---+---+   this message as message number 1.

     +---+---+---+---+
     | 1   1   0   1 | : The header contains a CRC for the DD. This
     +---+---+---+---+   implies that the header has a special form, see
                         Section 5.4.
     +---+---+---+---+
     | 1   1   1   0 | : Reserved.
     +---+---+---+---+

     +---+---+---+---+
     | 1   1   1   1 | : Do not place any part of this message in the
     +---+---+---+---+   TRD of the receiving entity or in the TST of
                         the sending entity.


Hannu, Christoffersson, Svanbro                                [Page 10]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   Every sent or received message is put in a suitable table or
   dictionary. If this should not be the case, it must be signaled to
   the decompressor using the last of the above code points.


5.2. Bit-mask

   There are 12 bits in the mask. To indicate that a message has been
   received, the bit corresponding to the messages identification number
   is set. Thus, to indicate a message with identification number 1 the
   bit-mask is set to:

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | Message ID    | 0   0   0   0 |
   +---+---+---+---+               +
   | 0   0   0   0   0   0   0   1 |
   +---+---+---+---+---+---+---+---+
   |          CRC for Message      |
   +---+---+---+---+---+---+---+---+
   /   Compressed message          /
   +---+---+---+---+---+---+---+---+

   To indicate messages with identification numbers 2, 4 and 9 the bit-
   mask is set to:

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | Message ID    | 0   0   0   1 |
   +---+---+---+---+               +
   | 0   0   0   0   1   0   1   0 |
   +---+---+---+---+---+---+---+---+
   |          CRC for Message      |
   +---+---+---+---+---+---+---+---+
   /   Compressed message          /
   +---+---+---+---+---+---+---+---+

   Note also that a received message should be indicated in the
   following sent messages until the received message is moved from the
   TRD to the DD.


5.3. The CRC for messages

   To discover residual bit errors in the messages an 8 bit CRC is
   computed over the message before compression. The CRC is then placed
   in the message header as shown in 5. After decompression, a CRC is
   computed over the decompressed message and compared to the CRC in the
   header. If these CRC's do not match, an error has occurred. In this
   case the message is not placed in the TRD and not acknowledged. The
   CRC polynomial is given by, C(x) = 1 + x + x^2 + x^8.


Hannu, Christoffersson, Svanbro                                [Page 11]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


5.4. Errors in the Dynamic Dictionary

   To obtain robustness to errors in the Dynamic Dictionary which would
   cause the decompression of messages to fail, a CRC can be computed
   over the dynamic dictionary. This CRC is compared to the CRC
   computed over the dynamic dictionary at the other entity. This can be
   useful when the decompressor has failed to decompress several
   consecutive  messages. A CRC computed over the dynamic dictionary is
   signaled using the code point 1 1 0 1. The extended header used in
   combination with this code point is as follows:

   0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1   1   0   1 | Message ID    |
   +---+---+---+---+---+---+---+---+
   |              Bit-             |
   +               +---+---+---+---+
   |     Mask      |   CRC for     |
   +---+---+---+---+               +
   |   Dynamic Dictionary          |
   +---+---+---+---+---------------+
   |       CRC for Message         |
   +---+---+---+---+---+---+---+---+
   /   Compressed message          /
   +---+---+---+---+---+---+---+---+

   The CRC for the dynamic dictionary is 12 bits and the CRC-polynomial
   is given by, C(x) = 1 + x^2 + x^3 + x^11 + X^12.


5.5. Wrap around

   A wrap around problem arises when no acknowledgment has been received
   for the message ID number that is in turn to be assigned to a new
   compressed message. There are some possible solutions to this
   problem; Assign the next following ID number, or if no ID number is
   free, use the code point which indicates that this message is not to
   be saved in the TRD or in the TST, thus it should not be used for
   further compression. This approach might reduce the compression
   efficiency in case the following messages differ substantially from
   the previous messages stored in the dictionaries. Message ID numbers
   could be freed even if no acknowledgment has been received. However,
   this must be done very carefully to maintain robustness. One approach
   is to free a message ID if an acknowledgement is received for some
   later message and that a prescribed "time" has expired.

   Note: This is solved by the implementation.


Hannu, Christoffersson, Svanbro                                [Page 12]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


5.6. Avoiding deadlock

   If more than 12 consecutive messages in one direction are lost, the
   compressor runs out of ID numbers and a deadlock may occur. To avoid
   this, the following scheme will restart the compression. Messages
   sent within a prescribed time after the 12'th message are sent with
   the codepoint 1 1 1 1. This forces the compressor to wait for the
   acknowledgement of the 12'th message. After the prescribed time
   period and if no acknowledgment has been received, the code point 0 0
   0 0 is sent. This signals that the DD, TRD and TST are emptied and
   the compression scheme restarts. If no acknowledgement is received
   for this message the procedure is repeated.

   Note: The prescribed time period is system dependent and should thus
   be decided by the implementation for the system.


6. Compressor - Decompressor, entities

   In the following sections of this draft the compressor/decompressor
   entities will be referred  to as entity-u (entity-uplink) and entity-
   d (entity-downlink), see Figure 6.1. Entity-u's compressor sends
   messages to entity-d's decompressor and entity-d's compressor sends
   messages to entity-u's decompressor. Depending on how the compressor
   and decompressor at one entity resides, or more specifically, to what
   extent they are able to communicate, different compression modes are
   possible. The compression efficiency will vary depending on the
   applied mode. The following three sections describe the different
   scenarios. Although Figure 6.1 shows a mobile communicating with a
   base station, the ROGER scheme could be applied to other types of
   systems and scenarios.

   Mobile, entity-d            Fixed network, entity-u
    |    ................              |  |
   ++                   /              +--+
   ||                   ............>   ||
   ++                                   /\
                                       /  \

   Figure 6.1. Placement of the compression entities


6.1. No contact mode

   The compressor and decompressor residing at the same entity are
   unable to communicate. Also, the decompressor at entity-u is unable
   to communicate with the compressor at entity-d and vice versa, making
   use of acknowledgements impossible. This precludes the use of ROGER
   headers. Thus, in this particular case no ROGER header is attached to
   the message. This also implies that the compressor at entity-d never
   can be positive that a sent message has been received at the


Hannu, Christoffersson, Svanbro                                [Page 13]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   decompressor of entity-u. Consequently, use of a dynamic dictionary
   would make decompression impossible if a previous packet had been
   lost. Only the static dictionary can be used without losing
   robustness against packet losses. Figure 6.2 shows this scenario.

                   Entity-d                    Entity-u
                  +---------+                 +---------+
                  |  Comp.  |---------------->| Decomp. |
                  +---------+                 +---------+

                  +---------+                 +---------+
                  | Decomp. |<----------------|  Comp.  |
                  +---------+                 +---------+

   Figure 6.2. No information about the arrival of sent messages reaches
   the compressor.


6.1.1. Compression

   Compression is carried out by using the static dictionary only.

   * Compression steps

   1) Compress message using SD
   2) Send message


6.1.2. Decompression

   Decompression is carried out by using the static dictionary only.

   * Decompression steps

   1) Decompress message using SD


6.2. Limited contact mode

   The decompressor at e.g. entity-u can acknowledge messages to the
   compressor at entity-d. The decompressor at entity-u has a system
   provided link, which from ROGER's point of view looks like a direct
   link, to the compressor at entity-d, see Figure 6.3.

    Entity-d                    Entity-u
   +---------+                 +---------+
   |  Comp.  |---------------->| Decomp. |
   |         |<-----ACK--------|         |
   +---------+                 +---------+

   Figure 6.3. Acknowledgements generated by decompressor.


Hannu, Christoffersson, Svanbro                                [Page 14]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   From ROGER's point of view, the basic acknowledgement has the same
   format as the message header described in Chapter 5, except for the
   CRC, see Figure 6.4.

   0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   |    ACK ID     |       Bit-    |
   +---+---+---+---+               +
   |             mask              |
   +---+---+---+---+---+---+---+---+
   Figure 6.4.  Basic acknowledgement.

   *ACK ID - 4 bits: The acknowledgement identification number. The
    number is increased with one for each sent acknowledgement. This
    number is used by the receiving entity (compressor) to indicate to
    the originator of this message (decompressor at the other entity),
    that the acknowledgement has been received.

   *Bit-mask - 12 bits: Indicating received messages by using their ID.
    These messages have been received at the entity generating this
    acknowledgement and is stored in its TRD.

   In situations when the decompressor wants to verify the correctness
   of its dynamic dictionary, i.e. send code point 1 1 0 1, the extended
   acknowledgement should be used, see Figure 6.5.

   0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1   1   0   1 |    ACK ID     |
   +---+---+---+---+---+---+---+---+
   |              Bit-             |
   +               +---+---+---+---+
   |     Mask      |   CRC for     |
   +---+---+---+---+               +
   |   Dynamic Dictionary          |
   +---+---+---+---+---------------+
   Figure 6.5.  Extended acknowledgement.

   Section 7.1 describes how ROGER could be realized within ROHC and
   also how the feedback (acknowledging) is handled with ROHC.


6.2.1. Compression

   When the session starts, the dynamic dictionary and the TST are
   empty. The message is compressed using the static and the dynamic
   dictionary and is also stored in the TST. The message header
   indicates which message is sent and which acknowledgements that have
   been received.


Hannu, Christoffersson, Svanbro                                [Page 15]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   * Compression steps:

   1) If necessary, move content of TST to DD
   2) Compress using SD+DD
   3) Put message in TST
   4) Attached header
   5) Send message


6.2.2. Decompression

   Decompression is done by first looking at the header attached to the
   compressed message. The header indicates which messages were in the
   DD when the message was compressed. That is, the bit-mask indicates
   which acknowledgements that have arrived to the compressor and the
   messages corresponding to the acknowledgements are used in the
   compression process. The decompressor makes sure that the same
   messages are used for decompression, i.e. moving the content of the
   TRD to the DD which is indicated in the bit-mask. The received
   message is put into the TRD and an acknowledgement is sent to the
   compressor. One could consider to use a sparse acknowledging scheme
   here.

   * Decompression steps:

   1) If necessary move content of TRD to DD, see Section 6.4.1.
   2) Decompress message using SD+DD
   3) Put message in TRD
   4) Send Acknowledgement


6.3. Full contact mode

   The compressor and decompressor on both sides reside together. Thus,
   both sent and received messages can be used in the compression
   process since the compressor and decompressor share dictionaries. The
   decompressor uses the compressor, which it resides together with, to
   inform the compressor on the other side that a message has been
   received. This is done by an indication in the bitmask of the sent
   message's header. See Figure 6.6 for scenario.


   +---------+                 +----------+
   | Comp./  |---------------->| Decomp./ |
   | Decomp. |<----------------| Comp.    |
   +---------+                 +----------+

   Figure 6.6. Compressor and decompressor reside together and are able
   to share the context.


Hannu, Christoffersson, Svanbro                                [Page 16]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   How this mode is handled within the ROHC scheme is described in
   Section 7.2.


6.3.1. Compression

   Compression is done using the SD, the DD and the TRD. The sent
   message is put into the TST. The message ID is put into the header
   together with the bit-mask indicating which messages have been
   received and not yet been put into the DD.

   * Compression steps:

   1) Compress using SD+DD+TRD
   2) Put message in TST
   3) Attach header
   4) Send message


6.3.2. Decompression

   The decompression starts by reading the message header. If the bit-
   mask indicates a previous sent message which acknowledged an earlier
   received message this earlier received message is moved from the TRD
   to the DD. If the bit mask indicates that a previously sent message
   has been received at the other entity this previously sent message is
   moved from the TST to the DD.

   * Decompression steps:

   1) If necessary move content of TRD to DD, see Section 6.4.1.
   2) If necessary move content of TST to DD, see Section 6.4.2.
   3) Decompress message using SD+DD
   4) Put message in TRD
   5) Send acknowledgment if needed


6.4. Move of table content

   This section defines when to move contents from the TRD or TST to the
   DD. In general the order for movement is;
   1) move contents from TRD to DD
   2) move contents from TST to DD.
   The contents of the TRD and TST may be several messages. Only the
   messages that correspond to a certain acknowledgement are moved.


6.4.1. TRD to DD

   When a message is sent carrying indications of received messages in
   the TRD, a mapping between the message ID and the IDs of the messages


Hannu, Christoffersson, Svanbro                                [Page 17]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   stored in the TRDT is made. When a future message is received by this
   entity, the entity withdraws the acknowledged messages IDs from the
   received message header. The acknowledged messages IDs are compared
   with the IDs stored in the TRDT. If there is a match the
   corresponding contents in the TRD (given by the mapping) is moved to
   the DD and the mapping is removed from the TRDT. If the next received
   message carries the same acknowledgment it will not cause
   difficulties since the mapping has been removed from the TRDT.


6.4.2. TST to DD

   The contents of the TST is moved to the DD when an acknowledgement is
   received for the message stored in the TST. The TST must be
   constructed so that if the next following messages acknowledge the
   same message there is no move of content from the TST to the DD.


7. Relation to header compression

   The protocols discussed in this draft, i.e. SIP/SDP, RTSP and HTTP
   are all carried on UDP/IP or TCP/IP. In order to utilize the benefits
   of using ROGER, attention should be paid to UDP/IP or TCP/IP
   compression as well.

   An efficient method for header compression of UDP/IP is given by
   ROHC. In the near future ROHC profiles for compressing TCP/IP will
   also be available.

   An appealing solution is to handle the UDP/IP compression with ROHC
   and the compression of the application signaling messages with ROGER.
   Figure 7.1. shows a packet before and after compression. CM is
   compressed information e.g. SIP/SDP, H is the ROGER header and R is
   the ROHC header handling the UDP/IP part of the packet.

        +--------+---------+                      +---+---+----+
        | IP/UDP | SIP/SDP |---- compression ---->| R | H | CM |
        +--------+---------+                      +---+---+----+
             Figure 7.1. Packet before and after compression.

   To identify messages that have been compressed with ROGER, there are
   some alternatives depending on the environment. For example, a link
   layer identification method could be used, or in conjunction with
   ROHC, a profile number.

   As ROGER compresses the part above TCP/IP and UDP/IP, it is assumed
   that context identification is handled by the underlying TCP/IP or
   UDP/IP compression scheme.


Hannu, Christoffersson, Svanbro                                [Page 18]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


7.1. ROHC and ROGER

   The compression scheme ROGER could be made as a part of the ROHC
   framework. Profile 2 in ROHC is a UDP/IP compression profile. A new
   profile which has the same functions and packet types as profile 2
   and includes ROGER could be defined. Thus, the new profile would also
   compress the UDP payload with ROGER. A new profile can also be
   defined in a similar manner using the future ROHC TCP profile.


7.1.1. ROHC and ROGER, limited contact mode

   The limited contact mode, see Section 6.2, does not require any
   changes to the ROHC scheme or any additional features from the system
   that are not already required by ROHC. The next section describes the
   packet types to fit ROGER into the ROHC scheme.


7.1.1.1. Packet types

   The packet types of this combination of ROHC and ROGER have the same
   formats as defined in ROHC for profile 2 with the addition of the
   ROGER message header, see Section 5. The ROGER message header is
   placed at the end of the profile 2 ROHC header.

   The feedback types are the same for ROHC profile 2, with the addition
   of this option:

   +---+---+---+---+---+---+---+---+
   |  Opt Type = 5 |  Opt Len = *  |
   +---+---+---+---+---+---+---+---+
   |  ID Number    |     Bit-      |
   +---+---+---+---+               +   - ROGER feedback
   |             mask              |
   +---+---+---+---+---+---+---+---+
   /                               /   - Other types of feedback
   +---+---+---+---+---+---+---+---+

   *If the "Opt Len" field has a value larger than 2 octets there are
    more feedback options in this packet, which starts after the ROGER
    feedback.


7.1.2. ROHC and ROGER, full contact mode

   ROGER gains in compression efficiency if the dictionaries can be
   shared between the compressor and decompressor at the entities. How
   to share contexts is not defined in the ROHC scheme and can therefore
   be regarded as an implementation issue. However, this feature must be
   implemented on both sides of the link. The use of shared dictionaries


Hannu, Christoffersson, Svanbro                                [Page 19]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   requires that it is possible to associate the entities inbound and
   outbound flows.

   The criteria for associating the flows could be the IP-addresses and
   possibly also the port numbers. It is necessary that both the uplink
   and downlink flows pass through the same point.

   To enable the use of shared dictionaries it is up to the underlying
   system to associate flows going in both directions and pass the ROGER
   headers with the compressed information to ROGER.


7.2 ROGER realized outside of the ROHC scheme

   ROGER can of course be realized outside of the ROHC scheme, but this
   implies somewhat more requirements on the system to use ROGER in. For
   the three modes defined in this draft there is one common thing that
   is required from the system (underlying link layer); The system must
   handle the negotiation of whether ROGER is to be used, and if so in
   which mode. The following two sections describe the requirements for
   the two latter modes. The requirements are more of the implementation
   type and probably not needed to standardize. In the No Contact Mode
   there are no other requirements since in this mode only the static
   dictionary is used.


7.2.1. ROGER realized outside of the ROHC scheme, limited contact mode

   In the limited contact mode the dictionaries are not shared among
   packet flows in opposite directions. There is one context per
   direction and flow.
   The additional link layer requirements in this mode are:

   *The system must be able to identify the flow(s) that correspond to
    a certain context. As the IP-header and UDP (or TCP) header are in
    the clear this can be done by looking at the IP-addresses and port
    numbers.

   *As ROGER does not define any piggyback headers, the system must
    provide the feedback to the ROGER entities, e.g. by a dedicated
    feedback channel.


7.2.2. ROGER realized outside of the ROHC scheme, full contact mode

   In the full contact mode the dictionaries are shared among packet
   flows in both directions. One context (the ROGER dictionaries) can be
   used by multiple flows in both directions.
   The additional link layer requirements in this mode are:


Hannu, Christoffersson, Svanbro                                [Page 20]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   *The system must be able to identify the flow(s) that correspond to
    a certain context. As the IP-header and UDP (or TCP) header are in
    the clear this can be done by looking at the IP-addresses and port
    numbers.

   *The system must be able to associate an uplink packet flow(s) with
    a downlink packet flow(s), since both sent and received messages
    are used in the compression process.

   *Since it is required from this mode that the system can associate
    uplink flows with downlink flows, acknowledging of messages are
    handled with the bit-mask in the headers of the compressed
    messages.


8. Evaluation of compression scheme

   A small test in the limited contact and full contact mode situations
   was carried out to evaluate the performance of ROGER. The messages
   from a SIP trace of a call setup were compressed. The packet flow
   consisted of 13 messages sent between a client and a SIP proxy.

   The compression was performed using an implementation of the LZSS
   algorithm as described in Section 4. To determine the size of the
   compressed file, the size of the compressed dictionary was compared
   to the size of the compressed extended file.

   The static dictionary used for the compression and decompression is
   built up by header field names e.g.; To:, From:, and Via:

   In the limited contact mode it was assumed that every message was
   acknowledged before the next message was sent, i.e. a dedicated
   channel was available. This gives a slightly better compression than
   if piggy-backing on SIP messages travelling in the opposite direction
   is used, since this gives a slower dictionary expansion.

   The results in terms of compression factors (size uncompressed/size
   compressed) are given in Table 1. The over all compression factors
   were 3.3 for the limited contact mode and 4.6 for the full contact
   mode.

   Message #        Originating source       Compression factor
                                             Limited   Full
       1                  Client               1.5      1.5
       2                  Proxy                1.5      5.6
       3                  Proxy                1.9      3.1
       4                  Client               3.5      4.9
       5                  Proxy                5.2      6.4
       6                  Client               5.7      5.7
       7                  Proxy                6.8      8.1
       8                  Proxy                7.4      8.3


Hannu, Christoffersson, Svanbro                                [Page 21]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


       9                  Proxy                6.9      7.0
       10                 Client               7.2      7.4
       11                 Proxy                7.1      7.8
       12                 Proxy                6.6      7.9
       13                 UAC                  7.8      7.8
                                      Average: 3.3      4.6
   Table 1.

   As can be seen from the table, compression is more efficient in the
   full contact mode. This is due to the fact that the dictionaries grow
   faster since both sent and received messages are used.


9. Conclusions

   This draft has presented the compression scheme ROGER for application
   signaling protocols. The scheme is simple to implement and shows
   promising results for compression of the ASCII based protocols
   SIP/SDP, and can be expected to have a similar performance on other
   ASCII based protocols such as RTSP.

   Depending on what the systems link layer can support there are three
   different modes of operation; No contact mode, Limited contact mode,
   and Full contact mode.

   ROGER can easily fit in the ROHC framework.

   Requirements on the system if ROGER is run outside of the ROHC
   framework are listed.


10. Security considerations

   In general encryption and compression do not go together very well.
   More specifically, messages that are encrypted are not possible to
   compress efficiently. It is of course possible to run a loss less
   compression algorithm like ROGER on an encrypted message, but the
   compression will most likely not decrease the size of the message.

   These points also hold for the use of ROGER. The use of ROGER does
   not effect the possibility to use encryption algorithms. Use of ROGER
   on encrypted messages is possible, although not believed to result in
   any size reduction of the encrypted message.


11. IANA considerations

   If ROGER is to be a part of the ROHC framework a ROHC profile
   identifier must be reserved by the IANA for the IP/UDP/ROGER profile
   defined in this document and also a ROHC profile identifier for a
   future IP/TCP/ROGER profile.


Hannu, Christoffersson, Svanbro                                [Page 22]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


12. Acknowledgements

   Thanks to Arto Mahkonen, Ericsson LMF, and Lars-Erik Jonsson,
   Ericsson Erisoft, for their valuable input to this work.


13. Intellectual property rights considerations

   Ericsson has filed patent applications that might possibly have
   technical relations to this contribution.
   See further: http://www.ietf.org/ietf/IPR/ERICSSON-General 


14. Author's Addresses

     Hans Hannu               Tel: +46 920 20 21 84
     Ericsson Erisoft AB
     Lulea, Sweden            EMail: Hans.Hannu@epl.ericsson.se

     Jan Christoffersson      Tel: +46 920 20 28 40
     Ericsson Erisoft AB
     Lulea, Sweden            EMail: Jan.Christoffersson@epl.ericsson.se

     Krister Svanbro          Tel: +46 920 20 20 77
     Ericsson Erisoft AB
     Lulea, Sweden            EMail: Krister.Svanbro@epl.ericsson.se


15. References

   [APP]       H. Hannu, J. Christoffersson and K. Svanbro, Application
               signaling over cellular links, Internet Draft (work in
              progress), November 2000.
              <draft-hannu-rohc-signaling-cellular-00.txt>

   [CELL]      L. Westberg and M. Lindqvist, Realtime traffic over
               cellular access networks, Internet Draft (work in
               progress), November 2000.
               <draft-westberg-realtime-cellular-03.txt>

   [HTTP]      R. Fielding, et. al., Hypertext Transfer Protocol -
               HTTP/1.1. RFC 2616, June 1999.

   [IP]        J. Postel, Internet Protocol, RFC 791, September 1981.

   [LZSS]      J.A. Storer and T.G. Szimanski, Data Compression via
               Textual Substitutions. Journal of the ACM 29, 1982.

   [ROHC]      C. Bormann, Et. al., RObust Header Compression, Internet
              Draft (work in progress), February 2001.
               <draft-ietf-rohc-rtp-08.txt>


Hannu, Christoffersson, Svanbro                                [Page 23]


INTERNET-DRAFT   RObust GEneric message size Reduction     Feb 23, 2001


   [RTSP]      H. Schulzrinne, A. Rao and R. Lanphier, Real Time
               Streaming Protocol (RTSP). RFC 2326, April 1998.


   [SDP]       M. Handley and V. Jacobson, SDP: Session Description
               Protocol. RFC 2327, April 1998.

   [SIP]       M. Handley, H. Schulzrinne, E. Schooler and J. Rosenberg,
               SIP: Session Initiation Protocol. RFC 2543, August 2000.

   [TCP]       J. Postel, Transmission Control Protocol, RFC 793,
               September 1981.

   [UDP]       J. Postel, User Datagram Protocol, RFC 761, August 1980.


   This Internet-Draft expires in August 2001.


Hannu, Christoffersson, Svanbro                                [Page 24]