Random Linear Network Coding (RLNC)-Based Symbol Representation
draft-heide-nwcrg-rlnc-00

Abstract

This document describes the use of Random Linear Network Coding (RLNC) schemes for erasure correction in data transfer, with an emphasis on RLNC encoded symbol representations in data packets. Both block and sliding window RLNC code implementations are described. By providing erasure correction using randomly generated repair symbols, such RLNC-based schemes offer advantages in accommodating varying frame sizes and dynamically changing connections, reducing the need for feedback, and lowering the amount of state information needed at the sender and receiver.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 11, 2018.

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Introduction

1.1. Random Linear Network Coding (RLNC) Basics
1.2. Generation-Based RLNC
1.3. Sliding Window RLNC
1.4. Recoding

2. Coding Parameter Design Considerations
3. Symbol Representation

3.1. Representation Setup
3.2. Field Types and Formats
3.3. Small Encoding Window

3.3.1. Examples

3.4. Large Encoding Window

4. Security Considerations
5. IANA Considerations
6. References

6.1. Normative References
6.2. Informative References

Authors' Addresses

1. Introduction

Network Coding is an emerging coding discipline that jointly improves network reliability and efficiency. In general communication networks, source coding is performed as a compression mechanism to reduce data redundancy and to reduce resources necessary for data transportation over the network. Channel coding, on the other hand, introduces redundancy for data transmission reliability over lossy channels. Network coding adds another layer of coding in-between these two. Random Linear Network Coding (RLNC) is an efficient network coding approach that enables network nodes to generate independently and randomly linear mappings of input to output data symbols over a finite field.

This document describes the use of RLNC schemes, with an emphasis on RLNC encoded symbol representations in data packets. Specifically, a block RLNC scheme and a sliding window RLNC scheme are discussed in the context of varying data frame sizes.

1.1. Random Linear Network Coding (RLNC) Basics

Unlike conventional communication systems based on the "store-and-forward" principle, RLNC allows network nodes to independently and randomly combine input source data into coded symbols over a finite field [HK03]. Such an approach enables receivers to decode and recover the original source data as long as enough linearly independent coded symbols, with sufficient degrees of freedom, are received. At the sender, RLNC can introduce redundancy into data streams in a granular way. At the receiver, RLNC enables progressive decoding and reduces feedback necessary for retransmission. Collectively, RLNC provides network utilization and throughput improvements, high degrees of robustness and decentralization, reduces transmission latency, and simplifies feedback and state management.

To encode using RLNC, original source data are divided into symbols of a given size and linearly combined. Each symbol is multiplied with a scalar coding coefficient drawn randomly from a finite field, and the resulting coded symbol is of the same size as the original data symbols.

Thus, each RLNC encoding operation can be viewed as creating a linear equation in the data symbols, where the random scalar coding coefficients can be grouped and viewed as a coding vector. Similarly, the overall encoding process where multiple coded symbols are generated can be viewed as a system of linear equations with randomly generated coefficients. Any number of coded symbols can be generated from a set of data symbols, similarly to expandable forward error correction codes specified in [RFC5445] and [RFC3453]. Coding vectors must be implicitly or explicitly transmitted from the sender to the receiver for successful decoding of the original data. For example, sending a seed for generating pseudo-random coding coefficients can be considered as an implicit transmission of the coding vectors. In addition, while coding vectors are often transmitted together with coded data in the same data packet, it is also possible to separate the transmission of coding coefficient vectors from the coded data, if desired.

To reconstruct the original data from coded symbols, a network node collects a finite but sufficient number of degrees of freedom for solving the system of linear equations. This is beneficial over conventional approaches as the network node is no longer required to gather each individual data symbol. In general, the network node needs to collect slightly more independent coded symbols than there are original data symbols, where the slight overhead arises because coding coefficients are drawn at random, with a non-zero probability that a coding vector is linearly dependent on another coding vector, and that one coded symbol is linearly dependent on another coded symbol. This overhead can be made arbitrarily small, provided that the finite field used is sufficiently large.

A unique advantage of RLNC is the ability to re-encode or "recode" without first decoding. Recoding can be performed jointly on existing coded symbols, partially decoded symbols, or uncoded systematic data symbols. This feature allows intermediate network nodes to re-encode and generate new linear combinations on the fly, thus increasing the likelihood of innovative transmissions to the receiver. Recoded symbols and recoded coefficient vectors have the same size as before and are indistinguishable from the original coded symbols and coefficient vectors.

In practical implementations of RLNC, the original source data are often divided into multiple coding blocks or "generations" where coding is performed over each individual generation to lower the computational complexity of the encoding and decoding operations. Alternatively, a convolutional approach can be used, where coding is applied to overlapping spans of data symbols, possibly of different spanning widths, viewed as a sliding coding window. In generation-based RLNC, not all symbols within a single generation need to be present for coding to start. Similarly, a sliding window can be variable-sized, with more data symbols added to the coding window as they arrive. Thus, innovative coded symbols can be generated as data symbols arrive. This "on-the-fly" coding technique reduces coding delays at transmit buffers, and together with rateless encoding operations, enables the sender to start emitting coded packets as soon as data is received from an upper layer in the protocol stack, adapting to fluctuating incoming traffic flows. Injecting coded symbols based on a dynamic transmission window also breaks the decoding delay lower bound imposed by traditional block codes and is well suited for delay-sensitive applications and streaming protocols.

When coded symbols are transmitted through a communication network, erasures may occur, depending on channel conditions and interactions with underlying transport protocols. RLNC can efficiently repair such erasures, potentially improving protocol response to erasure events to ensure reliability and throughput over the communication network. For example, in a point-to-point connection, RLNC can proactively compensate for packet erasures by generating Forward Erasure Correcting (FEC) redundancy, especially when a packet erasure probability can be estimated. As any number of coded symbols may be generated from a set of data symbols, RLNC is naturally suited for adapting to network conditions by adjusting redundancy dynamically to fit the level of erasures, and by updating coding parameters during a session. Alternatively, packet erasures may be repaired reactively by using feedback requests from the receiver to the sender, or by a combination of FEC and retransmission. RLNC simplifies state and feedback management and coordination as only a desired number of degrees of freedom needs to be communicated from the receiver to the sender, instead of indications of the exact packets to be retransmitted. The need to exchange packet arrival state information is therefore greatly reduced in feedback operations.

The advantages of RLNC in state and feedback management are apparent in a multicast setting. In this one-to-many setup, uncorrelated losses may occur, and any retransmitted data symbol is likely to benefit only a single receiver. By comparison, a transmitted RLNC coded symbol is likely to carry a new degree of freedom that may correct different errors at different receivers simultaneously. Similarly, RLNC offers advantages in coordinating multiple paths, multiple sources, mesh networking and cooperation, and peer-to-peer operations.

A more detailed introduction to network coding including RLNC is provided in the books [MS11] and [HL08].

1.2. Generation-Based RLNC

This section describes a generation-based RLNC scheme.

In generation-based RLNC, input data as received from an upper layer in the protocol stack is segmented into equal-sized blocks, denoted as generations, and each generation is further segmented into equal-sized data symbols for encoding, with paddings added when necessary. Encoding and decoding are performed over each individual generation. Figure 1 below provides an illustrative example where each generation includes four data symbols, and a systematic RLNC code is generated with rate 2/3.

Data Symbols:
          Generation-1                 Generation-2       
 +============================++============================+  
 | +---+  +---+  +---+  +---+ || +---+  +---+  +---+  +---+ |
 | | 1 |  | 2 |  | 3 |  | 4 | || | 5 |  | 6 |  | 7 |  | 8 | | ...
 | +---+  +---+  +---+  +---+ || +---+  +---+  +---+  +---+ | 
 +============================++============================+   
                  |                           |                   
                  |                           |                                 
Systematic        |                           |     
Symbols:          V                           V
 +---++---++---++---++---++---+ +---++---++---++---++---++---+ 
 | 1 || 2 || 3 || 4 || C1|| C2| | 5 || 6 || 7 || 8 || C3|| C4|  ...
 +---++---++---++---++---++---+ +---++---++---++---++---++---+

Figure 1: Generation-based RLNC with rate 2/3, systematic encoding performed on data symbols within each generation.

Symbols can be of any size, although symbol sizes typically depend on application or system specifications. In scenarios with highly varying input data frame sizes, a small symbol size may be desirable for achieving flexibility and transmission efficiency, with one or more symbols concatenated to form a coded data packet. In this context, existing basic FEC schemes [RFC5445] do not support the use of a single header for multiple coded symbols, whereas the symbol representation design as described herein provides this option.

For any protocol that utilizes generation-based RLNC, a setup process is necessary for establishing a connection and conveying coding parameters from the sender to the receiver. Such coding parameters can include one or more of field size, code specifications, index of the current generation being encoded at the sender, generation size, code rate, and desired feedback frequency or probability. Some coding parameters are updated dynamically during the transmission process, reflecting the coding operations over sequences of generations, and adjusting to channel conditions and resource availability. For example, an outer header can be added to the symbol representation specified in this document to indicate the current generation encoded within the symbol representation. Such information is essential for proper recoding and decoding operations, but the exact design of the outer header is outside the scope of the current document. At the minimum, an outer header should indicate the current generation, generation size, symbol size, and field size. Section 2 provides a detailed discussion of coding parameter considerations.

1.3. Sliding Window RLNC

This section describes a sliding-window RLNC scheme. Sliding window RLNC was first described in [SS09]

In sliding-window RLNC, input data as received from an upper layer in the protocol stack is segmented into equal-sized data symbols for encoding. In some implementations, the sliding encoding window can expand in size as new data packets arrive, until it is closed off by an explicit instruction, such as a feedback message that re-initiates the encoding window. In some implementations, the size of the sliding encoding window is upper bounded by some parameter, fixed or dynamically determined by online behavior such as packet loss or congestion estimation. Figure 3 below provides an example of a systematic finite sliding window code with rate 2/3.

 Data Symbols:
  +---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
  | 1 |  | 2 |  | 3 |  | 4 |  | 5 |  | 6 |  | 7 |  | 8 |      ...
  +---   +---+  +---+  +---+  +---+  +---+  +---+  +---+           
  |    C1    |             |             |             |
  +==========+             |             |             |
  |            C2          |             |             |
  +========================+             |             |
                |            C3          |             |
                +========================+             |
                              |              C4        |
                              +========================+      ...
                         |
                         |
Systematically           |
Coded Symbols:           V
+---++---++---++---++---++---++---++---++---++---++---++---+ 
| 1 || 2 || C1|| 3 || 4 || C2|| 5 || 6 || C3|| 7 || 8 || C4|...
+---++---++---++---++---++---++---++---++---++---++---++---+

Figure 3: Finite sliding-window RLNC with code rate 2/3, systematic encoding performed on data symbols within the sliding coding window.

For any protocol that utilizes sliding-window RLNC, a setup process is necessary for establishing a connection and conveying coding parameters from the sender to the receiver. Such coding parameters can include one or more of field size, code specifications, symbol ordering, encoding window position, encoding window size, code rate, and desired feedback frequency or probability. Some coding parameters can also be updated dynamically during the transmission process in accordance to channel conditions and resource availability. For example, an outer header can be added to the symbol representation specified in this document to indicate an encoding window position, as a starting index for current data symbols being encoded within the symbol representation. Again, such information is essential for proper recoding and decoding operations, but the exact design of the outer header is outside the scope of the current document. At the minimum, an outer header should indicate the current encoding window position, encoding window size, symbol size, and field size. Section 2 provides a detailed discussion of coding parameter considerations.

Once a connection is established, RLNC coded packets comprising one or more coded symbols are transmitted from the sender to the receiver. The sender can transmit in either a systematic or coded fashion, with or without receiver feedback. In progressive decoding of RLNC coded symbols, the notion of “seen” packets can be utilized to provide degree of freedom feedbacks. Seen packets are those packet that have contributed to a received coded packet, where generally the oldest such packet that has yet to be declared seen is declared as seen [SS09].

1.4. Recoding

Recoding is the process where coded or partially decoded symbols are re-encoded without first being fully decoded. To recode, both coded symbols and corresponding coding coefficient vectors are linearly combined, respectively, with new randomly generated recoding coefficients. Recoded symbols and recoded coefficient vectors generally have the same size as before and are indistinguishable from the original coded symbols and coding coefficient vectors. Recoding is typically performed at intermediate network nodes, in either an intra-session or an inter-session fashion. Intra-session coding refers to coding between packets of the same flow or session, while inter-session coding refers to combination of packets from different flows or sessions in a network.

As recoding requires the same operations on the coding coefficient vectors as applied to the coded symbols, coding coefficients must be updated by recoding coefficients. This is generally achieved by having coding coefficient accessible at recoding nodes so that they may be updated using the recoding coefficients. Thus, either the original coding coefficients or reversible representations of the coding coefficients need to be communicated from upstream nodes to the recoding nodes.

2. Coding Parameter Design Considerations

For any protocol that utilizes generation-based or sliding-window RLNC, several coding parameters must be communicated from the sender to the receiver as part of the protocol design. Without elaborating on all such parameters, this section examines those essential for symbol representation design, including field size, symbol size, maximum number of symbols, and maximum generation or window size.

As RLNC is performed over a finite field, field size determines the complexity of the required mathematical computations. Larger field sizes translate to higher probability of successful decoding, as randomly generated coding coefficient vectors are more likely to be independent from each other. However, larger field sizes may also result in higher computational complexity, leading to longer decoding latency, higher energy usage, and other hardware requirements for both the encoder and the decoder. A finite field size of 2 or the binary Finite Field FF(2) should always be supported since addition, multiplication, and division over FF(2) are equivalent to elementary XOR, AND, and IDENTITY operations respectively. It is also desirable to support a field size of 2^8, corresponding to a single byte, and where operations are performed over the binary extension field FF(2^8) with polynomial x^8+x^4+x^3+x^2+1.

The choice of symbol size typically depends on the application or system specification. For example, a symbol size may be chosen based on the size of a maximum transmission unit (MTU) so that datagrams are not fragmented as they traverse a network, while also ensuring no symbol bits are unnecessarily wasted. The current symbol representation design is flexible and can accommodate any symbol size in bytes. For example, an IP packet is typically in the range between 500 and 1500 bytes, while a much smaller datagram having a size of 90 bytes may be used by satellite communication networks. The current symbol representation can be configured to support either or both cases in different implementations.

The generation size or coding window size is a tradeoff between the strength of the code and the computational complexity of performing the coding operations. With a larger generation/window size, fewer generations or coding windows are needed to enclose a data message of a given size, thus reducing protocol overhead for coordinating individual generations or coding windows. In addition, a larger generation/window size increases the likelihood that a received coded symbol is innovative with respect to previously received symbols, thus amortizing retransmission or FEC overheads. Conversely, when coding coefficients are attached, larger generation/window size also lead to larger overheads per packet. The generation/window size to be used can be signaled between the sender and receiver when the connection is first established.

Lastly, to successfully decode RLNC coded symbols, sufficient degrees of freedom are required at the decoder. The maximum number of redundant symbols that can be transmitted is therefore limited by the number of linearly independent coding coefficient vectors that can be supported by the system. For example, if coding vectors are constructed using a pseudo-random generator, the maximum number of redundant symbols that can be transmitted is limited by the number of available generator states.

3. Symbol Representation

This section provides a symbol representation design for implementing RLNC-based erasure correction schemes. In this symbol representation design, multiple symbols are concatenated and associated with a single symbol representation header.

The symbol representation design is provided for constructing a data payload portion of a data packet for a protocol that utilizes a generation-based or sliding-window RLNC, where recoding can be used at intermediate nodes. A data packet data payload comprises one or more symbol representations. Each symbol representation in turn comprises one or more symbols that can be systematic, coded or recoded. The use of this symbol representation design is not limited by transmission schemes. It can be applied to unicast, multiple-unicast, multicast, multi-path, and multi-source settings and the like.

Coding coefficient vectors must be implicitly or explicitly transmitted from the sender to the receiver, generally along with the coded data for successful decoding of the original data. One option is to attach each coding coefficient vector to the corresponding coded symbol as a header, thus also enabling recoding at intermediate nodes. Another option is to attach the current state of a pseudo-random generator for generating the coding coefficient vector, to reduce the size of the header. Adding a header to each symbol may result in a high overhead when the symbol size is small or when generation or sliding window size is large. Adding a joint header to the beginning of each generation may also cause synchronization to be re-initiated only at the beginning of each generation instead of every symbol. In what follows, a symbol representation is provided that allow for both of these options such that both a general representation with coding coefficients and a compact representation with a seed for generating the coding coefficients can be used, in order to reduce the header overhead.

3.1. Representation Setup

This section specifies a symbol representation that enables both a general form with coding vectors attached, and a compact form where a seed is attached instead for the first symbol in the symbol representation, and where subsequent coding vectors are deduced from the first one. Different maximum generation and window size are supported for RLNC encoding, recoding, and decoding.

To encode over a set of data symbols, a coding vector as described in Section 1.1 is first generated, comprising a number of finite field elements as specified by a generation/window size variable. In the case of systematic codes, systematic symbols correspond to unit coding vectors.

Figure 4 illustrates the general symbol representation design. Four header fields precede the symbol data: TYPE flag, SYMBOLS, ENCODER RANK, and SEED or CODING COEFFICIENTS. The TYPE Flag indicates if the symbol is systematic, coded, or recoded. SYMBOLS indicates the number of symbols in the “Symbol(s) Data” field. ENCODER RANK represents the current rank of the encoder, which is the number of symbols being linearly combined. SEED is used to generate the coding coefficient vector(s) using a pseudo-random number generator, for a compact form of the symbol representation. CODING COEFFICIENTS field is a list of SYMBOLS number of coding vectors used to generate the ensuing SYMBOL(S) DATA.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|  TYPE |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                  SEED or CODING COEFFICIENTS                  |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                         SYMBOL(S) DATA                        |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 4: A general symbol representation design.

3.2. Field Types and Formats

The TYPE Flag indicates if the symbol is systematic, coded, or recoded, and has the following properties:

2 bits long.
If the TYPE flag is ‘1’, all symbols included in this symbol representation are systematic or uncoded, with symbol index starting from ENCODER RANK. This option allows for efficient representation of systematic symbols.
If the TYPE is '2', all symbols included in this symbol representation are coded, with coding vectors generated using the included SEED and the ENCODER RANK. This option allows for compact and efficient representation of coded symbols, which may also subsequently be recoded.
If the TYPE is '3', all symbols included in this symbol representation are either coded or recoded, with the coding coefficients included constituting ENCODER RANK coefficients each.

SYMBOLS indicates the number of symbols in the “Symbol(s) Data” field, and has the following properties:

4 bits long. A maximum number of 15 symbols are concatenated within each symbol representation.
The special case of SYMBOLS = 0 indicates that zero symbols are included, and consequently the size of ‘Symbol(s) Data’ is 0 bytes. This can, for example, be used to implement a flush functionality or ensure that protocol operations do not stop in certain case for purely event-driven protocols.

ENCODER RANK represents the current rank of the encoder, and has the following properties:

MUST be no larger than generation/window size.
If TYPE flag is '1', ENCODER RANK is the symbol index of the first data symbol in this symbol representation.
If TYPE flag is '2' or '3', ENCODER RANK is the number of data symbols over which coding was performed for all coded symbols in this symbol representation.
Coded symbols can be generated before a generation or window is filled. ENCODER RANK describes the number of original symbols included in the coded symbol(s).

SEED is used to generate the coding coefficient vector(s) using a pseudo-random number generator, for a compact form of the symbol representation, and has the following properties:

The SEED field is only present when TYPE flag is '2'. If TYPE is '1' or '3', this field is absent.
The pseudo-random generator MUST be seeded with this value and all coding coefficient vectors are produced by the same generator. For example, if ENCODER RANK is 12, then coding vector for the first symbol in this symbol representation is coefficients 0 through 11 generated by the pseudo-random generator seeded by SEED, and coding vector for the second symbol in this symbol representation is coefficients 12 through 23 generated by the pseudo-random generator seeded by SEED. If generation/window size is larger than ENCODER RANK, the remaining coefficients in the coding vector are zero.
To ensure that SEED can be interpreted correctly at the receiver, the same pseudo-random number generator MUST be used by the sender and a recoding or receiving node. Otherwise, more than one SEED field would need to be used.
8 bits long. Thus, 256 different seed values can be served. One SEED is used per symbol representation, each of which can contain up to 15 symbols, all derived using the same SEED. For distinct ENCODER RANKs, different coding vectors would be generated from the same SEED, since only an ENCODER RANK number of coefficients from the random generator is grouped as a coding coefficient vector, before progressing to the next coding vector for the next symbol in the symbol representation. Consequently, the maximal number of coded symbols that can be generated for a generation is |SEED| * |SYMBOLS| * |ENCODER RANK| which in the best case is (2^8)*(2^4-1)*(2^10) ~ 2^22, which for all practical considerations can be considered as an infinite number of coded symbols. If all coded symbols that can be represented using a SEED is exhausted, symbols where the coding vectors is included can be sent instead.
In the case where no random number generator is available, or where its use is not desired, the coding coefficients can be produced by other means, such as functions of the data, state of the network, or the like, and transmitted explicitly by setting the TYPE flag to ‘3’.

CODING COEFFICIENTS field is a list of SYMBOLS number of coding vectors used to generate the ensuing SYMBOL(S) DATA, and has the following properties:

The CODING COEFFICIENT field is only present when TYPE flag is '3'. If TYPE is '1' or '2', this field is absent.
Each coding vector comprises ENCODER RANK number of coding coefficients, each coding coefficient having a predetermined field size.

3.3. Small Encoding Window

In a first small encoding window symbol representation, ENCODER RANK is 10 bits long, and the maximum generation/window size is 2^10.

Figures 5 to 7 below illustrate systematic, coded, and recoded symbol representations within an encoding window of size 2^10. Systematic symbols are uncoded. Coded symbols are compact in form and comprises a seed for coding coefficient generation. Recoded symbols are general in form and comprises the coding coefficient vectors explicitly.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   1   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                         SYMBOL(S) DATA                        |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 5: A systematic symbol representation.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   2   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
+             SEED              |        SYMBOL(S) DATA         |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                    SYMBOL(S) DATA continued                   |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 6: A compact, coded symbol representation.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   3   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                      CODING COEFFICIENTS                      |
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                         SYMBOL(S) DATA                        |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 7: A recoded symbol representation.

3.3.1. Examples

The following examples show different symbol representations for an illustrative case where the symbol size is 2 bytes, generation/window size is 8, and field size is 2^8.

Example 1: Three systematic symbols with ID 0, 1 and 2. As the TYPE flag is '1' , SEED/CODING COEFFICIENTS is absent, and ENCODER RANK is the symbol index of the first data symbol with ID 0 in this compact symbol representation.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   1   |       3       |               0                       |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                    Systematic Symbol 0 Data                   |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                    Systematic Symbol 1 Data                   |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                    Systematic Symbol 2 Data                   |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 8: A symbol representation with 3 systematic, uncoded symbols.

Example 2: Two coded symbols using a compact representation. In this example, TYPE is '2', the SEED to the pseudo-random number generator shared by the sender and receiver is 4. The coding vector for Symbol A is coefficients 0 to 7 generated by the pseudo-random number generator, the coding vector for symbol B is coefficients 8 to 15 generated by the pseudo-random number generator.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   2   |       2       |               8                       |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|               4               |      Coded Symbol A Data      | 
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|       Coded Symbol A Data     |      Coded Symbol B Data      |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|       Coded Symbol B Data     |
+---+---+---+---+---+---+---+---+

Figure 9: A symbol representation with 2 coded symbols.

Example 3: Two recoded symbols. Coefficients A0 to A7 constitutes the coding vector for Symbol A, coefficients B0 to B7 constitutes the coding vector for symbol B . In practical implementations, symbols sizes are much larger than 2, leading to amortization of the coding coefficient overheads.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   2   |       2       |               8                       |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|              A0               |              A1               | 
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                              ...                              |     
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|              A6               |              A7               |     
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|              B0               |              B1               | 
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                              ...                              |     
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|              B6               |              B7               |   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                      Coded Symbol A Data                      |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                      Coded Symbol B Data                      |  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 10: A symbol representation with 2 recoded symbols having coding coefficients attached.

3.4. Large Encoding Window

In a second large encoding window symbol representation, ENCODER RANK is 18-bit long, and the maximum generation/window size is 2^18.

Figures 11 to 13 below illustrate systematic, coded, and recoded symbol representations within an encoding window of size 2^18. Systematic symbols are uncoded. Coded symbols are compact in form and comprises a seed for coding coefficient generation. Recoded symbols are general in form and comprises the coding coefficient vectors explicitly.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   1   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|     ENCODER RANK continued    |         SYMBOL(S) DATA        |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                    SYMBOL(S) DATA continued                   |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 11: A systematic symbol representation.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   2   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|     ENCODER RANK continued    |             SEED              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                         SYMBOL(S) DATA                        |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 12: A coded symbol representation.

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|   
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   3   |    SYMBOLS    |             ENCODER RANK              |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|     ENCODER RANK continued    |      CODING COEFFICIENTS      |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                 CODING COEFFICIENTS continued                 |
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|                         SYMBOL(S) DATA                        |  
|                               ...                             |
|                                                               |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Figure 13: A recoded symbol representation.

4. Security Considerations

This document does not present new security considerations.

5. IANA Considerations

This document has no actions for IANA.

6. References

6.1. Normative References

[RFC2119]	Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3453]	Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, M. and J. Crowcroft, "The Use of Forward Error Correction (FEC) in Reliable Multicast", RFC 3453, DOI 10.17487/RFC3453, December 2002.
[RFC5445]	Watson, M., "Basic Forward Error Correction (FEC) Schemes", RFC 5445, DOI 10.17487/RFC5445, March 2009.

6.2. Informative References

[HK03]	Ho, T., Koetter, R., Medard, M., Karger, D. and M. Effros, "The Benefits of Coding over Routing in a Randomized Setting", July 2003.
[HL08]	Ho, T. and D. Lun, "Network Coding: An Introduction", April 2008.
[MS11]	Medard, M. and A. Sprintson, "Network Coding: Fundamentals and Applications", October 2011.
[SS09]	Sundararajan, J., Shah, D., Medard, M., Mitzenmacher, M. and J. Barros, "Network Coding Meets TCP", April 2009.

Authors' Addresses

Janus Heide Steinwurf Aps Aalborg, Denmark EMail: janus@steinwurf.com

Shirley Shi Code On Network Coding LLC Cambridge, USA EMail: xshi@alum.mit.edu

Muriel Medard Code On Network Coding LLC Cambridge, USA EMail: muriel.medard@codeontechnologies.com

Vince Chook Inmarsat PLC London, United Kingdom EMail: Vince.Chook@inmarsat.com