Robust Header Compression Peter J. McCann INTERNET DRAFT Tom Hiller Document: draft-mccann-rohc-gehcoarch-01.txt Lucent Technologies February, 2001 Requirements and Architecture for Zero-Byte Header Compression Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [Bradner96]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract Efficient transmission of voice over wireless links requires significant engineering effort. Because of the high cost of bandwidth on such links, special techniques for compression of voice data and its transmission over the air have been developed. The compression techniques and the wireless physical layers have been co-designed for maximum spectral efficiency and human perceptual euphony. Voice over IP (VOIP) applications should be able to leverage this engineering effort when used over wireless links. We advocate a "zero-byte header compression" approach to this problem in order to enable the end-to-end service model while achieving maximum spectral efficiency. This document outlines an architectural framework for a wireless VOIP application, including the wireless link layer and its interface to typical IP stack implementations, and discusses the protocol elements that should be standardized between the various components. McCann, Hiller Expires 08/2001 1 GEHCOARCH February, 2001 2. Introduction Voice over IP (VOIP) promises to change radically the way that telephony services are built and delivered. Integration of voice with the Internet will not just be a change in the way traffic is carried; rather, new types of services will be made possible by the integration of voice with existing Internet applications such as the World Wide Web and e-mail. The key to these new services will be a platform that offers open programmability while offering a transport for VOIP in an integrated, robust, and efficient way. Wireless links offer great challenges to the transport of voice traffic, and significant engineering effort has gone into making them efficient for circuit voice applications. New voice compression algorithms ("codecs"), such as EVRC [TIA-IS127], SMV [TIA-SMV], or AMR [ETSI-AMR] have been developed to minimize the amount of data that must be carried, and special over-the-air channels have been implemented to carry these codecs with a minimum of overhead bits and minimal latency. VOIP flows will be carried inside the Real-Time Protocol (RTP) [Shulzrinne96] on wired links. However, for wireless links, the situation is less clear. The limited bandwidth of wireless links makes it impossible to transmit the entire IP/UDP/RTP header with every packet, as the overhead would be prohibitive. It is possible to compress these headers by transmitting only updates to the fields that change rather than the entire header [Bormann00], but these compression schemes are complex and can never entirely eliminate the overhead due to RTP. Even when the header is compressed down to one byte per frame on average, the impact on spectral capacity is significant. Also, the variable-sized frames produced by these compression protocols are unsuitable for typical wireless links that support only a limited number of frame sizes. The fundamental reason why these schemes cannot achieve the same efficiency as circuit data is that they discard information that is available at the physical channel layer, including the real- time nature of the traffic, which can assist in reconstructing the RTP header. This document describes an architectural framework that allows such real-time information to be used while not restricting the choice of call control protocol, placement of call feature servers, or mobile station architecture. We show that such a scheme can achieve complete transparency in the reconstruction of IP headers given a few reasonable assumptions about the behavior of the real-time physical link and the RTP packet stream. A companion document [Hiller01] gives a concrete realization of part of the architecture by extending the base ROHC protocol with a zero-byte compression profile. McCann, Hiller Expires 08/2001 2 GEHCOARCH February, 2001 2.1 Wireless Technology Considerations Cellular wireless technologies will support distinct bearer channels for real-time audio flows versus non-real-time data. Data for TCP, such as web or e-mail traffic, will suffer from the lossy nature of the wireless link unless a link-layer retransmission protocol is used to improve its reliability. Such a retransmission protocol (called the Radio Link Protocol or RLP in the emerging cellular data networks) does improve reliability but only at the expense of additional buffering and latency. Real-time audio streams cannot tolerate the additional latency, which could be on the order of 1 second under adverse radio conditions. For this reason, a separate bearer channel will be used for voice that does no retransmission. This bearer will be very similar to the existing circuit voice channels. The architecture outlined below allows the mobile station to make effective use of this channel for VOIP. The architecture of a mobile station should allow maximum flexibility in its hardware and software choices. Two basic mobile station models have been identified in the wireless data community. A "network model" station is one that is completely integrated, such as a phone plus browser or a palmtop with integrated radio hardware. Such a device usually has a real-time operating system, a DSP chip for processing the audio codec, and an embedded IP stack implementation. In contrast, a "relay model" station is one that is split in two: it consists of a piece of terminal equipment (such as a laptop computer) connected to a piece of radio equipment, usually by a serial connection. The idea is to make use of the mostly stock operating system on the terminal equipment, while "relaying" the data to and from the wireless network via the radio equipment. We take the point of view that VOIP applications must be supported for both kinds of mobile stations. While network model phones will offer a tightly integrated set of services, relay model stations are likely to offer a much more open and programmable environment on the terminal equipment. As these devices evolve we expect the distinction between network and relay models will blur as the wireless device moves closer to the UNIX notion of a "network interface" to a stock operating system, and the operating system evolves to take on more real-time functionality. 3. Requirements A zero-byte header compression scheme makes use of physical channel timing to accurately reconstruct the RTP header fields. The basic requirement of such a scheme is to accurately reconstruct headers while adding a minimum of overhead bits when compared to ordinary circuit voice. McCann, Hiller Expires 08/2001 3 GEHCOARCH February, 2001 Approximate voice activity factors (probability distribution of frame sizes) for the Selectable Mode Vocoder (SMV) are given in Figure 1. These reflect one party's activity during a typical two-way interactive voice call. Rate Activity % Payload (bits) Full 20 171 Half 20 80 Quarter 10 40 Eighth 50 16 Figure 1: Activity of the 3GPP2 Selectable Mode Vocoder This vocoder is designed to operate synchronously with the underlying physical channel: it outputs one of the above frame sizes every 20 milliseconds. Which frame size is output depends on the characteristics of the speech being compressed; typically, full-rate (171 bit) frames are used during active talk spurts, interspersed with half- and quarter-rate frames as needed. Eighth-rate frames are used mainly during silence periods, but they also contain information about the noise components present in the silence, which is referred to as "comfort noise generation". Also, the physical link typically requires that some frame be transmitted during every 20ms interval so that power control can be maintained, and the eighth-rate frames play this role. The cdma2000 air interface has been designed with these frame sizes in mind, to support optimal transport of circuit voice. It is not possible to perform a marginal adjustment to the frame sizes to accommodate header overhead. This makes application of the basic ROHC RTP profile problematic at best: if one byte of LSB-encoded sequence number is added to a frame, it must be carried in the next-higher frame format. For a full-rate frame, there is no next-higher frame format and so those frames could not be transported without breaking the synchronization with the underlying physical link and introducing additional framing, for example with the use of PPP HDLC flags or the ROHC segmentation mechanism. This would introduce another 1 or 2 bytes of overhead per frame, and would also have a multiplier effect on the frame error rate since most vocoder frames would now span two physical frames. Finally, this lack of synchronization would introduce an occasional lag between the vocoded frame time and real time that could add to the end-to-end latency and jitter of the RTP flow. Even a very conservative calculation, assuming these problems can be overcome and ignoring the contribution from eighth-rate 16 bit frames, yields an additional 400 bits per second from the header and segmentation overheads. Compared to the average 3720 bps circuit voice rate, this overhead (greater than 10%) would McCann, Hiller Expires 08/2001 4 GEHCOARCH February, 2001 significantly diminish the number of calls that can be handled in a given amount of spectrum. We conclude that because the codec and physical link have been co-engineered to such tight tolerances, we should endeavor to use the vocoder/physical link largely unchanged from its existing implementation for circuit voice. By not imposing any new format requirements on the vocoded frames, we allow development of future codecs to proceed with maximum flexibility. By making use of the real-time nature of the physical link, it is possible to eliminate header overhead while retaining the transparency offered by the basic ROHC RTP profile. This is because the time at which a frame arrives can be used as an indication of the proper sequence number that should be assigned to the corresponding RTP packet. In the opposite direction, packets can be scheduled for transmission during the correct physical layer interval so that all end-to-end semantics are preserved. In order for real-time to serve as a proxy for the RTP sequence number, it must be the case that the sequence number increments by one for every physical layer epoch. This would be satisfied if the transmitter sends a vocoded frame for every epoch, as is done by the existing cdma2000 vocoders even during silence intervals. Under this assumption, synchronization for play-back or for cryptosync, could then be based on the accurate sequence number or timestamp included with each packet. Note that in 3G systems the mobile node transmits continuously even during silence so that the network may monitor power. We assume that IPv4 Identifiers (used in fragmentation and re- assembly) can be taken from a contiguous range and do not need to be encoded with every packet. Only when a new range of Identifiers is chosen does an update need to be sent. We also assume that other fields of an RTP header such as CSRCs are updated rarely, if at all, and so these updates can be carried over the sister reliable data link to the peer without imposing much additional overhead. Finally, we assume that other events, such as a reset of the physical link due to hard handoff or a sequence number slippage due to clock drift, are similarly rare. All of these updates can be transported over the sister reliable data link when they occur. These updates can make use of the same mechanism used to initialize the flow parameters, in order to minimize the complexity of the decompressor, or special optimized update methods may be developed. McCann, Hiller Expires 08/2001 5 GEHCOARCH February, 2001 In summary, we pose the following set of requirements for a zero- byte compression scheme: - Little or no overhead compared to circuit voice - No change to the format of voice payloads. - Transparent decompression of all IP/UDP/RTP fields To meet requirements, we pose the following as assumptions on the operating environment: - The underlying link is synchronous: it provides precise indication of when a given frame arrived. - The RTP stream is well behaved: the sequence number is incremented every 20 milliseconds, and IPv4 Identifiers are taken from a contiguous range. 4. Reference Architecture Our reference architecture is shown in Figure 2. Other NRT VOIP Zero-Byte Apps Control------------Control \ \ | \ \ | +-------------IP Protocol | Stack / | / | / Header Comp/ ------Data Link--+ Peer Decomp \ Layer System \___ | | \ | | Audio Codec \ +-------------->Physical<----+ Hardware<--->Impl <--+------------------>Channel(s)<--+ Figure 2. Reference architecture for a system implementing zero-byte header compression. The architecture diagram consists of nine components connected to a peer system. Note that we expect zero-byte header compression to be somewhat asymmetric in that it will usually be implemented between a mobile station, where the VOIP and other applications reside, and a peer network entity that is just a data link termination point and a first-hop Internet router. As such, the McCann, Hiller Expires 08/2001 6 GEHCOARCH February, 2001 peer system in the network will likely be missing the audio hardware and codec implementation, and may not participate in the VOIP control. Also, the mobile station may not need to actually perform header compression and decompression if its codec implementation is connected directly to the physical channel, which will likely be required to achieve the desired latency guarantees. The component named "Zero-Byte Control" would consist of the protocol logic used to set up and maintain the zero-byte header compression context. We assume this will be realized as a profile extension to the ROHC framework, and give a concrete realization of the needed protocols in a companion document [Hiller01]. In the following subsections we discuss each of the architectural elements in turn. The next section will discuss the interfaces between them. 3.1 Non Real-time Components It is important to distinguish between the real-time and non real- time components of Figure 2. This is especially important for a relay model mobile station, as it impacts which elements of stock operating systems can be reused and which must be implemented as new real-time extensions. In this subsection we examine the non real-time components. 3.1.1 VOIP Control The VOIP control component is the implementation of the call signaling protocol, such as SIP [Handley00] or H.323 [ITU-H323]. We make no assumptions on which protocol is used, and we do not require the network-side peer system to contain this element. The mobile station will use one of the VOIP signaling protocols to interact with call feature servers that could be anywhere on the Internet. We assume that this component will open network-layer connections and will have access to the transport endpoint identifiers for the IP/UDP/RTP flow. However, we do not require this element to actually process audio data; it will probably be implemented in user-space and would therefore add unpredictable latency to such flows. 3.1.3 IP Protocol Stack Implementation We assume that the mobile station implements an IP protocol stack in conformance with RFC 1122 [Braden89]. Note that such an McCann, Hiller Expires 08/2001 7 GEHCOARCH February, 2001 implementation is usually not capable of supporting hard real-time tasks. 3.1.4 Data Link Layer The data link layer is the interface between the IP protocol stack and the wireless network device. For cdma2000, this will be PPP [TIA-IS835]. For GPRS, this will be LLC [ETSI-LLC], and for UMTS, this will be PDCP [ETSI-PDCP]. For cdma2000, we assume a mostly stock PPP implementation for interaction with the physical channels that support data and perform retransmission. However, because the data link layer is not a hard real-time component, we would not place it on the audio traffic path inside the mobile station. 3.1.5 Zero-Byte Control The Zero-Byte Control component is responsible for negotiating the use of ROHC parameters with the peer system and for setting up context information such as the fixed portion of the IP/UDP/RTP header. It will interact with the VOIP control component to acquire these parameters, and will send them across the data link layer to the peer system. It will also interact with the wireless device (possibly through the data link layer) to establish the physical audio channels and will identify the channel to be used when sending context information to the peer system. It will also need to receive indications from the physical layer when the channel is reset, such as during hard handoff, so that the context can be re-synchronized. Finally, it should get indications from the audio hardware and codec (or the header compression component if no codec implementation is present) about sequence number slippage due to clock drift so that re-synchronization updates can be sent to the peer. 3.1.6 Other Non-Real-Time Applications We expect the terminal equipment to be a general-purpose computer and as such will have other applications running. These applications may interact with other components such as the IP protocol stack, but in general will not be hard real-time tasks. These applications must co-exist will all the other components. 3.2 Real-time Components Because we make use of the real-time nature of the physical channel, several components must be implemented as real-time McCann, Hiller Expires 08/2001 8 GEHCOARCH February, 2001 tasks. For a network model phone, this is similar to existing practice: a tightly integrated, real-time operating system on an embedded device schedules the audio sampling and playback to coincide with the physical frame rate of the wireless link. For a relay model terminal, we wish to make use of the audio hardware on the connected terminal equipment. This may require that the components be implemented using special real-time extensions to existing stock operating systems. 3.2.1 Audio Hardware The audio hardware consists of the analog-to-digital (A/D) and digital-to-analog (D/A) converters used for sampling and playing back sound, along with the analog microphones and speakers. In a network model phone this consists of the integrated equipment that is part of the phone. In a relay model terminal it would be the "sound card" or other audio peripheral. To achieve the required hard real-time performance we assume that special software drivers may be required in such relay model terminals. 3.2.2 Codec Implementation The codec implementation converts the sampled audio to and from the special wireless-specific encoding format. For a network model phone, this encoding is carried out on dedicated Digital Signal Processing (DSP) hardware. In a relay model terminal, we assume this is performed on the general purpose CPU of the terminal equipment. 3.2.3 Physical Channel As mentioned before, there will be two physical channels supporting the mobile station: one that runs RLP retransmission, supporting the latency tolerant data applications; and another that resembles a voice circuit. VOIP control signaling will traverse the data-oriented RLP channel, while the voice bearer traffic will traverse the real-time circuit-like channel. Both channels must be available to the upper layers regardless of whether a relay model or network model terminal is used. The voice channel supports real-time traffic and performs no buffering. It will send a frame at precise, periodic intervals, such as 20 milliseconds for cdma2000. The codec implementation must be able to supply frames for the physical channel at exactly this rate. We assume that a physical channel is established at a given point in time and that both peers can count the number of frame McCann, Hiller Expires 08/2001 9 GEHCOARCH February, 2001 intervals that have elapsed so far. The channel may be "reset" by handoff events (such as hard handoffs in cdma2000) that do not necessarily result in a change of peer system, but which may require re-synchronization of compression/decompression state with the physical channel. 3.2.4 Header Compression/Decompression We expect the codec implementation to be directly connected to the physical channel on the mobile terminal side, and so concrete IP/UDP/RTP headers may not necessarily appear inside the mobile terminal. Therefore, the header compression/decompression component is only really necessary on the network side of the zero-byte header compression protocol. This component is drawn next to the data link layer in the diagram, and may in fact be integrated into the data link layer implementation. It is responsible for classifying each packet coming down from the IP protocol stack against the fixed IP/UDP/RTP header fields we are attempting to compress. The value of these fields is established by the Zero-Byte control component and installed into the header compression component, possibly via the data link layer. Once the header has been stripped this component must schedule the payload for transmission on the physical layer at the appropriate frame interval, according to the sequence number and timestamp received in the header. In the opposite direction, when packets arrive on the network side from the physical channel, this component is responsible for regenerating the proper IP/UDP/RTP header and passing the packet on to the IP protocol stack. It makes use of the physical arrival time to generate the proper timestamp and sequence number in the RTP header. Because the header compression/decompression component is sending and receiving packets from the IP protocol stack, it is at best a soft real-time component. However, it must interact with the physical voice channel, which is a hard real-time component, both to properly record the frame arrival time and to schedule outgoing packets for transmission. If the header compression/decompression is implemented in a separate network element from the physical channel, as is likely to be the case in the emerging cellular architectures [TIA-IS835], then this interaction could be accomplished with the proper use of sequence numbers on the interfaces between them so that each physical frame carries the information about precisely when it arrived or when it is to be transmitted. McCann, Hiller Expires 08/2001 10 GEHCOARCH February, 2001 4. Interfaces In this section we examine the interfaces between the above components. We distinguish between those interfaces that should be implemented as protocols, suitable for standardization in the IETF or elsewhere, and those that should remain Application Programming Interfaces (APIs) that may or may not need to be standardized. 4.1 Protocol Reference Points In terms of new protocols, the interfaces that need to be standardized are listed below. Some of these interfaces are opportunities for IETF protocols, while others should be carried out by other standards-setting organizations. 4.1.1 Zero-Byte Control to Data Link Layer The Zero-Byte control component needs to negotiate the use of ROHC with its peer and convey the static portion of the IP/UDP/RTP header to the peer. This should be done in such a way that the network side is not required to participate in the VOIP control protocol. This means the network side depends on the mobile station to inform it what are the RTP flows that should be classified by the header stripping component as appropriate for sending over the physical voice channel. Rather than create a new network-layer protocol, we advocate using new data link messages between the two systems to convey this information. We advocate extending ROHC with a new zero-byte profile. The GEHCO proposal [Hiller01] is an attempt at this, and this work should be carried out in the IETF. The particular mapping of ROHC onto a data link layer such as PPP should also be performed in the IETF. The mapping of ROHC onto other link layers is a job for other standards bodies. 4.1.2 Data Link Layer to Physical Channel Mobile terminals running PPP will typically generate an octet stream that is appropriate for an underlying physical channel running RLP. However, prior to running PPP the mobile terminal must take steps to establish the channel. Also, we require that the terminal be able to dynamically establish and release the voice channels used for real-time audio. For a network model phone this may be supported by APIs within the phone, but for a relay model terminal this signaling needs to be carried out across a serial port. Such signaling is usually the provenance of a McCann, Hiller Expires 08/2001 11 GEHCOARCH February, 2001 modem control protocol ("AT commands") and standardization is probably best carried out in the International Telecommunications Union (ITU). Note that in addition to the usual signaling to establish and release channels, we also need to obtain identifying information for each channel, along with a precise reference for the time at which the first physical voice frame will be transported across the wireless link. This information will be used by the Zero-Byte control component to communicate the initial timestamp and sequence number offsets to the peer. It must be possible to signal this information during a running PPP session. Also, the precise timing of handoff events in the network must be communicated to the Zero-Byte control implementation so that it can properly re-synchronize the compression state or carry out negotiation with the new peer system, if the handoff resulted in a change of attachment point. Note that some handoff events resulting in a reset of the physical channel will not result in a change of peer attachment point, depending on the architecture of the underlying access network. If Zero-Byte control state is proactively transferred from a source peer system to a target peer system, the relationship between RTP timestamps and physical layer frames must be preserved. 4.1.2 Physical Channel to Codec or Header Compression/Decompression As stated above, the physical channel will interface directly to the codec implementation on the mobile station side and to a header compression/decompression process on the network side. For a network model phone, the codec interface may be a proprietary API. However, for a relay model terminal, we must standardize a new way to transport the frames across a serial connection in real-time. This will require that we multiplex the real-time frames with the non-real-time data for PPP. This multiplexing could be carried out with the use of escape characters on the serial interface; again, this work is probably best carried out within the ITU. Any new special characters would need to be properly inserted into the ACCM of the PPP implementation. On the network side, the physical voice channel may be separated from the header compression/decompression process by an IP network. If this is the case then each physical frame must carry a sequence number that indicates the exact frame time that it was received or is to be transmitted over the air. Standardization of such interfaces is best carried out within the 3rd Generation Partnership Projects (3GPPs). 4.2 API Reference Points Other interfaces between the components are best done as Application Programming Interfaces (APIs) and may or may not need McCann, Hiller Expires 08/2001 12 GEHCOARCH February, 2001 to be standardized. In any case we do not advocate the standardization of APIs within the IETF and we discuss these interfaces for illustration purposes only. 4.2.1 VOIP Control to Zero-Byte Control The VOIP control component is responsible for end-to-end VOIP signaling such as SIP [Handley00] or H.323 [ITU-H323]. We expect these applications to be implemented by many different people and to use standard operating system interfaces. Also, these applications should work the same way when used in wireless or wireline settings, except that the codecs should be tailored for the specific link layer currently in use. When used over wireless links, we expect that applications will want to make use of the optimized real-time path outlined above (audio hardware to codec to physical channel) rather than taking audio data into user space, performing a user space codec transformation, constructing RTP packets, and writing them to a standard UDP socket. Such user space manipulation of audio traffic would introduce unpredictable latency to the flow. To enable the optimized real-time path, the VOIP control protocol should signal to the Zero-Byte control component that it has completed VOIP signaling and is ready to begin audio bearer flow. This signal might be a system call containing the IP/UDP/RTP parameters that have been negotiated and the codec to be used. This system call would be a one-line addition to existing VOIP client implementations. 4.2.2 Zero-Byte Control to Real-time Path When the Zero-Byte control component receives a signal from the VOIP control component that the VOIP signaling has been completed, it must take the following steps: 1) Open the new physical voice bearer channel; 2) Send the peer system information about the flow, including the static header fields and identification of the physical bearer channel; and, finally, 3) Trigger the audio hardware to begin sampling, and the codec implementation to begin encoding/decoding. The first step could be accomplished via an interface to the data link layer, or may be accomplished directly. In any case we need to acquire precise timing information about when the first voice physical frame will be sent and this timing needs to be related to McCann, Hiller Expires 08/2001 13 GEHCOARCH February, 2001 the internal time reference that will be used for RTP timestamps. In the second step, this timing information is used to inform the peer what header fields should be placed on the first physical frame. The third step requires interaction with the real-time components such as the audio hardware and codec implementation, to enable the real-time data to start flowing. Note that whenever an event takes place that requires re- synchronization of the compression state, such as a physical layer reset or sequence number slippage due to clock drift, the Zero- Byte control component must update its peer with the appropriate state. This update should include an offset, calculated from the time the channel was established or reset, indicating to which physical layer frame the update applies. Such offset-indicating updates should also be sent when any of the normally static header fields, such as TTL, TOS, or CSRCs change. This will enable completely transparent decompression of RTP header fields. 4.2.3 Header Compression/Decompression to Data Link Layer The header compression component must classify all traffic from the IP protocol stack as to whether it is part of the RTP flow that needs to be sent on the voice physical channel. Because it must examine each packet, it will probably be fairly tightly integrated with the data link layer. The header decompression component produces IP packets from the physical voice frames and sends them up the IP protocol stack. Getting packets to the IP protocol stack may be implemented by passing the packets through the data link layer. 4.2.3 Other Interfaces The mobile terminal potentially will be executing many simultaneous applications and we expect all of the standard interfaces (network sockets, GUI) to be present. Note that ordinary applications may want to use the audio hardware at the same time as a voice call is in progress. This could be disallowed, or a special "audio mixer" process could be introduced between the audio hardware and the codec implementation to allow such simultaneous access. For example, a system beep noise might be mixed into the telephone call in such a way that only the mobile terminal user would hear it. Much ado has been made about the proper reconstruction of the IP Identification field for each RTP packet. We note that RTP payloads are required to stay within the path MTU [Handley99] and should never experience fragmentation. However, in order to avoid any possibility of Identification field collision with other McCann, Hiller Expires 08/2001 14 GEHCOARCH February, 2001 packets that may be fragmented, a new interface could be implemented between the Zero-Byte control and the IP protocol stack to "reserve" a range of Identification values for use by the RTP flow. If the header decompression component always increments the Identification field by one for each reconstructed header, and wraps around to the beginning when the range is about to overflow, then no additional work is necessary to ensure uniqueness of IP Identification fields. 5. Conclusions This draft has presented an architecture for zero-byte header compression and its implications for both a mobile station and the supporting network. On the network side, with this architecture the peer in the network does not need to be aware of the VOIP control between the mobile and a SIP/H323 server that could be anywhere in the network. When the header compression/ decompression is performed in a network element that is physically separated from the physical channel (e.g. a PDSN from 3GPP2 [TIA- IS835]), the hard real-time requirements on this element can be alleviated through the proper use of sequence numbers on its interface to the radio channel elements. On the mobile side, this draft provides high level requirements for support of zero-byte header compression in the form of protocol interfaces and APIs. Both monolithic network style mobiles as well as relay phone mobiles with laptops are discussed. Proper architecture of the mobile station allows the segregation of hard real-time processing from the non-real-time IP stack and applications. Furthermore, convergence of wireline and wireless applications is a long-standing goal in the wireless industry. This architecture allows mobile end systems to run VOIP based applications developed for wireline access to operate in the wireless environment (although with wireless-specific codecs). The impact on VOIP applications could be as little as one line of code in the VOIP client itself. Finally, the draft has outlined protocol work items suitable for the IETF as well as external standards bodies, including the ITU and 3rd Generation Partnership Projects. Any necessary APIs could be standardized by a collaboration between operating system vendors (open source or otherwise) and third party application developers, driven by wireless service providers. 6. References [Bormann01] Bormann, C. (ed.), "RObust Header Compression (ROHC)," draft-ietf-rohc-rtp-09.txt, March 2001. Work In Progress. McCann, Hiller Expires 08/2001 15 GEHCOARCH February, 2001 [Braden89] Braden, R. (ed.), "Requirements for Internet Hosts -- Communication Layers," RFC 1122, October 1989. [Bradner96] Bradner, S., "The Internet Standards Process, Revision 3," RFC 2026, October 1996. [ETSI-AMR] European Telecommunications Standards Institute, "Adaptive Multi-Rate (AMR) Speech Transcoding," 3G TS 26.090, February 2000. [ETSI-LLC] European Telecommunications Standards Institute, GSM 04.64. [ETSI-PDCP] European Telecommunications Standards Institute, 3G TS 25.323. [Handley99] Handley, M., and Perkins, C., "Guidelines for Writers of RTP Payload Format Specifications," RFC 2736, December 1999. [Handley00] Handley, Schulzrinne, Schooler, Rosenberg, "SIP: Session Initiation Protocol," draft-ietf-sip- rfc2543bis-01.txt, August 2000. Work In Progress. [Hiller01] Hiller, T., and McCann, P., "Good Enough Header COmpression (GEHCO)," draft-hiller-rohc-gehco- 01.txt, February 2001. Work In Progress. [ITU-H323] International Telecommunications Union, "Packet Based Multimedia Communications Systems," ITU-T Rec. H.323, September 1999. [Shulzrinne96] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, V., "RTP: A Transport Protocol for Real- Time Applications," RFC 1889, January 1996. [TIA-IS127] Telecommunications Industry Association, "Enhanced Variable Rate Codec, Speech Service 3 for Wideband Spread Spectrum Digital Systems," TIA/EIA/IS-127, February 1997. [TIA-IS835] Telecommunications Industry Association, "Wireless IP Network Standard," TIA/EIA/IS-835, June 2000. [TIA-SMV] Telecommunications Industry Association, "Selectable Mode Vocoder Service Option for Wideband Spread Spectrum Communication Systems," TIA PN4575, 3GPP2 C.P9001, 1997. McCann, Hiller Expires 08/2001 16 GEHCOARCH February, 2001 7. Authors' Addresses Peter J. McCann Lucent Technologies Rm 2Z-305 263 Shuman Blvd Naperville, IL 60566-7050 USA Phone: +1 630 713 9359 FAX: +1 630 713 4982 EMail: mccap@lucent.com Tom Hiller Lucent Technologies Rm 2F-218 263 Shuman Blvd Naperville, IL 60566-7050 USA Phone: +1 630 979 7673 FAX: +1 630 979 7673 EMail: tom.hiller@lucent.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. McCann, Hiller Expires 08/2001 17 GEHCOARCH February, 2001 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McCann, Hiller Expires 08/2001 18