TOC 
MMUSIC Working GroupM. Garcia-Martin
Internet-DraftM. Willekens
Intended status: InformationalNokia Siemens Networks
Expires: May 19, 2008P. Xu
 Huawei Technologies
 November 16, 2007


Multiple Packetization Times in the Session Description Protocol (SDP): Problem Statement & Requirements
draft-garcia-mmusic-multiple-ptimes-problem-01.txt

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on May 19, 2008.

Abstract

This document provides a problem statement and requirements with respect to the presence of a single packetization time (ptime/maxptime) attribute in SDP media descriptions that contain several media formats (audio codecs).



Table of Contents

1.  Introduction
2.  Some Definitions
3.  Some references
4.  Problem Statement
5.  Requirements
6.  Solutions already proposed
    6.1.  Method 1
    6.2.  Method 2
    6.3.  Method 3
    6.4.  Method 4
    6.5.  Method 5
    6.6.  Method 6
    6.7.  Method 7
    6.8.  Method 8
    6.9.  Method 9
    6.10.  Method 10
7.  Conclusion and next steps
8.  Security Considerations
9.  IANA Considerations
10.  References
    10.1.  Normative References
    10.2.  Informative References
§  Authors' Addresses
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

The Session Description Protocol (SDP) (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1] provides a protocol to describe multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. A session description in SDP includes the session name and purpose, the media comprising the session, information needed to receive the media (addresses, ports, formats, etc.) and some other information.

In the SDP media description part, the m-line contains the media type (e.g. audio), a transport port, a transport protocol (e.g. RTP/AVP) and a media format description which depends on the transport protocol.

For the transport protocol RTP/AVP or RTP/SAVP, the media format sub-field can contain a list of RTP payload type numbers. See RTP Profile for Audio and Video Conferences with Minimal Control (Schulzrinne, H. and S. Casner, “RTP Profile for Audio and Video Conferences with Minimal Control,” July 2003.) [14], Table 4. For example: "m=audio 49232 RTP/AVP 3 15 18" indicates the audio encoders GSM, G728 and G729.

Further, the media description part can contain additional attribute lines that complement or modify the media description line. Of interest for this memo are the 'ptime' and 'maxptime' attributes. According to RFC 4566 (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1], the 'ptime' attribute gives the length of time in milliseconds represented by the media in a packet, and the 'maxptime' gives the maximum amount of media that can be encapsulated in each packet, expressed as time in milliseconds. These attributes modify the whole media description line, which can contain an extensive list of payload types. In other words, these attributes are not specific to a given codec.

The RFC 4566 (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1] also indicates that it should not be necessary to know ptime to decode RTP or vat audio since the 'ptime' attribute is intended as a recommendation for the encoding/packetization of audio. However, once more, the existing 'ptime' attribute defines the desired packetization time for all the payload types defined in the corresponding media description line.

End-devices can sometimes be configured with different codecs and for each codec a different packetization time can be indicated. However, there is no clear way to exchange this type of information between different user agents and this can result in lower voice quality, network problems or performance problems in the end-devices.



 TOC 

2.  Some Definitions

The Session Description Protocol (SDP) (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1] defines the ptime and maxptime as:

a=ptime:[packet time]

This gives the length of time in milliseconds represented by the media in a packet. This is probably only meaningful for audio data, but may be used with other media types if it makes sense. It should not be necessary to know ptime to decode RTP or vat audio, and it is intended as a recommendation for the encoding/packetization of audio. It is a media-level attribute, and it is not dependent on charset.

a=maxptime:[maximum packet time]

This gives the maximum amount of media that can be encapsulated in each packet, expressed as time in milliseconds. The time SHALL be calculated as the sum of the time the media present in the packet represents. For frame-based codecs, the time SHOULD be an integer multiple of the frame size. This attribute is probably only meaningful for audio data, but may be used with other media types if it makes sense. It is a media-level attribute, and it is not dependent on charset. Note that this attribute was introduced after RFC 2327, and non-updated implementations will ignore this attribute.

Additional encoding parameters MAY be defined in the future, but codec-specific parameters SHOULD NOT be added. Parameters added to an "a=rtpmap:" attribute SHOULD only be those required for a session directory to make the choice of appropriate media to participate in a session. Codec-specific parameters should be added in other attributes (for example, "a=fmtp:").

Note: RTP audio formats typically do not include information about the number of samples per packet. If a non-default (as defined in the RTP Audio/Video Profile) packetization is required, the "ptime" attribute is used as given above.



 TOC 

3.  Some references

Many RFCs make references to the "ptime/maxptime" attribute to give some definitions, recommendations, requirements, default values.

SDP (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1] gives definitions for ptime/maxptime.

SDP Offer/answer model (Rosenberg, J. and H. Schulzrinne, “An Offer/Answer Model with Session Description Protocol (SDP),” June 2002.) [2] gives some requirements for the ptime for the offerer and answerer. If the ptime attribute is present for a stream, it indicates the desired packetization interval that the offerer would like to receive. The ptime attribute MUST be greater than zero. The answerer MAY include a non-zero ptime attribute for any media stream; this indicates the packetization interval that the answerer would like to receive. There is no requirement that the packetization interval be the same in each direction for a particular stream.

SDP Transport independent bandwidth modifier (Westerlund, M., “A Transport Independent Bandwidth Modifier for the Session Description Protocol (SDP),” September 2004.) [5] indicates that the ptime may be a possible candidate for the bandwidth but it should be avoided to be used for that purpose. The use of another parameter is proposed.

SDP Conversions for ATM bearer (Kumar, R. and M. Mostafa, “Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections,” May 2001.) [6] It is not recommended that the ptime be used in ATM applications since packet period information is provided with other parameters (e.g., the profile type and number in the 'm' line, and the 'vsel', 'dsel' and 'fsel' attributes). Also, for AAL1 applications, 'ptime' is not applicable and should be flagged as an error. If used in AAL2 and AAL5 applications, 'ptime' should be consistent with the rest of the SDP description. The 'vsel', 'dsel' and 'fsel' attributes refer generically to codec-s. These can be bed for service-specific codec negotiation and assignment in non-ATM s well as ATM applications. The 'vsel' attribute indicates a prioritized list of one or more 3- tuples for voice service. Each 3-tuple indicates a codec, an optional packet length and an optional packetization period. This complements the 'm' line information and should be consistent with it. The 'vsel' attribute refers to all directions of a connection. For a bidirectional connection, these are the forward and backward directions. For a unidirectional connection, this can be either the backward or forward direction. The 'vsel' attribute is not meant to be used with bidirectional connections that have asymmetric codec configurations described in a single SDP descriptor. For these, the 'onewaySel' attribute should be used. The 'vsel' line is structured with an encodingName, a packetLength and a packetTime. The packetLength is a decimal integer representation of the packet length in octets. The packetTime is a decimal integer representation of the packetization interval in microseconds. The parameters packetLength and packetTime can be set to "-" when not needed. Also, the entire 'vsel' media attribute line can be omitted when not needed.

SIP/SDP static dictionary for SigComp (Garcia-Martin, M., Bormann, C., Ott, J., Price, R., and A. Roach, “The Session Initiation Protocol (SIP) and Session Description Protocol (SDP) Static Dictionary for Signaling Compression (SigComp),” February 2003.) [7]

SIP device requirements and configuration (Sinnreich, H., Lass, S., and C. Stredicke, “SIP Telephony Device Requirements and Configuration,” May 2006.) [8] In some cases, operators want to control which codecs may be used in their network. The desired subset of codecs supported by the device SHOULD be configurable along with the order of preference. Service providers SHOULD have the possibility of plugging in their own codecs of choice. The codec settings MAY include the packet length and other parameters like silence suppression or comfort noise generation. The set of available codecs will be used in the codec negotiation according to RFC3264. Example: Codecs="speex/8000;ptime=20;cng=on,gsm;ptime=30"

RTSP (Schulzrinne, H., Rao, A., and R. Lanphier, “Real Time Streaming Protocol (RTSP),” April 1998.) [9] Format-specific parameters are conveyed using the "fmtp" media attribute. The syntax of the "fmtp" attribute is specific to the encoding(s) that the attribute refers to. Note that the packetization interval is conveyed using the "ptime" attribute.

MGCP (Andreasen, F. and B. Foster, “Media Gateway Control Protocol (MGCP) Version 1.0,” January 2003.) [10] The packetization period in milliseconds, encoded as the keyword "p", followed by a colon and a decimal number. If the Call Agent specifies a range of values, the range will be specified as two decimal numbers separated by a hyphen (as specified for the "ptime" parameter for SDP).

MGCP ATM package (Kumar, R., “Asynchronous Transfer Mode (ATM) Package for the Media Gateway Control Protocol (MGCP),” January 2003.) [11] Packet time changed ("ptime(#)"): If armed via an R:atm/ptime, a media gateway signals a packetization period change through an O:atm/ptime. The decimal number in parentheses is optional. It is the new packetization period in milliseconds. In AAL2 applications, the pftrans event can be used to cover packetization period changes (and codec changes). Voice codec selection (vsel): This is a prioritized list of one or more 3-tuples describing voice service. Each vsel 3-tuple indicates a codec, an optional packet length and an optional packetization period.

Gateway control protocol (Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, “Gateway Control Protocol Version 1,” June 2003.) [12]

Registration MIME text/red sub-type (Jones, P., “Registration of the text/red MIME Sub-Type,” June 2005.) [13]

RTP/AVP (Schulzrinne, H. and S. Casner, “RTP Profile for Audio and Video Conferences with Minimal Control,” July 2003.) [14]

RTP payload for MPEG4 A/V (Kikuchi, Y., Nomura, T., Fukunaga, S., Matsui, Y., and H. Kimata, “RTP Payload Format for MPEG-4 Audio/Visual Streams,” November 2000.) [15]

RTP payload for G.711.1 (Luthi, P., “RTP Payload Format for ITU-T Recommendation G.722.1,” January 2001.) [16]

RTP payload for AMR, AMR-WB (Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie, “Real-Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs,” June 2002.) [17] The maxptime SHOULD be a multiple of the frame size. If this parameter is not present, the sender MAY encapsulate any number of speech frames into one RTP packet.

RTP payload for distributed speech recognition (Xie, Q., “RTP Payload Format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 Distributed Speech Recognition Encoding,” July 2003.) [18] The maxptime SHOULD be a multiple of the frame pair size (20 ms) If this parameter is not present, maxptime is assumed to be 80ms. Note, since the performance of most speech recognizers are extremely sensitive to consecutive FP losses, if the user of the payload format expects a high packet loss ratio for the session, it MAY consider to explicitly choose a maxptime value for the session that is shorter than the default value.

RTP payload for EVRC and SMV (Li, A., “RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV),” July 2003.) [19] The parameters maxptime and maxinterleave are exchanged at the initial setup of the session. In one-to-one sessions, the sender MUST respect these values set be the receiver, and MUST NOT interleave/bundle more packets than what the receiver signals that it can handle. This ensures that the receiver can allocate a known amount of buffer space that will be sufficient for all interleaving/bundling used in that session. During the session, the sender may decrease the bundling value or interleaving length (so that less buffer space is required at the receiver), but never exceed the maximum value set by the receiver. This prevents the situation where a receiver needs to allocate more buffer space in the middle of a session but is unable to do so. Additionally, senders have the following restrictions: MUST NOT bundle more codec data frames in a single RTP packet than indicated by maxptime (see Section 12) if it is signaled. SHOULD NOT bundle more codec data frames in a single RTP packet than will fit in the MTU of the underlying network. If maxptime is not signaled, the default maxptime value SHALL be 200 milliseconds.

RTP payload for iLBC (Duric, A. and S. Andersen, “Real-time Transport Protocol (RTP) Payload Format for internet Low Bit Rate Codec (iLBC) Speech,” December 2004.) [20] The maxptime SHOULD be a multiple of the frame size. This attribute is probably only meaningful for audio data, but may be used with other media types if it makes sense. It is a media attribute, and is not dependent on charset. Note that this attribute was introduced after RFC 2327, and non updated implementations will ignore this attribute. Parameter ptime can not be used for the purpose of specifying iLBC operating mode, due to fact that for the certain values it will be impossible to distinguish which mode is about to be used (e.g., when ptime=60, it would be impossible to distinguish if packet is carrying 2 frames of 30 ms or 3 frames of 20 ms, etc.).

RTP payload for 64 kbps transparent call (Kreuter, R., “RTP Payload Format for a 64 kbit/s Transparent Call,” April 2005.) [21]

RTP payload for distributed speech recognition (Xie, Q. and D. Pearce, “RTP Payload Formats for European Telecommunications Standards Institute (ETSI) European Standard ES 202 050, ES 202 211, and ES 202 212 Distributed Speech Recognition Encoding,” May 2005.) [22] If maxptime is not present, maxptime is assumed to be 80ms. Note, since the performance of most speech recognizers are extremely sensitive to consecutive FP losses, if the user of the payload format expects a high packet loss ratio for the session, it MAY consider to explicitly choose a maxptime value for the session that is shorter than the default value.

RTP payload for AC-3 (Link, B., Hager, T., and J. Flaks, “RTP Payload Format for AC-3 Audio,” October 2005.) [23]

RTP payload for broadVoice speech (Chen, J., Lee, W., and J. Thyssen, “RTP Payload Format for BroadVoice Speech Codecs,” December 2005.) [24] The maxptime SHOULD be a multiple of the duration of a single codec data frame (5 ms).

RTP payload for VMR-WB (Ahmadi, S., “Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband (VMR-WB) Audio Codec,” January 2006.) [25] The parameters "maxptime" and "ptime" should in most cases not affect the interoperability; however, the setting of the parameters can affect the performance of the application.

RTP payload for AMR-WB+ (Sjoberg, J., Westerlund, M., Lakaniemi, A., and S. Wenger, “RTP Payload Format for the Extended Adaptive Multi-Rate Wideband (AMR-WB+) Audio Codec,” January 2006.) [26]

RTP payload MIME type registration (Casner, S., “Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences,” February 2007.) [27]



 TOC 

4.  Problem Statement

The packetization time is an important parameter which helps in reducing the packet overhead. Many voice codecs use a certain frame length to determine the coded voice filter parameters and try to find a certain optimum between the perceived voice quality (measured by the Mean Option Score (MOS) factor), and the required bitrate. When a packet oriented network is used for the transfer, the packet header induces an additional overhead. As such, it makes sense to try to combine different voice frame data in one packet (up to a Maximum Transmission Unit (MTU)) to find a good balance between the required network resources, end-device resources and the perceived voice quality influenced by packet loss, packet delay, jitter. When the packet size decreases, the bandwidth efficiency is reduced. When the packet size increases, the packetization delay can have a negative impact on the perceived voice quality.

The RTP Profile for Audio and Video Conferences with Minimal Control (Schulzrinne, H. and S. Casner, “RTP Profile for Audio and Video Conferences with Minimal Control,” July 2003.) [14], Table 1, indicates the frame size and default packetization time for different codecs. The G728 codec has a frame size of 2.5 ms/frame and a default packetization time of 20 ms/packet. For G729 codec, the frame size is 10 ms/frame and a default packetization time of 20 ms/packet.

When more and more telephony traffic is carried over IP-networks, the quality as perceived by the end-user should be no worse as the classical telephony services. For VoIP service providers, it is very important that endpoints receive audio with the best possible codec and packetization time. In particular, the packetization time depends on the selected codec for the audio communication and other factors, such as the Maximum Transmission Unit (MTU) of the network and the type of access network technology.

As such, the packetization time is clearly a function of the codec and the network access technology. During the establishment of a new session or a modification of an existing session, an endpoint should be able to express its preference with respect to the packetization time for each codec. This would mean that the creator of the SDP prefers the remote endpoint to use certain packetization time when sending media with that codec.

The RFC 4566 (Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” July 2006.) [1] provides the means for expressing a packetization time that affects all the payload types declared in the media description line. So, there are no means to indicate the desired packetization time on a per payload type basis. Implementations have been using proprietary mechanisms for indicating the packetization time per payload type, leading to lack of interoperability in this area. One of these mechanisms is the 'maxmptime' attribute, defined in the ITU-T Recommendation V.152 (ITU-T, “Procedures for supporting voice-band data over IP networks,” January 2005.) [3], which "indicates the supported packetization period for all codec payload types". Another one is the 'mptime' attribute, defined in the PacketCable Network-Based Call Signaling Protocol Specification (PacketCable, “PacketCable Network-Based Call Signaling Protocol Specification,” August 2005.) [4], which indicates "a list of packetization period values the endpoint is capable of using (sending and receiving) for this connection". While all have similar semantics, there is obviously no interoperability between them, creating a nightmare for the implementer who happens to be defining a common SDP stack for different applications.

A few RTP payload format descriptions, such as RFC 3267 (Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie, “Real-Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs,” June 2002.) [17], RFC 3016 (Kikuchi, Y., Nomura, T., Fukunaga, S., Matsui, Y., and H. Kimata, “RTP Payload Format for MPEG-4 Audio/Visual Streams,” November 2000.) [15], and RFC 3952 (Duric, A. and S. Andersen, “Real-time Transport Protocol (RTP) Payload Format for internet Low Bit Rate Codec (iLBC) Speech,” December 2004.) [20] indicate that the packetization time for such payload should be indicated in the 'ptime' attribute in SDP. However, since the 'ptime' attribute affects all the payload formats included in the media description line, it would not be possible to create a media description line that contains all the mentioned payload formats and different packetization times. The solutions range from considering a single packetization time for all the payload types, or creating a media description line that contains a single payload type.

The issue of a given packetization for a specific codec has been captured in past RFCs. For example, RFC 4504 (Sinnreich, H., Lass, S., and C. Stredicke, “SIP Telephony Device Requirements and Configuration,” May 2006.) [8] contains a set of requirements for SIP telephony devices. Section 3.8 in that RFC also provides background information for the need of packetization time, which could be set by either the user or the administrator of the device, on a per codec basis. However, once more, if several payload formats are offered in the same media description line in SDP, there is no way to indicate different packetizations per payload format..

Below is an example which indicates how the ptime can cause interworking problems between different implementations.




m=audio 1234 RTP/AVP 0 4 8
ptime=30
 Example1 

The media formats 0 and 8 are PCM U and A-law which are sample based codecs with a default packetization time of 20 ms. However, a packetization time of 30 ms can also be used. The media format 4 is a G723 frame based codec with a frame size of 30 ms. As such, the most common ptime for all these different codecs is 30 ms. When the receiver uses this ptime to initialize its buffer for its voice samples based on this 30 ms value and when the sender however is sending the media with the PCMU codec with its default packetization time of 20 ms, then the receiver has to wait for another voice packet before its buffer can be filled-up for a total duration of 30 ms. And this can cause disruptions in the synchronous playback of the digitized voice.



 TOC 

5.  Requirements

The main requirement is coming from the implementation and media gateway community making use of hardware based solutions, e.g. DSP or FPGA implementations with silicon constraints for the amount of buffer space.

Some are making use of the ptime/codec information to make certain QoS budget calculations. When the packetization time is known for a codec with a certain frame size and frame datarate, the efficiency of the throughput can be calculated.

Currently, the ptime and maxptime are "indication" attributes are optional. When these parameters are used for resource reservation and for hardware initializations, a negotiated value between the "offerer" and "answerer" becomes a requirement.

There could be different sources for the ptime/maxptime, i.e. from RTP/AVP profile, from end-user device configuration, from network operator, from intermediaries, from receiver.

The codec and ptime/maxptime in uplink and downlink can be different.



 TOC 

6.  Solutions already proposed

During last years, different solutions were already proposed and implemented with the goal to make the ptime in function of the codec instead of the media containing is list of codecs. The purpose of this list is only to indicate what kind of logical proposals were already made to find a solution for the SDP interworking issues due to implementation and RFC interpretations. It's just a list and does not impose any preference for a certain solution.

In all these proposals, a semantic grouping of the codec specific information is made by giving a new interpretation of the sequence of the parameters or by providing new additional attributed.

All these methods are against the basic rule indicated in the RFCs which state that a ptime and maxptime are media specific and NOT codec specific. It does not solve the interworking issues. Instead, it makes it worse due to many new interpretations and implementations.

To avoid a further divergence, the implementation community is strongly asking for a standardized solution.



 TOC 

6.1.  Method 1

Write the rtpmap first, followed by the ptime when it is related to the codec.




m=audio 1234 RTP/AVP 4 0
a=rtpmap:4 G723/8000
a=rtpmap:0 PCMU/8000
a=ptime:20
a=fmtp:4 bitrate=6400
 Method1 

Some SDP encoders first write the media line, followed by the rtpmaps and then the value attributes.



 TOC 

6.2.  Method 2

Grouping of all codec specific information together.




m=audio 1234 RTP/AVP 4 0
a=rtpmap:4 G723/8000
a=fmtp:4 bitrate=6400
a=rtpmap:0 PCMU/8000
a=ptime:20
 Method2 

Most implementers are in favor of this proposal, i.e. writing the value attributes associated with an rtpmap listed immediately after it.



 TOC 

6.3.  Method 3

Use the ptime for every codec after its rtpmap definition.




m=audio 1234 RTP/AVP 0 18  4
a=rtpmap:18 G729/8000
a=ptime:30

a=rtpmap:0 PCMU/8000
a=ptime:40

a=rtpmap:4 G723/8000
a=ptime:60
 Method3 



 TOC 

6.4.  Method 4

Create a new "mptime" (multiple ptime) attribute with a construct similar to the m-line.




m=audio 1234 RTP/AVP 0 18  4
a=mptime 40 30 60
 Method4 



 TOC 

6.5.  Method 5

Use of a new "x-ptime" attribute



 TOC 

6.6.  Method 6

Use of different m-lines with one codec per m-line




m=audio 1234 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=ptime:40

m=audio 1234 RTP/AVP 18
a=rtpmap:18 G729/8000
a=ptime:30

m=audio 1234 RTP/AVP 4
a=rtpmap:4 G723/8000
a=ptime:60
 Method6 



 TOC 

6.7.  Method 7

Use of the ptime in the fmtp attribute




m=audio 1234 RTP/AVP 4 18
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=yes;ptime=20
a=maxptime:40

a=rtpmap 4 G723/8000
a=fmtp:4 bitrate=6.3;annexa=yes;ptime=30
a=maxptime:60
 Method7 



 TOC 

6.8.  Method 8

Use of the vsel parameter as done for ATM bearer connections Following example indicates first preference of G.729 or G.729a (both are interoperable) as the voice encoding scheme. A packet length of 10 octets and a packetization interval of 10 ms are associated with this codec. G726-32 is the second preference stated in this line, with an associated packet length of 40 octets and a packetization interval of 10 ms. If the packet length and packetization interval are intended to be omitted, then this media attribute line contains '-'.




a=vsel:G729 10 10000 G726-32 40 10000
a=vsel:G729 - - G726-32 - -
 Method8 



 TOC 

6.9.  Method 9

Method 9: use of V.152 "maxmptime" attribute



 TOC 

6.10.  Method 10

Method 10: use of PacketCable "mptime" attribute



 TOC 

7.  Conclusion and next steps

This memo advocates for the need of a standardized mechanism to indicate the packetization time on a per codec basis, allowing the creator of SDP to include several payload formats in the same media description line with different packetization times.

This memo encourage discussion in the MMUSIC WG mailing list in the IETF. The ultimate goal is to define a standard mechanism that fulfils the requirements highlighted in this memo.

The goal is finding a solution which does not require changes in implementations which have followed the existing RFC guidelines and which are able to receive any packetization time.

A clear solution has to be described for the resource constraint problem in hardware based solutions. Either this is an extension/modification of the current SDP or a clarification how certain issues can be solved with the existing RFCs.



 TOC 

8.  Security Considerations

This memo discusses a problem statement and requirements. As such, no protocol that can suffer attacks is defined.



 TOC 

9.  IANA Considerations

This document does not request IANA to take any action.



 TOC 

10.  References



 TOC 

10.1. Normative References

[1] Handley, M., Jacobson, V., and C. Perkins, “SDP: Session Description Protocol,” RFC 4566, July 2006 (TXT).
[2] Rosenberg, J. and H. Schulzrinne, “An Offer/Answer Model with Session Description Protocol (SDP),” RFC 3264, June 2002 (TXT).


 TOC 

10.2. Informative References

[3] ITU-T, “Procedures for supporting voice-band data over IP networks,” ITU-T Recommendation V.152, January 2005.
[4] PacketCable, “PacketCable Network-Based Call Signaling Protocol Specification,” PacketCable PKT-SP-EC-MGCP-I11-050812, August 2005.
[5] Westerlund, M., “A Transport Independent Bandwidth Modifier for the Session Description Protocol (SDP),” RFC 3890, September 2004 (TXT).
[6] Kumar, R. and M. Mostafa, “Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections,” RFC 3108, May 2001 (TXT).
[7] Garcia-Martin, M., Bormann, C., Ott, J., Price, R., and A. Roach, “The Session Initiation Protocol (SIP) and Session Description Protocol (SDP) Static Dictionary for Signaling Compression (SigComp),” RFC 3485, February 2003 (TXT).
[8] Sinnreich, H., Lass, S., and C. Stredicke, “SIP Telephony Device Requirements and Configuration,” RFC 4504, May 2006 (TXT).
[9] Schulzrinne, H., Rao, A., and R. Lanphier, “Real Time Streaming Protocol (RTSP),” RFC 2326, April 1998 (TXT).
[10] Andreasen, F. and B. Foster, “Media Gateway Control Protocol (MGCP) Version 1.0,” RFC 3435, January 2003 (TXT).
[11] Kumar, R., “Asynchronous Transfer Mode (ATM) Package for the Media Gateway Control Protocol (MGCP),” RFC 3441, January 2003 (TXT).
[12] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, “Gateway Control Protocol Version 1,” RFC 3525, June 2003 (TXT).
[13] Jones, P., “Registration of the text/red MIME Sub-Type,” RFC 4102, June 2005 (TXT).
[14] Schulzrinne, H. and S. Casner, “RTP Profile for Audio and Video Conferences with Minimal Control,” STD 65, RFC 3551, July 2003 (TXT, PS, PDF).
[15] Kikuchi, Y., Nomura, T., Fukunaga, S., Matsui, Y., and H. Kimata, “RTP Payload Format for MPEG-4 Audio/Visual Streams,” RFC 3016, November 2000 (TXT).
[16] Luthi, P., “RTP Payload Format for ITU-T Recommendation G.722.1,” RFC 3047, January 2001 (TXT).
[17] Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie, “Real-Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs,” RFC 3267, June 2002 (TXT).
[18] Xie, Q., “RTP Payload Format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 Distributed Speech Recognition Encoding,” RFC 3557, July 2003 (TXT).
[19] Li, A., “RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV),” RFC 3558, July 2003 (TXT).
[20] Duric, A. and S. Andersen, “Real-time Transport Protocol (RTP) Payload Format for internet Low Bit Rate Codec (iLBC) Speech,” RFC 3952, December 2004 (TXT).
[21] Kreuter, R., “RTP Payload Format for a 64 kbit/s Transparent Call,” RFC 4040, April 2005 (TXT).
[22] Xie, Q. and D. Pearce, “RTP Payload Formats for European Telecommunications Standards Institute (ETSI) European Standard ES 202 050, ES 202 211, and ES 202 212 Distributed Speech Recognition Encoding,” RFC 4060, May 2005 (TXT).
[23] Link, B., Hager, T., and J. Flaks, “RTP Payload Format for AC-3 Audio,” RFC 4184, October 2005 (TXT).
[24] Chen, J., Lee, W., and J. Thyssen, “RTP Payload Format for BroadVoice Speech Codecs,” RFC 4298, December 2005 (TXT).
[25] Ahmadi, S., “Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband (VMR-WB) Audio Codec,” RFC 4348, January 2006 (TXT).
[26] Sjoberg, J., Westerlund, M., Lakaniemi, A., and S. Wenger, “RTP Payload Format for the Extended Adaptive Multi-Rate Wideband (AMR-WB+) Audio Codec,” RFC 4352, January 2006 (TXT).
[27] Casner, S., “Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences,” RFC 4856, February 2007 (TXT).


 TOC 

Authors' Addresses

  Miguel A. Garcia-Martin
  Nokia Siemens Networks
  P.O.Box 6
  Nokia Siemens Networks, FIN 02022
  Finland
Email:  miguel.garcia@nsn.com
  
  Marc Willekens
  Nokia Siemens Networks
  Atealaan 34
  Herentals, BE 2200
  Belgium
Email:  marc.willekens@nsn.com
  
  Peili Xu
  Huawei Technologies
  Bantian
  Longgang, Shenzhen 518129
  China
Email:  xupeili@huawei.com


 TOC 

Full Copyright Statement

Intellectual Property