Codec Working Group Juin-Hwey Chen Internet-Draft Jes Thyssen Intended status: Standards Track Broadcom Corporation Expires: October 28, 2010 April 29, 2010 BroadVoice Speech Codecs draft-chen-bv-codec-00 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on October 28, 2010. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. Chen and Thyssen Expires October 28, 2010 [Page 1] Internet-Draft BroadVoice codec April 2010 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract BroadVoice(R) [bv-website] is a family of two open-source speech codecs suitable for Voice over IP (VoIP) applications. It is designed to achieve high speech quality with relatively low complexity and a very low coding delay. BroadVoice consists of two variants: a 16 kb/s narrowband codec for 8 kHz sampling called BroadVoice16, or BV16, and a 32 kb/s wideband codec for 16 kHz sampling called BroadVoice32, or BV32. BV16 and BV32 are standard codecs of PacketCable(TM), SCTE(R), and ANSI for VoIP applications in cable telephony, and they are also listed as optional codecs in the ITU-T Recommendations J.161 and J.361, respectively. This document describes the BV16 and BV32 speech codecs. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Overview of the BroadVoice Family of Codec . . . . . . . . . . 4 3. The BroadVoice16 (BV16) Codec . . . . . . . . . . . . . . . . . 6 4. The BroadVoice32 (BV32) Codec . . . . . . . . . . . . . . . . . 7 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 9 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 8.1. Normative References . . . . . . . . . . . . . . . . . . 11 8.2. Informative References . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 Chen and Thyssen Expires October 23, 2010 [Page 2] Internet-Draft BroadVoice codec March 2010 1. Introduction This document describes the BroadVoice family of speech codecs, which consists of (1) a 16 kb/s narrowband codec called BroadVoice16, or BV16, operating at a sampling rate of 8 kHz, and (2) a 32 kb/s wideband codec called BroadVoice32, or BV32, operating at a sampling rate of 16 kHz. The BV16 codec was standardized by the cable industry through CableLabs(R) as a standard codec in PacketCable 1.5 and PacketCable 2.0. It was also standardized by the Society of Cable Telecommunications Engineers (SCTE) and by the American National Standard Institute (ANSI) as the ANSI/SCTE 24-21 2006 standard [bv16-ANSI]. BV16 is also listed as an optional codec in the ITU-T Recommendation J.161. Similarly, BV32 is a standard codec in the following standards: PacketCable 2.0, ANSI/SCTE 24-23 2007 [bv32-ANSI], and ITU-T Recommendation J.361. Since the BV16 and BV32 coding algorithms are already specified in details in the ANSI/SCTE standard specification documents, there is no need to repeat the specifications here. Instead, links to the ANSI/SCTE specification documents for BV16 and BV32 will be given in Sections 3 and 4, respectively. The rest of this document gives an overview of the BroadVoice family of codecs, their attributes, and other relevant information. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [rfc2119]. Chen and Thyssen Expires October 28, 2010 [Page 3] Internet-Draft BroadVoice codec April 2010 2. Overview of the BroadVoice Family of Codec BroadVoice [bv-icassp] is a family of speech codecs developed by Broadcom Corporation for Voice over IP (VoIP) applications. It is based on Two-Stage Noise Feedback Coding (TSNFC) [tsnfc-icassp] rather than the popular Code-Excited Linear Prediction (CELP) coding paradigm. The RTP [rfc3550] payload formats for BV16 and BV32 are specified in RFC4298. BV16 and BV32 have very similar codec structures and share most of the algorithm modules, so if the two are implemented together, substantial code sharing and memory reduction can be achieved. To encourage wide-spread use of BroadVoice in diverse speech compression applications, Broadcom is providing both the floating- point and fixed-point C source code of BroadVoice on a royalty-free basis under the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation. Visit http://www.broadcom.com/broadvoice to download BroadVoice open source C code, get audio demonstration, or get relevant information. BroadVoice was designed from the ground up to be optimized for voice transmission over IP networks. The main design goal of BroadVoice was to make the coding delay and codec complexity as low as possible while maintaining output speech quality as close to transparent as possible. The following list summarizes the attributes of BV16 and BV32: o Ultra-low algorithmic buffering delay (5 ms) o Relatively low computational complexity (about 12 MIPS for BV16 and 17 MIPS for BV32) and memory requirements o High output quality for voice; acceptable quality for music o Sampling rates of either 8 kHz (for BV16) or 16 kHz (for BV32) o Bit-rate of either 16 kb/s (for BV16) or 32 kb/s (for BV32) o Robustness to packet loss (typically 0.5 MOS degradation at about 5% random packet loss rate) o Open source implementation (floating-point and fixed-point C) o No known patent enforcement activities or royalty-bearing patent pools as of date of submission Chen and Thyssen Expires October 28, 2010 [Page 4] Internet-Draft BroadVoice codec April 2010 o BV16 and BV32 are not bit-exact standards; implementation details MAY deviate from those specified in the ANSI/SCTE specifications for BV16 and BV32 as long as the bit-stream compatibility with the ANSI/SCTE BV16 and BV32 standards is maintained. It should be noted that some algorithm modules described in the ANSI/SCTE BV16 and BV32 specification documents are meant to illustrate the concepts behind the algorithm modules and may not be the most efficient way to implement the modules. For example, version 1.1 of the BroadVoice open source code implements some of the efficient excitation vector quantization (VQ) codebook search methods described in [tsnfc-icassp] and [bv-efficient-vq] that are mathematically equivalent to the excitation VQ codebook search method described in the ANSI/SCTE BV16 and BV32 specifications but are computationally more efficient. Broadcom licenses, on a royalty-free basis, its patents that are necessary to practice techniques used in Broadcom's official version of the BroadVoice open source code. Implementations that deviate from the techniques used in Broadcom's BroadVoice codecs may increase exposure to third-party patents. Therefore, to minimize potential intellectual property issues, it is RECOMMENDED that implementers of BroadVoice codecs use only techniques implemented in Broadcom's official version of BroadVoice open source C code. Chen and Thyssen Expires October 28, 2010 [Page 5] Internet-Draft BroadVoice codec April 2010 3. The BroadVoice16 (BV16) Codec The BroadVoice16 codec [bv16-asilomar] has a frame size of 5 ms and operates at a sampling rate of 8 kHz. For every 40 samples of 8 kHz sampled input speech, BV 16 encodes the 40 samples into 80 bits, resulting in a bit rate of 2 bits/sample, or 16 kb/s. A detailed description of the encoding and decoding principles of the BV16 codec is given in the ANSI/SCTE 24-21 2006 standard specification document [bv16-ANSI], which is available at the following link: http://www.scte.org/documents/pdf/Standards/ANSISCTE24212006.pdf A BV16 decoder MAY include an adaptive postfilter (PF) to reduce the perceived level of coding noise. It MAY also include packet loss Concealment (PLC) to conceal (at least partially) the quality- degrading effects of packet loss. Both PF and PLC are post- processing steps after the speech signal is decoded, so they do not affect bit-stream compatibility. Therefore, PF and PLC are not an essential part of the BV16 specification. The ANSI/SCTE BV16 specification describes an example PF and an example PLC scheme, but implementers can implement their own PF and PLC schemes without affecting bit-stream compatibility with BroadVoice codecs. Chen and Thyssen Expires October 28, 2010 [Page 6] Internet-Draft BroadVoice codec April 2010 4. The BroadVoice32 (BV32) Codec The BroadVoice32 codec also has a frame size of 5 ms but operates at a sampling rate of 16 kHz. For every 80 samples of 16 kHz sampled input speech, BV32 encodes the 80 samples into 160 bits, resulting in a bit rate of 2 bits/sample, or 32 kb/s. A detailed description of the encoding and decoding principles of the BV32 codec is given in the ANSI/SCTE 24-23 2007 standard specification document [bv32-ANSI], which is available at the following link: http://www.scte.org/documents/pdf/Standards/ANSI_SCTE24-232007.pdf A BV32 decoder MAY include an adaptive postfilter (PF) to reduce the perceived level of coding noise, although it is not really necessary because even without a postfilter the output speech quality of BV32 is already quite high. A BV32 decoder MAY also include packet loss Concealment (PLC) to conceal (at least partially) the quality- degrading effects of packet loss. Both PF and PLC are post- processing steps after the speech signal is decoded, so they do not affect bit-stream compatibility. The ANSI/SCTE BV32 specification describes an example PLC scheme, but implementers can implement their own PLC schemes without affecting bit-stream compatibility. Chen and Thyssen Expires October 28, 2010 [Page 7] Internet-Draft BroadVoice codec April 2010 5. Security Considerations A potential denial-of-service threat exists for data encoding using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream which are complex to decode and cause the receiver to become overloaded. However, the decoder complexity of BV16 and BV32 do not exhibit any significant non-uniformity. Chen and Thyssen Expires October 28, 2010 [Page 8] Internet-Draft BroadVoice codec April 2010 6. IANA Considerations This document has no actions for IANA. Chen and Thyssen Expires October 28, 2010 [Page 9] Internet-Draft BroadVoice codec April 2010 7. Acknowledgments The authors would like to thank Cheng-Chieh Lee and Robert Zopf for their partial contributions in the following areas: floating-point C codes, fixed-point C codes, optimized assembly codes, and performance testing of BroadVoice codecs. Chen and Thyssen Expires October 28, 2010 [Page 10] Internet-Draft BroadVoice codec April 2010 8. References 8.1. Normative References [rfc2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119. [bv16-ANSI] "BV16 Speech Codec Specification for Voice over IP Applications in Cable Telephony", American National Standard, ANSI/SCTE 24-21 2006. [bv32-ANSI] "BV32 Speech Codec Specification for Voice over IP Applications in Cable Telephony", American National Standard, ANSI/SCTE 24-23 2007. [rfc3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for real-time applications", RFC 3550. 8.2. Informative References [bv-website] BroadVoice(R) Speech Codec Open Source C Code, BroadVoice website http://www.broadcom.com/broadvoice/. [bv-icassp] Juin-Hwey Chen and Jes Thyssen, "The Broadvoice Speech Coding Algorithm", Proceedings of 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Volume 4, April 2007. [tsnfc-icassp] Juin-Hwey Chen, "Novel Codec Structures For Noise Feedback Coding of Speech", Proceedings of 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), Volume 1, May 2006. [bv16-asilomar] Juin-Hwey Chen and Jes Thyssen, "BroadVoice(R)16: A PacketCable Speech Coding Standard for Cable Telephony", Proceedings of Fortieth Asilomar Conference on Signals, Systems and Computers, ACSSC 2006, October - November 2006. Chen and Thyssen Expires October 28, 2010 [Page 11] Internet-Draft BroadVoice codec April 2010 [bv-efficient-vq] Thyssen and Juin-Hwey Chen, "Efficient VQ Techniques and General Noise Shaping in Noise Feedback Coding", Proceedings of Interspeech 2006 ICSLP, September 2006. Authors' Addresses Juin-Hwey (Raymond) Chen 5300 California Avenue Irvine, CA 92617 USA Email: rchen@broadcom.com Jes Thyssen 5300 California Avenue Irvine, CA 92617 USA Email: jthyssen@broadcom.com Chen and Thyssen Expires October 28, 2010 [Page 12]