123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008 |
- AVT G. Herlein
- Internet-Draft
- Intended status: Standards Track J. Valin
- Expires: October 24, 2007 University of Sherbrooke
- A. Heggestad
- April 22, 2007
- RTP Payload Format for the Speex Codec
- draft-ietf-avt-rtp-speex-01 (non-final)
- Status of this Memo
- By submitting this Internet-Draft, each author represents that any
- applicable patent or other IPR claims of which he or she is aware
- have been or will be disclosed, and any of which he or she becomes
- aware will be disclosed, in accordance with Section 6 of BCP 79.
- Internet-Drafts are working documents of the Internet Engineering
- Task Force (IETF), its areas, and its working groups. Note that
- other groups may also distribute working documents as Internet-
- Drafts.
- Internet-Drafts are draft documents valid for a maximum of six months
- and may be updated, replaced, or obsoleted by other documents at any
- time. It is inappropriate to use Internet-Drafts as reference
- material or to cite them other than as "work in progress."
- The list of current Internet-Drafts can be accessed at
- http://www.ietf.org/ietf/1id-abstracts.txt.
- The list of Internet-Draft Shadow Directories can be accessed at
- http://www.ietf.org/shadow.html.
- This Internet-Draft will expire on October 24, 2007.
- Copyright Notice
- Copyright (C) The Internet Society (2007).
- Herlein, et al. Expires October 24, 2007 [Page 1]
- Internet-Draft Speex April 2007
- Abstract
- Speex is an open-source voice codec suitable for use in Voice over IP
- (VoIP) type applications. This document describes the payload format
- for Speex generated bit streams within an RTP packet. Also included
- here are the necessary details for the use of Speex with the Session
- Description Protocol (SDP).
- Herlein, et al. Expires October 24, 2007 [Page 2]
- Internet-Draft Speex April 2007
- Editors Note
- All references to RFC XXXX are to be replaced by references to the
- RFC number of this memo, when published.
- Table of Contents
- 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
- 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
- 3. RTP usage for Speex . . . . . . . . . . . . . . . . . . . . . 6
- 3.1. RTP Speex Header Fields . . . . . . . . . . . . . . . . . 6
- 3.2. RTP payload format for Speex . . . . . . . . . . . . . . . 6
- 3.3. Speex payload . . . . . . . . . . . . . . . . . . . . . . 6
- 3.4. Example Speex packet . . . . . . . . . . . . . . . . . . . 7
- 3.5. Multiple Speex frames in a RTP packet . . . . . . . . . . 7
- 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
- 4.1. Media Type Registration . . . . . . . . . . . . . . . . . 9
- 4.1.1. Registration of media type audio/speex . . . . . . . . 9
- 5. SDP usage of Speex . . . . . . . . . . . . . . . . . . . . . . 11
- 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
- 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15
- 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
- 8.1. Normative References . . . . . . . . . . . . . . . . . . . 16
- 8.2. Informative References . . . . . . . . . . . . . . . . . . 16
- Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
- Intellectual Property and Copyright Statements . . . . . . . . . . 18
- Herlein, et al. Expires October 24, 2007 [Page 3]
- Internet-Draft Speex April 2007
- 1. Introduction
- Speex is based on the CELP [CELP] encoding technique with support for
- either narrowband (nominal 8kHz), wideband (nominal 16kHz) or ultra-
- wideband (nominal 32kHz). The main characteristics can be summarized
- as follows:
- o Free software/open-source
- o Integration of wideband and narrowband in the same bit-stream
- o Wide range of bit-rates available
- o Dynamic bit-rate switching and variable bit-rate (VBR)
- o Voice Activity Detection (VAD, integrated with VBR)
- o Variable complexity
- To be compliant with this specification, implementations MUST support
- 8 kHz sampling rate (narrowband)" and SHOULD support 8 kbps bitrate.
- The sampling rate MUST be 8, 16 or 32 kHz.
- Herlein, et al. Expires October 24, 2007 [Page 4]
- Internet-Draft Speex April 2007
- 2. Terminology
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
- "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
- document are to be interpreted as described in RFC2119 [RFC2119] and
- indicate requirement levels for compliant RTP implementations.
- Herlein, et al. Expires October 24, 2007 [Page 5]
- Internet-Draft Speex April 2007
- 3. RTP usage for Speex
- 3.1. RTP Speex Header Fields
- The RTP header is defined in the RTP specification [RFC3550]. This
- section defines how fields in the RTP header are used.
- Payload Type (PT): The assignment of an RTP payload type for this
- packet format is outside the scope of this document; it is
- specified by the RTP profile under which this payload format is
- used, or signaled dynamically out-of-band (e.g., using SDP).
- Marker (M) bit: The M bit is set to one to indicate that the RTP
- packet payload contains at least one complete frame
- Extension (X) bit: Defined by the RTP profile used.
- Timestamp: A 32-bit word that corresponds to the sampling instant
- for the first frame in the RTP packet.
- 3.2. RTP payload format for Speex
- The RTP payload for Speex has the format shown in Figure 1. No
- additional header fields specific to this payload format are
- required. For RTP based transportation of Speex encoded audio the
- standard RTP header [RFC3550] is followed by one or more payload data
- blocks. An optional padding terminator may also be used.
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | RTP Header |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | one or more frames of Speex .... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | one or more frames of Speex .... | padding |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- Figure 1: RTP payload for Speex
- 3.3. Speex payload
- For the purposes of packetizing the bit stream in RTP, it is only
- necessary to consider the sequence of bits as output by the Speex
- encoder [speexenc], and present the same sequence to the decoder.
- The payload format described here maintains this sequence.
- A typical Speex frame, encoded at the maximum bitrate, is approx. 110
- Herlein, et al. Expires October 24, 2007 [Page 6]
- Internet-Draft Speex April 2007
- octets and the total number of Speex frames SHOULD be kept less than
- the path MTU to prevent fragmentation. Speex frames MUST NOT be
- fragmented across multiple RTP packets,
- An RTP packet MAY contain Speex frames of the same bit rate or of
- varying bit rates, since the bit-rate for a frame is conveyed in band
- with the signal.
- The encoding and decoding algorithm can change the bit rate at any 20
- msec frame boundary, with the bit rate change notification provided
- in-band with the bit stream. Each frame contains both "mode"
- (narrowband, wideband or ultra-wideband) and "sub-mode" (bit-rate)
- information in the bit stream. No out-of-band notification is
- required for the decoder to process changes in the bit rate sent by
- the encoder.
- Sampling rate values of 8000, 16000 or 32000 Hz MUST be used. Any
- other sampling rates MUST NOT be used.
- The RTP payload MUST be padded to provide an integer number of octets
- as the payload length. These padding bits are LSB aligned in network
- octet order and consist of a 0 followed by all ones (until the end of
- the octet). This padding is only required for the last frame in the
- packet, and only to ensure the packet contents ends on an octet
- boundary.
- 3.4. Example Speex packet
- In the example below we have a single Speex frame with 5 bits of
- padding to ensure the packet size falls on an octet boundary.
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | RTP Header |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | ..speex data.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..speex data.. |0 1 1 1 1|
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- 3.5. Multiple Speex frames in a RTP packet
- Below is an example of two Speex frames contained within one RTP
- packet. The Speex frame length in this example fall on an octet
- boundary so there is no padding.
- Speex codecs [speexenc] are able to detect the bitrate from the
- Herlein, et al. Expires October 24, 2007 [Page 7]
- Internet-Draft Speex April 2007
- payload and are responsible for detecting the 20 msec boundaries
- between each frame.
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | RTP Header |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | ..speex frame 1.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..speex frame 1.. | ..speex frame 2.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..speex frame 2.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- Herlein, et al. Expires October 24, 2007 [Page 8]
- Internet-Draft Speex April 2007
- 4. IANA Considerations
- This document defines the Speex media type.
- 4.1. Media Type Registration
- This section describes the media types and names associated with this
- payload format. The section registers the media types, as per
- RFC4288 [RFC4288]
- 4.1.1. Registration of media type audio/speex
- Media type name: audio
- Media subtype name: speex
- Required parameters:
- None
- Optional parameters:
- ptime: see RFC 4566. SHOULD be a multiple of 20 msec.
- maxptime: see RFC 4566. SHOULD be a multiple of 20 msec.
- Encoding considerations:
- This media type is framed and binary, see section 4.8 in
- [RFC4288].
- Security considerations: See Section 6
- Interoperability considerations:
- None.
- Published specification: RFC XXXX [This RFC].
- Applications which use this media type:
- Audio streaming and conferencing applications.
- Additional information: none
- Person and email address to contact for further information :
- Herlein, et al. Expires October 24, 2007 [Page 9]
- Internet-Draft Speex April 2007
- Alfred E. Heggestad: aeh@db.org
- Intended usage: COMMON
- Restrictions on usage:
- This media type depends on RTP framing, and hence is only defined
- for transfer via RTP [RFC3550]. Transport within other framing
- protocols is not defined at this time.
- Author: Alfred E. Heggestad
- Change controller:
- IETF Audio/Video Transport working group delegated from the IESG.
- Herlein, et al. Expires October 24, 2007 [Page 10]
- Internet-Draft Speex April 2007
- 5. SDP usage of Speex
- When conveying information by SDP [RFC4566], the encoding name MUST
- be set to "speex". An example of the media representation in SDP for
- offering a single channel of Speex at 8000 samples per second might
- be:
- m=audio 8088 RTP/AVP 97
- a=rtpmap:97 speex/8000
- Note that the RTP payload type code of 97 is defined in this media
- definition to be 'mapped' to the speex codec at an 8kHz sampling
- frequency using the 'a=rtpmap' line. Any number from 96 to 127 could
- have been chosen (the allowed range for dynamic types).
- The value of the sampling frequency is typically 8000 for narrow band
- operation, 16000 for wide band operation, and 32000 for ultra-wide
- band operation.
- If for some reason the offerer has bandwidth limitations, the client
- may use the "b=" header, as explained in SDP [RFC4566]. The
- following example illustrates the case where the offerer cannot
- receive more than 10 kbit/s.
- m=audio 8088 RTP/AVP 97
- b=AS:10
- a=rtmap:97 speex/8000
- In this case, if the remote part agrees, it should configure its
- Speex encoder so that it does not use modes that produce more than 10
- kbit/s. Note that the "b=" constraint also applies on all payload
- types that may be proposed in the media line ("m=").
- An other way to make recommendations to the remote Speex encoder is
- to use its specific parameters via the a=fmtp: directive. The
- following parameters are defined for use in this way:
- ptime: duration of each packet in milliseconds.
- sr: actual sample rate in Hz.
- ebw: encoding bandwidth - either 'narrow' or 'wide' or 'ultra'
- (corresponds to nominal 8000, 16000, and 32000 Hz sampling rates).
- Herlein, et al. Expires October 24, 2007 [Page 11]
- Internet-Draft Speex April 2007
- vbr: variable bit rate - either 'on' 'off' or 'vad' (defaults to
- off). If on, variable bit rate is enabled. If off, disabled. If
- set to 'vad' then constant bit rate is used but silence will be
- encoded with special short frames to indicate a lack of voice for
- that period.
- cng: comfort noise generation - either 'on' or 'off'. If off then
- silence frames will be silent; if 'on' then those frames will be
- filled with comfort noise.
- mode: Speex encoding mode. Can be {1,2,3,4,5,6,any} defaults to 3
- in narrowband, 6 in wide and ultra-wide.
- Examples:
- m=audio 8008 RTP/AVP 97
- a=rtpmap:97 speex/8000
- a=fmtp:97 mode=4
- This examples illustrate an offerer that wishes to receive a Speex
- stream at 8000Hz, but only using speex mode 4.
- Several Speex specific parameters can be given in a single a=fmtp
- line provided that they are separated by a semi-colon:
- a=fmtp:97 mode=any;mode=1
- The offerer may indicate that it wishes to send variable bit rate
- frames with comfort noise:
- m=audio 8088 RTP/AVP 97
- a=rtmap:97 speex/8000
- a=fmtp:97 vbr=on;cng=on
- The "ptime" attribute is used to denote the packetization interval
- (ie, how many milliseconds of audio is encoded in a single RTP
- packet). Since Speex uses 20 msec frames, ptime values of multiples
- of 20 denote multiple Speex frames per packet. Values of ptime which
- are not multiples of 20 MUST be ignored and clients MUST use the
- default value of 20 instead.
- Implementations SHOULD support ptime of 20 msec (i.e. one frame per
- packet)
- In the example below the ptime value is set to 40, indicating that
- Herlein, et al. Expires October 24, 2007 [Page 12]
- Internet-Draft Speex April 2007
- there are 2 frames in each packet.
- m=audio 8008 RTP/AVP 97
- a=rtpmap:97 speex/8000
- a=ptime:40
- Note that the ptime parameter applies to all payloads listed in the
- media line and is not used as part of an a=fmtp directive.
- Values of ptime not multiple of 20 msec are meaningless, so the
- receiver of such ptime values MUST ignore them. If during the life
- of an RTP session the ptime value changes, when there are multiple
- Speex frames for example, the SDP value must also reflect the new
- value.
- Care must be taken when setting the value of ptime so that the RTP
- packet size does not exceed the path MTU.
- Herlein, et al. Expires October 24, 2007 [Page 13]
- Internet-Draft Speex April 2007
- 6. Security Considerations
- RTP packets using the payload format defined in this specification
- are subject to the security considerations discussed in the RTP
- specification [RFC3550], and any appropriate RTP profile. This
- implies that confidentiality of the media streams is achieved by
- encryption. Because the data compression used with this payload
- format is applied end-to-end, encryption may be performed after
- compression so there is no conflict between the two operations.
- A potential denial-of-service threat exists for data encodings using
- compression techniques that have non-uniform receiver-end
- computational load. The attacker can inject pathological datagrams
- into the stream which are complex to decode and cause the receiver to
- be overloaded. However, this encoding does not exhibit any
- significant non-uniformity.
- As with any IP-based protocol, in some circumstances a receiver may
- be overloaded simply by the receipt of too many packets, either
- desired or undesired. Network-layer authentication may be used to
- discard packets from undesired sources, but the processing cost of
- the authentication itself may be too high.
- Herlein, et al. Expires October 24, 2007 [Page 14]
- Internet-Draft Speex April 2007
- 7. Acknowledgements
- The authors would like to thank Equivalence Pty Ltd of Australia for
- their assistance in attempting to standardize the use of Speex in
- H.323 applications, and for implementing Speex in their open source
- OpenH323 stack. The authors would also like to thank Brian C. Wiles
- <brian@streamcomm.com> of StreamComm for his assistance in developing
- the proposed standard for Speex use in H.323 applications.
- The authors would also like to thank the following members of the
- Speex and AVT communities for their input: Ross Finlayson, Federico
- Montesino Pouzols, Henning Schulzrinne, Magnus Westerlund.
- Thanks to former authors of this document; Simon Morlat, Roger
- Hardiman, Phil Kerr
- Herlein, et al. Expires October 24, 2007 [Page 15]
- Internet-Draft Speex April 2007
- 8. References
- 8.1. Normative References
- [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
- Requirement Levels", BCP 14, RFC 2119, March 1997.
- [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
- Jacobson, "RTP: A Transport Protocol for Real-Time
- Applications", STD 64, RFC 3550, July 2003.
- [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
- Description Protocol", RFC 4566, July 2006.
- 8.2. Informative References
- [CELP] "CELP, U.S. Federal Standard 1016.", National Technical
- Information Service (NTIS) website http://www.ntis.gov/.
- [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and
- Registration Procedures", BCP 13, RFC 4288, December 2005.
- [speexenc]
- Valin, J., "Speexenc/speexdec, reference command-line
- encoder/decoder", Speex website http://www.speex.org/.
- Herlein, et al. Expires October 24, 2007 [Page 16]
- Internet-Draft Speex April 2007
- Authors' Addresses
- Greg Herlein
- 2034 Filbert Street
- San Francisco, California 94123
- United States
- Email: gherlein@herlein.com
- Jean-Marc Valin
- University of Sherbrooke
- Department of Electrical and Computer Engineering
- University of Sherbrooke
- 2500 blvd Universite
- Sherbrooke, Quebec J1K 2R1
- Canada
- Email: jean-marc.valin@usherbrooke.ca
- Alfred E. Heggestad
- Biskop J. Nilssonsgt. 20a
- Oslo 0659
- Norway
- Email: aeh@db.org
- Herlein, et al. Expires October 24, 2007 [Page 17]
- Internet-Draft Speex April 2007
- Full Copyright Statement
- Copyright (C) The Internet Society (2007).
- This document is subject to the rights, licenses and restrictions
- contained in BCP 78, and except as set forth therein, the authors
- retain all their rights.
- This document and the information contained herein are provided on an
- "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
- OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
- ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
- INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
- INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
- WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
- Intellectual Property
- The IETF takes no position regarding the validity or scope of any
- Intellectual Property Rights or other rights that might be claimed to
- pertain to the implementation or use of the technology described in
- this document or the extent to which any license under such rights
- might or might not be available; nor does it represent that it has
- made any independent effort to identify any such rights. Information
- on the procedures with respect to rights in RFC documents can be
- found in BCP 78 and BCP 79.
- Copies of IPR disclosures made to the IETF Secretariat and any
- assurances of licenses to be made available, or the result of an
- attempt made to obtain a general license or permission for the use of
- such proprietary rights by implementers or users of this
- specification can be obtained from the IETF on-line IPR repository at
- http://www.ietf.org/ipr.
- The IETF invites any interested party to bring to its attention any
- copyrights, patents or patent applications, or other proprietary
- rights that may cover technology that may be required to implement
- this standard. Please address the information to the IETF at
- ietf-ipr@ietf.org.
- Acknowledgment
- Funding for the RFC Editor function is provided by the IETF
- Administrative Support Activity (IASA).
- Herlein, et al. Expires October 24, 2007 [Page 18]
|