A/V Transport Payloads Working Group T. Edwards
Internet-Draft FOX
Intended status: Standards Track August 15, 2016
Expires: February 16, 2017

RTP Payload for SMPTE ST 291 Ancillary Data
draft-ietf-payload-rtp-ancillary-05

Abstract

This memo describes an RTP Payload format for SMPTE Ancillary data, as defined by SMPTE ST 291-1. SMPTE Ancillary data is generally used along with professional video formats to carry a range of ancillary data types, including time code, Closed Captioning, and the Active Format Description (AFD).

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on February 16, 2017.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This memo describes an RTP Payload format for the Society of Motion Picture and Television Engineers (SMPTE) Ancillary data (ANC), as defined by SMPTE ST 291-1 [ST291]. ANC can carry a range of data types, including time code, Closed Captioning, and the Active Format Description (AFD).

ANC is generally associated with the carriage of metadata within the bit stream of a Serial Digital Interface (SDI) such as SMPTE ST 259 [ST259], the standard definition (SD) Serial Digital Interface (with ANC data inserted as per SMPTE ST 125 [ST125]), or SMPTE ST 292-1 [ST292], the 1.5 Gb/s Serial Digital Interface for high definition (HD) television applications.

ANC data packet payload definitions for a specific application are specified by a SMPTE Standard, Recommended Practice, Registered Disclosure Document, or by a document generated by another organization, a company, or an individual (an Entity). When a payload format is registered with SMPTE, an application document describing the payload format is required, and the registered ancillary data packet is identified by a registered data identification word.

This memo describes an RTP payload that supports ANC data packets regardless of whether they originate from an SD or HD interface, or if the ANC data packet is from the vertical ancillary space (VANC) or the horizontal ancillary space (HANC), or if the ANC packet is located in the luma (Y) or color-difference (C) channel. Sufficient information is provided to enable the ANC data packets at the output of the decoder to be restored to their original locations in the serial digital video signal raster (if that is desired).

It should be noted that the ancillary data flag (ADF) word is not specifically carried in this RTP payload. The ADF may be specified in a document defining an interconnecting digital video interface, otherwise a default ADF is specified by SMPTE ST 291-1 [ST291].

This ANC payload can be used by itself, or used along with a range of RTP video formats. In particular, it has been designed so that it could be used along with RFC 4175 [RFC4175] "RTP Payload Format for Uncompressed Video" or RFC 5371 [RFC5371] "RTP Payload Format for JPEG 2000 Video Streams."

The data model in this document for the ANC data RTP payload is based on the data model of SMPTE ST 2038 [ST2038], which standardizes the carriage of ANC data packets in an MPEG-2 Transport Stream.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. RTP Payload Format for SMPTE ST 291 Ancillary Data

The format of an RTP packet containing SMPTE ST 291 Ancillary Data is shown below:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC    |M|    PT       |        sequence number        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Extended Sequence Number    |            Length             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ANC_Count     |C|   Line_Number       |   Horizontal_Offset   | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        DID        |        SDID       |   Data_Count      | R |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 		    	     User_Data_Words...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Checksum_Word   |                word_align                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   (next ANC data packet)...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1: SMPTE Ancillary Data RTP Packet Format

RTP packet header fields SHALL be interpreted as per RFC 3550 [RFC3550], with the following specifics:

Timestamp: 32 bits

The timestamp field is interpreted in a similar fashion to RFC 4175 [RFC4175]:
For progressive scan video, the timestamp SHALL denote the sampling instant of the frame to which the ancillary data in the RTP packet belongs. RTP packets MUST NOT include ANC data from multiple frames, and all RTP packets with ANC data belonging to the same frame MUST have the same timestamp.
For interlaced video, the timestamp SHALL denote the sampling instant of the field to which the ancillary data in the RTP packet belongs. RTP packets MUST NOT include ANC data from multiple fields, and all RTP packets belonging to the same field MUST have the same timestamp.
If the sampling instant does not correspond to an integer value of the clock, the value SHALL be truncated to the next lowest integer, with no ambiguity. Section 3.1 describes recommended timestamp clock rates.
Marker bit (M): 1 bit

The marker bit set to "1" SHALL indicate the last ANC RTP packet for a frame (for progressive scan video) or the last ANC RTP packet for a field (for interlaced video).

2.1. Payload Header Definitions

The ANC RTP payload header fields are defined as:

Extended Sequence Number: 16 bits

The high order bits of the extended 32-bit sequence number, in network byte order. This is the same as the Extended Sequence Number field in RFC 4175 [RFC4175].
Length: 16 bits

Number of octets of the ANC RTP payload, beginning with the "C" bit of the first ANC packet data.
ANC_Count: 8 bits

This field is the count of the total number of ANC data packets carried in the RTP payload. A single ANC RTP packet payload SHALL NOT carry more than 255 ANC data packets.
If more than 255 ANC data packets need to be carried in a field or frame, additional RTP packets carrying ANC data may be sent with the same RTP timestamp but with different sequence numbers. ANC_Count of 0 SHALL indicate that there are no ANC data packets in the payload (for example, for an RTP packet with the marker bit set indicating the last ANC RTP packet in a field/frame, even if that RTP packet carries no actual ANC data packets.)

For each ANC data packet in the payload, the following ANC data packet header fields MUST be present:

C: 1 bit

For HD signals, this flag, when set to "1", indicates that the ANC data corresponds to the color-difference channel (C). When set to "0", this flag indicates that the ANC data corresponds to the luma (Y) channel. For SD signals, this flag SHALL be set to "0".
Line_Number: 11 bits

This field contains the line number (as defined in ITU-R BT.1700 [BT1700] for SD video or ITU-R BT.1120 [BT1120] for HD video) that corresponds to the location of the ANC data packet in an SDI raster. A value of 0x7FF (all bits in the field are '1') SHALL indicate that the ANC data is carried without a specific line location within the field or frame.
Note that the lines that are available to convey ANC data are as defined in the applicable sample structure specification (e.g., SMPTE 274M [ST274], SMPTE ST 296 [ST296], ITU-R BT.656 [BT656]) and may be further restricted per SMPTE RP 168 [RP168].
Horizontal_Offset: 12 bits

This field defines the location of the ANC data packet in an SDI raster relative to the start of active video (SAV). A value of 0 means that the Ancillary Data Flag (ADF) of the ANC data packet begins immediately following SAV. For HD, this SHALL be in units of luma sample numbers as specified by the defining document of the particular image (e.g., SMPTE 274M [ST274] for 1920 x 1080 active images, or SMPTE ST 296 [ST296] for 1280 x 720 progressive active images). For SD, this SHALL be in units of (27MHz) multiplexed word numbers, as specified in SMPTE ST 125 [ST125]. A value of 0xFFF (all bits in the field are '1') SHALL indicate that the ANC data is carried without any specific location within the line.
Note that HANC space in the digital blanking area will generally have higher luma sample numbers than any samples in the active digital line.

An ANC data packet with the header fields Line_Number of 0x7FF and Horizontal_Offset of 0xFFF SHALL be considered to be carried without any specific location within the field or frame, and in such a case the "C" field SHALL be ignored.

For each ANC data packet in the payload, immediately after the ANC data packet header fields, the following data fields MUST be present, with the fields DID, SDID, Data_Count, User_Data_Words, and Checksum_Word representing the 10-bit words carried in the ANC data packet, as per SMPTE ST 291-1 [ST291]:

DID: 10 bits

Data Identification Word
SDID: 10 bits

Secondary Data Identification Word. Used only for a "Type 2" ANC data packet. Note that in a "Type 1" ANC data packet, this word will actually carry the Data Block Number (DBN).
Data_Count: 10 bits

The lower 8 bits of Data_Count, corresponding to bits b7 (MSB) through b0 (LSB) of the 10-bit Data_Count word, contain the actual count of 10-bit words in User_Data_Words. Bit b8 is the even parity for bits b7 through b0, and bit b9 is the inverse (logical NOT) of bit b8.
R: 2 reserved bits

R is a field of two reserved bits that MUST be set to zero.
User_Data_Words: integer number of 10 bit words

User_Data_Words (UDW) are used to convey information of a type as identified by the DID word or the DID and SDID words. The number of 10-bit words in the UDW is defined by the Data_Count field.
Checksum_Word: 10 bits

The Checksum_Word can be used to determine the validity of the ANC data packet from the DID word through the UDW. It consists of 10 bits, where bits b8 (MSB) through b0 (LSB) define the checksum value and bit b9 is the inverse (logical NOT) of bit b8. The checksum value is equal to the nine least significant bits of the sum of the nine least significant bits of the DID word, the SDID word, the Data_Count word, and all User_Data_Words in the ANC data packet. The checksum is initialized to zero before calculation, and any end carry resulting from the checksum calculation is ignored.
word_align: bits as needed to complete 32-bit word

Word align contains enough "0" bits as needed to complete the last 32-bit word of ANC packet's data in the RTP payload. If an ANC data packet in the RTP payload ends aligned with a word boundary, there is no need to add any word alignment bits. Word align should be used even for the last ANC data packet in an RTP packet.

When reconstructing an SDI signal based on this payload, it is important to place ANC data packets into the locations indicated by the ANC payload header fields Line_Number and Horizontal_Offset, and also to follow the requirements of SMPTE ST 291-1 [ST291] Section 7 "Ancillary Data Space Formatting (Component or Composite Interface)", which include rules on the placement of initial ANC data into allowed spaces as well as the contiguity of ANC data packet sequences within those spaces in order to assure that the resulting ANC data packets in the SDI signal are valid.

Senders of this payload SHOULD transmit available ANC data packets as soon as practical to reduce end-to-end latency, especially if receivers will be embedding the received ANC data packet into an SDI signal emission. One millisecond is a reasonable upper bound for the amount of time between when an ANC data packet becomes available to a sender and the emission of an RTP payload containing that ANC data packet.

ANC data packets with headers that specify specific location within a field or frame SHOULD be sent in raster scan order, both in terms of packing position within an RTP packet and in terms of transmission time of RTP packets.

3. Payload Format Parameters

This RTP payload format is identified using the video/smpte291 media type, which is registered in accordance with RFC 4855 [RFC4855], and using the template of RFC 6838 [RFC6838].

Note that the Media Type Definition is in the "video" tree due to the expected use of SMPTE ST 291 Ancillary Data along with video formats.

3.1. Media Type Definition

Type name: video

Subtype name: smpte291

Required parameters:

Optional parameters:

Encoding considerations: This media type is framed and binary; see Section 4.8 of RFC 6838 [RFC6838].

Security considerations: See Section 5 of [this RFC]

Interoperability considerations: Data items in smpte291 can be very diverse. Receivers might only be capable of interpreting a subset of the possible data items. Some implementations may care about the location of the ANC data packets in the SDI raster, but other implementations may not care.

Published specification: [this RFC]

Applications that use this media type: Devices that stream real-time professional video, especially those that must interoperate with legacy serial digital interfaces (SDI).

Additional Information:

Person & email address to contact for further information: T. Edwards <thomas.edwards@fox.com>, IETF Payload Working Group <payload@ietf.org>

Intended usage: COMMON

Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP RFC 3550 [RFC3550]. Transport within other framing protocols is not defined at this time.

Author: T. Edwards <thomas.edwards@fox.com>

Change controller: IETF Audio/Video Transport Payloads working group delegated from the IESG.

3.2. Mapping to SDP

The mapping of the above defined payload format media type and its parameters SHALL be done according to Section 3 of RFC 4855 [RFC4855].

DID and SDID values SHALL be specified in hexadecimal with a "0x" prefix (such as "0x61"). The ABNF as per RFC 5234 [RFC5234] of the fmtp line shall be:

        TwoHex = "0x" 1*2(HEXDIG)
        DidSdid = "DID_SDID={" TwoHex "," TwoHex "}"
        FormatSpecificParameters = DidSdid *(";" DidSdid)
           

For example, EIA 608 Closed Caption data would be signalled with the parameter DID_SDID={0x61,0x02}. If a DID_SDID parameter is not specified, then the ancillary data stream may potentially contain ancillary data packets of any type.

Multiple DID_SDID parameters may be specified (separated by semicolons) to signal the presence of multiple types of ANC data in the stream. DID_SDID={0x61,0x02};DID_SDID={0x41,0x05}, for example, signals the presence of EIA 608 Closed Captions as well as AFD/Bar Data.

A sample SDP mapping for ancillary data is as follows:

        m=video 30000 RTP/AVP 112
        a=rtpmap:112 smpte291/90000
        a=fmtp:112 DID_SDID={0x61,0x02};DID_SDID={0x41,0x05}

In this example, a dynamic payload type 112 is used for ancillary data. The 90 kHz RTP timestamp rate is specified in the "a=rtpmap" line after the subtype. The RTP sampling clock is 90 kHz. In the "a=fmtp:" line, DID 0x61 and SDID 0x02 are specified (registered to EIA 608 Closed Caption Data by SMPTE), and also DID 0x41 and SDID 0x05 (registered to AFD/Bar Data).

3.2.1. Grouping ANC data RTP Streams with Associated Video Streams

The ANC RTP payload format will often be used in groupings with associated video essence streams. Any legal SDP grouping mechanism could be used. Implementers may wish to use the Lip Synchronization (LS) grouping defined in RFC 5888 [RFC5888], which requires that "m" lines that are grouped together using LS semantics MUST synchronize the playout of the corresponding media streams.

A sample SDP mapping for grouping ANC data with RFC 4175 video using LS semantics is as follows:

        v=0
        o=Al 123456 11 IN IP4 host.example.com
        s=Professional Networked Media Test
        i=A test of synchronized video and ANC data
        t=0 0
        a=group:LS V1 M1
        m=video 50000 RTP/AVP 96
        c=IN IP4 233.252.0.1/255
        a=rtpmap:96 raw/90000
        a=fmtp:96 sampling=YCbCr-4:2:2; width=1280; height=720; depth=10
        a=mid:V1
        m=video 50010 RTP/AVP 97
        c=IN IP4 233.252.0.2/255
        a=rtpmap:97 smpte291/90000
        a=fmtp:97 DID_SDID={0x61,0x02};DID_SDID={0x41,0x05}
        a=mid:M1
          

3.3. Offer/Answer Model and Declarative Considerations

Receivers may with to receive ANC data streams with specific DID_SDID parameters. Thus when offering ANC data streams using the Session Description Protocol (SDP) in an Offer/Answer model [RFC3264] or in a declarative manner (e.g., SDP in the Real-Time Streaming Protocol (RTSP) [RFC2326] or the Session Announcement Protocol (SAP) [RFC2974]), the offerer may provide a list of ANC streams available with specific DID_SDID parameters in the fmtp line. The answerer may respond with a all or a subset of the streams offered along with fmtp lines with all or a subset of the DID_SDID parameters offered. Or the answerer may reject the offer.

4. IANA Considerations

One media type (video/smpte291) has been defined and needs registration in the media types registry. See Section 3.1

5. Security Considerations

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550] and any applicable RTP profile, e.g., AVP [RFC3551].

To avoid potential buffer overflow attacks, receivers should take care to validate that the ANC data packets in the RTP payload are of the appropriate length (using the Data_Count field) for the ANC data type specified by DID & SDID. Also the Checksum_Word should be checked against the ANC data packet to ensure that its data has not been damaged in transit.

Some receivers will simply move the ANC data packet bits from the RTP payload into a serial digital interface (SDI). It may still be a good idea for these "re-embedders" to perform the above mentioned validity tests to avoid downstream SDI systems from becoming confused by bad ANC data packets, which could be used for a denial of service attack.

"Re-embedders" into SDI should also double check that the Line_Number and Horizontal_Offset leads to the ANC data packet being inserted into a legal area to carry ancillary data in the SDI video bit stream of the output video format.

6. References

6.1. Normative References

[BT1120] ITU-R, "BT.1120-8, Digital Interfaces for HDTV Studio Signals", January 2012.
[BT1700] ITU-R, "BT.1700, Characteristics of Composite Video Signals for Conventional Analogue Television Systems", February 2005.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, July 2003.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007.
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008.
[RFC6838] Freed, N., Klensin, J. and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013.
[ST291] SMPTE, "ST 291-1:2011, Ancillary Data Packet and Space Formatting", 2011.

6.2. Informative References

[BT656] ITU-R, "BT.656-5, Interfaces for Digital Component Video Signals in 525-Line and 625-Line Television Systems Operating at the 4:2:2 Level of Recommendation ITU-R BT.601", December 2007.
[RFC2326] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, DOI 10.17487/RFC2326, April 1998.
[RFC2974] Handley, M., Perkins, C. and E. Whelan, "Session Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974, October 2000.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, DOI 10.17487/RFC3264, June 2002.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, DOI 10.17487/RFC3551, July 2003.
[RFC4175] Gharai, L. and C. Perkins, "RTP Payload Format for Uncompressed Video", RFC 4175, DOI 10.17487/RFC4175, September 2005.
[RFC5371] Futemma, S., Itakura, E. and A. Leung, "RTP Payload Format for JPEG 2000 Video Streams", RFC 5371, DOI 10.17487/RFC5371, October 2008.
[RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description Protocol (SDP) Grouping Framework", RFC 5888, DOI 10.17487/RFC5888, June 2010.
[RP168] SMPTE, "RP 168:2009, Definition of Vertical Interval Switching Point for Synchronous Video Switching", 2009.
[ST125] SMPTE, "ST 125:2013, SDTV Component Video Signal Coding 4:4:4 and 4:2:2 for 13.5 MHz and 18 MHz Systems", 2013.
[ST2038] SMPTE, "ST 2038:2008, Carriage of Ancillary Data Packets in an MPEG-2 Transport Stream", 2008.
[ST259] SMPTE, "ST 259:2008, SDTV Digital Signal/Data - Serial Digital Interface", 2008.
[ST274] SMPTE, "ST 274:2008, 1920 x 1080 Image Sample Structure, Digital Representation and Digital Timing Reference Sequences for Multiple Picture Rates", 2008.
[ST292] SMPTE, "ST 292-1:2012, 1.5 Gb/s Signal/Data Serial Interface", 2012.
[ST296] SMPTE, "ST 296:2012, 1280 x 720 Progressive Image 4:2:2 and 4:4:4 Sample Structure - Analog and Digital Representation and Analog Interface", 2012.

Author's Address

Thomas G. Edwards FOX 10201 W. Pico Blvd. Los Angeles, CA 90035 USA Phone: +1 310 369 6696 EMail: thomas.edwards@fox.com