3.1.5.24 Extension for video receive capabilities a=x-caps

This protocol defines a video media level attribute a=x-caps, which represents what a video receiver is capable of receiving. A video capability a= line defines video capabilities for each of the video codec the video receiver is capable of receiving.<49>

This attribute is optional; if missing, a video sender SHOULD set the video receive capabilities of the remote peer as Common Intermediate Format (CIF) at 15 fps and Video Graphics Array (VGA) at 15 fps. Quarter CIF (QCIF, which represents a video resolution of 176 width by 144 height), CIF and VGA MUST be advertised in the list of Video capabilities for a media stream that has an a=label:main-video SDP attribute. A video capability of VGA 13 fps MUST be treated as VGA 15 fps.

High-definition (HD), which represents a video resolution of 1280 width by 720 height, is an additional video capability that can be advertised by a video receiver.

High-definition 1080p (HD1080p), which represents a video resolution of 1920 width by 1080 height, is an additional video capability that can be advertised by a video receiver.

This media attribute has the following format in ABNF notation, as described in [RFC5234]:

Video-Capabilities-media-type-attribute:

 "a=x-caps:" <video-payload-type > SPACE <list-of-video-capabilities> CRLF

video-payload-type: The RTP payload type number for video, such as 121 for x-rtvc1.

list-of-video-capabilities:

 <video-capability>";"<video-capability>

video-capability:

 <Capability-ID>":"<Width-of-video-frame>":"<Height-of-video-frame>":"<frames-per-second>":"<maximum-bitrate-bits-per-second>":"<additional-attributes>

Capability-ID (integer): A unique random integer among the listed capability ID for that m= line. The value is between 1 and 2147483647 in the entire video-capabilities-media-type-attribute.

Width-of-video-frame (integer): The width is one of the following values:

  • 176 for QCIF

  • 352 for CIF

  • 640 for VGA

  • 1280 for HD

  • 1920 for HD1080p

Height-of-the-video-frame (integer): The height is one of the following values:

  • 144 for QCIF

  • 288 for CIF

  • 480 for VGA

  • 720 for HD

  • 1080 for HD1080p

frames-per-second (float): Value SHOULD be less than or equal to 30.0 fps. Any values beyond 30.0 SHOULD be treated as 30.0. The value specifies the maximum frame rate the receiver is capable of receiving.

maximum-bitrate-in bits-per-second (integer): Ignored, and reserved for future use.

Any additional attributes SHOULD be ignored, and reserved for future use.

A protocol peer, upon receiving the video capabilities SHOULD<50> do the following:

  • If there is a "," in a video-capability attribute, anything from "," to the end of video-capability (";" or CRLF) SHOULD be ignored.

  • If there is a syntax error in a=x-caps, the whole a=x-caps SHOULD be ignored and video receive capabilities of the remote peer SHOULD be set as CIF at 15 fps or VGA at 15 fps.

The following is an example a=x-caps attribute:

 a=x-caps:121 263:1920:1080:30.0:2000000:1;4359:1280:720:30.0:1500000:1;8455:640:480:30.0:600000:1;12551:640:360:30.0:600000:1;16647:352:288:15.0:250000:1;20743:424:240:15.0:250000:1;24839:176:144:15.0:180000:1

The a=x-caps attribute is not supported for the H.264UC or ULPFEC-UC video media formats (section 3.1.5.3). An a=x-caps attribute SHOULD NOT be included for H.264UC or ULPFEC-UC. An a=x-caps attribute for H.264UC or ULPFEC-UC, if present in a received SDP message, MUST be ignored.