2.2.1 RTP Packets

The syntax of the RTP header is as specified in [RFC3550] section 5.1. The fields of the fixed RTP header have their usual meaning, which is specified in [RFC3550] section 5.1 and [RFC3551] section 2, with the following additional notes:

Marker bit (M): In audio streams, if silence suppression is enabled, the marker bit (M) SHOULD be one for the first packet of a talk spurt and zero for all other packets. Failure to do so can result in reduced audio quality at the receiving end. If silence suppression is disabled, the marker bit can be one for the first packet in the stream, but SHOULD<3> be zero for all other packets. In video streams, the marker bit MUST be one for the last packet sent for each video frame, and zero for all other packets.

Payload type (PT): The payload type field identifies the format of the RTP payload, and determines its interpretation by the application. Codecs that are not assigned to static payload types MUST be assigned to a payload type within the dynamic range, which is between 0x60 and 0x7f.

Codecs with payload type numbers in the dynamic range can use a different payload type number for send and receive.<4>

Codecs with payload type numbers in the static range MUST be used as specified in the following table. Codecs with payload types in the dynamic range can use a different payload type number, but MUST be used with the clock rate, packetization times (P-times), and number of channels specified in the following table.

The following table lists audio codecs with payload type numbers, clock rates, P-times, and channels:

Payload type

Codec

Clock rate

P-times

Channels

0

G.711 ยต-Law<5>

8000

10, 20, 40, 60

1

3

GSM 6.10<6>

8000

20, 40, 60

1

4

G.723.1

8000

30, 60, 90

1

8

G.711 A-Law<7>

8000

10, 20, 40, 60

1

9 or 117

G.722<8>

8000

20, 40, 60

1

13

Comfort Noise<9>

8000

Not Applicable

1

103

Silk<10>

8000

60

1

104

Silk

16000

20,60,100

1

106

OPUS

48000

20,60

2

111

Siren

16000

20, 40, 60, 100, 200

1

112

G.722.1

16000

20, 40, 60

1

114

RT Audio

16000

20, 40, 60

1

115

RT Audio

8000

20, 40, 60

1

116

G.726

8000

20, 40, 60

1

117

G.722<11>

8000

20,40,60

2

118

Comfort Noise<12>

16000

Not Applicable

1

The following table lists video codecs with payload type numbers and clock rates:

Payload type

Codec

Clock rate

34

H.263 [MS-H26XPF]<13>

90000

121

RT Video

90000

122

H.264 [MS-H264PF]<14>

90000

123

H.264 FEC [MS-H264PF]<15>

90000

The following table lists data codecs with payload type numbers and clock rates<16>:

Payload type

Codec

Clock rate

127

x-data

90000

SSRC: This field identifies the synchronization source. This identifier SHOULD be chosen randomly, or SHOULD be configured by methods specified in [MS-SDPEXT] section 3.1.5.31. The SSRC value MUST NOT be zero. The loop detection and collision resolution algorithms from [RFC3550] section 8.2 MUST NOT be used because this protocol MAY use Media Source ID (MSI) in the CSRC list in a packet from a mixer. See the following definition for the CSRC list for details. When SSRC is not configured, SSRC SHOULD NOT be changed within 2 seconds of the start of the stream or a previous SSRC change, to prevent packets from being ignored by the throttling algorithm described in section 3.1.

CSRC list: This list identifies the contributing sources for the payload contained in this packet, as defined by [RFC3550] Section 5.1. This protocol differs from [RFC3550] in CSRC values. It uses MSI instead of SSRC of the contributing sources<17>. Additionally, for audio packets coming from mixers, the first CSRC in the list SHOULD be the dominant speaker at the moment in which the packet was generated, even if its audio stream is not included in the mix. For example, the packet sent by the mixer to the dominant speaker itself has its own SSRC on the CSRC list, even though its audio is not actually mixed in that packet. This means that a receiver MUST be able to handle receiving its own SSRC on the first position of the CSRC list without detecting a loop. CSRC list positions other than the first maintain their usual meaning, and a receiver can detect a loop if it receives its own SSRC in those positions.