CS 584 / CMPE 584
Multimedia Communications
Spring 2006-07
Advances in the Transport Layer
(RTP)
Multimedia Communications Spring 2006-07 Advances in the Transport - - PowerPoint PPT Presentation
CS 584 / CMPE 584 Multimedia Communications Spring 2006-07 Advances in the Transport Layer (RTP) Shahab Baqai LUMS Outline Why RTP? RTP Features RTP Packet RTCP RTCP Reports Jitter Estimation 2 RTP provide
CS 584 / CMPE 584
Spring 2006-07
(RTP)
2
3
– Unicast – multicast
– allow monitoring of the data delivery – scalable to large multicast networks – provide minimal control and identification functionality.
– Audio/video profile in RFC 1890
4
– Why??
– Pros: multiplexing and check sum services – Cons: does not support retransmissions upon packet loss
– Length indication – Framing mechanism – Possibility of packing multiple RTP packets into one network/transport packet
5
– payload type identification – sequencing of the out-of-order packets – Timely play out of the media data using timestamps
– monitors the quality of service – conveys information about the participants in an on-going session
– can work on any type of network like TCP/IP, ATM, frame relay etc.
location of a packet
– E.g. sequence numbers can be used in video decoding, without necessarily decoding packets in sequence
6
– Relies on lower layer services to do so
– Does not assume that the underlying network is reliable nor that it delivers packets in sequence – Provides sequence numbers to determine the proper location of a packet
7
– As a protocol, it is deliberately not complete – RTP is tailored to applications through modifications, rather than by providing options
– A profile specs
Defines
– Payload format specs
8
– Application level framing – Integrated layer processing
9
PCMA audio MPEG2 video Application RTP RTP UDP UDP I P I P Network Ethernet Frame Relay Data Link Transport
10
video once received
IP Header UDP Header RTP Header RTP Video Payload IP Header UDP Header RTP Header RTP Audio Payload
11
– Fixed RTP packet header – List of contributing sources (possibly empty) – Payload data
– Data transported by RTP in a packet
E.g. audio samples, compressed video data, etc …
12
(V) Version; 2 bits (P) Padding; 1 bit. (X) Extension; 1 bit. (CC) CSRC Count; 4 bits. (M) Marker; 1 bit. (PT) Payload Type; 7 bits. Sequence Number; 16 bits. Time Stamp; 32 bits. SSRC; 32 bits. CSRC List;
SSRC CSRC identifiers …… Payload V=2 Sequence Number P X CC M PT Time Stamp
13
– indicates the current version of RTP. The current version of RTP is 2.0.
– If P is set, the packet contains one or more additional padding octets at the end, which are not part of the payload. – Last octet of padding contains a count of how many padding octets are to be ignored – Padding is needed by some encryption algorithms, which require fixed block sizes, or for carrying several RTP packets in a lower-layer PDU.
– If X is set, the fixed header is followed by exactly one header extension (defined by profile)
14
– indicates the number of CSRC identifiers that follow the fixed headers. – the field has a non-zero value only if passed through a mixer.
– If M is set, it indicates some significant events like frame boundaries to be marked in the packet stream. – For example, an RTP marker bit is set if the packet contains a few bits of the previous frame along with the current frame.
– PT indicates the payload type carried by the RTP packet. – RTP Audio Video Profile (AVP) contains a default static mapping of payload type codes to payload formats. – Additional payload types can be registered with IANA.
15
– The number increments by one for each RTP data packet sent, with the initial value set to a random value. – The receiver can use the sequence number not only to detect packet loss but also to restore the packet sequence.
– The time stamp reflects the sampling instant of the first octet in the RTP data packet. – The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations at the receiver. – The initial value should be random, so as to prevent known plain text attacks. – several packets may have equal timestamps (eg. same video frame), or even in disorder E.g. interpolated frames in MPEG
16
17
– Source of a stream of RTP packets – 32-bit numeric identifier, randomly chosen meant to be globally unique – Carried in RTP header (thus not dependent on network address) – Allows receiver to group packets from same synchronization source
To identify the use of the same timing & sequence number space
– A participant need not use the same SSRC identifier for all RTP sessions in a multimedia session
Binding of identifiers is provided through RTCP
– If a participant generates multiple streams in one RTP session (separate video cameras) each must be identified as a different SSRC
18
– Must not just pick IP address, or other local identifier – Must not just call random() without carefully initializing state
– For joining a 1000-user session, about 1 in 5 million
– If a source sees another with same SSRC, it sends a BYE and picks again – Receivers can temporarily supress a conflicting SSRC using transport address
19
20
– A multicast group address is assigned – 2 UDP ports are assigned
One for RTP packaged audio data One for RTCP
– Address & port info distributed to intended participants – Encryption key is also distributed if privacy is required – Allocation and distribution mechanisms are outside the scope of RTP
21
– Audio conferencing application at each participant sends audio data in small chunks (e.g. of 20 msec duration)
RTP header is appended to each chunk
Senders can change encoding during a conference. E.g. – To accommodate a new participant connected thru a low bandwidth link – To react to network congestion
– Performed separately for each source
– To estimate how many packets are lost RTP packet (header + data) encapsulated into a UDP packet
22
– Dynamic participant group
Useful to know:
– Audio application periodically multicasts:
A reception report
The name of its user
– RTCP BYE packet sent when participant leaves conference
23
– Separate RTP sessions & RTCP packets for each medium (audio & video)
Different UDP ports pairs/multicast addresses Use same distinguished (canonical) name for both sessions in RTCP packets to associate the two sessions Separation allows participants to receive only one medium
– Synchronized playback of a source’s audio & video achieved using timing info carried in the RTCP packets for both sessions
24
Used to mix multiple audio streams into one to handle a low bandwidth link – Resync incoming audio packets – Reconstruct 20 msec spacing generated by senders – Mix the reconstructed audio stream into a single stream – Translate audio encoding into a lower bandwidth encoding – Forward lower bandwidth packet streams on their way – RTP header includes:
Means to identify sources contributing to a mixed stream
25
– A session is defined by a particular pair of destination transport addresses
One network address A pair of ports (one RTP & one RTCP) May be common to all participants (as in IP multicast) May be different for each (individual network address + common port address)
26
– Payload type may be switched during a session – SSRC identifies a single timing and sequence number space
Different clock rates for different streams
27
– RTCP sender & receiver reports describe only one timing & sequence space per SSRC – RTP mixer would not be able to combine interleaved streams of incompatible media in one session precludes:
Use of different network paths Use of different network resource allocation Reception of desired media (audio, not video)
28
– Allow RTP traffic to pass through an Internet firewall – Mix and/or recode data to fit over a low bandwidth link
– Typically used in firewall case – Allows sources to remain distinct even though packets all appear to come from the translator’s network address – May also change payload type or combine multiple packets into one
– Output of mixer has a newly assigned SSRC with new seq/timestamps – Original sources are listed in each packet in the CSRC list
29
– RTP / RTCP header contains padding bit to recover original packet length – Random start values of sequence number prevent known plaintext attacks – Decryption verified by sanity checking fields in header – Algorithm choice left to application by external means – Network layer encryption can be used where available
– Authentication could be accomplished via encapsulation
– Good for hardware which both decrypts & processes payload data
30
Mixer is an intermediate system that combines RTP streams from different sources into a single stream. It can change the data format of the RTP packets.
31
– Receiver report (RR), Sender report (SR), and Source description (SDES)
– to modify sender transmission rates – for diagnostics purposes
32
– Monitoring tools can get info without sorting thru all data packets
– Overall packet rate is reduced, saving both bandwidth & processing cost – No explicit count needed – overall length provided at lower layer
– Losses in multicast distribution can be quickly isolated – Senders can adapt to current network conditions
– Sources retain identity over time even if a collision occurs in SSRC – RTP streams from multiple tools can be bound together
33
– Might be included in a session announcement or inferred by scope
– RTP currently recommends 5% of total session bandwidth – Deviations from this should be specified in the profile documents
– Each participant keeps track of number of active senders & receivers – Senders get 25% of total RTCP bandwidth, receivers get the rest – Minimum report interval is 5 secs., to avoid bursts on small sessions – Actual interval is randomized to avoid unintentional synchronization
34
– supervise the network QoS – flow control and congestion control – identification of participants – persistent id (CNAME = Canonical Name) – determine the number of participants – session information – traffic of RTCP < 5%
– SR : sender reports
information on the source source statistics
– RR : reception reports
receiver statistics
– SDES : source description
CNAME
– BYE : end of the participation – APP : application specific functions
35
36
37
38
39
– absolute timestamp (NTP) – timestamp (RTP) – number of packets sent RTP – number of bytes sent RTP
40
– Fraction of lost packets – Number of lost packets – Last sequence number received – Estimation of the jitter – Timestamp of the last SR received (LSR) – Delay since the last SR received (DLSR)
41
– Any two reports can be differentiated to get activity over an interval – NTP timestamps in reports allow computation of rates – Monitoring tools need not know anything about particular encodings
– Average packet rate & average data rate over any interval – Monitoring tools can compute this without reading any of the data
– Extended sequence number can be used to compute packets expected – Packets lost & packets expected give long term loss rate – Fraction lost filed gives short term loss rate, with only a single report – LSR & DLSR give sender’s ability to compute round trip time
42
– Di = (Ri - Ri-1) – (Si - Si-1 )
– Ji = 15/16 Ji-1 + 1/16 | Di |
43
– Default mapping between payload type numbers & encodings – Use of marker bits – Frequency of the RTP clock used to generate timestamps
44
– RTP header extension – Additional RTCP packet types – RTCP report interval – Use of security / encryption
45
46
– Dynamic payload type assignment possible – Static payload type only if encoding is beneficial to entire Internet community
47
– 8000, 11025, 16000, 22000, 24000, 32000, 44100, 48000 Hz
– G.711, G.723 – GSM based – µ-law, A-law – …
48
– H.261, H,263 – MPEG 1 / MPEG 2 – JPEG – CelB – …
49
50
– Such as the previous value of the motion vector
51
– The H.261 information is carried as payload data within the RTP packet
52
– Payload type is set to H.261 payload format – RTP timestamp encodes the sampling instant of the first video image contained in the RTP packet
Will be the same for all RTP packets that belong to the same image Packets carrying data from different frames must have different timestamps Frames may be distinguished by timestamp If multiple frames are contained in one RTP packet, the timestamp refers to the first frame
– The RTP header’s marker bit is set to 1 in the last packet of a video frame
Otherwise must be zero
53
– Start Bit Position (SBIT, 3 bits):
Indicates the number of most significant bits that should be ignored in the first data octet
– End Bit Position (EBIT, 3 bits):
Indicates the number of least significant bits that should be ignored in the last data octet
– Intra Frame Encoded (I, 1 bit):
Set to 1 if this stream contains only INTRA-coded blocks Set to 0 if the stream may or may not contain INTRA-coded blocks This bit may not change during the course of an RTP session
54
– Motion Vector Flag (V, 1 bit):
Set to 0 if the motion vectors are not used in this stream Set to 1 if motion vectors may or may not be used in this stream This bit may not change during the course of an RTP session
– GOB Number (GOBN, 4 bits):
Encodes the current GOB number at the start of the packet Set to 0 if packet begins with a GOB header
– Macroblock Address Predictor (MBAP, 5 bits):
Indicates the address of the last MB encoded in the previous packet Ranges from 0 to 32 to predict the valid MBAs 1-33 Since the bitstream cannot be fragmented between the GOB header and the first macroblock (MB 1), the predictor at the start of a packet can never be zero. Thus the range is 1-32 which is made to fit in 5 bits by biasing the value by -1
MBAP is set to 0 if the packet starts with a GOB header
55
– Quantizer (QUANT, 5 bits):
Quantizer value (MQUANT or GQUANT) in effect prior to the start of this packet
– Horizontal Motion Vector Data (HMVD, 5 bits):
Reference horizontal motion vector data (MVD) Set to 0 if :
– Vertical Motion Vector Data (VMVD, 5 bits):
Reference vertical motion vector data (MVD) Set to 0 if :
I & V are “hint” flags as they can be inferred from the encoded bitstream. They are included to help the decoder make otherwise impossible optimizations based on these hints. These bits cannot change for the duration of the stream
56
– Fragmentation on GOB boundary – Fragmentation on MB boundary
57
58
– Requests the sender encode the next full frame using intra-coding
– Notifies the sender that certain RTP packets were lost – The encoder should encode the GOBs in the next frame which correspond to the lost GOBs using intra-coding
59
– Provides maximum interoperability among MPEG systems
– Provides interoperability with other Internet-based end systems
60
61
– E.g. sender switches data sources
62
63
– MBZ: unused, reserved for future use. Must be zero – TR (10 bits):
Temporal reference of the current frame within the current GOP Value ranges between 0 & 1023 Same in all RTP packets belonging to the same frame
– Sequence Header Present (S, 1 bit):
Set to 1 when the RTP packet contains a sequence header
64
– Beginning of Slice (B, 1 bit – BS):
start code is preceded by only one or more Video _sequence_header, GOP_header and/or Picture_header
– End of Slice (E, 1 bit –ES)
– Picture Type (P, 3 bits):
given picture
– FBV (1 bit):
– BFC (3 bits):
65
– FFV (1 bit):
– FFC (3 bits):
– FBV, BFC, FFV and FFC are obtained from the most recent picture header
66
67
– MBZ: unused, reserved for future use. Must be zero – Fragment Offset (16 bits):
68
– rtsp://cs.lums.edu.pk/class_video
– SDP (Session Description Protocol)
– start, pause, resume, end