adaptive loss concealment adaptive loss concealment for
play

Adaptive Loss Concealment Adaptive Loss Concealment for Internet - PowerPoint PPT Presentation

Adaptive Loss Concealment Adaptive Loss Concealment for Internet Internet Telephony Telephony for Applications Applications Henning Sanneck GMD FOKUS, Berlin sanneck@fokus.gmd.de Supported by the USMInt (DFN) and Multicube (ACTS) projects


  1. Adaptive Loss Concealment Adaptive Loss Concealment for Internet Internet Telephony Telephony for Applications Applications Henning Sanneck GMD FOKUS, Berlin sanneck@fokus.gmd.de Supported by the USMInt (DFN) and Multicube (ACTS) projects

  2. Overview Overview • Motivation • Receiver-Based Concealment • Adaptive Packetization / Concealment (Sender/Receiver operation) • Properties (packet sizes / header overhead, delay) • Subjective Test • Conclusions

  3. Motivation: Loss of Speech Packets Loss of Speech Packets Motivation: • Congestion in the Internet / Mbone Packet Loss speech signal dropouts need to enhance speech quality • Solutions: bandwidth adaptation, resource reservation, differential services, redundancy/FEC, interleaving, receiver-based concealment

  4. Packet Repetition Repetition (Receiver- Packet (Receiver-Based Based) ) + L/1000 0.2 p 0.15 L Sender 0.1 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 n : sample number 0.2 0.15 Receiver 0.1 ~ 0.05 s(n) p(n) : pitch period 0 −0.05 −0.1 −0.15 n L : packet size −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 Receiver x 10 4 0.2 0.15 (concealed) 0.1 ^ 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4

  5. Pitch Waveform Replication (Receiver- Pitch Waveform Replication (Receiver-Based Based) ) + L/1000 0.2 p 0.15 L Sender 0.1 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 0.2 0.15 Receiver 0.1 ~ 0.05 s(n) p(n) : pitch period 0 −0.05 −0.1 −0.15 n L : packet size −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 Receiver x 10 4 0.2 0.15 (concealed) 0.1 ^ 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4

  6. New approach approach New + p(c)/1000 0.2 p(c) 0.15 Sender 0.1 0.05 s(n) 0 −0.05 −0.1 L(c,c-1) L(c,c-1) −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 0.2 0.15 c : „chunk“ number Receiver 0.1 ~ 0.05 s(n) p(c) : pitch period 0 −0.05 −0.1 −0.15 n L(c,c-1) : −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 Receiver 0.2 0.15 packet size (concealed) 0.1 ^ 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4

  7. Adaptive Packetization / / Concealment Concealment Adaptive Packetization Sender-supported concealment: choose packetization interval adaptively • packet size (size of lost segment) relates to „importance“ of packet content • pre-processing of the undistorted signal • enables simple concealment operation at the receiver (high probability that adjacent packets contents resemble each other)

  8. Sender: Adaptive Packetization Adaptive Packetization Sender: • Auto-correlation of signal partitioning („chunks“) • speech content transition check: voiced/unvoiced • packetization: 2 chunks/packet (header overhead) (110 ms speech)

  9. Packet Size Frequency Distribution Distribution Packet Size Frequency • Packet size is now dependent on speaker‘s pitch (range: p min = 30, 2p max = 320 samples; 4...40ms ) 0.05 0.04 relative frequency l n / L 0.03 0.02 0.01 0 0 50 100 90 150 l [samples] f S / p V [Hz] 100 200 110 120 250 130 140 300 150 160 pitch frequency [Hz] packet length l [samples] (Relative frequency n weighted with size l )

  10. Relative Packet Header Packet Header Overhead Overhead Relative Typical value for IP Telephony: O = 20% • fixed packetization interval: 160 samples [20ms], • RTP/UDP/IP per packet overhead: o = 40 octets AP/C: Speaker Estimated Measured overhead overhead O [%] o/(o+2p v ) Male low 20.16 20.14 Male high 22.97 22.83 Female low 25.72 24.84 Female high 28.62 27.98 (mean pitch period: p v )

  11. Receiver: Concealment Concealment Receiver: left packet lost packet right packet l p(c ) 21 boundary c 11 c 12 c 21 c 22 c 31 c 32 info k = p( ) / p( ) c 12 c 21 k = p( ) / p( ) c 31 c 22 0.1 0.1 0.08 0.08 0.06 0.06 resampling 0.04 0.04 k 0.02 0.02 0 0 −0.02 −0.02 −0.04 −0.04 −0.06 −0.06 −0.08 −0.08 −0.1 0 10 20 30 40 50 60 70 80 90 100 −0.1 0 10 20 30 40 50 60 70 80 90 100 left packet replacement packet right packet ^ ^ c 11 c 12 c c 22 c 31 c 32 21 resampling: no specific distortions introduced (like e.g. with PWR)

  12. Receiver: Concealment Concealment ( (contd contd.) .) Receiver: + p(c)/1000 0.2 p(c) 0.15 Sender 0.1 0.05 s(n) 0 −0.05 −0.1 L(c,c-1) L(c,c-1) −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 0.2 0.15 c : „chunk“ number Receiver 0.1 ~ 0.05 s(n) p(c) : pitch period 0 −0.05 −0.1 −0.15 n L(c,c-1) : −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4 Receiver 0.2 0.15 packet size (concealed) 0.1 ^ 0.05 s(n) 0 −0.05 −0.1 −0.15 n −0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 10 4

  13. Discussion Discussion • characteristic“ information: 2 octets (own and following intra-packet boundary) • additional delay (buffering): sender: [ p max ,2p max -p min ]= 20...36ms receiver: [ p min ,2p max ]= 4...40ms (on loss only) • computational complexity/processing delay: low • backwards compatible with existing tools

  14. Subjective Test: Test Test: Test Procedure Procedure Subjective • Four signals (different speakers), PCM 16 bit linear, 8 kHz • comparison with „silence substitution“ and PWR • random, yet isolated packet losses • 40 test conditions: 4 speakers x (3 algorithms x 3 loss rates + original) • thirteen non-expert listeners judged on MOS scale • Anchoring: Original=5, „Worst Case“=1 (50% loss) • test conditions in rapid, random sequence

  15. Subjective Test: Test: Results Results Subjective MOS: Silence Substitution MOS: Pitch Waveform Replication Silence Substitution Pitch Waveform Replication 5 5 5 5 4.5 4.5 4 4 3.5 3.5 3 3 MOS MOS 2.5 2.5 2 2 1.5 1.5 1 1 1 1 0 0 0 0 10 10 20 20 90 90 30 30 90 90 100 100 110 110 40 40 120 120 50 50 130 130 140 140 50 50 sample loss sample loss 150 150 160 f S / p V [Hz] 160 f S / p V [Hz] 60 60 170 170 170 170 pitch frequency [Hz] pitch frequency [Hz] sample loss rate [%] sample loss rate [%] rate [%] rate [%]

  16. Subjective Test: Test: Results Results ( (contd contd.) .) Subjective MOS: Adaptive Packetization/ Standard deviation of MOS (AP/C) Concealment Adaptive Packetization / Concealment Adaptive Packetization / Concealment 5 5 1.5 4.5 1.5 4 3.5 3 MOS 1 std d ev of MOS 2.5 2 0.5 1.5 1 1 60 0 50 0 0 50 10 0 40 20 90 sample loss 30 100 90 110 30 90 90 100 20 120 110 130 40 120 50 140 10 130 150 140 50 160 sample loss 0 0 150 170 sample loss rate [%] rate [%] 160 f S / p V [Hz] f S / p V [Hz] 170 60 170 170 pitch frequency [Hz] pitch frequeny [Hz] sample loss rate [%] rate [%]

  17. Conclusions Conclusions • Sender preprocessing (Adaptive Packetization) pre-defined parts of the signal are dropped less perceptible distortion, simple concealment • low overhead (data, delay, processing, deployment) • Future/ongoing work: frame-based codec support / integration complement end2end mechanism with queue management at routers (loss burstiness !) • http://www.fokus.gmd.de/research/cc/glone/products/ voice/apc

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend