Adaptive Loss Concealment Adaptive Loss Concealment for Internet - - PowerPoint PPT Presentation

adaptive loss concealment adaptive loss concealment for
SMART_READER_LITE
LIVE PREVIEW

Adaptive Loss Concealment Adaptive Loss Concealment for Internet - - PowerPoint PPT Presentation

Adaptive Loss Concealment Adaptive Loss Concealment for Internet Internet Telephony Telephony for Applications Applications Henning Sanneck GMD FOKUS, Berlin sanneck@fokus.gmd.de Supported by the USMInt (DFN) and Multicube (ACTS) projects


slide-1
SLIDE 1

Adaptive Loss Concealment Adaptive Loss Concealment for for Internet Internet Telephony Telephony Applications Applications

Henning Sanneck

GMD FOKUS, Berlin

sanneck@fokus.gmd.de

Supported by the USMInt (DFN) and Multicube (ACTS) projects

slide-2
SLIDE 2

Overview Overview

  • Motivation
  • Receiver-Based Concealment
  • Adaptive Packetization / Concealment

(Sender/Receiver operation)

  • Properties (packet sizes / header overhead, delay)
  • Subjective Test
  • Conclusions
slide-3
SLIDE 3

Motivation: Motivation: Loss of Speech Packets Loss of Speech Packets

  • Congestion in the Internet / Mbone

Packet Loss speech signal dropouts need to enhance speech quality

  • Solutions:

bandwidth adaptation, resource reservation, differential services, redundancy/FEC, interleaving, receiver-based concealment

slide-4
SLIDE 4

Packet Packet Repetition Repetition (Receiver-

(Receiver-Based Based) ) L: packet size p(n): pitch period

3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2

s(n) s(n) s(n) ^ ~ n n n + L/1000 L p

n: sample number

Sender Receiver Receiver

(concealed)

slide-5
SLIDE 5

Pitch Waveform Replication Pitch Waveform Replication (Receiver-

(Receiver-Based Based) ) L: packet size p(n): pitch period

3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2

s(n) s(n) s(n) ^ ~ n n n + L/1000 L p

Sender Receiver Receiver

(concealed)

slide-6
SLIDE 6

New New approach approach

L(c,c-1): packet size p(c): pitch period

3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2

s(n) s(n) s(n) ^ ~ n n n + p(c)/1000 p(c) L(c,c-1) L(c,c-1)

c: „chunk“ number

Sender Receiver Receiver

(concealed)

slide-7
SLIDE 7

Adaptive Packetization Adaptive Packetization / / Concealment Concealment

Sender-supported concealment: choose packetization interval adaptively

  • packet size (size of lost segment) relates to

„importance“ of packet content

  • pre-processing of the undistorted signal
  • enables simple concealment operation at the

receiver (high probability that adjacent packets contents resemble each other)

slide-8
SLIDE 8

Sender: Sender: Adaptive Packetization Adaptive Packetization

  • Auto-correlation of signal partitioning („chunks“)
  • speech content transition check: voiced/unvoiced
  • packetization: 2 chunks/packet (header overhead)

(110 ms speech)

slide-9
SLIDE 9

90 100 110 120 130 140 150 160 50 100 150 200 250 300 0.01 0.02 0.03 0.04 0.05 pitch frequency [Hz] packet length l [samples] relative frequency l n / L

Packet Size Frequency Packet Size Frequency Distribution Distribution

  • Packet size is now dependent on speaker‘s pitch

(range: pmin= 30, 2pmax= 320 samples; 4...40ms) l [samples] fS / pV [Hz] (Relative frequency n weighted with size l)

slide-10
SLIDE 10

Relative Relative Packet Header Packet Header Overhead Overhead

Speaker Estimated

  • verhead
  • /(o+2pv)

Measured

  • verhead

O [%] Male low 20.16 20.14 Male high 22.97 22.83 Female low 25.72 24.84 Female high 28.62 27.98

Typical value for IP Telephony: O = 20%

  • fixed packetization interval: 160 samples [20ms],
  • RTP/UDP/IP per packet overhead: o = 40 octets

AP/C: (mean pitch period: pv )

slide-11
SLIDE 11

Receiver: Receiver: Concealment Concealment

resampling: no specific distortions introduced (like e.g. with PWR)

c21 c12 c22 c31 c31 c12 c21 c22 c32

21

p(c ) c31 c12 c32 c11 left packet right packet

21

c22 c ^ ^ replacement packet resampling k

10 20 30 40 50 60 70 80 90 100 −0.1 −0.08 −0.06 −0.04 −0.02 0.02 0.04 0.06 0.08 0.1 10 20 30 40 50 60 70 80 90 100 −0.1 −0.08 −0.06 −0.04 −0.02 0.02 0.04 0.06 0.08 0.1

k = p( ) / p( ) k = p( ) / p( ) c11 l

boundary info

left packet lost packet right packet

slide-12
SLIDE 12

Receiver: Receiver: Concealment Concealment ( (contd contd.) .)

L(c,c-1): packet size p(c): pitch period

3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 3.68 3.69 3.7 3.71 3.72 3.73 3.74 3.75 3.76 3.77 3.78 x 104 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2

s(n) s(n) s(n) ^ ~ n n n + p(c)/1000 p(c) L(c,c-1) L(c,c-1)

c: „chunk“ number

Sender Receiver Receiver

(concealed)

slide-13
SLIDE 13

Discussion Discussion

  • characteristic“ information: 2 octets

(own and following intra-packet boundary)

  • additional delay (buffering):

sender: [pmax ,2pmax-pmin ]= 20...36ms receiver: [pmin ,2pmax ]= 4...40ms (on loss only)

  • computational complexity/processing delay: low
  • backwards compatible with existing tools
slide-14
SLIDE 14

Subjective Subjective Test: Test Test: Test Procedure Procedure

  • Four signals (different speakers), PCM 16 bit

linear, 8 kHz

  • comparison with „silence substitution“ and PWR
  • random, yet isolated packet losses
  • 40 test conditions: 4 speakers x

(3 algorithms x 3 loss rates + original)

  • thirteen non-expert listeners judged on MOS scale
  • Anchoring: Original=5, „Worst Case“=1 (50% loss)
  • test conditions in rapid, random sequence
slide-15
SLIDE 15

Subjective Subjective Test: Test: Results Results

MOS: Silence Substitution MOS: Pitch Waveform Replication

90 100 110 120 130 140 150 160 170 10 20 30 40 50 60 1 1.5 2 2.5 3 3.5 4 4.5 5 pitch frequency [Hz] Pitch Waveform Replication sample loss rate [%] MOS 90 100 110 120 130 140 150 160 170 10 20 30 40 50 60 1 1.5 2 2.5 3 3.5 4 4.5 5 pitch frequency [Hz] Silence Substitution sample loss rate [%] MOS

sample loss

rate [%]

fS / pV [Hz] sample loss

rate [%]

fS / pV [Hz] 5 1

50 170 90

5 1

50 170 90

slide-16
SLIDE 16

Subjective Subjective Test: Test: Results Results ( (contd contd.) .)

MOS: Adaptive Packetization/ Standard deviation of MOS (AP/C) Concealment

90 100 110 120 130 140 150 160 170 10 20 30 40 50 60 1 1.5 2 2.5 3 3.5 4 4.5 5 pitch frequency [Hz] Adaptive Packetization / Concealment sample loss rate [%] MOS 90 100 110 120 130 140 150 160 170 10 20 30 40 50 60 0.5 1 1.5 sample loss rate [%] Adaptive Packetization / Concealment pitch frequeny [Hz] stddev of MOS

sample loss

rate [%]

fS / pV [Hz] fS / pV [Hz] sample loss

rate [%]

5 1

50 170 90 170 90 1.5 50

slide-17
SLIDE 17

Conclusions Conclusions

  • Sender preprocessing (Adaptive Packetization)

pre-defined parts of the signal are dropped less perceptible distortion, simple concealment

  • low overhead

(data, delay, processing, deployment)

  • Future/ongoing work:

frame-based codec support / integration complement end2end mechanism with queue management at routers (loss burstiness !)

  • http://www.fokus.gmd.de/research/cc/glone/products/

voice/apc