Reproduction of 22.2 multichannel audio w ith virtual rendering - - PowerPoint PPT Presentation

reproduction of 22 2 multichannel audio w ith virtual
SMART_READER_LITE
LIVE PREVIEW

Reproduction of 22.2 multichannel audio w ith virtual rendering - - PowerPoint PPT Presentation

Reproduction of 22.2 multichannel audio w ith virtual rendering ITU-R Workshop Topics on the Future of Audio in Broadcasting Wed 15th July 2015 Popov Room 16:30 - 20:00 Satoshi OODE Science and Technology Research Laboratories Japan


slide-1
SLIDE 1

Reproduction of 22.2 multichannel audio w ith virtual rendering

ITU-R Workshop “Topics on the Future of Audio in Broadcasting” Wed 15th July 2015 Popov Room 16:30 - 20:00

Satoshi OODE Science and Technology Research Laboratories Japan Broadcasting Corporation (NHK)

slide-2
SLIDE 2

Overview

 8K SHV broadcasting service

  • is planned to begin in 2016.
  • is composed of stereo, 5.1ch and 22.2ch

with metadata related to dialogue level control.

 Requirement of 22.2 multichannel audio

  • Three dimensional spatial impression is achieved by

loudspeakers placed in 30-45 degree intervals.

 Reproductions of 22.2 multichannel audio for home use

  • Theatrical environment using 24 loudspeakers or more.
  • Rendering to other channel configurations such as 9.1ch.
  • Loudspeakers integrated with the display using virtual

rendering (Binaural reproduction over loudspeakers).

  • Headphone using virtual rendering (Binaural reproduction).
slide-3
SLIDE 3

22.2 multichannel audio specified in Rec. BS.2051

 The 22.2 multichannel sound system

  • is specified as system H in Rec.

BS.2051.

  • consists of three layers.

 Top layer: 9 channels including overhead

loudspeaker.

 Middle layer: 10 channels.  Bottom layer: 3 channels including 2 LFE

channels Top layer 9 channels Middle layer 10 channels Bottom layer 3 channels + 2 LFE

TpFC TpBC TpBL TpSiL TpFL TpBR TpSiR TpFR TpC FC FLc BL BC BR SiR FRc FR SiL FL BtFC BtFL BtFR LFE2 LFE1

slide-4
SLIDE 4

Roadmap of 8K SHV broadcasting service

(2000) 2004 2012 2016 2020 2008 2018 London Olympics Tokyo Olympics Rio Olympics Pilot broadcasting

  • Rec. BS.2051
  • Rec. BT.2020
  • Rec. BT.2077
  • Rec. BS.1116, BS.1196

(Rec. BS.1770)

 8K SHV pilot broadcasting service is planned to begin in 2016.  Recommendations BT.2020 and BS.2051 were developed and

some related recommendations were revised.

 ARIB Standard B32 which specifies audio coding was also

updated in Japan.

ARIB STD-B32 2014 R&D Start Full service

slide-5
SLIDE 5

Audio service in 8K SHV broadcasting

 The revised ARIB Standard STD-B32 provides specifications of

audio coding for the advanced satellite broadcasting system.

 New features are as follows.

  • Transmission of the down-mixing coefficients for each programme.
  • Dialogue level control and dialogue replacement.
  • Audio coding including lossless transmission.

 NHK plans to broadcast 8K SHV…

  • using MPEG-4 AAC in satellite broadcasting.
  • using stereo, 5.1ch and 22.2ch audio formats.
  • with metadata related to down-mixing for each programme.
  • with metadata related to dialogue level control function.

 The information is reported in Report ITU-R BS.2159-7.

slide-6
SLIDE 6

Dialogue level control function

 Many complaints with regard to the intelligibility of dialogue

although “Dialogue” is the most important contents.

 The listeners can separately control the level of dialogue and

that of total level.

In traditional channel-based, total level can be changed. Dialogue level is increased. Dialogue level is decreased. BGM+SE Dialogue BGM+SE Dialogue BGM+SE Dialogue BGM+SE Dialogue In 8K SHV broadcasting, relative dialogue level will be able to be changed.

slide-7
SLIDE 7

Dialogue level control in 22.2 multichannel audio

Top layer 9 channels Middle layer 10 channels Bottom layer 3 channels + 2 LFE

TpFC TpBC TpBL TpSiL TpFL TpBR TpSiR TpFR TpC FC FLc BL BC BR SiR FRc FR SiL FL BtFC BtFL BtFR LFE2 LFE1

 Some channels are used only for dialogue depending on

individual programmes.

 Dialogue channels have a flag of “dialogue”

slide-8
SLIDE 8

Metadata for dialogue level control specified in ARIB STD-B32

Descriptor Explanation ext_dialogue_status Existence of dialogue channels. num_dialogue_chans Number of main dialogue channels. num_additional_lang_chans Number of alternative dialogue channels. dialogue_src_index[i] Index of dialogue channels. dialogue_main_lang_comment_bytes Byte count of characters indicating content of main dialogue. dialogue_main_lang_comment_data Byte data of characters indicating content of main dialogue. dialogue_main_lang_code Language code of main dialogue. dialogue_additional_lang_code[i] Language code of ith alternative dialogue. dialogue_additional_lang_comment_ bytes[i] Byte count of characters indicating content of ith alternative dialogue. dialogue_additional_lang_comment_ data[i] Byte data of characters indicating content of ith alternative dialogue. dialogue_gain_index[i] Gain of alternative dialogue channels. (0000: 0 dB, 0001: –1 dB, 0010: –2 dB, ..., 1110: –14 dB, 1111: –∞ dB) sn_dialogue_plus_index Maximum gain of dialogue channels in receiver. (000: 0 dB, 001: +3 dB, 010: +6 dB, ..., 110: +18 dB, 111: +∞ dB) sn_dialogue_minus_index Minimum gain of dialogue channels in receiver. (000: 0 dB, 001: – 3 dB, 010: –6 dB, ..., 110: –18 dB, 111: –∞ dB) additional_dialogue_data_sync Data stream element in which alternative dialogue data is packed. additional_dialogue_index Index of alternative dialogue channels corresponding to the “i” of dialogue_additional_lang_code[i].

Limitation of dialogue level control is important for broadcaster because dialogue source is not always clean.

slide-9
SLIDE 9

View ing angle

  • f 8K Super Hi-Vision

 System parameters are specified in Rec. ITU-R BT.2020  8K SHV has viewing angle of 100° in azimuth and 60° in elevation

when the listener is positioned at 0.75 Height of the display.

 8K SHV requires wider and higher sound fields than 5.1ch or

stereo to match the visual image with the sound image .

Sound Stage Viewing distance Left 1.5H 0.75H Right Stereo 5.1ch 22.2ch

slide-10
SLIDE 10

Characteristics of 22.2 multichannel audio

 Stable localization of frontal sound

  • ver the entire area of the large-

screen image

 Sound image reproduced in all

directions around the listener, including elevation

 3D spatial impression augmenting

the listener’s sense of reality

 Wide listening area with excellent

sound quality

 Compatible with existing

multichannel sound systems

 Suitable for live recording, mixing,

and transmission

5.1 22.2

slide-11
SLIDE 11

Loudspeaker Intervals

  • Requirements of 22.2 multichannel audio-

24 loudspeakers 15 degree intervals 8 loudspeakers 45 degree intervals

  • 3.50
  • 3.00
  • 2.50
  • 2.00
  • 1.50
  • 1.00
  • 0.50

0.00 0.50

Top layer (45°) (40 subjects) Middle layer (0°) (20 subjects)

Mean difference grade score

(30゚)(45゚)( 60゚ )( 90゚ ) ( 120゚ )

12 8 6a 6b 4a 4b 3a 3b Loudspeaker arrangement (Loudspeaker intervals) Spatial impression by 8 loudspeakers is almost the same as that by 24 loudspeakers.

Reference Test item

slide-12
SLIDE 12

Reproductions of 22.2 multichannel audio for home use

 Theatrical environment using 24

loudspeakers or more.

 Rendering to other channel

configurations such as 9.1ch.

 Down-mixing to stereo or 5.1ch  Loudspeakers integrated with

display using virtual rendering.

 Headphone with virtual rendering

(Binaural).

slide-13
SLIDE 13

Theatrical environment using 24 loudspeakers or more

 22.2 multichannel audio is usually reproduced by 24 loudspeakers  Additional loudspeakers are added depending on listening

environment.

 Subwoofers of base management for full range channels (room size).  Full range loudspeakers on the side to keep uniformity (room shape).  Not absolute positions of channels but relative positions are

important.

FL Ideal environment FL FL

slide-14
SLIDE 14

Rendering to other channel configurations such as 9.1ch

 Rendering based on channel position  When the number of loudspeaker is large, rendering is used.  Down-mixing  When 7.1ch, 5.1ch or stereo is used, down-mixing coefficients are used to prevent making dialogue unclear due to spatial masking. The spatial impression is deteriorated

with decreasing number of loudspeakers.

FL FLc FC SiL BL BC FR FRc SiR BR L C R Ls Rs FL FC SiL BL BC FR SiR BR TC L R Ls Rs FL FC FR L C R Ls Rs L R

stereo 5.1ch 9.1ch 22.2ch

Rendering Down-mix

slide-15
SLIDE 15

Loudspeakers integrated w ith display using virtual rendering

– How to introduce 8K SHV Audio into the home environment

  • High-quality sound requires

24 discrete loudspeakers.

  • Installing 24 loudspeakers is
  • ver equipped for home environment.

– Compact and convenient system

  • Loudspeaker system should be

integrated with SHV display.

– 12 loudspeakers system integrated with 85 inches SHV FPD was developed.

loudspeaker

12 loudspeakers integrated with 8K SHV Flat Panel Display

slide-16
SLIDE 16

Loudspeakers integrated w ith display using virtual rendering

 For front 11 channels

 8 channels around the display are directly reproduced. (marked as red circles)  3 front channels on the display are reproduced using amplitude panning of vertical pair-wise.

Vertical pair-wise panning method Horizontal pair-wise panning method

loudspeaker

12 loudspeakers integrated with 8K SHV Flat Panel Display

slide-17
SLIDE 17

Loudspeakers integrated w ith display using virtual rendering

 11 side, back and overhead

channels around the listener are reproduced by binaural reproduction

  • ver 11 front loudspeakers.

(* to simulate acoustical propagation characteristics from the loudspeaker to each ear.)  Binaural reproduction over

loudspeakers has been studied since the 1960s.

Some studies suggested horizontally

arrayed loudspeakers can operate binaural reproduction very well.

The system realizes immersive

audio with only front loudspeakers.

loudspeaker

12 loudspeakers integrated with 8K SHV Flat Panel Display

slide-18
SLIDE 18

Loudspeakers integrated w ith display using virtual rendering

Binaural Reproduction over loudspeakers 1

z

2

z

TpC TpSiL TpSiR TpBL TpBR TpBC BC SiR BR SiL BL

1 TpBL

F

2 TpBL

F

1

x

2

x

22.2 multichannel sound field* is simulated using HRTFs corresponding to each loudspeaker

*11 side, back and overhead channels

slide-19
SLIDE 19

Loudspeakers integrated w ith display using virtual rendering

Binaural Reproduction over loudspeakers 1

x

2

x

1

y

2

y

H

Original sound field Reproduced sound field

11

G

12

G

22

G

21

G

Acoustic crosstalk Crosstalk cancelation is achieved by unit matrix

            =      

2 1 22 21 12 11 2 1

G G G G x x H y y

Condition number is

  • ne of the factors for

system stability

slide-20
SLIDE 20

Loudspeakers integrated w ith display using virtual rendering

– Condition number is one of the factors which indicates the stability of the binaural reproduction. – Increase of the number of loudspeaker realizes more stable binaural reproduction regardless of loudspeaker configuration

0.5 1 1.5 2 x 10

4

1 2 3 4 5 6 7 8

Two loudspeakers

Frequency [Hz] Cond(G) 0.5 1 1.5 2 x 10

4

1 2 3 4 5 6 7 8

Four loudspeakers

Frequency [Hz] Cond(G) 0.5 1 1.5 2 x 10

4

1 2 3 4 5 6 7 8 Frequency [Hz] Cond(G)

Six loudspeakers

0.5 1 1.5 2 x 10

4

1 2 3 4 5 6 7 8

Twelve loudspeakers

Frequency [Hz] Cond(G)

Condition numbers Condition numbers Condition numbers Condition numbers

12 loudspeaker control 2 loudspeaker control

slide-21
SLIDE 21

Headphone w ith virtual rendering (Binaural reproduction)

 For mobile or personal use, 22.2 multichannel headphone processor was developed. 22.2 multichannel sound field is simulated using HRTFs corresponding to each loudspeaker.  The system in which audio engineer’s HRTFs are installed has already used in 22.2 multichannel sound production.

1

z

2

z

Tp C TpSiL TpSiR TpBL TpBR TpBC B C SiR BR SiL BL

1 TpBL

F

2 TpBL

F

1

x

2

x

FL BtFL TpFL FLc BtFC FCTpFC FR BtFR TpFR FRc

22.2 multichannel headphone processor

slide-22
SLIDE 22

Conclusion

 8K SHV broadcasting service

  • is composed of stereo, 5.1ch and 22.2ch,
  • with metadata related to dialogue level control.

toward personalization, especially for the older person. toward adaptation of the listening environment.

 Reproductions of 22.2 multichannel audio for home use

  • Theatrical environment using 24 loudspeakers or more.
  • Rendering to other channel configurations such as 9.1ch.
  • Loudspeakers integrated with the display using virtual

rendering (Binaural reproduction over loudspeakers).

  • Headphone using virtual rendering (Binaural reproduction).
slide-23
SLIDE 23

Thank you for your attention