Speech Synthesis and Perception with Envelope Cue B ACKGROUND I - - PowerPoint PPT Presentation

โ–ถ
speech synthesis and
SMART_READER_LITE
LIVE PREVIEW

Speech Synthesis and Perception with Envelope Cue B ACKGROUND I - - PowerPoint PPT Presentation

Signals and Systems Speech Synthesis and Perception with Envelope Cue B ACKGROUND I MPLEMENTATION R ESULTS D ISCUSSION I MPROVEMENT B ACKGROUND | P ART 1 History - Artificial Cochlea First extra-auricular electric simulation 1748


slide-1
SLIDE 1

Speech Synthesis and Perception with Envelope Cue

Signals and Systems

slide-2
SLIDE 2

็›ฎๅฝ•

BACKGROUND IMPLEMENTATION RESULTS DISCUSSION IMPROVEMENT

slide-3
SLIDE 3

B ACKGROUND | PART1

slide-4
SLIDE 4

History - Artificial Cochlea

1748

  • First extra-auricular electric simulation

1905

  • Invention of an electrical stimulating system

1930

  • Electrode placed in the acoustic nerve produced

a copy of the speech waveform.

slide-5
SLIDE 5

1961

  • The first true cochlea implant was implanted by

the American otologist William Bill House

1984

  • FDA allowed them to be implanted in adults.

2000

  • The implants are approved for infants over 12

months old.

slide-6
SLIDE 6
slide-7
SLIDE 7

I MPLEMENTATION | PART2

slide-8
SLIDE 8

Figure 1. The operation of a four-channel cochlear implant. Reprinted from "Introduction to cochlear implants," by P . C. Loizou, 1999, IEEE Engineering in Medicine and Biology Magazine, vol. 18, no. 1.

slide-9
SLIDE 9
slide-10
SLIDE 10

band = 8; order = 4

synthesize.m-modulation

slide-11
SLIDE 11
  • rder = 4

synthesize.m-8 band pass filters

slide-12
SLIDE 12

SNR=-5

add_ssn.m

slide-13
SLIDE 13

GUI.m

slide-14
SLIDE 14

R ESULTS D ISCUSSION | PART3

slide-15
SLIDE 15

Task1 Variation in Channel Number

Butter Filters: Order = 4 ๐‘”

๐‘‘๐‘ฃ๐‘ข๐‘๐‘”๐‘” = 50๐ผ๐‘จ

  • N=1
  • N=2
  • N=4
  • N=8
  • N=16
  • N=20
  • N=32
slide-16
SLIDE 16

Why N is limited?

  • Instability of filters
  • Interference between electrodes
  • Continuous interleaved sampling
slide-17
SLIDE 17

Task2 Variation in Cut-off Frequency

Set the number

  • f bands N=4.

Implement tone- vocoder by changing the LPF cut-off frequency .

Describe how the LPF cut-off frequency affects the intelligibility

  • f synthesized sentence.
slide-18
SLIDE 18

Task2 Results and Conclusion

  • ๐‘”

๐‘‘๐‘ฃ๐‘ข๐‘๐‘”๐‘” = 20Hz

  • ๐‘”

๐‘‘๐‘ฃ๐‘ข๐‘๐‘”๐‘” = 50Hz

  • ๐‘”

๐‘‘๐‘ฃ๐‘ข๐‘๐‘”๐‘” = 100Hz

  • ๐‘”

๐‘‘๐‘ฃ๐‘ข๐‘๐‘”๐‘” = 400Hz

N=4

slide-19
SLIDE 19

Task3 Noise & Variation in Band Number

Generate a noisy signal at SNR

  • 5 dB

Set LPF cut-off frequency to 50 Hz Implement tone-vocoder by changing the number of bands

Describe how the number of bands affects the intelligibility of synthesized sentence, and compare findings with those obtained in task 1

slide-20
SLIDE 20

Task3 Results and Conclusion

  • N=2
  • N=4
  • N=6
  • N=8
  • N=16
slide-21
SLIDE 21

Task4 Noise & Variation in Cut-off Frequency

Generate a noisy signal at SNR -5 dB Set the number

  • f bands

to N=6 Implement tone-vocoder by changing the LPF cut-off frequency Describe how the LPF cut-off frequency affects the intelligibility

  • f synthesized

sentence

slide-22
SLIDE 22

Task4 Noise & Variation in Cut-off Frequency

slide-23
SLIDE 23
  • Synthesized speech is likely to lose its tone.
  • Chinese: tonal; English: non-tonal

Processed๏ผš

English & Chinese Comparison

slide-24
SLIDE 24

English & Chinese Comparison

Reprinted from "็”ตๅญ่€ณ่œ—่จ€่ฏญๅค„็†็ญ–็•ฅ็š„้ข‘่ฐฑ็‰นๅพ็ ”็ฉถ." by ้™ˆๅˆๅœฃ, et al. (2017) ็”Ÿ็‰ฉๅŒปๅญฆๅทฅ็จ‹ๅญฆๆ‚ๅฟ— 34(5): 760-766.

slide-25
SLIDE 25

How about music?

slide-26
SLIDE 26
slide-27
SLIDE 27

I MPROVEMENT | PART4

slide-28
SLIDE 28

Noise Reduction

  • S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction. 2008.
slide-29
SLIDE 29

Noise Reduction using Wiener filters

  • Original
  • Noisy
  • Noise Reduced
  • Synthesized (noisy)
  • Synthesized (noise reduced)
slide-30
SLIDE 30

Reference๏ผš

[1]

  • A. Mudry and M. Mills, "The early history of the cochlear implant: a retrospective," (in eng), JAMA

Otolaryngol Head Neck Surg, vol. 139, no. 5, pp. 446-53, May 2013. [2]

  • R. V. Shannon, F. G. Zeng, V. Kamath, J. Wygonski, and M. Ekelid, "Speech recognition with primarily

temporal cues," (in eng), Science, vol. 270, no. 5234, pp. 303-4, Oct 13 1995. [3]

  • S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction. Wiley, 2008.

[4] Chen, F., et al. (2015). "Evaluation of noise reduction methods for sentence recognition by mandarin- speaking cochlear implant listeners." Ear and hearing 36(1): 61-71. [5]

้™ˆๅˆๅœฃ, et al. (2017). "็”ตๅญ่€ณ่œ—่จ€่ฏญๅค„็†็ญ–็•ฅ็š„้ข‘่ฐฑ็‰นๅพ็ ”็ฉถ." ็”Ÿ็‰ฉๅŒปๅญฆๅทฅ็จ‹ๅญฆๆ‚ๅฟ— 34(5): 760-766.

[6]

้พšๆ ‘็”Ÿ, and ้ƒ็‘พ, โ€œๅ›ฝไบงไบบๅทฅ่€ณ่œ—,ไปป้‡้“่ฟœ,โ€ ไธญๅ›ฝๅŒปๅญฆๆ–‡ๆ‘˜: ่€ณ้ผปๅ’ฝๅ–‰็ง‘ๅญฆ, vol. 28, no. 5, pp. 231-236, 2013.

slide-31
SLIDE 31

ๆ„Ÿ่ฐข่ง‚็œ‹ | THANK YOU