Speech and Language CS 188: Artificial Intelligence Spring 2011 - PDF document

Speech and Language CS 188: Artificial Intelligence Spring 2011 § Speech technologies § Automatic speech recognition (ASR) § Text-to-speech synthesis (TTS) § Dialog systems § Language processing technologies Speech and Language § Machine translation Pieter Abbeel – UC Berkeley § Information extraction Slides from Dan Klein § Web search, question answering § Text classification, spam filtering, etc … Digitizing Speech Speech in an Hour § Speech input is an acoustic wave form s p ee ch l a b “ l ” to “ a ” transition: Graphs from Simon Arnfield ’ s web tutorial on speech, 7 8 Sheffield: http://www.psyc.leeds.ac.uk/research/cogn/speech/tutorial/ Spectral Analysis Adding 100 Hz + 1000 Hz Waves § Frequency gives pitch; amplitude gives volume 0.99 § sampling at ~8 kHz phone, ~16 kHz mic (kHz=1000 cycles/sec) s p ee ch l a b amplitude 0 § Fourier transform of wave displayed as a spectrogram § darkness indicates energy at each frequency frequency œ 0.9654 0 0.05 Time (s) 9 10 1

Spectrum Part of [ae] from “ lab ” Frequency components (100 and 1000 Hz) on x-axis § Note complex wave repeating nine times in figure Amplitude § Plus smaller waves which repeats 4 times for every large pattern § Large wave has frequency of 250 Hz (9 times in .036 seconds) § Small wave roughly 4 times this, or roughly 1000 Hz § Two little tiny waves on top of peak of 1000 Hz waves 1000 100 Frequency in Hz 11 12 Back to Spectra Acoustic Feature Sequence § Time slices are translated into acoustic feature § Spectrum represents these freq components vectors (~39 real numbers per slice) § Computed by Fourier transform, algorithm which separates out each frequency component of wave. frequency …………………………………………… .. e 12 e 13 e 14 e 15 e 16 ……… .. § x-axis shows frequency, y-axis shows magnitude (in § These are the observations, now we need the decibels, a log measure of amplitude) hidden states X § Peaks at 930 Hz, 1860 Hz, and 3020 Hz. 14 18 State Space HMMs for Speech § P(E|X) encodes which acoustic vectors are appropriate for each phoneme (each kind of sound) § P(X|X ’ ) encodes how sounds can be strung together § We will have one state for each sound in each word § From some state x, can only: § Stay in the same state (e.g. speaking slowly) § Move to the next position in the word § At the end of the word, move to the start of the next word § We build a little state graph for each word and chain them together to form our state space X 19 20 2

Transitions with Bigrams Decoding § While there are some practical issues, finding the words given the acoustics is an HMM inference problem § We want to know which state sequence x 1:T is most likely given the evidence e 1:T : § From the sequence x, we can simply read off the words 21 22 Figure from Huang et al page 618 What is NLP? Problem: Ambiguities § Headlines: § Enraged Cow Injures Farmer With Ax § Hospitals Are Sued by 7 Foot Doctors § Ban on Nude Dancing on Governor ’ s Desk § Iraqi Head Seeks Arms § Local HS Dropouts Cut in Half § Juvenile Court to Try Shooting Defendant § Fundamental goal: analyze and process human language, § Stolen Painting Found by Tree broadly, robustly, accurately … § Kids Make Nutritious Snacks § End systems that we want to build: § Ambitious: speech recognition, machine translation, information extraction, dialog interfaces, question answering … § Modest: spelling correction, text categorization … § Why are these funny? 23 Parsing as Search Grammar: PCFGs § Natural language grammars are very ambiguous! § PCFGs are a formal probabilistic model of trees § Each “ rule ” has a conditional probability (like an HMM) § Tree ’ s probability is the product of all rules used § Parsing: Given a sentence, find the best tree – search! ROOT → S 375/420 S → NP VP . 320/392 NP → PRP 127/539 VP → VBD ADJP 32/401 ….. 25 26 3

Syntactic Analysis Machine Translation § Translate text from one language to another § Recombines fragments of example translations § Challenges: Hurricane Emily howled toward Mexico 's Caribbean coast on Sunday packing 135 mph winds and torrential rain and causing panic in Cancun, § What fragments? [learning to translate] where frightened tourists squeezed into musty shelters . § How to make efficient? [fast translation search] 27 29 4

Levels of Transfer 33 Machine Translation [demo: MT] 37 5

Speech and Language CS 188: Artificial Intelligence Spring 2011 - PDF document

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies Automatic speech recognition (ASR) Text-to-speech synthesis (TTS) Dialog systems Language processing technologies Speech and Language

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

CS 188: Artificial Intelligence Lecture 18: Speech Pieter Abbeel --- UC Berkeley Many slides

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

What is NLP? CS 188: Artificial Intelligence Language Fundamental goal: analyze and process

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Standard 188-2015 Presentation - TE Watson ANSI/ASHRAE Standard 188-2015 Legionellosis: Risk

CS 188: Artificial Intelligence Introduction Instructors: Anca Dragan, Sergey Levine University

Lecture 29: Artificial Intelligence Marvin Zhang 08/10/2016 Some slides are adapted from CS 188

for relic neutrino detection Rasa Muller PhD student at Nikhef, Amsterdam Previous intern at TNO

A mathematical model and inversion procedure for Magneto-Acousto-Electric Tomography (MAET)

Magneto-acoustic waves in asymmetric solar waveguides Progress in spatial magneto-seismology

Foundations of Language Science and Technology Acoustic Phonetics 1: Resonances and formants Jan

Probe type method for acoustic wave equations with discontinuous coefficients Gen Nakamura

: : OzGrav HF HF Key Parameter Value Simplified 3G: Arm Length

Non-Accelerator Experiments Astro-particle Physics Cosmology Dark Matter The 40th Anniversary

Greed is good: Leveraging Submodularity for Antenna Selection in Massive MIMO Aritra Konar