understanding the mind Naomi Feldman University of Maryland - - PowerPoint PPT Presentation

understanding the mind
SMART_READER_LITE
LIVE PREVIEW

understanding the mind Naomi Feldman University of Maryland - - PowerPoint PPT Presentation

Toward a goal of understanding the mind Naomi Feldman University of Maryland December 4, 2014 Language in the mind How to learners acquire speech sound categories? How do listeners perceive speech in noisy situations? How do


slide-1
SLIDE 1

Toward a goal of understanding the mind

Naomi Feldman University of Maryland December 4, 2014

slide-2
SLIDE 2

Language in the mind

 How to learners acquire speech sound categories?  How do listeners perceive speech in noisy situations?  How do listeners represent the speech they hear?

 Higher-level questions about grammar, discourse, etc.

slide-3
SLIDE 3

A common approach

measurements from the lab

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

slide-4
SLIDE 4

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

(neural modeling) (cognitive science) (computer science)

slide-5
SLIDE 5

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

(neural modeling) (cognitive science) (computer science)

To understand the mind, we need to study the environment

slide-6
SLIDE 6

Understanding the mind

measurements from the lab From cognitive/brain science

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

slide-7
SLIDE 7

Understanding the mind

measurements from the lab From computing/engineering

 Collections of data from the

environment (e.g., corpora)

 Features that help systems

generalize from those data From cognitive/brain science characteristics of the environment

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

slide-8
SLIDE 8

“Big data for cognitive science”

  • Jimmy Lin
slide-9
SLIDE 9

An example from language

 Dynamic signal that’s changing continuously  Contains information in both frequency and time

How is speech represented?

Aren Jansen Caitlin Richter

Supported by NSF BCS-1320410

slide-10
SLIDE 10

Developing representations

6 7 8 9 10 11 12

age in months (Werker & Tees, 1984; Kuhl et al., 1992)

Speech perception becomes tuned to the native language

slide-11
SLIDE 11

Developing representations

6 7 8 9 10 11 12

age in months 6-8 months: discriminate non- native consonant contrasts 10-12 months: poor discrimination

  • f non-native

consonant contrasts (Werker & Tees, 1984; Kuhl et al., 1992)

Speech perception becomes tuned to the native language

slide-12
SLIDE 12

Developing representations

6 7 8 9 10 11 12

6 months: some language-specific perception of vowels age in months 6-8 months: discriminate non- native consonant contrasts 10-12 months: poor discrimination

  • f non-native

consonant contrasts

Speech perception becomes tuned to the native language

(Werker & Tees, 1984; Kuhl et al., 1992)

slide-13
SLIDE 13

Representation matters

vs. People are much better than speech recognition systems at generalizing from the data they hear Bob Siri

slide-14
SLIDE 14

Representation matters

vs. People are much better than speech recognition systems at generalizing from the data they hear Bob Siri Non-native listeners

  • ften fail to perceive

unfamiliar phonetic distinctions

slide-15
SLIDE 15

How is speech represented?

From computing/engineering

 Speech corpora from many

different languages

 Effective methods for

representing the speech signal From cognitive/brain science

 Extensive data from human

listeners

 Cognitive models of language

acquisition and processing characteristics of the environment measurements from the lab

slide-16
SLIDE 16

How is speech represented?

distribution of sounds in the input

listeners’ performance on a discrimination task

“same” or “different”?

Cognitive model that connects distributions of sounds in the input to performance on a laboratory task

(Feldman et al., 2009)

slide-17
SLIDE 17

How is speech represented?

Different representations imply different input distributions

Which representations best predict human discrimination data?

Speech Representation 1 Speech Representation 2

listeners’ performance on a discrimination task

“same” or “different”?

slide-18
SLIDE 18

Speaker normalization

  • 1800
  • 1600
  • 1400
  • 1200
  • 1000
  • 800
  • 600
  • 400
  • 200

1 2 3 4 5 6 7 8 9 10 11 12 Log Likelihood Number of Dimensions Unnormalized Normalized

(Richter et al., in prep)

slide-19
SLIDE 19

A central role for the mind

measurements from the lab From computing/engineering

 Collections of data from the

environment (e.g., corpora)

 Features that help systems

generalize from those data From cognitive/brain science characteristics of the environment

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

slide-20
SLIDE 20

Benefit to cognitive science

 Ecological validity for evaluating hypotheses about

cognitive representations of speech

 Engineering tools provide hypotheses and insights

regarding cognitive representations

 Methods for normalizing across speakers (Wegmann et al., 1996)  RASTA is essentially an edge detector for speech (Hermansky &

Morgan, 1994)

slide-21
SLIDE 21

Benefit to engineering

 Speech representations that yield good performance on

speech recognition tasks also predict human data best

(Richter et al., in prep)

 Can cognitive models of phonetic learning improve zero-

resource speech recognition systems that learn representations from unlabeled data?

slide-22
SLIDE 22

Connections to neuroscience?

 Existing data on neural activity in when listening to

speech (e.g., Mesgarani et al., 2014; Näätänen et al., 1997;

Toscano et al., 2010)

 Use neural data to investigate relationships between

neural activation patterns and cognitive representations

 How do feature representations relate to neural activations computed from the same stretch of speech?