understanding the mind Naomi Feldman University of Maryland - - PowerPoint PPT Presentation

▶

May 05, 2023 126 likes •370 views

Toward a goal of understanding the mind Naomi Feldman University of Maryland December 4, 2014 Language in the mind How to learners acquire speech sound categories? How do listeners perceive speech in noisy situations? How do

SLIDE 1

Toward a goal of understanding the mind

Naomi Feldman University of Maryland December 4, 2014

SLIDE 2

Language in the mind

 How to learners acquire speech sound categories?  How do listeners perceive speech in noisy situations?  How do listeners represent the speech they hear?

 Higher-level questions about grammar, discourse, etc.

SLIDE 3

A common approach

measurements from the lab

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

SLIDE 4

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

(neural modeling) (cognitive science) (computer science)

SLIDE 5

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

(neural modeling) (cognitive science) (computer science)

To understand the mind, we need to study the environment

SLIDE 6

Understanding the mind

measurements from the lab From cognitive/brain science

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

SLIDE 7

Understanding the mind

measurements from the lab From computing/engineering

 Collections of data from the

environment (e.g., corpora)

 Features that help systems

generalize from those data From cognitive/brain science characteristics of the environment

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

SLIDE 8

“Big data for cognitive science”

Jimmy Lin

SLIDE 9

An example from language

 Dynamic signal that’s changing continuously  Contains information in both frequency and time

How is speech represented?

Aren Jansen Caitlin Richter

Supported by NSF BCS-1320410

SLIDE 10

Developing representations

6 7 8 9 10 11 12

age in months (Werker & Tees, 1984; Kuhl et al., 1992)

Speech perception becomes tuned to the native language

SLIDE 11

Developing representations

6 7 8 9 10 11 12

age in months 6-8 months: discriminate non- native consonant contrasts 10-12 months: poor discrimination

f non-native

consonant contrasts (Werker & Tees, 1984; Kuhl et al., 1992)

Speech perception becomes tuned to the native language

SLIDE 12

Developing representations

6 7 8 9 10 11 12

6 months: some language-specific perception of vowels age in months 6-8 months: discriminate non- native consonant contrasts 10-12 months: poor discrimination

f non-native

consonant contrasts

Speech perception becomes tuned to the native language

(Werker & Tees, 1984; Kuhl et al., 1992)

SLIDE 13

Representation matters

vs. People are much better than speech recognition systems at generalizing from the data they hear Bob Siri

SLIDE 14

Representation matters

vs. People are much better than speech recognition systems at generalizing from the data they hear Bob Siri Non-native listeners

ften fail to perceive

unfamiliar phonetic distinctions

SLIDE 15

How is speech represented?

From computing/engineering

 Speech corpora from many

different languages

 Effective methods for

representing the speech signal From cognitive/brain science

 Extensive data from human

listeners

 Cognitive models of language

acquisition and processing characteristics of the environment measurements from the lab

SLIDE 16

How is speech represented?

distribution of sounds in the input

listeners’ performance on a discrimination task

“same” or “different”?

Cognitive model that connects distributions of sounds in the input to performance on a laboratory task

(Feldman et al., 2009)

SLIDE 17

How is speech represented?

Different representations imply different input distributions

Which representations best predict human discrimination data?

Speech Representation 1 Speech Representation 2

listeners’ performance on a discrimination task

“same” or “different”?

SLIDE 18

Speaker normalization

1800
1600
1400
1200
1000
800
600
400
200

1 2 3 4 5 6 7 8 9 10 11 12 Log Likelihood Number of Dimensions Unnormalized Normalized

(Richter et al., in prep)

SLIDE 19

A central role for the mind

measurements from the lab From computing/engineering

 Collections of data from the

environment (e.g., corpora)

 Features that help systems

generalize from those data From cognitive/brain science characteristics of the environment

 Behavioral and neural data

from humans in the lab

 Cognitive model of the

phenomenon being studied

SLIDE 20

Benefit to cognitive science

 Ecological validity for evaluating hypotheses about

cognitive representations of speech

 Engineering tools provide hypotheses and insights

regarding cognitive representations

 Methods for normalizing across speakers (Wegmann et al., 1996)  RASTA is essentially an edge detector for speech (Hermansky &

Morgan, 1994)

SLIDE 21

Benefit to engineering

 Speech representations that yield good performance on

speech recognition tasks also predict human data best

(Richter et al., in prep)

 Can cognitive models of phonetic learning improve zero-

resource speech recognition systems that learn representations from unlabeled data?

SLIDE 22

Connections to neuroscience?

 Existing data on neural activity in when listening to

speech (e.g., Mesgarani et al., 2014; Näätänen et al., 1997;

Toscano et al., 2010)

 Use neural data to investigate relationships between

Toward a goal of understanding the mind

Naomi Feldman University of Maryland December 4, 2014

Language in the mind

A common approach

measurements from the lab

from humans in the lab

phenomenon being studied

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

A recurring theme in cognition

Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven”

To understand the mind, we need to study the environment

Understanding the mind

measurements from the lab From cognitive/brain science

from humans in the lab

phenomenon being studied

Understanding the mind

measurements from the lab From computing/engineering

environment (e.g., corpora)

generalize from those data From cognitive/brain science characteristics of the environment

from humans in the lab

phenomenon being studied

“Big data for cognitive science”

An example from language

How is speech represented?

Developing representations

6 7 8 9 10 11 12

Speech perception becomes tuned to the native language

Developing representations

6 7 8 9 10 11 12

Speech perception becomes tuned to the native language

Developing representations

6 7 8 9 10 11 12

Speech perception becomes tuned to the native language

Representation matters

Representation matters

How is speech represented?

From computing/engineering

different languages

representing the speech signal From cognitive/brain science

listeners

acquisition and processing characteristics of the environment measurements from the lab

How is speech represented?

Cognitive model that connects distributions of sounds in the input to performance on a laboratory task

How is speech represented?

Different representations imply different input distributions

Speaker normalization

A central role for the mind

measurements from the lab From computing/engineering

environment (e.g., corpora)

generalize from those data From cognitive/brain science characteristics of the environment

from humans in the lab

phenomenon being studied

Benefit to cognitive science

cognitive representations of speech

regarding cognitive representations

Benefit to engineering

speech recognition tasks also predict human data best

resource speech recognition systems that learn representations from unlabeled data?

Connections to neuroscience?

speech (e.g., Mesgarani et al., 2014; Näätänen et al., 1997;

Toscano et al., 2010)

neural activation patterns and cognitive representations

 How do feature representations relate to neural activations computed from the same stretch of speech?