Score Distribution Based Term Specific Thresholding for Spoken Term - - PowerPoint PPT Presentation

score distribution based term specific thresholding for
SMART_READER_LITE
LIVE PREVIEW

Score Distribution Based Term Specific Thresholding for Spoken Term - - PowerPoint PPT Presentation

Score Distribution Based Term Specific Thresholding for Spoken Term Detection D. Can M. Sarac lar Bo gazic i University Department of Electrical & Electronics Engineering BUSIM Lab Introduction Thresholding for Spoken Term


slide-1
SLIDE 1

Score Distribution Based Term Specific Thresholding for Spoken Term Detection

  • D. Can
  • M. Sarac

¸lar

Bo˘ gazic ¸i University Department of Electrical & Electronics Engineering BUSIM Lab

slide-2
SLIDE 2

Introduction Thresholding for Spoken Term Detection Experiments Summary

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-3
SLIDE 3

Introduction Thresholding for Spoken Term Detection Experiments Summary

Application: Sign Dictionary

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-4
SLIDE 4

Introduction Thresholding for Spoken Term Detection Experiments Summary

Anatomy of a Spoken Term Detection (STD) System

User Query Preprocess Search Engine larger than τ? Retrieve Dispose Speech Database ASR Index yes no

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-5
SLIDE 5

Introduction Thresholding for Spoken Term Detection Experiments Summary

Anatomy of a Spoken Term Detection (STD) System

User Query Preprocess Search Engine larger than τ? Retrieve Dispose Speech Database ASR Index INDEXING yes no

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-6
SLIDE 6

Introduction Thresholding for Spoken Term Detection Experiments Summary

Anatomy of a Spoken Term Detection (STD) System

User Query Preprocess Search Engine larger than τ? Retrieve Dispose Speech Database Index SEARCH yes no

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-7
SLIDE 7

Introduction Thresholding for Spoken Term Detection Experiments Summary

Anatomy of a Spoken Term Detection (STD) System

User Query Preprocess Search Engine larger than τ? Retrieve Dispose Speech Database Index RETRIEVAL yes no

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-8
SLIDE 8

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-9
SLIDE 9

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Global Thresholding

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Normalized histogram of posterior scores for an example query Pick a global threshold θ for all query terms Apply binary thresholding Vary θ for different operating points No term specific behavior, no joint processing of candidates, hence poor performance!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-10
SLIDE 10

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Global Thresholding

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Normalized histogram of posterior scores for an example query Pick a global threshold θ for all query terms Apply binary thresholding Vary θ for different operating points No term specific behavior, no joint processing of candidates, hence poor performance!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-11
SLIDE 11

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Global Thresholding

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Reject Accept Normalized histogram of posterior scores for an example query Pick a global threshold θ for all query terms Apply binary thresholding Vary θ for different operating points No term specific behavior, no joint processing of candidates, hence poor performance!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-12
SLIDE 12

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Global Thresholding

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Reject Accept Normalized histogram of posterior scores for an example query Pick a global threshold θ for all query terms Apply binary thresholding Vary θ for different operating points No term specific behavior, no joint processing of candidates, hence poor performance!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-13
SLIDE 13

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Global Thresholding

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Reject Accept Normalized histogram of posterior scores for an example query Pick a global threshold θ for all query terms Apply binary thresholding Vary θ for different operating points No term specific behavior, no joint processing of candidates, hence poor performance!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-14
SLIDE 14

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-15
SLIDE 15

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Term Weighted Value (TWV) [NIST, 2006]

TWV = 1 − 1 Q

Q

  • k=1

{Pmiss(qk) + βPFA(qk)} Pmiss(qk) = 1 − C(qk) R(qk), PFA(qk) = A(qk) − C(qk) T − C(qk) Q Number of queries R(qk) Number of occurrences of query qk A(qk) Total number of retrieved documents for qk C(qk) Number of correctly retrieved documents for qk T Total duration of the speech archive β Cost of false alarms relative to hits

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-16
SLIDE 16

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

TWV Based Term Specific Thresholding [Miller et al., 2007]

ˆ Vhit(qk) = 1 ˆ Ntrue(qk) , ˆ CFA(qk) = β T − ˆ Ntrue(qk) ˆ θ(qk) = ˆ CFA(qk) ˆ CFA(qk) + ˆ Vhit(qk) ˆ Ntrue(qk) Expected count of occurrences of qk ˆ θ(qk) Optimal threshold for qk maximizing TWV in the expected sense Term specific expected counts → Term specific thresholds Vary β for different operating points Only the sum of individual scores affects the threshold!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-17
SLIDE 17

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

TWV Based Term Specific Thresholding [Miller et al., 2007]

ˆ Vhit(qk) = 1 ˆ Ntrue(qk) , ˆ CFA(qk) = β T − ˆ Ntrue(qk) ˆ θ(qk) = ˆ CFA(qk) ˆ CFA(qk) + ˆ Vhit(qk) ˆ Ntrue(qk) Expected count of occurrences of qk ˆ θ(qk) Optimal threshold for qk maximizing TWV in the expected sense Term specific expected counts → Term specific thresholds Vary β for different operating points Only the sum of individual scores affects the threshold!

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-18
SLIDE 18

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-19
SLIDE 19

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Exploiting Score Distributions [Manmatha et al., 2001]

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Scores follow exponential-like distributions Model both classes (c0, c1) with exponential distributions: p(x|c0) = λ0e−λ0x p(x|c1) = λ1e−λ1(1−x) Model all candidates as a mixture of exponentials p(x) = P(c0)p(x|c0) + P(c1)p(x|c1) Use EM to estimate parameters ( λ0, λ1, P(c0), P(c1))

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-20
SLIDE 20

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Exploiting Score Distributions [Manmatha et al., 2001]

0.2 0.4 0.6 0.8 1 5 10 15 Posterior Score n Incorrect Class Distribution Correct Class Distribution Incorrect Class EM Estimate Correct Class EM Estimate

Scores follow exponential-like distributions Model both classes (c0, c1) with exponential distributions: p(x|c0) = λ0e−λ0x p(x|c1) = λ1e−λ1(1−x) Model all candidates as a mixture of exponentials p(x) = P(c0)p(x|c0) + P(c1)p(x|c1) Use EM to estimate parameters ( λ0, λ1, P(c0), P(c1))

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-21
SLIDE 21

Introduction Thresholding for Spoken Term Detection Experiments Summary Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

Computing Term Specific Thresholds

Cost Scheme γc(d) = 1 d = 0, c ∈ c1 α d = 1, c ∈ c0 where c is a candidate, d is a decision, and α is the cost of false alarms relative to hits. Estimate mixture parameters → each component ∼ a class, mixture weights ∼ priors Bayes-optimal threshold θ is given as: θ = λ1 + log(λ0/λ1) + log(P(c0)/P(c1)) + log α λ0 + λ1 . Different operating points can be achieved by changing α.

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-22
SLIDE 22

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-23
SLIDE 23

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Experimental Setup

Query Set 10229 single word queries selected from Turkish Radio and Television Channel 2 (TRT2) hearing impaired news LVCSR System Acoustic Data: 111 hours of BN data

Train set: 100 hours (from various TV and radio broadcasts) Test set: 11 hours (from TRT2 hearing impaired news)

Language Data: 100M words from various text sources IBM Attila Speech Recognition Toolkit

Baseline MLE models WER on the test set : 17%

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-24
SLIDE 24

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Outline

1

Introduction

2

Thresholding for Spoken Term Detection Global Thresholding Term Weighted Value Based Term Specific Thresholding Score Distribution Based Term Specific Thresholding

3

Experiments Setup Results

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-25
SLIDE 25

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Experimental Results

0.8 0.85 0.9 0.95 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall Global Thresholding Term Specific Thresholding (TWV) EMM + EM + MBR Detection Cheat + EMM + MBR Detection

Better performance in the high precision region Large room for improvement

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-26
SLIDE 26

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Experimental Results

0.8 0.85 0.9 0.95 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall Global Thresholding Term Specific Thresholding (TWV) EMM + EM + MBR Detection Cheat + EMM + MBR Detection

Better performance in the high precision region Large room for improvement

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-27
SLIDE 27

Introduction Thresholding for Spoken Term Detection Experiments Summary Setup Results

Experimental Results

0.8 0.85 0.9 0.95 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall Global Thresholding Term Specific Thresholding (TWV) EMM + EM + MBR Detection Cheat + EMM + MBR Detection

Better performance in the high precision region Large room for improvement

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-28
SLIDE 28

Introduction Thresholding for Spoken Term Detection Experiments Summary

Summary

Exploiting score distributions leads to a viable term specific thresholding method Proposed method has a large potential as indicated by the cheat experiment Superior to TWV based method in the high precision region

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-29
SLIDE 29

Appendix

References I

Manmatha, R., Rath, T., and Feng, F. (2001). Modeling score distributions for combining the outputs of search engines. In SIGIR ’01, pages 267–275, New York, NY, USA. ACM. Miller, D. R. H., Kleber, M., Kao, C., Kimball, O., Colthurst, T., Lowe, S. A., Schwartz, R. M., and Gish, H. (2007). Rapid and accurate spoken term detection. In Proc. Interspeech, pages 314–317. NIST (2006). The Spoken Term Detection (STD) 2006 Evaluation Plan http://www.nist.gov/speech/tests/std/.

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD

slide-30
SLIDE 30

Appendix

EM Parameter Updates

Model all candidates as a mixture of exponentials p(x) = P(c0)p(x|c0) + P(c1)p(x|c1) Use EM to estimate parameters ( λ0, λ1, P(c0), P(c1)) given the scores xi for i = 1, . . . , N.

First compute P(cj|xi) = P(cj)p(xi|cj)/p(xi) for j = 1, 2 Then update P(cj) = 1 N

  • i

P(cj|xi), λ0 =

  • i P(c0|xi)
  • i P(c0|xi)xi

, λ1 =

  • i P(c1|xi)
  • i P(c1|xi)(1 − xi).

Can, Sarac ¸lar Score Distribution Based Term Specific Thresholding for STD