DCU at the NTCIR-11 SpokenQuery&Doc Task
David N. Racca, Gareth J.F. Jones
CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland
DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth - - PowerPoint PPT Presentation
DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth J.F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland Overview We participated in the slide-group SQ-SCR.
CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland
3/19
4/19
5/19
6/19
Lectures WAV
Manual Annotated T ranscripts
VAD OpenSMILE Julius LVCSR
ChaSen
"%m %M %y" Annotation Removal Forced Alignment
Queries WAV
F0 Loudness every 10ms Normalised F0 Loudness every 10ms
IPUs WAV
Manual T ranscripts ASR T ranscripts Enriched Manual T ranscripts Enriched ASR T ranscripts
Capitalisation
10-best hypothesis per IPU
%M
%m %M or %m %y Lecture Normalisation
vnorm=vraw−minv maxv−minv
Provided by organisers
7/19
F0
max(f0i, j
k )=280.44 Hz Raw
max (f0i, j
k )=0.58 Normalised
Pitch (F0)
Loudness
max (li, j
k )=1.16 Raw
max(li, j
k )=0.37 Normalised
Loudness
tf-idf
Max ~ 280.44 Hz Max ~ 1.16 end ~ 2.36 start ~1.02
d=2.36 s−1.02 s=1.34 s
Duration
Lecture Normalisation
vnorm=vraw−minv maxv−minv
8/19
k
k )}
k
k )}
k
k )}−min k {min(f0i , j k )}
k
k }
9/19
10/19
IPUs with Prosody Slide-group segments with Prosody
Segment Index
Terrier Indexing
Enriched T ranscripts
IPU Grouping
11/19
tf(i, j)=
tfi, j+k1(1−b+bdl j avdl ) idf(i ,C)=log( N ni +1)
i M
― Probabilistic model with BM25 weighting: ― Three definitions for were explored:
12/19
Lecture Transcript uMAP pwMAP fMAP
Manual
LI LPr 0.7 .1369 .0976 .1005 LI Pr 0.7 .1369 .0951 .0995
G
LP 1 1 .1326 .0960 .0989
TF-IDF
.1270 .0950 .0972
Match
LI LPr 0.5 .0842 .0508 .0524 LI Dur
0.3
.0819 .0498 .0521
G Pr
1 1 .0786 .0473 .0499 LI Pr 0.7 .0778 .0490 .0501 TF-IDF .0682 .0477 .0486
UnmatchAMLM
G P 3 1 .0288 .0208 .0131 LI LP
0.5
.0278 .0210 .0135 LI LPr 0.2 .0271 .0205 .0132 LI P 0.9 .0227 .0206 .0129 TF-IDF .0222 .0203 .0128
θir
θac
ac(i , j) w(i , j)
α
13/19
Manual Match UnmatchAMLM 0.02 0.04 0.06 0.08 0.1 0.12 0.14
LI-Pr-0.7 LI-LPr-0.7 TF_IDF Spoken Query Types MAP
14/19
Manual Match UnmatchAMLM 0.02 0.04 0.06 0.08 0.1 0.12 0.14
LI-LPr-0.5 LI-Pr-0.7 LI-Dur-0.3 TF_IDF Spoken Query Types MAP
15/19
Manual Match UnmatchAMLM 0.02 0.04 0.06 0.08 0.1 0.12 0.14
LI-LPr-0.2 LI-LPr-0.5 LI-P-0.9 TF_IDF Spoken Query Types MAP
16/19
Match Unmatch Manual Match Unmatch Manual Match Unmatch Manual Manual Match Unmatch 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Query 1: Prosodic-based vs TF_IDF
TF_IDF Prosodic-based AveP
Spoken Query Type
17/19
18/19
19/19