Developing, Collecting and Sharing DyViS: A Forensic Phonetic Journey
Kirsty McDougall
(with particular thanks to Francis Nolan, Gea de Jong-Lendle and Toby Hudson)
WYRED Workshop, IAFPA 2018, University of Huddersfield
and Sharing DyViS : A Forensic Phonetic Journey Kirsty McDougall - - PowerPoint PPT Presentation
WYRED Workshop, IAFPA 2018, University of Huddersfield Developing, Collecting and Sharing DyViS : A Forensic Phonetic Journey Kirsty McDougall (with particular thanks to Francis Nolan, Gea de Jong-Lendle and Toby Hudson) DyViS Collaborators
(with particular thanks to Francis Nolan, Gea de Jong-Lendle and Toby Hudson)
WYRED Workshop, IAFPA 2018, University of Huddersfield
Toby Hudson Gea de Jong-Lendle Francis Nolan
Image: http://nomanbefore.com/oxford-vs-cambridge/
Image: http://nomanbefore.com/oxford-vs-cambridge/
(or at least determine how frequently particular voice features or combinations occur), we need population databases for speech →populations must be phonetically controlled, i.e. contain a large no. of speakers with the same:
→ need to hold demographic characteristics constant and examine variation between the speakers (“between-speaker variation”)
very few large-scale speech databases for phonetically- controlled populations available
German
Japanese
(Japanese National Research Institute for Police Sciences) (Osanai, Tanimosto, Kido and Suzuki 1995)
Mean f0 data for 100 male & 50 female German speakers (Künzel 1989: 121, Figure 3)
is not straightforward
limits, but extensive variability possible within these
background noise…
ESRC no. RES-000-23-1248 Department of Linguistics, University of Cambridge Gea de Jong Toby Hudson Kirsty McDougall Francis Nolan 2005-2009
‘Dynamic Variability in Speech: A Forensic Phonetic Study of British English’
Image: http://nomanbefore.com/oxford-vs-cambridge/
(non-contemporaneous)
this way before
proportion of Cambridge University students
studied (SSBE has its own specific social profile)
linguistic research
(2 styles)
Task Description Spontaneous Read Studio Quality Telephone Quality
1 Simulated police interview x x 2 Telephone conversation with ‘accomplice’ x x x 3 Read passage x x 4 Read sentences x x
(Künzel 2001, Nolan 2002)
telephone landline
effects
(technical challenge: no mobile signal in Cambridge sound-treated recording studio…)
at different times, e.g. incriminating phone call and police interview
10-14 weeks after initial session
couldn’t be repeated → further creative genius needed for additional spontaneous tasks…
who spoke English with a ‘standard Southern accent’
trained research assistant, native SSBE speaker
asked to leave a message
places lived
assistant
after recording session
University of Cambridge
23
__
BT External telephone line Intercept: (TC22) Telephone 2
Recorder 2
__
Recorder 1 Mic
Sound-treated booth Research room
Subject Researcher
Telephone 1
Image credit: Gea de Jong-Lendle
Prospect balance unit (TC22)
‘lying’)
Subject = drug deal suspect
facts in black (OK to admit) OR red (must NOT be revealed)
25 Scott Weadon tour guide friend from secondary school: Buckley School regularly chat on Skype Robert Freeman
see regularly in the pub
Yewtree Reservoir Yewtree Footpath Dexter Road you met Robert Freeman here last Wednesday Badger Pass
27
Method:
Researcher phones to the sound-treated booth via an external telephone line.
Recording conditions:
The subject’s speech is recorded directly into the microphone and indirectly via a telephone intercept.
Content:
The researcher requests a short debriefing from the subject about the mock interview.
28
Example…
29
“Report: Hoards of Heroin in Parkville last Thursday Police announced last night that they have arrested one
quantities of heroin at the Parkville petrol station at 10:15 pm last Thursday. The suspect, who cannot be named, works as a hairdresser in Carter Town. He is employed by Mr Eugene Burke at Eugene’s Hairdressers on Reeve Causeway, opposite the city tour bus stop. Reeve Causeway is north of the hypermarket on Pighty Road. …. ”
30
Sentences designed to elicit target variables in phonetically controlled contexts
That driver was a CREEP yesterday. We decided to HIDE today. He had a difficult YOUTH I reckon. It won’t be King’s Cross; we’ll meet at EUSTON next time. etc.
Format: randomised x 6
Image: http://nomanbefore.com/oxford-vs-cambridge/
anybody undertaking university research
system for filenames, transcription conventions, etc.
instructions
Image: http://nomanbefore.com/oxford-vs-cambridge/
Image: http://nomanbefore.com/oxford-vs-cambridge/
examples
sociophonetics…
→ formant dynamics & speaker characteristics
(McDougall and Nolan 2007)
→ sound change and speaker identity
(de Jong et al. 2007a, de Jong et al. 2007b)
→ perception of telephone speech
(Lawrence, Nolan and McDougall 2008)
→ SSBE fundamental frequency distribution
(Hudson et al. 2007)
200 400 600 800 1000 1200 500 1000 1500 2000 2500 Frequency of F2 (Hz) Frequency of F1 (Hz)
heed had hard hoard hood who'd
(2008; Nolan, McDougall and Hudson)
investigating earwitness identification, especially the effect of the telephone
Postdoctoral Fellowship (2010-2015; McDougall) → ‘A phonetic theory of voice similarity’
Grant (McDougall) → Voice Similarity and accent differences: YorViS → Comparison of SSBE and York English
(not German!), e.g. J.P. French Associates, Martin Barry Forensic Voice Services, Duckworth Consultancy
Distribution of Speaker F0 Means
5 10 15 20 25 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 141 More F0 (Hz)
Hudson et al. (2007)
forensic applications, e.g.
traditional phonetic variables to undertake speaker recognition
recordings of multi-speaker conversations
agencies and private companies
http://www.oxfordwaveresearch.com/
system in conjunction with traditional techniques with respect to casework examples
introduction in casework
essential to both of these tasks – yet such database development beyond scope of a casework-focussed firm
http://www.jpfrench.com/
enhancements for casework, including software for formant plotting
http://www.mbfvs.co.uk/
collaboration with Kirsty McDougall investigating speaker-specific patterns of disfluency in normally fluent speakers using DyViS
University of Cambridge Humanities Research Grant
for Forensic Analysis (McDougall and Duckworth 2017) Duckworth Consultancy
Martin Duckworth
− er [er] − erm [erm] − others, e.g. ah [fpo]
− ‘grammatical’ [pg] − ‘other’ [po]
lateral [prov]
McDougall and Duckworth (2017) Speech Communication
45
implemented in casework by Duckworth Consultancy since 2011
J.P. French Associates in casework since 2015 (IAFPA 2018 paper: McDougall, Rhodes, et al.) Duckworth Consultancy
Martin Duckworth
Erica Gold – clicks, f0, AR; LR approaches Vince Hughes – LR approaches, reference populations Colleen Kavanagh – consonant features Nathan Atkinson – earwitness evidence
4 funded research projects:
Analysis of DyViS data – voice quality, long-term formants, ASR
Newcastle, Middlesbrough, Sunderland – DyViS tasks
Bradford, Kirklees and Wakefield – DyViS tasks
PI K. McDougall
York English, 20 male speakers – DyViS tasks
https://sites.google.com/site/yorkfss/research/grants-and- projects/voice-and-identity---source-filter-biometric https://www.york.ac.uk/language/research/projects/tuuls/ http://wyredproject.co.uk/
(McDougall, Duckworth and Hudson 2015)
(East Lancashire accent)
hope avenue onto reeve causeway then down the high street and then onto pightly road
down hope avenue onto reeve causeway and left
Rhythm in Varieties of English
Used DyViS for SSBE data; replicated methodology to collect Indian English spontaneous speech corpus
e.g. Anne Fabricius
English
Wikström (2013) ‘An acoustic study of the RP English LOT and THOUGHT vowels’ JIPA
https://link.springer.com/book/10.1007/978-3-662-47818-9
should – and some more
transcription, editing, file management….
Image: http://nomanbefore.com/oxford-vs-cambridge/
References
Anderson, A. H., M. Bader, E. G. Bard, E. Boyle, G. Doherty, S. Garrod, S. Isard, J. Kowtko, J. McAllister, J. Miller, C. Sotillo, H. S. Thompson and R. Weinert (1991) ‘The HCRC Map Task Corpus.’ Language and Speech 34.4: 351-366. de Jong, G., K. McDougall, T. Hudson and F. Nolan (2007) ‘The speaker-discriminating power of sounds undergoing historical change: a formant-based study.’ In J. Trouvain and W. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 6-10 August 2007, Saarbrücken, 1813-1816.
Classification II: Selected Papers. Berlin: Springer. 130-141. Duckworth, M., K. McDougall, G. de Jong and L. Shockey (2011) ‘The consistency of formant measurements in high quality audio data: the effect of agreeing measurement procedures.’ International Journal of Speech, Language and the Law 18.1: 35-51. Fuchs, R. (2012) Speech Rhythm in Varieties of English: Evidence from Educated Indian English and British English. Berlin: Springer- Verlag. Hudson, T., G. de Jong, K. McDougall, P. Harrison and F. Nolan (2007) ‘F0 statistics for 100 young male speakers of Standard Southern British English.’ In J. Trouvain and W. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 6- 10 August 2007, Saarbrücken, 1809-1812. Jessen, M., O. Köster and S. Gfroerer (2005) ‘Influence of vocal effort on average and variability of fundamental frequency.’ International Journal of Speech, Language and the Law 12(2): 174-213. Lawrence, S., F. Nolan and K. McDougall (2008) ‘Acoustic and perceptual effects of telephone transmission on vowel quality.’ International Journal of Speech, Language and the Law 15.2: 159-190. Künzel, H. J. (2001). ‘Beware of the “telephone effect”: the influence of telephone transmission on the measurement of formant frequencies.’ International Journal of Speech Language and the Law 8.1: 80-99. Nolan, F. (2002) ‘The 'telephone effect' on formants: a response.’ Forensic Linguistics 9(1): 74-82. McDougall, K. and M. Duckworth (2017) ‘Profiling fluency: an analysis of individual variation in disfluencies in adult males.’ Speech Communication 95: 16-27. McDougall, K., M. Duckworth and T. Hudson (2015) ‘Individual and group variation in disfluency features: a cross-accent investigation.’ In: The Scottish Consortium for ICPhS 2015 (ed.) Proceedings of the 18th International Congress of Phonetic Sciences, 10-14 August 2015, Glasgow. Paper number 0308.1-5. <http://www.icphs.info/pdfs/Papers/ICPHS0308.pdf> McDougall, K. and F. Nolan (2007) ‘Discrimination of speakers using the formant dynamics of /uː/ in British English.’ In J. Trouvain and W. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 6-10 August 2007, Saarbrücken, 1825- 1828. McDougall, K., R. Rhodes, M. Duckworth, J.P. French, C. Kirchhübel and J. Wormald (2018) ‘Applying disfluency analysis in forensic speaker comparison casework.’ Paper presented at the International Association for Forensic Phonetics and Acoustics Annual Conference, Huddersfield, 29 July – 1 August 2018. Nolan, F., K. McDougall, G. de Jong and T. Hudson (2009) ‘The DyViS database: style-controlled recordings of 100 homogeneous speakers for forensic phonetic research.’ International Journal of Speech, Language and the Law 16.1: 31-57. Osanai, T., M. Tanimosto, H. Kido and T. Suzuki (1995) ‘Text-dependent speaker verification using isolated word utterances based
Wikstrom, J. (2013) ‘An acoustic study of the RP English LOT and THOUGHT vowels.’ Journal of the International Phonetic Association 43.1: 37-47.