Workshop Programme Introduction: VoxCeleb, VoxConverse & VoxSRC - PowerPoint PPT Presentation

Workshop Programme Introduction: “ VoxCeleb, VoxConverse & VoxSRC ” – Arsha Nagrani, Joon Son Chung & Andrew Zisserman 19:00 Keynote: “X-vectors: Neural Speech Embeddings for Speaking Recognition ” - Daniel Garcia-Romero 19:25 Speaker verification: Leaderboards and winners for Tracks 1-3 20:00 Participant talks from Tracks 1, 2 and 3, live Q & A 20:05 Coffee break 20:50 Keynote: “ Tackling Multispeaker Conversation Processing based on Speaker Diarization and 21:10 Multispeaker Speech Recognition ” - Shinji Watanabe 21:40 Diarization : Leaderboards and winners for Track 4 Participant talks from Track 4, live Q & A 21:42 Wrap-up discussions and closing 22:00

Organisers Andrew Zisserman Joon Son Chung Arsha Nagrani Jaesung Huh Andrew Brown Mitch Mclaren Doug Reynolds Ernesto Coto Weidi Xie

Workshop Programme Introduction: “ VoxCeleb, VoxConverse & VoxSRC ” – Arsha Nagrani, Joon Son Chung & Andrew Zisserman 19:00 Keynote: “X-vectors: Neural Speech Embeddings for Speaking Recognition ” - Daniel Garcia-Romero 19:25 Speaker verification: Leaderboards and winners for Tracks 1-3 20:00 Participant talks from Tracks 1, 2 and 3, live Q & A 20:05 Coffee break 20:50 Keynote: “ Tackling Multispeaker Conversation Processing based on Speaker Diarization and 21:10 Multispeaker Speech Recognition ” - Shinji Watanabe 21:40 Diarization : Leaderboards and winners for Track 4 Participant talks from Track 4, live Q & A 21:42 Wrap-up discussions and closing 22:00

Introduction 1. Data : VoxCeleb and VoxConverse 2. Challenge Mechanics: New tracks, rules and metrics

VoxCeleb datasets • Multi-speaker environments • Varying audio quality and background channel noise Red Carpet Interviews • Freely available Studio Interviews Outdoor and pitch Interviews

VoxCeleb - automatic pipeline Transferring labels from Vision to Speech Felicity Jones Face + landmarks Input Video detection Face verification match match Active Speaker VOXCELEB Detection

VoxCeleb Statistics VoxCeleb2 dev set -> primary data for speaker verification ● Validation toolkit for scoring ● Train Validation # Speakers 5,994 1,251 # Utterances 1,092,009 153,516

A more challenging test set - VoxMovies • Hard samples found from VoxCeleb identities speaking in movies • Playing characters, showing strong emotions, background noise VoxCeleb VoxMovies accent change background music emotion

A more challenging test set - VoxMovies • Audio dataset, but we use visual methods to collect it (VoxCeleb automatic pipeline) VoxCeleb VoxMovies Steve Martin

Audio speaker diarization • Solving “who spoke when” in multi-speaker video. Steve Martin

Diarization - The VoxConverse dataset ● 526 videos from YouTube ● Mostly debates, talk shows, news segments http://www.robots.ox.ac.uk/~vgg/data/voxconverse/

Automatic audio-visual diarization method Active speaker Audio-visual source Face detection & detection separation Input video face track clustering VoxConverse X O Speaker verification Chung, Joon Son, et al. "Spot the conversation: speaker diarisation in the wild." INTERSPEECH (2020).

The VoxCeleb Speaker Recognition Challenge

VoxSRC-2020 tracks • Track 1 : Supervised speaker verification (closed) • Track 2 : Supervised speaker verification (open) • Track 3 : Self-supervised speaker verification (closed) • Track 4 : Speaker diarization (open) TWO NEW tracks this year!

New Tracks Track 3: Self-Supervised No speaker labels allowed • Can use future frames, visual frames, or any other objective • from the video itself Track 4: Speaker Diarization Solving “who spoke when” in multi-speaker video. • Speaker overlap, challenging background conditions •

Mechanics Metrics (Tracks 1-3) • DCF , EER • Following NIST-SRE 2018 • Metrics (Track 4 ) • DER , JER • Overlapping speech counted, collar of 0.25s • Only 1 submission per day, 5 in total • Submissions via CodaLab •

Test Sets • New, more difficult test sets • Manual verification of all speech segments • In addition, annotators pay particular attention to examples whose speaker embeddings are far from cluster centres

Workshop Programme Introduction: VoxCeleb, VoxConverse & VoxSRC - PowerPoint PPT Presentation

Workshop Programme Introduction: VoxCeleb, VoxConverse & VoxSRC Arsha Nagrani, Joon Son Chung & Andrew Zisserman 19:00 Keynote: X-vectors: Neural Speech Embeddings for Speaking Recognition - Daniel Garcia-Romero 19:25

PRESENTATION OF LEARNING PROGRAMMES DETAILS OF THE LEARNING PROGRAMME PROGRAMME NAME: Programme

Lincolnshire Broadband Programme Programme Status Lincolnshire Broadband Programme Overview

Cheshire and Merseyside Tobacco Control Programme Jo McCullagh, Programme Lead What We Will

Contents 1. Programme Overview 2. Programme Structure 3. Programme Milestones a. Compulsory

Sarah Coombs www.sasp.co.uk www.sasp.co.uk Sportivate programme Year 6 of programme Programme

Policy Learning Programme Overview of Programme Modules and Sessions FAO Policy Learning

Presentation Guidelines Contents 1.1 Programme Opening 1.2 End of Programme 1.3 Subtitles

Welcome AGENDA Welcome PEACE PLUS Programme development context Group Session 1 Thematic

InFocus Programme InFocus Programme on Crisis Response on Crisis Response and Reconstruction

The EME Programme Efficacy and Mechanism Evaluation Programme EME webinar 2017 www.nihr.ac.uk

Manchester 2020 Strategy M2020 Programme M2020 Programme Programme Objectives: Realising our

Manchester 2020 Strategy M2020 Programme M2020 Programme Programme Objectives: Realising

Bradford Healthy Hearts Programme update Dr Chris Harris BHH programme The aim of the BHH

Core Surgical Anatomy Programme Induction to the Programme & the Human Anatomy Unit Revised

Programme National overview of the Rheumatic Fever Programme Ministry of Health

POSS Possibilities: setting the data free Y = 1.076 + The POSS Programme POSS is the

Status on positron fraction Multi-track event CC fitted Multi-track event 1 track Multi-Track

Shape of minimal sets in aperiodic flows Krystyna Kuperberg, Auburn University, USA May 21-25,

3D Interlocking Assemblies: Design and Applications Peng SONG, SUTD 3D Assemblies Composed of

Mensch-Maschine-Interaktion Butz, Krger: Mensch-Maschine-Interaktion, Kapitel 14 - Experience

MOVEMENT TRACKS FOR THE AUTOMATIC DETECTION OF FISH BEHAVIORS IN VIDEOS Author ors Declan

Admissible Multiple-Conclusion Rules George Metcalfe Mathematics Institute University of Bern

Planning (Ch. 10) Forward search Last time... Initial: At(Truck, UPSD) ^ Package(UPSD, P1) ^

The Paradox of Multiple Elections 13 voters are asked to each vote yes or no on three issues: 3

Sambuz

Useful Links

Newsletter

Mail Us

Workshop Programme Introduction: VoxCeleb, VoxConverse & VoxSRC - PowerPoint PPT Presentation

Workshop Programme Introduction: VoxCeleb, VoxConverse & VoxSRC Arsha Nagrani, Joon Son Chung & Andrew Zisserman 19:00 Keynote: X-vectors: Neural Speech Embeddings for Speaking Recognition - Daniel Garcia-Romero 19:25

PRESENTATION OF LEARNING PROGRAMMES DETAILS OF THE LEARNING PROGRAMME PROGRAMME NAME: Programme

Lincolnshire Broadband Programme Programme Status Lincolnshire Broadband Programme Overview

Cheshire and Merseyside Tobacco Control Programme Jo McCullagh, Programme Lead What We Will

Contents 1. Programme Overview 2. Programme Structure 3. Programme Milestones a. Compulsory

Sarah Coombs www.sasp.co.uk www.sasp.co.uk Sportivate programme Year 6 of programme Programme

Policy Learning Programme Overview of Programme Modules and Sessions FAO Policy Learning

Presentation Guidelines Contents 1.1 Programme Opening 1.2 End of Programme 1.3 Subtitles

Welcome AGENDA Welcome PEACE PLUS Programme development context Group Session 1 Thematic

InFocus Programme InFocus Programme on Crisis Response on Crisis Response and Reconstruction

The EME Programme Efficacy and Mechanism Evaluation Programme EME webinar 2017 www.nihr.ac.uk

Manchester 2020 Strategy M2020 Programme M2020 Programme Programme Objectives: Realising our

Manchester 2020 Strategy M2020 Programme M2020 Programme Programme Objectives: Realising

Bradford Healthy Hearts Programme update Dr Chris Harris BHH programme The aim of the BHH

Core Surgical Anatomy Programme Induction to the Programme &amp; the Human Anatomy Unit Revised

Programme National overview of the Rheumatic Fever Programme Ministry of Health

POSS Possibilities: setting the data free Y = 1.076 + The POSS Programme POSS is the

Status on positron fraction Multi-track event CC fitted Multi-track event 1 track Multi-Track

Shape of minimal sets in aperiodic flows Krystyna Kuperberg, Auburn University, USA May 21-25,

3D Interlocking Assemblies: Design and Applications Peng SONG, SUTD 3D Assemblies Composed of

Mensch-Maschine-Interaktion Butz, Krger: Mensch-Maschine-Interaktion, Kapitel 14 - Experience

MOVEMENT TRACKS FOR THE AUTOMATIC DETECTION OF FISH BEHAVIORS IN VIDEOS Author ors Declan

Admissible Multiple-Conclusion Rules George Metcalfe Mathematics Institute University of Bern

Planning (Ch. 10) Forward search Last time... Initial: At(Truck, UPSD) ^ Package(UPSD, P1) ^

The Paradox of Multiple Elections 13 voters are asked to each vote yes or no on three issues: 3

Sambuz

Useful Links

Newsletter

Mail Us

Core Surgical Anatomy Programme Induction to the Programme & the Human Anatomy Unit Revised