speaker change detection using siamese networks
play

Speaker Change Detection using Siamese Networks Siamese layers share - PowerPoint PPT Presentation

Speaker Change Detection using Siamese Networks Siamese layers share their Acoustic Data Acoustic Data weights Left Segment Right Segment Classifier is trained using binary cross-entropy BLSTM BLSTM Siamese Input features are


  1. Speaker Change Detection using Siamese Networks • Siamese layers share their Acoustic Data Acoustic Data weights Left Segment Right Segment • Classifier is trained using binary cross-entropy BLSTM BLSTM Siamese • Input features are PLPs Left Right embedding embedding Classifier Same/Different

  2. Pre-training of the Siamese Layers • Gender classification Contrastive Divergence • left right BLSTM BLSTM BLSTM x l x r Male/Female 7 8 = : -(/ ;(") , / <(") ) + 7 8 ≠ : max(0, Δ − -(/ ; " , / < " )) % min ∑ "#$ • Triplet Loss positive negative anchor BLSTM BLSTM BLSTM x a x p x n 0, Δ + -(/ 0 " , / 1 " ) − -(/ 0 " , / 4 " ) % min ∑ "#$ max

  3. Validation Data Classification Accuracy (%) Pretraining Distance Freeze Siamese layers Accuracy Gender classification - Yes 76.9 Gender classification - No 78.1 Contrastive divergence Cosine Yes 76.7 Contrastive divergence Cosine No 87.3 Contrastive divergence Euclidean Yes 77.4 Contrastive divergence Euclidean No 87.5 Triplet loss Cosine Yes 84.6 Triplet loss Cosine No 87.9 Triplet loss Euclidean Yes 82.7 Triplet loss Euclidean No 89.0

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend