SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
Rongzhi Zhang, Yue Yu, Chao Zhang Georgia Institute of Technology
EMNLP | 2020 SeqMix: Augmenting Active Sequence Labeling via - - PowerPoint PPT Presentation
EMNLP | 2020 SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup Rongzhi Zhang, Yue Yu, Chao Zhang Georgia Institute of Technology Introduction Sequence labeling is core to many NLP tasks. Part-of-speech (POS) tagging.
Rongzhi Zhang, Yue Yu, Chao Zhang Georgia Institute of Technology
Sample Add sample Run Train
Discriminator π(β ) Paring function π(β ) Labeled set β Active learning model π Active query policy π(β )
fitted model
Unlabeled set π± Data annotation
newly labeled data top K samples
Generated sequences Eligible generations
Paired samples newly labeled data augmentation data labeled data
Labeled sequence Unlabeled sequence Mixed sequence
label density threshold π0 =
3 5 .
sequences with same length and valid label density π β₯ π0 get paired.
label density threshold π0 =
2 3 .
sub-sequences with same length and valid label density π β₯ π0 get paired.
label density threshold π0 =
2 3 .
sub-sequences with same length, consistent labels, and valid label density π β₯ π0 get paired.
1
The improvements to different active learning approaches provided by SeqMix.
The performance of SeqMix with variant discriminator score range
2 3 ,
representation of language models.
plausibility of the generated data.