UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell - - PowerPoint PPT Presentation
UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell - - PowerPoint PPT Presentation
UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell Weinzierl, & Sanda M. Harabagiu The University of Texas at Dallas Human Language Technology Research Institute http://www.hlt.utdallas.edu/~{ramon, max, sanda} Outline 1.
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
Introduction
Multi-task neural model for:
- Task 1: entity identification
- Task 2: relation identification
- Task 3*: concept normalization
- Task 4: normalized relation identification
Introduction
Problem
- Sentence-level
- Binary Relation identification
Our Approach
- Multi-task learning
– Sentence classification – Mention boundary detection – Relation extraction – PK effect classification
- Pre-trained Transformer for shared
representation
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
The Approach
Preprocessing
Annotation Propagation
- Mentions
- Relations
- Pseudo-triggers
Tokenization
- Spacy
- Word-piece
BERT
Shared Representation
Sentence Classifier Mention Boundary Relation Extractor PKE Classifier
Multi-task Transformer Net for Identifying Drug-Drug Interactions
Mentions Relations PK Effects
SPLs SPLs SPLs
Structured Product Labels
Postprocessing
- Mention Filtering
- Continuation Linking
- Unused mention/relation filtering
Task 1: Mentions Task 2: Relations Task 3: Normalized Mentions
Task 4: Label Interactions Normalization
UMLS MED-RT SNOMED-CT
FDA Label Drug-Drug Interaction Pipeline
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
Preprocessing
- Binary Relations
– (Trigger, Precipitant, Effect) ->
- (Trigger, Precipitant)
- (Trigger, Effect)
– Pseudo-triggers for SIs in some PDIs – PK effects as attributes
- Mention annotation propagation
– Ease the learning problem
Preprocessing
- Tokenization
– spaCy – WordPiece using BERT vocab
- C-IOBES tagging
– Continuation necessary for disjoint spans
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
Multi-Task Transformer
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
BERT Sentence Encoder
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
Sentence Classifier
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
Mention Boundary Labeler
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
Relation Extractor
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
Pharmacokinetic Effect Classifier
Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)
t1 t2 tn
Mention Boundary Labeler
Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report
[CLS] [SEP] t3 c1 c2 cn s c3
BERT Sentence Encoder
c1 c2 cn c3
Trigger Embedding Argument Embedding
c4 c5 c6
Context Embedding Softmax Layer
r
Sentence Classifier Softmax Layer
s
PKE Classifier Softmax Layer
r
If r is a PKI Sentences containing interactions PKI effect codes
c1 c2 cn b1 b2 bn
CRF
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
Postprocessing
- Filtering
– Invalid boundary tag sequences – Repeated mentions – Mentions not involved in an interaction
- C-spans linked to closest mention
- Reconstruct ternary interactions from binary
through shared trigger
Postprocessing
- Normalization
– String matching – SNOMED-CT
- Specific interactions
– MED-RT
- Drug classes
– UNII
- precipitants
– Augmented with atoms from UMLS
- Map precipitants first to MED-RT, then to UNII of
no match was found
Postprocessing Task 4
- inferred from unique interactions between
normalized mentions
- PK effect codes from MTTDDI
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion
Results
Evaluated MTTDDI against two alternate configurations:
- UTDHLTRI Run3: No sentence filtering/targeted training
- Run3 + Filtering: Dedicated Learners
System Task1 Task2 Task3 Task4 Best Submission 65.38 49.03 62.39 17.56 Median 48.97 37.13 45.53 17.56 UTDHLTRI Run3 35.04 27.48 28.66 17.56 Run3 + Filtering 56.03 42.29 45.73 24.07 MTTDDI 54.39 41.34 44.08 25.20 * Bold indicated best score. Italics indicates best score among LDIIP systems.
Outline
- 1. Introduction
- 2. The Approach
- 1. Pipeline Overview
- 2. Preprocessing
- 3. Multi-Task Transformer
- 4. Postprocessing
- 3. Results
- 4. Conclusion