UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell - - PowerPoint PPT Presentation

utd hltri at tac 2019 ddi track
SMART_READER_LITE
LIVE PREVIEW

UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell - - PowerPoint PPT Presentation

UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell Weinzierl, & Sanda M. Harabagiu The University of Texas at Dallas Human Language Technology Research Institute http://www.hlt.utdallas.edu/~{ramon, max, sanda} Outline 1.


slide-1
SLIDE 1

UTD HLTRI at TAC 2019: DDI Track

Ramon Maldonado, Maxwell Weinzierl, & Sanda M. Harabagiu

The University of Texas at Dallas Human Language Technology Research Institute http://www.hlt.utdallas.edu/~{ramon, max, sanda}

slide-2
SLIDE 2

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-3
SLIDE 3

Introduction

Multi-task neural model for:

  • Task 1: entity identification
  • Task 2: relation identification
  • Task 3*: concept normalization
  • Task 4: normalized relation identification
slide-4
SLIDE 4

Introduction

Problem

  • Sentence-level
  • Binary Relation identification

Our Approach

  • Multi-task learning

– Sentence classification – Mention boundary detection – Relation extraction – PK effect classification

  • Pre-trained Transformer for shared

representation

slide-5
SLIDE 5

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-6
SLIDE 6

The Approach

Preprocessing

Annotation Propagation

  • Mentions
  • Relations
  • Pseudo-triggers

Tokenization

  • Spacy
  • Word-piece

BERT

Shared Representation

Sentence Classifier Mention Boundary Relation Extractor PKE Classifier

Multi-task Transformer Net for Identifying Drug-Drug Interactions

Mentions Relations PK Effects

SPLs SPLs SPLs

Structured Product Labels

Postprocessing

  • Mention Filtering
  • Continuation Linking
  • Unused mention/relation filtering

Task 1: Mentions Task 2: Relations Task 3: Normalized Mentions

Task 4: Label Interactions Normalization

UMLS MED-RT SNOMED-CT

FDA Label Drug-Drug Interaction Pipeline

slide-7
SLIDE 7

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-8
SLIDE 8

Preprocessing

  • Binary Relations

– (Trigger, Precipitant, Effect) ->

  • (Trigger, Precipitant)
  • (Trigger, Effect)

– Pseudo-triggers for SIs in some PDIs – PK effects as attributes

  • Mention annotation propagation

– Ease the learning problem

slide-9
SLIDE 9

Preprocessing

  • Tokenization

– spaCy – WordPiece using BERT vocab

  • C-IOBES tagging

– Continuation necessary for disjoint spans

slide-10
SLIDE 10

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-11
SLIDE 11

Multi-Task Transformer

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-12
SLIDE 12

BERT Sentence Encoder

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-13
SLIDE 13

Sentence Classifier

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-14
SLIDE 14

Mention Boundary Labeler

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-15
SLIDE 15

Relation Extractor

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-16
SLIDE 16

Pharmacokinetic Effect Classifier

Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI)

t1 t2 tn

Mention Boundary Labeler

Relation Labels for all Mention Pairs Mention Type & Boundary Labels for all words in an EEG report

[CLS] [SEP] t3 c1 c2 cn s c3

BERT Sentence Encoder

c1 c2 cn c3

Trigger Embedding Argument Embedding

c4 c5 c6

Context Embedding Softmax Layer

r

Sentence Classifier Softmax Layer

s

PKE Classifier Softmax Layer

r

If r is a PKI Sentences containing interactions PKI effect codes

c1 c2 cn b1 b2 bn

CRF

slide-17
SLIDE 17

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-18
SLIDE 18

Postprocessing

  • Filtering

– Invalid boundary tag sequences – Repeated mentions – Mentions not involved in an interaction

  • C-spans linked to closest mention
  • Reconstruct ternary interactions from binary

through shared trigger

slide-19
SLIDE 19

Postprocessing

  • Normalization

– String matching – SNOMED-CT

  • Specific interactions

– MED-RT

  • Drug classes

– UNII

  • precipitants

– Augmented with atoms from UMLS

  • Map precipitants first to MED-RT, then to UNII of

no match was found

slide-20
SLIDE 20

Postprocessing Task 4

  • inferred from unique interactions between

normalized mentions

  • PK effect codes from MTTDDI
slide-21
SLIDE 21

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-22
SLIDE 22

Results

Evaluated MTTDDI against two alternate configurations:

  • UTDHLTRI Run3: No sentence filtering/targeted training
  • Run3 + Filtering: Dedicated Learners

System Task1 Task2 Task3 Task4 Best Submission 65.38 49.03 62.39 17.56 Median 48.97 37.13 45.53 17.56 UTDHLTRI Run3 35.04 27.48 28.66 17.56 Run3 + Filtering 56.03 42.29 45.73 24.07 MTTDDI 54.39 41.34 44.08 25.20 * Bold indicated best score. Italics indicates best score among LDIIP systems.

slide-23
SLIDE 23

Outline

  • 1. Introduction
  • 2. The Approach
  • 1. Pipeline Overview
  • 2. Preprocessing
  • 3. Multi-Task Transformer
  • 4. Postprocessing
  • 3. Results
  • 4. Conclusion
slide-24
SLIDE 24

Questions