Effective transfer learning for clinical applications Benjamin van - - PowerPoint PPT Presentation

effective transfer learning for clinical applications
SMART_READER_LITE
LIVE PREVIEW

Effective transfer learning for clinical applications Benjamin van - - PowerPoint PPT Presentation

Effective transfer learning for clinical applications Benjamin van der Burgh (LIACS) OVERVIEW 1. Transfer learning in NLP 2. Experiments on Dutch data 3. Well-being tracking using clinical journals 2 PROJECT BACKGROUND Physiotherapists


slide-1
SLIDE 1

Effective transfer learning for clinical applications

Benjamin van der Burgh (LIACS)

slide-2
SLIDE 2

OVERVIEW

  • 1. Transfer learning in NLP
  • 2. Experiments on Dutch data
  • 3. Well-being tracking using clinical journals

2

slide-3
SLIDE 3

PROJECT BACKGROUND

▰ Physiotherapists keep journals ▰ Can we quantify well-being from text? ▰ Not a conventional task, no labeled data ▰ What can we do about it?

3

slide-4
SLIDE 4

TRANSFER LEARNING

Learning with a head start

4

1

slide-5
SLIDE 5

TRANSFER LEARNING

▰ Deep neural networks ▰ First train model for different but similar task ▰ Learns reusable representation / features ▰ Replace last layer(s) to adjust to target ▰ Continue training the model on target dataset

5

slide-6
SLIDE 6

6

Source: http://ruder.io/nlp-imagenet/

slide-7
SLIDE 7

TRANSFER LEARNING IN NLP

▰ Generic task in NLP: language modelling ▰ Example: “I’m not half the man I …” ▰ Dataset source: Wikipedia, CommonCrawl, etc. ▰ Suitable architecture ▻ RNN-based: ULMFiT (AWD-LSTM) ▻ Self-attention models: Transformer, BERT

7

slide-8
SLIDE 8

8

slide-9
SLIDE 9

FINE-TUNING LANGUAGE MODEL

▰ Adjust model to idiosyncrasies of target ▰ Example: “Patient has pain in the …“ ▰ Use language model as encoder for target

9

slide-10
SLIDE 10

THREE-STAGE PROCESS

10

Generic LM Fine-tuned LM Target Task

slide-11
SLIDE 11

EXPERIMENTS

Transfer learning on Dutch data

11

2

slide-12
SLIDE 12

EXPERIMENTS WITH ULMFIT

▰ Language model trained on Dutch Wikipedia ▰ Dataset of 110k Dutch book reviews [1] ▻ {1, 2} → negative ▻ {4, 5} → positive ▻ {3} → neutral ▰ 18836 training examples, 50% pos / 50% neg

12

[1] 110k Dutch Book Reviews Dataset for Sentiment Analysis https://benjaminvdb.github.io/110kDBRD

slide-13
SLIDE 13

EXPERIMENTAL RESULTS

▰ Training language model took days ▰ Fine-tuning encoder took an hour ▰ Training classifier took minutes ▰ Accuracy 94% ▰ Off-the-shelf software and hardware

13

slide-14
SLIDE 14

14

slide-15
SLIDE 15

ADVANTAGES

  • 1. Improved data efficiency
  • 2. Models can be shared
  • 3. Or even collaboratively trained

→ Federated Learning [1]

15

[1] Federated Learning: Collaborative Machine Learning without Centralized Training Data https://ai.googleblog.com/2017/04/federated-learning-collaborative.html

slide-16
SLIDE 16

WELL-BEING TRACKING

Learning from subjective data

16

3

slide-17
SLIDE 17

WELL-BEING TRACKING

▰ Well-being tracking using journal text (SOAP) ▰ Multivariate regression: positive and negative ▰ No labeled data available

17

slide-18
SLIDE 18

18

LABEL DATA

Experts quantify the contents of a journal entry

  • n a positive and negative

axis.

slide-19
SLIDE 19

19

slide-20
SLIDE 20

TAKEAWAYS

… no, not that kind of takeaway

20

4

slide-21
SLIDE 21

SUMMARY

▰ Transfer learning in NLP possible ▰ State-of-the-art while easy-to-use ▰ Unlock knowledge in subjective data ▰ Models can be shared

21

slide-22
SLIDE 22

RELATED WORK

▰ Bert-as-a-service [1] ▰ Self-supervised learning for image data [2] ▰ Sentiment analysis using text in psychiatry [3]

22

[1] bert-as-a-service: https://github.com/hanxiao/bert-as-service [2] Selfie: Self-supervised Pretraining for Image Embedding: https://arxiv.org/abs/1906.02940 [3] Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records: https://arxiv.org/abs/1904.03225

slide-23
SLIDE 23

FURTHER RESEARCH

▰ Can privacy be preserved when models are shared? ▰ How can we make machine learning more accessible? ▰ What can be learned from subjective data? ▰ How to explain ‘deep results’?

23

slide-24
SLIDE 24

SHARE MODELS

Help patients while preserving privacy

24

You can download mine from: https://github.com/benjaminvdb/110kDBRD

slide-25
SLIDE 25

25

THANKS!

Any questions? You can find me at @BenjaminBurgh & b.van.der.burgh@liacs.leidenuniv.nl