For Text xt Cla lassif ificatio ion Masters Thesis by Shaour - PowerPoint PPT Presentation

Few-Shot Learnin ing For Text xt Cla lassif ificatio ion Master’s Thesis by Shaour Haider First Referee : Prof. Dr. Benno Stein Second Referee : Prof. Dr. Volker Rodehorst 1

Overview • Introduction • Approaches And Results • Related Work • Future Work 2

Introduction What is is text xt cla lassif ification? • For given input: • a paragraph A • a fixed set of classes C = {c 1 , c 2 , … , c n } • Output: a predicted class c ∈ C Why Text Classification? 3

Introduction • Sentiment Analysis • Spam Detection • Topic Classification Image: Sentiment Analysis Image: Spam Detection Image: Topic Classification 4

Introduction Situation Few-Shot Learning • Limited data Few-shot learning aims to learning a classifier with limited amount of labeled examples (<10) Few-shot task 4-way 1-shot task Train Set Test Set Class: Paragraph When you apply styles, your headings change to match the new theme. Video: Video provides a powerful way to help you prove your point. Save time in Word with new buttons that show up where you need them. Document: You can also type a keyword to search online for the video that To change the way a picture fits in your document, click it and a button for layout best fits your document. options appears next to it. Themes: Themes and styles also help keep your document coordinated. When you work on a table, click where you want to add a row or a column, and Design: When you click Design and choose a new Theme, the pictures, charts, then click the plus sign. and SmartArt graphics change to match your new theme. 5

Introduction Terminologies Datasets Target Dataset: • Train Set t (f (few-shot tr training set) t) • Tes est t Set t ( ( tes estin ting set) t) Ba Base Dataset: Addit itional dataset t th that t is is dis isjoint to o tr train in and tes est t set t of of target dataset 6

Let's 's Im Implement Approaches And Results Bag of words Target Dataset Train Target Training Data Classifier Feature Extraction Loss & Update (Few) Target Fixed Weights Test Classifier Accuracy Target Testing Data Feature Extraction Target Assessment 7

Approaches And Results K=1 K=3 K=9 BOW 0.48 0.58 0.74 BOW 0.8 0.74 0.7 0.58 0.6 Baseline 0.48 0.5 Bag of words 0.4 0.3 0.2 0.1 0 K=1 K=3 K=9 8

Approaches And Results Problem wit ith the bag of words! • Overfitting • Vocabulary mismatch Football is a family of team sports that involve, to varying degrees, kicking a ball to score a goal. [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 1 1 0 0 2 0 1 0 0 0] While football continued to be played in various forms throughout Britain, its public schools (equivalent to private schools in other countries) are widely credited with four Sports key achievements in the creation of modern football codes. [0 1 0 0 1 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 2 1 1 0 0 0 3 0 0 1 1 0 0 1 1 0 1 1 1 1 2 0 0 0 0 1 1 2 1 0 1 1 1] Baseball evolved from older bat-and-ball games already being played in England by the mid-18th century. [1 0 1 1 0 1 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] [0 1 0 0 1 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 2 1 1 0 0 0 3 0 0 1 1 0 0 1 1 0 1 1 1 1 2 0 0 0 0 1 1 2 1 0 1 1 1] 9

Better representations Approaches And Results Pre-Trained FastText Or Bert Model Target Dataset Train Target Training Data Model Feature Extraction Loss & Update (Few) Target Fixed Weights Test Classifier Accuracy Target Testing Data Feature Extraction Target Assessment 10

K=1 K=3 K=9 Approaches And Results BOW 0.48 0.58 0.74 FastText 0.66 (+ 0.18) 0.78 (+ 0.20) 0.84 (+ 0.10) Bert 0.73 (+ 0.25) 0.84 (+ 0.26) 0.89 (+ 0.15) BOW FastText Bert 1 0.89 0.9 0.84 0.84 0.78 0.8 0.74 0.73 Baseline 0.7 0.66 0.58 0.6 FastText & Bert 0.48 0.5 0.4 0.3 0.2 0.1 0 K=1 K=3 K=9 11

Approaches And Results Can we im improve any further? Image: Transfer Learning 12

Approaches And Results Approach: Transfer Learning Bag of words Base Dataset Pre-Training Data Feature Model Loss & Update (Many) Extraction Pre-Training Fixed Weights Target Training Train Feature Model Classifier Data Loss & Update Extraction Pre-Training Target (Few) Target Dataset Fixed Weights Fixed Weights Test Target Testing Feature Model Classifier Accuracy Data Extraction Pre-Training Target Assessment 13

Approaches And Results Model Standard 14

Results Approaches And Results Transfer Learning - Standard Model 0.8 BOW 0.71 0.7 Transfer K=1 K=3 K=9 0.6 0.52 0.49 Learning 0.5 BOW (Standard) 0.4 0.3 BOW 0.49 0.52 0.71 0.2 (+ 0.01) (- 0.06) (- 0.03) 0.1 0 K=1 K=3 K=9 15

Approaches And Results Pretrained FastText & Bert Model Base Dataset Pre-Training Data Feature Model Loss & Update (Many) Extraction Pre-Training Fixed Weights Target Training Train Feature Model Classifier Data Loss & Update Extraction Pre-Training Target (Few) Target Dataset Fixed Weights Fixed Weights Test Target Testing Feature Model Classifier Accuracy Data Extraction Pre-Training Target Assessment 16

Results Approaches And Results Transfer Learning - Standard Model 0.8 0.74 BOW 0.7 0.58 0.6 0.48 Transfer K=1 K=3 K=9 0.5 BOW Learning 0.4 (Standard) 0.3 0.2 BOW 0.49 0.52 0.71 0.1 (+ 0.01) (- 0.06) (- 0.03) 0 FastText 0.62 0.75 0.81 K=1 K=3 K=9 (- 0.04) (- 0.03) (- 0.03) 1 FastText Bert 0.88 0.84 0.81 Bert 0.73 0.84 0.88 0.75 0.8 0.73 ( 0.00) ( 0.00) ( -0.01) 0.62 0.6 FastText And Bert 0.4 0.2 0 17 K=1 K=3 K=9

Approaches And Results Model Modified 18

Results Approaches And Results Transfer Learning - Modified Model BOW FastText Bert 1 0.87 0.9 0.84 0.83 0.81 Transfer K=1 K=3 K=9 0.78 0.8 0.75 0.73 Learning 0.69 0.68 0.7 (Modified) 0.6 BOW 0.68 0.75 0.84 0.5 (+ 0.20) (+ 0.17) (+ 0.10) 0.4 FastText 0.69 0.78 0.83 (+ 0.03) ( 0.00) (- 0.01) 0.3 Bert 0.73 0.81 0.87 0.2 ( 0.00) (- 0.03) (- 0.02) 0.1 0 K=1 K=3 K=9 19

Complete Results Approaches And Results BOW- Baseline FastText- Baseline Bert- Baseline BOW- Standard Transfer Leaning FastText- Standard Transfer Leaning Bert- Standard Transfer Leaning BOW- Modified Transfer Learning FastText- Modified Transfer Learning Bert- Modified Transfer Learning 1 0.89 0.88 0.87 0.9 0.84 0.84 0.84 0.84 0.83 0.81 0.81 0.78 0.78 0.8 0.75 0.75 0.74 0.73 0.73 0.73 0.71 0.69 0.68 0.7 0.66 0.62 0.58 0.6 0.52 0.49 0.48 0.5 0.4 0.3 0.2 0.1 0 K=1 K=3 K=9 20

Approaches And Results Results Summary ry • An average improvement of 10-20% in the modified transfer learning using bow representations as compared to the baseline scores of the bow model. • A general increase in the accuracy with the increase in the size of training task. • No real improvements when fine-tuning the representations from both the advanced pre-trained models fasttext and bert. • Bow representation can be improved by pre-training on Wikipedia section heading classification task. 21

Related Work Few-shot learning approaches: • Metric Learning • Meta Learning 22

Related Work Metric Learning Relation Network Advances in few-shot learning Siamese

Related Work Meta Learning MAML

Future Work • Using other few-shot learning approaches such as meta learning and metric learning. • Increasing the dataset by not just limiting to the level 2 section heading- Would require having increased computation resources. • Using bert-large model instead of bert-base. • Finding peak accuracy score for bert model. • Testing the trained classifier on topic classification data other than Wikipedia. 25

Thank you 26

Additional Slides 42

Additional Slides 43

Related Work: Metric Learning • Siamese • Matching Networks Support Set Instances Distance Metric Neural Neural Network Network Query Set Instance Input 2 Input 1 44

Related Work: Metric Learning • Prototypical Networks & Relation Networks 45

Related Work: Meta Learning W Θ 46

Related Work: Transfer Learning • Baseline 47

For Text xt Cla lassif ificatio ion Masters Thesis by Shaour - PowerPoint PPT Presentation

Few-Shot Learnin ing For Text xt Cla lassif ificatio ion Masters Thesis by Shaour Haider First Referee : Prof. Dr. Benno Stein Second Referee : Prof. Dr. Volker Rodehorst 1 Overview Introduction Approaches And Results

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

17 o f 46 Ac c ide nts 26 o f 58 Ac c ide nts 35% 45% $76,858 in WC Cla ims Pa id Out $78,628

Third rd-Party V Verific ificatio ion o of f Flo low Batter ery y Datacen enter er

E XT RADIT ION & E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

NM NMYSA SA RET RETURN RN TO ACT CTIVITY GA GAME D DAY M MODIFIC IFICATIO IONS NS OP

ACCESS FORUM Page 1 Agenda Item 4 THE CLA Tim Woodward MSc FRICS Page 2 Regional Surveyor,

THE KENT DEBATE - WHOSE LAND IS IT ANYWAY? Robin Edwards CLA SE Regional Director Ross

Cla lass of f 2022 Cla lass Meeting MR. R. MORGAN Pri rincip ipal MS. . FU FUQUAY Ass

the concept of f contradic iction fi findin ing and cla lassif ific ication Joanna

Contour Detection and Hierarchical Image Segmentation P. Arbelaez, M. Maire, C. Fowlkes, and J.

Shall I Bow To My Creator? NO! YES! ancient myths ancient monotheism eastern

Archery @ NDGF Dmytro Karpenko (UiO/NDGF) ARC6 retreat, November 2018, Ume Idea ! First,

Cool URIs Are Human Readable Linked Open Data and MultilingualWeb 11 June 2012

R A B Relational Mapping b 1 Properties b 2 a 1 (Archery) a 2 b 3 a 3 b 4 arrows lec 3W.1

IMAGE REPRESENTATION Xinyi Fan COS598c Spring2014 Monday, April 7, 14 IMAGE REPRESENTATION

Primary Proof: Finding and Identifying Primary Sources for Documentation Association of Public

Research Guide uoft.m e/ caravan Torontos Caravan: Required readings Civic

For Text xt Cla lassif ificatio ion Masters Thesis by Shaour - PowerPoint PPT Presentation

Few-Shot Learnin ing For Text xt Cla lassif ificatio ion Masters Thesis by Shaour Haider First Referee : Prof. Dr. Benno Stein Second Referee : Prof. Dr. Volker Rodehorst 1 Overview Introduction Approaches And Results

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

17 o f 46 Ac c ide nts 26 o f 58 Ac c ide nts 35% 45% $76,858 in WC Cla ims Pa id Out $78,628

Third rd-Party V Verific ificatio ion o of f Flo low Batter ery y Datacen enter er

E XT RADIT ION &amp; E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

NM NMYSA SA RET RETURN RN TO ACT CTIVITY GA GAME D DAY M MODIFIC IFICATIO IONS NS OP

ACCESS FORUM Page 1 Agenda Item 4 THE CLA Tim Woodward MSc FRICS Page 2 Regional Surveyor,

THE KENT DEBATE - WHOSE LAND IS IT ANYWAY? Robin Edwards CLA SE Regional Director Ross

Cla lass of f 2022 Cla lass Meeting MR. R. MORGAN Pri rincip ipal MS. . FU FUQUAY Ass

the concept of f contradic iction fi findin ing and cla lassif ific ication Joanna

Contour Detection and Hierarchical Image Segmentation P. Arbelaez, M. Maire, C. Fowlkes, and J.

Shall I Bow To My Creator? NO! YES! ancient myths ancient monotheism eastern

Archery @ NDGF Dmytro Karpenko (UiO/NDGF) ARC6 retreat, November 2018, Ume Idea ! First,

Cool URIs Are Human Readable Linked Open Data and MultilingualWeb 11 June 2012

R A B Relational Mapping b 1 Properties b 2 a 1 (Archery) a 2 b 3 a 3 b 4 arrows lec 3W.1

IMAGE REPRESENTATION Xinyi Fan COS598c Spring2014 Monday, April 7, 14 IMAGE REPRESENTATION

Primary Proof: Finding and Identifying Primary Sources for Documentation Association of Public

Research Guide uoft.m e/ caravan Torontos Caravan: Required readings Civic

E XT RADIT ION & E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f