Text Mining on Clinical Data Robert McHardy Outline Motivation - - PowerPoint PPT Presentation

text mining on clinical data
SMART_READER_LITE
LIVE PREVIEW

Text Mining on Clinical Data Robert McHardy Outline Motivation - - PowerPoint PPT Presentation

Institut fr Maschinelle Sprachverarbeitung Text Mining on Clinical Data Robert McHardy Outline Motivation Medical Entity Recognition Anonymization of Medical Reports Knowledge-based Biomedical Word Sense Disambiguation


slide-1
SLIDE 1

Institut für Maschinelle Sprachverarbeitung

Robert McHardy

Text Mining on Clinical Data

slide-2
SLIDE 2

Outline

  • Motivation
  • Medical Entity Recognition
  • Anonymization of Medical Reports
  • Knowledge-based Biomedical Word Sense Disambiguation
  • Extraction of Potential Adverse Drug Events
  • Resources

5.12.2017 Universität Stuttgart 2

slide-3
SLIDE 3

5.12.2017 Universität Stuttgart 3

Motivation — Different Users

slide-4
SLIDE 4

Motivation — Why do we need Text Mining on Clinical Data?

  • Doctors need to know if a drug is safe to use or not

5.12.2017 Universität Stuttgart 4

slide-5
SLIDE 5

Motivation — Why do we need Text Mining on Clinical Data?

  • Doctors need to know if a drug is safe to use or not
  • As fast as possible

5.12.2017 Universität Stuttgart 4

slide-6
SLIDE 6

Motivation — Why do we need Text Mining on Clinical Data?

  • Doctors need to know if a drug is safe to use or not
  • As fast as possible
  • We don‘t want to suffer from unsafe drugs

5.12.2017 Universität Stuttgart 4

slide-7
SLIDE 7

Motivation — Why do we need Text Mining on Clinical Data?

  • Doctors need to know if a drug is safe to use or not
  • As fast as possible
  • We don‘t want to suffer from unsafe drugs
  • Researchers want to use the data

5.12.2017 Universität Stuttgart 4

slide-8
SLIDE 8

Motivation — Why do we need Text Mining on Clinical Data?

  • Doctors need to know if a drug is safe to use or not
  • As fast as possible
  • We don‘t want to suffer from unsafe drugs
  • Researchers want to use the data
  • It has to be anonymized

5.12.2017 Universität Stuttgart 4

slide-9
SLIDE 9

Motivation — PubMed, again!

5.12.2017 Universität Stuttgart 5

slide-10
SLIDE 10

Unified Medical Language System Metathesaurus (UMLS)

5.12.2017 Universität Stuttgart 6

slide-11
SLIDE 11

Medical Entity Recognition — Overview

  • Abacha and Zweigenbaum: Consists of two parts
  • Detecting phrases referring to medical entities
  • Assigning semantic categories to the found entities

5.12.2017 Universität Stuttgart 7

slide-12
SLIDE 12

Medical Entity Recognition — Overview

5.12.2017 Universität Stuttgart 8

slide-13
SLIDE 13

Medical Entity Recognition — Overview

5.12.2017 Universität Stuttgart 8

Diabetes type 1 Type 1 diabetes IDDM Juvenile diabetes T1D

slide-14
SLIDE 14

Medical Entity Recognition — Noun Phrase Chunking

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

5.12.2017 Universität Stuttgart 9

slide-15
SLIDE 15

Medical Entity Recognition — Noun Phrase Chunking

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Many tools for NP chunking available

5.12.2017 Universität Stuttgart 9

slide-16
SLIDE 16

Medical Entity Recognition — Noun Phrase Chunking

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Many tools for NP chunking available
  • Maximum recall is desired

5.12.2017 Universität Stuttgart 9

slide-17
SLIDE 17

Medical Entity Recognition — Noun Phrase Chunking

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Many tools for NP chunking available
  • Maximum recall is desired
  • Open-domain tools like IMS‘ TreeTagger are suitable

5.12.2017 Universität Stuttgart 9

slide-18
SLIDE 18

Medical Entity Recognition — MetaMap and the UMLS

  • MetaMap is a tool which maps noun phrases in raw text to UMLS concepts
  • This is done according to a matching score

5.12.2017 Universität Stuttgart 10

slide-19
SLIDE 19

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap

5.12.2017 Universität Stuttgart 11

slide-20
SLIDE 20

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap
  • Noun chunking performance is worse than with specialized NLP tools

5.12.2017 Universität Stuttgart 11

slide-21
SLIDE 21

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap
  • Noun chunking performance is worse than with specialized NLP tools
  • Medical entity detection often finds verbs and general words which aren‘t MEs

5.12.2017 Universität Stuttgart 11

slide-22
SLIDE 22

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap
  • Noun chunking performance is worse than with specialized NLP tools
  • Medical entity detection often finds verbs and general words which aren‘t MEs
  • Some ambiguity is left

5.12.2017 Universität Stuttgart 11

slide-23
SLIDE 23

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap
  • Noun chunking performance is worse than with specialized NLP tools
  • Medical entity detection often finds verbs and general words which aren‘t MEs
  • Some ambiguity is left
  • UMLS can provide several concepts for a term

5.12.2017 Universität Stuttgart 11

slide-24
SLIDE 24

Medical Entity Recognition — MetaMap and the UMLS

  • Three problems with MetaMap
  • Noun chunking performance is worse than with specialized NLP tools
  • Medical entity detection often finds verbs and general words which aren‘t MEs
  • Some ambiguity is left
  • UMLS can provide several concepts for a term
  • and several semantic categories for a concept

5.12.2017 Universität Stuttgart 11

slide-25
SLIDE 25

Medical Entity Recognition — MetaMap and the UMLS Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

5.12.2017 Universität Stuttgart 12

Cold (term) Cold temperature Common cold Chronic

  • bstructive lung

disease Cold storage (term) Cold storage

slide-26
SLIDE 26

Medical Entity Recognition — MetaMap+

  • Use tools like TreeTagger for the NP chunking

5.12.2017 Universität Stuttgart 13

slide-27
SLIDE 27

Medical Entity Recognition — MetaMap+

  • Use tools like TreeTagger for the NP chunking
  • Filter NPs with a stop-word list

5.12.2017 Universität Stuttgart 13

slide-28
SLIDE 28

Medical Entity Recognition — MetaMap+

  • Use tools like TreeTagger for the NP chunking
  • Filter NPs with a stop-word list
  • Search in specialized lists for candidate terms

5.12.2017 Universität Stuttgart 13

slide-29
SLIDE 29

Medical Entity Recognition — MetaMap+

  • Use tools like TreeTagger for the NP chunking
  • Filter NPs with a stop-word list
  • Search in specialized lists for candidate terms
  • Annotate entities with MetaMap

5.12.2017 Universität Stuttgart 13

slide-30
SLIDE 30

Medical Entity Recognition — MetaMap+

  • Use tools like TreeTagger for the NP chunking
  • Filter NPs with a stop-word list
  • Search in specialized lists for candidate terms
  • Annotate entities with MetaMap
  • Filter frequent errors and too broad semantic types

5.12.2017 Universität Stuttgart 13

slide-31
SLIDE 31

Medical Entity Recognition — MetaMap+

  • Voting mechanism to disambiguate semantic categories

5.12.2017 Universität Stuttgart 14

slide-32
SLIDE 32

Medical Entity Recognition — Support Vector Machines (SVMs)

  • Word level features:
  • words of the NP
  • number of words of the NP
  • window of words around the NP
  • Orthographical features:
  • first letter capitalized
  • all letters upper-/lowercase
  • contains abbreviation(s)
  • POS tags

5.12.2017 Universität Stuttgart 15

slide-33
SLIDE 33

Medical Entity Recognition — BIO-CRFs

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Words are annotated with the the tags B, I and O

5.12.2017 Universität Stuttgart 16

slide-34
SLIDE 34

Medical Entity Recognition — BIO-CRFs

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Words are annotated with the the tags B, I and O
  • B-x: Begin of a phrase of class x

5.12.2017 Universität Stuttgart 16

slide-35
SLIDE 35

Medical Entity Recognition — BIO-CRFs

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Words are annotated with the the tags B, I and O
  • B-x: Begin of a phrase of class x
  • I-x: Intermediate part of a phrase of class x

5.12.2017 Universität Stuttgart 16

slide-36
SLIDE 36

Medical Entity Recognition — BIO-CRFs

Pharmacodynamic studies, including positron-emission tomography (PET) and computed tomography (CT) […]

  • Words are annotated with the the tags B, I and O
  • B-x: Begin of a phrase of class x
  • I-x: Intermediate part of a phrase of class x
  • O: Outside entities

5.12.2017 Universität Stuttgart 16

slide-37
SLIDE 37

Medical Entity Recognition — BIO-CRFs

  • Word level features:
  • The word itself
  • Window of words
  • Lemmas
  • Orthographical features:
  • Upper/lowercase
  • contains a digit
  • pre- and suffixes
  • POS tags
  • (Semantic category of word (provided by MetaMap+))

5.12.2017 Universität Stuttgart 17

slide-38
SLIDE 38

Medical Entity Recognition — Evaluation

  • Corpus contains discharge summaries and progress notes
  • De-identified and annotated by hand
  • Entities: Problem, Treatment and Test
  • Overall 76,665 sentences

5.12.2017 Universität Stuttgart 18

slide-39
SLIDE 39

20.01.2016 Universität Stuttgart 19

Medical Entity Recognition — Evaluation

Setting Precision Recall F-Score MetaMap 15.52 16.10 15.80 MetaMap+ 48.68 56.46 52.28 SVM 43.65 47.16 45.33 BIO-CRF 70.15 83.31 76.17 BIO-CRF-Hybrid 72.18 83.78 77.55

slide-40
SLIDE 40

Anonymization of Medical Reports

20.01.2016 Universität Stuttgart 20

slide-41
SLIDE 41

Anonymization of Medical Reports — What is anonymization?

  • De-Identification

5.12.2017 Universität Stuttgart 21

slide-42
SLIDE 42

Anonymization of Medical Reports — What is anonymization?

  • De-Identification
  • Completely remove all personal health information

5.12.2017 Universität Stuttgart 21

slide-43
SLIDE 43

Anonymization of Medical Reports — What is anonymization?

  • De-Identification
  • Completely remove all personal health information
  • Anonymization
  • Identify and classify personal information in the documents

5.12.2017 Universität Stuttgart 21

slide-44
SLIDE 44

Anonymization of Medical Reports — What is anonymization?

  • De-Identification
  • Completely remove all personal health information
  • Anonymization
  • Identify and classify personal information in the documents
  • Do not delete but replace it

5.12.2017 Universität Stuttgart 21

slide-45
SLIDE 45

Anonymization of Medical Reports — Anonymization vs De-Identification

  • The documents remain (easily) readable
  • Unclear if remaining PHI is from the original document or arbitrary

5.12.2017 Universität Stuttgart 22

slide-46
SLIDE 46

Anonymization of Medical Reports — The features

  • Use Named Entity Recognition to detect the PHI
  • No “deep knowledge information“ such as syntatic information (POS tags) or

domain-specific resources

5.12.2017 Universität Stuttgart 23

slide-47
SLIDE 47

Anonymization of Medical Reports — The features

  • Use Named Entity Recognition to detect the PHI
  • No “deep knowledge information“ such as syntatic information (POS tags) or

domain-specific resources

  • Orthographical features, frequency and phrasal information, dictionaries and

contextual information

5.12.2017 Universität Stuttgart 23

slide-48
SLIDE 48

Anonymization of Medical Reports — The features

  • Use Named Entity Recognition to detect the PHI
  • No “deep knowledge information“ such as syntatic information (POS tags) or

domain-specific resources

  • Orthographical features, frequency and phrasal information, dictionaries and

contextual information

  • Trigger words within a certain window indicate PHI

5.12.2017 Universität Stuttgart 23

slide-49
SLIDE 49

Anonymization of Medical Reports — Machine learning techniques

  • Use of decision trees for NER
  • C4.5 for building the tree
  • Boosting for improving the performance of the decision tree

5.12.2017 Universität Stuttgart 24

slide-50
SLIDE 50

Anonymization of Medical Reports — Boosting and C4.5

  • C4.5 is used for building the decision tree
  • The tree doesn‘t have to be binary
  • The data is splitted according to the information gain

5.12.2017 Universität Stuttgart 25

slide-51
SLIDE 51

Anonymization of Medical Reports — Boosting and C4.5

  • C4.5 is used for building the decision tree
  • The tree doesn‘t have to be binary
  • The data is splitted according to the information gain
  • Boosting combines multiple (weak) classifiers to one strong classifier
  • The decision of each (weak) classifier is weighted
  • The training of a Boosting classifier is finding these weights

5.12.2017 Universität Stuttgart 25

slide-52
SLIDE 52

Anonymization of Medical Reports — Training

  • Trusted entities found in first training phase

5.12.2017 Universität Stuttgart 26

slide-53
SLIDE 53

Anonymization of Medical Reports — Training

  • Trusted entities found in first training phase
  • Iterative training with new features

5.12.2017 Universität Stuttgart 26

slide-54
SLIDE 54

Anonymization of Medical Reports — Performance of the system

  • Different models and two baselines tested
  • Majority baseline only predicting non-PHI
  • C4.5 decision tree without Boosting but with domain specific extensions
  • Iteratively trained models

5.12.2017 Universität Stuttgart 27

slide-55
SLIDE 55

Anonymization of Medical Reports — Performance of the system

5.12.2017 Universität Stuttgart 28

Evaluation Evaluation standardized 9-way F 8-way P/R/F 9-way F 8-way P/R/F Majority 94.29 0.00 94.29 0.00 C4.5 99.46 97.92/92.12/94.93 99.52 97.92/93.19/95.49 ITR1_BEST 99.61 98.92/93.97/96.38 99.74 98.47/96.04/97.42 ITR1_VOTE 99.64 98.99/94.35/96.61 99.75 98.79/96.41/97.58 ITR2_BEST 99.65 98.79/94.72/96.71 99.75 98.81/96.39/97.58 ITR2_VOTE 99.65 98.79/94.73/96.71 99.75 98.89/96.42/97.64

slide-56
SLIDE 56

Anonymization of Medical Reports — Analysis of the feature set

  • The dictionaries were useless

5.12.2017 Universität Stuttgart 29

slide-57
SLIDE 57

Anonymization of Medical Reports — Analysis of the feature set

  • The dictionaries were useless
  • Basic features were the most important

5.12.2017 Universität Stuttgart 29

slide-58
SLIDE 58

Anonymization of Medical Reports — Analysis of the feature set

  • The dictionaries were useless
  • Basic features were the most important
  • Orthographical and frequency information 2nd and 3rd most important

5.12.2017 Universität Stuttgart 29

slide-59
SLIDE 59

Anonymization of Medical Reports — Analysis of the feature set

  • The dictionaries were useless
  • Basic features were the most important
  • Orthographical and frequency information 2nd and 3rd most important
  • Sentence position and quotation marks/brackets didn‘t help

5.12.2017 Universität Stuttgart 29

slide-60
SLIDE 60

Anonymization of Medical Reports — Errors

5.12.2017 Universität Stuttgart 30

Error Percentage of errors DOCTOR 23 % LOCATION 20 % HOSPITAL 16 % DATE 10 %

slide-61
SLIDE 61

Knowledge-based Biomedical Word Sense Disambiguation — Ambiguity

  • We already know genes, proteins and diseases are ambiguous
  • Biomedical texts also contain some form of standard language
  • Building a manually annotated corpus for statistical WSD is expensive

5.12.2017 Universität Stuttgart 31

slide-62
SLIDE 62

Knowledge-based Biomedical WSD — Machine Readable Dictionaries

  • Build a vector for each concept the word represents

5.12.2017 Universität Stuttgart 32

slide-63
SLIDE 63

Knowledge-based Biomedical WSD — Machine Readable Dictionaries

  • Build a vector for each concept the word represents
  • Build a vector for the context

5.12.2017 Universität Stuttgart 32

slide-64
SLIDE 64

Knowledge-based Biomedical WSD — Machine Readable Dictionaries

  • Build a vector for each concept the word represents
  • Build a vector for the context
  • Cosine similarity to select the most similar concept

𝑁𝑆𝐸 𝑑 = 𝑏𝑠𝑕max

𝑑∈𝐷𝑥

𝑑 ∙ 𝑑𝑦 𝑑 ∙ |𝑑𝑦|

5.12.2017 Universität Stuttgart 32

slide-65
SLIDE 65

Knowledge-based Biomedical WSD — PageRank

  • Idea: Use the topology of the resource network

5.12.2017 Universität Stuttgart 33

slide-66
SLIDE 66

Knowledge-based Biomedical WSD — Automatic Corpus Extraction

  • Idea: Use one really big corpus (i.e. MEDLINE) and extract multiple corpora for

training

  • Get monosemous relatives of an ambiguous term
  • Use the result to train statistical model

5.12.2017 Universität Stuttgart 34

slide-67
SLIDE 67

Knowledge-based Biomedical WSD — Automatic Corpus Extraction

5.12.2017 Universität Stuttgart 34

„Surgical repair“ OR („repair“ AND [„Corneal Transplantation“ OR „Corneal Transplantations“ OR „Corneal Graftings“ OR „Corneal Grafting“ OR „Cornea Transplantation“ OR „Repair of the Middle Ear“])

Monosemous synonyms Ambiguous term Monosemous terms from related concepts

slide-68
SLIDE 68

Knowledge-based Biomedical WSD — Journal Descriptor Indexing

  • Idea: Use semantic types for disambiguation

In the mouse, the process of implantation is initiated by the attachment reaction between the blastocyst trophectoderm and uterine luminal epithelium that occurs at 2200–2300 h on day 4 (day 1 = vaginal plug) of pregnancy.

5.12.2017 Universität Stuttgart 35

slide-69
SLIDE 69

Knowledge-based Biomedical WSD — Journal Descriptor Indexing

  • Idea: Use semantic types for disambiguation

In the mouse, the process of implantation is initiated by the attachment reaction between the blastocyst trophectoderm and uterine luminal epithelium that occurs at 2200–2300 h on day 4 (day 1 = vaginal plug) of pregnancy. 1000 Implantation <1> (Blastocyst Implantation, natural) [Organism Function] 1000 Implantation <2> (Implantation procedure) [Therapeutic or Preventive Procedure]

5.12.2017 Universität Stuttgart 35

slide-70
SLIDE 70

Knowledge-based Biomedical WSD — ST vector for implantation

Rank Semantic Type abbreviation Semantic Type Score 57 aapp Amino Acid, Peptide, or Protein 0.3373 5 diap Diagnostic Procedure 0.6637 39 emst Embryonic Structure 0.4168 13

  • rgf

Organism Function 0.6013 1 spco Spatial Concept 0.7027 2 topp Therapeutic or Preventive Procedure 0.6937 108 vtbt Vertebrate 0.1748

5.12.2017 Universität Stuttgart 36

slide-71
SLIDE 71

Knowledge-based Biomedical WSD — ST vector for blastocyst implantation

Rank Semantic Type abbreviation Semantic Type Score 1

  • rgf

Organism Function 0.5506 4 emst Embryonic Structure 0.5132 12 spco Spatial Concept 0.4340 13 topp Therapeutic or Preventive Procedure 0.4316 16 diap Diagnostic Procedure 0.4182 45 aapp Amino Acid, Peptide, or Protein 0.2766 92 vtbt Vertebrate 0.1746

5.12.2017 Universität Stuttgart 37

slide-72
SLIDE 72

Knowledge-based Biomedical WSD — Evaluation

Accuracy all Accuracy JDI set Machine Readable Dictionary 0.639 0.653 PageRank 0.583 0.587 Automatic Corpus Extraction 0.683 0.693 Journal Descriptor Indexing 0.748 Linear Combination 0.762 0.779 Combined Voting 0.760 0.774 Maximum Frequency Sense 0.855 0.867 Naive Bayes 0.883 0.906

5.12.2017 Universität Stuttgart 38

slide-73
SLIDE 73

Extraction of Potential Adverse Drug Events — Where to get the data from?

  • ADE corpus
  • Contains 2972 MEDLINE case reports
  • Relations only on sentence-level
  • If there is no relation between a drug and a condition, they weren‘t

annotated

5.12.2017 Universität Stuttgart 39

slide-74
SLIDE 74

Extraction of Potential Adverse Drug Events — Where to get the data from?

  • ADE-EXT corpus
  • NER with dictionaries for identification of drugs and conditions
  • Every pair not previously annotated as a relation formed a False relation
  • Overall 5969 False relations

5.12.2017 Universität Stuttgart 40

slide-75
SLIDE 75

Extraction of Potential Adverse Drug Events — Drug-Cause-Condition

5.12.2017 Universität Stuttgart 41

slide-76
SLIDE 76

Extraction of Potential Adverse Drug Events — How the system was built

  • Nested annotations such as „acute lithium toxcity“ were removed
  • The resulting corpus was divided into a training and a test set (90:10 split)

5.12.2017 Universität Stuttgart 42

slide-77
SLIDE 77

Extraction of Potential Adverse Drug Events — How the system was built

  • Tokens enriched:
  • POS-tags, lemmas and named entity flags

5.12.2017 Universität Stuttgart 43

slide-78
SLIDE 78

Extraction of Potential Adverse Drug Events — Evaluation

  • F-Score of 0.87 over the ADE-TRAIN-EXT corpus with 10-fold cross-validation
  • Errors typically caused by missing context, distantly co-occuring inter-related

entities

5.12.2017 Universität Stuttgart 44

slide-79
SLIDE 79

Extraction of Potential Adverse Drug Events — Evaluation

  • F-Score of 0.87 over the ADE-TRAIN-EXT corpus with 10-fold cross-validation
  • Errors typically caused by missing context, distantly co-occuring inter-related

entities

  • A 65-year-old patient chronically treated with the selective serotonin reuptake

inhibitor (SSRI) citalopram developed confusion, agitation, tachycardia, tremors, myoclonic jerks and unsteady gait, consistent with serotonin syndrome, following initiation of fentanyl, and all symptoms and signs resolved following discontinuation of fentanyl

5.12.2017 Universität Stuttgart 44

slide-80
SLIDE 80

Extraction of Potential Adverse Drug Events — Training size

5.12.2017 Universität Stuttgart 45

F-Score #Documents Mean Standard Derivation 10 0.55 0.38 20 0.64 0.37 50 0.82 0.09 100 0.78 0.04 200 0.84 0.04 500 0.84 0.02 1000 0.86 0.01 2000 0.87 0.01

slide-81
SLIDE 81

Take-away

  • Different approaches for detecting medical entities
  • What anonymization is and how to do it
  • Disambiguation of biomedical terms
  • What adverse drug events are and how to extract them

5.12.2017 Universität Stuttgart 45

slide-82
SLIDE 82

Resources

  • Gurulingappa, Mateen-Rajput and Toldo, 2012: Extraction of potential adverse drug

events from medical case reports

  • Uzuner, South, Shen and DuVall; 2010: i2b2/VA challenge on concepts, assertions, and

relations in clinical text

  • Humphrey, Rogers, Kilicoglu, Demner-Fushman, Rindflesch; 2006: Word Sense

Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment

  • Szarvas, Farkas and Busa-Fekete; 2007: State-of-the-art Anonymization of Medical

Records Using an Iterative Machine Learning Framework

  • Jimeno-Yepes and Aronson; 2010: Knowledge-based biomedical word sense

disambiguation: comparison of approaches

5.12.2017 Universität Stuttgart 46

slide-83
SLIDE 83

Resources

  • Agirre and Soroa, 2009: Personalizing PageRank for Word Sense Disambiguation
  • Abacha and Zweigenbaum; 2011: Medical Entity Recognition: A Comparison of Semantic

and Statistical Methods

20.01.2016 Universität Stuttgart 83