Semantic Categories using Automatically Acquired Symmetric Patterns - - PowerPoint PPT Presentation

semantic categories using automatically acquired
SMART_READER_LITE
LIVE PREVIEW

Semantic Categories using Automatically Acquired Symmetric Patterns - - PowerPoint PPT Presentation

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns Roy Schwartz + , Roi Reichart * and Ari Rappoport + + The Hebrew University, * Technion IIT COLING 2014


slide-1
SLIDE 1

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns

Roy Schwartz+, Roi Reichart * and Ari Rappoport+

+The Hebrew University, *Technion IIT

COLING 2014

slide-2
SLIDE 2

2 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. http://www.slideshare.net/halucinex/friend-word-map

slide-3
SLIDE 3

– ... tokens to date, friend lists and recent ... – ... by my dear friend and companion, Fritz von ... – ... even have a friend who never fails ... – ... by my worthy friend Doctor Haygarth of ... – ... and as a friend pointed out to ... – ... partner, in-laws, relatives or friends speak a different ... – ... petition to a friend Go to the ... – ... otherwise, to a friend or family member ... – ...images from my friend Rory though - ... – ... great, and a friend as well as a colleague, who, ... – …

3 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. Examples taken from the ukwac corpus (Baroni et al., 2009)

DS Hypothesis (Harris, 1954)

slide-4
SLIDE 4

0.5 0.76

  • 0.12

0.76

  • 0.51

. . .

4 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-5
SLIDE 5

0.5 0.76

  • 0.12

0.76

  • 0.51

. . .

4 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Θ friend colleague

slide-6
SLIDE 6

– ... tokens to date, friend lists and recent ... – ... by my dear friend and companion, Fritz von ... – ... even have a friend who never fails ... – ... by my worthy friend Doctor Haygarth of ... – ... and as a friend pointed out to ... – ... partner, in-laws, relatives or friends speak a different ... – ... petition to a friend Go to the ... – ... otherwise, to a friend or family member ... – ...images from my friend Rory though - ... – ... great, and a friend as well as a colleague, who, ... – …

5 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-7
SLIDE 7

– ... by my dear friend and companion, Fritz von ... – ... partner, in-laws, relatives or friends speak a different ... – ... great, and a friend as well as a colleague, who, ... – …

5 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-8
SLIDE 8

– ... by my dear friend and companion, Fritz von ... – ... partner, in-laws, relatives or friends speak a different ... – ... great, and a friend as well as a colleague, who, ... – …

5 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-9
SLIDE 9

– friend and companion – companion and friend – relatives or friends – friends or relatives – friend as well as a colleague – colleague as well as a friend

6 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-10
SLIDE 10

– friend and companion – companion and friend – relatives or friends – friends or relatives – friend as well as a colleague – colleague as well as a friend

6 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-11
SLIDE 11

7 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-12
SLIDE 12

Symmetric Patterns Senna Brown

8 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

slide-13
SLIDE 13

Overview

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

  • The task

– Minimally supervised semantic classification

  • The method

– Automatically acquired symmetric patterns

  • Results

– Symmetric patterns outperform strong baselines by > 12% accuracy

9

slide-14
SLIDE 14

The Task

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

  • Binary Classification of Nouns into Semantic Categories

– Is “dog” an animal? – Is “couch” a tool?

  • Use minimal supervision

10

slide-15
SLIDE 15

The Task

Example

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

  • Animals

11

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

slide-16
SLIDE 16

The Task

Goal

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

  • Animals

12

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

slide-17
SLIDE 17

Symmetric Patterns Contexts

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 13

slide-18
SLIDE 18

Symmetric Patterns to Word Similarity

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 14

  • SXY  the number of times X,Y appeared in the same

symmetric pattern

slide-19
SLIDE 19

Symmetric Patterns to Word Similarity

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 14

  • orange  apple

1. … apples and oranges … 2. … oranges as well as apples … …

  • K. … neither apple nor orange …

 orange  apple = – Z: a normalization factor

Z K

slide-20
SLIDE 20

Symmetric Patterns to Word Similarity

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 14

  • 1.

2. – Z: a normalization factor

  • France  England

1. … England or France … 2. … from France to England … …

  • M. … England and France …

 France  England =

Z K Z M

slide-21
SLIDE 21

Symmetric Patterns

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 15

slide-22
SLIDE 22

Symmetric Patterns

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 15

slide-23
SLIDE 23

Automatically Extracted Symmetric Patterns

The (Davidov and Rappoport, 2006) Algorithm

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 16

  • A graph-based algorithm

– Input: a corpus of plain text – Output: a set of symmetric patterns

slide-24
SLIDE 24

Automatically Extracted Symmetric Patterns

The (Davidov and Rappoport, 2006) Algorithm

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 16

  • The idea: search for patterns with interchangeable word pairs

– For each pattern candidate, compute symmetry measure (M) – Select the patterns with the highest M values

slide-25
SLIDE 25

Automatically Extracted Symmetric Patterns

The (Davidov and Rappoport, 2006) algorithm

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 17

  • The M measure counts the proportion of pattern instances

that appear in both directions (“cat and dog” + “dog and cat”)

– See paper for more details

  • High M value  A symmetric pattern
slide-26
SLIDE 26

Automatically Extracted Symmetric Patterns

The (Davidov and Rappoport, 2006) algorithm

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 17

  • Twenty symmetric patterns are extracted

– “X and Y”, “X or Y” – “X and the Y”, “X rather than Y”, “X versus Y”

slide-27
SLIDE 27

Word Similarity Measures

SXY  Similarity Between Words X and Y

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 18

  • Symmetric patterns

– Extract a set of symmetric patterns from plain text – SXY  the number of time X and Y participate in the same symmetric pattern

slide-28
SLIDE 28

Word Similarity Measures

SXY  Similarity Between Words X and Y

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 18

  • Symmetric patterns

– Extract a set of symmetric patterns from plain text – SXY  the number of time X and Y participate in the same symmetric pattern

  • Baselines:

– Senna word embeddings (Collobert et al., 2011): – SXY  cosine similarity between the word embeddings of X and Y

slide-29
SLIDE 29

Word Similarity Measures

SXY  Similarity Between Words X and Y

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 18

  • Symmetric patterns

– Extract a set of symmetric patterns from plain text – SXY  the number of time X and Y participate in the same symmetric pattern

  • Baselines:

– Senna word embeddings (Collobert et al., 2011): – SXY  cosine similarity between the word embeddings of X and Y – Brown Clusters (Brown et al., 1992): – SXY  1 - tree distance between X and Y clusters

slide-30
SLIDE 30

Word Classification

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 19

  • Reminder: Our Task

– Minimally-supervised semantic word classification

slide-31
SLIDE 31

Word Classification

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 19

  • An undirected weighted graph

– Nodes are words – Edge weights are word similarity scores

slide-32
SLIDE 32

Word Classification

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 19

  • Goal: to propagate labels from a few labeled seed words to

the rest of the graph

slide-33
SLIDE 33

Label Propagation Algorithms

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 20

  • Iterative variant of k-Nearest Neighbors

– See paper for details

  • Baselines

– Normalized graph cut algorithm (Yu and Shi, 2003) – Modified Adsorption (MAD) algorithm (Talukdar and Crammer, 2009)

slide-34
SLIDE 34

Experimental Setup

Experiments

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 21

  • A subset of the CSLB property norms dataset (Devereux et al.,

2013)

– 450 concrete nouns – Thirty human annotators assigned each noun with semantic categories – animals, tools, food, clothes

  • Symmetric pattern based scores computed using the google

books n-gram corpus

  • Number of labeled seed words

– 4, 10, 20, 40

slide-35
SLIDE 35

22 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Results

Word Similarity Measures

slide-36
SLIDE 36

22 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Results

Word Similarity Measures

best symmetric patterns model >> any other model

12.5% accuracy, 0.13 F1 points difference

slide-37
SLIDE 37

23 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

More Results

  • When using as few as four labeled seed words

– Accuracy results are 82-94% – F1 scores are 0.64-0.86

  • Symmetric patterns are superior compared to the other word

similarity measures across

– semantic categories – label propagation algorithms – labeled seed set sizes – evaluation measures

slide-38
SLIDE 38

24 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Symmetric Patterns

  • Interpretable
  • Efficient to compute

– A count model, no vector or matrix computation

  • Captures a different signal than bag-of-words or word n-gram

models

slide-39
SLIDE 39

25 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Future Work

  • Integrating symmetric pattern information into deep network

models

– Enhancing bag-of-words models with symmetric patterns information – Integrate word embeddings with symmetric patterns-based vectors

slide-40
SLIDE 40

26 Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al.

Summary

  • Minimally supervised semantic classification
  • Symmetric patterns form an effective manner to compute

word similarity

– And can also be extracted from plain text!

  • 82%-94% accuracy using only four labeled examples per

category

– 12.5% accuracy improvement over strong baselines

slide-41
SLIDE 41

roys02@cs.huji.ac.il http://www.cs.huji.ac.il/~roys02/

Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns @ Schwartz et al. 27