tricks for statistical semantic tricks for statistical
play

Tricks for Statistical Semantic Tricks for Statistical Semantic - PDF document

Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge Discovery: A Selectionally Selectionally Restricted Sample Restricted Sample A Marti A. Hearst Marti A. Hearst UC Berkeley UC Berkeley


  1. Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge Discovery: A Selectionally Selectionally Restricted Sample Restricted Sample A Marti A. Hearst Marti A. Hearst UC Berkeley UC Berkeley Statistical Approaches Statistical Approaches ► An alternative to hand ► An alternative to hand- -coded meaning. coded meaning. ► Solve sub ► Solve sub- -problems first. problems first. e.g., Acquiring Semantic Relations e.g., Acquiring Semantic Relations Marti Hearst, NYU Semantics ‘08 1

  2. Tricks I Like Tricks I Like Unambiguous Cues Lots o’ Text Rewrite and Verify Marti Hearst, NYU Semantics ‘08 Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: words in the same syntactic context are ► Idea: words in the same syntactic context are semantically related. semantically related. � � Hindle, ACL Hindle , ACL’ ’90, 90, “ “Noun classification from predicate Noun classification from predicate- -argument structure. argument structure.” ” Marti Hearst, NYU Semantics ‘08 2

  3. Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: words in the same syntactic context are ► Idea: words in the same syntactic context are semantically related. semantically related. � � Nakov & Hearst, ACL/HLT Nakov & Hearst, ACL/HLT’ ’08 08 “ “Solving Relational Similarity Problems Using the Web as a Corpus Solving Relational Similarity Problems Using the Web as a Corpus” ” Marti Hearst, NYU Semantics ‘08 Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: bigger is better than smarter! ► Idea: bigger is better than smarter! � � Banko & Brill ACL Banko & Brill ACL’ ’01: 01: “ “Scaling to Very, Very Large Corpora for Natural Scaling to Very, Very Large Corpora for Natural Language Disambiguation” Language Disambiguation ” Marti Hearst, NYU Semantics ‘08 3

  4. Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: apply web ► Idea: apply web- -scale n scale n- -grams to every grams to every problem imaginable. problem imaginable. � Lapata � Lapata & Keller, HLT/NACCL & Keller, HLT/NACCL ‘ ‘04 04: : “ “Web as a Baseline: Evaluating Web as a Baseline: Evaluating the Performance of Unsupervised Web- -Based Models for a Range Based Models for a Range the Performance of Unsupervised Web of NLP Tasks” ” of NLP Tasks = supervised > supervised MT candidate selection Noun compound bracketing Article suggestion Adjective ordering Noun compound interpretation Marti Hearst, NYU Semantics ‘08 Limitation Limitation ► Sometimes counts alone are too ambiguous. ► Sometimes counts alone are too ambiguous. Solution Solution ► ► Bootstrap from Bootstrap from unambiguous unambiguous contexts. contexts. Marti Hearst, NYU Semantics ‘08 4

  5. Trick: Use Unambiguous Context Trick: Use Unambiguous Context ► … ► … to build statistics for ambiguous contexts. to build statistics for ambiguous contexts. � Hindle � Hindle & & Rooth Rooth, ACL , ACL ’ ’91 91“ “Structural Ambiguity and Lexical Relations Structural Ambiguity and Lexical Relations” ” Example: PP attachment I eat spaghetti with sauce. Bootstrap from unambiguous contexts: Spaghetti with sauce is delicious. I eat with a fork. Marti Hearst, NYU Semantics ‘08 Trick: Use Unambiguous Context Trick: Use Unambiguous Context ► … ► … to identify semantic relations ( to identify semantic relations (lexico lexico- - syntactic contexts) syntactic contexts) � � Hearst, COLING ’ Hearst, COLING ’92, 92, “ “ Automatic Acquisition of Hyponyms from Large Text Automatic Acquisition of Hyponyms from Large Text Corpora” Corpora ” Example: Hyponym I dentification Marti Hearst, NYU Semantics ‘08 5

  6. Combine Tricks 1 and 2 Combine Tricks 1 and 2 Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Combine ► Combine lexico lexico- -syntactic patterns with syntactic patterns with occurrence counts. occurrence counts. � � Kozareva, , Riloff Riloff, , Hovy Hovy, HLT , HLT- -ACL ACL’ ’08. 08. “ “Semantic Class learning form the Web with Semantic Class learning form the Web with Kozareva Hyponym Pattern Linkage Graphs” Hyponym Pattern Linkage Graphs ”. . Marti Hearst, NYU Semantics ‘08 6

  7. Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Combine (usually) unambiguous surface ► Combine (usually) unambiguous surface patterns with occurrence counts. patterns with occurrence counts. � � Nakov & Hearst, HLT/EMNLP Nakov & Hearst, HLT/EMNLP’ ’05 05 “ “Using the Web as an Implicit Training Using the Web as an Implicit Training Set: Application to Structural Ambiguity Resolution” Set: Application to Structural Ambiguity Resolution ”. . Left dash Left dash Punctuation Punctuation cycle analysis � � left heath care, provider � � left cell- cell -cycle analysis left heath care, provider left Possessive marker Possessive marker Abbreviation Abbreviation s stem cell � � right ) factor � � right brain’ brain ’s stem cell right tum. tum . necr.(TN necr.(TN) factor right Parentheses Parentheses Concatenation Concatenation growth factor (beta) � � left reform � � left growth factor (beta) left heathcare reform heathcare left Marti Hearst, NYU Semantics ‘08 Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Identify a ► Identify a “ “protagonist protagonist” ” in each text to learn in each text to learn narrative structure narrative structure � � Chambers & Jurafsky Jurafsky, ACL , ACL’ ’08 08 “ “Unsupervised Learning of Narrative Event Chains Unsupervised Learning of Narrative Event Chains” ”. . Chambers & Marti Hearst, NYU Semantics ‘08 7

  8. Trick 3: Trick 3: Rewrite & Verify Rewrite & Verify Trick: Rewrite & Verify Trick: Rewrite & Verify ► ► Check if alternatives exist in text Check if alternatives exist in text � � Nakov & Hearst, HLT/EMNLP & Hearst, HLT/EMNLP’ ’05 05 “ “Using the Web as an Implicit Training Set: Application to Using the Web as an Implicit Training Set: Application to Nakov Structural Ambiguity Resolution” Structural Ambiguity Resolution ”. . � Example: NP bracketing � Example: NP bracketing � Prepositional � Prepositional � right � stem cells in in the ► ► stem cells the brain brain right brain � � right stem cells from from the ► stem cells ► the brain right stem � � left cells from from the ► ► cells the brain brain stem left � Verbal � Verbal human immunodeficiency � � left ► ► virus virus causing causing human immunodeficiency left � left � ► ► pain pain associated with associated with arthritis migraine arthritis migraine left � Copula � Copula skyscraper � � right ► office building office building that is that is a ► a skyscraper right Marti Hearst, NYU Semantics ‘08 8

  9. Towards New Approaches Towards New Approaches to Semantic Analysis to Semantic Analysis Ideas Ideas ► Inducing Semantic Grammars ► Inducing Semantic Grammars � � Boggess, Boggess , Agarwal Agarwal, & Davis, AAAI , & Davis, AAAI ’ ’91, 91, “ “Disambiguation of Prepositional Disambiguation of Prepositional Phrases in Automatically Labelled Phrases in Automatically Labelled Technical Text Technical Text” ” Marti Hearst, NYU Semantics ‘08 9

  10. Ideas Ideas ► Use Cognitive Linguistics ► Use Cognitive Linguistics � Hearst, � Hearst, ’ ’90, 90,’ ’92, 92, “ “Direction Direction- -Based Text Interpretation Based Text Interpretation” ”. . � � Talmy Talmy’ ’s s Force Dynamics + Reddy Force Dynamics + Reddy’ ’s Conduit Metaphor s Conduit Metaphor � Path Model � Path Model � Solves: Was the person in favor of or opposed to the idea � Solves: Was the person in favor of or opposed to the idea Marti Hearst, NYU Semantics ‘08 Using Cognitive Linguistics Using Cognitive Linguistics ► Talmy ► Talmy’ ’s s Theory of Force Dynamics Theory of Force Dynamics � � Talmy, Talmy , “ “Force Dynamics in Language and Thought, Force Dynamics in Language and Thought,” ” in in Parasession on Causatives and Agentivity , Chicago Linguistic Society 1985. � � Describes how the interaction of agents with respect to force is lexically lexically Describes how the interaction of agents with respect to force is and grammatically expressed. and grammatically expressed. � � Posits two opposing entities: Agonist and Antagonist. Posits two opposing entities: Agonist and Antagonist. � � Each entity expresses an intrinsic force: towards rest or motion. Each entity expresses an intrinsic force: towards rest or motion . � � The balance of the strengths of the entities determines the outcome of the The balance of the strengths of the entities determines the outc ome of the event. event. ► ► Grammatical expression includes using a Grammatical expression includes using a claused claused headed by headed by “ “despite despite” ” to express a weaker to express a weaker antagonist. antagonist. Marti Hearst, NYU Semantics ‘08 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend