Identifying Prominent Arguments in Online Debates Using Semantic - PowerPoint PPT Presentation

Identifying Prominent Arguments in Online Debates Using Semantic Textual Similarity Filip Boltuži´ c and Jan Šnajder Text Analysis and Knowledge Engineering Lab FER, University of Zagreb Second Workshop on Argumentation Mining NAACL 2015 Denver, Colorado 4 June 2015 1 / 28

Should marijuana be legalized? User comment 1 No, because marijuana lessen the brain’s ability for cognitive thinking. User comment 2 There have been plenty of highway deaths associated with marajuanna use. User comment 3 The Legalization of marijuana would lower are crime rates in the United States of America by at least 15 to 20 User comment 4 Marijuana is proven to cause depression and change brain patterns in odd ways among other things 2 / 28

Should marijuana be legalized? California marijuana poll APFR 2014 survey 3 / 28

Should marijuana be legalized? 4 / 28

Should marijuana be legalized? User comment 1 No, because marijuana lessen the brain’s ability for cognitive thinking. User comment 4 Marijuana is proven to cause depression and change brain patterns in odd ways among other things 5 / 28

Should marijuana be legalized? No, damages health User comment 1 No, because marijuana lessen the brain’s ability for cognitive thinking. User comment 4 Marijuana is proven to cause depression and change brain patterns in odd ways among other things 5 / 28

Online Discussions Online discussions growing source of mass opinion Expressing opinion varies: implicit premises, value judgements, irony Tumblr 6 / 28

Arguments from opinions Clustering similar opinions gives an argument Arguments may be related Image source 7 / 28

Task Description Identifying Prominent Arguments Identifying reasonings and opinions to cluster into arguments. 8 / 28

Task Description Identifying Prominent Arguments Identifying reasonings and opinions to cluster into arguments. Input: 1 Noisy comments from online discussions Output: 1 Set of Argument Clusters 2 Representative Argument of each Cluster 8 / 28

Related Work Argumentation mining [Palau and Moens, 2009] Argument supervised classification Argument recognition [Boltuži´ c and Šnajder, 2014] Reason classification [Hasan and Ng, 2014] Argument tags [Conrad et al., 2012] Argument unsupervised topic modeling Identifying arguing expressions [Trabelsi and Zaïane, 2014] Stance classification Stance on forum posts [Anand et al., 2011] 9 / 28

Outline Corpus 1 Model 2 Evaluation 3 10 / 28

Corpus [Hasan and Ng, 2014] annotated threaded debates with arguments We extract pairs of gold arguments and comments Ignoring non-argumentative content Sentence level comments 12 / 28

Corpus [Hasan and Ng, 2014] annotated threaded debates with arguments We extract pairs of gold arguments and comments Ignoring non-argumentative content Sentence level comments Comment Medically speaking marijuana is one of the safest and most effective medications for the widest variety diseases known Gold Argument Used as a medicine for its positive effects 12 / 28

Corpus Majority pro – 2028 (65%) Four topics Should gay marriage be legal? Should marijuana be legalized? Is Obama a good president? Should abortion be legalized? 13 / 28

Corpus Majority pro – 2028 (65%) Four topics Should gay marriage be legal? Should marijuana be legalized? Is Obama a good president? Should abortion be legalized? GM MAR OBA ABO Pro Con Pro Con Pro Con Pro Con #Arguments 5 4 5 5 8 8 7 5 #Comments 639 197 585 239 358 272 446 368 13 / 28

Argument similarity Vector-space similarity Bag-of-words (BoW) Inverse sentence frequency weight Neural network skip-gram [Mikolov et al., 2013] Word-vector sum for sentences Cosine distance Semantic textual similarity (STS) [Šari´ c et al., 2012] Text comparison features Output real valued similarity score 15 / 28

Clustering Hierarhical agglomerative clustering (HAC) [Xu et al., 2005] Input : Distance matrix Output : Hierarhical structures Linkage criterion Complete linkage Ward’s method 16 / 28

Cluster evaluation Evaluation metrics Comparison against gold corpus labels Hierarhical clustering stopping criteria #gold labels Supervised measures Adjusted Rand Index (ARI) V-measure (V) evaluationforms.org 18 / 28

Cluster evaluation OBA MAR GM ABO Model (linkage) V ARI V ARI V ARI V ARI STS (Complete) .11 .02 .05 .03 .05 .01 .06 .02 BoW (Complete) .15 .03 .04 .00 .04 .01 .04 .01 BoW (Ward’s) .27 .04 .17 .02 .15 .04 .07 .24 Skip-gram (Complete) .21 .04 .13 .02 .10 .04 .20 .03 Skip-gram (Ward’s) .19 .15 .23 .30 .10 .25 .07 .08 Skip-gram (Ward’s) pro/con .24 .08 .20 .07 .25 .20 .16 .07 Ward’s linkage best performance Word embeddings best performance Stance separated improves performance on two topics 19 / 28

Clustering quality Cluster matching Manual cluster matching to gold arguments on MAR topic Medioid cluster representative Compare medoid to gold label Funny-pics.co 20 / 28

Cluster matching example Example 1 Cluster medoid the economy would get billions of dollars in a new industry if it were legalized (...) no longer would this revenue go directly into the black market. Gold argument Legalized marijuana can be controlled and regulated by the government 21 / 28

Cluster matching example Example 1 Cluster medoid the economy would get billions of dollars in a new industry if it were legalized (...) no longer would this revenue go directly into the black market. Gold argument Legalized marijuana can be controlled and regulated by the government Example 2 Cluster medoid There are thousands of deaths every year from tobacco and alcohol, yet there has never been a recorded death due to marijuana. Gold argument Does not cause any damage to our bodies 21 / 28

Error analysis Main problems identified Background knowledge Idiomatic language Grammatical errors Fine/coarse arguments http://www.relationship-economy.com 22 / 28

Error analysis: Background knowledge Comment Pot is also one of the most high priced exports of Central American Countries and the Carribean 23 / 28

Error analysis: Background knowledge Comment Pot is also one of the most high priced exports of Central American Countries and the Carribean Not addictive 23 / 28

Error analysis: Background knowledge Comment Pot is also one of the most high priced exports of Central American Countries and the Carribean Not addictive Legalized marijuana can be controlled and regulated by the government 23 / 28

Error analysis: Argument granularity Specific Damages our bodies Responsible for brain damage Damaging our bodies General the economy would get billions of dollars If the tax on cigarettes can be (...) no longer would this revenue go di- $5.00/pack imagine what we could rectly into the black market. tax pot for! Economy profits Tax benefits Legalized marijuana can be controlled and regulated by the government 24 / 28

Wrap Up Baseline unsupervised identification of prominent arguments Hierarhical clustering Textual similarity measure 0.15 to 0.30 V-measure 25 / 28

Wrap Up Baseline unsupervised identification of prominent arguments Hierarhical clustering Textual similarity measure 0.15 to 0.30 V-measure Future work Semi-supervised approach Argument hierarchy analysis 25 / 28

References I Anand, P ., Walker, M., Abbott, R., Tree, J. E. F ., Bowmani, R., and Minor, M. (2011). Cats rule and dogs drool!: Classifying stance in online debate. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis , pages 1–9. Boltuži´ c, F . and Šnajder, J. (2014). Back up your stance: Recognizing arguments in online discussions. In Proceedings of the First Workshop on Argumentation Mining , pages 49–58. Conrad, A., Wiebe, J., et al. (2012). Recognizing arguing subjectivity and argument tags. In Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics , pages 80–88. 26 / 28

References II Hasan, K. S. and Ng, V. (2014). Why are you taking this stance? Identifying and classifying reasons in ideological debates. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages 751–762. Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of ICLR , Scottsdale, AZ, USA. Palau, R. M. and Moens, M.-F . (2009). Argumentation mining: The detection, classification and structure of arguments in text. In Proceedings of the 12th International Conference on Artificial Intelligence and Law , pages 98–107. ACM. 27 / 28

Identifying Prominent Arguments in Online Debates Using Semantic - PowerPoint PPT Presentation

Identifying Prominent Arguments in Online Debates Using Semantic Textual Similarity Filip Boltui c and Jan najder Text Analysis and Knowledge Engineering Lab FER, University of Zagreb Second Workshop on Argumentation Mining NAACL 2015

Command Line Arguments ECE2893 Lecture 20 ECE2893 Command Line Arguments Spring 2011 1 / 5

Classical arguments that for us is a PP complement of easy The following arguments are the

Using Semantics of the Arguments Using Semantics of the Arguments for Predicate Sense Induction

C ASE P REPARATION : part 2 Simon Quinn Most debates are won and lost in the preparation room.

CPD Dialogue on Recent Wage Debates in the RMG Sector: What is it All About? Presentation on

Arguments Arguments An ARGUMENT is a collection of statements which are intended to support,

End-to-end arguments: End-to-end arguments: The Internet and beyond The Internet and beyond

Reasoning for Humans: Clear Thinking in an Uncertain World PHIL 171 Eric Pacuit Department of

Arguments and Problems for Non-Cognitivism Felix Pinkert 103 Ethics: Metaethics, University of

Structuring and potentially formalising (Assurance) Case Arguments Tim Kelly

Chapter 10 Attaway MATLAB 4E Variable # of Arguments So far in the functions that we ve

Unit 11b Functions Pass-by-Reference & Array Arguments 2 Passing Arrays As Arguments //

Morpho-syntax February 20 and 22, 2017 Core Arguments The core arguments of a verb are Actor,

Identifying MMORPG Bots: Identifying MMORPG Bots: A Traffic Analysis Approach A Traffic Analysis

Recognizing stances, arguments, viewpoints Ruth Morrison, Julian Chan Somasundaran and Wiebe

PRESENTATION SKILLS Introduction: If you take part in any of the debates at ANC, whether during

OPEN INTERNET Henning Schulzrinne 2 ITEP OI 2017 Overview Historic background Industry

Catholic Social Thought and Our Current Economic Policy Challenges Charles M. A. Clark, PhD

Machine Learning for NLP Ethics and Machine Learning Aurlie Herbelot 2019 Centre for

The Purpose of Visualization Ma Maneesh Agrawala CS 448B: Visualization Winter 2020 How much

Social Media Argumentation Mining: The Quest for Deliberateness in Raucousnes Jan najder Joint

20-10-2010

Health care management, health care and health: better with behavioural O.R? Geoff Royston

09 $:6

Identifying Prominent Arguments in Online Debates Using Semantic - PowerPoint PPT Presentation

Identifying Prominent Arguments in Online Debates Using Semantic Textual Similarity Filip Boltui c and Jan najder Text Analysis and Knowledge Engineering Lab FER, University of Zagreb Second Workshop on Argumentation Mining NAACL 2015

Command Line Arguments ECE2893 Lecture 20 ECE2893 Command Line Arguments Spring 2011 1 / 5

Classical arguments that for us is a PP complement of easy The following arguments are the

Using Semantics of the Arguments Using Semantics of the Arguments for Predicate Sense Induction

C ASE P REPARATION : part 2 Simon Quinn Most debates are won and lost in the preparation room.

CPD Dialogue on Recent Wage Debates in the RMG Sector: What is it All About? Presentation on

Arguments Arguments An ARGUMENT is a collection of statements which are intended to support,

End-to-end arguments: End-to-end arguments: The Internet and beyond The Internet and beyond

Reasoning for Humans: Clear Thinking in an Uncertain World PHIL 171 Eric Pacuit Department of

Arguments and Problems for Non-Cognitivism Felix Pinkert 103 Ethics: Metaethics, University of

Structuring and potentially formalising (Assurance) Case Arguments Tim Kelly

Chapter 10 Attaway MATLAB 4E Variable # of Arguments So far in the functions that we ve

Unit 11b Functions Pass-by-Reference &amp; Array Arguments 2 Passing Arrays As Arguments //

Morpho-syntax February 20 and 22, 2017 Core Arguments The core arguments of a verb are Actor,

Identifying MMORPG Bots: Identifying MMORPG Bots: A Traffic Analysis Approach A Traffic Analysis

Recognizing stances, arguments, viewpoints Ruth Morrison, Julian Chan Somasundaran and Wiebe

PRESENTATION SKILLS Introduction: If you take part in any of the debates at ANC, whether during

OPEN INTERNET Henning Schulzrinne 2 ITEP OI 2017 Overview Historic background Industry

Catholic Social Thought and Our Current Economic Policy Challenges Charles M. A. Clark, PhD

Machine Learning for NLP Ethics and Machine Learning Aurlie Herbelot 2019 Centre for

The Purpose of Visualization Ma Maneesh Agrawala CS 448B: Visualization Winter 2020 How much

Social Media Argumentation Mining: The Quest for Deliberateness in Raucousnes Jan najder Joint

20-10-2010

Health care management, health care and health: better with behavioural O.R? Geoff Royston

09 $:6

Unit 11b Functions Pass-by-Reference & Array Arguments 2 Passing Arrays As Arguments //