Argument-Level Interactions for Persuasion Comments Evaluation using - - PowerPoint PPT Presentation
Argument-Level Interactions for Persuasion Comments Evaluation using - - PowerPoint PPT Presentation
Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Lu Ji , Zhongyu Wei, Xiangkun Hu, Yang Liu , Qi Zhang, Xuanjing Huang COLING2018 2018.7.25 1
1
Introduction and Motivation
2
Our Approach
3
Experiments
4
Outline
5
Conclusion Analysis on Co-attention Network
Introduction and Motivation
Computational argumentation: l argument unit detection (Al-Khatib et al., 2016) l argument structure prediction (Peldszus and Stede, 2015) l argumentation scheme classification (Feng et al., 2014) … Online Debating Forums https://reddit.com/r/changemyview http://convinceme.net. http://debatepedia.idebate.org.
Introduction and Motivation
Change My View(CMV) on Reddit.com
Introduction and Motivation
n Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions (Tan et al.,2016) n Is this post persuasive? ranking argumentative comments in online forum (Wei and Liu, 2016) Common grounds: Ø They construct datasets from the ChangeMyView subreddit. Ø Labels are given by the online debating forum. Ø They design interaction based features and argument-only feature on word level to judge whether a post is persuasive. Related Work
Introduction and Motivation
Related Work n What makes a convincing argument ? empirical analysis and detecting attributes of convincingness in web argumentation (Habernal and Gurevych,2016) n Why can’t you convince me? modeling weaknesses in unpersuasive arguments (Persing and Ng,2017) Common grounds: Ø They construct datasets from online debating forums. Ø They both annotate a corpus of debate comments. Ø They design argumentation based feature on word level to analyze the persuasiveness of arguments.
Introduction and Motivation
Related Work l Argument related features are considered in word-level. l Interaction between two participants are largely ignored. l Our work is to represent the debate comments in units of arguments and to study their interaction on argument level. l The stronger the interaction, the stronger the persuasiveness of texts.
Introduction and Motivation
Original Post: Philosophy doesn’t seem to have any practical applications. [What value does philosophy have in the modern age, right row, aside from contemplating thing?] I have read the argument that it is impossible to argue that philosophy is useless without using philosophy. [What do you gain from studying philosophy that could not be gained from thoughtful introspection?] Positive Reply Negative Reply What do you gain from studying philosophy that could not be gained from thoughtful introspection? [Two answers. #1 rigor and #2 it saves us from reinventing the wheel.] [Why do you think we should start from scratch in all value decisions rather than seeking to understand the work that has been done in the past?] What do you gain from studying philosophy that could not be gained from thoughtful introspection? [Ask yourself the same question about math.] Your argument seems to be that studying philosophy is a waste
- f time because it has no practical use.
Figure : An example of dialogical argumentation.
1
Introduction and Motivation
2
Our Approach
3
Experiments
4
Outline
5
Conclusion Analysis on Co-attention Network
System Framework
Overall architecture of the proposed model.
System Framework
OP n i i=1
{r }
GRU GRU
OP
u
R m j j=1
{r }
GRU GRU
Output Aggregation Network Co-Attention Network Argument Vector Co-Attention Network
Attention Pooling
Original Post Reply Attention Pooling
GRU GRU GRU GRU
S(OP,R) Dense Layer
* n i i=1
{U}
GRU GRU GRU
*
O
feat
X
Overall architecture of the proposed model.
System Framework
OP n i i=1
{r }
GRU GRU
OP
u
R m j j=1
{r }
GRU GRU
Output Aggregation Network Co-Attention Network Argument Vector Co-Attention Network
Attention Pooling
Original Post Reply Attention Pooling
GRU GRU GRU GRU
S(OP,R) Dense Layer
* n i i=1
{U}
GRU GRU GRU
*
O
feat
X
Overall architecture of the proposed model.
System Framework
Co-Attention Network Reply argument to post argument attention
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix Softmax
...
R 1
r ...
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix
...
R 1
r ...
Post argument to reply argument attention Softmax Original Post
R 1
r
O P
u
R 2
r ...
R m
r
Reply Weights
+
Post to reply argument attention
The detailed structure of the co-attention network
System Framework
l Reply argument to post argument attention computes the relevance of each reply argument with every post argument and obtains a set of new post representations. l Post argument to reply argument attention computes the relevance of each post argument with every reply argument and helps learn a set of new reply representations. l Post to reply argument attention computes the relevance of each reply argument with the entire post argument which contributes to learn a new reply representation.
System Framework
Co-Attention Network Reply argument to post argument attention
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix Softmax
...
R 1
r ...
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix
...
R 1
r ...
Post argument to reply argument attention Softmax Original Post
R 1
r
O P
u
R 2
r ...
R m
r
Reply Weights
+
Post to reply argument attention
The detailed structure of the co-attention network
System Framework
Co-Attention Network Reply argument to post argument attention
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix Softmax
...
R 1
r ...
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix
...
R 1
r ...
Post argument to reply argument attention Softmax Original Post
R 1
r
O P
u
R 2
r ...
R m
r
Reply Weights
+
Post to reply argument attention
The detailed structure of the co-attention network
System Framework
Co-Attention Network Reply argument to post argument attention
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix Softmax
...
R 1
r ...
O P 1
r
O P 2
r
O P n
r
R 2
r
R m
r
Alignment Matrix
...
R 1
r ...
Post argument to reply argument attention Softmax Original Post
R 1
r
O P
u
R 2
r ...
R m
r
Reply Weights
+
Post to reply argument attention
The detailed structure of the co-attention network
System Framework
Overall architecture of the proposed model.
System Framework
Overall architecture of the proposed model.
System Framework
Overall architecture of the proposed model.
System Framework
Overall architecture of the proposed model.
1
Introduction and Motivation
2
Our Approach
3
Experiments
4
Outline
5
Conclusion Analysis on Co-attention Network
Experiments
Dataset: l /r/ChangeMyView subreddit l 3,456 training instances and 807 testing instances (Tan et al.,2016) Table: The statistics of the datasets used in our experiments. Training Set Test Set Avew Varw Avep Varp Avew Varw Avep Varp Original post 10 49.5 14 163.7 11 53.2 15 133.7 Positive reply 10 46.3 14 125.0 10 44.1 13 123.8 Negative reply 10 39.2 11 82.0 10 44.7 10 69.5
- Avew represents the average number of words per argument.
- Avep represents the average number of arguments per post.
- Varw indicates the variance of the number of words per argument.
- Varp indicates the variance of the number of arguments per post.
Experiments
Models for Comparing: Ø Tan et al. (2016) designed interplay features, argument-related features and text style features to predict whether a reply is persuasive. Ø WB employs BiGRU to encode posts on word level. Ø CB employs CNN+BiGRU to encode posts on argument level. Ø WOF uses the word-overlap features to evaluate argumentation quality. Ø CBCA introduces the co-attention network to the model CB. Ø CBWOF introduces the word-overlap features to the model CB. Ø CBAWOF_I is with the post argument to reply argument attention in co-attention network. Ø CBAWOF_II is with the reply argument to post argument attention in co-attention network. Ø CBAWOF_III is with the post to reply argument attention in the co-attention network. Ø CBCAWOF is our proposed model.
Experiments
Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45* Table: The performance of different approaches on our datasets.
Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).
Experiments
Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*
Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).
Table: The performance of different approaches on our datasets.
Experiments
Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*
Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).
Table: The performance of different approaches on our datasets.
Experiments
Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*
Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).
Table: The performance of different approaches on our datasets.
1
Introduction and Motivation
2
Our Approach
3
Experiments
4
Outline
5
Conclusion Analysis on Co-attention Network
l Task definition: For each argument in the reply, identify arguments in the
- riginal post that interact with it.