Argument-Level Interactions for Persuasion Comments Evaluation using - - PowerPoint PPT Presentation

argument level interactions for persuasion comments
SMART_READER_LITE
LIVE PREVIEW

Argument-Level Interactions for Persuasion Comments Evaluation using - - PowerPoint PPT Presentation

Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Lu Ji , Zhongyu Wei, Xiangkun Hu, Yang Liu , Qi Zhang, Xuanjing Huang COLING2018 2018.7.25 1


slide-1
SLIDE 1

Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention

Lu Ji , Zhongyu Wei, Xiangkun Hu, Yang Liu , Qi Zhang, Xuanjing Huang 自然语言处理前沿技术研讨会暨 COLING2018论文预讲会 2018.7.25

slide-2
SLIDE 2

1

Introduction and Motivation

2

Our Approach

3

Experiments

4

Outline

5

Conclusion Analysis on Co-attention Network

slide-3
SLIDE 3

Introduction and Motivation

Computational argumentation: l argument unit detection (Al-Khatib et al., 2016) l argument structure prediction (Peldszus and Stede, 2015) l argumentation scheme classification (Feng et al., 2014) … Online Debating Forums https://reddit.com/r/changemyview http://convinceme.net. http://debatepedia.idebate.org.

slide-4
SLIDE 4

Introduction and Motivation

Change My View(CMV) on Reddit.com

slide-5
SLIDE 5

Introduction and Motivation

n Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions (Tan et al.,2016) n Is this post persuasive? ranking argumentative comments in online forum (Wei and Liu, 2016) Common grounds: Ø They construct datasets from the ChangeMyView subreddit. Ø Labels are given by the online debating forum. Ø They design interaction based features and argument-only feature on word level to judge whether a post is persuasive. Related Work

slide-6
SLIDE 6

Introduction and Motivation

Related Work n What makes a convincing argument ? empirical analysis and detecting attributes of convincingness in web argumentation (Habernal and Gurevych,2016) n Why can’t you convince me? modeling weaknesses in unpersuasive arguments (Persing and Ng,2017) Common grounds: Ø They construct datasets from online debating forums. Ø They both annotate a corpus of debate comments. Ø They design argumentation based feature on word level to analyze the persuasiveness of arguments.

slide-7
SLIDE 7

Introduction and Motivation

Related Work l Argument related features are considered in word-level. l Interaction between two participants are largely ignored. l Our work is to represent the debate comments in units of arguments and to study their interaction on argument level. l The stronger the interaction, the stronger the persuasiveness of texts.

slide-8
SLIDE 8

Introduction and Motivation

Original Post: Philosophy doesn’t seem to have any practical applications. [What value does philosophy have in the modern age, right row, aside from contemplating thing?] I have read the argument that it is impossible to argue that philosophy is useless without using philosophy. [What do you gain from studying philosophy that could not be gained from thoughtful introspection?] Positive Reply Negative Reply What do you gain from studying philosophy that could not be gained from thoughtful introspection? [Two answers. #1 rigor and #2 it saves us from reinventing the wheel.] [Why do you think we should start from scratch in all value decisions rather than seeking to understand the work that has been done in the past?] What do you gain from studying philosophy that could not be gained from thoughtful introspection? [Ask yourself the same question about math.] Your argument seems to be that studying philosophy is a waste

  • f time because it has no practical use.

Figure : An example of dialogical argumentation.

slide-9
SLIDE 9

1

Introduction and Motivation

2

Our Approach

3

Experiments

4

Outline

5

Conclusion Analysis on Co-attention Network

slide-10
SLIDE 10

System Framework

Overall architecture of the proposed model.

slide-11
SLIDE 11

System Framework

OP n i i=1

{r }

GRU GRU

OP

u

R m j j=1

{r }

GRU GRU

Output Aggregation Network Co-Attention Network Argument Vector Co-Attention Network

Attention Pooling

Original Post Reply Attention Pooling

GRU GRU GRU GRU

S(OP,R) Dense Layer

* n i i=1

{U}

GRU GRU GRU

*

O

feat

X

Overall architecture of the proposed model.

slide-12
SLIDE 12

System Framework

OP n i i=1

{r }

GRU GRU

OP

u

R m j j=1

{r }

GRU GRU

Output Aggregation Network Co-Attention Network Argument Vector Co-Attention Network

Attention Pooling

Original Post Reply Attention Pooling

GRU GRU GRU GRU

S(OP,R) Dense Layer

* n i i=1

{U}

GRU GRU GRU

*

O

feat

X

Overall architecture of the proposed model.

slide-13
SLIDE 13

System Framework

Co-Attention Network Reply argument to post argument attention

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix Softmax

...

R 1

r ...

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix

...

R 1

r ...

Post argument to reply argument attention Softmax Original Post

R 1

r

O P

u

R 2

r ...

R m

r

Reply Weights

+

Post to reply argument attention

The detailed structure of the co-attention network

slide-14
SLIDE 14

System Framework

l Reply argument to post argument attention computes the relevance of each reply argument with every post argument and obtains a set of new post representations. l Post argument to reply argument attention computes the relevance of each post argument with every reply argument and helps learn a set of new reply representations. l Post to reply argument attention computes the relevance of each reply argument with the entire post argument which contributes to learn a new reply representation.

slide-15
SLIDE 15

System Framework

Co-Attention Network Reply argument to post argument attention

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix Softmax

...

R 1

r ...

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix

...

R 1

r ...

Post argument to reply argument attention Softmax Original Post

R 1

r

O P

u

R 2

r ...

R m

r

Reply Weights

+

Post to reply argument attention

The detailed structure of the co-attention network

slide-16
SLIDE 16

System Framework

Co-Attention Network Reply argument to post argument attention

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix Softmax

...

R 1

r ...

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix

...

R 1

r ...

Post argument to reply argument attention Softmax Original Post

R 1

r

O P

u

R 2

r ...

R m

r

Reply Weights

+

Post to reply argument attention

The detailed structure of the co-attention network

slide-17
SLIDE 17

System Framework

Co-Attention Network Reply argument to post argument attention

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix Softmax

...

R 1

r ...

O P 1

r

O P 2

r

O P n

r

R 2

r

R m

r

Alignment Matrix

...

R 1

r ...

Post argument to reply argument attention Softmax Original Post

R 1

r

O P

u

R 2

r ...

R m

r

Reply Weights

+

Post to reply argument attention

The detailed structure of the co-attention network

slide-18
SLIDE 18

System Framework

Overall architecture of the proposed model.

slide-19
SLIDE 19

System Framework

Overall architecture of the proposed model.

slide-20
SLIDE 20

System Framework

Overall architecture of the proposed model.

slide-21
SLIDE 21

System Framework

Overall architecture of the proposed model.

slide-22
SLIDE 22

1

Introduction and Motivation

2

Our Approach

3

Experiments

4

Outline

5

Conclusion Analysis on Co-attention Network

slide-23
SLIDE 23

Experiments

Dataset: l /r/ChangeMyView subreddit l 3,456 training instances and 807 testing instances (Tan et al.,2016) Table: The statistics of the datasets used in our experiments. Training Set Test Set Avew Varw Avep Varp Avew Varw Avep Varp Original post 10 49.5 14 163.7 11 53.2 15 133.7 Positive reply 10 46.3 14 125.0 10 44.1 13 123.8 Negative reply 10 39.2 11 82.0 10 44.7 10 69.5

  • Avew represents the average number of words per argument.
  • Avep represents the average number of arguments per post.
  • Varw indicates the variance of the number of words per argument.
  • Varp indicates the variance of the number of arguments per post.
slide-24
SLIDE 24

Experiments

Models for Comparing: Ø Tan et al. (2016) designed interplay features, argument-related features and text style features to predict whether a reply is persuasive. Ø WB employs BiGRU to encode posts on word level. Ø CB employs CNN+BiGRU to encode posts on argument level. Ø WOF uses the word-overlap features to evaluate argumentation quality. Ø CBCA introduces the co-attention network to the model CB. Ø CBWOF introduces the word-overlap features to the model CB. Ø CBAWOF_I is with the post argument to reply argument attention in co-attention network. Ø CBAWOF_II is with the reply argument to post argument attention in co-attention network. Ø CBAWOF_III is with the post to reply argument attention in the co-attention network. Ø CBCAWOF is our proposed model.

slide-25
SLIDE 25

Experiments

Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45* Table: The performance of different approaches on our datasets.

Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).

slide-26
SLIDE 26

Experiments

Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*

Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).

Table: The performance of different approaches on our datasets.

slide-27
SLIDE 27

Experiments

Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*

Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).

Table: The performance of different approaches on our datasets.

slide-28
SLIDE 28

Experiments

Model Pair accuracy Tan et al.(2016) 65.70 Word-level BiGRU(WB) 61.22 CNN+BiGRU(CB) 63.34 Word Overlap Features(WOF) 63.59 CNN+BiGRU+Co-Att (CBCA) 66.96 CNN+BiGRU+Word Overlap Features(CBWOF) 68.08 CNN+BiGRU+Att_III+Word Overlap Features(CBWOF_III) 69.95 CNN+BiGRU+Att_I+Word Overlap Features(CBWOF_I) 70.07 CNN+BiGRU+Att_II+Word Overlap Features(CBWOF_II) 70.20 CNN+BiGRU+Co-Att+Word Overlap Features(CBCAWOF) 70.45*

Bold: best performance; underline: performance of the state-of-the-art method; *: significantly better than Tan et al.,2016 (p < 0.01).

Table: The performance of different approaches on our datasets.

slide-29
SLIDE 29

1

Introduction and Motivation

2

Our Approach

3

Experiments

4

Outline

5

Conclusion Analysis on Co-attention Network

slide-30
SLIDE 30

l Task definition: For each argument in the reply, identify arguments in the

  • riginal post that interact with it.

l Dataset Construction: We sample 50 triples in the form of (original post, positive reply, negative reply) from the training set and split these into 100 post-reply pairs. l Dataset Annotation: 365 argument pairs are identified with three annotators

Interactive argument pair extraction

slide-31
SLIDE 31

l Co-attention Network (CN): Interactive argument pairs based on the results of our co-attention network. l Word-overlap Similarity (WS): Interactive argument pairs based on the similarity between argument pairs. Model P@1 P@2 P@3 P@4 P@5 MRR WS 17.53 30.41 39.72 48.49 53.97 31.33 CN 22.19 36.71 43.84 49.86 54.24 39.41 Table: Experimental results of WS and CN on the self-constructed dataset.

Interactive argument pair extraction

slide-32
SLIDE 32

1

Introduction and Motivation

2

Our Approach

3

Experiments

4

Outline

5

Conclusion Analysis on Co-attention Network

slide-33
SLIDE 33

Conclusion

l We propose to incorporate argument-level interactions for better persuasion comments quality evaluation. l We propose a novel co-attention network to capture the detailed interactions between the original post and the reply on argument level. l Experimental results on a benchmark dataset show that our proposed model can achieve better performance than the state-of-the-art method. l We formalize a task of interactive argument pair extraction to further understand how attention mechanism works,

slide-34
SLIDE 34

THANK YOU

PPT模板下载:www.1ppt.com/moban/ 行业PPT模板:www.1ppt.com/hangye/ 节日PPT模板:www.1ppt.com/jieri/ PPT素材下载:www.1ppt.com/sucai/ PPT背景图片:www.1ppt.com/beijing/ PPT图表下载:www.1ppt.com/tubiao/ 优秀PPT下载:www.1ppt.com/xiazai/ PPT教程: www.1ppt.com/powerpoint/ Word教程: www.1ppt.com/word/ Excel教程:www.1ppt.com/excel/ 资料下载:www.1ppt.com/ziliao/ PPT课件下载:www.1ppt.com/kejian/ 范文下载:www.1ppt.com/fanwen/ 试卷下载:www.1ppt.com/shiti/ 教案下载:www.1ppt.com/jiaoan/ PPT论坛:www.1ppt.cn