Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke - PowerPoint PPT Presentation

Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn July 9, 2018 Institute of Computer Science and Technology, Peking University Beijing , China

Outline 1. Introduction 2. Related Work 3. Framework 4. Experiments 5. Conclusion and Future Work 1/29

Introduction • The boom of scholarly papers • Motivations • Help review submission system to detect the consistency of review texts and scores. • Help the chair to write a comprehensive meta-review. • Help authors to further improve their paper. Figure 1: An example of peer review text and the analysis results. 3/29

Introduction • Challenges • Long length. • Mixture of non-opinionated and opinionated texts. • Mixture of pros and cons. • Contributions • We built two evaluation datasets. (ICLR-2017 and ICLR-2018) • We propose a multiple instance learning network with a novel abstract-based memory mechanism (MILAM) • Evaluation results demonstrate the efficacy of our proposed model and show the great helpfulness of using abstract as memory. 4/29

Related Work • Sentiment Classification Sentiment analysis has been widely explored in many text domains, but few studies trying to perform it in the domain of peer reviews for scholarly papers. • Multiple Instance Learning MIL can extract instance labels(sentence-level polarities) from bags (reviews in our case), but none of previous work was applied to this challenging task. • Memory Network Memory network utilizes external information for greater capacity and efficiency. • Study on Peer Reviews These tasks are related but different from the sentiment analysis task addressed in this study. 6/29

Framework • Architecture document attention Review a a a P Classification 1 2 n softmax review 1 Input Layer h h ... h ... 1 2 n ... Representation P P P V V V 1 2 n 1 2 n ... Sentence 2 Abstract-based Memory Mechanism V V V Sentence response ... ( ) i 1 2 n R Classification Classification content Sum (2) ( ) n Layer (1) R R R MLP MLP MLP Review 3 ... ... matched ( ) i ( ) i ( ) i ( ) i (1) Classification E e e e E (2) ( ) n E E attention ... 1 2 m M M M I I I max pooling 2 ... m 1 1 2 ... n Input Representation convolution Layer ... ... sentence embedding ... ... r r r a a a S S S S S S 1 2 m 1 2 n T T abstract review Figure 2: The architecture of MILAM 8/29

Framework Input Representation Layer: 1 I A sentence S of length L (padded where necessary) is represented as: S = w 1 ⊕ w 2 ⊕ · · · ⊕ w L , S ∈ R L × d , (1) II The convolutional layer: f k = tanh ( W c · W k − l +1: k + b c ) , (2) f ( q ) = [ f ( q ) 1 , f ( q ) 2 , · · · , f ( q ) L − l +1 ] , (3) III A max-pooling layer: u q = max { f ( q ) } . (4) Finally, the representations of the review text { S r i } n i =1 and the abstract text { S a j } m j =1 are denoted as [ I i ] n i =1 , [ M j ] m i =1 respectively. where I i , M j ∈ R z . 9/29

Framework Sentence Classification Layer: 2 I Obtain a matched attention vector E ( i ) = [ e ( i ) t ] m t =1 which indicates the weight of memories. II Calculate the response content R ( i ) ∈ R z using this matched attention vector. III Use a MLP to obtain the final representation vector of each sentence in the review text. V i = f mlp ( I i || R ( i ) ; θ mlp ) , (5) IV Use the softmax classifier to get sentence-level distribution over sentiment labels. P i = softmax ( W p · V i + b p ) , (6) Finally, we obtained new high-level representations of sentences in the review text by leveraging relevant abstract information. 10/29

Framework Review Classification Layer: 3 I use separate LSTM modules to produce forward and back- ward hidden vectors: − → h i = − LSTM ( V i ) , ← − − → h i = ← − LSTM ( V i ) , h i = − − − − → h i ||← − h i (7) II The importance ( a i ) of each sentence is measured as follows: ′ exp ( h i ) ′ i = tanh ( W a · h i + b a ) , a i = (8) h ′ ∑ j exp ( h j ) III Finally, we obtain a document-level distribution over sentiment labels as the weighted sum of sentence-level distributions: P ( c ) ∑ a i P ( c ) review = , c ∈ [1 , C ] (9) i i 11/29

Framework • Abstract-based Memory Mechanism Get the matched attention vector E ( i ) of memories: 1 ′ t = LSTM (ˆ h t − 1 , M t ) , (ˆ h 0 = I i , t = 1 , ..., m ) (10) e ′ exp ( e t ) e ( i ) = (11) t ′ ∑ j exp ( e j ) E ( i ) = [ e ( i ) t ] m (12) t =1 Calculate the response content R ( i ) : 2 m R ( i ) = e ( i ) ∑ t M t (13) t =1 Use R ( i ) and I i to compute the new sentence representation 3 vector V i : V i = f mlp ( I i || R ( i ) ; θ mlp ) , (14) 12/29

Framework • Objective Function • Our model only needs the review ’ s sentiment label while each sentence’s sentiment label is unobserved. • The categorical cross-entropy loss: C ∑ ∑ − P ( c ) review log (¯ P ( c ) L ( θ ) = review ) (15) c =1 T review 13/29

Experiments • Evaluation Datasets • Statistics for ICLR-2017 and ICLR-2018 datasets. Data Set #Papers #Reviews #Sentences #Words ICLR-2017 490 1517 24497 9868 ICLR-2018 954 2875 58329 13503 • The score distributions: 15/29

Experiments • Comparison of review sentiment classification accuracy on the 2-class task {accept(score ∈ [1, 5]), reject(score ∈ [6, 10])} 16/29

Experiments • Comparison of review sentiment classification accuracy on the 3-class task {accept(score ∈ [1, 4]), borderline(score ∈ [5, 6]), reject(score ∈ [7, 10])} 17/29

Experiments • Sentence-Level Classification Results. We randomly selected 20 reviews, a total of 213 sentences, and manually labeled the sentiment polarity of each sentence. Figure 3: Example opinionated sentences with predicted polarity scores extracted from a review text. 18/29

Experiments • Influence of Abstract Text. Figure 4: Example sentences in a review text and its most relevant sentence in the paper abstract text. The sentence with the largest weight in the matched attention vector E ( i ) is considered most relevant. The red texts indicate similarities in the review text and the abstract text. 19/29

Experiments • Influence of Abstract Text. • A simple method of using abstract texts as a contrast experiment Remove the sentences that are similar to the paper abstract ’ s sentences from the review text and use the remaining text for classification.(The threshold is set to 0.7) Figure 5: The comparison of using and not using the paper abstract via a simple method. 20/29

Experiments • Influence of Borderline Reviews. Figure 6: Experimental results on different datasets with, without and only borderline reviews. 21/29

Experiments • Cross-Year Experiments. Figure 7: Results of cross-year experiments. Model @ ICLR − ∗ means the model is trained on ICLR − ∗ dataset. 22/29

Experiments • Cross-Domain Experiments. We further collected 87 peer reviews for submissions in the NLP conferences (CoNLL, ACL, EMNLP , etc.), including 57 positive reviews (accept) and 30 negative reviews (reject). Figure 8: Results of cross-domain experiments. ∗ means the performance improvement over the first three methods is statistically significant with p-value < 0.05 for sign-test. Model @ ICLR − ∗ means the model is trained on 23/29 ICLR − ∗ dataset.

Experiments • Final Decision Prediction for Scholarly Papers. • Methods to predict the final decision of a paper based on several review scores. • Voting: { if #accept > #reject Accept Decision = (16) Reject Otherwise • Simple Average: Simply average the scores of all reviews. If the average score is larger than or equal to 0.6, then the paper is predicted as final accept, and otherwise final reject. • Confidence-based Average: | S | overall _ score = 1 1 ∑ S i ∗ (17) | S | (6 − ReviewerConfidence i ) i =1 24/29

Experiments • Final Decision Prediction for Scholarly Papers. • Results of final decision prediction for scholarly papers. Figure 9: Results of final decision prediction for scholarly papers. 25/29

Conclusion and Future Work • Contributions • We built two evaluation datasets. (ICLR-2017 and ICLR-2018) • We propose a multiple instance learning network with a novel abstract-based memory mechanism (MILAM) • Evaluation results demonstrate the efficacy of our proposed model and show the great helpfulness of using abstract as memory. • Future Work • Collect more peer reviews. • Try more sophisticated deep learning techniques. • Several other sentiment analysis tasks: Prediction of the fine-granularity scores of reviews, Automatic writing of meta-reviews, Prediction of the best papers... 27/29

Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke - PowerPoint PPT Presentation

Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn July 9, 2018 Institute of Computer Science and Technology, Peking University Beijing , China Outline 1. Introduction 2.

Twitter Sentiment Analysis Twitter Sentiment Analysis Presented by: Loitongbam Gyanendro Singh

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Sentiment Analysis for the Humanities: the Case of Historical Texts Alessandro Marchetti, Rachele

Sentiment analysis Christopher Potts CS 244U: Natural language understanding May 19 1 / 83

Pl u tchik ' s w heel of emotion , polarit y v s . sentiment SE N TIME N T AN ALYSIS IN R Ted K

Sentiment Analysis What is Sentiment Analysis? Dan Jurafsky Positive or negative movie review?

Opening up the review process: alternative peer review tools in scholarly publishing 20. 01.

P2P: Distributed Hash Tables Chord + Routing Geometries Nirvan Tyagi CS 6410 Fall16

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer-to-Peer Computing Peer-to-Peer (P2P) employ distributed resources to perform function in a

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

ANALOGUE TELEVISION ANALOGUE TELEVISION Fernando Pereira Fernando Pereira Instituto Superior

Sense and Sensibility or A Parents' Night with a Difference Jean-Jacques Ruppert A

GoBack Sensor networks Exposure analysis Baptiste Prtre Betreuer: Kay Rmer Baptiste

Projects Presentations: Sound and Animation Dates: Week 9: Wed, Feb 14 Week 10:

CSC 411: Introduction to Machine Learning Lecture 1 - Introduction Roger Grosse, Amir-massoud

Analysis in Hindi Naman Bansal Umair Z Ahmed MOTIVATION Why Sentiment Analysis? Labeling

Nave Bayes & Maxent Models CMSC 473/673 UMBC September 18 th , 2017 Some slides adapted

Lecture 1 Number Representation CS 230 - Spring 2020 1-1 Number Representation Radix

Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke - PowerPoint PPT Presentation

Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn July 9, 2018 Institute of Computer Science and Technology, Peking University Beijing , China Outline 1. Introduction 2.

Twitter Sentiment Analysis Twitter Sentiment Analysis Presented by: Loitongbam Gyanendro Singh

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Peer to Peer Learning &amp; Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Sentiment Analysis for the Humanities: the Case of Historical Texts Alessandro Marchetti, Rachele

Sentiment analysis Christopher Potts CS 244U: Natural language understanding May 19 1 / 83

Pl u tchik ' s w heel of emotion , polarit y v s . sentiment SE N TIME N T AN ALYSIS IN R Ted K

Sentiment Analysis What is Sentiment Analysis? Dan Jurafsky Positive or negative movie review?

Opening up the review process: alternative peer review tools in scholarly publishing 20. 01.

P2P: Distributed Hash Tables Chord + Routing Geometries Nirvan Tyagi CS 6410 Fall16

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer-to-Peer Computing Peer-to-Peer (P2P) employ distributed resources to perform function in a

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

ANALOGUE TELEVISION ANALOGUE TELEVISION Fernando Pereira Fernando Pereira Instituto Superior

Sense and Sensibility or A Parents' Night with a Difference Jean-Jacques Ruppert A

GoBack Sensor networks Exposure analysis Baptiste Prtre Betreuer: Kay Rmer Baptiste

Projects Presentations: Sound and Animation Dates: Week 9: Wed, Feb 14 Week 10:

CSC 411: Introduction to Machine Learning Lecture 1 - Introduction Roger Grosse, Amir-massoud

Analysis in Hindi Naman Bansal Umair Z Ahmed MOTIVATION Why Sentiment Analysis? Labeling

Nave Bayes &amp; Maxent Models CMSC 473/673 UMBC September 18 th , 2017 Some slides adapted

Lecture 1 Number Representation CS 230 - Spring 2020 1-1 Number Representation Radix

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to

Nave Bayes & Maxent Models CMSC 473/673 UMBC September 18 th , 2017 Some slides adapted