semisupervised autoencoder for sentiment analysis
play

Semisupervised Autoencoder for Sentiment Analysis Shuangfei Zhai, - PowerPoint PPT Presentation

Semisupervised Autoencoder for Sentiment Analysis Shuangfei Zhai, Zhongfei Zhang. Seoul National University ga0408@snu.ac.kr July 06, 2018 1/10 Traditional autoencoders suffer from at least two aspects. Scalability with the


  1. Semisupervised Autoencoder for Sentiment Analysis Shuangfei Zhai, Zhongfei Zhang. 이 종 진 Seoul National University ga0408@snu.ac.kr July 06, 2018 1/10

  2. ◮ Traditional autoencoders suffer from at least two aspects. – Scalability with the high dimensionality of vocabulary size. – Dealing with task-irrelevant words. ◮ Proposed are divised to learns highly discriminative feature maps. 2/10

  3. ◮ x: n-gram count data, y: label, ˜ x : reconstruction of x. ◮ Traditional autoencoder’s loss function. x − x ) 2 D (˜ x , x ) = (˜ (1) – Reconstruction to be accurate towards frequent words. ◮ Proposed autoencoder’s loss function. x , x ) = ( θ T (˜ x − x )) 2 D (˜ (2) – θ are the weights of the linear classfier for label. – Reconstruction to be accurate towards only along directions where the linear classifier is sensitive to. 3/10

  4. x − x )) 2 has rationalized from the perspective of Bregman ◮ D (˜ x , x ) = ( θ T (˜ Divergence ◮ SVM2 (max( 0 , 1 − y i θ T x i )) 2 + λ � θ � 2 � L ( θ ) = (3) ◮ θ is fixed. f ( x i ) = (max( 0 , 1 − y i θ T x i )) 2 (4) ◮ Reconstruct ˜ x i to have small value of f (˜ x i ) = f ( x i ) – we would like to ˜ x i to still be correctly classified by the pretrained linear classifier. – Bregman Divergence from f ( x i ) and use it as the loss function of the subsequent autoencoder training, the autoencoder should be guided to give rescontruction errors that do not confuse the classifer. 4/10

  5. ◮ Bregman Divergence with respect to f. x ) − ( f ( x ) + ∆ f ( x ) T (˜ D f (˜ x , x ) = f (˜ x − x )) . (5) ◮ f ( x i ) is a quadratic function of x i , The Hessian follows as  ( θ T (˜ x i − x i )) 2 if 1 − y i θ T x i >0   H ( x i ) = (6) 0 , otherwise   ◮ Bregman Divergence is simply ( x − ˜ x ) T H ( x − ˜ x ) in SVM2  ( θ T (˜ x i − x i )) 2 if 1 − y i θ T x i >0   D f (˜ x , x ) = (7) 0 , otherwise   5/10

  6. The Bayesian Marginallization ◮ Estimate θ using one single classfier can bring bias. ◮ Bayesian approach, Borrowing the idea of Energy Based Model exp ( − β L ( θ )) p ( θ ) = (8) � exp ( − β L ( θ )) , d θ ◮ Rewrite D (˜ ( θ T (˜ x − x )) 2 p ( θ ) d θ , and using sampling method, � x , x ) = MCMC. p ( θ ) = N (ˆ ◮ Approximate p ( θ ) by gaussian ˜ θ, Σ) , then x − x )) 2 + (Σ 1 1 x , x ) = (ˆ θ T (˜ 2 (˜ x − x )) T (Σ 2 (˜ D (˜ x − x )) (9) β ( diag ( � I ( 1 − y i θ T x i > 0 ) x 2 ◮ Σ = 1 i )) − 1 6/10

  7. Experiments ◮ Dataset (IMDB dataset / Amazon review data of five item.) ◮ Method – Bag of Words with uni-gram or bi-gram – Normalization: log( 1 + c i , j ) x i , j = (10) max j log( 1 + c i , j ) – DAE/ DAE with Finetuning / NN / Logistic with Dropout / Semisupervised Bregman Divergence Autoencoder / SBDAE with Finetuning 7/10

  8. Experiments ◮ Book – id1: lost credability,quickly!!:chalupa, id2 : 4423 – asin : 055380121X – product name/product type – helpful: 12 of 15 – rating: 2.0 – title/data/reviewer/reviewer location – reviewer text I admit, I haven’t finished this book. A friend recommended it to me as I have been having problems with insomnia. I was interested in reading a book about women’s health issues and this one sounded intriguing UNTIL she started in with her tarot cards, interest in astrology and angels. Granted, I am not a firm believer in just "the hard facts" but its really hard to believe anything this woman writes after it is clear that common sense isn’t alternative enough for her! 8/10

  9. Experiments 9/10

  10. Experiments 10/10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend