computational methods for text analysis
play

Computational methods for text analysis BA program Sociology and - PowerPoint PPT Presentation

Computational methods for text analysis BA program Sociology and Social Informatics Kirill Maslinsky 2018 Higher School of Economics Saint Petersburg 1/12 Why do you need it Just to learn to make those pictures 1 1 Just kiddin


  1. Computational methods for text analysis BA program “Sociology and Social Informatics” Kirill Maslinsky 2018 Higher School of Economics — Saint Petersburg 1/12

  2. Why do you need it

  3. Just to learn to make those pictures 1 1 Just kiddin 2/12

  4. Scale up population studied “all social media users of a town” time spans “all of the Post-Soviet history” geographical scope “all educational migration in Russia” 3/12

  5. Course goals • provide basic understanding of how to properly use collections of texts • and to make this knowledge practical 4/12 as quantitative evidence,

  6. Course content

  7. Bread and butter: Topic modeling 5/12

  8. Killer feature: Word embeddings 6/12

  9. The icing on the cake: Sentiment analysis 7/12

  10. The icing on the cake: Sentiment analysis 7/12

  11. The icing on the cake: Sentiment analysis 7/12

  12. Course topics • word embeddings, • this is a really very boring slide, isn’t it? • information extraction from unstructured text. • sentiment analysis, • automating content analisys (extracting theme and topic), • Applied tasks: • sequence modeling. • topic modeling, • Basic word statistics: • document classification and clusterization, • dictionary methods, • Methods for supervised and unsupervised modeling: • vector representation of text. • distributive semantics (word co-occurrence patterns), • lexical statistics (word frequency distributions), 8/12

  13. What to expect

  14. How coursework will be organized • An interesting recent article • with an explanation of the necessary concepts and methods during lecture • followed by detailed analysis of the method in class • concluded by the task to reproduce the method with your own data 9/12

  15. Expectations Practical work with real texts in class and at home. • command line • mining your own text collection • R scripts • bugs in scripts, googling, bugs in scripts again • seeking and getting help from your peers and course instructor • happy end 10/12

  16. Work in groups 11/12

  17. What you can learn • State-of-the-art of natural language processing: • solved problems • topical issues and unsolved problems • Terms: • a minimal vocabulary of necessary linguistic terms (with meanings! :)) • appropriate keywords to search for current research and tools • Tools: • Where to apply methods for computational text analysis and how to interpret their results • Existing software for text analysis (for Russian and English) • Existing linguistic resources — dictionaries, corpora, pre-trained models (for Russian and English) 12/12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend