machine learning for computational linguistics
play

Machine Learning for Computational Linguistics ar ltekin University - PowerPoint PPT Presentation

Machine Learning for Computational Linguistics ar ltekin University of Tbingen Seminar fr Sprachwissenschaft April 12, 2016 Practical matters What and why The course plan When/where ccoltekin@sfs.uni-tuebingen.de).


  1. Machine Learning for Computational Linguistics Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft April 12, 2016

  2. Practical matters What and why The course plan When/where ccoltekin@sfs.uni-tuebingen.de). http://coltekin.net/cagri/courses/ml . include pointers to reading material for each lecture. Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 1 / 16 ▶ Lectures: Tuesday/Thursday 08:30 at Hörsaal 0.02 ▶ Offjce hours: Tuesday 10:00-12:00, or by appointment (email ▶ Course web page: ▶ Reading material: no (single) textbook. Course web page will

  3. Practical matters What and why April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, 2 / 16 Literature The course plan ▶ James et al. (2013) [online!] ▶ Hastie, Tibshirani, and J. Friedman (2009) [online!] ▶ Barber (2012) ▶ Murphy (2012) ▶ Bishop (2006) ▶ Mitchell (1997) ▶ Goodfellow, Bengio, and Courville (2016) [online copy] ▶ Alpaydın (2004) ▶ Witten and Frank (2005) ▶ Richert (2015) ▶ Lantz (2015) ▶ Cho (2015) ▶ Goldberg (2015)

  4. Practical matters What and why April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, forming teams announced later) 3 / 16 Evaluation 6 (maybe 7) homeworks, to be done individually. The course plan ▶ Homeworks (30%) ▶ Term project / term paper (70%) ▶ Team work (up to 3 team members) is encouraged ▶ The project has to include a machine learning ‘experiment’ ▶ The results should be presented in a term paper (details will be ▶ You should already start thinking about project topics, and

  5. Practical matters information processing tasks. April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, —James et al. (2013) data. Statistical learning refers to a vast set of tools for understanding —Barber (2012) mimicking, understanding and aiding human and biological What and why Machine Learning is the study of data-driven methods capable of —Mitchell (1997) with experience. how to construct computer programs that automatically improve The fjeld of machine learning is concerned with the question of Machine learning is … The course plan 4 / 16

  6. Practical matters information processing tasks. April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, —James et al. (2013) data. Statistical learning refers to a vast set of tools for understanding —Barber (2012) mimicking, understanding and aiding human and biological What and why Machine Learning is the study of data-driven methods capable of —Mitchell (1997) with experience. how to construct computer programs that automatically improve The fjeld of machine learning is concerned with the question of Machine learning is … The course plan 4 / 16

  7. Practical matters information processing tasks. April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, —James et al. (2013) data. Statistical learning refers to a vast set of tools for understanding —Barber (2012) mimicking, understanding and aiding human and biological What and why Machine Learning is the study of data-driven methods capable of —Mitchell (1997) with experience. how to construct computer programs that automatically improve The fjeld of machine learning is concerned with the question of Machine learning is … The course plan 4 / 16

  8. Practical matters What and why April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, 5 / 16 are based on machine learning Machine learning and computational linguistics The course plan ▶ Majority of the computational linguistic tasks and applications ▶ Tokenization ▶ Part of speech tagging ▶ Parsing ▶ … ▶ Speech recognition ▶ Named Entity recognition ▶ Document classifjcation ▶ Question answering ▶ Machine translation ▶ …

  9. Practical matters What and why The course plan Refresher: linear algebra Thursday! Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 6 / 16 ▶ Vectors, vector operations, their geometric interpretations ▶ Vector norms, distances between vectors ▶ Matrices, matrix operations ▶ Some useful matrix properties ▶ Linear transformations

  10. Practical matters What and why The course plan Refresher: probability and statistics Next week Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 7 / 16 ▶ Probabilities: where do they come from? ▶ Random variables, probability distributions ▶ Joint, conditional, marginal probabilities, chain rule ▶ Bayes’ formula ▶ Some concepts from information theory

  11. Practical matters What and why April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, 80 60 40 20 80 60 40 20 Regression The course plan 8 / 16 y x

  12. Practical matters – April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, – – – + What and why + + + Classifjcation The course plan 9 / 16 x 2 x 1

  13. Practical matters + April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, – – – – + What and why + + ? Classifjcation The course plan 9 / 16 x 2 x 1

  14. Practical matters + April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, – – – – + What and why + + ? Classifjcation The course plan 9 / 16 x 2 x 1

  15. Practical matters What and why The course plan Machine learning basics Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 10 / 16 ▶ How to measure success in an ML experiment? ▶ Variance and bias ▶ Overfjtting and underfjtting ▶ Cross validation ▶ Training/test/development set split

  16. Practical matters What and why The course plan Unsupervised learning Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 11 / 16 ▶ Clustering ▶ Density estimation ▶ Dimensionality reduction

  17. Practical matters Input April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, layer Output layer Hidden layer Output What and why Output Output Neural networks The course plan 12 / 16 x 1 x 2 x 3 x 4

  18. Practical matters What and why The course plan Distributed representations Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 13 / 16 ▶ Sparse feature representations ▶ Dense representations ▶ Word/character embeddings ▶ How to obtain meaningful combinations?

  19. Practical matters What and why The course plan Deep learning Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 14 / 16 ▶ Convolutional networks ▶ Recurrent networks ▶ Auto-encoder/decoders ▶ …

  20. Practical matters What and why The course plan Bayesian learning (if time allows) Ç. Çöltekin, SfS / University of Tübingen April 12, 2016 15 / 16 ▶ Bayesian inference ▶ Graphical models

  21. Practical matters What and why April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, 16 / 16 some that was not covered Summary The course plan ▶ Besides what what is covered during the course, we will note ▶ Decision trees, random forests ▶ Rule learning ▶ Memory based learning ▶ Support vector machines ▶ Local regression / generalized additive models ▶ Learning sequences (e.g., HMMs) ▶ Active learning ▶ Reinforcement learning ▶ Ensemble methods ▶ …

  22. Machine learning books/resources arXiv:1511.07916. April 12, 2016 SfS / University of Tübingen Ç. Çöltekin, Grus, Joel (2015). Data Science from Scratch: First Principles with Python . O’Reilly Media. isbn : 9781491904404. Goodfellow, Ian, Yoshua Bengio, and Aaron Courville (2016). “Deep Learning”. Book in preparation for MIT Press. http://www.cs.biu.ac.il/~yogo/nnlp.pdf . Goldberg, Yoav (2015). A Primer on Neural Network Models for Natural Language Processing . url : University Press. isbn : 9781107096394. Flach, Peter (2012). Machine Learning: The Art and Science of Algorithms that Make Sense of Data . Cambridge Cho, Kyunghyun (2015). Natural Language Understanding with Distributed Representation . arXiv preprint The following is an unsorted list of machine lerning related books and 9781118961766. Bowles, Michael (2015). Machine Learning in Python: Essential Techniques for Predictive Analysis . Wiley. isbn : Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning . Springer. isbn : 978-0387-31073-2. 9780521518147. Barber, David (2012). Bayesian Reasoning and Machine Learning . Cambridge University Press. isbn : Press. isbn : 0262012111,9780262012119. Alpaydın, Ethem (2004). Introduction to machine learning . Adaptive computation and machine learning. MIT AMLBook.com. isbn : 9781600490064. Abu-Mostafa, Yaser S., Malik Magdon-Ismail, and Hsuan-Tien Lin (2012). Learning from Data: A Short Course . resources. A.1 url : http://www.deeplearningbook.org .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend