comp 138 reinforcement learning
play

COMP 138: Reinforcement Learning Instructor : Jivko Sinapov Webpage : - PowerPoint PPT Presentation

COMP 138: Reinforcement Learning Instructor : Jivko Sinapov Webpage : https://www.eecs.tufts.edu/~jsinapov/teaching/comp150_RL_Fall2020/ BE a reinforcement learner You, as a class, will act as the learning agent BE a reinforcement learner


  1. COMP 138: Reinforcement Learning Instructor : Jivko Sinapov Webpage : https://www.eecs.tufts.edu/~jsinapov/teaching/comp150_RL_Fall2020/

  2. BE a reinforcement learner ● You, as a class, will act as the learning agent

  3. BE a reinforcement learner ● You, as a class, will act as the learning agent ● Actions: wave, clap, or nod

  4. BE a reinforcement learner ● You, as a class, will act as the learning agent ● Actions: wave, clap, or nod ● Observations: color, reward

  5. BE a reinforcement learner ● You, as a class, will act as the learning agent ● Actions: wave, clap, or nod ● Observations: color, reward ● Goal: find an optimal policy

  6. BE a reinforcement learner ● You, as a class, will act as the learning agent ● Actions: wave, clap, or stand ● Observations: color, reward ● Goal: find an optimal policy – What is a policy? What makes a policy optimal?

  7. How did you do it? ● What is your policy, and how is it represented? ● What does the world look like?

  8. What actually happened...

  9. What actually happened...

  10. Now, let’s formalize this (board or writing projector)

  11. About this course ● Reinforcement Learning theory & practice ● Theory at the start and practice towards end ● Syllabus = the course web page: https://www.eecs.tufts.edu/~jsinapov/teaching/comp150_RL/

  12. Where does RL fall within the field of Artificial Intelligence?

  13. Where does RL fall within the field of Artificial Intelligence? ● AI → ML → RL

  14. Where does RL fall within the field of Artificial Intelligence? ● AI → ML → RL ● Type of Machine Learning: – Supervised : learn from labeled examples – Unsupervised : learn from unlabeled examples – Reinforcement : learn through interaction

  15. Reduced Formalism

  16. Reduced Formalism (board or writing projector)

  17. Take-home Message ● Agent’s perspective: only the policy is under control ● State representation and reward function are given ● Focus on policy algorithms ● Appeal: program agents by just specifying goals ● Practice: need to pick state representation and reward function

  18. Example Applications

  19. Example Applications

  20. Reading Assignment ● Chapter 1 and 2 of Sutton and Barto ● Reading response on Canvas due 9/11 before class starts

  21. Programming Assignments ● Students are required to complete 4 minor programming assignments of their choosing ● Default options: programing exercises from Sutton and Barto (let’s look at some examples)

  22. Discussion Moderation ● Each student will lead a reading discussion once during the semester ● Students can team up in a pair ● Sign up sheet will be posted to Canvas tonight ● Extra credit for anyone who volunteers for slots in the next week ● Presentation materials / notes or description of what will be discussed should be emailed to me 48 hours before the class

  23. Next time...

  24. COMP 150: Reinforcement Learning

  25. Domains and Applications

  26. Curriculum Learning . . . . . . Example QuickChess game variants

  27. The Curriculum Learning Problem Task = MDP Environment Task Creatjon State Actjon Reward Agent Target task Sequencing Transfer Learning [ Narverkar et al 2016 ]

  28. Textbook The authors have made the book available: http://incompleteideas.net/book/bookdraft2017nov5.pdf

  29. Course Organization ● Taught as a seminar: students take turns presenting the readings ● Will cover both theory and practice ● Final projects – you will complete a project in which you ask (and then answer) a relevant RL research question

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend