prof sameer singh
play

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER - PowerPoint PPT Presentation

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017 Upcoming Check out course webpage and schedule Check out Canvas, especially for deadlines Misc. Do the survey by tomorrow, April 7, 2017


  1. Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017

  2. Upcoming… • Check out course webpage and schedule Check out Canvas, especially for deadlines Misc. • • Do the survey by tomorrow, April 7, 2017 Homework 1 will be up soon • Meanwhile, install and get Malmo working Homework • Due: April 14, 2017 • Teams are due April 17, 2017, Proposals April 21, 2017 • Project • Start assembling teams now! (use Piazza) Start thinking of project ideas • CS 175: PROJECTS IN AI (SPRING 2017) 2

  3. Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 3

  4. Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 4

  5. What is AI? "Artificial intelligence is anything computers can't do yet." - Douglas Hofstadter https://en.wikipedia.org/wiki/AI_effect CS 175: PROJECTS IN AI (SPRING 2017) 5

  6. What can a project be? Do difficult things automatically, Research Minecraft is just a testbed Help players do things that are otherwise time-consuming Use AI/ML to create stuff in the world “Art” Practical Just cool! Tool CS 175: PROJECTS IN AI (SPRING 2017) 6

  7. Technical Solution Use Artificial Intelligence or Machine Learning algorithms Artificial Intelligence Machine Learning Heuristic/Adversarial/Local Search Supervised Learning Logic Planning Unsupervised Learning Bayesian Networks Reinforcement Learning Natural Language Processing Computer Vision Recommendation Systems Computer Vision Constraint Satisfaction Deep Learning Time Series Modeling CS 175: PROJECTS IN AI (SPRING 2017) 7

  8. Evaluation How would YOU define that your project was a success? Numerical Metrics: Accuracy, F1, AUC, … • Time to “run”, time to “train” • Quantitative Baselines: Evaluation What would be currently used? • What are reasonable “simpler” methods? • By how much amount? We hope to improve the METRIC by AMOUNT over BASELINE! (I won’t hold you to it, just want you to think about it) CS 175: PROJECTS IN AI (SPRING 2017) 8

  9. Evaluation How would YOU define that your project was a success? Simple Example Cases: What are examples that your idea • will “definitely” work on? What is the expected output on these? • Qualitative Error Analysis and Introspection: Evaluation Are there plots/figures to verify the behavior? • If it doesn’t work, how will you improve it? • The Super-Impressive Example What is the best example? “awesome if it works” • E.g. something that perfectly captures your idea! • CS 175: PROJECTS IN AI (SPRING 2017) 9

  10. You will have doubts! Is it too ambitious? Is it too simple? Is there data to train my classifier? Is my evaluation inappropriate? Is there a different algorithm I should use? Can I only use off-the-shelf code? Use Piazza! Every team has to meet me during Week 4. Discussion will cover many Both TA and me are available simple situations for appointments CS 175: PROJECTS IN AI (SPRING 2017) 10

  11. Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 11

  12. Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 12

  13. Reinforcement Learning Agent learns to do things by trying things, and succeeding/failing Explore the map without dying • Solve mazes • Navigation Learn the best way home from anywhere • Get to the highest hill in the map • Figure out best way to make items • Learn Recipes Without any knowledge of the recipes • • Learn to hide/find shelter Combat Learn to fight, example paper • http://alekhagarwal.net/arxiv_geql.pdf CS 175: PROJECTS IN AI (SPRING 2017) 13

  14. Reinforcement Learning Agent learns to do things by trying things, and succeeding/failing What the agent sees Observation What the agent can do Action New Item++ Goal++ Reward What the agent likes/dislikes No Item- Died--- CS 175: PROJECTS IN AI (SPRING 2017) 14

  15. Reinforcement Learning Next few lectures will go into details (and more ideas) For now, let’s look at non-RL ideas CS 175: PROJECTS IN AI (SPRING 2017) 15

  16. Describe the Scene Houses and a pig on a grassy Pig staring at me in a village. field during the day. CS 175: PROJECTS IN AI (SPRING 2017) 16

  17. Live Commentator “Hit a rabbit” CS 175: PROJECTS IN AI (SPRING 2017) 17

  18. How is this even possible? 3 block in a line Grass blocks as floor Daylight, clear weather Malmo Training Signal Deep Learning, CNN + LSTM Machine “3 block in a line” Learning CS 175: PROJECTS IN AI (SPRING 2017) 18

  19. Many Variations of These Your code Agent/World “Label” in Malmo Render x1000 x100000 x100000 Machine “Label” Learning object object detection objects ~caption generation action ~action detection, “commentary” depth of pixel ~stereoscopy, depth/distance prediction CS 175: PROJECTS IN AI (SPRING 2017) 19

  20. Captions to Speech Why are you making me read? Pig staring at me in a village. CS 175: PROJECTS IN AI (SPRING 2017) 20

  21. Natural Language Navigation trivial > Go forward till you hit a wall > Go to the pig > Go to the house on the right > Go behind the house hardest Quite Difficult! CS 175: PROJECTS IN AI (SPRING 2017) 21

  22. Natural Language Interface trivial > Choose steel pickaxe and dig > Go and destroy that window > Put the blue block on the closest wall > Find a tree and chop it hardest Quite Difficult! CS 175: PROJECTS IN AI (SPRING 2017) 22

  23. SHRDLU (from 1970!) http://hci.stanford.edu/winograd/shrdlu/ CS 175: PROJECTS IN AI (SPRING 2017) 23

  24. Natural Speech to Commands Off the shelf Why are you Speech to Text systems making me type? Online Speech to Text APIs CS 175: PROJECTS IN AI (SPRING 2017) 24

  25. Photo to Minecraft Character Photo of a person Minecraft Skin Your Project Need to label data? Can you use existing classifiers, like Visual QA? CS 175: PROJECTS IN AI (SPRING 2017) 25

  26. Recipe Planners Inventory “Need”(s) Steps > Get 2 wood planks > Make a stick > Get 2 diamonds > Make diamond sword CS 175: PROJECTS IN AI (SPRING 2017) 26

  27. Lots of other possibilities Many other games in Minecraft Create AI for those? One AI that works for all of those? http://www.planetminecraft.com/ CS 175: PROJECTS IN AI (SPRING 2017) 27

  28. Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 28

  29. Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning Based on slides by David Silver CS 175: PROJECTS IN AI (SPRING 2017) 29

  30. Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 30

  31. What makes it different? No direct supervision, only rewards Feedback is delayed, not instantaneous Time really matters, i.e. data is sequential Agent’s actions affect what data it will receive Fly stunt maneuvers in a helicopter • Defeat the world champion at Backgammon • Examples Manage an investment portfolio • • Control a power station Make a humanoid robot walk • Play many different Atari games better than humans • Beat the world champion in Go • CS 175: PROJECTS IN AI (SPRING 2017) 31

  32. Agent-Environment Interface Agent decides on an action • receives next observation • receives next reward • Environment executes the action • computes next observation • computes next reward • CS 175: PROJECTS IN AI (SPRING 2017) 32

  33. Reward, R t Nothing about WHY it is +, positive (Good) How well the doing well, could have -, negative (Bad) agent is doing little to do with A t-1 Agent is trying to maximize its cumulative reward CS 175: PROJECTS IN AI (SPRING 2017) 33

  34. Example of Rewards Fly stunt maneuvers in a helicopter • • +ve reward for following desired trajectory −ve reward for crashing • Defeat the world champion at Backgammon • • +/−ve reward for winning/losing a game Manage an investment portfolio • • +ve reward for each $ in bank Control a power station • +ve reward for producing power • • −ve reward for exceeding safety thresholds Make a humanoid robot walk • • +ve reward for forward motion −ve reward for falling over • Play many different Atari games better than humans • • +/−ve reward for increasing/decreasing score CS 175: PROJECTS IN AI (SPRING 2017) 34

  35. Sequential Decision Making Actions have long term consequences Rewards may be delayed May be better to sacrifice short term reward for long term benefit • A financial investment (may take months to mature) • Refuelling a helicopter (might prevent a crash later) Examples • Blocking opponent moves (might eventually help win) • Spend a lot of money and go to college (earn more later) Don ’ t commit crimes (rewarded by not going to jail) • • Get started on Malmo/project soon (make it an easy quarter) A key aspect of intelligence, how far ahead are you able to plan? CS 175: PROJECTS IN AI (SPRING 2017) 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend