- Prof. Sameer Singh
CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017
April 6, 2017
Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER - - PowerPoint PPT Presentation
Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017 Upcoming Check out course webpage and schedule Check out Canvas, especially for deadlines Misc. Do the survey by tomorrow, April 7, 2017
April 6, 2017
CS 175: PROJECTS IN AI (SPRING 2017) 2
Misc.
Homework
Project
CS 175: PROJECTS IN AI (SPRING 2017) 3
CS 175: PROJECTS IN AI (SPRING 2017) 4
CS 175: PROJECTS IN AI (SPRING 2017) 5
https://en.wikipedia.org/wiki/AI_effect
CS 175: PROJECTS IN AI (SPRING 2017) 6
Research Practical Tool “Art” Just cool!
Do difficult things automatically, Minecraft is just a testbed Help players do things that are otherwise time-consuming Use AI/ML to create stuff in the world
CS 175: PROJECTS IN AI (SPRING 2017) 7
Use Artificial Intelligence or Machine Learning algorithms Artificial Intelligence Machine Learning Natural Language Processing Computer Vision Supervised Learning Unsupervised Learning Deep Learning Reinforcement Learning Computer Vision Heuristic/Adversarial/Local Search Planning Constraint Satisfaction Logic Bayesian Networks Time Series Modeling Recommendation Systems
CS 175: PROJECTS IN AI (SPRING 2017) 8
How would YOU define that your project was a success? Quantitative Evaluation Numerical Metrics:
Baselines:
By how much amount? We hope to improve the METRIC by AMOUNT over BASELINE! (I won’t hold you to it, just want you to think about it)
CS 175: PROJECTS IN AI (SPRING 2017) 9
How would YOU define that your project was a success? Qualitative Evaluation Simple Example Cases:
will “definitely” work on?
Error Analysis and Introspection:
The Super-Impressive Example
CS 175: PROJECTS IN AI (SPRING 2017) 10
Every team has to meet me during Week 4. Is it too simple? Is it too ambitious? Is my evaluation inappropriate? Both TA and me are available for appointments Discussion will cover many simple situations Can I only use off-the-shelf code? Is there data to train my classifier? Use Piazza! Is there a different algorithm I should use?
CS 175: PROJECTS IN AI (SPRING 2017) 11
CS 175: PROJECTS IN AI (SPRING 2017) 12
CS 175: PROJECTS IN AI (SPRING 2017) 13
Navigation Learn Recipes
Combat Agent learns to do things by trying things, and succeeding/failing
http://alekhagarwal.net/arxiv_geql.pdf
CS 175: PROJECTS IN AI (SPRING 2017) 14
Observation Action Reward Agent learns to do things by trying things, and succeeding/failing What the agent sees What the agent can do What the agent likes/dislikes New Item++ No Item- Goal++ Died---
CS 175: PROJECTS IN AI (SPRING 2017) 15
Next few lectures will go into details (and more ideas) For now, let’s look at non-RL ideas
CS 175: PROJECTS IN AI (SPRING 2017) 16
Houses and a pig on a grassy field during the day. Pig staring at me in a village.
CS 175: PROJECTS IN AI (SPRING 2017) 17
“Hit a rabbit”
CS 175: PROJECTS IN AI (SPRING 2017) 18
3 block in a line Grass blocks as floor Daylight, clear weather Malmo Machine Learning Deep Learning, CNN + LSTM “3 block in a line” Training Signal
CS 175: PROJECTS IN AI (SPRING 2017) 19
“Label” Agent/World in Malmo Your code Render Machine Learning “Label” x1000 x100000 x100000
~caption generation action depth of pixel ~action detection, “commentary” ~stereoscopy, depth/distance prediction
CS 175: PROJECTS IN AI (SPRING 2017) 20
Why are you making me read? Pig staring at me in a village.
CS 175: PROJECTS IN AI (SPRING 2017) 21
Quite Difficult! > Go forward till you hit a wall > Go to the pig > Go to the house on the right > Go behind the house trivial hardest
CS 175: PROJECTS IN AI (SPRING 2017) 22
Quite Difficult! > Choose steel pickaxe and dig > Go and destroy that window > Put the blue block on the closest wall > Find a tree and chop it trivial hardest
CS 175: PROJECTS IN AI (SPRING 2017) 23 http://hci.stanford.edu/winograd/shrdlu/
CS 175: PROJECTS IN AI (SPRING 2017) 24
Why are you making me type? Off the shelf Speech to Text systems Online Speech to Text APIs
CS 175: PROJECTS IN AI (SPRING 2017) 25
Photo of a person Minecraft Skin Your Project Need to label data? Can you use existing classifiers, like Visual QA?
CS 175: PROJECTS IN AI (SPRING 2017) 26
Inventory “Need”(s) > Get 2 wood planks > Make a stick > Get 2 diamonds > Make diamond sword Steps
CS 175: PROJECTS IN AI (SPRING 2017) 27 http://www.planetminecraft.com/
Many other games in Minecraft Create AI for those? One AI that works for all of those?
CS 175: PROJECTS IN AI (SPRING 2017) 28
CS 175: PROJECTS IN AI (SPRING 2017) 29
Based on slides by David Silver
CS 175: PROJECTS IN AI (SPRING 2017) 30
CS 175: PROJECTS IN AI (SPRING 2017) 31
No direct supervision, only rewards Feedback is delayed, not instantaneous Time really matters, i.e. data is sequential Agent’s actions affect what data it will receive
Examples
CS 175: PROJECTS IN AI (SPRING 2017) 32
Agent
Environment
CS 175: PROJECTS IN AI (SPRING 2017) 33
How well the agent is doing +, positive (Good)
Nothing about WHY it is doing well, could have little to do with At-1 Agent is trying to maximize its cumulative reward
CS 175: PROJECTS IN AI (SPRING 2017) 34
CS 175: PROJECTS IN AI (SPRING 2017) 35
Actions have long term consequences Rewards may be delayed May be better to sacrifice short term reward for long term benefit Examples
A key aspect of intelligence, how far ahead are you able to plan?
CS 175: PROJECTS IN AI (SPRING 2017) 36
Given an environment (produces observations and rewards) Reinforcement Learning Automated agent that selects actions to maximize total rewards in the environment
CS 175: PROJECTS IN AI (SPRING 2017) 37
What does the choice of action depend on?
CS 175: PROJECTS IN AI (SPRING 2017) 38
History: everything that happened so far Ht = O1R1A1O2R2A2O3R3,…,At-1OtRt State, St can be Ot OtRt At-1OtRt Ot-3Ot-2Ot-1Ot In general, St = f(Ht) You, as AI designer, specify this function
CS 175: PROJECTS IN AI (SPRING 2017) 39
Current state St Next action At
Deterministic Policy: 𝐵# = 𝜌 𝑇# Stochastic Policy: 𝜌 𝑏|𝑡 = 𝑄(𝐵# = 𝑏|𝑇# = 𝑡) Good policy: Leads to larger cumulative reward Bad policy: Leads to worse cumulative reward (we will explore this more in the next week)
CS 175: PROJECTS IN AI (SPRING 2017) 40
Rules are unknown
Dynamics are unknown
CS 175: PROJECTS IN AI (SPRING 2017) 41
https://www.youtube.com/watch?v=V1eYniJ0Rnk
CS 175: PROJECTS IN AI (SPRING 2017) 42
https://www.youtube.com/watch?v=CIF2SBVY-J0
CS 175: PROJECTS IN AI (SPRING 2017) 43
https://www.youtube.com/watch?v=I2WFvGl4y8c
CS 175: PROJECTS IN AI (SPRING 2017) 44