Lecture 40 final exam review Mark Hasegawa-Johnson 5/6/2020 Some - PowerPoint PPT Presentation

Lecture 40 – final exam review Mark Hasegawa-Johnson 5/6/2020

Some sample problems • DNNs: Practice Final, question 23 • Reinforcement learning: Practice Final, question 24 • Games: Practice Final, question 25 • Game theory: Practice Final, question 26

Practice Exam, question 23 You have a two-layer neural network trained as an animal classier. The Input input feature vector is ⃗ 𝑦 = Weights [𝑦 ! , 𝑦 " , 𝑦 # , 1] , where 𝑦 ! , 𝑦 " , and 𝑦 # 𝑦 ! ∗ are some features, and 1 is 𝒛 𝟐 𝑥 !! ℎ ! multiplied by the bias. There are two 𝑥 !" 𝑦 " hidden nodes, and three output 𝑥 "! 𝑧 ∗ = [𝑧 ! ∗ , 𝑧 " ∗ , 𝑧 # ∗ , ] , ∗ nodes, ⃗ 𝒛 𝟑 𝑥 "" 𝑦 # ℎ " corresponding to the three output ∗ = Pr(dog| ⃗ ∗ = 𝑥 #! classes 𝑧 ! 𝑦 ), 𝑧 " 𝑥 #" ∗ = Pr(skunk| ⃗ ∗ Pr(cat| ⃗ 𝑦 ), 𝑧 # 𝑦 ). 𝒛 𝟒 1 1 Hidden node activations are sigmoid; output node activations are softmax. By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 (a) A Maltese puppy has feature Input vector ⃗ 𝑦 = [2,20, −1, 1] . All weights and biases are initialized Weights 𝑧 ∗ ? to zero. What is ⃗ 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! ℎ ! 𝑥 !" 𝑦 " 𝑥 "! ∗ 𝒛 𝟑 𝑥 "" 𝑦 # ℎ " 𝑥 #! 𝑥 #" ∗ 𝒛 𝟒 1 1 By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 (a) A Maltese puppy has feature Input vector ⃗ 𝑦 = [2,20, −1, 1] . All weights and biases are Weights 𝑧 ∗ ? initialized to zero. What is ⃗ 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! ℎ ! 𝑥 !" Hidden node excitations are both: 𝑦 " 𝑥 "! 0×⃗ 𝑦 = 0 ∗ 𝒛 𝟑 𝑥 "" 𝑦 # ℎ " Therefore, hidden node 𝑥 #! 𝑥 #" activations are both: ∗ 𝒛 𝟒 1 1 1 1 + 1 = 1 1 1 + 𝑓 "# = 2 By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 (a) A Maltese puppy has feature Input vector ⃗ 𝑦 = [2,20, −1, 1] . All weights and biases are Weights 𝑧 ∗ ? initialized to zero. What is ⃗ 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! ℎ ! 𝑥 !" Output node excitations are all: 𝑦 " 𝑥 "! 0×ℎ = 0 ∗ 𝒛 𝟑 𝑥 "" 𝑦 # ℎ " Therefore, output node 𝑥 #! 𝑥 #" activations are all: ∗ 𝒛 𝟒 1 1 𝑓 # 𝑓 # = 1 ' ∑ $%& 3 By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 (b) Let 𝑥 $( be the weight connecting Input the ith output node to the jth hidden ∗ Weights node. What is )* ! )+ !# ? Write your 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! ∗ , 𝑥 $( , and/or ℎ ( answer in terms of 𝑧 $ ℎ ! 𝑥 !" 𝑦 " for appropriate values of i and/or j. 𝑥 "! ∗ 𝒛 𝟑 𝑥 "" 𝑦 # ℎ " 𝑥 #! 𝑥 #" ∗ 𝒛 𝟒 1 1 By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 ∗ !" ! (b) What is !# !# ? Answer: OK, first we need the definition of softmax. Let’s write it in lots of parts, so it will be easier to differentiate. Input ∗ = num 𝑧 $ Weights den Where ”num” is the numerator of the softmax function: 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! num = exp 𝑔 ! ℎ ! 𝑥 !" “den” is the denominator of the softmax function: 𝑦 " 𝑥 "! % ∗ 𝒛 𝟑 den = - exp 𝑔 𝑥 "" " 𝑦 # ℎ " "#$ 𝑥 #! 𝑥 #" And both of those are written in terms of the softmax excitations, let’s call them 𝑔 " : ∗ 𝒛 𝟒 1 1 By http://www.birdphotos.com - 𝑔 " = - 𝑥 "& ℎ " Own work, CC BY 3.0, https://commons.wikimedia.org & /w/index.php?curid=4409510

Practice Exam, question 23 ∗ (b) What is () ' (* ') ? Now we differentiate each part: Input ∗ 𝑒𝑧 + 1 𝑒num − num 𝑒den Weights = den + 𝑒𝑥 +- den 𝑒𝑥 +- 𝑒𝑥 +- 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! 𝑒num 𝑒𝑔 ℎ ! 𝑥 !" + = exp 𝑔 𝑦 " + 𝑒𝑥 !" 𝑒𝑥 +- 𝑥 "! ∗ 𝒛 𝟑 𝑥 "" 0 𝑦 # ℎ " 𝑒𝑔 𝑒𝑔 𝑒den # ! 𝑒𝑥 !" = exp 𝑔 " = 1 exp 𝑔 𝑥 #! . 𝑒𝑥 !" 𝑥 #" 𝑒𝑥 +- ./- ∗ 𝒛 𝟒 1 1 𝑒𝑔 + = ℎ - 𝑒𝑥 +- By http://www.birdphotos.com - Own work, CC BY 3.0, https://commons.wikimedia.org /w/index.php?curid=4409510

Practice Exam, question 23 ∗ 01 ( (b) What is 02 () ? Putting it all back together again: Input Weights ∗ 𝑒𝑧 3 𝑦 ! ∗ 𝒛 𝟐 𝑥 !! 𝑒𝑥 34 ℎ ! 𝑥 !" 1 𝑦 " 𝑥 "! exp 𝑔 = " ℎ 4 ∗ 𝒛 𝟑 7 ∑ 564 exp 𝑔 𝑥 "" 5 𝑦 # ℎ " 𝑥 #! 𝑥 #" exp 𝑔 3 3 exp 𝑔 ∗ − " ℎ 4 𝒛 𝟒 1 1 7 ∑ 564 exp 𝑔 5 By http://www.birdphotos.com - ∗ 𝑒𝑧 3 Own work, CC BY 3.0, ∗ ℎ 4 − 𝑧 3 ∗ 3 ℎ 4 = 𝑧 3 https://commons.wikimedia.org 𝑒𝑥 34 /w/index.php?curid=4409510

Some sample problems • DNNs: Practice Final, question 23 • Reinforcement learning: Practice Final, question 24 • Games: Practice Final, question 25 • Game theory: Practice Final, question 26

Practice Exam, question 24 A cat lives in a two-room apartment. It has two possible actions: purr, or walk. It starts in room s0 = 1, where it receives the reward r0 = 2 (petting). It then implements the following sequence of actions: a0 =walk, a1 =purr. In response, it observes the following sequence of states and rewards: s1 = 2, r1 = 5 (food), s2 = 2.

Practice Exam, question 24 (a) The cat starts out with a Q-table whose entries are all Q(s,a) = 0. • …then performs one iteration of TD-learning using each of the two SARS sequences described above. • …it uses a relatively high learning rate (alpha = 0.05) and a relatively low discount factor (gamma = 3/4). Which entries in the Q-table have changed, after this learning, and what are their new values?

Practice Exam, question 24 Time step 0: 𝑇𝐵𝑆𝑇 = (1, 𝑥𝑏𝑚𝑙, 2,2) 𝑅𝑚𝑝𝑑𝑏𝑚 = 𝑆(1) + 𝛿 max 𝑅(2, 𝑏) * = 2 + 3 4 max 0,0 = 2 𝑅(1, 𝑥) = 𝑅(1, 𝑥) + 𝛽(𝑅𝑚𝑝𝑑𝑏𝑚 − 𝑅 1, 𝑥 ) = 0 + 0.05 ∗ (2 − 0) = 0.1 Time step 1: 𝑇𝐵𝑆𝑇 = (2, 𝑞𝑣𝑠𝑠, 5,2) 𝑅𝑚𝑝𝑑𝑏𝑚 = 𝑆(2) + 𝛿 max 𝑅(2, 𝑏) * = 5 + 3 4 max 0,0 = 5 𝑅(2, 𝑞𝑣𝑠𝑠) = 𝑅(2, 𝑞) + 𝛽(𝑅𝑚𝑝𝑑𝑏𝑚 − 𝑅 2, 𝑞 ) = 0 + 0.05 ∗ (5 − 0) = 0.25

Practice Exam, question 24 (b) The cat decides, instead, to use model-based learning. Based on these two observations, it estimates P(s’|s,a) with Laplace smoothing, where the smoothing constant is k=1. Find P(s’|2,purr). Time step 0: 𝑇𝐵𝑆𝑇 = (1, 𝑥𝑏𝑚𝑙, 2,2) Time step 1: 𝑇𝐵𝑆𝑇 = (2, 𝑞𝑣𝑠𝑠, 5,2)

Practice Exam, question 24 (b) Find P(s’|2,purr). P 𝑡 G = 1 𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠 = 1 + 𝐷𝑝𝑣𝑜𝑢(𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠, 𝑡 G = 1) 1 2 + ∑ 𝐷𝑝𝑣𝑜𝑢(𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠, 𝑡 G ) = 2 + 1 P 𝑡 G = 2 𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠 = 1 + 𝐷𝑝𝑣𝑜𝑢(𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠, 𝑡 G = 2) 2 + ∑ 𝐷𝑝𝑣𝑜𝑢(𝑡 = 2, 𝑏 = 𝑞𝑣𝑠𝑠, 𝑡 G ) = 1 + 1 2 + 1

Practice Exam, question 24 (c) The cat estimates R(1)=2, R(2)=5, and the following P(s’|s,a) table. It chooses the policy pi(1)=purr, pi(2)=walk. What is the policy-dependent utility of each room? Write two equations in the two unknowns U(1) and U(2); don’t solve. a=purr a=walk s=1 s=2 s=1 s=2 s’=1 2/3 1/3 1/3 2/3 s’=2 1/3 2/3 2/3 1/3

Practice Exam, question 24 (c) Answer: policy-dependent utility is just like Bellman’s equation, but without the max operation. The equations are 𝑄 𝑡 % 𝑡 = 1, 𝜌 1 𝑉(𝑡 % ) 𝑉 1 = 𝑆 1 + 𝛿 - $% 𝑄 𝑡 % 𝑡 = 2, 𝜌 2 𝑉(𝑡 % ) 𝑉 2 = 𝑆 2 + 𝛿 - $% a=purr a=walk s=1 s=2 s=1 s=2 s’=1 2/3 1/3 1/3 2/3 s’=2 1/3 2/3 2/3 1/3

Practice Exam, question 24 (c) Answer: So to solve, we just plug in the values for all variables except U(1) and U(2): 𝑉 1 = 2 + (3 4) 2 3 𝑉 1 + 1 3 𝑉(2) 𝑉 2 = 5 + (3 4) 2 3 𝑉 1 + 1 3 𝑉(2) a=purr a=walk s=1 s=2 s=1 s=2 s’=1 2/3 1/3 1/3 2/3 s’=2 1/3 2/3 2/3 1/3

Practice Exam, question 24 (d) Since it has some extra time, and excellent python programming skills, the cat decides to implement deep reinforcement learning, using an actor-critic algorithm. Inputs are one-hot encodings of state and action. What are the input and output dimensions of the actor network, and of the critic network?

Practice Exam, question 24 (d) Actor network is 𝜌 3 𝑡 = probability that action a is the best action, where a=1 or a=2. So output has two dimensions. Input is the state, s. If there are two states, encoded using a one-hot vector, then state 1 is encoded as 𝑡 = [1,0] , state 2 is encoded as 𝑡 = [0,1] . So, two dimensions.

Lecture 40 final exam review Mark Hasegawa-Johnson 5/6/2020 Some - PowerPoint PPT Presentation

Lecture 40 final exam review Mark Hasegawa-Johnson 5/6/2020 Some sample problems DNNs: Practice Final, question 23 Reinforcement learning: Practice Final, question 24 Games: Practice Final, question 25 Game theory: Practice

Math 211 Math 211 Review for the Final Exam December 8, 2002 2 The Final Exam The Final Exam

Final Review Drawing on the Web Final exam on Thursday, May 14 at 2:00 p.m. (EST) Final Review

ICS 101 Final Exam Review Fall 2016 Final Exam information In lab: check final exam schedule

Final Review Introduction to Web Design Final exam on Thursday, December 19 at 12:00 p.m. Final

Final exam effects Textures I Final exam effects Final exam effects Lighting Grads

Announcements Announcements Final Exam will be a take Final Exam will be a take- -home exam

The final exam Other finals review Final Exam Review CSH Review November 17 th

Did I happen to mention? Final exam Final Exam Review The date for the Final has been

Exam4 Information and Guidance General Topics General Exam Information Exam types

Quicksort Sorting Lower Bound Exam Exam Exam Exam 2 2 tomorrow evening 2 2 tomorrow

Review Final exam Final exam will be 11-12 problems, drop any 2 Cumulative up to and including

Final Exam Details The final exam will be posted on Blackboard by 7am on April 26th It will be

Final exam on Thursday, May 16 Drawing on the Web Final CSCI-UA 380 Review Multiple choice

FINAL EXAM REVIEW PACKET ANSWERS All answers can be found on my website! Final Exam Review 1.

Exam Review 2 Exam Overview Final Exam Friday,

Examination Lydia Love DVM DACVAA 2018 Exam Committee Chair September 2018 Exam Format

CERN OpenStack Cloud Control Plane From VMs to K8s OpenStack Summit - Shanghai 2019 Belmiro

Surgical Approach to Treatment Fetal period Dx- HLHS, intact atrial septum of Pulmonary

INNOVATION COLONIES: Disrupting the Fortune 500 Dominic Holt Dominic Holt Slide 3 ACM

Infants -Dr. Renee Baillargeon, Rose M. Scott, Zijing He - Kuldeep Yadav (10358) Lets start

Plan for Today Remarks by OLLI Director Megan Whilden Introduction of MFW and Kate

Storytimes and Transitions with Lessons for Early Educators from Youth Librarians with Amadee

INF5890 IT and Management Project Management in Practice: handling complexity and uncertainty

why are the 7 deadly? 1) They are difficult to shake. 2) The y are endemic (common) to humanity.

Sambuz

Useful Links

Newsletter

Mail Us

Lecture 40 final exam review Mark Hasegawa-Johnson 5/6/2020 Some - PowerPoint PPT Presentation

Lecture 40 final exam review Mark Hasegawa-Johnson 5/6/2020 Some sample problems DNNs: Practice Final, question 23 Reinforcement learning: Practice Final, question 24 Games: Practice Final, question 25 Game theory: Practice

Math 211 Math 211 Review for the Final Exam December 8, 2002 2 The Final Exam The Final Exam

Final Review Drawing on the Web Final exam on Thursday, May 14 at 2:00 p.m. (EST) Final Review

ICS 101 Final Exam Review Fall 2016 Final Exam information In lab: check final exam schedule

Final Review Introduction to Web Design Final exam on Thursday, December 19 at 12:00 p.m. Final

Final exam effects Textures I Final exam effects Final exam effects Lighting Grads

Announcements Announcements Final Exam will be a take Final Exam will be a take- -home exam

The final exam Other finals review Final Exam Review CSH Review November 17 th

Did I happen to mention? Final exam Final Exam Review The date for the Final has been

Exam4 Information and Guidance General Topics General Exam Information Exam types

Quicksort Sorting Lower Bound Exam Exam Exam Exam 2 2 tomorrow evening 2 2 tomorrow

Review Final exam Final exam will be 11-12 problems, drop any 2 Cumulative up to and including

Final Exam Details The final exam will be posted on Blackboard by 7am on April 26th It will be

Final exam on Thursday, May 16 Drawing on the Web Final CSCI-UA 380 Review Multiple choice

FINAL EXAM REVIEW PACKET ANSWERS All answers can be found on my website! Final Exam Review 1.

Exam Review 2 Exam Overview Final Exam Friday,

Examination Lydia Love DVM DACVAA 2018 Exam Committee Chair September 2018 Exam Format

CERN OpenStack Cloud Control Plane *From VMs to K8s* OpenStack Summit - Shanghai 2019 Belmiro

Surgical Approach to Treatment Fetal period Dx- HLHS, intact atrial septum of Pulmonary

INNOVATION COLONIES: Disrupting the Fortune 500 Dominic Holt Dominic Holt Slide 3 ACM

Infants -Dr. Renee Baillargeon, Rose M. Scott, Zijing He - Kuldeep Yadav (10358) Lets start

Plan for Today Remarks by OLLI Director Megan Whilden Introduction of MFW and Kate

Storytimes and Transitions with Lessons for Early Educators from Youth Librarians with Amadee

INF5890 IT and Management Project Management in Practice: handling complexity and uncertainty

why are the 7 deadly? 1) They are difficult to shake. 2) The y are endemic (common) to humanity.

Sambuz

Useful Links

Newsletter

Mail Us

CERN OpenStack Cloud Control Plane From VMs to K8s OpenStack Summit - Shanghai 2019 Belmiro