during share your screen don t this lecture probability
play

- . during share your screen Don 't * this lecture . - PowerPoint PPT Presentation

students Upon entry speakers of the are * for the quality of sound in Zoom muted room . ' ' raise up ' ' speak , the hand to Please * an muted for you will be audio . write private ' ' chat " to You can use * instructor the


  1. students Upon entry speakers of the are * for the quality of sound in Zoom muted room . ' ' raise up ' ' speak , the hand to Please * an muted for you will be audio . write private ' ' chat " to You can use * instructor the note to the poll question ? T ake it Can you see * post # 360 check piazza if can you - . during share your screen Don 't * this lecture .

  2. Probability*and*Statistics* � ! for*Computer*Science** cov ( X, Y ) = E [( X − E [ X ])( Y − E [ Y ])] Covariance!is!coming!back!in! matrix! ! Credit:!wikipedia! Hongye!Liu,!Teaching!Assistant!Prof,!CS361,!UIUC,!03.25.2020!

  3. Last*time* � Review!of!Maximum!likelihood! EsOmaOon!(MLE)! � Bayesian!Inference!(MAP)! videos Check out the discussion the pdf MLE for file and

  4. Content* � Review!of!Bayesian!inference! � Visualizing!high!dimensional!data!&! Summarizing!data! � Refresh!of!some!linear!algebra! � The!covariance!matrix! !

  5. Bayesian inference for p COLD ) O is . a probability distribution It is . Maximum Likely hood function a probability function is ( ( O ) =p ( DIO ) distribution . but NOT a Plot D) =PcDo , Bayes .e PID ) Rule

  6. Beta%distribution% � A"distribu&on"is"Beta"distribu&on"if"it"has"the"following" expressive ! pdf:" P ( θ ) = K ( α , β ) θ α − 1 (1 − θ ) β − 1 OGG , I ] T T T " T ' , Boo 270 pdf of Beta − distribution K ( α , β ) = Γ ( α + β ) 10 Beta(1,1) Beta(5,5) "" Beta(50,50) Γ ( α ) Γ ( β ) Beta(70,70) Beta(20,50) 8 Beta(0.5,0.5) - � Is"an"expressive"family"of" 6 density distribu&ons""""""""""""""""""""""""""" Kisiel g 4 t � """"""""""""""""""""""""""""""is"uniform" Beta ( α = 1 , β = 1) 2 INFO to - x . 0 Ct k → 0.0 0.2 0.4 0.6 0.8 1.0 t ' a θ" . X

  7. Beta%distribution%as%the%conjugate%prior% for%Binomial%likelihood% � The$likelihood$is$Binomial$( N ,$ k )$ � N � θ k (1 − θ ) N − k P ( D | θ ) = k � The$Beta$distribuOon$is$used$as$the$prior$ O C- Co , I ) P ( θ ) = K ( α , β ) θ α − 1 (1 − θ ) β − 1 " otherwise Pl 01=0 , � So$ P ( θ | D ) ∝ θ α + k − 1 (1 − θ ) β + N − k − 1 * � Then$the$posterior$is$$ Beta ( α + k, β + N − k ) P ( θ | D ) = K ( α + k, β + N − k ) θ α + k − 1 (1 − θ ) β + N − k − 1 Deco , I ]

  8. The posterior for this example Continuous distribution Beta , is PCOID ) a - K - i c , - o , Btn " " C O is the X , , ] OGG random variable Binomial Distribution i r e Discrete Pl X -14=17 N - K , ) ok :c , - o , / the is K K 70 random why PLOID ) variable is not Binomial ?

  9. The%update%of%Bayesian%posterior% � Since$the$posterior$is$in$the$same$family$as$the$ conjugate$prior,$the$posterior$can$be$used$as$a$new$prior$ if$more$data$is$observed.$ � Suppose$we$start$with$a$uniform$prior$on$the$ probability$θ$of$heads$ N" k" α" β" 1$ 1$ 3$ 0$ 1$ 4$ 10$ 7$ 8$ 7$ 30$ 17$ 25$ 20$ 100$ 72$ 97$ 48$ θ$

  10. Maximize%the%Bayesian%posterior%(MAP)% � The$posterior$of$the$previous$example$is$ $ P ( θ | D ) = K ( α + k, β + N − k ) θ α + k − 1 (1 − θ ) β + N − k − 1 � DifferenOaOng$and$se^ng$to$0$gives$the$MAP$esOmate$ 2=1 α − 1 + k ˆ θ = B. =L α + β − 2 + N 1St prior - It 96 I 0<67 = - It I -2-1143

  11. Conjugate%prior%for%other%likelihood% functions% � What$is$the$the$conjugate$prior$if$the$likelihood$is$ Bernoulli$or$geometric?$ Berta � What$is$the$the$conjugate$prior$if$the$likelihood$is$ Poisson$or$ExponenOal?$ Gamma � What$is$the$the$conjugate$prior$if$the$likelihood$is$ normal$with$known$variance?$ Normal

  12. Content% � Review$of$Bayesian$inference$ � Visualizing"high"dimensional"data" &"Summarizing"data" � Refresh$of$some$linear$algebra$ � The$covariance$matrix$ $

  13. A%data%set%with%7%dimensions% � Seed$data$set$from$the$UCI$Machine$Learning$ - site:$ - areaA$ perimeterP$ compactness$ lengthKernel$ widthKernel$ asymmetry$ lengthGroove$ Label$ - 15.26$ 14.84$ 0.871$ 5.763$ 3.312$ 2.221$ 5.22$ 1$ 1$ 14.88$ 14.57$ 0.8811$ 5.554$ 3.333$ 1.018$ 4.956$ 1$ 2$ 14.29$ 14.09$ 0.905$ 5.291$ 3.337$ 2.699$ 4.825$ 1$ 3$ 13.84$ 13.94$ 0.8955$ 5.324$ 3.379$ 2.259$ 4.805$ 1$ 4$ 16.14$ 14.99$ 0.9034$ 5.658$ 3.562$ 1.355$ 5.175$ 1$ 5$ 14.38$ 14.21$ 0.8951$ 5.386$ 3.312$ 2.462$ 4.956$ 1$ 6$ 14.69$ 14.49$ 0.8799$ 5.563$ 3.259$ 3.586$ 5.219$ 1$ 7$ …$

  14. Matrix%format%of%a%dataset%in%the%textbook% N Co l - - - l - ! ) row :-p ← area A : : q , µ # of features d-

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend