Scribing and ! ) & Scribes Sarthah Hao ( today Thanks ; - - PowerPoint PPT Presentation

scribing
SMART_READER_LITE
LIVE PREVIEW

Scribing and ! ) & Scribes Sarthah Hao ( today Thanks ; - - PowerPoint PPT Presentation

Homework Scribing and ! ) & Scribes Sarthah Hao ( today Thanks ; Friday Available Dec 12 ; ) ( Homework I Fri Due 26 - LATEX template Scribing Schedule + - Notes Lecture 1 - Distribution Prior Posterior Recap


slide-1
SLIDE 1 Homework and Scribing Scribes today ; Sarthah & Hao ( Thanks ! ) Available Friday 12 Dec ;
  • Homework
I ( Due Fri 26 )
  • Scribing
Schedule + LATEX template
  • Notes
Lecture 1
slide-2
SLIDE 2 Distribution Recap ! Prior and Posterior Example : Biased Coin tl :p , Density Prior O Be ,,a( a

,p#

trim

)

Likelihood

y[

~ Binom ;a( (
  • ,

}

PLO 's "m)=Beta( O;o+a,p+b ) a=§ , ,yn b=N . a ( Sufficient Statistics ) Prior I Trial to Trials loco Trials Yi = y , :b :( 0,1 ,o,l , 'll ) a
  • 672
b=3Z8 PLO )= p( Oly , ) = p( O ;y , :b ) p( Oiyiiiooo ) Betalo ;2i2 ) Betalo ;2 , 3 ) Betalb ; 6,4 ) =Beta( 0,674,330)
slide-3
SLIDE 3 Bayesian Regression distribution
  • ver
functions
  • /
f.

"

' i '

diggllloise

dist yn = fcxn ) + En fnpc f) En~ PIE n=1 , ... ,N Linear Basis Function
  • a
fix )
  • .
f- ' ⇒ of
  • x
x D f ( In ) := It In = Iwaka fCInl÷ Tvtocxn ) D= ' D
  • I=×w

×=tI÷El}

"

t.ae#=t9tEIII1NX1NxDpxl

slide-4
SLIDE 4 Example : Polynomial Basis yn = Wttcfkn ) t En w°~p( I ) En~ PIE ) n= I , ... , N 4 Reduces to linear reg . when ¢ is identity Polynomial Basis ¢d(×n ) = xD Prior
  • n
in implies prior
  • n
functions f
slide-5
SLIDE 5 Over fitting and Under fitting
  • Under
fitting Residuals are strongly correlated Over fitting Poor generalization
slide-6
SLIDE 6 Ridge Regression ( LZ Regularization ) Objective :

[ TEI

= IF cyn . wto .it +

ia¥

, wa ' I =
  • I=i
I = 10
slide-7
SLIDE 7 Ridge Regression : Probabilistic Interpretation yn = It ¢( In 1 + En

Wd~

Normal (
  • ,
s ) does not depend yn ~ Normal ( uiiofkn ) , 6)

En

~

Normal ( 0,6 )
  • ne
) Maximum a Posteriori Est.mat@l5tkPH5lPl5largmwaxpcwl51-arggnaxpC5.w ) = arynfax log piyii ) µ " ( log is monotonic log pl 5,5 ) =
  • E. log plyntw
) + log pews can ignore Normal ( x ; M , 6) log Phil = £ ,

logfntrs

.)
  • I ¥E
= ae
  • IK
'M%2 §e , log pcynlw ) = § , legato .)
  • {
( Yn . wtocxn ) )2 e-
slide-8
SLIDE 8 Ridge Regression : Probabilistic Interpretation

[TEl=n§

.

.cyn

. wto.si +

if

Maximum a Posteriori Estimation yn =Itoki ) t En Wd~

Norm

( o , S ) En~µorm( 0,6 ) µ Depends log pl 5,5 ) =
  • E. log
plynliu ) + log pw ) any an s and 6 N D y =
  • tzsf
( yn
  • w¢( In ))2
. ztz § woi t.co/nst n =L =L 62 argngin Lridtelw ) = arggmax leg piyii ) I = I
slide-9
SLIDE 9 Ridge Regression s yn = it Tcfkn ) t En Wd ~ Norm ( 0,5 ) En~µorm( 0,6 ) argugax log piwly ) 9=91--0 7=5--1 7=5--10 IEN I 6
  • (
precise
  • bservations
) s → as ( uninformative prior ) E [ En2]= 10 # [ woi )
slide-10
SLIDE 10 Posterior Predictive Distribution Yi yz Y } 3h55 Small °
  • (
00 7=5 y # SZ °
  • (
y #
  • Large
A 000

(

y # y* Posterior y* ~ pcy 1 flx* ) ) f ~ p ( f 1 Yi :5 ) given previous
  • bservations
slide-11
SLIDE 11 Posterior Predictive Distribution f* . | y , yz Y } 5h55 l , Small °
  • i
7=91 f

,*y*

" l
  • hi
co 1 , y # , f * 1 Large 9
  • µ
y # y* P(y*ly , :n ) = df pcytlfspcfly , :X , ) f*= E[ 4*14 , :µ ) ( full

distribution

: Gaussian Process ) ( kernel Lidge regression )
slide-12
SLIDE 12 Regression with Kennels : Motivation Idea : What if we used lots
  • f
features D >> N ?
  • .
yn = wtokn ) + En 5 = TQ 5 + E Want to calculate , * " Rabab * µ [%¥%

;)

> E[

4*14=5

) = |dy*d5 y* pcytlwlpiwl 51 When posterior Gaussian : to L
  • =
|do F- [ 4*16=0 ) pail 51 E[ T 14=5 ] d. = angjnaxplw 15 ) =

|dI

wt¢(x* ) pc

515

) ' , ( take as given for now ) = # [ WTI § =y ] ¢k* )
slide-13
SLIDE 13 Regression with Kennels : Motivation Idea : What if we used lots
  • f
features D >> N ?
  • .
yn = wtokn ) + En 5 = TQ 5 + E an
  • nasa
. * , t.FI?.) f* = E[ 4*14=5 ] =

6*+46*1

Solve by tuwmf I * = ungngnax log pc 5,6 ) y derivative wrt I = angmoax 1 5- Easily
  • OI
E) + iwtw d
  • =z#tCy
. Que 't

)+7ws4→

  • It5=#oI+i)w*
slide-14
SLIDE 14 Regression with Kennels : Motivation Idea : What if we used lots
  • f
features D >> N ?
  • .
yn = wtokn ) + En 5 = TQ 5 + E DENY Dgt 'ya
  • R
" pnxbab pµ |%¥,¥,.) d

01+5=0+01+71

)w' * ui* = ( It #

t.IT#ty

Ridge regression : | Alternative invert Dxb matrix formal at '~ b×µ µ xD Dxt ' 0(µ3 ) ( OITOI + II ) "0It = OIT (0101++91) " is better than O(p3 ) Invert N×N matrix when D > > N
slide-15
SLIDE 15 The Kernel Trick OIOIT = ¢( I. stock ) ... CKIMQKI ) = K

| quit.io#i...ocxIiocxxi/

Idea : use kernel function h( xi , Ej ) := 4655445 ;) Expected value f* = ¢(I*stw*
  • *=
# ( 1<+91515 =(¢(E*st¢t ) ( 1<+9+-515 Never need = In hcE* , # ( k+I±sInym } to computed
slide-16
SLIDE 16 kernel Ridge Regression f * = li ( k + Its ' ' g 5=14 ' , ' ' ' , Yn ) Knm = be ( In ,Im ) ten = lr ( I* , In )