Review
- Linear separability (and use of features)
- Class probabilities for linear discriminants
sigmoid (logistic) function
- Applications: USPS, fMRI
!! !" # " ! # #$" #$! #$% #$& ' ( ) φ1 φ2 0.5 1 0.5 1
figure from book
1
Review Linear separability (and use of features) Class - - PowerPoint PPT Presentation
Review Linear separability (and use of features) Class probabilities for linear discriminants sigmoid (logistic) function Applications: USPS, fMRI ' figure from book 1 #$& 2 #$% ) 0.5 #$! #$" 0 # ! ! ! "
sigmoid (logistic) function
!! !" # " ! # #$" #$! #$% #$& ' ( ) φ1 φ2 0.5 1 0.5 1
figure from book
1
maximum conditional likelihood
each example adds a penalty to all weight vectors that misclassify it penalty is approximately piecewise linear
!! !" !# $ # " ! $ % # & " ' ! ()*+ !,-./0)1/2/*+
2
!! !" # " ! $ # #%! #%& #%' #%( "
3
!" !# !$ !% " % $ & ' !& !$ !% " %
4
P(Y=k) = Zk =
5
6 4 2 2 4 6 6 4 2 2 4 6
figure from book
6
Z =
common priors: L2 (ridge), L1 (sparsity)
7
e.g., glm function in R
called “iteratively reweighted least squares” for L1, slightly harder (less software available)
8
petal length P(I. virginica)
9
10
conditional MLE: maxw P(Y | X, w) conditional MAP: maxw P(W=w | X, Y)
why?
11
!" !# !$ " $ #" !% !& !' " ' &
12
!!" !#" " #" !" " "$! "$% "$& "$' #
13
may still lead to bad results for other reasons! e.g., not enough data, bad model class, …
14
Graphical model is big and highly connected Variables are high-arity or continuous
Inference reduces to numerical integration (and/
We’ll look at randomized algorithms
15
!! !"#$ " "#$ ! !! !"#% !"#& !"#' !"#( " "#( "#' "#& "#% ! " ! ( ) ' $ * + ,-+.*/
16
simultaneously try to discover it and estimate amount
17
18
Eliazar and Parr, IJCAI-03
19
!! !"#$ " "#$ ! " !" %" &" '" $" (" )"
20
!! !"#$ " "#$ ! " !" %" &" '" $" (" )"
21
22
23
23
24
25
!! !"#$ " "#$ ! " !" %" &" '" $" (" )"
26
f big Q small
27
Wi =
28
Wi =
EQ(Wg(X)) =
28
want f(x) dx vs. EP(f(X)) W = 1/Q vs. W = P/Q
29
30
and,
31
32