SLIDE 29 Statistical Learning Theory: Summary Statistical Learning Theory: Summary
There is a 3 There is a 3-
way tradeoff between ε ε, , m m, and the complexity of the , and the complexity of the hypothesis space H. hypothesis space H. The complexity of H can be measured by the VC dimension The complexity of H can be measured by the VC dimension For a fixed hypothesis space, we should try to minimize training For a fixed hypothesis space, we should try to minimize training set set error (empirical risk minimization) error (empirical risk minimization) For a variable For a variable-
- sized hypothesis space, we should be willing to
sized hypothesis space, we should be willing to accept some training set errors in order to reduce the VC dimens accept some training set errors in order to reduce the VC dimension ion
k (structural risk minimization)
(structural risk minimization) Margin theory shows that by changing Margin theory shows that by changing γ γ, we continuously change , we continuously change the effective VC dimension of the hypothesis space. Large the effective VC dimension of the hypothesis space. Large γ γ means means small effective VC dimension (fat shattering dimension) small effective VC dimension (fat shattering dimension) Soft margin theory tells us that we should be willing to accept Soft margin theory tells us that we should be willing to accept an an increase in || increase in ||ξ ξ|| ||2
2 in order to get an increase in
in order to get an increase in γ γ. . We will be able to implement structural risk minimization within We will be able to implement structural risk minimization within a a single optimizer by having a dual objective function that tries single optimizer by having a dual objective function that tries to to maximize maximize γ γ while minimizing || while minimizing ||ξ ξ|| ||2
2