CSC 411 Lecture 9: SVMs and Boosting Roger Grosse, Amir-massoud - - PowerPoint PPT Presentation

csc 411 lecture 9 svms and boosting
SMART_READER_LITE
LIVE PREVIEW

CSC 411 Lecture 9: SVMs and Boosting Roger Grosse, Amir-massoud - - PowerPoint PPT Presentation

CSC 411 Lecture 9: SVMs and Boosting Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 09-Classification Odds and Ends 1 / 34 Overview Support Vector Machines Connection between Exponential Loss


slide-1
SLIDE 1

CSC 411 Lecture 9: SVMs and Boosting

Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

University of Toronto

UofT CSC 411: 09-Classification Odds and Ends 1 / 34

slide-2
SLIDE 2

Overview

Support Vector Machines Connection between Exponential Loss and AdaBoost

UofT CSC 411: 09-Classification Odds and Ends 2 / 34

slide-3
SLIDE 3

Binary Classification with a Linear Model

Classification: Predict a discrete-valued target Binary classification: Targets t ∈ {−1, +1} Linear model: z = w⊤x + b y = sign(z) Question: How should we choose w and b?

UofT CSC 411: 09-Classification Odds and Ends 3 / 34

slide-4
SLIDE 4

Zero-One Loss

We can use the 0 − 1 loss function, and find the weights that minimize it

  • ver data points

L0−1(y, t) = if y = t 1 if y = t = I{y = t}. But minimizing this loss is computationally difficult, and it can’t distinguish different hypotheses that achieve the same accuracy. We investigated some other loss functions that are easier to minimize, e.g., logistic regression with the cross-entropy loss LCE. Let’s consider a different approach, starting from the geometry of binary classifiers.

UofT CSC 411: 09-Classification Odds and Ends 4 / 34

slide-5
SLIDE 5

Separating Hyperplanes

Suppose we are given these data points from two different classes and want to find a linear classifier that separates them.

UofT CSC 411: 09-Classification Odds and Ends 5 / 34

slide-6
SLIDE 6

Separating Hyperplanes

<latexit sha1_base64="CenO+DINbFRCOV26HhAJh/UjCUs=">ACTnicdVBLS0JBGJ1rL7OX1rLNkBRBIPdGUJtActPSIB+gJnPHUYfmcZn5riUX/0nb+j1t+yPtosbHIhUPfHA45ztwOGEkuAXf/JSa+sbm1vp7czO7t7+QTZ3WLU6NpRVqBba1ENimeCKVYCDYPXIMCJDwWrhc2ns1wbMWK7VIwj1pKkp3iXUwJOamezIb7AL09N0BF+xbfYb2fzfsGfAC+TYEbyaIZyO+edNTuaxpIpoIJY2wj8CFoJMcCpYKNM7YsIvSZ9FjDUks61kUn2ET53SwV1t3CnAE/V/IiHS2qEM3ack0LeL3lhc5UFfjuY10dOGO5nTFcZCW+jetBKuohiYotOy3Vhg0Hi8Je5wyiIoSOEujynmPaJIRTc4pnmJiUtJREdezILRs7rhMqpeFwC8ED1f54t1s4zQ6RifoHAXoGhXRPSqjCqJogN7QO/rwPr1v78f7nb6mvFnmCM0hlf4DBUSz4A=</latexit>

The decision boundary looks like a line because x ∈ R2, but think about it as a D − 1 dimensional hyperplane. Recall that a hyperplane is described by points x ∈ RD such that f (x) = w⊤x + b = 0.

UofT CSC 411: 09-Classification Odds and Ends 6 / 34

slide-7
SLIDE 7

Separating Hyperplanes

<latexit sha1_base64="oQ89AczmG2P/8oOvsF95rOS0I=">ACUnicdVJLSwMxGMzWV62vVr15CRZFEMquCHoRir14rGAfYOuSTdM2mMeSfKvWpf/Fq/4eL/4VT6aPg23pQGCY+QaGIVEsuAXf/EyK6tr6xvZzdzW9s7uXr6wX7c6MZTVqBbaNCNimeCK1YCDYM3YMCIjwRrRc2XkN16YsVyrBxjErC1JT/EupwScFOYPozDA5/g1DJ5aoGP8hm+wH+aLfskfAy+SYEqKaIpqWPBOWx1NE8kUEGsfQz8GNopMcCpYMNcK7EsJvSZ9Nijo4pIZtvpuP4Qnzilg7vauKcAj9X/iZRIawcycpeSQN/OeyNxmQd9OZzVRE8b7mROlxhzbaF73U65ihNgik7KdhOBQePRnrjDaMgBo4Q6vKcYtonhlBwq+da42Ba0VIS1bFDt2wv+MiqV+UAr8U3F8Wy7fTjbPoCB2jMxSgK1RGd6iKaoid/SBPtGX9+39ZtwvmZxmvGnmAM0gs/0Hf3azKg=</latexit> <latexit sha1_base64="WXIjrMlnXZQ7KWgZaDN7Wad+IDs=">ACUnicdVJLSwMxGMzWd63aqjcvwaIQtktgl4EsRePFWwtHXJpmkbmseSfKvWpf/Fq/4eL/4VT6aPg610IDMfAPDkCgW3ILvf3uZldW19Y3Nrex2bmd3L1/Yr1udGMpqVAtGhGxTHDFasBsEZsGJGRYI/RoDL2H5+ZsVyrBxjGrC1JT/EupwScFOYPo7CMz/FLWH5qgY7xK7Gfpgv+iV/AvyfBDNSRDNUw4J32upomkimgApibTPwY2inxACngo2yrcSymNAB6bGmo4pIZtvpP4Inzilg7vauKcAT9S/iZRIa4cycpeSQN8uemNxmQd9OZrXRE8b7mROlxgLbaF71U65ihNgik7LdhOBQePxnrjDaMgho4Q6vKcYtonhlBwq2dbk2Ba0VIS1bEjt2ywuON/Ui+XAr8U3F8Ub25nG2+iI3SMzlCALtENukNVEMUvaF39IE+vS/vJ+N+yfQ0480yB2gOmdwvg0SzLA=</latexit>

There are multiple separating hyperplanes, described by different parameters (w, b).

UofT CSC 411: 09-Classification Odds and Ends 7 / 34

slide-8
SLIDE 8

Separating Hyperplanes

UofT CSC 411: 09-Classification Odds and Ends 8 / 34

slide-9
SLIDE 9

Optimal Separating Hyperplane

Optimal Separating Hyperplane: A hyperplane that separates two classes and maximizes the distance to the closest point from either class, i.e., maximize the margin of the classifier.

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="HUrqgXiwoP/pt58JPjhVuEWrf98=">ACVnicdZBLSwMxFIUzo7W1vlrdCG6CRXFVZoqgG6HYjcsK9gGdUjJpg3NY0gyShnGX+NWf4/+GTF9LGxLwQO37kHbk4YM6qN5/047s5ubi9f2C8eHB4dn5TKp20tE4VJC0smVTdEmjAqSMtQw0g3VgTxkJFOGnM/M4rUZpK8WKmMelzNBI0ohgZiwal8wZ8gEGkE79LA2EVDx9ywa1bFCqeFVvPnBT+EtRActpDsrOdTCUOFEGMyQ1j3fi0/RcpQzEhWDBJNYoQnaER6VgrEie6n8y9k8MqSIYyksk8YOKf/EyniWk95aDc5MmO97s3gNs+MebK2EgqajHFW4y1a01030+piBNDBF4cGyUMGglncIhVQbNrUCYZunGOIxsp0a23wxmAfThuQciaGeNeuv97gp2rWq71X959tK/XHZcQFcgEtwA3xwB+rgCTRBC2DwDj7AJ/hyvp1fN+fmF6us8ycgZVxS38zfLaN</latexit> <latexit sha1_base64="19E7Q3SzuFSITLhYdfln84cJaQ=">ACQnicdVDNTsJAGNzFP8Q/0KOXRqLxRFoueiRy8YiJgAk0ZLvdlpX9aXa3JqThHbzq8/gSvoI349WDS+lBIEzyJZOZb5LJBAmj2rjuJyxtbe/s7pX3KweHR8cn1dpT8tUYdLFkn1FCBNGBWka6h5ClRBPGAkX4wac/9/gtRmkrxaKYJ8TmKBY0oRsZKPY5UTMWoWncbg5nXgFqYMCnVENXg1DiVNOhMEMaT3w3MT4GVKGYkZmlWGqSYLwBMVkYKlAnGg/y+vOnEurhE4klT1hnFz9n8gQ13rKA/vJkRnrVW8ubvLMmM+WNRZLRa1M8QZjpa2Jbv2MiQ1ROBF2ShljpHOfD8npIpgw6aWIGzFDt4jBTCxq5cGebBrC05RyLUM7ust7rjOuk1G57b8B6a9dZdsXEZnIMLcA08cANa4B50QBdg8AxewRt4hx/wC37Dn8VrCRaZM7AE+PsHEhayMA=</latexit>

Intuitively, ensuring that a classifier is not too close to any data points leads to better generalization on the test data.

UofT CSC 411: 09-Classification Odds and Ends 9 / 34

slide-10
SLIDE 10

Geometry of Points and Planes

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="SQ4s8AFHTzFnLzTEnxRMVAdSJDM=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWNF+4A2lM1mky7dR9jdiCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7xlLRQR/0tMYewxGnIQEQW2kx5eRO6rWnYaTw14lbkHqoEB7VLMuhoFACcNcIwqVGrhOrL0USk0QxVlmCgcQzSBER4YyiHDykvzrpl9bpTADoU0x7Wdq/8TKWRKTZlvPhnUY7XszcR1nh6zbFGjkZDEyAStMZba6vDWSwmPE405mpcNE2prYc/GswMiMdJ0aghEJk+QjcZQqTNxJVhHkxbgjHIA5WZd3lHVdJ96rhOg34brevCs2LoNTcAYugQtuQBPcgzboAQi8ArewLv1YX1Z39bP/LVkFZkTsADr9w/q1LCe</latexit> <latexit sha1_base64="wLxVwb7JO8TFCMfjpr5pGfRwb2g=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSlIEPRZ78VjRPqANZbPZpEv3EXY3Ygn5CV719/gz/AXexKs3t2kOtqUDHwz38AwfkyJ0o7zaW1sbm3v7Jb2yvsHh0fHlepJV4lEItxBgrZ96HClHDc0URT3I8lhsynuOdPWjO/94ylIoI/6WmMPQYjTkKCoDbS48uoMarUnLqTw14lbkFqoEB7VLUuh4FACcNcIwqVGrhOrL0USk0QxVl5mCgcQzSBER4YyiHDykvzrpl9YZTADoU0x7Wdq/8TKWRKTZlvPhnUY7XszcR1nh6zbFGjkZDEyAStMZba6vDWSwmPE405mpcNE2prYc/GswMiMdJ0aghEJk+QjcZQqTNxOVhHkxbgjHIA5WZd3lHVdJt1F3nbr7cF1r3hUbl8AZOAdXwAU3oAnuQRt0AIReAVv4N36sL6sb+tn/rphFZlTsADr9w/srbCf</latexit> <latexit sha1_base64="6bcJ9NFmCNqdZFN3f0Qqrur3zBc=">ACPnicdVBLS8NAGNz4rPXV6tHLYvFxKokIeiz24rGKfUAbymazaZfuI+xuxBL6D7zq7/Fv+Ae8iVePbtMcbEsHPhmvoFhgphRbVz301lb39jc2i7sFHf39g8OS+WjlpaJwqSJZOqEyBNGBWkahpBMrgnjASDsY1ad+5koTaV4MuOY+BwNBI0oRsZKjy8X/VLFrboZ4DLxclIBORr9snPeCyVOBEGM6R13Nj46dIGYoZmR7iSYxwiM0IF1LBeJE+2lWdQLPrBLCSCp7wsBM/Z9IEd6zAP7yZEZ6kVvKq7yzJBP5jU2kIpameIVxkJbE936KRVxYojAs7JRwqCRcLodDKki2LCxJQjbPMUQD5FC2NiFi70smNYl50iEemKX9RZ3XCatq6rnVr2H60rtLt+4AE7AKbgEHrgBNXAPGqAJMIjAK3gD786H8+V8Oz+z1zUnzxyDOTi/fwLBsCs=</latexit> <latexit sha1_base64="vkhjxIL2dCQAq3BWHZQwy6UYI=">ACSXicdVDJTgJBFOwBF8QN9OhlItF4IjPGRI9ELh4xkSWBCelpGmjpZex+YyATvsOrfo9f4Gd4M57sAQ4CoZKXVKpeJZUKI84MeN6Xk8lube/s5vby+weHR8eF4knDqFgTWieK90KsaGcSVoHBpy2Ik2xCDlthqNq6jdfqTZMySeYRDQeCBZnxEMVgrG3Q7QMSRVs/TbqHklb0Z3HXiL0gJLVDrFp3LTk+RWFAJhGNj2r4XQZBgDYxwOs13YkMjTEZ4QNuWSiyoCZJZ6l7YZWe21fangR3pv5PJFgYMxGh/RQYhmbVS8VNHgzFdFnjA6WZlRnZYKy0hf5dkDAZxUAlmZftx9wF5aYzuj2mKQE+sQTm2fEJUOsMQE7dr4zCyZVJQSWPZMu6/uE4a12XfK/uPN6XK/WLjHDpD5+gK+egWVdADqE6IugFvaF39OF8Ot/Oj/M7f804i8wpWkIm+wcQYbSm</latexit> <latexit sha1_base64="LsWZ6uDGvrbSnbxb8WnpSDBCIcQ=">ACWHicdZBLSwMxFIVvx0drfdW602wKOKizBRBN0KxG5cK1hY6tWTSTBuax5BklDIM+Gvc6t/RX2Nau9CKFwKH79wDNydKODPW9z8K3srq2nqxtFHe3Nre2a3sVR+MSjWhbaK40t0IG8qZpG3LKfdRFMsIk470aQ18ztPVBum5L2dJrQv8EiymBFsHRpUDp4fz9AVCmONSfacZ6FUWjgxaOSDSs2v+/NBf0WwEDVYzO1gr3ASDhVJBZWcGxML/AT28+wtoxwmpfD1NAEkwke0Z6TEgtq+tn8Ezk6dmSIYqXdkxbN6c9EhoUxUxG5TYHt2Cx7M/ifZ8ci/834SGnmMCP/GEvX2viynzGZpJZK8n1snHJkFZq1ioZMU2L51AlMXJ4RMbYdWpd9+VwHsxaSgsh2bWbLDc41/x0KgHfj24O681rxcdl+AQjuAUAriAJtzALbSBwAu8whu8Fz498IrexveqV1hk9uHXeNUvTD62pA=</latexit> <latexit sha1_base64="5+RaHJWS91d5hlovt7sv+80ze0=">ACXicdVHLSgMxFE3HV62vqgsXLgwWRDKTBF0WezGZQVbC51aMmDeYxJHfUMszSr3GrH+PKXzF9LTigcDhnHvg3pMoEdyC738WvKXldW14npY3Nre6e8u9e2OjWUtagW2nQiYpngirWAg2CdxDAiI8Huo8fGxL9/YsZyre5gnLCeJEPFY04JOKlfPgpjQ2j2/BCTvALPsdRnoVKG5k95/1a3i9X/Ko/Bf5LgjmpoDma/d3CaTjQNJVMARXE2m7gJ9DLiAFOBctLYWpZQugjGbKuo4pIZnvZ9JIcnzhlgGNt3FOAp+rPREaktWMZuUlJYGQXvYn4nwcjmf/WxFAb7mRO/zEWtoX4qpdxlaTAFJ0tG6cCg8aTavGAG0ZBjB0h1OU5xXREXL3gPqAUToNZQ0tJ1MBOmg0We/xL2rVq4FeD24tK/XrecREdomN0hgJ0ieroBjVRC1H0it7QO/ofHkr3qa3PRv1CvPMPvoF7+Ab5a+4zA=</latexit>

Recall that the decision hyperplane is orthogonal (perpendicular) to w. The vector w∗ =

w w2 is a unit vector pointing in the same direction as w.

The same hyperplane could equivalently be defined in terms of w∗.

UofT CSC 411: 09-Classification Odds and Ends 10 / 34

slide-11
SLIDE 11

Geometry of Points and Planes

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="SQ4s8AFHTzFnLzTEnxRMVAdSJDM=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWNF+4A2lM1mky7dR9jdiCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7xlLRQR/0tMYewxGnIQEQW2kx5eRO6rWnYaTw14lbkHqoEB7VLMuhoFACcNcIwqVGrhOrL0USk0QxVlmCgcQzSBER4YyiHDykvzrpl9bpTADoU0x7Wdq/8TKWRKTZlvPhnUY7XszcR1nh6zbFGjkZDEyAStMZba6vDWSwmPE405mpcNE2prYc/GswMiMdJ0aghEJk+QjcZQqTNxJVhHkxbgjHIA5WZd3lHVdJ96rhOg34brevCs2LoNTcAYugQtuQBPcgzboAQi8ArewLv1YX1Z39bP/LVkFZkTsADr9w/q1LCe</latexit> <latexit sha1_base64="wLxVwb7JO8TFCMfjpr5pGfRwb2g=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSlIEPRZ78VjRPqANZbPZpEv3EXY3Ygn5CV719/gz/AXexKs3t2kOtqUDHwz38AwfkyJ0o7zaW1sbm3v7Jb2yvsHh0fHlepJV4lEItxBgrZ96HClHDc0URT3I8lhsynuOdPWjO/94ylIoI/6WmMPQYjTkKCoDbS48uoMarUnLqTw14lbkFqoEB7VLUuh4FACcNcIwqVGrhOrL0USk0QxVl5mCgcQzSBER4YyiHDykvzrpl9YZTADoU0x7Wdq/8TKWRKTZlvPhnUY7XszcR1nh6zbFGjkZDEyAStMZba6vDWSwmPE405mpcNE2prYc/GswMiMdJ0aghEJk+QjcZQqTNxOVhHkxbgjHIA5WZd3lHVdJt1F3nbr7cF1r3hUbl8AZOAdXwAU3oAnuQRt0AIReAVv4N36sL6sb+tn/rphFZlTsADr9w/srbCf</latexit> <latexit sha1_base64="6bcJ9NFmCNqdZFN3f0Qqrur3zBc=">ACPnicdVBLS8NAGNz4rPXV6tHLYvFxKokIeiz24rGKfUAbymazaZfuI+xuxBL6D7zq7/Fv+Ae8iVePbtMcbEsHPhmvoFhgphRbVz301lb39jc2i7sFHf39g8OS+WjlpaJwqSJZOqEyBNGBWkahpBMrgnjASDsY1ad+5koTaV4MuOY+BwNBI0oRsZKjy8X/VLFrboZ4DLxclIBORr9snPeCyVOBEGM6R13Nj46dIGYoZmR7iSYxwiM0IF1LBeJE+2lWdQLPrBLCSCp7wsBM/Z9IEd6zAP7yZEZ6kVvKq7yzJBP5jU2kIpameIVxkJbE936KRVxYojAs7JRwqCRcLodDKki2LCxJQjbPMUQD5FC2NiFi70smNYl50iEemKX9RZ3XCatq6rnVr2H60rtLt+4AE7AKbgEHrgBNXAPGqAJMIjAK3gD786H8+V8Oz+z1zUnzxyDOTi/fwLBsCs=</latexit> <latexit sha1_base64="vkhjxIL2dCQAq3BWHZQwy6UYI=">ACSXicdVDJTgJBFOwBF8QN9OhlItF4IjPGRI9ELh4xkSWBCelpGmjpZex+YyATvsOrfo9f4Gd4M57sAQ4CoZKXVKpeJZUKI84MeN6Xk8lube/s5vby+weHR8eF4knDqFgTWieK90KsaGcSVoHBpy2Ik2xCDlthqNq6jdfqTZMySeYRDQeCBZnxEMVgrG3Q7QMSRVs/TbqHklb0Z3HXiL0gJLVDrFp3LTk+RWFAJhGNj2r4XQZBgDYxwOs13YkMjTEZ4QNuWSiyoCZJZ6l7YZWe21fangR3pv5PJFgYMxGh/RQYhmbVS8VNHgzFdFnjA6WZlRnZYKy0hf5dkDAZxUAlmZftx9wF5aYzuj2mKQE+sQTm2fEJUOsMQE7dr4zCyZVJQSWPZMu6/uE4a12XfK/uPN6XK/WLjHDpD5+gK+egWVdADqE6IugFvaF39OF8Ot/Oj/M7f804i8wpWkIm+wcQYbSm</latexit> <latexit sha1_base64="LsWZ6uDGvrbSnbxb8WnpSDBCIcQ=">ACWHicdZBLSwMxFIVvx0drfdW602wKOKizBRBN0KxG5cK1hY6tWTSTBuax5BklDIM+Gvc6t/RX2Nau9CKFwKH79wDNydKODPW9z8K3srq2nqxtFHe3Nre2a3sVR+MSjWhbaK40t0IG8qZpG3LKfdRFMsIk470aQ18ztPVBum5L2dJrQv8EiymBFsHRpUDp4fz9AVCmONSfacZ6FUWjgxaOSDSs2v+/NBf0WwEDVYzO1gr3ASDhVJBZWcGxML/AT28+wtoxwmpfD1NAEkwke0Z6TEgtq+tn8Ezk6dmSIYqXdkxbN6c9EhoUxUxG5TYHt2Cx7M/ifZ8ci/834SGnmMCP/GEvX2viynzGZpJZK8n1snHJkFZq1ioZMU2L51AlMXJ4RMbYdWpd9+VwHsxaSgsh2bWbLDc41/x0KgHfj24O681rxcdl+AQjuAUAriAJtzALbSBwAu8whu8Fz498IrexveqV1hk9uHXeNUvTD62pA=</latexit> <latexit sha1_base64="5+RaHJWS91d5hlovt7sv+80ze0=">ACXicdVHLSgMxFE3HV62vqgsXLgwWRDKTBF0WezGZQVbC51aMmDeYxJHfUMszSr3GrH+PKXzF9LTigcDhnHvg3pMoEdyC738WvKXldW14npY3Nre6e8u9e2OjWUtagW2nQiYpngirWAg2CdxDAiI8Huo8fGxL9/YsZyre5gnLCeJEPFY04JOKlfPgpjQ2j2/BCTvALPsdRnoVKG5k95/1a3i9X/Ko/Bf5LgjmpoDma/d3CaTjQNJVMARXE2m7gJ9DLiAFOBctLYWpZQugjGbKuo4pIZnvZ9JIcnzhlgGNt3FOAp+rPREaktWMZuUlJYGQXvYn4nwcjmf/WxFAb7mRO/zEWtoX4qpdxlaTAFJ0tG6cCg8aTavGAG0ZBjB0h1OU5xXREXL3gPqAUToNZQ0tJ1MBOmg0We/xL2rVq4FeD24tK/XrecREdomN0hgJ0ieroBjVRC1H0it7QO/ofHkr3qa3PRv1CvPMPvoF7+Ab5a+4zA=</latexit>

The (signed) distance of a point x′ to the hyperplane is w⊤x′ + b w2

UofT CSC 411: 09-Classification Odds and Ends 11 / 34

slide-12
SLIDE 12

Maximizing Margin as an Optimization Problem

Recall: the classification for the i-th data point is correct when sign(w⊤x(i) + b) = t(i) This can be rewritten as t(i)(w⊤x(i) + b) > 0 Enforcing a margin of C: t(i) · (w⊤x(i) + b) w2

  • signed distance

≥ C

UofT CSC 411: 09-Classification Odds and Ends 12 / 34

slide-13
SLIDE 13

Maximizing Margin as an Optimization Problem

Max-margin objective: max

w,b C

s.t. t(i)(w⊤x(i) + b) w2 ≥ C i = 1, . . . , N Plug in C = 1/ w2 and simplify: t(i)(w⊤x(i) + b) w2 ≥ 1 w2

  • geometric margin constraint

⇐ ⇒ t(i)(w⊤x(i) + b) ≥ 1

  • algebraic margin constraint

Equivalent optimization objective: min w2

2

s.t. t(i)(w⊤x(i) + b) ≥ 1 i = 1, . . . , N

UofT CSC 411: 09-Classification Odds and Ends 13 / 34

slide-14
SLIDE 14

Maximizing Margin as an Optimization Problem

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="HUrqgXiwoP/pt58JPjhVuEWrf98=">ACVnicdZBLSwMxFIUzo7W1vlrdCG6CRXFVZoqgG6HYjcsK9gGdUjJpg3NY0gyShnGX+NWf4/+GTF9LGxLwQO37kHbk4YM6qN5/047s5ubi9f2C8eHB4dn5TKp20tE4VJC0smVTdEmjAqSMtQw0g3VgTxkJFOGnM/M4rUZpK8WKmMelzNBI0ohgZiwal8wZ8gEGkE79LA2EVDx9ywa1bFCqeFVvPnBT+EtRActpDsrOdTCUOFEGMyQ1j3fi0/RcpQzEhWDBJNYoQnaER6VgrEie6n8y9k8MqSIYyksk8YOKf/EyniWk95aDc5MmO97s3gNs+MebK2EgqajHFW4y1a01030+piBNDBF4cGyUMGglncIhVQbNrUCYZunGOIxsp0a23wxmAfThuQciaGeNeuv97gp2rWq71X959tK/XHZcQFcgEtwA3xwB+rgCTRBC2DwDj7AJ/hyvp1fN+fmF6us8ycgZVxS38zfLaN</latexit> <latexit sha1_base64="19E7Q3SzuFSITLhYdfln84cJaQ=">ACQnicdVDNTsJAGNzFP8Q/0KOXRqLxRFoueiRy8YiJgAk0ZLvdlpX9aXa3JqThHbzq8/gSvoI349WDS+lBIEzyJZOZb5LJBAmj2rjuJyxtbe/s7pX3KweHR8cn1dpT8tUYdLFkn1FCBNGBWka6h5ClRBPGAkX4wac/9/gtRmkrxaKYJ8TmKBY0oRsZKPY5UTMWoWncbg5nXgFqYMCnVENXg1DiVNOhMEMaT3w3MT4GVKGYkZmlWGqSYLwBMVkYKlAnGg/y+vOnEurhE4klT1hnFz9n8gQ13rKA/vJkRnrVW8ubvLMmM+WNRZLRa1M8QZjpa2Jbv2MiQ1ROBF2ShljpHOfD8npIpgw6aWIGzFDt4jBTCxq5cGebBrC05RyLUM7ust7rjOuk1G57b8B6a9dZdsXEZnIMLcA08cANa4B50QBdg8AxewRt4hx/wC37Dn8VrCRaZM7AE+PsHEhayMA=</latexit>

UofT CSC 411: 09-Classification Odds and Ends 14 / 34

slide-15
SLIDE 15

Maximizing Margin as an Optimization Problem

Algebraic max-margin objective: min

w,b w2 2

s.t. t(i)(w⊤x(i) + b) ≥ 1 i = 1, . . . , N

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="HUrqgXiwoP/pt58JPjhVuEWrf98=">ACVnicdZBLSwMxFIUzo7W1vlrdCG6CRXFVZoqgG6HYjcsK9gGdUjJpg3NY0gyShnGX+NWf4/+GTF9LGxLwQO37kHbk4YM6qN5/047s5ubi9f2C8eHB4dn5TKp20tE4VJC0smVTdEmjAqSMtQw0g3VgTxkJFOGnM/M4rUZpK8WKmMelzNBI0ohgZiwal8wZ8gEGkE79LA2EVDx9ywa1bFCqeFVvPnBT+EtRActpDsrOdTCUOFEGMyQ1j3fi0/RcpQzEhWDBJNYoQnaER6VgrEie6n8y9k8MqSIYyksk8YOKf/EyniWk95aDc5MmO97s3gNs+MebK2EgqajHFW4y1a01030+piBNDBF4cGyUMGglncIhVQbNrUCYZunGOIxsp0a23wxmAfThuQciaGeNeuv97gp2rWq71X959tK/XHZcQFcgEtwA3xwB+rgCTRBC2DwDj7AJ/hyvp1fN+fmF6us8ycgZVxS38zfLaN</latexit> < l a t e x i t s h a 1 _ b a s e 6 4 = " 1 9 E 7 Q Q 3 S z u F S I T L h Y d f l n 8 4 c J a Q = " > A A A C Q n i c d V D N T s J A G N z F P 8 Q / K O X R q L x R F
  • u
e i R y 8 Y i J g A k Z L v d l p X 9 a X a 3 J q T h H b z q 8 / g S v
  • I
3 4 9 W D S + l B I E z y J Z O Z b 5 L J B A m j 2 r j u J y x t b e / s 7 p X 3 K w e H R 8 c n 1 d p p T 8 t U Y d L F k k n 1 F C B N G B W k a 6 h h 5 C l R B P G A k X 4 w a c / 9 / g t R m k r x a K Y J 8 T m K B Y
  • R
s Z K P Y 5 U T M W
  • W
n c b b g 5 n n X g F q Y M C n V E N X g 1 D i V N O h M E M a T 3 w 3 M T 4 G V K G Y k Z m l W G q S Y L w B M V k Y K l A n G g / y + v O n E u r h E 4 k l T 1 h n F z 9 n 8 g Q 1 3 r K A / v J k R n r V W 8 u b v L M m M + W N R Z L R a 1 M 8 Q Z j p a 2 J b v 2 M i i Q 1 R O B F 2 S h l j p H O f D 8 n p I p g w 6 a W I G z z F D t 4 j B T C x q 5 c G e b B r C 5 R y L U M 7 u s t 7 r j O u k 1 G 5 7 b 8 B 6 a 9 d Z d s X E Z n I M L c A 8 c A N a 4 B 5 Q B d g 8 A x e w R t 4 h x / w C 3 7 D n 8 V r C R a Z M 7 A E + P s H E h a y M A = = < / l a t e x i t >

Observe: if the margin constraint is not tight for x(i), we could remove it from the training set and the optimal w would be the same. The important training examples are the ones with algebraic margin 1, and are called support vectors. Hence, this algorithm is called the (hard) Support Vector Machine (SVM) (or Support Vector Classifier). SVM-like algorithms are often called max-margin or large-margin.

UofT CSC 411: 09-Classification Odds and Ends 15 / 34

slide-16
SLIDE 16

Non-Separable Data Points

How can we apply the max-margin principle if the data are not linearly separable?

UofT CSC 411: 09-Classification Odds and Ends 16 / 34

slide-17
SLIDE 17

Maximizing Margin for Non-Separable Data Points

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="HUrqgXiwoP/pt58JPjhVuEWrf98=">ACVnicdZBLSwMxFIUzo7W1vlrdCG6CRXFVZoqgG6HYjcsK9gGdUjJpg3NY0gyShnGX+NWf4/+GTF9LGxLwQO37kHbk4YM6qN5/047s5ubi9f2C8eHB4dn5TKp20tE4VJC0smVTdEmjAqSMtQw0g3VgTxkJFOGnM/M4rUZpK8WKmMelzNBI0ohgZiwal8wZ8gEGkE79LA2EVDx9ywa1bFCqeFVvPnBT+EtRActpDsrOdTCUOFEGMyQ1j3fi0/RcpQzEhWDBJNYoQnaER6VgrEie6n8y9k8MqSIYyksk8YOKf/EyniWk95aDc5MmO97s3gNs+MebK2EgqajHFW4y1a01030+piBNDBF4cGyUMGglncIhVQbNrUCYZunGOIxsp0a23wxmAfThuQciaGeNeuv97gp2rWq71X959tK/XHZcQFcgEtwA3xwB+rgCTRBC2DwDj7AJ/hyvp1fN+fmF6us8ycgZVxS38zfLaN</latexit> <latexit sha1_base64="v8Ry+B9UWZbY2oQR8qE+CAUB9zU=">ACQXicdVDNSsNAGNz4W+tfq0cvwaJ4KokIeiz24rGCaQtNKJvNpl26P2F3I5aQZ/Cqz+NT+AjexKsXt2kOtqUDHwz38AwYUKJ0o7zaW1sbm3v7Fb2qvsHh0fHtfpJV4lUIuwhQYXsh1BhSj2NEU9xOJIQsp7oWT9szvPWOpiOBPeprgMERJzFBUBvJ81/I0B3WGk7TKWCvErckDVCiM6xbl34kUMow14hCpQauk+g1ITRHFe9VOFE4gmcIQHhnLIsAqyom1uXxglsmMhzXFtF+r/RAaZUlMWmk8G9VgtezNxnafHLF/U6EhIYmSC1hLbXV8F2SEJ6nGHM3Lxim1tbBn89kRkRhpOjUEIpMnyEZjKCHSZuSqXwSztmAM8kjlZl3ecdV0r1uk7TfbxptO7LjSvgDJyDK+CW9ACD6ADPIAa/gDbxbH9aX9W39zF83rDJzChZg/f4BpCWxdw=</latexit> <latexit sha1_base64="LeZm3xJsfTdes5W71yrJjO4Q6Q=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0mKoMdiLx4rmLbQhLZbNql+xN2N2IJeQav+jw+hY/gTbx6cdvmYFs68MEw8w0MEyaUKO04n1Zpa3tnd6+8Xzk4PDo+qdZOu0qkEmEPCSpkP4QKU8Kxp4muJ9IDFlIcS+ctGd+7xlLRQR/0tMEBwyOIkJgtpInv9Chs1hte40nDnsdeIWpA4KdIY168qPBEoZ5hpRqNTAdRIdZFBqgijOK36qcALRBI7wFAOGVZBNm+b25dGiexYSHNc23P1fyKDTKkpC80ng3qsVr2ZuMnTY5Yva3QkJDEyQRuMlbY6vgsywpNUY4WZeOU2lrYs/nsiEiMNJ0aApHJE2SjMZQaTNyxZ8Hs7ZgDPJI5WZd3XHdJtNlyn4T7e1Fv3xcZlcA4uwDVwS1ogQfQAR5AgIBX8AberQ/ry/q2fhavJavInIElWL9/pf6xeA=</latexit> <latexit sha1_base64="FcLicTG2Qokn8gLMrVzYhNbQfR4=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0lU0GOxF48VTFtoQtlstu3S/Qm7G7GEPINXfR6fwkfwJl69uE1zsC0d+GCY+QaGCWNKlHacT6u0sbm1vVPereztHxweVWvHSUSibCHBWyF0KFKeHY0RT3IslhiykuBtOWjO/+4ylIoI/6WmMAwZHnAwJgtpInv9CBteDat1pODnsVeIWpA4KtAc168KPBEoY5hpRqFTfdWIdpFBqgijOKn6icAzRBI5w31AOGVZBmrfN7HOjRPZQSHNc27n6P5FCptSUheaTQT1Wy95MXOfpMcsWNToSkhiZoDXGUls9vAtSwuNEY47mZYcJtbWwZ/PZEZEYaTo1BCKTJ8hGYygh0mbkip8H05ZgDPJIZWZd3nHVdK5arhOw328qTfvi43L4BScgUvglvQBA+gDTyAGv4A28Wx/Wl/Vt/cxfS1aROQELsH7/AKfXsXk=</latexit> <latexit sha1_base64="sHovSF/sqvWoNa6C158gySo1A=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0mkoMdiLx4rmLbQhLZbNul+xN2N2IJeQav+jw+hY/gTbx6cZvmYFs68MEw8w0ME8aUKO04n1Zpa3tnd6+8Xzk4PDo+qdZOu0okEmEPCSpkP4QKU8Kxp4muB9LDFlIcS+ctud+7xlLRQR/0rMYBwyORkRBLWRP+FDJvDat1pODnsdeIWpA4KdIY168qPBEoY5hpRqNTAdWIdpFBqgijOKn6icAzRFI7xwFAOGVZBmrfN7EujRPZISHNc27n6P5FCptSMheaTQT1Rq95c3OTpCcuWNToWkhiZoA3GSls9ugtSwuNEY4WZUcJtbWw5/PZEZEYaTozBCKTJ8hGEygh0mbkip8H07ZgDPJIZWZd3XHdK9abhOw31s1lv3xcZlcA4uwDVwS1ogQfQAR5AgIBX8AberQ/ry/q2fhavJavInIElWL9/qbCxeg=</latexit> < l a t e x i t s h a 1 _ b a s e 6 4 = " 1 9 E 7 Q Q 3 S z u F S I T L h Y d f l n 8 4 c J a Q = " > A A A C Q n i c d V D N T s J A G N z F P 8 Q / K O X R q L x R F
  • u
e i R y 8 Y i J g A k Z L v d l p X 9 a X a 3 J q T h H b z q 8 / g S v
  • I
3 4 9 W D S + l B I E z y J Z O Z b 5 L J B A m j 2 r j u J y x t b e / s 7 p X 3 K w e H R 8 c n 1 d p p T 8 t U Y d L F k k n 1 F C B N G B W k a 6 h h 5 C l R B P G A k X 4 w a c / 9 / g t R m k r x a K Y J 8 T m K B Y
  • R
s Z K P Y 5 U T M W
  • W
n c b b g 5 n n X g F q Y M C n V E N X g 1 D i V N O h M E M a T 3 w 3 M T 4 G V K G Y k Z m l W G q S Y L w B M V k Y K l A n G g / y + v O n E u r h E 4 k l T 1 h n F z 9 n 8 g Q 1 3 r K A / v J k R n r V W 8 u b v L M m M + W N R Z L R a 1 M 8 Q Z j p a 2 J b v 2 M i i Q 1 R O B F 2 S h l j p H O f D 8 n p I p g w 6 a W I G z z F D t 4 j B T C x q 5 c G e b B r C 5 R y L U M 7 u s t 7 r j O u k 1 G 5 7 b 8 B 6 a 9 d Z d s X E Z n I M L c A 8 c A N a 4 B 5 Q B d g 8 A x e w R t 4 h x / w C 3 7 D n 8 V r C R a Z M 7 A E + P s H E h a y M A = = < / l a t e x i t >

Main Idea: Allow some points to be within the margin or even be misclassified; we represent this with slack variables ξi. But constrain or penalize the total amount of slack.

UofT CSC 411: 09-Classification Odds and Ends 17 / 34

slide-18
SLIDE 18

Maximizing Margin for Non-Separable Data Points

<latexit sha1_base64="VQB14ElJwnNPgog3Hhcs5tX+JT4=">ACVXicdVDLSgMxFM2MVWt9V24cBMsSkUoMyLoRih247KCVcHWkzbTCPIbmjlmG+xq1+j/gxguljoZUeCJx7zj1wc6JEcAtB8OX5C4XFpeXiSml1bX1js7y1fWt1aihrUS20uY+IZYIr1gIOgt0nhEZCXYXPTVG/t0zM5ZrdQPDhHUk6Ssec0rASd3yblx9PcIXOMLH+OWxDTrBr24MuVKUAvGwP9JOCUVNEWzu+UdtnuapIpoIJY+xAGCXQyYoBTwfJSO7UsIfSJ9NmDo4pIZjvZ+Ac5PnBKD8fauKcAj9XfiYxIa4cycpuSwMDOeiNxngcDmf/VRF8b7mRO5xgz10J83sm4SlJgik6OjVOBQeNRpbjHDaMgho4Q6vKcYjoghlBwxZfa42DW0FIS1bO5azac7fE/uT2phUEtvD6t1C+nHRfRHtpHVRSiM1RHV6iJWoiHL2hd/ThfXrfsFfmqz63jSzg/7A3/wB/6uz1A=</latexit> <latexit sha1_base64="HUrqgXiwoP/pt58JPjhVuEWrf98=">ACVnicdZBLSwMxFIUzo7W1vlrdCG6CRXFVZoqgG6HYjcsK9gGdUjJpg3NY0gyShnGX+NWf4/+GTF9LGxLwQO37kHbk4YM6qN5/047s5ubi9f2C8eHB4dn5TKp20tE4VJC0smVTdEmjAqSMtQw0g3VgTxkJFOGnM/M4rUZpK8WKmMelzNBI0ohgZiwal8wZ8gEGkE79LA2EVDx9ywa1bFCqeFVvPnBT+EtRActpDsrOdTCUOFEGMyQ1j3fi0/RcpQzEhWDBJNYoQnaER6VgrEie6n8y9k8MqSIYyksk8YOKf/EyniWk95aDc5MmO97s3gNs+MebK2EgqajHFW4y1a01030+piBNDBF4cGyUMGglncIhVQbNrUCYZunGOIxsp0a23wxmAfThuQciaGeNeuv97gp2rWq71X959tK/XHZcQFcgEtwA3xwB+rgCTRBC2DwDj7AJ/hyvp1fN+fmF6us8ycgZVxS38zfLaN</latexit> <latexit sha1_base64="v8Ry+B9UWZbY2oQR8qE+CAUB9zU=">ACQXicdVDNSsNAGNz4W+tfq0cvwaJ4KokIeiz24rGCaQtNKJvNpl26P2F3I5aQZ/Cqz+NT+AjexKsXt2kOtqUDHwz38AwYUKJ0o7zaW1sbm3v7Fb2qvsHh0fHtfpJV4lUIuwhQYXsh1BhSj2NEU9xOJIQsp7oWT9szvPWOpiOBPeprgMERJzFBUBvJ81/I0B3WGk7TKWCvErckDVCiM6xbl34kUMow14hCpQauk+g1ITRHFe9VOFE4gmcIQHhnLIsAqyom1uXxglsmMhzXFtF+r/RAaZUlMWmk8G9VgtezNxnafHLF/U6EhIYmSC1hLbXV8F2SEJ6nGHM3Lxim1tbBn89kRkRhpOjUEIpMnyEZjKCHSZuSqXwSztmAM8kjlZl3ecdV0r1uk7TfbxptO7LjSvgDJyDK+CW9ACD6ADPIAa/gDbxbH9aX9W39zF83rDJzChZg/f4BpCWxdw=</latexit> <latexit sha1_base64="LeZm3xJsfTdes5W71yrJjO4Q6Q=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0mKoMdiLx4rmLbQhLZbNql+xN2N2IJeQav+jw+hY/gTbx6cdvmYFs68MEw8w0MEyaUKO04n1Zpa3tnd6+8Xzk4PDo+qdZOu0qkEmEPCSpkP4QKU8Kxp4muJ9IDFlIcS+ctGd+7xlLRQR/0tMEBwyOIkJgtpInv9Chs1hte40nDnsdeIWpA4KdIY168qPBEoZ5hpRqNTAdRIdZFBqgijOK36qcALRBI7wFAOGVZBNm+b25dGiexYSHNc23P1fyKDTKkpC80ng3qsVr2ZuMnTY5Yva3QkJDEyQRuMlbY6vgsywpNUY4WZeOU2lrYs/nsiEiMNJ0aApHJE2SjMZQaTNyxZ8Hs7ZgDPJI5WZd3XHdJtNlyn4T7e1Fv3xcZlcA4uwDVwS1ogQfQAR5AgIBX8AberQ/ry/q2fhavJavInIElWL9/pf6xeA=</latexit> <latexit sha1_base64="FcLicTG2Qokn8gLMrVzYhNbQfR4=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0lU0GOxF48VTFtoQtlstu3S/Qm7G7GEPINXfR6fwkfwJl69uE1zsC0d+GCY+QaGCWNKlHacT6u0sbm1vVPereztHxweVWvHSUSibCHBWyF0KFKeHY0RT3IslhiykuBtOWjO/+4ylIoI/6WmMAwZHnAwJgtpInv9CBteDat1pODnsVeIWpA4KtAc168KPBEoY5hpRqFTfdWIdpFBqgijOKn6icAzRBI5w31AOGVZBmrfN7HOjRPZQSHNc27n6P5FCptSUheaTQT1Wy95MXOfpMcsWNToSkhiZoDXGUls9vAtSwuNEY47mZYcJtbWwZ/PZEZEYaTo1BCKTJ8hGYygh0mbkip8H05ZgDPJIZWZd3nHVdK5arhOw328qTfvi43L4BScgUvglvQBA+gDTyAGv4A28Wx/Wl/Vt/cxfS1aROQELsH7/AKfXsXk=</latexit> <latexit sha1_base64="sHovSF/sqvWoNa6C158gySo1A=">ACQXicdVDNSsNAGNzUv1r/Wj16CRbFU0mkoMdiLx4rmLbQhLZbNul+xN2N2IJeQav+jw+hY/gTbx6cZvmYFs68MEw8w0ME8aUKO04n1Zpa3tnd6+8Xzk4PDo+qdZOu0okEmEPCSpkP4QKU8Kxp4muB9LDFlIcS+ctud+7xlLRQR/0rMYBwyORkRBLWRP+FDJvDat1pODnsdeIWpA4KdIY168qPBEoY5hpRqNTAdWIdpFBqgijOKn6icAzRFI7xwFAOGVZBmrfN7EujRPZISHNc27n6P5FCptSMheaTQT1Rq95c3OTpCcuWNToWkhiZoA3GSls9ugtSwuNEY4WZUcJtbWw5/PZEZEYaTozBCKTJ8hGEygh0mbkip8H07ZgDPJIZWZd3XHdK9abhOw31s1lv3xcZlcA4uwDVwS1ogQfQAR5AgIBX8AberQ/ry/q2fhavJavInIElWL9/qbCxeg=</latexit> <latexit sha1_base64="19E7Q3SzuFSITLhYdfln84cJaQ=">ACQnicdVDNTsJAGNzFP8Q/0KOXRqLxRFoueiRy8YiJgAk0ZLvdlpX9aXa3JqThHbzq8/gSvoI349WDS+lBIEzyJZOZb5LJBAmj2rjuJyxtbe/s7pX3KweHR8cn1dpT8tUYdLFkn1FCBNGBWka6h5ClRBPGAkX4wac/9/gtRmkrxaKYJ8TmKBY0oRsZKPY5UTMWoWncbg5nXgFqYMCnVENXg1DiVNOhMEMaT3w3MT4GVKGYkZmlWGqSYLwBMVkYKlAnGg/y+vOnEurhE4klT1hnFz9n8gQ13rKA/vJkRnrVW8ubvLMmM+WNRZLRa1M8QZjpa2Jbv2MiQ1ROBF2ShljpHOfD8npIpgw6aWIGzFDt4jBTCxq5cGebBrC05RyLUM7ust7rjOuk1G57b8B6a9dZdsXEZnIMLcA08cANa4B50QBdg8AxewRt4hx/wC37Dn8VrCRaZM7AE+PsHEhayMA=</latexit>

Soft margin constraint: t(i)(w⊤x(i) + b) w2 ≥ C(1 − ξi), for ξi ≥ 0. Penalize

i ξi

UofT CSC 411: 09-Classification Odds and Ends 18 / 34

slide-19
SLIDE 19

Maximizing Margin for Non-Separable Data Points

Soft-margin SVM objective: min

w,b,ξ

1 2 w2

2 + γ N

  • i=1

ξi s.t. t(i)(w⊤x(i) + b) ≥ 1 − ξi i = 1, . . . , N ξi ≥ 0 i = 1, . . . , N γ is a hyperparameter that trades off the margin with the amount of slack.

◮ For γ = 0, we’ll get w = 0. (Why?) ◮ As γ → ∞ we get the hard-margin objective.

Note: it is also possible to constrain

i ξi instead of penalizing it.

UofT CSC 411: 09-Classification Odds and Ends 19 / 34

slide-20
SLIDE 20

From Margin Violation to Hinge Loss

Let’s simplify the soft margin constraint by eliminating ξi. Recall: t(i)(w⊤x(i) + b) ≥ 1 − ξi i = 1, . . . , N ξi ≥ 0 i = 1, . . . , N Rewrite as ξi ≥ 1 − t(i)(w⊤x(i) + b). Case 1: 1 − t(i)(w⊤x(i) + b) ≤ 0

◮ The smallest non-negative ξi that satisfies the constraint is ξi = 0.

Case 2: 1 − t(i)(w⊤x(i) + b) > 0

◮ The smallest ξi that satisfies the constraint is ξi = 1 − t(i)(w⊤x(i) + b).

Hence, ξi = max{0, 1 − t(i)(w⊤x(i) + b)}. Therefore, the slack penalty can be written as

N

  • i=1

ξi =

N

  • i=1

max{0, 1 − t(i)(w⊤x(i) + b)}. We sometimes write max{0, y} = (y)+

UofT CSC 411: 09-Classification Odds and Ends 20 / 34

slide-21
SLIDE 21

From Margin Violation to Hinge Loss

If we write y (i)(w, b) = w⊤x + b, then the optimization problem can be written as min

w,b,ξ N

  • i=1
  • 1 − t(i)y (i)(w, b)
  • + + 1

2γ w2

2

The loss function LH(y, t) = (1 − ty)+ is called the hinge loss. The second term is the L2-norm of the weights. Hence, the soft-margin SVM can be seen as a linear classifier with hinge loss and an L2 regularizer.

UofT CSC 411: 09-Classification Odds and Ends 21 / 34

slide-22
SLIDE 22

Revisiting Loss Functions for Classification

Hinge loss compared with other loss functions

UofT CSC 411: 09-Classification Odds and Ends 22 / 34

slide-23
SLIDE 23

SVMs: What we Left Out

What we left out: How to fit w:

◮ One option: gradient descent ◮ Can reformulate with the Lagrange dual

The “kernel trick” converts it into a powerful nonlinear classifier. We’ll cover this later in the course. Classic results from learning theory show that a large margin implies good generalization.

UofT CSC 411: 09-Classification Odds and Ends 23 / 34

slide-24
SLIDE 24

AdaBoost Revisited

Part 2: reinterpreting AdaBoost in terms of what we’ve learned about loss functions.

UofT CSC 411: 09-Classification Odds and Ends 24 / 34

slide-25
SLIDE 25

AdaBoost Revisited

Samples

h1

<latexit sha1_base64="WQi1lWoKHIxIqloXQAEwQJU1/k=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSiKCHou9eKxoH9CGstlskqX7CLsboYT8BK/6e/wZ/gJv4tWb2zQH29KBD4aZb2AYP6FEacf5tDY2t7Z3dit71f2Dw6PjWv2kp0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcn7Znf8FSEcGf9TBHoMRJyFBUBvpKR6741rDaToF7FXilqQBSnTGdetyFAiUMsw1olCpoesk2sug1ARnFdHqcIJRBMY4aGhHDKsvKzomtsXRgnsUEhzXNuF+j+RQabUlPnmk0Edq2VvJq7zdMzyRY1GQhIjE7TGWGqrwzsvIzxJNeZoXjZMqa2FPRvPDojESNOpIRCZPE2iqGESJuJq6MimLUFY5AHKjfLus7rpLedN1mu7jTaN1X25cAWfgHFwBF9yCFngAHdAFCETgFbyBd+vD+rK+rZ/564ZVZk7BAqzfP80ksI4=</latexit><latexit sha1_base64="WQi1lWoKHIxIqloXQAEwQJU1/k=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSiKCHou9eKxoH9CGstlskqX7CLsboYT8BK/6e/wZ/gJv4tWb2zQH29KBD4aZb2AYP6FEacf5tDY2t7Z3dit71f2Dw6PjWv2kp0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcn7Znf8FSEcGf9TBHoMRJyFBUBvpKR6741rDaToF7FXilqQBSnTGdetyFAiUMsw1olCpoesk2sug1ARnFdHqcIJRBMY4aGhHDKsvKzomtsXRgnsUEhzXNuF+j+RQabUlPnmk0Edq2VvJq7zdMzyRY1GQhIjE7TGWGqrwzsvIzxJNeZoXjZMqa2FPRvPDojESNOpIRCZPE2iqGESJuJq6MimLUFY5AHKjfLus7rpLedN1mu7jTaN1X25cAWfgHFwBF9yCFngAHdAFCETgFbyBd+vD+rK+rZ/564ZVZk7BAqzfP80ksI4=</latexit><latexit sha1_base64="WQi1lWoKHIxIqloXQAEwQJU1/k=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSiKCHou9eKxoH9CGstlskqX7CLsboYT8BK/6e/wZ/gJv4tWb2zQH29KBD4aZb2AYP6FEacf5tDY2t7Z3dit71f2Dw6PjWv2kp0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcn7Znf8FSEcGf9TBHoMRJyFBUBvpKR6741rDaToF7FXilqQBSnTGdetyFAiUMsw1olCpoesk2sug1ARnFdHqcIJRBMY4aGhHDKsvKzomtsXRgnsUEhzXNuF+j+RQabUlPnmk0Edq2VvJq7zdMzyRY1GQhIjE7TGWGqrwzsvIzxJNeZoXjZMqa2FPRvPDojESNOpIRCZPE2iqGESJuJq6MimLUFY5AHKjfLus7rpLedN1mu7jTaN1X25cAWfgHFwBF9yCFngAHdAFCETgFbyBd+vD+rK+rZ/564ZVZk7BAqzfP80ksI4=</latexit><latexit sha1_base64="WQi1lWoKHIxIqloXQAEwQJU1/k=">ACP3icdVBLS8NAGNz4rPXV6tFLsCieSiKCHou9eKxoH9CGstlskqX7CLsboYT8BK/6e/wZ/gJv4tWb2zQH29KBD4aZb2AYP6FEacf5tDY2t7Z3dit71f2Dw6PjWv2kp0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcn7Znf8FSEcGf9TBHoMRJyFBUBvpKR6741rDaToF7FXilqQBSnTGdetyFAiUMsw1olCpoesk2sug1ARnFdHqcIJRBMY4aGhHDKsvKzomtsXRgnsUEhzXNuF+j+RQabUlPnmk0Edq2VvJq7zdMzyRY1GQhIjE7TGWGqrwzsvIzxJNeZoXjZMqa2FPRvPDojESNOpIRCZPE2iqGESJuJq6MimLUFY5AHKjfLus7rpLedN1mu7jTaN1X25cAWfgHFwBF9yCFngAHdAFCETgFbyBd+vD+rK+rZ/564ZVZk7BAqzfP80ksI4=</latexit>

Re-weighted Samples Re-weighted Samples Re-weighted Samples

h2

<latexit sha1_base64="RJacZp1vCxsEVFHPWV5JdBplswg=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVJIi6LHYi8eK9gFtKJvNJl26j7C7EUrIT/Cqv8ef4S/wJl69uU1zsC0d+GCY+QaG8WNKlHacT6u0tb2zu1ferxwcHh2fVGunPSUSiXAXCSrkwIcKU8JxVxN8SCWGDKf4r4/bc/9/guWigj+rGcx9hiMOAkJgtpIT5Nxc1ytOw0nh71O3ILUQYHOuGZdjQKBEoa5RhQqNXSdWHsplJogirPKFE4hmgKIzw0lEOGlZfmXTP70iBHQpjms7V/8nUsiUmjHfDKoJ2rVm4ubPD1h2bJGIyGJkQnaYKy01eGdlxIeJxpztCgbJtTWwp6PZwdEYqTpzBCITJ4gG02ghEibiSujPJi2BWOQByozy7qrO6TXrPhOg38abeui82LoNzcAGugQtuQs8gA7oAgQi8ArewLv1YX1Z39bP4rVkFZkzsATr9w/O/bCP</latexit><latexit sha1_base64="RJacZp1vCxsEVFHPWV5JdBplswg=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVJIi6LHYi8eK9gFtKJvNJl26j7C7EUrIT/Cqv8ef4S/wJl69uU1zsC0d+GCY+QaG8WNKlHacT6u0tb2zu1ferxwcHh2fVGunPSUSiXAXCSrkwIcKU8JxVxN8SCWGDKf4r4/bc/9/guWigj+rGcx9hiMOAkJgtpIT5Nxc1ytOw0nh71O3ILUQYHOuGZdjQKBEoa5RhQqNXSdWHsplJogirPKFE4hmgKIzw0lEOGlZfmXTP70iBHQpjms7V/8nUsiUmjHfDKoJ2rVm4ubPD1h2bJGIyGJkQnaYKy01eGdlxIeJxpztCgbJtTWwp6PZwdEYqTpzBCITJ4gG02ghEibiSujPJi2BWOQByozy7qrO6TXrPhOg38abeui82LoNzcAGugQtuQs8gA7oAgQi8ArewLv1YX1Z39bP4rVkFZkzsATr9w/O/bCP</latexit><latexit sha1_base64="RJacZp1vCxsEVFHPWV5JdBplswg=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVJIi6LHYi8eK9gFtKJvNJl26j7C7EUrIT/Cqv8ef4S/wJl69uU1zsC0d+GCY+QaG8WNKlHacT6u0tb2zu1ferxwcHh2fVGunPSUSiXAXCSrkwIcKU8JxVxN8SCWGDKf4r4/bc/9/guWigj+rGcx9hiMOAkJgtpIT5Nxc1ytOw0nh71O3ILUQYHOuGZdjQKBEoa5RhQqNXSdWHsplJogirPKFE4hmgKIzw0lEOGlZfmXTP70iBHQpjms7V/8nUsiUmjHfDKoJ2rVm4ubPD1h2bJGIyGJkQnaYKy01eGdlxIeJxpztCgbJtTWwp6PZwdEYqTpzBCITJ4gG02ghEibiSujPJi2BWOQByozy7qrO6TXrPhOg38abeui82LoNzcAGugQtuQs8gA7oAgQi8ArewLv1YX1Z39bP4rVkFZkzsATr9w/O/bCP</latexit><latexit sha1_base64="RJacZp1vCxsEVFHPWV5JdBplswg=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVJIi6LHYi8eK9gFtKJvNJl26j7C7EUrIT/Cqv8ef4S/wJl69uU1zsC0d+GCY+QaG8WNKlHacT6u0tb2zu1ferxwcHh2fVGunPSUSiXAXCSrkwIcKU8JxVxN8SCWGDKf4r4/bc/9/guWigj+rGcx9hiMOAkJgtpIT5Nxc1ytOw0nh71O3ILUQYHOuGZdjQKBEoa5RhQqNXSdWHsplJogirPKFE4hmgKIzw0lEOGlZfmXTP70iBHQpjms7V/8nUsiUmjHfDKoJ2rVm4ubPD1h2bJGIyGJkQnaYKy01eGdlxIeJxpztCgbJtTWwp6PZwdEYqTpzBCITJ4gG02ghEibiSujPJi2BWOQByozy7qrO6TXrPhOg38abeui82LoNzcAGugQtuQs8gA7oAgQi8ArewLv1YX1Z39bP4rVkFZkzsATr9w/O/bCP</latexit>

h3

<latexit sha1_base64="BTVvM4uCgypSE15+jQl7JzQyFQw=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIV9FjsxWNF+4A2lM1mky7dR9jdCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7wVLRQR/1tMYewxGnIQEQW2kp/HoelStOw0nh71K3ILUQYH2qGZdDAOBEoa5RhQqNXCdWHsplJogirPKMFE4hmgCIzwlEOGlZfmXTP73CiBHQpjms7V/8nUsiUmjLfDKox2rZm4nrPD1m2aJGIyGJkQlaYy1eGdlxIeJxpzNC8bJtTWwp6NZwdEYqTp1BCITJ4gG42hEibiSvDPJi2BGOQByozy7rLO6S7lXDdRru4029eV9sXAan4AxcAhfcgiZ4AG3QAQhE4BW8gXfrw/qyvq2f+WvJKjInYAHW7x/Q1rCQ</latexit><latexit sha1_base64="BTVvM4uCgypSE15+jQl7JzQyFQw=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIV9FjsxWNF+4A2lM1mky7dR9jdCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7wVLRQR/1tMYewxGnIQEQW2kp/HoelStOw0nh71K3ILUQYH2qGZdDAOBEoa5RhQqNXCdWHsplJogirPKMFE4hmgCIzwlEOGlZfmXTP73CiBHQpjms7V/8nUsiUmjLfDKox2rZm4nrPD1m2aJGIyGJkQlaYy1eGdlxIeJxpzNC8bJtTWwp6NZwdEYqTp1BCITJ4gG42hEibiSvDPJi2BGOQByozy7rLO6S7lXDdRru4029eV9sXAan4AxcAhfcgiZ4AG3QAQhE4BW8gXfrw/qyvq2f+WvJKjInYAHW7x/Q1rCQ</latexit><latexit sha1_base64="BTVvM4uCgypSE15+jQl7JzQyFQw=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIV9FjsxWNF+4A2lM1mky7dR9jdCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7wVLRQR/1tMYewxGnIQEQW2kp/HoelStOw0nh71K3ILUQYH2qGZdDAOBEoa5RhQqNXCdWHsplJogirPKMFE4hmgCIzwlEOGlZfmXTP73CiBHQpjms7V/8nUsiUmjLfDKox2rZm4nrPD1m2aJGIyGJkQlaYy1eGdlxIeJxpzNC8bJtTWwp6NZwdEYqTp1BCITJ4gG42hEibiSvDPJi2BGOQByozy7rLO6S7lXDdRru4029eV9sXAan4AxcAhfcgiZ4AG3QAQhE4BW8gXfrw/qyvq2f+WvJKjInYAHW7x/Q1rCQ</latexit><latexit sha1_base64="BTVvM4uCgypSE15+jQl7JzQyFQw=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIV9FjsxWNF+4A2lM1mky7dR9jdCXkJ3jV3+P8Bd4E6/e3KY52JYOfDMfAPD+DElSjvOp1Xa2Nza3invVvb2Dw6PqrXjrhKJRLiDBWy70OFKeG4o4muB9LDJlPc+ftGZ+7wVLRQR/1tMYewxGnIQEQW2kp/HoelStOw0nh71K3ILUQYH2qGZdDAOBEoa5RhQqNXCdWHsplJogirPKMFE4hmgCIzwlEOGlZfmXTP73CiBHQpjms7V/8nUsiUmjLfDKox2rZm4nrPD1m2aJGIyGJkQlaYy1eGdlxIeJxpzNC8bJtTWwp6NZwdEYqTp1BCITJ4gG42hEibiSvDPJi2BGOQByozy7rLO6S7lXDdRru4029eV9sXAan4AxcAhfcgiZ4AG3QAQhE4BW8gXfrw/qyvq2f+WvJKjInYAHW7x/Q1rCQ</latexit>

hT

<latexit sha1_base64="OPt3ynkhCqxqBzDpgrYhiyO9cFA=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWPFvqANZbPZJEv3EXY3Qgn9CV719/gz/AXexKs3t2kOtqUDHwz38AwfkKJ0o7zaZW2tnd298r7lYPDo+OTau20p0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcnrbnf8FSEcE7epgj8GIk5AgqI30HI8742rdaTg57HXiFqQOCrTHNetqFAiUMsw1olCpoesk2sug1ARPKuMUoUTiCYwkNDOWRYeVnedWZfGiWwQyHNcW3n6v9EBplSU+abTwZ1rFa9ubjJ0zGbLWs0EpIYmaANxkpbHd57GeFJqjFHi7JhSm0t7Pl4dkAkRpODYHI5AmyUQwlRNpMXBnlwawlGIM8UDOzrLu64zrp3TRcp+E+3dabD8XGZXAOLsA1cMEdaIJH0AZdgEAEXsEbeLc+rC/r2/pZvJasInMGlmD9/gEN3rCx</latexit><latexit sha1_base64="OPt3ynkhCqxqBzDpgrYhiyO9cFA=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWPFvqANZbPZJEv3EXY3Qgn9CV719/gz/AXexKs3t2kOtqUDHwz38AwfkKJ0o7zaZW2tnd298r7lYPDo+OTau20p0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcnrbnf8FSEcE7epgj8GIk5AgqI30HI8742rdaTg57HXiFqQOCrTHNetqFAiUMsw1olCpoesk2sug1ARPKuMUoUTiCYwkNDOWRYeVnedWZfGiWwQyHNcW3n6v9EBplSU+abTwZ1rFa9ubjJ0zGbLWs0EpIYmaANxkpbHd57GeFJqjFHi7JhSm0t7Pl4dkAkRpODYHI5AmyUQwlRNpMXBnlwawlGIM8UDOzrLu64zrp3TRcp+E+3dabD8XGZXAOLsA1cMEdaIJH0AZdgEAEXsEbeLc+rC/r2/pZvJasInMGlmD9/gEN3rCx</latexit><latexit sha1_base64="OPt3ynkhCqxqBzDpgrYhiyO9cFA=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWPFvqANZbPZJEv3EXY3Qgn9CV719/gz/AXexKs3t2kOtqUDHwz38AwfkKJ0o7zaZW2tnd298r7lYPDo+OTau20p0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcnrbnf8FSEcE7epgj8GIk5AgqI30HI8742rdaTg57HXiFqQOCrTHNetqFAiUMsw1olCpoesk2sug1ARPKuMUoUTiCYwkNDOWRYeVnedWZfGiWwQyHNcW3n6v9EBplSU+abTwZ1rFa9ubjJ0zGbLWs0EpIYmaANxkpbHd57GeFJqjFHi7JhSm0t7Pl4dkAkRpODYHI5AmyUQwlRNpMXBnlwawlGIM8UDOzrLu64zrp3TRcp+E+3dabD8XGZXAOLsA1cMEdaIJH0AZdgEAEXsEbeLc+rC/r2/pZvJasInMGlmD9/gEN3rCx</latexit><latexit sha1_base64="OPt3ynkhCqxqBzDpgrYhiyO9cFA=">ACP3icdVBLS8NAGNzUV62vVo9egkXxVBIR9FjsxWPFvqANZbPZJEv3EXY3Qgn9CV719/gz/AXexKs3t2kOtqUDHwz38AwfkKJ0o7zaZW2tnd298r7lYPDo+OTau20p0QqEe4iQYUc+FBhSjuaqIpHiQSQ+ZT3Pcnrbnf8FSEcE7epgj8GIk5AgqI30HI8742rdaTg57HXiFqQOCrTHNetqFAiUMsw1olCpoesk2sug1ARPKuMUoUTiCYwkNDOWRYeVnedWZfGiWwQyHNcW3n6v9EBplSU+abTwZ1rFa9ubjJ0zGbLWs0EpIYmaANxkpbHd57GeFJqjFHi7JhSm0t7Pl4dkAkRpODYHI5AmyUQwlRNpMXBnlwawlGIM8UDOzrLu64zrp3TRcp+E+3dabD8XGZXAOLsA1cMEdaIJH0AZdgEAEXsEbeLc+rC/r2/pZvJasInMGlmD9/gEN3rCx</latexit>

H(x) = sign T X

t=1

αtht(x) !

<latexit sha1_base64="GB1VUXh93SXCU1F/0RNvxLUajOE=">ACfHicdVHLatAFB2r9R9Oe2ym6Fui0OpkUpskJzSbLFOIkELnianwlDZmHmLkqMUJ/k6/Jt30Z0rHjheNQw4MHM65B+49k9dKeorjP73o3v0HDx9tPO4/efrs+YvB5stjbxsncCKsu40B49KGpyQJIWntUPQucKT/Hx/4Z/8ROelNUc0r3GqoTSykAIoSNng68HoYov8pTwglovS9PxVGFBI576Rmct7SbdjyOegqoryIhXGS0SqZNlRVvZYBiP4yX4bZKsyJCtcJht9t6nMysajYaEAu/PkrimaQuOpFDY9dPGYw3iHEo8C9SARj9tl4d2/F1QZrywLjxDfKn+n2hBez/XeZjUQJVf9xbiXR5VurupqdI6GWQp7jDWtqViZ9pKUzeERlwvWzSKk+WL5vlMOhSk5oGACHkpuKjAgaDwP/10GWz3rdZgZr4LzSbrPd4mx5/GSTxOvn8e7n1bdbzBXrM3bMQSts32AE7ZBMm2CW7Yr/Y797f6G30Ifp4PRr1VplX7AaiL/8ARbLDuA=</latexit><latexit sha1_base64="GB1VUXh93SXCU1F/0RNvxLUajOE=">ACfHicdVHLatAFB2r9R9Oe2ym6Fui0OpkUpskJzSbLFOIkELnianwlDZmHmLkqMUJ/k6/Jt30Z0rHjheNQw4MHM65B+49k9dKeorjP73o3v0HDx9tPO4/efrs+YvB5stjbxsncCKsu40B49KGpyQJIWntUPQucKT/Hx/4Z/8ROelNUc0r3GqoTSykAIoSNng68HoYov8pTwglovS9PxVGFBI576Rmct7SbdjyOegqoryIhXGS0SqZNlRVvZYBiP4yX4bZKsyJCtcJht9t6nMysajYaEAu/PkrimaQuOpFDY9dPGYw3iHEo8C9SARj9tl4d2/F1QZrywLjxDfKn+n2hBez/XeZjUQJVf9xbiXR5VurupqdI6GWQp7jDWtqViZ9pKUzeERlwvWzSKk+WL5vlMOhSk5oGACHkpuKjAgaDwP/10GWz3rdZgZr4LzSbrPd4mx5/GSTxOvn8e7n1bdbzBXrM3bMQSts32AE7ZBMm2CW7Yr/Y797f6G30Ifp4PRr1VplX7AaiL/8ARbLDuA=</latexit><latexit sha1_base64="GB1VUXh93SXCU1F/0RNvxLUajOE=">ACfHicdVHLatAFB2r9R9Oe2ym6Fui0OpkUpskJzSbLFOIkELnianwlDZmHmLkqMUJ/k6/Jt30Z0rHjheNQw4MHM65B+49k9dKeorjP73o3v0HDx9tPO4/efrs+YvB5stjbxsncCKsu40B49KGpyQJIWntUPQucKT/Hx/4Z/8ROelNUc0r3GqoTSykAIoSNng68HoYov8pTwglovS9PxVGFBI576Rmct7SbdjyOegqoryIhXGS0SqZNlRVvZYBiP4yX4bZKsyJCtcJht9t6nMysajYaEAu/PkrimaQuOpFDY9dPGYw3iHEo8C9SARj9tl4d2/F1QZrywLjxDfKn+n2hBez/XeZjUQJVf9xbiXR5VurupqdI6GWQp7jDWtqViZ9pKUzeERlwvWzSKk+WL5vlMOhSk5oGACHkpuKjAgaDwP/10GWz3rdZgZr4LzSbrPd4mx5/GSTxOvn8e7n1bdbzBXrM3bMQSts32AE7ZBMm2CW7Yr/Y797f6G30Ifp4PRr1VplX7AaiL/8ARbLDuA=</latexit><latexit sha1_base64="GB1VUXh93SXCU1F/0RNvxLUajOE=">ACfHicdVHLatAFB2r9R9Oe2ym6Fui0OpkUpskJzSbLFOIkELnianwlDZmHmLkqMUJ/k6/Jt30Z0rHjheNQw4MHM65B+49k9dKeorjP73o3v0HDx9tPO4/efrs+YvB5stjbxsncCKsu40B49KGpyQJIWntUPQucKT/Hx/4Z/8ROelNUc0r3GqoTSykAIoSNng68HoYov8pTwglovS9PxVGFBI576Rmct7SbdjyOegqoryIhXGS0SqZNlRVvZYBiP4yX4bZKsyJCtcJht9t6nMysajYaEAu/PkrimaQuOpFDY9dPGYw3iHEo8C9SARj9tl4d2/F1QZrywLjxDfKn+n2hBez/XeZjUQJVf9xbiXR5VurupqdI6GWQp7jDWtqViZ9pKUzeERlwvWzSKk+WL5vlMOhSk5oGACHkpuKjAgaDwP/10GWz3rdZgZr4LzSbrPd4mx5/GSTxOvn8e7n1bdbzBXrM3bMQSts32AE7ZBMm2CW7Yr/Y797f6G30Ifp4PRr1VplX7AaiL/8ARbLDuA=</latexit>

errt = PN

i=1 wiI{ht(x(i) 6= t(i)}

PN

i=1 wi

<latexit sha1_base64="5U5SI+PZ7uJNFYG/ReSMtHFZ5Po=">ACqXicdVFNaxsxENVuvxL3y2mPvYiaFpeC2Q2F5hIzaWnNIE4Mc06i1aetYX1sZFm2xixP7LH/pJeI9t7SJzmCcHTm3lieFNUjhMkr9R/Ojxk6fPtrY7z1+8fPW6u/PmzJnachyI40dFcyBFBqGKFDCqLAVCHhvJgfLuvnv8A6YfQpLioYKzbVohScYZDy7jxDuEYP1jY50n2alZxTzNXq9yL/bS5PK/c0GzHxr8LT0s6L0182l74tPDc0XFsH+F0teg/miybu9ZJCsQO+TtCU90uI434k+ZhPDawUauWTOXaRJhWPLAouoelktYOK8TmbwkWgmilwY79KpaEfgjKhpbHhaqQr9bDM+XcQhWhUzGcuc3aUnyohjPV3NXk1FgRZMEfKGxMi+Xe2Atd1Qiar4cta0nR0OWa6ERY4CgXgTAe/IJTPmNhNxiW2clWRn9olGJ64pbJps53idnu4M0GaQnX3oH39qMt8g78p70SUq+kgPynRyTIeHkD/kXkSiKP8cn8Sj+uW6No9bzltxBzG8AkrLQdg=</latexit><latexit sha1_base64="5U5SI+PZ7uJNFYG/ReSMtHFZ5Po=">ACqXicdVFNaxsxENVuvxL3y2mPvYiaFpeC2Q2F5hIzaWnNIE4Mc06i1aetYX1sZFm2xixP7LH/pJeI9t7SJzmCcHTm3lieFNUjhMkr9R/Ojxk6fPtrY7z1+8fPW6u/PmzJnachyI40dFcyBFBqGKFDCqLAVCHhvJgfLuvnv8A6YfQpLioYKzbVohScYZDy7jxDuEYP1jY50n2alZxTzNXq9yL/bS5PK/c0GzHxr8LT0s6L0182l74tPDc0XFsH+F0teg/miybu9ZJCsQO+TtCU90uI434k+ZhPDawUauWTOXaRJhWPLAouoelktYOK8TmbwkWgmilwY79KpaEfgjKhpbHhaqQr9bDM+XcQhWhUzGcuc3aUnyohjPV3NXk1FgRZMEfKGxMi+Xe2Atd1Qiar4cta0nR0OWa6ERY4CgXgTAe/IJTPmNhNxiW2clWRn9olGJ64pbJps53idnu4M0GaQnX3oH39qMt8g78p70SUq+kgPynRyTIeHkD/kXkSiKP8cn8Sj+uW6No9bzltxBzG8AkrLQdg=</latexit><latexit sha1_base64="5U5SI+PZ7uJNFYG/ReSMtHFZ5Po=">ACqXicdVFNaxsxENVuvxL3y2mPvYiaFpeC2Q2F5hIzaWnNIE4Mc06i1aetYX1sZFm2xixP7LH/pJeI9t7SJzmCcHTm3lieFNUjhMkr9R/Ojxk6fPtrY7z1+8fPW6u/PmzJnachyI40dFcyBFBqGKFDCqLAVCHhvJgfLuvnv8A6YfQpLioYKzbVohScYZDy7jxDuEYP1jY50n2alZxTzNXq9yL/bS5PK/c0GzHxr8LT0s6L0182l74tPDc0XFsH+F0teg/miybu9ZJCsQO+TtCU90uI434k+ZhPDawUauWTOXaRJhWPLAouoelktYOK8TmbwkWgmilwY79KpaEfgjKhpbHhaqQr9bDM+XcQhWhUzGcuc3aUnyohjPV3NXk1FgRZMEfKGxMi+Xe2Atd1Qiar4cta0nR0OWa6ERY4CgXgTAe/IJTPmNhNxiW2clWRn9olGJ64pbJps53idnu4M0GaQnX3oH39qMt8g78p70SUq+kgPynRyTIeHkD/kXkSiKP8cn8Sj+uW6No9bzltxBzG8AkrLQdg=</latexit><latexit sha1_base64="5U5SI+PZ7uJNFYG/ReSMtHFZ5Po=">ACqXicdVFNaxsxENVuvxL3y2mPvYiaFpeC2Q2F5hIzaWnNIE4Mc06i1aetYX1sZFm2xixP7LH/pJeI9t7SJzmCcHTm3lieFNUjhMkr9R/Ojxk6fPtrY7z1+8fPW6u/PmzJnachyI40dFcyBFBqGKFDCqLAVCHhvJgfLuvnv8A6YfQpLioYKzbVohScYZDy7jxDuEYP1jY50n2alZxTzNXq9yL/bS5PK/c0GzHxr8LT0s6L0182l74tPDc0XFsH+F0teg/miybu9ZJCsQO+TtCU90uI434k+ZhPDawUauWTOXaRJhWPLAouoelktYOK8TmbwkWgmilwY79KpaEfgjKhpbHhaqQr9bDM+XcQhWhUzGcuc3aUnyohjPV3NXk1FgRZMEfKGxMi+Xe2Atd1Qiar4cta0nR0OWa6ERY4CgXgTAe/IJTPmNhNxiW2clWRn9olGJ64pbJps53idnu4M0GaQnX3oH39qMt8g78p70SUq+kgPynRyTIeHkD/kXkSiKP8cn8Sj+uW6No9bzltxBzG8AkrLQdg=</latexit>

αt = 1 2 log 1 − errt errt

  • <latexit sha1_base64="o3z3ZC8kU1t+nm6vA42CJrkmjUo=">ACk3icdVFNaxRBEO0dv7Lr10bx5KVxUeLBZSYICiKExIMXIYKbBDLUtNbM9OkP4buGnFp5pf5Szx61T9h72YOZkMKGl69Vw+qXxWNkp7S9NcguX7zt17O8PR/QcPHz0e7z458bZ1AmfCKuvOCvCopMEZSVJ41jgEXSg8LS6O1vrpd3ReWvONVg3ONVRGlIARWoxng2HOaimhgXxjzwvHYiQdWG/46OoKFvlCkva64U3PCf8QGd6KhC1s9z52sanq9GE/Sabopfh1kPZiwvo4Xu4NX+dKVqMhocD78yxtaB7AkRQKu1HemxAXECF5xEa0OjnYfP/jr+MzJKX1sVniG/Y/x0BtPcrXcRJDVT7bW1N3qRrburnKqsk5GW4gZha1sq38+DNE1LaMTlsmWrOFm+PghfSoeC1CoCENEvBRc1xLgpnm2Ub4zhyGoNZum7mGy2neN1cLI/zdJp9vXt5OCwz3iHPWcv2B7L2Dt2wD6zYzZjgv1kv9kf9jd5lnxIDpNPl6PJoPc8ZVcq+fIPOUDMCw=</latexit><latexit sha1_base64="o3z3ZC8kU1t+nm6vA42CJrkmjUo=">ACk3icdVFNaxRBEO0dv7Lr10bx5KVxUeLBZSYICiKExIMXIYKbBDLUtNbM9OkP4buGnFp5pf5Szx61T9h72YOZkMKGl69Vw+qXxWNkp7S9NcguX7zt17O8PR/QcPHz0e7z458bZ1AmfCKuvOCvCopMEZSVJ41jgEXSg8LS6O1vrpd3ReWvONVg3ONVRGlIARWoxng2HOaimhgXxjzwvHYiQdWG/46OoKFvlCkva64U3PCf8QGd6KhC1s9z52sanq9GE/Sabopfh1kPZiwvo4Xu4NX+dKVqMhocD78yxtaB7AkRQKu1HemxAXECF5xEa0OjnYfP/jr+MzJKX1sVniG/Y/x0BtPcrXcRJDVT7bW1N3qRrburnKqsk5GW4gZha1sq38+DNE1LaMTlsmWrOFm+PghfSoeC1CoCENEvBRc1xLgpnm2Ub4zhyGoNZum7mGy2neN1cLI/zdJp9vXt5OCwz3iHPWcv2B7L2Dt2wD6zYzZjgv1kv9kf9jd5lnxIDpNPl6PJoPc8ZVcq+fIPOUDMCw=</latexit><latexit sha1_base64="o3z3ZC8kU1t+nm6vA42CJrkmjUo=">ACk3icdVFNaxRBEO0dv7Lr10bx5KVxUeLBZSYICiKExIMXIYKbBDLUtNbM9OkP4buGnFp5pf5Szx61T9h72YOZkMKGl69Vw+qXxWNkp7S9NcguX7zt17O8PR/QcPHz0e7z458bZ1AmfCKuvOCvCopMEZSVJ41jgEXSg8LS6O1vrpd3ReWvONVg3ONVRGlIARWoxng2HOaimhgXxjzwvHYiQdWG/46OoKFvlCkva64U3PCf8QGd6KhC1s9z52sanq9GE/Sabopfh1kPZiwvo4Xu4NX+dKVqMhocD78yxtaB7AkRQKu1HemxAXECF5xEa0OjnYfP/jr+MzJKX1sVniG/Y/x0BtPcrXcRJDVT7bW1N3qRrburnKqsk5GW4gZha1sq38+DNE1LaMTlsmWrOFm+PghfSoeC1CoCENEvBRc1xLgpnm2Ub4zhyGoNZum7mGy2neN1cLI/zdJp9vXt5OCwz3iHPWcv2B7L2Dt2wD6zYzZjgv1kv9kf9jd5lnxIDpNPl6PJoPc8ZVcq+fIPOUDMCw=</latexit><latexit sha1_base64="o3z3ZC8kU1t+nm6vA42CJrkmjUo=">ACk3icdVFNaxRBEO0dv7Lr10bx5KVxUeLBZSYICiKExIMXIYKbBDLUtNbM9OkP4buGnFp5pf5Szx61T9h72YOZkMKGl69Vw+qXxWNkp7S9NcguX7zt17O8PR/QcPHz0e7z458bZ1AmfCKuvOCvCopMEZSVJ41jgEXSg8LS6O1vrpd3ReWvONVg3ONVRGlIARWoxng2HOaimhgXxjzwvHYiQdWG/46OoKFvlCkva64U3PCf8QGd6KhC1s9z52sanq9GE/Sabopfh1kPZiwvo4Xu4NX+dKVqMhocD78yxtaB7AkRQKu1HemxAXECF5xEa0OjnYfP/jr+MzJKX1sVniG/Y/x0BtPcrXcRJDVT7bW1N3qRrburnKqsk5GW4gZha1sq38+DNE1LaMTlsmWrOFm+PghfSoeC1CoCENEvBRc1xLgpnm2Ub4zhyGoNZum7mGy2neN1cLI/zdJp9vXt5OCwz3iHPWcv2B7L2Dt2wD6zYzZjgv1kv9kf9jd5lnxIDpNPl6PJoPc8ZVcq+fIPOUDMCw=</latexit>

wi ← wi exp

  • 2αtI{ht(x(i)) 6= t(i)}
  • <latexit sha1_base64="E7ey2D1vUl4iSw4a8mCWl+cdJ5s=">ACl3icdZHbahsxEIbl7SGpe4jTXpXeiJoW+8bshkJ719BAyF0TqJOUyF208qxXRIeNvELPtseY4+QG7TV6i89kXjkAHBP9/oh+GfrFTSYxz/6USPHj95urH5rPv8xctXW73t18feVk7AWFhl3WnGPShpYIwSFZyWDrjOFJxk53uL+clvcF5a8wPnJUw0nxmZS8ExoLT38zKVlCnIkTtnL2nbwlW5ZAO6w7gqC54iZd8N1LRIA2RZXl81v+qBHDZDygxcUFx2lDaUOTkrcJj2+vEoboveF8lK9MmqDtPtzkc2taLSYFAo7v1ZEpc4qblDKRQ0XVZ5KLk45zM4C9JwDX5Stxk09EMgU5pbF5B2tL/HTX3s91Fn5qjoVfny3gQzMsdHOXqZl1MmApHhisbYv5l0ktTVkhGLFcNq8URUsXR6FT6UCgmgfBRfBLQUXBHRcYTtdlrbHes1pzM/VNSDZz/G+ON4ZJfEoOfrU3/2yniTvCPvyYAk5DPZJQfkIyJINfkhtySv9Hb6Gu0Hx0sv0adlecNuVPR0T9Ty80k</latexit><latexit sha1_base64="E7ey2D1vUl4iSw4a8mCWl+cdJ5s=">ACl3icdZHbahsxEIbl7SGpe4jTXpXeiJoW+8bshkJ719BAyF0TqJOUyF208qxXRIeNvELPtseY4+QG7TV6i89kXjkAHBP9/oh+GfrFTSYxz/6USPHj95urH5rPv8xctXW73t18feVk7AWFhl3WnGPShpYIwSFZyWDrjOFJxk53uL+clvcF5a8wPnJUw0nxmZS8ExoLT38zKVlCnIkTtnL2nbwlW5ZAO6w7gqC54iZd8N1LRIA2RZXl81v+qBHDZDygxcUFx2lDaUOTkrcJj2+vEoboveF8lK9MmqDtPtzkc2taLSYFAo7v1ZEpc4qblDKRQ0XVZ5KLk45zM4C9JwDX5Stxk09EMgU5pbF5B2tL/HTX3s91Fn5qjoVfny3gQzMsdHOXqZl1MmApHhisbYv5l0ktTVkhGLFcNq8URUsXR6FT6UCgmgfBRfBLQUXBHRcYTtdlrbHes1pzM/VNSDZz/G+ON4ZJfEoOfrU3/2yniTvCPvyYAk5DPZJQfkIyJINfkhtySv9Hb6Gu0Hx0sv0adlecNuVPR0T9Ty80k</latexit><latexit sha1_base64="E7ey2D1vUl4iSw4a8mCWl+cdJ5s=">ACl3icdZHbahsxEIbl7SGpe4jTXpXeiJoW+8bshkJ719BAyF0TqJOUyF208qxXRIeNvELPtseY4+QG7TV6i89kXjkAHBP9/oh+GfrFTSYxz/6USPHj95urH5rPv8xctXW73t18feVk7AWFhl3WnGPShpYIwSFZyWDrjOFJxk53uL+clvcF5a8wPnJUw0nxmZS8ExoLT38zKVlCnIkTtnL2nbwlW5ZAO6w7gqC54iZd8N1LRIA2RZXl81v+qBHDZDygxcUFx2lDaUOTkrcJj2+vEoboveF8lK9MmqDtPtzkc2taLSYFAo7v1ZEpc4qblDKRQ0XVZ5KLk45zM4C9JwDX5Stxk09EMgU5pbF5B2tL/HTX3s91Fn5qjoVfny3gQzMsdHOXqZl1MmApHhisbYv5l0ktTVkhGLFcNq8URUsXR6FT6UCgmgfBRfBLQUXBHRcYTtdlrbHes1pzM/VNSDZz/G+ON4ZJfEoOfrU3/2yniTvCPvyYAk5DPZJQfkIyJINfkhtySv9Hb6Gu0Hx0sv0adlecNuVPR0T9Ty80k</latexit><latexit sha1_base64="E7ey2D1vUl4iSw4a8mCWl+cdJ5s=">ACl3icdZHbahsxEIbl7SGpe4jTXpXeiJoW+8bshkJ719BAyF0TqJOUyF208qxXRIeNvELPtseY4+QG7TV6i89kXjkAHBP9/oh+GfrFTSYxz/6USPHj95urH5rPv8xctXW73t18feVk7AWFhl3WnGPShpYIwSFZyWDrjOFJxk53uL+clvcF5a8wPnJUw0nxmZS8ExoLT38zKVlCnIkTtnL2nbwlW5ZAO6w7gqC54iZd8N1LRIA2RZXl81v+qBHDZDygxcUFx2lDaUOTkrcJj2+vEoboveF8lK9MmqDtPtzkc2taLSYFAo7v1ZEpc4qblDKRQ0XVZ5KLk45zM4C9JwDX5Stxk09EMgU5pbF5B2tL/HTX3s91Fn5qjoVfny3gQzMsdHOXqZl1MmApHhisbYv5l0ktTVkhGLFcNq8URUsXR6FT6UCgmgfBRfBLQUXBHRcYTtdlrbHes1pzM/VNSDZz/G+ON4ZJfEoOfrU3/2yniTvCPvyYAk5DPZJQfkIyJINfkhtySv9Hb6Gu0Hx0sv0adlecNuVPR0T9Ty80k</latexit>

UofT CSC 411: 09-Classification Odds and Ends 25 / 34

slide-26
SLIDE 26

Additive Models

Consider a hypothesis class H with each hi : x → {−1, +1} within H, i.e., hi ∈ H. These are the “weak learners”, and in this context they’re also called bases. An additive model with m terms is given by Hm(x) =

m

  • i=1

αihi(x), where (α1, · · · , αm) ∈ Rm. Observe that we’re taking a linear combination of base classifiers, just like in boosting. We’ll now interpret AdaBoost as a way of fitting an additive model.

UofT CSC 411: 09-Classification Odds and Ends 26 / 34

slide-27
SLIDE 27

Stagewise Training of Additive Models

A greedy approach to fitting additive models, known as stagewise training:

  • 1. Initialize H0(x) = 0
  • 2. For m = 1 to T:

◮ Compute the m-th hypothesis and its coefficient

(hm, αm) ← argmin

h∈H,α N

  • i=1

L

  • Hm−1(x(i)) + αh(x(i)), t(i))
  • ◮ Add it to the additive model

Hm = Hm−1 + αmhm

UofT CSC 411: 09-Classification Odds and Ends 27 / 34

slide-28
SLIDE 28

Additive Models with Exponential Loss

Consider the exponential loss LE(y, t) = exp(−ty). We want to see how the stagewise training of additive models can be done. (hm, αm) ← argmin

h∈H,α N

  • i=1

exp

  • Hm−1(x(i)) + αh(x(i))
  • t(i)

=

N

  • i=1

exp

  • −Hm−1(x(i))t(i) − αh(x(i))t(i)

=

N

  • i=1

exp

  • −Hm−1(x(i))t(i)

exp

  • −αh(x(i))t(i)

=

N

  • i=1

w (m)

i

exp

  • −αh(x(i))t(i)

. Here we defined w (m)

i

exp

  • −Hm−1(x(i))t(i)

.

UofT CSC 411: 09-Classification Odds and Ends 28 / 34

slide-29
SLIDE 29

Additive Models with Exponential Loss

We want to solve the following minimization problem: (hm, αm) ← argmin

h∈H,α N

  • i=1

w (m)

i

exp

  • −αh(x(i))t(i)

. If h(x(i)) = t(i), we have exp

  • −αh(x(i))t(i)

= exp(−α). If h(x(i)) = t(i), we have exp

  • −αh(x(i))t(i)

= exp(+α). (recall that we are in the binary classification case with {−1, +1} output values). We can divide the summation to two parts:

N

  • i=1

w(m)

i

exp

  • −αh(x(i))t(i)

=e−α

N

  • i=1

w(m)

i

I{h(x(i)) = ti} + eα

N

  • i=1

w(m)

i

I{h(x(i)) = ti} =(eα − e−α)

N

  • i=1

w(m)

i

I{h(x(i)) = ti}+ e−α

N

  • i=1

w(m)

i

  • I{h(x(i)) = ti} + I{h(x(i)) = ti}
  • UofT

CSC 411: 09-Classification Odds and Ends 29 / 34

slide-30
SLIDE 30

Additive Models with Exponential Loss

N

  • i=1

w (m)

i

exp

  • −αh(x(i))t(i)

=(eα − e−α)

N

  • i=1

w (m)

i

I{h(x(i) = ti}+ e−α

N

  • i=1

w (m)

i

  • I{h(x(i) = ti} + I{h(x(i)) = ti}
  • =(eα − e−α)

N

  • i=1

w (m)

i

I{h(x(i)) = ti} + e−α

N

  • i=1

w (m)

i

. Let us first optimize h: The second term on the RHS does not depend on h. So we get hm ← argmin

h∈H N

  • i=1

w (m)

i

exp

  • −αh(x(i))t(i)

≡ argmin

h∈H N

  • i=1

w (m)

i

I{h(x(i)) = ti}. This means that hm is the minimizer of the weighted 0/1-loss.

UofT CSC 411: 09-Classification Odds and Ends 30 / 34

slide-31
SLIDE 31

Additive Models with Exponential Loss

Now that obtained hm, we want to find α: Define the weighted classification error: errm = N

i=1 w (m) i

I{hm(x(i)) = t(i)} N

i=1 w (m) i

With this definition and minh∈H N

i=1 w (m) i

exp

  • −αh(x(i))t(i)

= N

i=1 w (m) i

I{hm(x(i)) = ti}, we have min

α min h∈H N

  • i=1

w (m)

i

exp

  • −αh(x(i))t(i)

= min

α

  • (eα − e−α)

N

  • i=1

w (m)

i

I{hm(x(i)) = ti} + e−α

N

  • i=1

w (m)

i

  • = min

α

  • (eα − e−α)errm

N

  • i=1

w (m)

i

  • + e−α

N

  • i=1

w (m)

i

  • Take derivative w.r.t. α and set it to zero. We get that

e2α = 1 − errm errm ⇒ α = 1 2 log 1 − errm errm

  • .

UofT CSC 411: 09-Classification Odds and Ends 31 / 34

slide-32
SLIDE 32

Additive Models with Exponential Loss

The updated weights for the next iteration is w (m+1)

i

= exp

  • −Hm(x(i))t(i)

= exp

  • Hm−1(x(i)) + αmhm(x(i))
  • t(i)

= exp

  • −Hm−1(x(i))t(i)

exp

  • −αmhm(x(i))t(i)

= w (m)

i

exp

  • −αmhm(x(i))t(i)

= w (m)

i

exp

  • −αm
  • 2I{hm(x(i)) = t(i)} − 1
  • = exp(αm)w (m)

i

exp

  • −2αmI{hm(x(i)) = t(i)}
  • .

The term exp(αm) multiplies the weight corresponding to all samples, so it does not affect the minimization of hm+1 or αm+1.

UofT CSC 411: 09-Classification Odds and Ends 32 / 34

slide-33
SLIDE 33

Additive Models with Exponential Loss

To summarize, we obtain the additive model Hm(x) = m

i=1 αihi(x) with

hm ← argmin

h∈H N

  • i=1

w(m)

i

I{h(x(i)) = ti}, α = 1 2 log 1 − errm errm

  • ,

where errm = N

i=1 w(m) i

I{hm(x(i)) = t(i)} N

i=1 w(m) i

, w(m+1)

i

= w(m)

i

exp

  • −αmhm(x(i))t(i)

. We derived the AdaBoost algorithm!

UofT CSC 411: 09-Classification Odds and Ends 33 / 34

slide-34
SLIDE 34

Revisiting Loss Functions for Classification

If AdaBoost is minimizing exponential loss, what does that say about its behavior (compared to, say, logistic regression)? This interpretation allows boosting to be generalized to lots of other loss functions!

UofT CSC 411: 09-Classification Odds and Ends 34 / 34