A Theoretical Analysis of Metric Hypothesis Transfer Learning Micha - PDF document

A Theoretical Analysis of Metric Hypothesis Transfer Learning Micha¨ el Perrot MICHAEL . PERROT @ UNIV - ST - ETIENNE . FR Amaury Habrard AMAURY . HABRARD @ UNIV - ST - ETIENNE . FR Universit´ e de Lyon, Universit´ e Jean Monnet de Saint-Etienne, Laboratoire Hubert Curien, CNRS, UMR5516, F-42000, Saint-Etienne, France. Abstract score to pairs of examples of the same class (resp. different class). Most of the existing work has notably fo- We consider the problem of transferring some cused on learning Mahalanobis-like distances of the form a priori knowledge in the context of supervised ( x − x ′ ) T M ( x − x ′ ) where M is a posi- d M ( x , x ′ ) = � metric learning approaches. While this setting tive semi-definite (PSD) matrix 1 , the learned matrix being has been successfully applied in some empirical typically plugged in a k -Nearest Neighbor classifier allow- contexts, no theoretical evidence exists to justify ing one to achieve a better accuracy than the standard Eu- this approach. In this paper, we provide a theo- clidean distance. retical justification based on the notion of algo- rithmic stability adapted to the regularized met- Recently, there is a growing interest for methods ric learning setting. We propose an on-average- able to take into account some background knowledge replace-two-stability model allowing us to prove (Parameswaran & Weinberger, 2010; Cao et al., 2013; fast generalization rates when an auxiliary source Bohn´ e et al., 2014) for learning M . This is in particular the metric is used to bias the regularizer. Moreover, case for supervised regularized metric learning approaches we prove a consistency result from which we where the regularizer is biased with respect to an auxiliary show the interest of considering biased weighted metric given under the form of a matrix. The main ob- regularized formulations and we provide a solu- jective here is to make use of this a priori knowledge in a tion to estimate the associated weight. We also setting where only few labelled data are available to help present some experiments illustrating the interest learning. For example, in the context of learning a PSD of the approach in standard metric learning tasks matrix M plugged into a Mahalanobis-like distance as dis- and in a transfer learning problem where few la- cussed above, let I be the identity matrix used as an aux- belled data are available. iliary knowledge, � M − I � is a biased regularizer often considered. This regularization can be interpreted as fol- lows: learn M while trying to stay close to the Euclidean 1. Introduction distance, or from another standpoint try to learn a matrix M which performs better than I . Other standard matrices can A lot of machine learning problems, such as clustering, be used such as Σ − 1 the inverse of the variance-covariance classification or ranking, require to accurately compare ex- matrix, note that if we take the 0 matrix, we retrieve the amples by means of distances or similarities. Designing classical unbiased regularization term. a good metric for a task at hand is thus of crucial impor- tance. Manually tuning a metric is in general difficult and Another useful setting comes when I is replaced by any tedious, a recent trend consists to learn the metrics directly auxiliary matrix M S learned from another task. This cor- from data. This has led to the emergence of supervised responds to a transfer learning approach where the biased metric learning , see (Bellet et al., 2013; Kulis, 2013) for regularization can be interpreted as transferring the knowl- up-to-date surveys. The underlying idea is to infer auto- edge brought by M S for learning M . This setting is appro- priate when the distributions over training and testing do- matically the parameters of a metric in order to capture the mains are different but related. Domain adaptation strate- idiosyncrasies of the data. In a supervised classification perspective, this is generally done by trying to satisfy pair- 1 Note that this distance is a generalization of some well- based constraints aiming at assigning a small (resp. large) known distances: when M = I , I being the identity matrix, we retrieve the Euclidean distance, when M = Σ − 1 where Σ is the Proceedings of the 32 nd International Conference on Machine variance-covariance matrix of the data at hand, it actually corre- Learning , Lille, France, 2015. JMLR: W&CP volume 37. Copy- sponds to the original definition of a Mahalanobis distance. right 2015 by the author(s).

A Theoretical Analysis of Metric Hypothesis Transfer Learning Micha - PDF document

A Theoretical Analysis of Metric Hypothesis Transfer Learning Micha el Perrot MICHAEL . PERROT @ UNIV - ST - ETIENNE . FR Amaury Habrard AMAURY . HABRARD @ UNIV - ST - ETIENNE . FR Universit e de Lyon, Universit e Jean Monnet de

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

6.16.4 Hypothesis tests Prof. Tesler Math 186 Winter 2019 Prof. Tesler 6.16.4 Hypothesis

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis Tests using Excel T.TEST function V1e 11/12/2013 Two group hypothesis tests using

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Hypothesis Tests using Z.TEST function in Excel 2008 V1c 11/16/2012 Hypothesis Tests [Excel

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

Metric Conversions Ladder Method T. Trimpe 2008 http://sciencespot.net/ Metric System The

Dynamical Systems Continuous maps of metric spaces We work with metric spaces, usually a

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Clinical trial design in the current age of immunotherapy and targeted therapy Martijn Lolkema,

Translational Cancer Research Network 10 February 2014 Introducing the Cancer Challenge of the

YES! YES! Anton Hagenbeek, M.D., Ph.D. Anton Hagenbeek, M.D., Ph.D. Academic Medical Center

Towards Best Practice for Chronic HIV Care Professor Georg Behrens Department for Clinical

Game Theory and its Applications to Networks Corinne Touati / Bruno Gaujal Master ENS Lyon, Fall

From research to innovative training : new professionnalisation perspectives for beginning

Uncertainty Quantification Framework for Modeling Prediction Michael Frenklach Collaborators:

INSA-LYON Team iGEM2014 Source: W. Eugene Smith, JAPAN. Minamata. JAPAN Mercury poisoning