metric learning applied for automatic large image
play

Metric Learning Applied for Automatic Large Image Classification - PowerPoint PPT Presentation

September, 2014 UPC Metric Learning Applied for Automatic Large Image Classification Supervisors TOON CALDERS (PhD) / ULB SAHILU WENDESON / IT4BI SALIM JOUILI (PhD) /EuraNova Image Database Classification How? Using K-Nearest Neighbor (kNN)


  1. September, 2014 UPC Metric Learning Applied for Automatic Large Image Classification Supervisors TOON CALDERS (PhD) / ULB SAHILU WENDESON / IT4BI SALIM JOUILI (PhD) /EuraNova

  2. Image Database Classification How? Using K-Nearest Neighbor (kNN) Depends on Quality of Distance Measure 2

  3. k Nearest Neighbor ( k NN) Classifier � Depends on the distance measure k neighbors choose based on Euclidean distance measure (k=5) (k=5) (k=1) (k=1) Q 3

  4. Outline 1) METRIC LEARNING 2) INDEXING 3) OBJECTIVES and CONTRIBUTIONS 3) OBJECTIVES and CONTRIBUTIONS 4) EXPERIMENT RESULTS 5) DISCUSSION,CONCLUSION and FUTURE WORKS

  5. 1. METRIC LEARNING • To maximize accuracy of kNN by learning the distance measure • Traditional metric space (Euclidean )lacks 1 : – Consider correlation between features – Consider correlation between features – To provide curved as well as linear decision boundaries • Metric Learning used to solve these limitations using Mahalanobis metric space 1 R. O. Duda, "Pattern Recognition for HCI," Department of Electrical Engineering San Jose State University, p. 5 http://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/PR_Mahal/PR_Mahal.htm, 1997.

  6. Mahalanobis Metric Space Where is the cone of symmetric PSD • Mahalanobis Space 1 • Euclidean Space Using Cholesky decomposition : And rewrite d M As follow: 6 1 http://www.ias.ac.in/resonance/Volumes/04/06/0020-0026.pdf

  7. Euclidean Vs Mahanalobis Space Euclidean Space Mahalanalobis Space Age Age M M=G T G Weight Weight 7

  8. Metric Learning Algorithms • Metric learning algorithms approaches: – Driven by Nearest Neighbors – Information -theoretical – Online – Etc – Etc • Nearest Neighbor approaches – MMC, Xing et al. (2002) – NCA, leave-one out cross validation (LOO), Goldberger et al.(2004) – MCML, Globerson and Roweis (2005) – LMNN , Weinberger et al. (2005) 8

  9. LMNN Mahanalobis Metric Euclidean Metric Local neighborhood G G M=G T G Target Neighbor Margin Impostors 9

  10. Summary • LMNN 1 – Scales to large datasets – Has fast test-time performance – Convex optimization – Can solve efficiently Can solve efficiently – No assumption about the data – Number of target, prior assignment • Euclidean Space, Multi-Pass LMNN 1 – Sensitive to outliers – Dimension Reduction � Principle Component Analysis (PCA) 2 1 http://www.cse.wustl.edu/~kilian/papers/jmlr08_lmnn.pdf 10 2 http://computation.llnl.gov/casc/sapphire/pubs/148494.pdf

  11. Dimension Reduction • PCA – In the mean-square error sense – Linear dimension reduction – Based on covariance matrix of the variables – Used to reduce computation time and avoid overfitting PCA 11

  12. Build, 70% Dataset Dataset Training Training (labeled ) (labeled ) Test, 30% set set Metric Learning Normalize and Normalize and Best PSD, Best PSD, Testing Testing Plug LMNN LMNN dimension dimension M=G T G M=G T G set set reduction reduction Build Model Model Build model using LMNN Evaluation Test model Model • Intra/Inter distance ratio • kNN Error ratio 12

  13. Intra/Inter Distance Ratio 0.3 0.25 0.2 intra/inter 0.15 0.15 ratio ratio Mahalanobis Mahalanobis Euclidean 0.1 0.05 -3.89E-1 0 1 2 3 4 5 6 7 8 9 10 Class Mnist, has 10 classes, intra/inter ratio, number of target = 3 13

  14. Build, 70% Dataset Dataset Training Training (labeled ) (labeled ) Test, 30% set set Metric Learning Normalize and Normalize and Best PSD, Best PSD, Testing Testing Plug LMNN LMNN dimension dimension M=G T G M=G T G set set reduction reduction Build Model Model Build model using LMNN Evaluation Test model Model • Intra/Inter distance ratio • kNN Error ratio 14

  15. kNN Error Ratio 1.5 Mnist 2.5 4.7 ISOLET 9.6 atasets 4.3 Bal Data 7.4 7.4 Mahanalobis 2.6 Faces Euclidean 5.9 3.2 Iris 4.3 0 2 4 6 8 10 12 Error rate Error rate LMNN Vs Euclidean Metrics, ( k =5 ), number of target = 3 15

  16. Comparison Statistics Mnist Letters Isolet Bal Wines Iris 70000 20000 7797 535 152 128 #inputs 784 16 617 4 13 4 #features 164 16 172 4 13 4 #reduced dimensions 60000 14000 6238 375 106 90 #training examples 10000 6000 1559 161 46 38 #testing examples 10 26 26 3 3 3 #classes kNN kNN 2,12 4.68 8.98 18.33 25.00 4.87 Euclidean 2.43 4.68 8.60 18.33 25.00 4.87 PCA 5.93 4.34 5.71 12.31 2.28 3.71 RCA 15.66 30.96 3.55 N/A N/A N/A MMC 5.33 28.67 4.32 N/A N/A N/A NCA 1.72 3.60 4.36 11.16 8.72 4.37 PCA 1.69 2.80 4.30 5.86 7.59 4.26 LMNN Multiple Passes 16 Weinberger, K. Q. and L. K. Saul (2009). "Distance metric learning for large margin nearest neighbor classification." The Journal of Machine Learning Research 10: 207-244.

  17. Image Database Classification using kNN kNN Intractable Time Complexity Solution Approximate Nearest Neighbor (ANN) 17 http://www.cs.utexas.edu/~grauman/courses/spring2008/datasets.htm

  18. Out-line 1) METRIC LEARNING 2) INDEXING 3) OBJECTIVES and CONTRIBUTIONS 3) OBJECTIVES and CONTRIBUTIONS 4) EXPERIMENT RESULTS 5) DISCUSSION,CONCLUSION and FUTURE WORKS

  19. 2. Locality Sensitive Hashing • Idea: hash functions that similar objects are more likely to have the same hash 1 , Sub-linear time search • Hashing methods to do fast Approximate Nearest Neighbor (ANN) Search, :-approximation ratio, P =4, • LSHs have been designed for � Cosine Similarity � L P Distance Measure Q r, radius � Hamming distance � Jaccard index for set similarity � …… 19 1 [Indyk-Motwani ’ 98] http://people.csail.mit.edu/indyk/mmds.pdf

  20. Example, LSH • Take random projections of data • Quantize each projection with few bits 1 1 0 0 1100 1100 Feature vector 1 0 1 20 www.cs.utexas.edu/~grauman/.../jain_et_al_cvpr2008. ppt

  21. Cosine Similarity LSH r is d-dimensional random hyperplane, Gaussian distribution Basic Hashing Function 1 Learned Hashing Function Image database G h r1 … r4 10010 h r1 … rb series of b randomized b randomized 10110 Q LSH functions 10101 10100 h r1 … r4 10010 Both cases Q 10110 Colliding instances are searched <<n 10100 10011 21 1 Jain, B. Kulis, and K. Grauman. Fast Image Search for Learned Metrics. In CVPR, 2008

  22. Euclidean Space Hashing • Basic Euclidean Space • Learned Euclidean Space • Where a is a d -dimensional vector chosen independently • Where a is a d -dimensional vector chosen independently from a p-stable distribution , • Chose random line and partition into equi-width segments w , and • b is a real number chosen randomly from range [0,w] • To guarantee accuracy, L hash table(s) are used to probe near neighbors in each Buckets, under K hash functions. 22

  23. Euclidean Space Hashing Image Database No indexing involved L = 3, number of hash Table K, number of hash function L 2 Hash Table L 1 Hash Table L 3 Hash Table Key Values Key Values key Values X’ X Y’’ Y Y’ W’’ W R’’ R’ R S’ S Q 23

  24. Out-line 1) METRIC LEARNING 2) INDEXING 3) OBJECTIVES and CONTRIBUTIONS 3) OBJECTIVES and CONTRIBUTIONS 4) EXPERIMENT RESULTS 5) DISCUSSION,CONCLUSION and FUTURE WORKS

  25. 3. OBJECTIVES and CONTRIBUTIONS • The main objectives of this thesis are:- – To study and implement � metric learning algorithm � dimension reduction technique � LSH in different metric space � LSH in different metric space – To establish and implement machine learning evaluation techniques • The original contribution of the thesis are – Formulate a fresh learned approach for both Cosine similarity and Euclidean metric space hashing 25

  26. Out-line 1) METRIC LEARNING 2) INDEXING 3) OBJECTIVES and CONTRIBUTIONS 3) OBJECTIVES and CONTRIBUTIONS 4) EXPERIMENTAL RESULTS 5) DISCUSSION,CONCLUSION and FUTURE WORKS

  27. Build, 90% Dataset Dataset Training Training (labeled ) (labeled ) Test, 10% set set Metric Learning Normalize and Normalize and Best PSD, Best PSD, Query Query Plug LMNN LMNN dimension dimension M M set set reduction reduction Decompose, M Transform, Transform, M=G T G M=G T G M=G T G M=G T G Hashing (LSH) Hashing (LSH) Cosine Similarity Hashing Euclidean Space Hashing Learned Basic Original Learned Model Evaluation Test model • Time Complexity • Computational Complexity 27 • Query Accuracy

  28. Time Complexity Exhaustive Vs Euclidean Space Hashing, 3NN 189 200 180 160 Exhaustive 140 120 Euclidean Hashing time 100 (Msecond) 80 80 49 49 60 40 24 22 8 20 4 0 LetterRecognition Isolet Mnist Datasets LetterRecognitioon Isolet Mnist Dataset 20,000 7796 70,000 Instances 16 617 784 Dimension 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend