siamese neural l netw networks a and simila larity
play

Siamese Neural l Netw Networks a and Simila larity Learning Wh - PowerPoint PPT Presentation

Siamese Neural l Netw Networks a and Simila larity Learning Wh What at can an ML ML do do for or us? Classification problem Neural CAT Network Prof. Leal-Taix and Prof. Niessner 2 Wh What at can an ML ML do do for or


  1. Siamese Neural l Netw Networks a and Simila larity Learning

  2. Wh What at can an ML ML do do for or us? • Classification problem Neural CAT Network Prof. Leal-Taixé and Prof. Niessner 2

  3. Wh What at can an ML ML do do for or us? • Classification problem on ImageNet with thousands of categories Prof. Leal-Taixé and Prof. Niessner 3

  4. Wh What at can an ML ML do do for or us? • Performance on ImageNet – Size of the blobs indicates the number of parameters A. Canziani et al. „An Analysis of Deep Neural Network Models for Practical Prof. Leal-Taixé and Prof. Niessner 4 Applications“. arXiv:1605.07678 2016

  5. Wh What at can an ML ML do do for or us? • Regression problem: pose regression Pretrained FC network p ∈ R 3 q ∈ R 4 y ∈ R 2048 FC Feature extraction Linear regression Prof. Leal-Taixé and Prof. Niessner 5

  6. Wh What at can an ML ML do do for or us? • Regression problem: bounding box regression D. Held et al. „Learning to Track at 100 FPS with Deep Regression Networks“. ECCV 2016 Prof. Leal-Taixé and Prof. Niessner 6

  7. Wh What at can an ML ML do do for or us? • Third type of problems Classification: person, face, female A B Classification: person, face, male Prof. Leal-Taixé and Prof. Niessner 7

  8. Wh What at can an ML ML do do for or us? • Third type of problems A Is it the same person? B Prof. Leal-Taixé and Prof. Niessner 8

  9. Wh What at can an ML ML do do for or us? • Third type of problems: Similarity Learning A - Comparison - Ranking B Prof. Leal-Taixé and Prof. Niessner 9

  10. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Application: unlocking your iPhone with your face Training Prof. Leal-Taixé and Prof. Niessner 10

  11. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Application: unlocking your iPhone with your face YES A Testing Can be solved as a B NO classification problem Prof. Leal-Taixé and Prof. Niessner 11

  12. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Application: face recognition system so students can enter the exam room without the need for ID check Person 1 Training Person 2 Person 3 Prof. Leal-Taixé and Prof. Niessner 12

  13. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Application: face recognition system so students can enter the exam room without the need for ID check What is the problem with this approach? Scalability – we need to retrain our model every time a new student is registered to the course Prof. Leal-Taixé and Prof. Niessner 13

  14. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Application: face recognition system so students can enter the exam room without the need for ID check Can we train one model and use it every year? Prof. Leal-Taixé and Prof. Niessner 14

  15. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Learn a similarity function A A High similarity Low similarity score score B B Prof. Leal-Taixé and Prof. Niessner 15

  16. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Learn a similarity function: testing A Not the same d ( A, B ) > τ person B Prof. Leal-Taixé and Prof. Niessner 16

  17. Si Simila larity ty Le Learni ning ng: whe when n and nd why why? • Learn a similarity function A d ( A, B ) < τ Same person B Prof. Leal-Taixé and Prof. Niessner 17

  18. Si Simila larity ty le learni ning ng • How do we train a network to learn similarity? Prof. Leal-Taixé and Prof. Niessner 18

  19. Siamese Neural l Netw Networks

  20. Simila Si larity ty le learni ning ng • How do we train a network to learn similarity? Representation A of my face in 128 values CNN FC Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014 Prof. Leal-Taixé and Prof. Niessner 20

  21. Si Simila larity ty le learni ning ng • How do we train a network to learn similarity? A f ( A ) B f ( B ) Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014 Prof. Leal-Taixé and Prof. Niessner 21

  22. Si Simila larity ty le learni ning ng • Siamese network = shared weights A f ( A ) B f ( B ) Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014 Prof. Leal-Taixé and Prof. Niessner 22

  23. Si Simila larity ty le learni ning ng • Siamese network = shared weights • We use the same network to obtain an encoding of the image f ( A ) • To be done: compare the encodings Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014 Prof. Leal-Taixé and Prof. Niessner 23

  24. Si Simila larity ty le learni ning ng d ( A, B ) = || f ( A ) − f ( B ) || 2 • Distance function • Training: learn the parameter such that – If and depict the same person, is small d ( A, A, B ) = d ( A, B ) = d ( A, A, B ) = d ( A, B ) = – If and depict a different person, is large Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014 Prof. Leal-Taixé and Prof. Niessner 24

  25. Si Simila larity ty le learni ning ng • Loss function for a positive pair: – If and depict the same person, is small d ( A, A, B ) = d ( A, B ) = L ( A, B ) = || f ( A ) − f ( B ) || 2 Prof. Leal-Taixé and Prof. Niessner 25

  26. Si Simila larity ty le learni ning ng • Loss function for a negative pair: d ( A, A, B ) = d ( A, B ) = – If and depict a different person, is large – Better use a Hinge loss: L ( A, B ) = max(0 , m 2 − || f ( A ) − f ( B ) || 2 ) If two elements are already far away, do not spend energy in pulling them even further away Prof. Leal-Taixé and Prof. Niessner 26

  27. Si Simila larity ty le learni ning ng • Contrastive loss: L ( A, B ) = y ∗ || f ( A ) − f ( B ) || 2 + (1 − y ∗ ) max (0 , m 2 − || f ( A ) − f ( B ) || 2 ) Negative pair, Positive pair, brings the elements reduce the distance further apart up to a between the margin elements Prof. Leal-Taixé and Prof. Niessner 27

  28. Si Simila larity ty le learni ning ng • Training the siamese networks – You can update the weights for each channel independently and then average them • This loss function allows us to learn to bring positive pairs together and negative pairs apart Prof. Leal-Taixé and Prof. Niessner 28

  29. Triple let Loss

  30. Tr Triple let t lo loss • Triplet loss allows us to learn a ranking Anchor (A) Positive (P) Negative (N) || f ( A ) − f ( P ) || 2 < || f ( A ) − f ( N ) || 2 We want: Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015 Prof. Leal-Taixé and Prof. Niessner 30

  31. Tr Triple let t lo loss • Triplet loss allows us to learn a ranking || f ( A ) − f ( P ) || 2 < || f ( A ) − f ( N ) || 2 || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 < 0 ) = || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 + m < 0 margin Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015 Prof. Leal-Taixé and Prof. Niessner 31

  32. Tr Triple let t lo loss • Triplet loss allows us to learn a ranking || f ( A ) − f ( P ) || 2 < || f ( A ) − f ( N ) || 2 || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 < 0 ) = || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 + m < 0 L ( A, P, N ) = max (0 , || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 + m ) Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015 Prof. Leal-Taixé and Prof. Niessner 32

  33. Tr Triple let t lo loss • Hard negative mining: training with hard cases L ( A, P, N ) = max (0 , || f ( A ) − f ( P ) || 2 − || f ( A ) − f ( N ) || 2 + m ) • Train for a few epochs • Choose the hard cases where d ( A, P ) ≈ d ( A, N ) • Train with those to refine the distance learned Prof. Leal-Taixé and Prof. Niessner 33

  34. Tr Triple let t lo loss Negative Negative Training Anchor Anchor Positive Positive Prof. Leal-Taixé and Prof. Niessner 34

  35. Tr Triple let t lo loss: te test t ti time • Just do nearest neighbor search! ����� ��������� Prof. Leal-Taixé and Prof. Niessner 35

  36. Tr Triple let t Lo Loss Cha halle lleng nges • Random sampling does not work - the number of possible triplets is O(n^3) so the network would need to be trained for a very long time. • Even with hard negative mining, there is the risk of being stuck in local minima. Prof. Leal-Taixé and Prof. Niessner 36

  37. Several l approaches to improve simila larity le learning

  38. Im Improving simil imilar arit ity lear earnin ing • Loss: – Contrastive vs. triplet loss • Sampling: – Choosing the best triplets to train with, sample the space wisely = diversity of classes + hard cases • Ensembles: – Why not using several networks, each of them trained with a subset of triplets? • Can we use a classification loss for similarity learning? Prof. Leal-Taixé and Prof. Niessner 38

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend