RS-2DLDA Random Subspace Two-dimensional LDA for Face Recognition - - PowerPoint PPT Presentation

rs 2dlda
SMART_READER_LITE
LIVE PREVIEW

RS-2DLDA Random Subspace Two-dimensional LDA for Face Recognition - - PowerPoint PPT Presentation

Algorithm Experiments Conclusions RS-2DLDA Random Subspace Two-dimensional LDA for Face Recognition Garrett Bingham B.S. Computer Science & Mathematics, Yale University 19 July 25, 2017 RS-2DLDA Garrett Bingham 1 Algorithm


slide-1
SLIDE 1

Algorithm Experiments Conclusions

RS-2DLDA

Random Subspace Two-dimensional LDA for Face Recognition Garrett Bingham

B.S. Computer Science & Mathematics, Yale University ’19

July 25, 2017

RS-2DLDA Garrett Bingham 1

slide-2
SLIDE 2

Algorithm Experiments Conclusions

Table of Contents

  • 1. Algorithm

Background Information RS-2DLDA

  • 2. Experiments

Datasets Experiment Design Results

  • 3. Conclusions

Insights Future Work References

RS-2DLDA Garrett Bingham 2

slide-3
SLIDE 3

Algorithm Experiments Conclusions Background Information RS-2DLDA

Algorithm

RS-2DLDA Garrett Bingham 3

slide-4
SLIDE 4

Algorithm Experiments Conclusions Background Information RS-2DLDA

Linear Discriminant Analysis

  • Keeps images of the same

person close together, while spreading those of different people farther apart

  • Generally more effective in face

recognition than PCA Principal Component Analysis

  • Spreads the data out as much

as possible

  • Useful in dimensionality

reduction

  • Doesn’t consider which images

belong to which people

https://sebastianraschka.com/images/blog/2014/linear-discriminant-analysis/lda_1.png RS-2DLDA Garrett Bingham 4

slide-5
SLIDE 5

Algorithm Experiments Conclusions Background Information RS-2DLDA

2D Version

Normally, image matrices are first converted to 1D vectors. In 2DLDA and 2DPCA how- ever, they are left in matrix

  • form. Advantages include:
  • Avoids using

high-dimensional vectors, making it computationally efficient

  • Structure of face is

preserved

RS-2DLDA Garrett Bingham 5

slide-6
SLIDE 6

Algorithm Experiments Conclusions Background Information RS-2DLDA

Right, Left, and Bilateral Projection Schemes

Yj = AjV Yj = UTAj

Image reconstruction by R2DPCA and L2DPCA of an example ORL image.

RS-2DLDA Garrett Bingham 6

slide-7
SLIDE 7

Algorithm Experiments Conclusions Background Information RS-2DLDA

2DLDA

In 2DLDA we keep the top d eigenvectors. Each image matrix is multiplied by these eigenvectors and projected to a new space. If we choose d too small, accuracy will be poor. On the other hand, if d is too large the algorithm will overfit to the training data and won’t be able to generalize to new information.

RS-2DLDA Garrett Bingham 7

slide-8
SLIDE 8

Algorithm Experiments Conclusions Background Information RS-2DLDA

Applying the Random Subspace Method

In RS-2DLDA, we select d eigenvectors at random to make one classifier. we can repeat this many times, creating several different classifiers. This allows us to utilize the information from all eigenvectors without risk of

  • verfitting.

If each classifier makes different mistakes, then we can combine their

  • utputs into one final decision to increase accuracy. This is known as a

diverse set of classifiers. We can quantify diversity with entropy measure.

RS-2DLDA Garrett Bingham 8

slide-9
SLIDE 9

Algorithm Experiments Conclusions Background Information RS-2DLDA

Entropy Experiments

Figure: Choosing 10 random eigenvectors results in high entropy

If we choose too few eigenvectors, each classifier is too weak and makes a large number of errors, resulting in low entropy. On the other hand, if we choose too many eigenvectors, then the classifiers will make near identical predictions, again resulting in low entropy.

RS-2DLDA Garrett Bingham 9

slide-10
SLIDE 10

Algorithm Experiments Conclusions Background Information RS-2DLDA

Entropy Experiments

Figure: Entropy levels off

Not training enough classi- fiers results in low entropy, be- cause we haven’t sampled the eigenvectors thoroughly. Changing these parameters affects the entropy, but the accuracy remains more or less stable. This implies that although we aren’t affecting the the unweighted accuracy, we may be able to increase overall accuracy with a weighted voting scheme.

RS-2DLDA Garrett Bingham 10

slide-11
SLIDE 11

Algorithm Experiments Conclusions Background Information RS-2DLDA

Estimating Classifier Performance

The adjusted Rand index (ARI) is a clustering similarity measure. We use it to measure how well each classifier keeps images of the same person close together. We suspect that classifiers with a higher ARI will be more accurate on the testing data. We want to give classifiers that have a higher ARI more weight in the final decision. If we raise each ARI to a common exponent b, we can control how much extra in- fluence the strong classifiers have over the weak ones.

RS-2DLDA Garrett Bingham 11

slide-12
SLIDE 12

Algorithm Experiments Conclusions Background Information RS-2DLDA

Weighting Scheme

Figure: The weighting scheme is tested on MORPH-II using two different random seeds

Choosing a moderate value for b gives an extra 2-5% in overall accuracy. The optimal value for b may be different with datasets of different size and difficulty.

RS-2DLDA Garrett Bingham 12

slide-13
SLIDE 13

Algorithm Experiments Conclusions Datasets Experiment Design Results

Experiments

RS-2DLDA Garrett Bingham 13

slide-14
SLIDE 14

Algorithm Experiments Conclusions Datasets Experiment Design Results

ORL

The ORL dataset contains images of 40 people collected from 1992 to

  • 1994. Each image was collected in a lab, is grayscale, and has dimension

112 × 92. There are 10 images per person, each with minor variations in pose and facial expression.

Available at http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

RS-2DLDA Garrett Bingham 14

slide-15
SLIDE 15

Algorithm Experiments Conclusions Datasets Experiment Design Results

MORPH-II

The MORPH-II dataset [5] is a longitudinal dataset collected over five

  • years. It contains 55,134 images from 13,617 individuals. Subjects’ ages

range from 16-77 years of age. MORPH-II suffers from high variability in pose, facial expression, and illumination. To account for this, the images were rotated so that the eyes were horizontal, and then cropped to 70 × 60 to reduce noise from the background or the subject’s hair.

Example pre-processed MORPH-II images

RS-2DLDA Garrett Bingham 15

slide-16
SLIDE 16

Algorithm Experiments Conclusions Datasets Experiment Design Results

Experiment Design: ORL

All 40 people from ORL were used in the experiments. 5 images per person were selected at random for training and 5 for testing. The eigenvectors from 2D-LDA and 2D-PCA were computed, and those that explained less than 1% of the variability (or discriminability in the case of 2D-LDA) were discarded because they are likely just random noise. Of the remaining eigenvectors, I choose 5 at random to build a classifier. I build 50 such classifiers and then use a weighted voting scheme to combine their results into a final decision. Each classifier uses KNN (k = 1) to make its individual decision. Different distance metrics are considered.

RS-2DLDA Garrett Bingham 16

slide-17
SLIDE 17

Algorithm Experiments Conclusions Datasets Experiment Design Results

Experiment Design: MORPH-II

50 arbitrary people were chosen from MORPH-II. 5 images per person were selected at random for training and 5 for testing. The eigenvectors from 2D-LDA and 2D-PCA were computed, and those that explained less than 1% of the variability (or discriminability in the case of 2D-LDA) were discarded because they are likely just random noise. Of the remaining eigenvectors, I choose 10 at random to build a

  • classifier. I build 50 such classifiers and then use a weighted voting

scheme to combine their results into a final decision. Each classifier uses KNN (k = 5) to make its individual decision. Different distance metrics are considered.

RS-2DLDA Garrett Bingham 17

slide-18
SLIDE 18

Algorithm Experiments Conclusions Datasets Experiment Design Results

Results: ORL

Each experiment was conducted thirty times and results averaged to

  • btain the following accuracies:

Face Recognition on ORL Algorithm Euclidean Cosine Weighted Unweighted Original Weighted Unweighted Original B2DLDA .931 (.017) .924 (.017) .935 .939 (.015) .936 (.016) .940 L2DLDA .914 (.013) .909 (.014) .940 .937 (.016) .935 (.017) .940 R2DLDA .929 (.013) .923 (.016) .935 .948 (.015) .943 (.013) .945 B2DPCA .914 (.013) .911 (.014) .870 .908 (.015) .908 (.013) .865 L2DPCA .895 (.011) .893 (.010) .865 .884 (.012) .884 (.013) .860 R2DPCA .905 (.010) .903* (.011) .895 .916 (.016) .914 (.016) .895

Figure: Experiments conducted on ORL. Standard error is shown in parentheses, and top accuracy in bold. The framework introduced by Nguyen et al. in [2] is denoted (*).

RS-2DLDA Garrett Bingham 18

slide-19
SLIDE 19

Algorithm Experiments Conclusions Datasets Experiment Design Results

Results: MORPH-II

Each experiment was conducted thirty times and results averaged to

  • btain the following accuracies:

Face Recognition on MORPH-II Algorithm Euclidean Cosine Weighted Unweighted Original Weighted Unweighted Original B2DLDA .727 (.019) .678 (.026) .764 .781 (.018) .786 (.015) .768 L2DLDA .743 (.008) .735 (.009) .756 .788 (.010) .780 (.012) .776 R2DLDA .704 (.016) .662 (.018) .704 .723 (.018) .733 (.016) .704 B2DPCA .706 (.013) .701 (.013) .564 .702 (.018) .692 (.013) .556 L2DPCA .678 (.009) .667 (.011) .552 .670 (.018) .660 (.016) .544 R2DPCA .611 (.010) .609* (.007) .580 .609 (.009) .612 (.009) .584

Figure: Experiments conducted on MORPH-II. Standard error is shown in parentheses, and top accuracy in bold. The framework introduced by Nguyen et al. in [2] is denoted (*).

RS-2DLDA Garrett Bingham 19

slide-20
SLIDE 20

Algorithm Experiments Conclusions Insights Future Work References

Conclusions

RS-2DLDA Garrett Bingham 20

slide-21
SLIDE 21

Algorithm Experiments Conclusions Insights Future Work References

Insights

  • Cosine distance consistently outperforms euclidean. This could be a

result of the high illumination variability in MORPH-II. It measures the angle between two observations, so it considers similarity and disregards magnitude.

  • The LDA-based algorithms achieve higher accuracies than their PCA
  • counterparts. This was expected, and is due to the fact that the

eigenvectors from LDA are based on maximizing discriminability instead of just explaining variability.

  • The weighting scheme is effective but does not generalize well to
  • ther algorithms and datasets.

RS-2DLDA Garrett Bingham 21

slide-22
SLIDE 22

Algorithm Experiments Conclusions Insights Future Work References

Future Work

  • Further investigation into the differences of the right, left, and

bilateral versions of 2DLDA and 2DPCA.

  • More difficult face recognition problems.
  • Developing a more systematic and robust weighting scheme that can

be applied regardless of algorithm or dataset.

  • Instead of KNN, each classifier could use support vector machines

(SVM).

RS-2DLDA Garrett Bingham 22

slide-23
SLIDE 23

Algorithm Experiments Conclusions Insights Future Work References

Acknowledgements

  • Dr. Chen, Dr. Wang, and Troy Kling put in an incredible amount of work

to make this program run smoothly. We have all accomplished so much in such a short time, and none of it would have been possible without their support.

RS-2DLDA Garrett Bingham 23

slide-24
SLIDE 24

Algorithm Experiments Conclusions Insights Future Work References

References I

  • T. K. Ho.

The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, Aug 1998.

  • N. Nguyen, W. Liu, and S. Venkatesh.

Random Subspace Two-Dimensional PCA for Face Recognition, pages 655–664. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007.

  • S. Noushath, G. H. Kumar, and P. Shivakumara.

(2d)2lda: An efficient approach for face recognition. Pattern Recognition, 39(7):1396 – 1400, 2006.

  • R. Polikar.

Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6(3):21–45, 2006.

RS-2DLDA Garrett Bingham 24

slide-25
SLIDE 25

Algorithm Experiments Conclusions Insights Future Work References

References II

  • K. Ricanek and T. Tesafaye.

Morph: a longitudinal image database of normal adult age-progression. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pages 341–345, April 2006.

  • A. Xu, X. Jin, Y. Jiang, and P. Guo.

Complete two-dimensional pca for face recognition. In 18th International Conference on Pattern Recognition (ICPR’06), volume 3, pages 481–484, 2006.

  • J. Yang, D. Zhang, A. F. Frangi, and J. yu Yang.

Two-dimensional pca: a new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(1):131–137, Jan 2004.

RS-2DLDA Garrett Bingham 25