Learning Loss for Active Learning Rymarczyk D., Zieliski B., Tabor - - PowerPoint PPT Presentation
Learning Loss for Active Learning Rymarczyk D., Zieliski B., Tabor - - PowerPoint PPT Presentation
Learning Loss for Active Learning Rymarczyk D., Zieliski B., Tabor J., Sadowski M., Titov M. Agenda 1. Active Learning introduction 2. Base methods in AL for Deep Learning 3. Learning Loss for Active Learning 4. Our ideas for Active
Agenda
1. Active Learning introduction 2. Base methods in AL for Deep Learning 3. Learning Loss for Active Learning 4. Our ideas for Active Learning 5. Future plans 6. Bibliography
Active Learning
Unlabeled dataset Labeled
Active Learning
Unlabeled dataset Labeled Label
Active Learning
Unlabeled dataset Labeled Predicted Label
Active Learning
Unlabeled dataset Labeled Predicted Label True Label
Active Learning
Unlabeled dataset Labeled Label
Challenges in Active Learning
1. Criteria on which the sample will be chosen to the labelling process. 2. How many samples should be included in the labelling process? 3. Is the oracle infallible? 4. Multi oracle scenarios. 5. Online learning. 6. Can we use unlabeled data? and how? 7. How does the oracle reckon the AL system? 8. ...
Challenges in Active Learning
1. Criteria on which the sample will be chosen to the labelling process. 2. How many samples should be included in the labelling process? 3. Is the oracle infallible? 4. Multi oracle scenarios. 5. Online learning. 6. Can we use unlabeled data? and how? 7. How does the oracle reckon the AL system? 8. ...
Base methods in AL for Deep Learning
Random sampling to label
Unlabeled dataset Labeled
Base methods in AL for Deep Learning
Core-Set approach - K-Greedy algorithm / KMeans++
Unlabeled dataset Labeled
Base methods in AL for Deep Learning
Core-Set approach - K-Greedy algorithm / KMeans++
Unlabeled dataset Labeled
Base methods in AL for Deep Learning
Uncertainty based approach - entropy
Unlabeled dataset
0.2 0.9 0.1 0.5 0.4 0.3 0.4
Labeled
0.1
Base methods in AL for Deep Learning
Uncertainty based approach - entropy
Unlabeled dataset
0.2 0.9 0.1 0.5 0.4 0.3 0.4
Labeled
0.1
Base methods in AL for Deep Learning
Experiment - learning episode:
Unlabeled dataset (CIFAR10) Labeled 1000 Predicted Label True Label To be Labeled 1000
Base methods in AL for Deep Learning
Experiment - 10 x learning episodes:
Unlabeled dataset (CIFAR10) Labeled 10000 Predicted Label
Base methods in AL for Deep Learning
Experiment - Results
Learning Loss for Active Learning
Motivation: 1. None of the basic methods use information from inner layers of NN. 2. Best measure of NN error is value of loss function. 3. More advanced methods requires:
a. modifications of the architecture, b. training another neural network, c. training generative model, d. finding adversarial examples, e. bayesian deep learning, f. model ensembles.
Learning Loss for Active Learning
Architecture modifications - learning loss module:
Learning Loss for Active Learning
Architecture modifications - learning loss module:
Learning Loss for Active Learning
Loss function for learning the loss
Learning Loss for Active Learning
Results from the paper: Experiment details on CIFAR10:
- network trained for 200 epochs, lr=0.1
- after 160 epochs lr=0.01
- at 120 epoch loss prediction module
does not influence the con weights
Our ideas for Active Learning
- 1. Remove the loss prediction module and use decoder or VAE.
Take to the labelling process samples with the highest reconstruction loss
Our ideas for Active Learning
- 1. Remove the loss prediction
module and use decoder or VAE.
Our ideas for Active Learning
- 2. We should try to make an adversarial example of the image and choose
those which requires the smallest modification to do so. DONE: https://arxiv.org/pdf/1802.09841.pdf
Our ideas for Active Learning
- 3. We should be like GANs. Train a discriminator to distinguish between
labeled and unlabeled dataset. DONE: https://arxiv.org/pdf/1907.06347.pdf
Our ideas for Active Learning
- 4. The neural network is learning the easy example first. Can be the history
- f learning a differentiation between labeled and unlabeled datasets?
Our ideas for Active Learning
- 4. History of learning
Unlabeled dataset Labeled Label History record every 20 epochs r a w
Our ideas for Active Learning
- 4. History of learning
Unlabeled dataset Labeled History record RandomForest Labeled Unlabeled Only unlabeled 1000 100 sampled with highest possibility of being unlabeled
Our ideas for Active Learning
- 4. History of learning
Our ideas for Active Learning
- 5. Different moments of
History of learning
Our ideas for Active Learning
- 6. Maybe the NN is after
critical point - do not fine-tune
Our ideas for Active Learning
- 7. Take the history of inner layers
Our ideas for Active Learning
- 8. Why entropy is so good?
Can we be better?
Our ideas for Active Learning
- 9. Is history even worth something?
Our ideas for Active Learning
- 9. Is history even worth something?
Our ideas for Active Learning
- 9. Is history even worth something?
Our ideas for Active Learning
- 9. Is history even worth something?
Our ideas for Active Learning
- 9. Is history even worth something?
Our ideas for Active Learning
- 9. Is history even worth something?
Future plans
1. Investigate ways of finding the dataset outlier. 2. Do more research about history of learning.
Future plans
1. IDEA: use augmentation for checking how the image prediction is sustained through different transformation. Take samples with highest m.
y’ y’’ y’’’
Bibliography
1. Yoo, Donggeun, and In So Kweon. "Learning Loss for Active Learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019. https://arxiv.org/pdf/1905.03677.pdf 2. Ducoffe, Melanie, and Frederic Precioso. "Adversarial active learning for deep networks: a margin based approach." arXiv preprint arXiv:1802.09841 (2018). https://arxiv.org/pdf/1802.09841.pdf 3. Gissin, Daniel, and Shai Shalev-Shwartz. "Discriminative active learning." arXiv preprint arXiv:1907.06347 (2019). https://arxiv.org/pdf/1907.06347.pdf 4. Sener, Ozan, and Silvio Savarese. "Active learning for convolutional neural networks: A core-set approach." arXiv preprint arXiv:1708.00489 (2017). https://arxiv.org/pdf/1708.00489.pdf