SLIDE 32 Benjamin Aubin - Institut de Physique T héorique NeurIPS 2018
Large number of hidden units K = Θp(1)
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Specialized hidden units Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Specialized hidden units Bayes optimal g( α) AMP g( α) Discontinuous specialization
2 4 6 8 10 12 14
- α = (# of samples)/(#hidden units × input size)
0.0 0.1 0.2 0.3 0.4 0.5
Generalization error g( α)
Non-specialized hidden units Specialized hidden units Computational gap Bayes optimal g( α) AMP g( α) Discontinuous specialization
Gaussian weights - sign activation
Poster #111 TO KNOW MORE:
https:/ /github.com/ benjaminaubin/
TheCommitteeMachine