Modern machine learning methods for trustworthy science Tom Charnock Institut d'Astrophysique de Paris
Why neural networks don't work (and how to use them) Tom Charnock Institut d'Astrophysique de Paris
Why neural networks don't work Tom Charnock Institut d'Astrophysique de Paris
Apologies about the term bias when something is intrinsically unknowable it is biased if there is some offset, which could in principle be corrected, it is biased
Apologies about the term bias when something is intrinsically unknowable it is biased if there is some offset, which could in principle be corrected, it is biased I (almost always) mean the top one
ℕℕ ( � , � ) : � → � An approximation to a model, : � → �
A crazy likelihood surface of how likely we are to get targets from data
What are we actually interested in?
( � | � ) = ∫ � � � � ( � | � , � , � ) ( � , � )
( � | � ) = ∫ � � � � ( � | � , � , � ) ( � , � ) - Posterior - Likelihood ( � | � ) ( � | � , � , � ) predictive density How likely are the targets to How likely are the true be generated by a particular targets given some data? network? - Probability density ( � , � ) What is the probability of obtaining a particular network with particular parameter values?
( � | � ) = ∫ � � � � ( � | � , � , � ) ( � , � )
Where does this information about the weights and hyperparameters come from?
Training and validation data
Training and validation data Training data and targets: } train � train � train { � , � ≡ { , | � ∈ [1, � train ]} � � Validation data and targets: } val � val � val { � , � ≡ { , | � ∈ [1, � val ]} � � Posterior distribution of weights and hyperparameters } train } val ( � , � | { � , � , { � , � ) ∝ } train } val ( � , � | { � , � , { � , � ) � ( � , � )
The failing of traditional training
The failing of traditional training approximator : � → � � ( � , � ) smooth and convex ℕℕ ( � , � ) : � → � Cost function and likelihood ( � | � , � , � ) complex and non-convex � ∗ � ∗ � ( � , � ) = − ln ( � | � , , ) in and � �
Optimising (or training) a network
Optimising (or training) a network What are the maximum likelihood estimates of the weights? � MLE } train } train � ∗ = argmax [ ( { � | { � , � , ) ] �
Local maximum likelihood estimates
The main problem...
We degenerate the posterior } train } train ( � , � | { � , � ) ∝ ( � , � | { � , � ) � ( � , � ) � MLE � ∗ → � ( � − , � − )
We degenerate the posterior } train } train ( � , � | { � , � ) ∝ ( � , � | { � , � ) � ( � , � ) � MLE � ∗ → � ( � − , � − )
All predictions are (probably incorrect) estimates ( � | � ) = � ( � )
There is no way to interpret how close is to ... � �
There is no way to interpret how close is to ... � � Because the likelihood is non-interpretably complex
Are there better methods?
Variational inference
} train ( � | � ) = ∫ � � � � � � ( � | � , � , � ) ( � | � , � , { � , � ) � ( � , � )
Still depends on �xed weights in the complex likelihood surface and choice of variational distribution } train ( � | � ) = ∫ � � � � � � ( � | � , � , � ) ( � | � , � , { � , � ) � MLE � ∗ × � ( � − , � − ) � ∗ � MLE � ∗ } train = ∫ � � ( � | � , � , ) ( � | , , { � , � ).
Bayesian neural networks
Bayesian neural networks
} train ( � | � ) = ∫ � � � � ( � | � , � , � ) ( � , � | { � , � ) ∝ ∫ � � � � ( � | � , � , � ) � train � train � train × ( | , � , � ) � ( � , � ). � � ∏ � Sample the likelihood of the training data
Still dependent on the training data! Classical network : } train � MLE � ∗ ( � , � | { � , � ) → � ( � − , � − ) Variational inference : } train } train � MLE � ∗ ( � , � | { � , � ) = ( � | , , { � , � ) Bayesian networks : ∏ � train } train � train � train ( � , � | { � , � ) = ( | , � , � ) � ( � , � ) � � �
Problems with physical models...
Problems with physical models...
How can we use a neural network then?
Build it into the physical model
Method 1 : Infer the data, physics and the neural network
Method 2 : Understand the likelihood (using neural physical engines)
Method 3 : Likelihood-free inference
Compare distance between observed summaries and simulation summaries and select results within �
Conclusions
Conclusions Neural networks are not to be trusted They can make trusty companions - when the correct framework is introduced Using statistics we can build neural networks into the forward model to get unbiased results
For more information read my new blog bit.ly/ProbNN
Recommend
More recommend