modern machine learning methods for trustworthy science
play

Modernmachinelearningmethods fortrustworthyscience TomCharnock - PowerPoint PPT Presentation

Modernmachinelearningmethods fortrustworthyscience TomCharnock Institutd'AstrophysiquedeParis Whyneuralnetworksdon'twork (andhowtousethem) TomCharnock


  1. Modern
machine
learning
methods

 for
trustworthy
science Tom
Charnock Institut
d'Astrophysique
de
Paris

 
 


  2. Why
neural
networks
don't
work

 (and
how
to
use
them) Tom
Charnock Institut
d'Astrophysique
de
Paris

 
 


  3. Why
neural
networks
don't
work
 
 Tom
Charnock Institut
d'Astrophysique
de
Paris

 
 


  4. Apologies
about
the
term
bias when
something
is
intrinsically
unknowable
it
is
biased if
there
is
some
offset,
which
could
in
principle
be corrected,
it
is
biased

  5. Apologies
about
the
term
bias when
something
is
intrinsically
unknowable
it
is
biased if
there
is
some
offset,
which
could
in
principle
be corrected,
it
is
biased I
(almost
always)
mean
the
top
one

  6. 
 ℕℕ ( � , � ) : � → � An
approximation
to
a
model,
  : � → �

  7. A
crazy
likelihood
surface
of
how
likely
we are
to
get
targets
from
data

  8. What
are
we
actually
interested
in?

  9.  ( � | � ) = ∫ � � � �  ( � | � , � , � )  ( � , � )

  10.  ( � | � ) = ∫ � � � �  ( � | � , � , � )  ( � , � ) 
-
Posterior 
 
-
Likelihood
  ( � | � )  ( � | � , � , � ) predictive
density
 How
likely
are
the
targets
to How
likely
are
the
true be
generated
by
a
particular targets
given
some
data?
 network? 
 
-
Probability
density
  ( � , � ) What
is
the
probability
of
obtaining
a
particular
network
with particular
parameter
values?

  11.  ( � | � ) = ∫ � � � �  ( � | � , � , � )  ( � , � ) 
 
 
 


  12. Where
does
this
information
about
the weights
and
hyperparameters
come
from?

  13. Training
and
validation
data

  14. Training
and
validation
data Training
data
and
targets:
 
 } train � train � train { � , � ≡ { , | � ∈ [1, � train ]} � � 
 Validation
data
and
targets:
 } val � val � val { � , � ≡ { , | � ∈ [1, � val ]} � � Posterior
distribution
of
weights
and
hyperparameters } train } val  ( � , � | { � , � , { � , � ) ∝ } train } val  ( � , � | { � , � , { � , � ) � ( � , � )

  15. The
failing
of
traditional
training

  16. The
failing
of
traditional
training 
 
 
 
 approximator 
 
  : � → � � ( � , � ) smooth
and
convex
 ℕℕ ( � , � ) : � → � 
 
 Cost
function
and
likelihood 
  ( � | � , � , � ) complex
and
non-convex � ∗ � ∗ � ( � , � ) = − ln  ( � | � , , ) in
 
and
 � �

  17. Optimising
(or
training)
a
network

  18. Optimising
(or
training)
a
network What
are
the
maximum
likelihood
estimates
of
the
weights? 
 � MLE } train } train � ∗ = argmax [  ( { � | { � , � , ) ] �

  19. Local
maximum
likelihood
estimates 


  20. The
main
problem...

  21. We
degenerate
the
posterior 
 } train } train  ( � , � | { � , � ) ∝  ( � , � | { � , � ) � ( � , � ) � MLE � ∗ → � ( � − , � − )

  22. We
degenerate
the
posterior 
 } train } train  ( � , � | { � , � ) ∝  ( � , � | { � , � ) � ( � , � ) � MLE � ∗ → � ( � − , � − )

  23. All
predictions
are
(probably
incorrect)
estimates
 
 
  ( � | � ) = � ( � )

  24. There
is
no
way
to
interpret
how
close
 
is
to
 ... � �

  25. There
is
no
way
to
interpret
how
close
 
is
to
 ... � � Because
the
likelihood
is
non-interpretably
complex

  26. Are
there
better
methods?

  27. Variational
inference

  28. } train  ( � | � ) = ∫ � � � � � �  ( � | � , � , � )  ( � | � , � , { � , � ) � ( � , � ) 


  29. Still
depends
on
�xed
weights
in
the
complex
likelihood
surface
 and
choice
of
variational
distribution 
 } train  ( � | � ) = ∫ � � � � � �  ( � | � , � , � )  ( � | � , � , { � , � ) � MLE � ∗ × � ( � − , � − ) � ∗ � MLE � ∗ } train = ∫ � �  ( � | � , � , )  ( � | , , { � , � ).

  30. Bayesian
neural
networks

  31. Bayesian
neural
networks

  32. } train  ( � | � ) = ∫ � � � �  ( � | � , � , � )  ( � , � | { � , � ) ∝ ∫ � � � �  ( � | � , � , � ) � train � train � train ×  ( | , � , � ) � ( � , � ). � � ∏ � Sample
the
likelihood
of
the
training
data

  33. Still
dependent
on
the
training
data! 
 Classical
network
:
 } train � MLE � ∗  ( � , � | { � , � ) → � ( � − , � − ) Variational
inference
:
 } train } train � MLE � ∗  ( � , � | { � , � ) =  ( � | , , { � , � ) Bayesian
networks
:
 ∏ � train } train � train � train  ( � , � | { � , � ) =  ( | , � , � ) � ( � , � ) � � �

  34. Problems
with
physical
models...

  35. Problems
with
physical
models...

  36. How
can
we
use
a
neural
network
then?

  37. Build
it
into
the
physical
model

  38. Method
1
:

 
 Infer
the
data,
physics
and
the
neural network

  39. Method
2
:
 
 Understand
the
likelihood
(using
neural physical
engines)

  40. 
 
 
 
 


  41. Method
3
:

 
 Likelihood-free
inference

  42. Compare
distance
between
observed
summaries
and simulation
summaries
and
select
results
within
 
 � 


  43. Conclusions

  44. Conclusions Neural
networks
are
not
to
be
trusted They
can
make
trusty
companions
-
when
the
correct framework
is
introduced Using
statistics
we
can
build
neural
networks
into
the forward
model
to
get
unbiased
results

  45. For
more
information
read
my
new
blog
 
 bit.ly/ProbNN

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend