Machine Learning NEIL LAWRENCE UNIVERSITY OF SHEFFIELD @lawrennd - PowerPoint PPT Presentation

Gaussian Processes for Machine Learning NEIL LAWRENCE UNIVERSITY OF SHEFFIELD @lawrennd

GLOBAL INFORMATION STORAGE CAPACITY IN OPTIMALLY COMPRESSED BYTES SVMs ConvNets dominate NIPS Developed

Coal Google Facebook Amazon Tin Startups

The Data are Not Enough • Four pillars: • Deterministic/Stochastic • Mechanistic/Emipirical • Goal: model complex phenomena over time • Problem: • Mechanistic models are often inaccurate • Data is often not rich enough for an empirical approach • Question 1: How do we combine inaccurate physical model with machine learning?

Central Dogma DNA Transcription mRNA Translation Protein

Decision: Transcription Factors mRNA Measured using Microarray since 1998 Translation TF Protein Difficult to measure Transcription Other mRNAs Measured using Microarray since 1998

Mechanistic Model mRNA 𝑛 𝑈𝐺 𝑢 ⅆ𝑞 𝑈𝐺 (𝑢) = 𝑡 𝑔 𝑛 𝑈𝐺 𝑢 − 𝑒 𝑔 𝑞 𝑈𝐺 (𝑢) Translation ⅆ𝑢 TF Protein 𝑞 𝑈𝐺 (𝑢) ⅆ𝑛 𝑗 (𝑢) = 𝑡 𝑗 𝑞 𝑈𝐺 (𝑢) − 𝑒 𝑗 𝑛 𝑗 (𝑢) Transcription ⅆ𝑢 Other mRNAs 𝑛 𝑗 (𝑢)

Need to Model 𝑞 𝑈𝐺 (𝑢) • Gaussian process: a probabilistic model for functions. • Formally known as a stochastic process . • Multivariate Gaussian is normally defined by a mean vector , 𝝂 , and a covariance matrix , C . 𝑧~𝑂(𝝂, C) • Gaussian process defined by a mean function , 𝜈(𝑢) , and a covariance function, 𝑑(𝑢, 𝑢 ′ ) . 𝑧(𝑢)~𝑂(𝜈(𝑢), 𝑑(𝑢, 𝑢 ′ ))

Zero Mean Gaussian Process Sample Zero Mean Gaussian Sample index 𝑢 ′ 5 10 15 20 25 2 1 0.9 1.5 5 0.8 0.7 1 10 𝑧(𝑢) 0.6 y index 0.5 𝑢 0.5 15 0.4 0.3 0 20 0.2 0.1 25 0 5 10 15 20 25 index t covariance C covariance function 𝑑(𝑢, 𝑢 ′ ) samples from Gaussian samples from Gaussian process

Gaussian Processes 𝑦 2 , 𝑧 2 𝑦 1 , 𝑧 1 𝑞 𝐠 𝐲 𝑞 𝑧 1 |𝑔 𝑞 𝐠|𝐳, 𝐲 𝑞 𝑧 2 |𝑔 1 2

Results 𝑛 𝑈𝐺 𝑢 ⅆ𝑞 𝑈𝐺 (𝑢) = 𝑡 𝑔 𝑛 𝑈𝐺 𝑢 − 𝑒 𝑔 𝑞 𝑈𝐺 (𝑢) ⅆ𝑢 𝑞 𝑈𝐺 (𝑢) ⅆ𝑛 𝑗 (𝑢) = 𝑡 𝑗 𝑞 𝑈𝐺 (𝑢) − 𝑒 𝑗 𝑛 𝑗 (𝑢) ⅆ𝑢 𝑛 𝑗 (𝑢) TPAMI, 2 PNAS papers, 2 Comp Bio

MATLAB Demo • demo_2016_04_28_amazon.m

Further Challenge • This model inter-relates different functions with mechanistic understanding. • What if you need to inter-relate across different modalities of data at different scales. • E.g. biopsy images + genetic test + mammogram for breast cancer diagnostics.

The Data are Not Enough • Four pillars: • Deterministic/Stochastic • Mechanistic/Empirical • Goal: model complex phenomena over time • Problem: • Mechanistic models are often inaccurate • Data is often not rich enough for an empirical approach • Question 2: How do we formulate the right representations to integrate different data modalities?

Classical Latent Variables x y

Classical Treatment • Assume a priori that x~𝑂 0, I • Relate linearly to y y = Wx +𝛝 • Framework covers many classical models PCA, Factor Analysis, ICA

Render Gaussian Non Gaussian 𝑧 = 𝑔(𝑦) 𝑦 𝑧

Use Abstraction for Complex Systems High Level Ideas Stratification of Concepts Low Level Mechanisms

Biology and Health Health ? ? ? Molecular Biology

Neuroscience Behaviour ? ? ? Neuron Firing

g 𝑦 f 3 (∙) f 4 (∙) f 5 (∙) f 6 (∙)f 7 (∙)f 8 (∙)f 9 (∙) f 2 (∙) f 1 (𝑦) g 𝑦 = f 9 f 8 f 7 f 6 ⋯

Stochastic Process Composition • A new approach to forming stochastic processes • Mathematical composition: 𝑧 𝑦 = 𝑔 1 𝑔 2 𝑔 3 𝑦 • Properties of resulting process highly non-Gaussian • Allows for hierarchical structured form of model. • Learning in models of this type has become known as: deep learning .

(200 iterations)

(converged)

model MSE (train) MSE (test) mlp (200 iters) 108.5 1185.1 mlp (converged) 24.0 1338.2 gp 59.2 1095.4 deep gp (2) 146.2 833.7 deep gp (3) 182.5 843.6 One hundred hidden nodes, one hundred inducing points

Regression 𝑜 𝑞 data set GP Sparse GP Deep GP housing 506 13 2.78±0.54 2.77±0.60 2.69±0.49 redwine 588 11 0.72±0.06 0.62±0.04 0.62±0.04 energy1 768 8 0.48±0.07 0.50±0.07 0.49±0.07 energy2 768 8 0.59±0.08 1.66±0.21 1.39±0.49 concrete 1030 8 5.26±0.67 5.81±0.62 5.66±0.62

Bayesian Optimization • Check http://sheffieldml.github.io/GPyOpt/

Use Abstraction for Complex Systems High Level Ideas Stratification of Concepts Low Level Mechanisms

Example: Motion Capture Modelling

Modelling Digits

Numerical Issues

Health • Complex genotype epigenotype environmen t system • Scarce data State of health • Different modalities clinical tests Organ states gene • Poor expression understanding treatment clinical Cell states of mechanism notes • Large scale survival analysis X-ray biopsy PLoS Comp Bio, Nature Communications X-ray biopsy

To Find Out More • Gaussian Process Summer School • 12 th -15 th September 2016 in Sheffield • This year in parallel with/themed as a UQ orientated school (co- organisation with Rich Wilkinson) • Occurring alongside ENBIS Meeting • http://gpss.cc/

Future • Methodology • Deep GPs (also current) • Latent Force Models (current but dormant) • Latent Action Models and Stochastic Optimal Control (new) • Probabilistic Geometries (starting) • Exemplar Applications • Health and Biology (existing) • Developing world (existing) • Robotics at different scales (starting) • Perception: vision (dormant) haptic (new)

Summary • Complex systems: • ‘big data’ is too ‘small’. • The data are not enough. • Need data efficient methods • http://www.theguardian.com/media-network/2016/jan/28/google-ai-go-grandmaster- real-winner-deepmind • Solutions: • Hybrid mechanistic-empirical models • Structured models for automated data assimilation

Thank you Neil Lawrence http://inverseprobability.com @lawrennd

The Digital Oligarchy • Response to concentration of power with data • CitizenMe • London based start up • User-centric data modelling • New challenges in ML • Integration of ML, systems, cryptography.

Open Data Science and Africa Challenge • “Whole pipeline challenge” • Make software available • Teach summer schools • Support local meetings • Publicity in the Guardian • Opportunities to deploy pipeline solution

Disease Incidence for Malaria

Uganda • Spatial models of disease

Deployed with UN Global Pulse Lab http://pulselabkampala.ug/hmis/

Machine Learning NEIL LAWRENCE UNIVERSITY OF SHEFFIELD @lawrennd - PowerPoint PPT Presentation

Gaussian Processes for Machine Learning NEIL LAWRENCE UNIVERSITY OF SHEFFIELD @lawrennd GLOBAL INFORMATION STORAGE CAPACITY IN OPTIMALLY COMPRESSED BYTES SVMs ConvNets dominate NIPS Developed Coal Google Facebook Amazon Tin Startups

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Mapping Context-Dependent Requirements to Event-Based Context-Oriented Programs for Modularity

Des Design gn and De Developme ment t Me Metho thodo dology 01204322 Embedded System

UNMANNED AERIAL VEHICLE (UAV) SURVEY FOR YEAR-END MINING RECLAMATION ESTIMATION Prepared For

Quantifying and reducing uncertainties on sets under Gaussian Process priors David Ginsbourger 1,2

Interactive Visual Exploration of Most Likely Movements Can Yang and Gyz Gidfalvi Division

Best-Case WiBro Performance for a Single Flow Shinae Woo , Keon Jang , Sangman Kim

Bartendr: A Practical Approach to Energy-aware Cellular Data Scheduling Aaron Schulman Vishnu

Spatial knowledge acquisition with Mobile map, Language Haosheng Huang Manuela Schmidt and