Prediction of genetic Values using Neural Networks Paulino Perez 1 - PowerPoint PPT Presentation

Prediction of genetic Values using Neural Networks Paulino Perez 1 Daniel Gianola 2 Jose Crossa 1 1 CIMMyT-Mexico 2 University of Wisconsin, Madison. September, 2014 SLU,Sweden Prediction of genetic Values using Neural Networks 1/26

Contents Introduction 1 Non linear models and NN 2 Model fitting 3 Case study: Wheat 4 Application examples 5 SLU,Sweden Prediction of genetic Values using Neural Networks 2/26

Introduction Introduction High density marker panels enable genomic selection (GS). Marker based models performs better than pedigree based models (e.g. de los Campos et al., 2009). Most research done with linear additive models (see eq. 1). It might be possible to increase accuracy using non-linear models with dominance and additive effects. p � y i = x ij β j + e i (1) j = 1 SLU,Sweden Prediction of genetic Values using Neural Networks 3/26

Introduction Continued... Recent studies with non-additive effects: SLU,Sweden Prediction of genetic Values using Neural Networks 4/26

Introduction Continued... SLU,Sweden Prediction of genetic Values using Neural Networks 5/26

Non linear models and NN Non linear models and neural networks y i = µ + f ( x i ) + e i (2) Any non linear function can be exactly represented as (Kolmogorov’s theorem): � p � 2 p + 1 � � f ( x i ) = f ( x i 1 , ..., x ip ) = g λ r h q ( x ir ) (3) q = 1 r = 1 In Neural Networks (NN) non-linear functions are “approximated” as sums of finite series of smooth functions. Most basic and well known NN is the Single Hidden Layer Feed Forward Neural Network (SHLNN). SLU,Sweden Prediction of genetic Values using Neural Networks 6/26

Non linear models and NN Continued... Figure 1: Graphical representation of a SHLNN. SLU,Sweden Prediction of genetic Values using Neural Networks 7/26

Non linear models and NN Continued... Figure 2: Inputs (e.g. Markers) and output (phenotype) for a SHLNN. SLU,Sweden Prediction of genetic Values using Neural Networks 8/26

Non linear models and NN Continued... Prediction has two (automated) steps: Inputs transformed non-linearly in the hidden layer. Outputs from hidden layer combined to obtain predictions. Combine output from hidden layer � ��   p S � � x ij β [ k ] y i = µ + w k g k  b k + + e i  j k = 1 j = 1 � �� output from hidden layer g k ( · ) is the activation (transformation) function. SLU,Sweden Prediction of genetic Values using Neural Networks 9/26

Model fitting Model fitting Parameters to be estimated in a NN are the weights ( w 1 , ..., w S ) , biases ( b 1 , ..., b S ) , connection strengths ( β [ 1 ] 1 , ...., β [ 1 ] p ; ..., β [ S ] 1 , ...., β [ S ] p ) , µ and σ 2 e . When number of predictors ( p ) and of neurons ( S ) increase, the number of parameters to estimate grows quickly. = ⇒ Can cause over-fitting. To prevent over fitting use penalized methods, via Bayesian approaches. SLU,Sweden Prediction of genetic Values using Neural Networks 10/26

Model fitting Empirical Bayes Contents Introduction 1 Non linear models and NN 2 Model fitting 3 Empirical Bayes Case study: Wheat 4 5 Application examples SLU,Sweden Prediction of genetic Values using Neural Networks 11/26

Model fitting Empirical Bayes Empirical Bayes McKay (1995) developed Empirical Bayes approach framework for estimating parameters in a NN. Let θ = ( w 1 , ..., w S , b 1 , ..., b S , β [ 1 ] 1 , ...., β [ 1 ] p ; ..., β [ S ] 1 , ...., β [ S ] p , µ ) ′ p ( θ | σ 2 θ ) = MN ( 0 , σ 2 θ I ) Estimation requires two steps, 1) Obtain conditional posterior modes of the elements in θ assuming σ 2 θ , σ 2 e known. These are obtained by maximizing, e ) = p ( y | θ , σ 2 e ) p ( θ | σ 2 p ( y | θ , σ 2 e ) p ( θ | σ 2 θ ) θ ) p ( θ | y , σ 2 θ , σ 2 = p ( y | σ 2 θ , σ 2 � R m p ( y | θ , σ 2 e ) p ( θ | σ 2 e ) θ ) d θ which is equivalent to minimizing the “augmented” sum of squares: n m 1 1 � � θ 2 F ( θ ) = e i + (4) j 2 σ 2 2 σ 2 e θ i = 1 j = 1 SLU,Sweden Prediction of genetic Values using Neural Networks 12/26

Model fitting Empirical Bayes Continued... 2) Update σ 2 θ , σ 2 e by maximizing marginal likelihood of the data p ( y | σ 2 θ , σ 2 e ) . The marginal log-likelihood aproximated as: e ) ≈ k + n 2 log β + m 2 log α − 1 log p ( y | σ 2 θ , σ 2 2 log | Σ | θ = θ map − F ( θ ) | θ = θ map ∂ 2 where Σ = ∂ θθ ′ F ( θ ) . It can be shown that this function is maximized when: γ n − γ , γ = m − 2 α Trace ( Σ − 1 ) α = , β = 2 � m � n j = 1 θ 2 i = 1 e 2 j i Iterate between 1 and 2 until convergence. NOTE: SIMILAR TO USING BLUP AND ML IN GAUSSIAN LINEAR MODELS. SLU,Sweden Prediction of genetic Values using Neural Networks 13/26

Model fitting Empirical Bayes Problems with the approach Huge number of parameters to estimate, m = 1 + S × ( 1 + 1 + p ) where S is the number of neurons and p is the number of covariates. Gauss-Newton algorithm used to minimize (4) requires solving linear systems of order m × m , complexity O ( m 3 ) . Updating formulas for the variance components requires inverting a matrix of order m × m , complexity O ( m 3 ) . Alternatives: Derivative free algorithms (may have poor performance, unstable). Parallel computing. SLU,Sweden Prediction of genetic Values using Neural Networks 14/26

Model fitting Empirical Bayes brnn We developed an R package (brnn) that implements the Empirical Bayes approach to fiting a NN. It will be available in a few months in the R-mirrors. Figure 3: Help page for the trainbr package. SLU,Sweden Prediction of genetic Values using Neural Networks 15/26

Case study: Wheat Case study: additive genetic effects (wheat) Prediction of Grain yield (GY) and Days to heading (DTH) in wheat lines, 306 wheat lines from Global Wheat Program of CIMMyT. 1,717 binary markers (DArT). Two traits analyzed: GY (5 Environments). 1 DTH (10 Environments). 2 Bayesian regularized neural networks fitted by using the MCMC approach. Predictive ability of BRNN compared against standard models by generating 50 random partitions with 90% of observations in training and 10% in testing. SLU,Sweden Prediction of genetic Values using Neural Networks 16/26

Case study: Wheat Continued... Table 1: Correlations between observed and predicted phenotypes for DTH and GY (“winner” underlined). NOTE: Non-parametric methods better in 15/15 comparisons. SLU,Sweden Prediction of genetic Values using Neural Networks 17/26

Case study: Wheat Continued... Figure 4: Plot of the correlation for each of 50 partitions and 10 environments for days to heading (DTH) in different combination of models. SLU,Sweden Prediction of genetic Values using Neural Networks 18/26

Application examples Toy examples #Example 1 #Noise triangle wave function, similar to example 1 in Foresee and Hagan (1997) #Generating the data x1=seq(0,0.23,length.out=25) y1=4*x1+rnorm(25,sd=0.1) x2=seq(0.25,0.75,length.out=50) y2=2-4*x2+rnorm(50,sd=0.1) x3=seq(0.77,1,length.out=25) y3=4*x3-4+rnorm(25,sd=0.1) x=c(x1,x2,x3) y=c(y1,y2,y3) X=as.matrix(x) neurons=2 out=brnn(y,X,neurons=neurons) cat("Message: ",out$reason,"\n") plot(x,y,xlim=c(0,1),ylim=c(-1.5,1.5), main="Bayesian Regularization for ANN 1-2-1") Note: Type library(brnn) and then demo(’Example_1’) to run this example in the 1 R console. SLU,Sweden Prediction of genetic Values using Neural Networks 19/26

Application examples Continued... 1.5 Matlab R • 1.0 • • • • • • • • • • • • • • • • • • • • • 0.5 • • • • • • • • • • • • • • • • • • • • • • • • 0.0 • • • y • • • • • • • • • • • • • • • • • • • • • −0.5 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • −1.0 • −1.5 0.0 0.2 0.4 0.6 0.8 1.0 x SLU,Sweden Prediction of genetic Values using Neural Networks 20/26

Application examples Continued... #2 Inputs and 1 output #the data used in Paciorek and #Schervish (2004). The data is from a two input one output function with Gaussian noise #with mean zero and standard deviation 0.25. data(twoinput) X=normalize(as.matrix(twoinput[,1:2])) y=as.vector(twoinput[,3]) neurons=10 out=brnn(y,X,neurons=neurons) cat("Message: ",out$reason,"\n") f=function(x1,x2,theta,neurons) predictions.nn(X=cbind(x1,x2),theta,neurons) x1=seq(min(X[,1]),max(X[,1]),length.out=50) x2=seq(min(X[,1]),max(X[,1]),length.out=50) z=outer(x1,x2,f,theta=out$theta,neurons=neurons) # calculating the density values transformation_matrix=persp(x1, x2, z, main="Fitted model", sub=expression(y==italic(g)~(bold(x))+e), col="lightgreen",theta=30, phi=20,r=50, d=0.1, expand=0.5,ltheta=90, lphi=180, shade=0.75, ticktype="detailed",nticks=5) points(trans3d(X[,1],X[,2], f(X[,1],X[,2], theta=out$theta,neurons=neurons), transformation_matrix), col = "red") SLU,Sweden Prediction of genetic Values using Neural Networks 21/26

Prediction of genetic Values using Neural Networks Paulino Perez 1 - PowerPoint PPT Presentation

Prediction of genetic Values using Neural Networks Paulino Perez 1 Daniel Gianola 2 Jose Crossa 1 1 CIMMyT-Mexico 2 University of Wisconsin, Madison. September, 2014 SLU,Sweden Prediction of genetic Values using Neural Networks 1/26 Contents

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

1 2 Genetic Program Genetic Program Parameter 3 Genetic Program Genetic Program 4 Softcoding

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Genetic.io Genetic Algorithms in all their shapes and forms ! Genetic.io Make something of your

Germ- -line Genetic Therapy line Genetic Therapy Germ Munson- -Davis Look Bravely at a Davis

Introduction to Python for Biologists skova 1 Jean-Fred Fontaine 1 , 2 K aterina Ta 1 Faculty of

Rahti container cloud service Aim of this aernoon: $ aragorn

Common Structured Patterns in Linear Graphs: Approximation and Combinatorics Guillaume Fertin,

A tour of recent results on word transducers Anca Muscholl (based on joint work with F.

Remodularizing Legacy Model Transformations with Automatic Clustering Techniques Andreas

NP Completeness Tractability Polynomial time Stephen Cook Leonid Levin Richard Karp

and Future Agro-industrialization Strategies in Africa O. Badiane Director for Africa Outline

Forward-looking statements Except for the historical information contained herein, the matters