Machine Learning and Rendering Alex Keller, Director of Research - PowerPoint PPT Presentation

Modern Path Tracing Numerical integro-approximation ⌅ Monte Carlo methods r r r r r r r r r r r r r r r r n [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 r Z r r r r ∑ r r g ( y ) = f ( y , x i ) r r r r r r r r r n r i = 1 r r r r r r r r r r r r r r rr r r r – uniform, independent, unpredictable random samples x i r r r r r r r r r r – simulated by pseudo-random numbers r r r r r r r r r r r ⌅ quasi-Monte Carlo methods r r r r r n [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 Z r r ∑ g ( y ) = f ( y , x i ) r n r i = 1 r r r r – much more uniform correlated samples x i r r r r – realized by low-discrepancy sequences, which are progressive Latin-hypercube samples 5

Modern Path Tracing Numerical integro-approximation ⌅ Monte Carlo methods r r r r r r r r r r r r r r r r n [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 r Z r r r r ∑ r r g ( y ) = f ( y , x i ) r r r r r r r r r n r i = 1 r r r r r r r r r r r r r r rr r r r – uniform, independent, unpredictable random samples x i r r r r r r r r r r – simulated by pseudo-random numbers r r r r r r r r r r r r r ⌅ quasi-Monte Carlo methods r r r r r r r r r n r [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 Z r r r ∑ g ( y ) = f ( y , x i ) r r n r r r i = 1 r r r r r r r r – much more uniform correlated samples x i r r r r r r r – realized by low-discrepancy sequences, which are progressive Latin-hypercube samples 5

Modern Path Tracing Numerical integro-approximation ⌅ Monte Carlo methods r r r r r r r r r r r r r r r r n [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 r Z r r r r ∑ r r g ( y ) = f ( y , x i ) r r r r r r r r r n r i = 1 r r r r r r r r r r r r r r rr r r r – uniform, independent, unpredictable random samples x i r r r r r r r r r r – simulated by pseudo-random numbers r r r r r r r r r r r r r r r ⌅ quasi-Monte Carlo methods r r r r r r r r r r r r r n r r [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 Z r r r r ∑ g ( y ) = f ( y , x i ) r r r r n r r r r i = 1 r r r r r r r r r r r r r – much more uniform correlated samples x i r r r r r r r r r – realized by low-discrepancy sequences, which are progressive Latin-hypercube samples 5

Modern Path Tracing Numerical integro-approximation ⌅ Monte Carlo methods r r r r r r r r r r r r r r r r n [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 r Z r r r r ∑ r r g ( y ) = f ( y , x i ) r r r r r r r r r n r i = 1 r r r r r r r r r r r r r r rr r r r – uniform, independent, unpredictable random samples x i r r r r r r r r r r – simulated by pseudo-random numbers r r r r r r r r r r r r r r r r r ⌅ quasi-Monte Carlo methods r r r r r r r r r r r r r r r r r n r r [ 0 , 1 ) s f ( y , x ) dx ⇡ 1 r Z r r r r r ∑ g ( y ) = f ( y , x i ) r r r r r r n r r r r r i = 1 r r r r r r r r r r r r r r r r r – much more uniform correlated samples x i r r r r r r r r r r r r – realized by low-discrepancy sequences, which are progressive Latin-hypercube samples 5

Modern Path Tracing Pushbutton paradigm ⌅ deterministic – may improve speed of convergence – reproducible and simple to parallelize 6

Modern Path Tracing Pushbutton paradigm ⌅ deterministic – may improve speed of convergence – reproducible and simple to parallelize ⌅ unbiased – zero difference between expectation and mathematical object – not sufficient for convergence 6

Modern Path Tracing Pushbutton paradigm ⌅ deterministic – may improve speed of convergence – reproducible and simple to parallelize ⌅ biased – allows for ameliorating the problem of insufficient techniques – can tremendously increase efficiency 6

Modern Path Tracing Pushbutton paradigm ⌅ deterministic – may improve speed of convergence – reproducible and simple to parallelize ⌅ biased – allows for ameliorating the problem of insufficient techniques – can tremendously increase efficiency ⌅ consistent – error vanishes with increasing set of samples – no persistent artifacts introduced by algorithm I Quasi-Monte Carlo image synthesis in a nutshell I The Iray light transport simulation and rendering system 6

Reconstruction from noisy input: Massively parallel path space filtering (link)

From Machine Learning to Graphics

Machine Learning Taxonomy ⌅ unsupervised learning from unlabeled data – examples: clustering, auto-encoder networks 9

Machine Learning Taxonomy ⌅ unsupervised learning from unlabeled data – examples: clustering, auto-encoder networks ⌅ semi-supervised learning by rewards – example: reinforcement learning 9

Machine Learning Taxonomy ⌅ unsupervised learning from unlabeled data – examples: clustering, auto-encoder networks ⌅ semi-supervised learning by rewards – example: reinforcement learning ⌅ supervised learning from labeled data – examples: support vector machines, decision trees, artificial neural networks 9

Reinforcement Learning Goal: maximize reward ⌅ state transition yields reward Agent s t r t + 1 ( a t | s t ) 2 R s t + 1 a t r t + 1 ( a t | s t ) Environment 10

Reinforcement Learning Goal: maximize reward ⌅ state transition yields reward Agent s t r t + 1 ( a t | s t ) 2 R s t + 1 a t r t + 1 ( a t | s t ) ⌅ learn a policy π t – to select an action a t 2 A ( s t ) Environment – given the current state s t 2 S 10

Reinforcement Learning Goal: maximize reward ⌅ state transition yields reward Agent s t r t + 1 ( a t | s t ) 2 R s t + 1 a t r t + 1 ( a t | s t ) ⌅ learn a policy π t – to select an action a t 2 A ( s t ) Environment – given the current state s t 2 S ⌅ maximizing the discounted cumulative reward ∞ γ k · r t + 1 + k ( a t + k | s t + k ) , where 0 < γ < 1 ∑ V ( s t ) ⌘ k = 0 10

Reinforcement Learning Q-Learning [Watkins 1989] ⌅ learns optimal action selection policy for any given Markov decision process Q 0 ( s , a ) r ( s , a )+ γ · V ( s 0 ) � � = ( 1 � α ) · Q ( s , a )+ α · for a learning rate α 2 [ 0 , 1 ] 11

Reinforcement Learning Q-Learning [Watkins 1989] ⌅ learns optimal action selection policy for any given Markov decision process Q 0 ( s , a ) r ( s , a )+ γ · V ( s 0 ) � � = ( 1 � α ) · Q ( s , a )+ α · for a learning rate α 2 [ 0 , 1 ] with the following options for the discounted cumulative reward 8 max a 0 2 A Q ( s 0 , a 0 ) consider best action in next state s 0 > > > > > > < V ( s 0 ) ⌘ > > > > > > : 11

Reinforcement Learning Q-Learning [Watkins 1989] ⌅ learns optimal action selection policy for any given Markov decision process Q 0 ( s , a ) r ( s , a )+ γ · V ( s 0 ) � � = ( 1 � α ) · Q ( s , a )+ α · for a learning rate α 2 [ 0 , 1 ] with the following options for the discounted cumulative reward 8 max a 0 2 A Q ( s 0 , a 0 ) consider best action in next state s 0 > > > > > > < V ( s 0 ) ⌘ ∑ a 0 2 A π ( s 0 , a 0 ) Q ( s 0 , a 0 ) policy weighted average over discrete action space > > > > > > : 11

Reinforcement Learning Q-Learning [Watkins 1989] ⌅ learns optimal action selection policy for any given Markov decision process Q 0 ( s , a ) r ( s , a )+ γ · V ( s 0 ) � � = ( 1 � α ) · Q ( s , a )+ α · for a learning rate α 2 [ 0 , 1 ] with the following options for the discounted cumulative reward 8 max a 0 2 A Q ( s 0 , a 0 ) consider best action in next state s 0 > > > > > > < V ( s 0 ) ⌘ ∑ a 0 2 A π ( s 0 , a 0 ) Q ( s 0 , a 0 ) policy weighted average over discrete action space > > > > > A π ( s 0 , a 0 ) Q ( s 0 , a 0 ) da 0 > R policy weighted average over continuous action space : 11

Reinforcement Learning Maximize reward by learning importance sampling online ⌅ radiance integral equation L ( x , ω ) = L e ( x , ω ) + R f s ( ω i , x , ω )cos θ i L ( h ( x , ω i ) , � ω i ) d ω i S 2 + ( x ) 12

Reinforcement Learning Maximize reward by learning importance sampling online ⌅ structural equivalence of integral equation and Q -learning L ( x , ω ) = L e ( x , ω ) + R f s ( ω i , x , ω )cos θ i L ( h ( x , ω i ) , � ω i ) d ω i S 2 + ( x ) ✓ ◆ Q 0 ( s , a ) π ( s 0 , a 0 ) Q ( s 0 , a 0 ) da 0 = ( 1 � α ) Q ( s , a )+ α r ( s , a ) + γ R A 12

Reinforcement Learning Maximize reward by learning importance sampling online ⌅ structural equivalence of integral equation and Q -learning L ( x , ω ) = L e ( x , ω ) + R f s ( ω i , x , ω )cos θ i L ( h ( x , ω i ) , � ω i ) d ω i S 2 + ( x ) ✓ ◆ Q 0 ( s , a ) π ( s 0 , a 0 ) Q ( s 0 , a 0 ) da 0 = ( 1 � α ) Q ( s , a )+ α r ( s , a ) + γ R A ⌅ graphics example: learning the incident radiance ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 12

Reinforcement Learning Maximize reward by learning importance sampling online ⌅ structural equivalence of integral equation and Q -learning L ( x , ω ) = L e ( x , ω ) + R f s ( ω i , x , ω )cos θ i L ( h ( x , ω i ) , � ω i ) d ω i S 2 + ( x ) ✓ ◆ Q 0 ( s , a ) π ( s 0 , a 0 ) Q ( s 0 , a 0 ) da 0 = ( 1 � α ) Q ( s , a )+ α r ( s , a ) + γ R A ⌅ graphics example: learning the incident radiance ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 to be used as a policy for selecting an action ω in state x to reach the next state y := h ( x , ω ) – the learning rate α is the only parameter left I Technical Note: Q -Learning 12

Reinforcement Learning Online algorithm for guiding light transport paths Function pathTrace(camera , scene) throughput 1 ray setupPrimaryRay( camera ) for i 0 to ∞ do y , n intersect( scene , ray ) if isEnvironment(y) then return throughput · getRadianceFromEnvironment( ray , y ) else if isAreaLight(y) return throughput · getRadianceFromAreaLight( ray , y ) ω , p ω , f s sampleBsdf( y , n ) throughput throughput · f s · cos( n , ω ) / p ω ray y , ω 13

Reinforcement Learning Online algorithm for guiding light transport paths Function pathTrace(camera , scene) throughput 1 ray setupPrimaryRay( camera ) for i 0 to ∞ do y , n intersect( scene , ray ) if i > 0 then ✓ ◆ Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α R L e ( y , � ω )+ +( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 if isEnvironment(y) then return throughput · getRadianceFromEnvironment( ray , y ) else if isAreaLight(y) return throughput · getRadianceFromAreaLight( ray , y ) ω , p ω , f s sampleScatteringDirectionProportionalToQ( y ) throughput throughput · f s · cos( n , ω ) / p ω ray y , ω 13

approximate solution Q stored on discretized hemispheres across scene surface

2048 paths traced with BRDF importance sampling in a scene with challenging visibility

Path tracing with online reinforcement learning at the same number of paths

Metropolis light transport at the same number of paths

Reinforcement Learning Guiding paths to where the value Q comes from ⌅ shorter expected path length ⌅ dramatically reduced number of paths with zero contribution ⌅ very efficient online learning by learning Q from Q 18

Reinforcement Learning Guiding paths to where the value Q comes from ⌅ shorter expected path length ⌅ dramatically reduced number of paths with zero contribution ⌅ very efficient online learning by learning Q from Q ⌅ directions for research – representation of value Q : data structures from games – importance sampling proportional to the integrand, i.e. the product of policy γ · π times value Q I On-line learning of parametric mixture models for light transport simulation I Product importance sampling for light transport path guiding I Fast product importance sampling of environment maps I Learning light transport the reinforced way I Practical path guiding for efficient light-transport simulation 18

From Graphics back to Machine Learning

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ input layer a 0 , L � 1 hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 20

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ input layer a 0 , L � 1 hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i a l � 1 , j } in layer l 20

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ input layer a 0 , L � 1 hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i a l � 1 , j } in layer l – backpropagating the error δ l � 1 , i = ∑ a l , j > 0 δ l , j w l , j , i 20

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ input layer a 0 , L � 1 hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i a l � 1 , j } in layer l – backpropagating the error δ l � 1 , i = ∑ a l , j > 0 δ l , j w l , j , i , update weights w 0 l , j , i = w l , j , i � λδ l , j a l � 1 , i if a l , j > 0 20

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ example architectures classifier I Multilayer feedforward networks are universal approximators I Approximation capabilities of multilayer feedforward networks I Universal approximation bounds for superpositions of a sigmoidal function 21

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ example architectures classifier generator I Multilayer feedforward networks are universal approximators I Approximation capabilities of multilayer feedforward networks I Universal approximation bounds for superpositions of a sigmoidal function 21

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation ⌅ example architectures classifier generator auto-encoder I Multilayer feedforward networks are universal approximators I Approximation capabilities of multilayer feedforward networks I Universal approximation bounds for superpositions of a sigmoidal function 21

Efficient Training of Artificial Neural Networks Using an integral equation for supervised learning ⌅ Q -learning ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 22

Efficient Training of Artificial Neural Networks Using an integral equation for supervised learning ⌅ Q -learning ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 for α = 1 yields the residual, i.e. loss ✓ ◆ Z ∆ Q := Q ( x , ω ) � L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 22

Efficient Training of Artificial Neural Networks Using an integral equation for supervised learning ⌅ Q -learning ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 for α = 1 yields the residual, i.e. loss ✓ ◆ Z ∆ Q := Q ( x , ω ) � L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 ⌅ supervised learning algorithm – light transport paths generated by a low discrepancy sequence for online training 22

Efficient Training of Artificial Neural Networks Using an integral equation for supervised learning ⌅ Q -learning ✓ ◆ Z Q 0 ( x , ω ) = ( 1 � α ) Q ( x , ω )+ α L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 for α = 1 yields the residual, i.e. loss ✓ ◆ Z ∆ Q := Q ( x , ω ) � L e ( y , � ω )+ + ( y ) f s ( ω i , y , � ω )cos θ i Q ( y , ω i ) d ω i S 2 ⌅ supervised learning algorithm – light transport paths generated by a low discrepancy sequence for online training – learn weights of an artificial neural network for Q ( x , n ) by back-propagating loss of each path I A machine learning driven sky model I Global illumination with radiance regression Functions I Machine learning and integral equations I Neural importance sampling 22

Efficient Training of Artificial Neural Networks Learning from noisy/sampled labeled data ⌅ find set of weights θ of an artificial neural network f to minimize summed loss L – using clean targets y i and data ˆ x i distributed according to ˆ x ⇠ p (ˆ x | y i ) argmin θ ∑ L ( f θ (ˆ x i ) , y i ) i 23

Efficient Training of Artificial Neural Networks Learning from noisy/sampled labeled data ⌅ find set of weights θ of an artificial neural network f to minimize summed loss L – using clean targets y i and data ˆ x i distributed according to ˆ x ⇠ p (ˆ x | y i ) argmin θ ∑ L ( f θ (ˆ x i ) , y i ) i – using targets ˆ y i distributed according to ˆ y ⇠ p (ˆ y ) instead argmin θ ∑ L ( f θ (ˆ x i ) , ˆ y i ) i 23

Efficient Training of Artificial Neural Networks Learning from noisy/sampled labeled data ⌅ find set of weights θ of an artificial neural network f to minimize summed loss L – using clean targets y i and data ˆ x i distributed according to ˆ x ⇠ p (ˆ x | y i ) argmin θ ∑ L ( f θ (ˆ x i ) , y i ) i – using targets ˆ y i distributed according to ˆ y ⇠ p (ˆ y ) instead argmin θ ∑ L ( f θ (ˆ x i ) , ˆ y i ) i ⇧ allows for much faster training of artificial neural networks used in simulations ⌅ amounts to learning integration and integro-approximation I Noise2Noise: Learning image restoration without clean data 23

Example Applications of Artificial Neural Networks in Rendering Learning from noisy/sampled labeled data ⌅ denoising quasi-Monte Carlo rendered images – noisy targets computed 2000 ⇥ faster than clean targets 24

Example Applications of Artificial Neural Networks in Rendering Sampling according to a distribution given by observed data ⌅ generative adversarial network (GAN) I image source I Tutorial on GANs 25

Example Applications of Artificial Neural Networks in Rendering Sampling according to a distribution given by observed data ⌅ generative adversarial network (GAN) ∇ θ g ∑ m – update generator G using i = 1 log( 1 � D ( G ( ξ i ))) I image source I Tutorial on GANs 25

Example Applications of Artificial Neural Networks in Rendering Sampling according to a distribution given by observed data ⌅ generative adversarial network (GAN) – update discriminator D (k times) using ∇ θ d m ∑ m 1 i = 1 [log D ( x i )+log( 1 � D ( G ( ξ i )))] ∇ θ g ∑ m – update generator G using i = 1 log( 1 � D ( G ( ξ i ))) I image source I Tutorial on GANs 25

Example Applications of Artificial Neural Networks in Rendering Sampling according to a distribution given by observed data ⌅ Celebrity GAN I Progressive growing of GANs for improved quality, stability, and variation 26

Example Applications of Artificial Neural Networks in Rendering Replacing simulations by learned predictions for more efficiency ⌅ much faster simulation of participating media – hierarchical stencil of volume densities as input to the neural network I Deep scattering: Rendering atmospheric clouds with radiance-predicting neural networks I Learning particle physics by example: Accelerating science with generative adversarial networks 27

Neural Networks linear in Time and Space

Neural Networks linear in Time and Space Complexity ⌅ the brain – about 10 11 nerve cells with to up to 10 4 connections to others 29

Neural Networks linear in Time and Space Complexity ⌅ the brain – about 10 11 nerve cells with to up to 10 4 connections to others ⌅ artificial neural networks – number of neural units L ∑ n = n l where n l is the number of neurons in layer l l = 1 29

Neural Networks linear in Time and Space Complexity ⌅ the brain – about 10 11 nerve cells with to up to 10 4 connections to others ⌅ artificial neural networks – number of neural units L ∑ n = n l where n l is the number of neurons in layer l l = 1 – number of weights L ∑ n w = n l � 1 · n l l = 1 29

Neural Networks linear in Time and Space Complexity ⌅ the brain – about 10 11 nerve cells with to up to 10 4 connections to others ⌅ artificial neural networks – number of neural units L ∑ n = n l where n l is the number of neurons in layer l l = 1 – number of weights L ∑ n w = c · n l l = 1 – constrain to constant number c of weights per neuron 29

Neural Networks linear in Time and Space Complexity ⌅ the brain – about 10 11 nerve cells with to up to 10 4 connections to others ⌅ artificial neural networks – number of neural units L ∑ n = n l where n l is the number of neurons in layer l l = 1 – number of weights L ∑ n w = c · n l = c · n l = 1 – constrain to constant number c of weights per neuron to reach complexity linear in n 29

Neural Networks linear in Time and Space Sampling proportional to the weights of the trained neural units ⌅ partition of unit interval by sums P k := ∑ k j = 1 | w j | of normalized absolute weights w 1 w 2 w m 0 1 P 0 P 1 P 2 P m � 1 P m 30

Neural Networks linear in Time and Space Sampling proportional to the weights of the trained neural units ⌅ partition of unit interval by sums P k := ∑ k j = 1 | w j | of normalized absolute weights w 1 w 2 w m 0 1 P 0 P 1 P 2 P m � 1 P m – using a uniform random variable ξ 2 [ 0 , 1 ) to select input i , P i � 1  ξ < P i satisfying Prob ( { P i � 1  ξ < P i } ) = | w i | 30

Neural Networks linear in Time and Space Sampling proportional to the weights of the trained neural units ⌅ partition of unit interval by sums P k := ∑ k j = 1 | w j | of normalized absolute weights w 1 w 2 w m 0 1 P 0 P 1 P 2 P m � 1 P m – using a uniform random variable ξ 2 [ 0 , 1 ) to select input i , P i � 1  ξ < P i satisfying Prob ( { P i � 1  ξ < P i } ) = | w i | ⌅ in fact derivation of quantization to ternary weights in { � 1 , 0 , + 1 } – integer weights result from neurons referenced more than once – relation to drop connect and drop out 30

Neural Networks linear in Time and Space Sampling proportional to the weights of the trained neural units 1 0 . 8 Test Accuracy 0 . 6 LeNet on MNIST 0 . 4 LeNet on CIFAR-10 AlexNet on CIFAR-10 0 . 2 Top-5 Accuracy AlexNet on ILSVRC12 Top-1 Accuracy AlexNet on ILSVRC12 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 Percent of fully connected layers sampled 31

Neural Networks linear in Time and Space Sampling paths through networks ⌅ complexity bounded by number of paths times depth L of network 32

Neural Networks linear in Time and Space Sampling paths through networks ⌅ complexity bounded by number of paths times depth L of network ⌅ application after training – backwards random walks using sampling proportional to the weights of a neuron – compression and quantization by importance sampling 32

Neural Networks linear in Time and Space Sampling paths through networks ⌅ complexity bounded by number of paths times depth L of network ⌅ application after training – backwards random walks using sampling proportional to the weights of a neuron – compression and quantization by importance sampling ⌅ application before training – uniform (bidirectional) random walks to connect inputs and outputs – sparse from scratch 32

Neural Networks linear in Time and Space Sampling paths through networks ⌅ sparse from scratch a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 33

Neural Networks linear in Time and Space Sampling paths through networks ⌅ sparse from scratch a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 � 1 a L , n L � 1 a 1 , n 1 � 1 a 2 , n 2 � 1 33

Machine Learning and Rendering Alex Keller, Director of Research - PowerPoint PPT Presentation

East Building, Ballroom BC nvidia.com/siggraph2018 Machine Learning and Rendering Alex Keller, Director of Research Machine Learning and Rendering Course web page at https://sites.google.com/site/mlandrendering/ 14:00 From Machine Learning

Object Space Volume Rendering Object Space Volume Rendering Ronald Peikert SciVis 2010 - Object

VCD: Rendering Objective: To understand, and be able to use, tonal rendering to communicate and

Non-Photorealistic Rendering Non-Photorealistic Rendering Pen-and-Ink Illustrations Pen-and-Ink

Non-Photorealistic Rendering Non-Photorealistic Rendering Pen-and-Ink Illustrations Pen-and-Ink

Object Space Volume Rendering 4-1 Ronald Peikert SciVis 2007 - Object Space Volume Rendering

Six- DOF Haptic Rendering I Outline Motivation Direct rendering Proxy-based rendering

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Image-Based Rendering and Modeling l Image-based rendering (IBR): A scene is represented as a

E E Energy Efficiency in Energy Efficiency in Effi i Effi i i i Graphics Rendering

The OpenGL Rendering Pipeline The Rendering Pipeline The process to generate two-dimensional

Graphics Pipeline Rendering approaches 1. object-oriented 3D rendering vertices image pipeline

Data Centric Systems and Networking (DCSN) Session 1: Introduction to R212 Eiko Yoneki Systems

On Fourier and Wavelets: On Fourier and Wavelets: Representation, Approximation and

BBM406 Fundamentals of Machine Learning Lecture 11: Multi-layer Perceptron Forward Pass

Nerve cell model and asymptotic expansion Yasushi ISHIKAWA [Department of Mathematics, Ehime

Carol Lambe Head of Commissioning and Delivery HFCCG In 2013/14 an extensive review of MSK

Dual Application of Chiral Derivatives of Xanthones: in Medicinal Chemistry and Liquid

Synchronization Analysis in Models of Coupled Oscillators Guilherme M. Toso, Fabricio A. Breve

On periodic orbits in non-smooth differential equations with applications Rafel Prohens