Time-delay reservoir computers: nonlinear stability of functional - PowerPoint PPT Presentation

Time-delay reservoir computers: nonlinear stability of functional differential systems and optimal nonlinear information processing capacity. Applications to stochastic nonlinear time series forecasting. Lyudmila Grigoryeva 1 , Julie Henriques 2 , Laurent Larger 2 , Juan-Pablo Ortega 3 , 4 1 Universit¨ at Konstanz, Germany 2 Universit´ e Bourgogne Franche-Comt´ e, France 3 Universit¨ at Sankt Gallen, Switzerland 4 CNRS, France L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 1 / 71 Financial and Insurance Mathematics Seminar

Outline 1 Machine learning in a nutshell: Discrete vs continuous time Deterministic vs stochastic 2 Static problems, neural networks, and approximation theorems 3 Dynamic problems and reservoir computing 4 Universality theorems The control theoretical approach The filter/operators approach 5 Time-delay reservoir computers Hardware realizations, scalability, and big data compatibility Models and performance estimations 6 Application examples L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 2 / 71

Machine learning in a nutshell Machine learning in a nutshell We approach to machine learning as an input/output problem. Input: it is denoted by the character z . It contains available information for the solution of the problem (historical data, explanatory factors, features of the individuals that need to be classified). Output: denoted generically by y . Contains the solution of the problem (forecasted data, explained variables, classification results). Purely empirical approach not based on first principles but on a training/testing routine. We distinguish between static/discrete-time and continuous-time setups and between deterministic and stochastic situations since they lead to very different levels of mathematical complexity. L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 3 / 71

Machine learning in a nutshell Examples Deterministic setup : an explicit functional relation (via a just measurable function) is assumed between input and output. Static/Discrete-time: observables or diagnostics variables in complex physical or noiseless engineering systems (domotics), translators, memory tasks, games. Continuous time: integration or path continuation of (chaotic) differential equations: molecular dynamics, structural mechanics, vibration analysis, space mission design. Autopilot systems, robotics. Stochastic setup : the input and the output are random variables or processes and only probabilistic dependence is assumed between them. Static/Discrete-time: image classification, speech recognition, time series forecasting, volatility filtering, factor analysis. Continuous time: physiological time series classification, financial bubble detection. L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 4 / 71

Machine learning in a nutshell Setups considered Static/Discrete time Continuous time Deterministic Stochastic Deterministic Stochastic z and y are R n and R q -valued z ∈ R n z ∈ ( L 2 (Ω , F , P )) n z ∈ C ∞ ([ a , b ] , R n ) Characterization of processes adapted with respect y ∈ R q y ∈ ( L 2 (Ω , F , P )) q y ∈ C ∞ ([ a , b ] , R n ) ingredients to a given filtration F Problem to be y = f ( z ) E [ y | z ] y ( · ) = F ( z ( · )) E [ y ( · ) | z ( · )] solved f measurable Object to be Real/complex Conditional Functional/Operator Stochastic trained function expectation Causal Filter Causal Filter Approach and (Semi)-parametric Functional data analysis and Approximation Control theory source of statistics Stochastic control theory Stone-Weierstraß Universality Kalman filter theory L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 5 / 71

Static problems, neural networks, and approximation theorems The deterministic case Neural networks Input Hidden Output w 1 w 2 layer layer layer Input z 1 Input z 2 Output y Input z 3 Input z 4     5 4 � �  w 2  w 1   , ψ sigmoid function . y = ψ i ψ ij z j (1) i =1 j =1 L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 6 / 71

Static problems, neural networks, and approximation theorems The deterministic case Universality in neural networks and approximation theorems Neural networks are implemented as a machine learning device by tuning the weights w i using a gradient descent algorithm (backpropagation) that minimizes the approximation error based on a training set. In the deterministic case, the objective is to recover an explicit functional relation between input and output. In the absence of noise there is not danger of overfitting. L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 7 / 71

Static problems, neural networks, and approximation theorems The deterministic case Universality problem: how large is the class of input-output functions that can be generated using feedforward neural networks as in (1)? Hilbert’s 13th problem on multivariate functions: can any continuous function of three variables be expressed as a composition of finitely many continuous functions of two variables? This question is a generalization of the original problem for algebraic functions posed in the 1900 ICM in Paris and in [Hil27] L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 8 / 71

Static problems, neural networks, and approximation theorems The deterministic case The Kolmogorov-Arnold representation theorem and Kolmogorov-Sprecher networks Theorem (Kolmogorov-Arnold [Kol56, Arn57]) There exist fixed continuous increasing functions ϕ p , q ( x ) on I = [0 , 1] such that each continuous function f on I n can be written as   2 n +1 n � �   f ( x 1 , . . . , x n ) = g q ϕ pq ( x p ) q =1 p =1 where the g q are properly chosen continuous functions of one variable. This amounts to saying that the only genuinely multivariate function is the sum! This is a representation and not an approximation theorem L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 9 / 71

Static problems, neural networks, and approximation theorems The deterministic case Theorem (Sprecher [Spr65, Spr96, Spr97]) There exist constants λ p and fixed continuous increasing functions ϕ q ( x ) on I = [0 , 1] such that each continuous function f on I n can be written as   2 n +1 n � �   f ( x 1 , . . . , x n ) = g q λ p ϕ q ( x p ) q =1 p =1 where the g q are properly chosen continuous functions of one variable. The g q functions depend on f but not λ p and ϕ q . All the information contained in the multivariable continuous function f is contained in the single variable continuous functions g q . This is not ideal for machine learning applications because we would need to train the g q functions. It still can be done (see the CMAC in [CG92]) L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 10 / 71

Static problems, neural networks, and approximation theorems The deterministic case The Kolmogorov-Sprecher network (taken from [CG92]) L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 11 / 71

Static problems, neural networks, and approximation theorems The deterministic case The Cybenko and the Hornik et al. theorems Definition A squashing function is a map ψ : R → [0 , 1] that is non-decreasing and that λ →−∞ ψ ( λ ) = 0 lim and λ →∞ ψ ( λ ) = 1 lim L. Grigoryeva, J. Henriques, L. Larger, J.-P. Ortega ( Universit¨ Time-delay reservoir computers at Konstanz, Germany, Universit´ e Bourgogne Franche-Comt´ DarrylFest, July, 2017 e, France, Universit 12 / 71

Time-delay reservoir computers: nonlinear stability of functional - PowerPoint PPT Presentation

Time-delay reservoir computers: nonlinear stability of functional differential systems and optimal nonlinear information processing capacity. Applications to stochastic nonlinear time series forecasting. Lyudmila Grigoryeva 1 , Julie Henriques 2

Interconnect Gate delay Wire delay The delay in VLSI circuits have two components Gate delay (

Bessel inequality for robust stability analysis of time-delay system F. Gouaisbaut, Y. Ariba, A.

IPR/Reservoir Augmentation Reservoir Storage Permitting Issues Michael R. Welch, Ph.D., P.E.

Running Bro in the Cloud at Scale Reservoir Labs 1 About: Alan Commike Reservoir Labs:

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Outline Outline 4 Nonlinear Stability Analysis 4 Nonlinear Stability Analysis 4 Disturbed Motion

Nonlinear Control Lecture # 18 Stability of Feedback Systems Nonlinear Control Lecture # 18

Nonlinear Control Lecture # 14 Input-Output Stability Nonlinear Control Lecture # 14 Input-Output

Language and Computers where to start? Outline Computers Computers Computers Topic 1: Text

Gate%Delay Transistors%within%a%gate%require%finite%amount%of% time%to%switch%%

P Packet Scheduling: k S h d li E d t End-to-End Delay Bounds E d D l B d Delay bounds

RC delay 4: The Elmore delay - 3 Application of the Elmore delay formula to a (RC) wire. Let R

Nonlinear state-dependent delay modeling and stability analysis of Internet congestion control

Nonlinear Control Lecture # 15 Input-Output Stability Nonlinear Control Lecture # 15 Input-Output

Disclosures Definitions Seizure-a transient occurrence of signs and/or symptoms due to abnormal

A Typed C11 Semantics for Interactive Theorem Proving Freek Wiedijk Robbert Krebbers ICIS,

Motivation Underlying question : How does software change ? In : Two versions of a program

Syntax Directed Analysis Chapter 5 1 Compiler Construction Syntax Directed Analysis

The WBIF: towards a conceptual rationalisation Massimo Cingolani EIB - Managerial Adviser and

M97-720 Trial Lopinavir (ABT-378) + Ritonavir + NRTIs in Treatment-Nave M97-720: Study Design

Living Well with Myeloma Teleconference Series Thursday, March 24 th 2016 4:00 PM Pacific/5:00 PM

Lalgoritmo terapeutico oggi: L algoritmo terapeutico oggi: fattori clinici e molecolari