 
              On Ridge Functions Allan Pinkus Technion September 23, 2013 Allan Pinkus (Technion) Ridge Function September 23, 2013 1 / 27
Foreword In this lecture we will survey a few problems and properties associated with Ridge Functions. I hope to convince you that this is a subject worthy of further consideration, especially as regards to Multivariate Approximation and Interpolation with Applications Allan Pinkus (Technion) Ridge Function September 23, 2013 2 / 27
What is a Ridge Function? • A Ridge Function, in its simplest form, is any multivariate function F : R n → R of the form F ( x ) = f ( a 1 x 1 + · · · + a n x n ) = f ( a · x ) where f : R → R , x = ( x 1 , . . . , x n ), and a = ( a 1 , . . . , a n ) ∈ R n \{ 0 } . • The vector a ∈ R n \{ 0 } is generally called the direction. • It is a multivariate function, constant on the hyperplanes a · x = c , c ∈ R . • It is one of the simpler multivariate functions. Namely, a superposition of a univariate function with one of the simplest multivariate functions, the inner product. Allan Pinkus (Technion) Ridge Function September 23, 2013 3 / 27
Where do we find Ridge Functions? We see specific Ridge Functions in numerous multivariate settings without considering them as of interest in and of themselves. • In multivariate Fourier series where the basic functions are of the form e i ( n · x ) , for n ∈ Z n , in the Fourier transform e i ( w · x ) , and in the Radon transform. • In PDE where, for example, if P is a constant coefficient polynomial in n variable, then � ∂ � , . . . , ∂ P f = 0 ∂ x 1 ∂ x n has a solution of the form f ( x ) = e a · x if and only if P ( a ) = 0. • The polynomials ( a · x ) k are used in many settings. Allan Pinkus (Technion) Ridge Function September 23, 2013 4 / 27
Where do we use Ridge Functions? • Approximation Theory – Ridge Functions should be of interest to researchers and students of approximation theory. The basic concept is straightforward and simple. Approximate complicated functions by simpler functions. Among the class of multivariate functions linear combinations of ridge functions are a class of simpler functions. The questions one asks are the basic questions of approximation theory. Can one approximate arbitrarily well (density)? How well can one approximate (degree of approximation)? How does one approximate (algorithms)? Etc .... Allan Pinkus (Technion) Ridge Function September 23, 2013 5 / 27
Where do we use Ridge Functions? • Partial Differential Equations – Ridge Functions used to be called Plane Waves. For example, we see them in the book Plane Waves and Spherical Means applied to Partial Differential Equations by Fritz John. In general, linear combinations of ridge functions also appear in the study of hyperbolic constant coefficient pde’s. As an example, assume the ( a i , b i ) are pairwise linearly independent vectors in R 2 . Then the general “solution” of the pde r � � ∂ ∂ � ∂ x − a i F = 0 b i ∂ y i =1 are all functions of the form r � F ( x , y ) = f i ( a i x + b i y ) , i =1 for arbitrary f i . Allan Pinkus (Technion) Ridge Function September 23, 2013 6 / 27
Where do we use Ridge Functions? • Projection Pursuit – This is a topic in Statistics. Projection pursuit algorithms approximate a functions of n variables by functions of the form r g i ( a i · x ) , � i =1 where both the functions g i and directions a i are variables. The idea here is to “reduce dimension” and thus bypass the “curse of dimensionality”. Allan Pinkus (Technion) Ridge Function September 23, 2013 7 / 27
Where do we use Ridge Functions? • Neural Networks – One of the popular neuron models is that of a multilayer feedforward neural net with input, hidden and output layers. In its simplest case, and without the terminology used, one is interested in functions of the form � � r n � � w ij x j + θ i α i σ , i =1 j =1 where σ : R → R is some given fixed univariate function. In this model, which is just one of many, we vary the w ij , θ i and α i . For each θ and w ∈ R n we are considering linear combinations of σ ( w · x + θ ) . Thus, a lower bound on the degree of approximation by such functions is given by the degree of approximation by ridge functions. Allan Pinkus (Technion) Ridge Function September 23, 2013 8 / 27
Where do we use Ridge Functions? • Computerized Tomography – The term Ridge Function was coined in a 1975 paper by Logan and Shepp, that was a seminal paper in computerized tomography. They considered ridge functions in the unit disk in R 2 with equally spaced directions. We will consider some nice domain K in R n , and a function G belonging to L 2 ( K ). Problem: For some fixed directions { a i } r i =1 we are given � G ( x ) d x K ∩{ a i · x = λ } for each λ and i = 1 , . . . , r . That is, we see the “projections” of G along the hyperplanes K ∩ { a i · x = λ } , λ a.e., i = 1 , . . . , r . What is a good method of reconstructing G based only on this information? Allan Pinkus (Technion) Ridge Function September 23, 2013 9 / 27
Answer: The unique best L 2 ( K ) approximation r i ( a i · x ) � f ∗ ( x ) = f ∗ i =1 to G from � r � f i ( a i · x ) : f i vary � M ( a 1 , . . . , a r ) = , i =1 if such exists, necessarily satisfies � � f ∗ ( x ) d x G ( x ) d x = K ∩{ a i · x = λ } K ∩{ a i · x = λ } for each λ and i = 1 , . . . , r , and among all such functions with the same data as G is the one of minimal L 2 ( K ) norm. Allan Pinkus (Technion) Ridge Function September 23, 2013 10 / 27
Properties of Ridge Functions In the remaining part of this lecture I want to consider various properties of linear combinations of Ridge Functions. Namely, • Density • Representation • Smoothness • Uniqueness • Interpolation Allan Pinkus (Technion) Ridge Function September 23, 2013 11 / 27
Density - Fixed Directions • Ridge functions are dense in C ( K ) for every compact K ⊂ R n . E.g., span { e n · x : n ∈ Z Z n + } is dense (Stone-Weierstrass). • Let Ω be any set of vectors in R n , and M (Ω) = span { f ( a · x ) : a ∈ Ω , all f } . Theorem (Vostrecov, Kreines) M (Ω) is dense in C ( R n ) in the topology of uniform convergence on compact subsets if and only if no non-trivial homogeneous polynomial vanishes on Ω . Allan Pinkus (Technion) Ridge Function September 23, 2013 12 / 27
Density - Variable Directions • Let Ω j , j ∈ J , be sets of vectors in R n , and M (Ω j ) be as above. We ask when, for each given G ∈ C ( R n ), compact K ⊂ R n and ε > 0, there exists an F ∈ M (Ω j ), for some j ∈ J , such that � G − F � L ∞ ( K ) < ε. (If Ω j are the totality of all sets of ridge functions with k directions, then this is the problem of approximating with k arbitrary directions.) • To each Ω j , let r j be the minimal degree of a non-trivial homogeneous polynomial vanishes on Ω j . Then (Kro´ o) � M (Ω j ) j ∈ J is dense in C ( R n ), as explained above, if and only if sup r (Ω j ) = ∞ . j ∈ J Allan Pinkus (Technion) Ridge Function September 23, 2013 13 / 27
Representation • As previously, let Ω be any set of vectors in R n , and M (Ω) = span { f ( a · x ) : a ∈ Ω , all f } . The question we now ask is: What is M (Ω) when it is not all of C ( R n )? • Let P (Ω) be the set of all homogeneous polynomials that vanish on Ω. Let C (Ω) be the set of all polynomials q such that p ( D ) q = 0 , all p ∈ P (Ω) . � ∂ , . . . , ∂ � p ( D ) := p . ∂ x 1 ∂ x n Allan Pinkus (Technion) Ridge Function September 23, 2013 14 / 27
Representation Theorem On C ( R n ) , in the topology of uniform convergence on compact subsets, we have M (Ω) = C (Ω) . • Thus, for example, g ( b · x ) ∈ M (Ω) for some b and all continuous g if and only if all homogeneous polynomials vanishing on Ω also vanish on b . • For n = 2, Ω = { ( a i , b i ) } r i =1 this gives us r � F ( x , y ) = f i ( a i x + b i y ) , i =1 for arbitrary smooth f i if and only if r � � ∂ ∂ � ∂ x − a i F = 0 . b i ∂ y i =1 Allan Pinkus (Technion) Ridge Function September 23, 2013 15 / 27
Smoothness Assume r f i ( a i · x ) , � G ( x ) = i =1 where r is finite, and the a i are pairwise linearly independent fixed vectors in R n . If G is of a certain smoothness class, what can we say about the smoothness of the f i ? Allan Pinkus (Technion) Ridge Function September 23, 2013 16 / 27
Smoothness — r = 1, r = 2 • Assume G ∈ C k ( R n ). If r = 1 there is nothing to prove. That is, assume G ( x ) = f 1 ( a 1 · x ) is in C k ( R n ) for some a 1 � = 0 , then obviously f 1 ∈ C k ( R ). • Let r = 2. As the a 1 and a 2 are linearly independent, there exists a vector c ∈ R n satisfying a 1 · c = 0 and a 2 · c = 1. Thus G ( t c ) = f 1 ( a 1 · t c ) + f 2 ( a 2 · t c ) = f 1 (0) + f 2 ( t ) . As G ( t c ) is in C k ( R ), as a function of t , so is f 2 . The same result holds for f 1 . Allan Pinkus (Technion) Ridge Function September 23, 2013 17 / 27
Recommend
More recommend