why lasso ridge
play

Why LASSO, Ridge Need for Strictly . . . Regression, and EN: - PowerPoint PPT Presentation

Need for Regularization Which Regularizations . . . Need for Degrees of . . . Need for And- and . . . Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why LASSO Explanation Based on Soft Why


  1. Need for Regularization Which Regularizations . . . Need for Degrees of . . . Need for “And”- and . . . Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why LASSO Explanation Based on Soft Why Ridge Regression Why EN: Idea Computing Home Page Title Page Woraphon Yamaka 1 , Hamza Alkhatib 2 , Ingo Neumann 2 , and Vladik Kreinovich 3 ◭◭ ◮◮ 1 Faculty of Economics, Chiang Mai University ◭ ◮ Chiang Mai, Thailand, woraphon.econ@gmail.com 2 Geodesic Institute, Leibniz University of Hannover Page 1 of 34 Hannover, Germany, alkhatib@gih.uni-hannover.de neumann@gih.uni-hannover.de Go Back 3 Department of Computer Science, University of Texas at El Paso Full Screen El Paso, Texas 79968, USA, vladik@utep.edu Close Quit

  2. Need for Regularization Which Regularizations . . . 1. Need for Regularization Need for Degrees of . . . • In practice, in addition to measurement results, we of- Need for “And”- and . . . ten use imprecise expert knowledge. Need for Strictly . . . General Analysis of the . . . • For example, physicists usually believe that: Why LASSO – when the value of a physical quantity x is small, Why Ridge Regression – we expand the dependence y = f ( x ) of some other Why EN: Idea quantity y on x in Taylor series, and Home Page – ignore quadratic and higher order terms in this ex- Title Page pansion. ◭◭ ◮◮ • The usual argument is that: ◭ ◮ – when x is small, Page 2 of 34 – its square x 2 is so much smaller than x that it can Go Back safely be ignored. Full Screen Close Quit

  3. Need for Regularization Which Regularizations . . . 2. Need for Regularization (cont-d) Need for Degrees of . . . • This is indeed true: Need for “And”- and . . . – if x = 10% = 0 . 1, then x 2 = 0 . 01 ≪ 0 . 1; Need for Strictly . . . General Analysis of the . . . – if x = 1% = 0 , 01, then we can say that x 2 = Why LASSO 0 . 0001 ≪ x = 0 . 01 with even higher confidence. Why Ridge Regression • However, from the purely mathematical viewpoint, this Why EN: Idea argument is not fully convincing. Home Page • Indeed, the quadratic term in the Taylor expansion is Title Page not x 2 , but a 2 · x 2 for some coefficient a 2 . ◭◭ ◮◮ • From the purely mathematical viewpoint, this coeffi- ◭ ◮ cient a 2 can be huge. Page 3 of 34 • In this case the product a 2 · x 2 will also be big, and we Go Back will not be able to ignore it. Full Screen • From the physicist’s viewpoint, however, this argument is valid. Close Quit

  4. Need for Regularization Which Regularizations . . . 3. Need for Regularization (cont-d) Need for Degrees of . . . • Indeed, physicists usually assume that the coefficients Need for “And”- and . . . cannot be too large, they must be reasonably small. Need for Strictly . . . General Analysis of the . . . • This imprecise additional assumption underlies many Why LASSO successes of physics. Why Ridge Regression • It can also be used as a supplement to measurements Why EN: Idea when we estimate the values of physical quantities. Home Page • This is common sense. Title Page • Sometimes, after applying some mathematical tech- ◭◭ ◮◮ niques, we get too large values of some parameters. ◭ ◮ • This usually means that something is not right: Page 4 of 34 – either with our method Go Back – or with some measurement results – they may be Full Screen outliers. Close Quit

  5. Need for Regularization Which Regularizations . . . 4. Need for Regularization (cont-d) Need for Degrees of . . . • In simple cases, it is clear that if we have a record of Need for “And”- and . . . temperature in some area, Need for Strictly . . . General Analysis of the . . . – and we see 17, 18, 19, 18, 17, and then suddenly 42 Why LASSO degrees, Why Ridge Regression – we should get very suspicious – especially if the Why EN: Idea next day, we again have the high of 19. Home Page • Physicists’ intuition is great, but we cannot always rely Title Page on this intuition. ◭◭ ◮◮ • There are many problems that need solving. ◭ ◮ • It is not realistic to expect to have a skilled physicist Page 5 of 34 for each such problem. Go Back • How to deal with situations when a professional physi- Full Screen cist is not available? Close Quit

  6. Need for Regularization Which Regularizations . . . 5. Need for Regularization (cont-d) Need for Degrees of . . . • We need to have a precise description of: Need for “And”- and . . . Need for Strictly . . . – what we mean General Analysis of the . . . – when we say that the coefficients a 0 , . . . , a n describ- Why LASSO ing a model must be reasonably small. Why Ridge Regression – Such descriptions are known as regularization . Why EN: Idea Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 6 of 34 Go Back Full Screen Close Quit

  7. Need for Regularization Which Regularizations . . . 6. Which Regularizations Are Currently Used Need for Degrees of . . . • Out of many possible regularizations, the following three Need for “And”- and . . . techniques have been most empirically successful: Need for Strictly . . . General Analysis of the . . . – LASSO technique when we limit the sum of the n Why LASSO � absolute values | a i | ; Why Ridge Regression i =1 – ridge regression method, in which we limit the sum Why EN: Idea n Home Page � a 2 of the squares i ; and i =0 Title Page – the Elastic Net (EN) method, in which we limit a ◭◭ ◮◮ linear combination of the above two sums. ◭ ◮ • Why? Page 7 of 34 • In this paper, we show that: Go Back – a natural formalization of commonsense intuition Full Screen – indeed leads to these three regularization techniques. Close Quit

  8. Need for Regularization Which Regularizations . . . 7. Need for Degrees of Confidence Need for Degrees of . . . • Precise statements like “ x is larger than 5” are either Need for “And”- and . . . true or false. Need for Strictly . . . General Analysis of the . . . • In contrast, imprecise statements like “ x is reasonably Why LASSO small” are not well-defined. Why Ridge Regression • For some values x , for example, for x = 0 . 0001, the Why EN: Idea expert is absolutely sure that x is small. Home Page • For other values like x = 10 7 , the expert is usually Title Page absolutely sure that this value is not reasonably small. ◭◭ ◮◮ • However, for intermediate values x : ◭ ◮ – the expert is usually not 100% sure whether this Page 8 of 34 value is indeed reasonably small; Go Back – he or she is only sure to some degree. Full Screen Close Quit

  9. Need for Regularization Which Regularizations . . . 8. Need for Degrees of Confidence (cont-d) Need for Degrees of . . . • It is therefore reasonable to ask the expert to assign: Need for “And”- and . . . Need for Strictly . . . – to each value x , General Analysis of the . . . – a degree µ ( x ) to which this expert believes that x Why LASSO is reasonably small. Why Ridge Regression • We can use different scales for such degrees. Why EN: Idea Home Page • In the computer, “absolutely true” is usually described as 1, and “absolutely false” as 0. Title Page ◭◭ ◮◮ • So, it is convenient to use a scale from 0 to 1 for such degrees. ◭ ◮ • This assignment is one of the main ideas behind fuzzy Page 9 of 34 logic . Go Back • This technique was specifically developed to deal with Full Screen such imprecision. Close Quit

  10. Need for Regularization Which Regularizations . . . 9. Need for Degrees of Confidence (cont-d) Need for Degrees of . . . • This way, we can assign: Need for “And”- and . . . Need for Strictly . . . – to each imprecise statement, General Analysis of the . . . – a function µ ( x ) that describes to what degree this Why LASSO statement is satisfied for each value x . Why Ridge Regression • This function is known as a membership function or a Why EN: Idea fuzzy set . Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 10 of 34 Go Back Full Screen Close Quit

  11. Need for Regularization Which Regularizations . . . 10. Need for “And”- and “Or”-Operations Need for Degrees of . . . • Often, experts make complex statements. Need for “And”- and . . . Need for Strictly . . . • For example, they may say that x is reasonably small, General Analysis of the . . . but not very small. Why LASSO • This statement is obtained: Why Ridge Regression – from the basic statements “ x is reasonably small” Why EN: Idea Home Page and “ x is very small” – by applying connectives “not” and “but” (which Title Page here means the same as “and”). ◭◭ ◮◮ • In general: ◭ ◮ – we can use connectives “and”, “or”, and “not” Page 11 of 34 – to combine elementary statements into a composite Go Back one. Full Screen Close Quit

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend