scientific computing
play

Scientific Computing Maastricht Science Program Week 4 Frans - PowerPoint PPT Presentation

Scientific Computing Maastricht Science Program Week 4 Frans Oliehoek <frans.oliehoek@maastrichtuniversity.nl> Recap Last Week Approximation of Data and Functions find a function f mapping x y Interpolation f goes


  1. Scientific Computing Maastricht Science Program Week 4 Frans Oliehoek <frans.oliehoek@maastrichtuniversity.nl>

  2. Recap Last Week  Approximation of Data and Functions  find a function f mapping x → y  Interpolation  f goes through the data points  piecewise or not  linear regression  lossy fit  minimizes SSE  Linear Algebra  Solving systems of linear equations  GEM, LU factorization

  3. Recap Least-Squares Method number of data points: N = n  1  'the function unknown'  it is only known at certain points  x 0, y 0  ,  x 1, y 1  , ... ,  x n , y n   want to predict y given x  Least Squares Regression:  find a function that minimizes the prediction error  better for noisy data.

  4. Recap Least-Squares Method  Minimize sum of the squares of the errors y =̃ ̃ f ( x )= a 0 + a 1 x n f )= ∑ SSE (̃ [̃ 2 f ( x i )− y i ] i = 0 ̃  pick the with min. SSE f a 0, a 1 (that means: pick )

  5. This Lecture  Last week: labeled data (also 'supervised learning')  data: (x,y)-pairs  This week: unlabeled data (also 'unsupervised learning')  data: just x  Finding structure in data  2 Main methods:  Clustering  Principle Components analysis (PCA)

  6. Part 1: Clustering

  7. Clustering  data set ( 0 ) , y ( 0 ) ) , ... , ( x ( n ) , y ( n ) )} {( x  but now: unlabeled ( 0 ) , x 2 ( 0 ) ) , ... , ( x 1 ( n ) , x 2 ( n ) )} {( x 1  now what?  structure?  summarize this data?

  8. Clustering  data set ( 0 ) , y ( 0 ) ) , ... , ( x ( n ) , y ( n ) )} {( x  but now: unlabeled ( 0 ) , x 2 ( 0 ) ) , ... , ( x 1 ( n ) , x 2 ( n ) )} {( x 1  now what?  structure?  summarize this data?

  9. Clustering  data set ( 0 ) , x 2 ( 0 ) ) , ... , ( x 1 ( n ) , x 2 ( n ) )} {( x 1 try to find the  different clusters!  How?

  10. Clustering  data set ( 0 ) , x 2 ( 0 ) ) , ... , ( x 1 ( n ) , x 2 ( n ) )} {( x 1 try to find the  different clusters!  One way:  find centroids

  11. Clustering – Applications Clustering or Cluster Analysis has many applications  Understanding   Astronomy: new types of stars  Biology:  create taxonomies of living things  clustering based on genetic information  Climate: find patterns in the atmospheric pressure  etc. Data (pre)processing   summarization of data set  compression

  12. Cluster Methods  Many types of clustering!  We will treat one method: k-Means clustering  the standard text-book method  not necessarily the best  but the simplest You will implement k-Means   Use it to compress an image

  13. k-Means Clustering  The main idea  clusters are represented by 'centroids'  start with random centroids  then repeatedly  find all data points that are nearest to a centroid  update each centroid based on its data points

  14. k-Means Clustering: Example

  15. k-Means Clustering: Example

  16. k-Means Clustering: Example

  17. k-Means Clustering: Example

  18. k-Means Clustering: Example

  19. k-Means Clustering: Example

  20. k-Means Algorithm %% k-means PSEUDO CODE % % X - the data % centroids - initial centroids % (given by random initialization on data points) iterations = 1 done = 0 while (~done && iterations < max_iters) labels = NearestCentroids(X, centroids); centroids = UpdateCentroids(X, labels); iterations = iterations + 1; if centroids did not change done = 1 end end

  21. Part 2: Principal Component Analysis

  22. Dimension Reduction  Clustering allows us to summarize data using centroids  summary of a point: what cluster is belongs to.  Different idea: ( x 1, x 2, ... , x D )→( z 1, z 2, ... ,z d )  reduce the number of variables  i.e., reduce the number of dimensions from D to d d < D

  23. Dimension Reduction  Clustering allows us to summarize data using centroids  summary of a point: what cluster is belongs to.  Different idea: ( x 1, x 2, ... , x D )→( z 1, z 2, ... ,z d )  reduce the number of variables  i.e., reduce the number of dimensions from D to d d < D This is what Principal Component Analysis (PCA) does.

  24. PCA – Goals N = n + 1  Given a data set X of N data point of D variables → convert to data set Z of N data points of d variables ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z d ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z d ( 1 ) ) ( x 1 ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z d ( n ) ) ( x 1

  25. PCA – Goals  Given a data set X of N data point of D variables → convert to data set Z of N data points of d variables ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z d ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z d ( 1 ) ) ( x 1 ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z d ( n ) ) ( x 1 ( 0 ) , z i ( 1 ) , ... , z i ( n ) ) ( z i The vector is called the i -th principal component (of the data set)

  26. PCA – Goals  Given a data set X of N data point of D variables → convert to data set Z of N data points of d variables ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z d ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z d ( 1 ) ) ( x 1 ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z d ( n ) ) ( x 1 ( 0 ) , z i ( 1 ) , ... , z i ( n ) ) ( z i The vector is called the i -th principal component (of the data set)  PCA performs a linear transformation: → variables z i are linear combinations of x 1 ,...,x D

  27. PCA Goals – 2  Of course many possible transformations possible...  Reducing the number of variables: loss of information  PCA makes this loss minimal  PCA is very useful  Exploratory analysis of the data  Visualization of high-D data  Data preprocessing  Data compression

  28. PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) x 2 x 1

  29. PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) x 2 Very important idea The most information is contained by the variable with the largest spread. ● i.e., highest variance (Information Theory) x 1

  30. PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) so if we have to chose x 2 between x 1 and x 2 → remember x 2 Very important idea Transform of k -th point: ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 The most information is contained by the variable with where the largest spread. ( k ) = x 2 ( k ) ● i.e., highest variance z 1 (Information Theory) x 1

  31. PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) so if we have to chose x 2 between x 1 and x 2 → remember x 2 Transform of k -th point: ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 Example: ( k ) = 1.5 z 1 where ( k ) = x 2 ( k ) z 1 x 1

  32. PCA – Intuition  Reconstruction based on x 2 → only need to remember mean of x 1 x 2 x 1

  33. PCA – Intuition  How would you summarize this data using 1 dimension? x 2 x 1

  34. PCA – Intuition  How would you summarize this data using 1 dimension? x 2 This is a projection on the x1 axis. x 1

  35. Question  Suppose the data is now 3-dimensional x =( x 1, x 2, x 3 )   Can you think of an example where we could project it to 2 dimensions: ( x 1, x 2, x 3 )→( z 1, z 2 ) ?

  36. PCA – Intuition  How would you summarize this data using 1 dimension? x 2 x 1

  37. PCA – Intuition  How would you summarize this data using 1 dimension? ● More difficult... x 2 ...projection on both axes does not give nice results. ● Idea of PCA: find a new direction to project on! x 1

  38. PCA – Intuition  How would you summarize this data using 1 dimension? ● More difficult... x 2 ...projection on both axes does not give nice results. ● Idea of PCA: find a new direction to project on! x 1

  39. PCA – Intuition  How would you summarize this data using 1 dimension? ● u is the direction of highest variance ● e.g., u = (1, 1) x 2 ● we will assume it is a unit vector ● length = 1 ● u = (0.71, 0.71) u x 1

  40. PCA – Intuition  How would you summarize this data using 1 dimension? x 2 Transform of k -th point: ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 where z 1 is the orthogonal scalar projection on u: ( k ) = u 1 x 1 ( k ) + u 2 x 2 ( k ) =( u, x ( k ) ) z 1 u x 1

  41. PCA – Intuition  How would you summarize this data using 1 dimension? x 2 Transform of k -th point: ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 where z 1 is the orthogonal scalar projection on u: ( k ) = u 1 x 1 ( k ) + u 2 x 2 ( k ) =( u, x ( k ) ) z 1 u Note, the general formula for scalar projection is ( k ) )/( u,u ) ( u , x x 1 However, when u is a unit vector, we can use the simplified formula

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend