welcome back
play

Welcome back. Projects comments available on Glookup! Welcome back. - PowerPoint PPT Presentation

Welcome back. Projects comments available on Glookup! Welcome back. Projects comments available on Glookup! Turn in homework! Welcome back. Projects comments available on Glookup! Turn in homework! I am away April 15-20. Welcome back.


  1. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations.

  2. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2

  3. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2 √ Signal >> Noise. ↔ d ε 2 >> d σ 2 .

  4. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2 √ Signal >> Noise. ↔ d ε 2 >> d σ 2 . Need d >> σ 4 / ε 4 .

  5. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim.

  6. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp.

  7. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 .

  8. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1.

  9. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2.

  10. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. Std deviation is σ 2 !

  11. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 !

  12. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal!

  13. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 .

  14. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2

  15. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2 Versus d >> σ 4 / ε 4 .

  16. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2 Versus d >> σ 4 / ε 4 . A quadratic difference in amount of data!

  17. Don’t know much about... Don’t know µ 1 or µ 2 ?

  18. Without the means? Sample of n people.

  19. Without the means? Sample of n people. Some (say half) from population 1,

  20. Without the means? Sample of n people. Some (say half) from population 1, some from population 2.

  21. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which?

  22. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach

  23. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared.

  24. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold.

  25. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )]

  26. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y )

  27. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other.

  28. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 .

  29. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to

  30. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other.

  31. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering.

  32. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering. � n � log n factor for union bound over pairs. 2

  33. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering. � n � log n factor for union bound over pairs. 2 Best one can do?

  34. Principal components analysis. Remember Projection!

  35. Principal components analysis. Remember Projection!

  36. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ?

  37. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis:

  38. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance.

  39. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points)

  40. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population.

  41. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 .

  42. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 ,

  43. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 .

  44. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 .

  45. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least.

  46. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability?

  47. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions.

  48. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions?

  49. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions? Infinity

  50. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions? Infinity and beyond!

  51. Nets “ δ - Net”.

  52. Nets “ δ - Net”. Set D of directions

  53. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D .

  54. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ .

  55. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net:

  56. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] .

  57. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ

  58. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction.

  59. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net.

  60. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 .

  61. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 .

  62. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works.

  63. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works. Nearest neighbor works with very high d > σ 4 / ε 4 .

  64. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works. Nearest neighbor works with very high d > σ 4 / ε 4 . PCA can reduce d to “knowing centers” case, with reasonable number of sample points.

  65. PCA calculation. Matrix A where rows are points.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend