empirical comparisons of fast methods
play

Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas - PowerPoint PPT Presentation

Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas { dalang, klaas } @cs.ubc.ca University of British Columbia December 17, 2004 Fast N-Body Learning - Empirical Comparisons p. 1 SumKernel Methods Fast Multipole Method


  1. Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas { dalang, klaas } @cs.ubc.ca University of British Columbia December 17, 2004 Fast N-Body Learning - Empirical Comparisons – p. 1

  2. Sum−Kernel Methods Fast Multipole Method Dual−Tree KD−tree Anchors Gaussian Kernel Regular Grid Fast Gauss Transform Improved FGT Box Filter Max−Kernel Methods Dual−Tree Regular Grid KD−tree Anchors Distance Transform A Map of Fast Methods Fast N-Body Learning - Empirical Comparisons – p. 2

  3. We claim that to be useful for other researchers, Fast Methods need: • guaranteed, adjustable error bounds: users can set the error bound low during development stage, then experiment once they know their code works. • no parameters that need to be adjusted by users (other than error tolerance). • documented error behaviour: we must explain the properties of our approximation errors. The Role of Fast Methods Fast N-Body Learning - Empirical Comparisons – p. 3

  4. We tested: � � N −� x i − y j � 2 � 2 Sum-Kernel: f j = w i exp h 2 i =1 Max-Kernel: � � � � −� x i − y j � 2 N 2 j = argmax w i exp x ∗ h 2 i =1 Gaussian kernel, fixed bandwidth h , non-negative weights w i , j = 1 . . . N . Testing Framework Fast N-Body Learning - Empirical Comparisons – p. 4

  5. For the Sum-Kernel problem, we allow a given error tolerance ǫ : | f j − f true | ≤ ǫ for each j . We tested: • Fast Gauss Transform (FGT) • Improved Fast Gauss Transform (IFGT) • Dual-Tree with kd -tree (KDtree) • Dual-Tree with ball-tree constructed via Anchors Hierarchy (Anchors) Testing Framework (2) Fast N-Body Learning - Empirical Comparisons – p. 5

  6. Fast Gauss Transform (FGT) code by Firas Hamze of UBC. KDtree and Anchors Dual-Tree code by Dustin. The same Dual-Tree code was used for KDtree and Anchors. Methods Tested Fast N-Body Learning - Empirical Comparisons – p. 6

  7. Ramani Duraiswami and Changjiang Yang generously gave their code for the Improved Fast Gauss Transform (IFGT). To make the IFGT fit in our testing framework, we had to devise a method for choosing parameters. Our method seems reasonable but is probably not optimal. All methods: in C with Matlab bindings. Methods Tested (2) Fast N-Body Learning - Empirical Comparisons – p. 7

  8. Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0 . 1 , ǫ = 10 − 6 : Time. 4 10 • Naive is usually fastest. 2 10 CPU Time (s) • Only FGT is faster - but only ∼ 3 × . • IFGT may become 0 10 Naive faster - after 1 . 5 FGT IFGT hours of compute Anchors KDtree −2 10 time. 2 3 4 5 10 10 10 10 N Results (1): A Worst-Case Scenario Fast N-Body Learning - Empirical Comparisons – p. 8

  9. Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0 . 1 , ǫ = 10 − 6 : Memory. 9 10 Memory Usage (bytes) 8 10 • Dual-Tree memory requirements are 7 10 an issue. FGT IFGT 6 Anchors 10 KDtree 2 3 4 5 10 10 10 10 N Results (1): A Worst-Case Scenario Fast N-Body Learning - Empirical Comparisons – p. 8

  10. Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0 . 01 , ǫ = 10 − 6 . 2 10 • IFGT cannot be run– more than 1 10 10 10 expansion CPU Time (s) terms required for 0 10 N = 100 points. Naive FGT • Dual-Tree and FGT −1 Anchors 10 KDtree are fast, but not Order N*sqrt(N) Order N −2 O ( N ) . 10 2 3 4 5 10 10 10 10 N Results (2) Fast N-Body Learning - Empirical Comparisons – p. 9

  11. Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0 . 01 , ǫ = 10 − 6 . 9 10 Memory Usage (bytes) 8 10 • Memory require- ments are still an 7 10 issue. FGT 6 Anchors 10 KDtree 2 3 4 5 10 10 10 10 N Results (2) Fast N-Body Learning - Empirical Comparisons – p. 9

  12. Uniform data and weights, N = 10 , 000 , ǫ = 10 − 3 , h = 0 . 01 , varying dimension: CPU time. 3 • IFGT very fast for 10 1D, infeasible 2 10 CPU Time (s) beyond 2D. • KDtree, Anchors 1 10 show (unex- Naive 0 pected?) optimal FGT 10 IFGT behaviour around 3 Anchors KDtree −1 10 or 4 dimensions. 0 1 2 10 10 10 Dimension Results (3) Fast N-Body Learning - Empirical Comparisons – p. 10

  13. Uniform data and weights, N = 10 , 000 , ǫ = 10 − 3 , h = 0 . 01 , varying dimension: Memory usage. 9 10 8 Memory Usage (bytes) 10 7 10 Naive 6 FGT 10 IFGT Anchors KDtree 0 1 2 10 10 10 Dimension Results (3) Fast N-Body Learning - Empirical Comparisons – p. 10

  14. Uniform sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 : CPU time. Naive FGT Anchors KDtree • Cost of Dual-Tree 1 10 methods increases CPU Time slowly with accuracy. • FGT cost rises 0 10 more quickly. −1 −3 −5 −7 −9 −11 10 10 10 10 10 10 Epsilon Results (4) Fast N-Body Learning - Empirical Comparisons – p. 11

  15. Uniform sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 : CPU time relative to Uniform. FGT • Error of Dual-Tree Anchors KDtree methods almost −5 10 exactly as large as Real Error allowed ( ǫ ). • FGT (and presum- −10 10 ably IFGT) overes- timate the error– thus do more work −5 −10 than required. 10 10 Epsilon Results (4) Fast N-Body Learning - Empirical Comparisons – p. 11

  16. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  17. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 1 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  18. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 2 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  19. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 3 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  20. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 5 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  21. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 2 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  22. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 3 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  23. Clumpy sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time. 1 10 CPU Time Naive As clumpiness FGT Anchors increases, Dual-Tree KDtree methods get faster. 0 10 1 1.5 2 2.5 3 Data Clumpiness Results (5): clumpy sources Fast N-Body Learning - Empirical Comparisons – p. 13

  24. Clumpy sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time relative to 1 Uniform. CPU Usage Relative to Uniform Data 0.9 0.8 Especially Anchors. 0.7 0.6 Naive FGT Anchors 0.5 KDtree 1 1.5 2 2.5 3 Data Clumpiness Results (5): clumpy sources Fast N-Body Learning - Empirical Comparisons – p. 13

  25. Clumpy sources, clumpy targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time. 1 10 CPU Time Naive FGT Even bigger improve- Anchors KDtree ments! 0 10 1 1.5 2 2.5 3 Data Clumpiness Results (6): clumpy sources and targets Fast N-Body Learning - Empirical Comparisons – p. 14

  26. Clumpy sources, clumpy targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time relative to 1 Uniform. CPU Usage Relative to Uniform Data 0.9 0.8 0.7 Large variance- details 0.6 of particular clumpy 0.5 data sets? 0.4 Naive FGT Anchors 0.3 KDtree 1 1.5 2 2.5 3 Data Clumpiness Results (6): clumpy sources and targets Fast N-Body Learning - Empirical Comparisons – p. 14

  27. Clumpy sources and targets ( C = 2 ), N = 10 , 000 , h = 0 . 01 , ǫ = 10 − 3 , varying dimension: CPU time. 2 10 CPU Time (s) 1 10 Not qualitatively differ- ent from uniform data! 0 10 Naive IFGT Anchors KDtree 0 1 10 10 Dimension Results (7): clumpy, dimensionality Fast N-Body Learning - Empirical Comparisons – p. 15

  28. Clumpy sources and targets ( C = 2 ), N = 10 , 000 , h = 0 . 01 , ǫ = 10 − 3 , varying dimension: CPU time. 3 10 2 10 CPU Time (s) For reference: the non- 1 10 clumpy results. Naive 0 FGT 10 IFGT Anchors KDtree −1 10 0 1 2 10 10 10 Dimension Results (7): clumpy, dimensionality Fast N-Body Learning - Empirical Comparisons – p. 15

  29. • Synthetic-data tests; each algorithm is required to guarantee results within a given error tolerance. • IFGT: • We devised a method of choosing parameters– a different method might work better. • The error bounds seem to be very loose, so it does much more work than necessary. Summary (1) Fast N-Body Learning - Empirical Comparisons – p. 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend