deterministic distributed and streaming algorithms for
play

Deterministic Distributed and Streaming Algorithms for Linear - PowerPoint PPT Presentation

Deterministic Distributed and Streaming Algorithms for Linear Algebra Problems Charlie Dickens Joint work with Graham Cormode and David P. Woodruff University of Warwick, Department of Computer Science WPCCS, 30th June 2017 Motivation


  1. Deterministic Distributed and Streaming Algorithms for Linear Algebra Problems Charlie Dickens Joint work with Graham Cormode and David P. Woodruff University of Warwick, Department of Computer Science WPCCS, 30th June 2017

  2. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d

  3. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra!

  4. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra! ◮ But there are also some problems...

  5. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra! ◮ But there are also some problems... ◮ Storage - the data may be too large to store

  6. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra! ◮ But there are also some problems... ◮ Storage - the data may be too large to store ◮ Time complexity - ‘efficient’ polynomial time algorithms might be too slow

  7. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra! ◮ But there are also some problems... ◮ Storage - the data may be too large to store ◮ Time complexity - ‘efficient’ polynomial time algorithms might be too slow ◮ Instead of solving exactly can we find ‘efficient’ algorithms which are more suitable for large scale data analysis, perhaps allowing for some approximate solution?

  8. Motivation ◮ Large data can be abstracted as a matrix, A ∈ R n × d ◮ Matrices = Linear Algebra! ◮ But there are also some problems... ◮ Storage - the data may be too large to store ◮ Time complexity - ‘efficient’ polynomial time algorithms might be too slow ◮ Instead of solving exactly can we find ‘efficient’ algorithms which are more suitable for large scale data analysis, perhaps allowing for some approximate solution? ◮ Randomised methods have been proposed but are they necessary?

  9. Computation Models

  10. Computation Models Streaming Model

  11. Computation Models Streaming Model ◮ See data one item at a time

  12. Computation Models Streaming Model ◮ See data one item at a time ◮ Cannot store all of the data

  13. Computation Models Streaming Model ◮ See data one item at a time ◮ Cannot store all of the data ◮ Want to optimise storage - sublinear in n

  14. Computation Models Streaming Model ◮ See data one item at a time ◮ Cannot store all of the data ◮ Want to optimise storage - sublinear in n ◮ Need to keep a running ‘summary’ of the data

  15. Computation Models Streaming Model ◮ See data one item at a time ◮ Cannot store all of the data ◮ Want to optimise storage - sublinear in n ◮ Need to keep a running ‘summary’ of the data ◮ Use the summary to compute approximation to the original problem.

  16. Computation Models Streaming Model Distributed Model ◮ See data one item at a time ◮ Cannot store all of the data ◮ Want to optimise storage - sublinear in n ◮ Need to keep a running ‘summary’ of the data ◮ Use the summary to compute approximation to the original problem.

  17. Computation Models Streaming Model Distributed Model ◮ See data one item at a ◮ Coordinator sends small time blocks of input to worker nodes ◮ Cannot store all of the data ◮ Want to optimise storage - sublinear in n ◮ Need to keep a running ‘summary’ of the data ◮ Use the summary to compute approximation to the original problem.

  18. Computation Models Streaming Model Distributed Model ◮ See data one item at a ◮ Coordinator sends small time blocks of input to worker nodes ◮ Cannot store all of the data ◮ Worker nodes report back ◮ Want to optimise storage - a summary of the data to sublinear in n coordinator ◮ Need to keep a running ‘summary’ of the data ◮ Use the summary to compute approximation to the original problem.

  19. Computation Models Streaming Model Distributed Model ◮ See data one item at a ◮ Coordinator sends small time blocks of input to worker nodes ◮ Cannot store all of the data ◮ Worker nodes report back ◮ Want to optimise storage - a summary of the data to sublinear in n coordinator ◮ Need to keep a running ◮ Coordinator computes ‘summary’ of the data approximation to original ◮ Use the summary to problem using by using the compute approximation to summaries sent back the original problem.

  20. Summary of Results Previous results are specific for p = 2. Our results are the first deterministic algorithms which generalise to arbitrary p -norm (where applicable). Problem Solution type Time Space O ( nd 2 + nd 5 log( n )) High Leverage Scores 1 / poly ( d ) additive poly ( d )

  21. Summary of Results Previous results are specific for p = 2. Our results are the first deterministic algorithms which generalise to arbitrary p -norm (where applicable). Problem Solution type Time Space O ( nd 2 + nd 5 log( n )) High Leverage Scores 1 / poly ( d ) additive poly ( d ) ℓ p -regression ( p � = ∞ ) poly ( d ) relative poly ( nd ) O (1 /γ ) n γ d

  22. Summary of Results Previous results are specific for p = 2. Our results are the first deterministic algorithms which generalise to arbitrary p -norm (where applicable). Problem Solution type Time Space O ( nd 2 + nd 5 log( n )) High Leverage Scores 1 / poly ( d ) additive poly ( d ) ℓ p -regression ( p � = ∞ ) poly ( d ) relative poly ( nd ) O (1 /γ ) n γ d poly ( nd 5 ) d O ( p ) /ε O (1) ℓ ∞ -regression ε � b � p additive

  23. Summary of Results Previous results are specific for p = 2. Our results are the first deterministic algorithms which generalise to arbitrary p -norm (where applicable). Problem Solution type Time Space O ( nd 2 + nd 5 log( n )) High Leverage Scores 1 / poly ( d ) additive poly ( d ) ℓ p -regression ( p � = ∞ ) poly ( d ) relative poly ( nd ) O (1 /γ ) n γ d poly ( nd 5 ) d O ( p ) /ε O (1) ℓ ∞ -regression ε � b � p additive ℓ 1 low- ( k ) rank approximation poly ( k ) relative poly ( nd ) O (1 /γ ) n γ poly ( d )

  24. Summary of Results Previous results are specific for p = 2. Our results are the first deterministic algorithms which generalise to arbitrary p -norm (where applicable). Problem Solution type Time Space O ( nd 2 + nd 5 log( n )) High Leverage Scores 1 / poly ( d ) additive poly ( d ) ℓ p -regression ( p � = ∞ ) poly ( d ) relative poly ( nd ) O (1 /γ ) n γ d poly ( nd 5 ) d O ( p ) /ε O (1) ℓ ∞ -regression ε � b � p additive ℓ 1 low- ( k ) rank approximation poly ( k ) relative poly ( nd ) O (1 /γ ) n γ poly ( d )

  25. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) .

  26. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) . A matrix U is a wcb for the column space of A if:

  27. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) . A matrix U is a wcb for the column space of A if: ◮ � U � p ≤ α

  28. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) . A matrix U is a wcb for the column space of A if: ◮ � U � p ≤ α ◮ for all z , � z � q ≤ β � Uz � p where q is the dual norm to p

  29. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) . A matrix U is a wcb for the column space of A if: ◮ � U � p ≤ α ◮ for all z , � z � q ≤ β � Uz � p where q is the dual norm to p ◮ α and β are at most poly ( d ).

  30. Main Algorithmic Techniques: well-conditioned basis Much of the work relies on the notion of a well-conditioned basis (wcb) . A matrix U is a wcb for the column space of A if: ◮ � U � p ≤ α ◮ for all z , � z � q ≤ β � Uz � p where q is the dual norm to p ◮ α and β are at most poly ( d ). Mahoney et al. show that a change of basis matrix R can be computed in polynomial time such that AR is well conditioned.

  31. Main Algorithmic Techniques: high leverage rows Let U = AR for change of basis matrix R . Then the full ℓ p -leverage scores are w i = � ( AR ) i � p p . Similar definition for local leverage scores.

  32. Main Algorithmic Techniques: high leverage rows Let U = AR for change of basis matrix R . Then the full ℓ p -leverage scores are w i = � ( AR ) i � p p . Similar definition for local leverage scores. Problem: Can the rows of high leverage be found without reading the whole matrix?

  33. Main Algorithmic Techniques: high leverage rows Let U = AR for change of basis matrix R . Then the full ℓ p -leverage scores are w i = � ( AR ) i � p p . Similar definition for local leverage scores. Problem: Can the rows of high leverage be found without reading the whole matrix? Theory: Rows with high global leverage scores have high local leverage scores up to poly ( d ) factors: ˆ w i ≥ w i ′ / poly ( d ).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend