algorithms for big data x
play

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms for Big Data (X) 1/10 Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication . Algorithms for Big Data (X) . .


  1. Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms for Big Data (X) 1/10

  2. Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication . Algorithms for Big Data (X) . . where The best algorithm so far costs The Strassen’s algorithm reduces the cost to Given two matrices multiplication operations. , the naive algorithm costs For . , we computes and 2/10

  3. Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication For , the naive algorithm costs multiplication operations. The Strassen’s algorithm reduces the cost to . The best algorithm so far costs where . . Algorithms for Big Data (X) 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB .

  4. Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication The Strassen’s algorithm reduces the cost to . The best algorithm so far costs where . . Algorithms for Big Data (X) 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB . For m = n = p , the naive algorithm costs O ( n 3 ) multiplication operations.

  5. Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication The best algorithm so far costs where . . Algorithms for Big Data (X) 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB . For m = n = p , the naive algorithm costs O ( n 3 ) multiplication operations. The Strassen’s algorithm reduces the cost to O ( n 2.81 ) .

  6. Today we will introduce a Monte-Carlo algorithm to approximate Matrix Multiplication . Algorithms for Big Data (X) 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB . For m = n = p , the naive algorithm costs O ( n 3 ) multiplication operations. The Strassen’s algorithm reduces the cost to O ( n 2.81 ) . The best algorithm so far costs O ( n ω ) where ω < 2.3728639 .

  7. Algorithms for Big Data (X) Matrix Multiplication 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB . For m = n = p , the naive algorithm costs O ( n 3 ) multiplication operations. The Strassen’s algorithm reduces the cost to O ( n 2.81 ) . The best algorithm so far costs O ( n ω ) where ω < 2.3728639 . Today we will introduce a Monte-Carlo algorithm to approximate AB .

  8. The Frobenius norm of a matrix Review of Linear Algebra b Algorithms for Big Data (X) is a b , where each a b is of rank . Then . . Assume . . b and a a 3/10

  9. The Frobenius norm of a matrix Review of Linear Algebra . Algorithms for Big Data (X) is a b , where each a b is of rank . Then . . 3/10  b T  1 � � Assume A = and B = a 1 , . . . , a n     . b T n

  10. The Frobenius norm of a matrix Review of Linear Algebra . Algorithms for Big Data (X) is . . 3/10  b T  1 � � Assume A = and B = a 1 , . . . , a n     . b T n Then AB = ∑ n i = 1 a i b T i , where each a i b T i is of rank 1 .

  11. Review of Linear Algebra . Algorithms for Big Data (X) . . 3/10  b T  1 � � Assume A = and B = a 1 , . . . , a n     . b T n Then AB = ∑ n i = 1 a i b T i , where each a i b T i is of rank 1 . The Frobenius norm of a matrix A = ( a ij ) 1 ≤ i ≤ m,1 ≤ j ≤ n is � m n � ∑ ∑ ∥ A ∥ F ≜ � a 2 ij . � i = 1 j = 1

  12. The Algorithm Note that a b . The algorithm randomly pick indices independently times (with replacement). Let denote the indices. Output a b , where is some weight to be determined. Algorithms for Big Data (X) 4/10

  13. The Algorithm The algorithm randomly pick indices independently times (with replacement). Let denote the indices. Output a b , where is some weight to be determined. Algorithms for Big Data (X) 4/10 Note that AB = ∑ n i = 1 a i b T i .

  14. The Algorithm Let denote the indices. Output a b , where is some weight to be determined. Algorithms for Big Data (X) 4/10 Note that AB = ∑ n i = 1 a i b T i . The algorithm randomly pick indices i ∈ [ n ] independently c times (with replacement).

  15. Output The Algorithm a b , where is some weight to be determined. Algorithms for Big Data (X) 4/10 Note that AB = ∑ n i = 1 a i b T i . The algorithm randomly pick indices i ∈ [ n ] independently c times (with replacement). Let J : [ c ] → [ n ] denote the indices.

  16. The Algorithm Algorithms for Big Data (X) 4/10 Note that AB = ∑ n i = 1 a i b T i . The algorithm randomly pick indices i ∈ [ n ] independently c times (with replacement). Let J : [ c ] → [ n ] denote the indices. Output ∑ c i = 1 w ( J ( i )) · a J ( i ) b T J ( i ) , where w ( J ( i )) is some weight to be determined.

  17. It is convenient to formulate the algorithm using matrices. Define a random sampling 5/10 otherwise Algorithms for Big Data (X) and where Then our algorithm outputs . such that if matrix . times in expectation, so we can set is picked Therefore, the index We fix a distribution on [ n ] ( p i for i ∈ [ n ] satisfying ∑ i ∈ [ n ] p i = 1 ).

  18. It is convenient to formulate the algorithm using matrices. Define a random sampling matrix such that if otherwise . Then our algorithm outputs where and Algorithms for Big Data (X) 5/10 We fix a distribution on [ n ] ( p i for i ∈ [ n ] satisfying ∑ i ∈ [ n ] p i = 1 ). Therefore, the index j is picked c · p j times in expectation, so we can set w ( j ) = ( cp j ) − 1 .

  19. 5/10 otherwise Algorithms for Big Data (X) and where Then our algorithm outputs . We fix a distribution on [ n ] ( p i for i ∈ [ n ] satisfying ∑ i ∈ [ n ] p i = 1 ). Therefore, the index j is picked c · p j times in expectation, so we can set w ( j ) = ( cp j ) − 1 . It is convenient to formulate the algorithm using matrices. Define a random sampling matrix Π = ( π ij ) ∈ R c × c such that { ( cp i ) − 1 if i = J ( j ) 2 π ij = 0

  20. 5/10 otherwise Algorithms for Big Data (X) . We fix a distribution on [ n ] ( p i for i ∈ [ n ] satisfying ∑ i ∈ [ n ] p i = 1 ). Therefore, the index j is picked c · p j times in expectation, so we can set w ( j ) = ( cp j ) − 1 . It is convenient to formulate the algorithm using matrices. Define a random sampling matrix Π = ( π ij ) ∈ R c × c such that { ( cp i ) − 1 if i = J ( j ) 2 π ij = 0 Then our algorithm outputs A ′ B ′ where A ′ = AΠ and B ′ = Π T B.

  21. Analysis . Algorithms for Big Data (X) Var a b E a b E b We are going to choose some a , we let for any Fix . so that 6/10

  22. Analysis Fix for any , we let a b . E a b E a b Var Algorithms for Big Data (X) 6/10 We are going to choose some ( p i ) i ∈ [ n ] so that A ′ B ′ ≈ AB .

  23. Analysis . Algorithms for Big Data (X) Var a b E a b E 6/10 We are going to choose some ( p i ) i ∈ [ n ] so that A ′ B ′ ≈ AB . � � a J ( k ) b T J ( k ) Fix i, j for any k ∈ [ c ] , we let X k = cp J ( k ) ij

  24. Analysis . Algorithms for Big Data (X) E 6/10 We are going to choose some ( p i ) i ∈ [ n ] so that A ′ B ′ ≈ AB . � � a J ( k ) b T J ( k ) Fix i, j for any k ∈ [ c ] , we let X k = cp J ( k ) ij n � a ℓ b T ∑ � = 1 ℓ E [ X k ] = c ( AB ) ij p ℓ cp ℓ ij ℓ = 1 � 2 n n a 2 ℓi b 2 � a ℓ b T ∑ ∑ � � ℓj X 2 ℓ = p ℓ = k c 2 p ℓ cp ℓ ij ℓ = 1 ℓ = 1 n a 2 ℓi b 2 ∑ − 1 ℓj c 2 ( AB ) 2 Var [ X k ] = ij . c 2 p ℓ ℓ = 1

  25. Therefore, We are going to study the concentration of this algorithm. Algorithms for Big Data (X) b a Var E E We compute that 7/10 E c ∑ ( A ′ B ′ ) ij � � = E [ X k ] = ( AB ) ij . k = 1

  26. Therefore, We are going to study the concentration of this algorithm. Algorithms for Big Data (X) b a Var E E We compute that 7/10 E c ∑ ( A ′ B ′ ) ij � � = E [ X k ] = ( AB ) ij . k = 1

  27. Therefore, We are going to study the concentration of this algorithm. Algorithms for Big Data (X) Var E E E We compute that 7/10 c ∑ ( A ′ B ′ ) ij � � = E [ X k ] = ( AB ) ij . k = 1 n p ∑ ∑ � � � � ∥ AB − A ′ B ′ ∥ 2 ( AB − A ′ B ′ ) 2 = F ij i = 1 j = 1 n p ∑ ∑ � ( A ′ B ′ ) ij � = i = 1 j = 1 � n � = 1 ∑ 1 ∥ a ℓ ∥ 2 ∥ b ℓ ∥ 2 − ∥ AB ∥ 2 F c p ℓ ℓ = 1

  28. 8/10 E Algorithms for Big Data (X) If we choose p ℓ ∼ ∥ a ℓ ∥∥ b ℓ ∥ , then � n   � 2 ∑ = 1 � � ∥ AB − A ′ B ′ ∥ 2 − ∥ AB ∥ 2 ∥ a ℓ ∥∥ b ℓ ∥ F  F  c ℓ = 1 � n � 2 ∑ ≤ 1 ∥ a ℓ ∥∥ b ℓ ∥ c ℓ = 1 ≤ 1 c ∥ A ∥ 2 F ∥ B ∥ 2 F .

  29. We can use a variant of median trick to boost the algorithm. Therefore, by Chebyshev’s inequality, Pr Algorithms for Big Data (X) probability of correctness. to achieve log We can choose 9/10 1 � � � ∥ AB − A ′ B ′ ∥ F > ε ∥ A ∥ F ∥ B ∥ F � ∥ AB − A ′ B ′ ∥ 2 F > ε 2 ∥ A ∥ 2 F ∥ B ∥ 2 = Pr ≤ cε 2 . F

  30. Therefore, by Chebyshev’s inequality, Pr Algorithms for Big Data (X) probability of correctness. to achieve log We can choose 9/10 1 � � � ∥ AB − A ′ B ′ ∥ F > ε ∥ A ∥ F ∥ B ∥ F � ∥ AB − A ′ B ′ ∥ 2 F > ε 2 ∥ A ∥ 2 F ∥ B ∥ 2 = Pr ≤ cε 2 . F We can use a variant of median trick to boost the algorithm.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend