Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms for Big Data (X) 1/10

Algorithms for Big Data (X) Matrix Multiplication 2/10 Given two matrices A ∈ R m × n and B ∈ R n × p , we computes C = AB . For m = n = p , the naive algorithm costs O ( n 3 ) multiplication operations. The Strassen’s algorithm reduces the cost to O ( n 2.81 ) . The best algorithm so far costs O ( n ω ) where ω < 2.3728639 . Today we will introduce a Monte-Carlo algorithm to approximate AB .

Review of Linear Algebra . Algorithms for Big Data (X) . . 3/10  b T  1 � � Assume A = and B = a 1 , . . . , a n     . b T n Then AB = ∑ n i = 1 a i b T i , where each a i b T i is of rank 1 . The Frobenius norm of a matrix A = ( a ij ) 1 ≤ i ≤ m,1 ≤ j ≤ n is � m n � ∑ ∑ ∥ A ∥ F ≜ � a 2 ij . � i = 1 j = 1

The Algorithm Algorithms for Big Data (X) 4/10 Note that AB = ∑ n i = 1 a i b T i . The algorithm randomly pick indices i ∈ [ n ] independently c times (with replacement). Let J : [ c ] → [ n ] denote the indices. Output ∑ c i = 1 w ( J ( i )) · a J ( i ) b T J ( i ) , where w ( J ( i )) is some weight to be determined.

5/10 otherwise Algorithms for Big Data (X) . We fix a distribution on [ n ] ( p i for i ∈ [ n ] satisfying ∑ i ∈ [ n ] p i = 1 ). Therefore, the index j is picked c · p j times in expectation, so we can set w ( j ) = ( cp j ) − 1 . It is convenient to formulate the algorithm using matrices. Define a random sampling matrix Π = ( π ij ) ∈ R c × c such that { ( cp i ) − 1 if i = J ( j ) 2 π ij = 0 Then our algorithm outputs A ′ B ′ where A ′ = AΠ and B ′ = Π T B.

Analysis . Algorithms for Big Data (X) E 6/10 We are going to choose some ( p i ) i ∈ [ n ] so that A ′ B ′ ≈ AB . � � a J ( k ) b T J ( k ) Fix i, j for any k ∈ [ c ] , we let X k = cp J ( k ) ij n � a ℓ b T ∑ � = 1 ℓ E [ X k ] = c ( AB ) ij p ℓ cp ℓ ij ℓ = 1 � 2 n n a 2 ℓi b 2 � a ℓ b T ∑ ∑ � � ℓj X 2 ℓ = p ℓ = k c 2 p ℓ cp ℓ ij ℓ = 1 ℓ = 1 n a 2 ℓi b 2 ∑ − 1 ℓj c 2 ( AB ) 2 Var [ X k ] = ij . c 2 p ℓ ℓ = 1

Therefore, We are going to study the concentration of this algorithm. Algorithms for Big Data (X) Var E E E We compute that 7/10 c ∑ ( A ′ B ′ ) ij � � = E [ X k ] = ( AB ) ij . k = 1 n p ∑ ∑ � � � � ∥ AB − A ′ B ′ ∥ 2 ( AB − A ′ B ′ ) 2 = F ij i = 1 j = 1 n p ∑ ∑ � ( A ′ B ′ ) ij � = i = 1 j = 1 � n � = 1 ∑ 1 ∥ a ℓ ∥ 2 ∥ b ℓ ∥ 2 − ∥ AB ∥ 2 F c p ℓ ℓ = 1

8/10 E Algorithms for Big Data (X) If we choose p ℓ ∼ ∥ a ℓ ∥∥ b ℓ ∥ , then � n   � 2 ∑ = 1 � � ∥ AB − A ′ B ′ ∥ 2 − ∥ AB ∥ 2 ∥ a ℓ ∥∥ b ℓ ∥ F  F  c ℓ = 1 � n � 2 ∑ ≤ 1 ∥ a ℓ ∥∥ b ℓ ∥ c ℓ = 1 ≤ 1 c ∥ A ∥ 2 F ∥ B ∥ 2 F .

Therefore, by Chebyshev’s inequality, Pr Algorithms for Big Data (X) 9/10 1 � � � ∥ AB − A ′ B ′ ∥ F > ε ∥ A ∥ F ∥ B ∥ F � ∥ AB − A ′ B ′ ∥ 2 F > ε 2 ∥ A ∥ 2 F ∥ B ∥ 2 = Pr ≤ cε 2 . F We can use a variant of median trick to boost the algorithm. � 1 We can choose c = O ( 1 � ) to achieve 1 − δ probability of correctness. ε 2 log δ

Graph Spectrum Algorithms for Big Data (X) 10/10

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms for Big Data (X) 1/10 Algorithms for Big Data (X) Matrix Multiplication 2/10 Given two matrices A R m n and B R n p , we computes C

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Big- Big -O O Analyzing Algorithms Asymptotically Analyzing Algorithms Asymptotically P1 P2

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms & Data Structures Tuesday,

Analysis of Algorithms & Big-O CS16: Introduction to Algorithms & Data Structures Spring

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

Algorithms for Big Data (VI) Chihao Zhang Shanghai Jiao Tong University Oct. 25, 2019

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

BIG DATA CONFERENCE How to transform data into money using Big Data technologies INTRO THE

BIG DATA: Revolutionizing construction business through socmed data mining REVOLUTIONIZING

Getting the Big (Data) Picture Eva Andreasson , Cloudera Big Data? Todays Big Data Landscape

Fundamentals of Big Data BIG DATA F UN DAMEN TALS W ITH P YS PARK Upendra Devisetty Science

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

SIGRAV Cosmology School - Firenze SIGRAV Cosmology School - Firenze Cosmography by Gamma Ray

Tableau metatheory for propositional and syllogistic logics Part II: General idea of tableau

Trusted System Elements and Examples CS461/ECE422 Spring 2012

Overview of the OS CS 450 : Operating Systems Michael Saelee <lee@iit.edu> 1 Computer

The best time to reach Pinterest Users is Saturday 8 - 11pm source: bit.ly/2CIqfGE Google 'near

Scheduling Don Porter CSE 506 Housekeeping Paper reading assigned for next Tuesday

CPU Performance Lecture 8 CAP 3103 06-11-2014 1.6 Performance Defining Performance Which

Chapter Chapter 1 Computer Abstractions and Technology 1.1 Introduction The Computer

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms for Big Data (X) 1/10 Algorithms for Big Data (X) Matrix Multiplication 2/10 Given two matrices A R m n and B R n p , we computes C

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Big- Big -O O Analyzing Algorithms Asymptotically Analyzing Algorithms Asymptotically P1 P2

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms &amp; Data Structures Tuesday,

Analysis of Algorithms &amp; Big-O CS16: Introduction to Algorithms &amp; Data Structures Spring

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

Algorithms for Big Data (VI) Chihao Zhang Shanghai Jiao Tong University Oct. 25, 2019

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES &amp; OPPORTUNITIES Paris Big Data

BIG DATA CONFERENCE How to transform data into money using Big Data technologies INTRO THE

BIG DATA: Revolutionizing construction business through socmed data mining REVOLUTIONIZING

Getting the Big (Data) Picture Eva Andreasson , Cloudera Big Data? Todays Big Data Landscape

Fundamentals of Big Data BIG DATA F UN DAMEN TALS W ITH P YS PARK Upendra Devisetty Science

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

SIGRAV Cosmology School - Firenze SIGRAV Cosmology School - Firenze Cosmography by Gamma Ray

Tableau metatheory for propositional and syllogistic logics Part II: General idea of tableau

Trusted System Elements and Examples CS461/ECE422 Spring 2012

Overview of the OS CS 450 : Operating Systems Michael Saelee &lt;lee@iit.edu&gt; 1 Computer

The best time to reach Pinterest Users is Saturday 8 - 11pm source: bit.ly/2CIqfGE Google 'near

Scheduling Don Porter CSE 506 Housekeeping Paper reading assigned for next Tuesday

CPU Performance Lecture 8 CAP 3103 06-11-2014 1.6 Performance Defining Performance Which

Chapter Chapter 1 Computer Abstractions and Technology 1.1 Introduction The Computer

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms & Data Structures Tuesday,

Analysis of Algorithms & Big-O CS16: Introduction to Algorithms & Data Structures Spring

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Overview of the OS CS 450 : Operating Systems Michael Saelee <lee@iit.edu> 1 Computer