Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong University Sept. 29, 2019 Algorithms for Big Data (III) 1/16

We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a few useful concentration inequalities. We constructed a -universal universal family of Hash functions. Algorithms for Big Data (III) 2/16

Review of the Last Lecture Last time, we proved a few useful concentration inequalities. We constructed a -universal universal family of Hash functions. Algorithms for Big Data (III) 2/16 We introduced the notion of universal families of Hash functions.

Review of the Last Lecture Last time, we proved a few useful concentration inequalities. Algorithms for Big Data (III) 2/16 We introduced the notion of universal families of Hash functions. We constructed a 2 -universal universal family of Hash functions.

h a b x Review: the construction mod p Algorithms for Big Data (III) is a -universal Hash function family. So n h y h x Pr h y , We proved that for every x mod n b ax n is defined as m where each h a b p b p a h a b The family 3/16 Let m , n be two integer and p ≥ m be a prime.

Review: the construction Pr h Algorithms for Big Data (III) is a -universal Hash function family. So n h y h x y , We proved that for every x The family 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n .

is a -universal Hash function family. Review: the construction The family So Algorithms for Big Data (III) 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n . We proved that for every x � y , Pr h ∈H [ h ( x ) = h ( y )] ≤ 1 n .

Review: the construction The family Algorithms for Big Data (III) 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n . We proved that for every x � y , Pr h ∈H [ h ( x ) = h ( y )] ≤ 1 n . So H is a 2 -universal Hash function family.

is called strongly -universal family of Hash functions. h a b x 4/16 to get a Algorithms for Big Data (III) p a b mod p b ax In this case, we have strong -universal family. p are primes, the we can modify the previously constructed Recall that if we further require that for any u v , n When m then n v h y u h x Pr h Strongly 2 -Universal Hash Family

h a b x 4/16 In this case, we have Algorithms for Big Data (III) p a b mod p b ax strong -universal family. to get a p are primes, the we can modify the previously constructed n When m Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions.

h a b x In this case, we have ax b mod p a b p Algorithms for Big Data (III) 4/16 Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions. When m = n = p are primes, the we can modify the previously constructed H to get a strong 2 -universal family.

Algorithms for Big Data (III) In this case, we have 4/16 Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions. When m = n = p are primes, the we can modify the previously constructed H to get a strong 2 -universal family. mod p : 0 ≤ a , b ≤ p − 1 } . H = { h a , b ( x ) = ax + b

The equations h a b x y and h a b x mod p mod p Pr h a b H h a b x h a b x Proof x mod p b y Therefore, ax mod p y y y p Algorithms for Big Data (III) x They have a unique solution y a Lemma y b ax y b ax y are equivalent to and p is a prime. p ) if a mod p has unique solution (in b The equation ax 5/16

The equations h a b x y and h a b x mod p mod p Pr h a b H h a b x h a b x b y ax mod p Proof Therefore, x y y p Algorithms for Big Data (III) mod p y x y Lemma a They have a unique solution y b ax y b ax y are equivalent to 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime.

Pr h a b H h a b x h a b x Proof mod p Algorithms for Big Data (III) p y y Therefore, mod p ax y b x Lemma x y y a They have a unique solution 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p .

Pr h a b H h a b x h a b x Proof Lemma Algorithms for Big Data (III) p y y Therefore, 5/16 They have a unique solution The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p . a = y 2 − y 1 b = y 1 − ax 1 mod p , mod p . x 2 − x 1

Proof They have a unique solution Algorithms for Big Data (III) Therefore, Lemma 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p . a = y 2 − y 1 b = y 1 − ax 1 mod p , mod p . x 2 − x 1 Pr h a , b ∈ H [ h a , b ( x 1 ) = y 1 ∧ h a , b ( x 2 ) = y 2 ] = 1 p 2 .

We can naturally generalize m p to m h a b x The General Case b p , define k i a i x i Then mod p and h a b a i p b p Algorithms for Big Data (III) b a i p x n p k . Write every number x in base p : x x x p p The Hash family we just constructed has the restriction that m x k p k For every a a a a k , with 6/16

We can naturally generalize m p to m h a b x The General Case b p , define k i a i x i Then mod p and h a b a i p b p Algorithms for Big Data (III) b a i p x p k . Write every number x in base p : x x x p p x k p k For every a a a a k , with 6/16 The Hash family we just constructed has the restriction that m = n

h a b x The General Case b p , define k i a i x i mod p and Then h a b a i p b p Algorithms for Big Data (III) b p 6/16 a i Write every number x in base p : x x x p x p x k p k For every a a a a k , with The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k .

h a b x The General Case Algorithms for Big Data (III) p b p a i h a b Then mod p b a i x i i k , define p b and p a i , with a k a a For every a Write every number x in base p : 6/16 The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 .

The General Case Then Algorithms for Big Data (III) p b p a i h a b 6/16 Write every number x in base p : The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 . For every ¯ a = ( a 0 , a 1 , . . . , a k − 1 ) , with 0 ≤ a i ≤ p − 1 and 0 ≤ b ≤ p − 1 , define k − 1 a , b ( x ) = � a i x i + b � ∑ mod p . h ¯ � � i =0

The General Case Then Algorithms for Big Data (III) 6/16 Write every number x in base p : The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 . For every ¯ a = ( a 0 , a 1 , . . . , a k − 1 ) , with 0 ≤ a i ≤ p − 1 and 0 ≤ b ≤ p − 1 , define k − 1 a , b ( x ) = � a i x i + b � ∑ mod p . h ¯ � � i =0 a , b : 0 ≤ a i ≤ p − 1 , 0 ≤ b ≤ p − 1 } . H = { h ¯

j i a j x j j i a j y j For fixed x y u v and a j j i , a unique pair a i b (out of p pairs) is determined. h a b x h a b y Proof Algorithms for Big Data (III) p v u Pr h a b Therefore, mod p b v Assuming x a i y i mod p u b a i x i , we have equations p For every u v y i ). y and they difger on the position i ( x i 7/16

j i a j x j j i a j y j For fixed x y u v and a j j i , a unique pair a i b (out of p pairs) is determined. h a b x h a b y Proof Algorithms for Big Data (III) p v u Pr h a b Therefore, mod p v b a i y i mod p u b a i x i , we have equations p For every u v 7/16 Assuming x � y and they difger on the position i ( x i � y i ).

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong University Sept. 29, 2019 Algorithms for Big Data (III) 1/16 We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Data and Analysis Part III Unstructured Data Ian Stark February 2011 Part III: Unstructured

Big- Big -O O Analyzing Algorithms Asymptotically Analyzing Algorithms Asymptotically P1 P2

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms & Data Structures Tuesday,

Analysis of Algorithms & Big-O CS16: Introduction to Algorithms & Data Structures Spring

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

I III IV I III IV I III IV BUILDING TRUST Radical Candor Chart HIGH I III IV

Partnerships across industry, academia, nonprofits, and government Meredith M. Lee Executive

Data-Intensive Distributed Computing CS 431/631 451/651 (Fall 2019) Part 2: From MapReduce to

Strategic mobile library development: the place of library apps and the options for creating them

Research performance of PhD students and different categories of faculty members at Chalmers

CS520 Data Integration, Warehousing, and Provenance 7. Big Data Systems and Integration IIT

Hadoop: Scalable Infrastructure for Big Data QCon London 2012 Parand Tony Darugar Founder and

Compact Multi-Signatures for Smaller Blockchains Dan Boneh 1 , Manu Drijvers 2 , Gregory Neven 2 1

Bitcoin Anonymity Mike Fleder Mike Kester Sudeep Pillai "Voodah"

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong University Sept. 29, 2019 Algorithms for Big Data (III) 1/16 We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Algorithms for Big Data (X) Chihao Zhang Shanghai Jiao Tong University Nov. 22, 2019 Algorithms

Data and Analysis Part III Unstructured Data Ian Stark February 2011 Part III: Unstructured

Big- Big -O O Analyzing Algorithms Asymptotically Analyzing Algorithms Asymptotically P1 P2

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms &amp; Data Structures Tuesday,

Analysis of Algorithms &amp; Big-O CS16: Introduction to Algorithms &amp; Data Structures Spring

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

I III IV I III IV I III IV BUILDING TRUST Radical Candor Chart HIGH I III IV

Partnerships across industry, academia, nonprofits, and government Meredith M. Lee Executive

Data-Intensive Distributed Computing CS 431/631 451/651 (Fall 2019) Part 2: From MapReduce to

Strategic mobile library development: the place of library apps and the options for creating them

Research performance of PhD students and different categories of faculty members at Chalmers

CS520 Data Integration, Warehousing, and Provenance 7. Big Data Systems and Integration IIT

Hadoop: Scalable Infrastructure for Big Data QCon London 2012 Parand Tony Darugar Founder and

Compact Multi-Signatures for Smaller Blockchains Dan Boneh 1 , Manu Drijvers 2 , Gregory Neven 2 1

Bitcoin Anonymity Mike Fleder Mike Kester Sudeep Pillai &quot;Voodah&quot;

ANALYSIS OF ALGORITHMS AND BIG-O CS16: Introduction to Algorithms & Data Structures Tuesday,

Analysis of Algorithms & Big-O CS16: Introduction to Algorithms & Data Structures Spring

Bitcoin Anonymity Mike Fleder Mike Kester Sudeep Pillai "Voodah"