compressed counting
play

Compressed Counting Ping Li Department of Statistical Science - PowerPoint PPT Presentation

Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 1 Compressed Counting Ping Li Department of Statistical Science Faculty of Computing and Information Science Cornell University Ithaca, NY 14850 March, 2009 Ping Li


  1. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 1 Compressed Counting Ping Li Department of Statistical Science Faculty of Computing and Information Science Cornell University Ithaca, NY 14850 March, 2009

  2. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 2 What is Counting in This Talk? Assume a very long vector of D items: x 1 , x 2 , ..., x D . This talk is about counting � D i =1 x α where 0 < α ≤ 2 . i , x 1 2 4 6 8 10 12 14 D The case α → 1 is particularly interesting and important.

  3. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 3 Related Summary Statistics • The sum � D The number of non-zeros, � D i =1 x i . i =1 1 x i � =0 • The α th moment F ( α ) = � D i =1 x α i F (1) = the sum, F (2) = the power/energy, F (0) = number of non-zeros. • The future fortune, � D i =1 x 1 ± ∆ , ∆ = interest/decay rate (usually small) i • The entropy moment � D i =1 x i log x i and entropy � D x i x i F (1) log i =1 F (1) 1 − F ( α ) /F α F ( α ) 1 (1) • The Tsallis Entropy 1 − α log The R´ enyi Entropy F α α − 1 (1)

  4. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 4 Isn’t Counting a Simple (Trivial) Task? Partially True! , if data are static. However Real-world data are in general Massive and Dynamic —— Data Streams • Databases in Amazon, Ebay, Walmart, and search engines • Internet/telephone traffic, high-way traffic • Finance (stock) data • ... • May need answers in real-time, eg anomaly detection (using entropy).

  5. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 5 For example, the Turnstile data stream model for an online bookstore t=0 .... 0 0 0 0 0 0 0 IP 1 .... IP 2 IP 3 IP 4 IP D t=1 arriving stream = (3, 10 ) user 3 ordered 10 books .... 0 0 10 0 0 0 0 IP 1 .... IP 2 IP 3 IP 4 IP D t=2 arriving stream = (1, 5 ) user 1 ordered 5 books .... 5 0 10 0 0 0 0 IP 1 .... IP 2 IP 3 IP 4 IP D t=3 arriving stream = (3, −8 ) user 3 cancelled 8 books .... 5 0 2 0 0 0 0 IP 1 .... IP 2 IP 3 IP 4 IP D

  6. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 6 Turnstile Data Stream Model At time t , an incoming element : a t = ( i t , I t ) i t ∈ [1 , D ] index, I t : increment/decrement. A t [ i t ] = A t − 1 [ i t ] + I t Updating rule : Goal : Count F ( α ) = � D i =1 A t [ i ] α

  7. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 7 Counting: Trivial if α = 1 , but Non-trivial in General Goal : Count F ( α ) = � D i =1 A t [ i ] α , where A t [ i t ] = A t − 1 [ i t ] + I t . When α � = 1 , counting F ( α ) exactly requires D counters. (but D can be 2 64 ) When α = 1 , however, counting the sum is trivial, using a simple counter. D t � � F (1) = A t [ i ] = I s , i =1 s =1

  8. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 8 The Intuition for α ≈ 1 There might exist an intelligent counting system which works like a simple counter when α is close 1; and its complexity is a function of how close α is to 1. Our answer: Yes! Two caveats: Shouldn’t we define F ( α ) = � D i =1 | A t [ i ] | α ? (1) What if data are negative? (2) Why the case α ≈ 1 is important ?

  9. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 9 The Non-Negativity Constraint ”God created the natural numbers; all the rest is the work of man.” —- by German mathematician Leopold Kronecker (1823 - 1891) Turnstile model, a t = ( i t , I t ) , A t [ i t ] = A t − 1 [ i t ] + I t , I t > 0 : increment, insertion, eg place orders I t < 0 : decrement, deletion, eg cancel orders, This talk: Strict Turnstile model A t [ i ] ≥ 0 , always. One can only cancel an order if she/he did place the order!! Suffices for almost all applications.

  10. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 10 Sample Applications of α th Moments (Especially α ≈ 1 ) 1. F ( α ) = � D i =1 A t [ i ] α itself is a useful summary statistic enyi entropy, Tsallis entropy, are functions of F ( α ) . e.g., R´ 2. Statistical modeling and inference of parameters using method of moments Some moments may be much easier to compute than others. 3. F ( α ) = � D i =1 A t [ i ] α is a fundamental building element for other algorithms Eg., estimating Shannon entropy of data streams

  11. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 11 Shannon Entropy of Data Streams Definition of Shannon Entropy D D A t [ i ] log A t [ i ] � � H = − , F (1) = A t [ i ] F (1) F (1) i =1 i =1 Shannon entropy can be approximated by R´ enyi Entropy or Tsallis Entropy. R´ enyi Entropy 1 − α log F ( α ) 1 H α = → H, as α → 1 F α (1) Tsallis Entropy � � 1 − F ( α ) 1 T α = → H, as α → 1 F α α − 1 (1)

  12. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 12 Algorithms on Estimating Shannon Entropy • Many algorithms in theoretical CS and databases on estimating entropy. • A recent trend: Using α th moments to approximate Shannon entropy. – Zhao et. al. (IMC07), used symmetric stable random projections (Indyk JACM06, Li SODA08) to approximate moments and Shannon entropy. – Harvey et. al. (ITW08). A theoretical paper proposed a criterion on how close α is to 1. Used symmetric stable random projections as the underlying algorithm. – Harvey et. al. (FOCS08). They proposed refined criteria on how to choose α and cited both symmetric stable random projections and Compressed Counting as underlying algorithms.

  13. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 13 Anomaly Detection in Large Networks Using Entropy of Traffic Example: Laura Feinstein, Dan Schnackenberg, Ravindra Balupari, and Darrell Kindred. Statistical approaches to DDoS attack detection and response. In DARPA Information Survivability Conference and Exposition, 2003 General idea: Anomaly events (such as failure of service, distributed denial of service (DoS) attacks) change the the distribution of the traffic data. The change of distribution can be characterized by the change of entropy.

  14. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 14 Previous Methods for Estimating F ( α ) • The pioneering work, [AMS STOC’96] • A popular algorithm, symmetric stable random projections [Indyk JACM’06], [Li SODA’08] – Basic idea: Let X = A t × R , where entries of R ∈ R D × k are sampled from a symmetric α -stable distribution. Entries of X ∈ R k are also samples from a symmetric α -stable distribution with the scale = F ( α ) . 1 /ǫ 2 � � – k = O , the large-deviation bound. k may be too large for real applications [GC RANDOM’07].

  15. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 15 Compressed Counting: Skewed Stable Random Projections Original data stream signal: A t [ i ] , i = 1 to D . eg D = 2 64 Projected signal: X t = A t × R ∈ R k , k is small (eg k = 20 ∼ 100 ) Projection matrix: R ∈ R D × k , Sample entries of R i.i.d. from a skewed α -stable distribution.

  16. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 16 The Standard Data Stream Technique: Incremental Projection Linear Projection: X t = A t × R + Linear data model: A t [ i t ] = A t − 1 [ i t ] + I t = ⇒ Conduct X t = A t × R incrementally. Generate entries of R on-demand Our method differs from previous algorithms in the choice of the distribution of R .

  17. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 17 Recover F ( α ) from Projected Data X t = ( x 1 , x 2 , ..., x k ) = A t × R R = { r ij } ∈ R D × k , r ij ∼ S ( α, β, 1) S ( α, β, γ ) : α -stable, β -skewed distribution with scale γ Then, by stability, at any t , x j ’s are i.i.d. stable samples � D � � A t [ i ] α x j ∼ S α, β, F ( α ) = i =1 = ⇒ A statistical estimation problem.

  18. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 18 Review of Skewed Stable Distributions Z follows a β -skewed α -stable distribution if Fourier transform of its density � √ � F Z ( t ) = E exp − 1 Zt α � = 1 , √ � πα � − F | t | α � ��� = exp 1 − − 1 β sign ( t ) tan , 2 0 < α ≤ 2 , − 1 ≤ β ≤ 1 . The scale F > 0 . Z ∼ S ( α, β, F ) If Z 1 , Z 2 ∼ S ( α, β, 1) , independent, then for any C 1 ≥ 0 , C 2 ≥ 0 , Z = C 1 Z 1 + C 2 Z 2 ∼ S ( α, β, F = C α 1 + C α 2 ) .

  19. Ping Li Compressed Counting, Data Streams March 2009 DIMACS 2009 19 If C 1 and C 2 do not have the same signs, the “stability” does not hold. Let Z = C 1 Z 1 − C 2 Z 2 , with C 1 ≥ 0 and C 2 ≥ 0 . Because F − Z 2 ( t ) = F Z 2 ( − t ) , √ � πα � −| C 1 t | α � ��� F Z ( t ) = exp 1 − − 1 β sign ( t ) tan 2 √ � πα � −| C 2 t | α � ��� × exp 1 + − 1 β sign ( t ) tan , 2 Does NOT represent a stable law, unless β = 0 or α = 2 , 0+ . Symmetric ( β = 0 ) projections work for any data, but if data are non-negative, benefits of skewed projection are enormous.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend