Sliding Windows Zhewei Wei 1 , Xuancheng Liu 1 , Feifei Li 2 , Shuo - - PowerPoint PPT Presentation

sliding windows
SMART_READER_LITE
LIVE PREVIEW

Sliding Windows Zhewei Wei 1 , Xuancheng Liu 1 , Feifei Li 2 , Shuo - - PowerPoint PPT Presentation

Matrix Sketching over Sliding Windows Zhewei Wei 1 , Xuancheng Liu 1 , Feifei Li 2 , Shuo Shang 1 Xiaoyong Du 1 , Ji-Rong Wen 1 1 School of Information, Renmin University of China 2 School of Computing, The University of Utah Matrix data


slide-1
SLIDE 1

Matrix Sketching over Sliding Windows

Zhewei Wei1, Xuancheng Liu1, Feifei Li2, Shuo Shang1 Xiaoyong Du1, Ji-Rong Wen1

1 School of Information, Renmin University of China 2 School of Computing, The University of Utah

slide-2
SLIDE 2

Matrix data

  • Modern data sets are modeled as large matrices.
  • Think of 𝐵 ∈ 𝑆𝑜×𝑒 as n rows in 𝑆𝑒.

Data Rows Columns d n Textual Documents Words 105 – 107 >1010 Actions Users Types 101 – 104 >107 Visual Images Pixels, SIFT 105 – 106 >108 Audio Songs, tracks Frequencies 105 – 106 >108 Machine Learning Examples Features 102 – 104 >106 Financial Prices Items, Stocks 103 – 105 >106

slide-3
SLIDE 3

Singular Value Decomposition (SVD)

𝑏𝑜𝑒 𝑏1𝑒 𝑏11 𝑏𝑜1 ⋮ ⋮ … … … 𝐵 𝜀1 𝜀𝑒 𝜀2 … … ⋮ ⋮ ⋱ 𝑣𝑜𝑜 𝑣1𝑜 𝑣11 𝑣𝑜1 ⋮ ⋮ … … … 𝑤𝑜𝑒 𝑤𝑒1 𝑤11 𝑤1𝑒 ⋮ ⋮ … … … = 𝑉 Σ 𝑊𝑈 × ×

  • Principal component analysis (PCA)
  • K-means clustering
  • Latent semantic indexing (LSI)

… … … ⋮ ⋮ ⋮

slide-4
SLIDE 4

SVD & Eigenvalue decomposition

𝜀1

2

… … ⋮ ⋮ ⋱ 𝑤𝑜𝑒 𝑤𝑒1 𝑤11 𝑤1𝑒 ⋮ ⋮ … … … 𝑊𝑈 × × 𝑏𝑜𝑒 𝑏1𝑒 𝑏11 𝑏𝑜1 ⋮ ⋮ … … … = 𝑤𝑜𝑒 𝑤𝑒1 𝑤11 𝑤1𝑒 ⋮ ⋮ … … … 𝑊 𝜀2

2

𝜀𝑒

2

Σ2 𝐵𝑈 × Covariance Matrix 𝐵𝑈𝐵 𝑏𝑜𝑒 𝑏1𝑒 𝑏11 𝑏𝑜1 ⋮ ⋮ … … … 𝐵

slide-5
SLIDE 5

Matrix Sketching

  • Computing SVD is slow (and offline).
  • Matrix sketching: approximate large matrix 𝐵 ∈

𝑆𝑜×𝑒 with B ∈ 𝑆𝑚×𝑒, 𝑚 ≪ 𝑜, in an online fashion.

  • Row-update stream: each update receives a row.
  • Covariance error [Liberty2013, Ghashami2014,

Woodruff2016]: 𝐵𝑈𝐵 − 𝐶𝑈𝐶 /||𝐵||𝐺

2 ≤ 𝜁.

  • Feature hashing [Weinberger2009], random

projection [Papadimitriou2011], …

  • Frequent Directions (FD) [Liberty2013]:
  • B ∈ 𝑆𝑚×𝑒 , 𝑚 =

1 𝜁 , s.t. covariance error ≤ 𝜁. 𝐵 𝑒 𝑜 𝐶 𝑚 𝑏𝑗 𝑏𝑗

slide-6
SLIDE 6

Matrix Sketching over Sliding Windows

  • Each row is associated with a timestamp.
  • Maintain 𝐶𝑋 for 𝐵𝑋: rows in sliding window 𝑋.

Covariance error: ||𝐵𝑋

𝑈 𝐵𝑋 − 𝐶𝑋 𝑈 𝐶𝑋||/||𝐵𝑋||𝐺 2 ≤ 𝜁

  • Sequence-based window: past N rows.
  • Time-based window: rows in a past time period Δ.

𝐵𝑋: 𝑂 rows 𝐵𝑋: rows in Δ time units

slide-7
SLIDE 7

Motivation 1: Sliding windows vs. unbounded streams

  • Sliding window model is a more appropriate model in many

real-world applications.

  • Particularly so in the areas of data analysis wherein matrix

sketching techniques are widely used.

  • Applications:
  • Analyzing tweets for the past 24 hours.
  • Sliding window PCA for detecting changes and

anomalies [Papadimitriou2006, Qahtan2015].

slide-8
SLIDE 8

Motivation 2: Lower bound

  • Unbounded stream solution: use O(𝑒2) space to store 𝐵𝑈𝐵.
  • Update: 𝐵𝑈𝐵 ← 𝐵𝑈𝐵 + 𝑏𝑗

𝑈𝑏𝑗

Theorem 4.1 An algorithm that returns 𝐵𝑈𝐵 for any sequence- based sliding window must use Ω(𝑂𝑒) bits space.

  • Matrix sketching is necessary for sliding window, even when

dimension 𝑒 is small.

  • Matrix sketching over sliding windows requires new

techniques.

slide-9
SLIDE 9

Three algorithms

Sketches Update Space Window Interpretable? Sampling 𝑒 𝜁2 log log 𝑂𝑆 𝑒 𝜁2 log 𝑂𝑆 Sequence & time Yes LM-FD 𝑒 log 𝜁𝑂𝑆 1 𝜁2 log 𝜁𝑂𝑆 Sequence & time No DI-FD 𝑒 𝜁 log 𝑆 𝜁 𝑆 𝜁 log 𝑆 𝜁 Sequence No

  • Sampling:
  • Sample 𝑏𝑗 w.p. proportional to ||𝑏𝑗||2 [Frieze2004].
  • Priority sampling[Efraimidis2006] + Sliding window top-k.
  • LM-FD: Exponential Histogram (Logarithmic method)

[Datar2002] + Frequent Directions.

  • DI-FD: Dyadic interval techniques [Arasu2004] + Frequent

Directions.

slide-10
SLIDE 10

Three algorithms

Sketches Update Space Window Interpretable? Sampling Slow Large Sequence & time Yes LM-FD Fast Small Sequence & time No DI-FD Slow Best for small 𝑆 Sequence No

  • Interpretable: rows of the sketch 𝐶 come from 𝐵.
  • 𝑆: ratio between maximum squared norm and minimum squared norms.
  • Sampling:
  • Sample 𝑏𝑗 w.p. proportional to ||𝑏𝑗||2 [Frieze2004].
  • Priority sampling[Efraimidis2006] + Sliding window top-k.
  • LM-FD: Exponential Histogram (Logarithmic method)

[Datar2002] + Frequent Directions.

  • DI-FD: Dyadic interval techniques [Arasu2004] + Frequent

Directions.

slide-11
SLIDE 11

Experiments: space vs. error

𝑆 = 8.35 𝑆 = 1 𝑆 =90089

  • Interpretable: rows of the sketch 𝐶 come from 𝐵.
  • 𝑆: ratio between maximum squared norm and minimum squared norms.

Sketches Update Space Window Interpretable? Sampling Slow Large Sequence & time Yes LM-FD Fast Small Sequence & time No DI-FD Slow Best for small 𝑆 Sequence No

slide-12
SLIDE 12

Experiments: time vs. space

𝑆 = 8.35 𝑆 = 1 𝑆 =90089

  • Interpretable: rows of the sketch 𝐶 come from 𝐵.
  • 𝑆: ratio between maximum squared norm and minimum squared norms.

Sketches Update Space Window Interpretable? Sampling Slow Large Sequence & time Yes LM-FD Fast Small Sequence & time No DI-FD Slow Best for small 𝑆 Sequence No

slide-13
SLIDE 13

Conclusions

  • First attempt to tackle the sliding window matrix sketching

problem.

  • Lower bounds show that the sliding window model is

different from unbounded streaming model for the matrix sketching problem.

  • Propose algorithms for both time-based and sequence-

based windows with theoretical guarantee and experimental evaluation.

slide-14
SLIDE 14

Thanks!