CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford - - PowerPoint PPT Presentation

cs345a data mining jure leskovec and anand rajaraman j
SMART_READER_LITE
LIVE PREVIEW

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford - - PowerPoint PPT Presentation

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Homework 2 is out: Homework 2 is out: Due Monday 15 th at midnight! Submit PDFs Submit PDFs Talk: http://rain.stanford.edu Wed at 12:30 in Terman 453


slide-1
SLIDE 1

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j

Stanford University

slide-2
SLIDE 2

Homework 2 is out: Homework 2 is out:

 Due Monday 15th at midnight!  Submit PDFs  Submit PDFs

Talk:

 http://rain.stanford.edu  Wed at 12:30 in Terman 453  Yehuda Koren – Winner of the Netflix

challenge! g

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 2

slide-3
SLIDE 3

T t LSI fi d ‘ t ’

 Text ‐ LSI: find ‘concepts’

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 3

slide-4
SLIDE 4

 Compress / reduce dimensionality

  • 106 rows; 103 columns; no updates
  • random access to any cell(s); small error: OK

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 4

slide-5
SLIDE 5

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 5

slide-6
SLIDE 6

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 6

slide-7
SLIDE 7

 ( )T A[n x m] = U[n x r]  r x r] (V[m x r])T

 A: n x m matrix  A: n x m matrix

(eg., n documents, m terms)

 U: n x r matrix

(n documents, r concepts)

 : r x r diagonal matrix

(strength of each ‘concept’) (strength of each concept ) (r : rank of the matrix)

 V: m x r matrix

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 7

(m terms, r concepts)

slide-8
SLIDE 8

n n

A

m

m

VT

A

m

U

U

8 2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining

slide-9
SLIDE 9

n

1u1v1 2u2v2

A  A

m

+

9 2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining

slide-10
SLIDE 10

THEOREM [P 92] l ibl THEOREM [Press+92]: always possible to decompose matrix A into A = U  VT , where

 U  V: unique  U,  V: unique  U, V: column orthonormal:

  • UT U = I; VT V = I (I: identity matrix)

; ( y )

  • (Cols. are orthogonal unit vectors)

 : diagonal

  • Entries (singular values) are positive, and

sorted in decreasing order

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 10

slide-11
SLIDE 11

A U  VT l

 A = U  VT ‐ example:

d inf. retrieval brainlung

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 11

0.27 0.71 0.71

slide-12
SLIDE 12

A U  VT l

 A = U  VT ‐ example:

d inf. retrieval brainlung

CS‐concept MD‐concept

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

p

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 12

0.27 0.71 0.71

slide-13
SLIDE 13

A U  VT l

d t t

 A = U  VT ‐ example:

d inf. retrieval brainlung

CS‐concept MD‐concept doc‐to‐concept similarity matrix

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

p

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 13

0.27 0.71 0.71

slide-14
SLIDE 14

A U  VT l

 A = U  VT ‐ example:

d inf. retrieval brainlung

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

‘strength’ of CS‐concept

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 14

0.27 0.71 0.71

slide-15
SLIDE 15

A U  VT l

 A = U  VT ‐ example:

d inf. retrieval brainlung

term‐to‐concept similarity matrix

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

CS‐concept

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 15

0.27 0.71 0.71

slide-16
SLIDE 16

A U  VT l

 A = U  VT ‐ example:

d inf. retrieval brainlung

term‐to‐concept similarity matrix

1 1 1 2 2 2

data brain g

0.18 0 0 36 0

CS‐concept

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

= CS

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27

MD

0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 16

0.27 0.71 0.71

slide-17
SLIDE 17

‘d t ’ ‘t ’ d ‘ t ’ ‘documents’, ‘terms’ and ‘concepts’:

 U: document‐to‐concept similarity matrix  V: term‐to‐concept sim. matrix

 it di l l t

 : its diagonal elements:

‘strength’ of each concept

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 17

slide-18
SLIDE 18

SVD: gives best axis to project

 best axis to

project on:

first singular vector

p j (‘best’ = min sum of squares

  • f projection

v1

  • f projection

errors)

 minimum RMS

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 18

error

slide-19
SLIDE 19

A U  VT l

 A = U  VT ‐ example:

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

v1

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 19

0.27 0.71 0.71

slide-20
SLIDE 20

A U  VT l

 A = U  VT ‐ example:

variance (‘spread’)

1 1 1 2 2 2

0.18 0 0 36 0

p

  • n the v1 axis

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 20

0.27 0.71 0.71

slide-21
SLIDE 21

A U  VT l

 A = U  VT ‐ example:

  • U gives the coordinates of the

points in the projection axis points in the projection axis

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 21

0.27 0.71 0.71

slide-22
SLIDE 22

M d t il More details

 Q: how exactly is dim. reduction done?

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 22

0.27 0.71 0.71

slide-23
SLIDE 23

M d t il More details

 Q: how exactly is dim. reduction done?  A: set the smallest singular values to zero:  A: set the smallest singular values to zero:

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 23

0.27 0.71 0.71

slide-24
SLIDE 24

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

~

9.64 0

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 24

0.27 0.71 0.71

slide-25
SLIDE 25

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

~

9.64 0

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 25

0.27 0.71 0.71

slide-26
SLIDE 26

1 1 1 2 2 2

0.18 0 36

2 2 2 1 1 1 5 5 5

0.36 0.18 0.90

~

9.64

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.58 0.58 0.58 0

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 26

slide-27
SLIDE 27

1 1 1 2 2 2 1 1 1 2 2 2 2 2 2 1 1 1 5 5 5

~

1 1 1 5 5 5 0 0 0 0 2 2 0 0 3 3 0 0 1 1 0 0 0 0 0 0

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 27

slide-28
SLIDE 28

E i l t Equivalent: ‘spectral decomposition’ of the matrix:

1 1 1 2 2 2

0.18 0 0 36 0

2 2 2 1 1 1 5 5 5

0.36 0 0.18 0 0.90 0

=

9.64 0 5.29

x x

0 0 2 2 0 0 3 3 0 0 1 1

0.53 0.80 0 27 0.58 0.58 0.58 0 0 71 0 71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 28

0.27 0.71 0.71

slide-29
SLIDE 29

E i l t Equivalent: ‘spectral decomposition’ of the matrix:

1 1 1 2 2 2 2 2 2 1 1 1 5 5 5

= x x u1 u2 1 2

0 0 2 2 0 0 3 3 0 0 1 1

v1

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 29

v2

slide-30
SLIDE 30

E i l t Equivalent: ‘spectral decomposition’ of the matrix:

1 1 1 2 2 2

m

2 2 2 1 1 1 5 5 5

= u1 1 vT

1

u2 2 vT

2

+ +... n

0 0 2 2 0 0 3 3 0 0 1 1

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 30

slide-31
SLIDE 31

E i l t Equivalent: ‘spectral decomposition’ of the matrix:

1 1 1 2 2 2

m r terms

2 2 2 1 1 1 5 5 5

= u1 1 vT

1

u2 2 vT

2

+ +... n

0 0 2 2 0 0 3 3 0 0 1 1

n x 1 1 x m

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 31

1 x m

slide-32
SLIDE 32

A i ti / di d ti Approximation / dim. reduction: by keeping the first few terms (Q: how many?)

1 1 1 2 2 2

m

2 2 2 1 1 1 5 5 5

= u1 1 vT

1

u2 2 vT

2

+ +... n

0 0 2 2 0 0 3 3 0 0 1 1

assume: 1 >= 2 >= ...

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 32

slide-33
SLIDE 33

A (h i ti [F k ]) Answer:(heuristic‐[Fukunaga]): keep 80‐90% of ‘energy’ (= i

2)

1 1 1 2 2 2

m

2 2 2 1 1 1 5 5 5

= u1 1 vT

1

u2 2 vT

2

+ +... n

0 0 2 2 0 0 3 3 0 0 1 1

assume: 1 >= 2 >= ...

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 33

slide-34
SLIDE 34

O(

2)

O( 2 ) ( hi h i l )

 O(nm2) or O(n2m) (whichever is less)  But:

  • Less work if we just want singular values
  • Less work, if we just want singular values
  • or if we want first k singular vectors
  • or if the matrix is sparse
  • or if the matrix is sparse

 Implemented:

  • Linear algebra packages like: LINPACK, Matlab,

SPlus, Mathematica ...

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 34

slide-35
SLIDE 35

SVD A U  VT i (*)

 SVD: A= U  VT : unique (*)  U: document‐to‐concept similarities  V: term to concept similarities  V: term‐to‐concept similarities   : strength of each concept  Dim. reduction:

  • keep the few largest singular values

(80 90% f ‘ ’) (80‐90% of ‘energy’)

  • SVD: picks up linear correlations

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 35

slide-36
SLIDE 36

 SVD gives us:  SVD gives us:

  • A = U  VT

 Eigen‐decomposition:

g p

  • A = X XT
  • U, V, X are orthonormal (UTU=I),
  •  are diagonal

 What is:

AAT AAT= ATA= A A=

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 36

slide-37
SLIDE 37

 A AT = U 2 UT  A A = U  U  ATA = V 2 VT  (ATA) k= V 2k VT  (ATA) k ~ v1 1

2k v1 T

for k>>1

 (ATA)k x ~ (constant) v1

for (almost) any x

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 37

slide-38
SLIDE 38

Q H t d i ith LSI? Q: How to do queries with LSI? Problem: e.g., find documents with ‘data’

l

1 1 1

data inf. retrieval brainlung

0.18 0

1 1 1 2 2 2 1 1 1 5 5 5

0.18 0 0.36 0 0.18 0 0.90 0 = CS 9.64 0 5.29 x x

2 2 3 3 1 1

0.53 0.80 0.27 MD 0.58 0.58 0.58 0 0.71 0.71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 38

slide-39
SLIDE 39

Q H t d i ith LSI? Q: How to do queries with LSI? A: map query vectors into ‘concept space’ – how?

l

1 1 1

data inf. retrieval brainlung

0.18 0

1 1 1 2 2 2 1 1 1 5 5 5

0.18 0 0.36 0 0.18 0 0.90 0 = CS 9.64 0 5.29 x x

2 2 3 3 1 1

0.53 0.80 0.27 MD 0.58 0.58 0.58 0 0.71 0.71

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 39

slide-40
SLIDE 40

Q H t d i ith LSI? Q: How to do queries with LSI? A: map query vectors into ‘concept space’ – how?

l

1 0

data inf. retrieval brainlung

q= term2 q

1 0 0 0 0

q= v1 v2 term1

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 40

slide-41
SLIDE 41

Q H t d i ith LSI? Q: How to do queries with LSI? A: map query vectors into ‘concept space’ – how?

l

1 0

data inf. retrieval brainlung

q= term2 q

1 0 0 0 0

q= v1 v2 term1 A: inner product (cosine similarity) with each ‘concept’ vector vi

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 41

slide-42
SLIDE 42

Q H t d i ith LSI? Q: How to do queries with LSI? A: map query vectors into ‘concept space’ – how?

l

1 0

data inf. retrieval brainlung

q= term2 q

1 0 0 0 0

q= v1 v2 q o v1 term1 A: inner product (cosine similarity) with each ‘concept’ vector vi

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 42

slide-43
SLIDE 43

C tl h Compactly, we have: qconcept = q V E.g.:

d t inf. retrieval brainlung

0.58 0

CS‐concept

1 0 0 0 0

data brain g

q=

0.58 0 0.58 0 0.71

=

0 .5 8

0.71

term‐to‐concept

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 43

p similarities

slide-44
SLIDE 44

H ld th d t (‘i f ti ’ How would the document (‘information’, ‘retrieval’) be handled by LSI? d = d V dconcept = d V E.g.: E.g.:

data inf. retrieval brainlung

d

0.58 0 0.58 0 0 58 0

=

1.16 0

CS‐concept 0 1 1 d=

0.58 0 0.71 0.71

1.16 0

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 44

term‐to‐concept similarities

slide-45
SLIDE 45

Ob ti d t (‘i f ti ’ Observation: document (‘information’, ‘retrieval’) will be retrieved by query (‘data’) although it does not contain ( data ), although it does not contain ‘data’!

f retrieval

CS‐concept 0 1 1

data inf. brainlung

d=

1.16 0

1 0

0 .5 8

q=

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 45

slide-46
SLIDE 46

+ Optimal low‐rank approximation:

Optimal low rank approximation:

  • in L2 norm
  • Interpretability problem:
  • A singular vector specifies a linear combination of

all input columns or rows.

  • Lack of Sparsity:
  • Lack of Sparsity:
  • Singular vectors are dense

=

 VT

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 46

U

slide-47
SLIDE 47

 Goal:  Goal:

Make ||A‐CUR|| small

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 47

slide-48
SLIDE 48

 Goal:  Goal:

Make ||A‐CUR|| small

Pseudo‐inverse of the intersection of C and R

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 48

slide-49
SLIDE 49

 Let:  Let:

Ak be the “best” rank k approximation to A (e.i., SVD) ( , ) Theorem [Drineas et al.] CUR in O(mn) time achieves

  • ||A‐CUR||  ||A‐Ak||+ ||A||

with probability at least 1‐, by picking

  • O(k log(1/)/2) columns, and
  • O(k2log3(1/)/6) rows
  • O(k2log3(1/)/6) rows

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 49

slide-50
SLIDE 50

 Sample columns (similarly for rows):  Sample columns (similarly for rows):

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 50

slide-51
SLIDE 51

 Let W be the “intersection” of  Let W be the intersection of

sampled columns C and rows R

 Then:

U = W+ = X + YT

  • +: reciprocals of non‐zero

singular values: +

ii  ii

i.e., Moore–Penrose pseudoinverse

A C R W

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 51

C U = W+

slide-52
SLIDE 52

+ Easy interpretation + Easy interpretation

  • Since the basis vectors are actual

columns and rows

+ Sparse basis

  • Since the basis vectors are actual

Singular vector Actual column

columns and rows

  • Duplicate columns and rows

C l f l ill b l d

  • Columns of large norms will be sampled many

times

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 52

slide-53
SLIDE 53

 If we want to get rid of the duplicates:  If we want to get rid of the duplicates:

  • Throw them away
  • S

l th l / b th

  • Scale the columns/rows by the square

root of the number of duplicates

Rd Rs A Cd Cs

Construct a small U

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 53

slide-54
SLIDE 54

sparse and small

SVD: A = U  VT

Huge but sparse Big and dense

CUR A C U R

dense but small

CUR: A = C U R

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 54

Huge but sparse Big but sparse

slide-55
SLIDE 55

 Drineas et al., Fast Monte Carlo Algorithms for Matrices III:

Computing a Compressed Approximate Matrix Decomposition, SIAM Journal on Computing, 2006.

 J. Sun, Y. Xie, H. Zhang, C. Faloutsos: Less is More: Compact Matrix

  • J. Sun, Y. Xie, H. Zhang, C. Faloutsos: Less is More: Compact Matrix

Decomposition for Large Sparse Graphs, SDM 2007

 Intra‐ and interpopulation genotype reconstruction from tagging

SNPs P Paschou M W Mahoney A Javed J R Kidd A J Pakstis SNPs, P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis,

  • S. Gu, K. K. Kidd, and P. Drineas, Genome Research, 17(1), 96‐107

(2007)

 Tensor‐CUR Decompositions For Tensor‐Based Data, M. W.

Mahoney, M. Maggioni, and P. Drineas, Proc. 12‐th Annual SIGKDD, 327‐336 (2006)

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 55

slide-56
SLIDE 56

 Slides borrowed from Jimeng Sun Christos  Slides borrowed from Jimeng Sun, Christos

Faloutsos, Michael Mahoney, Hanghang Tong, and Petros Drineas and Petros Drineas

2/9/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 56