Generalized Matrix Factorizations
as a
Unifying Framework
for
Pattern Set Mining
Complexity Beyond Blocks
Pauli Miettinen 10 September 2015
Generalized Matrix Factorizations as a Unifying Framework for - - PowerPoint PPT Presentation
Generalized Matrix Factorizations as a Unifying Framework for Pattern Set Mining Complexity Beyond Blocks Pauli Miettinen 10 September 2015 Community detection A B C ( ) 1 1 1 0 1 A 2 1 1 1 0 1 1 3 A B C B ( ) ( ) 2
Generalized Matrix Factorizations
as a
Unifying Framework
for
Pattern Set Mining
Complexity Beyond Blocks
Pauli Miettinen 10 September 2015
Community detection
1 2 3 1 1 1 1
1 2 3 A B C 1 1 1 1 1 1 1
1 2 3 A B C 1 1 1 1
A B C
Rank-1 matrices
graph using its cliques
a sum of rank-1 matrices
simple patterns
AB = 1bT
1 + 2bT 2 + · · · + kbT k
Beyond blocks
Limitations of matrix factorization
matrices
nested matrices as a matrix factorization
Generalized outer products
Vectors Parameters
Example: biclique core
1 1 1 1 1
, [ 1 1 1 1 1 ] , {1, 2} =
0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0
Rows that belong to the pattern Columns that belong to the pattern The core
Example: nested matrix
1 1 1 1 1
, [ 1 1 1 1 1 ] , [ 1 2 2 5 6 ] =
1 1 1 1 1 1 1 1 1 1 1 1 1 1
Step function
Generalized decompositions
is a decomposition of X
X ≈ AB = 1bT
1 + 2bT 2 + · · · + kbT k
X ≈ F1 Å F2 Å · · · Å Fk, F = o(, y, )
(exactly) with that kind of outer products
has exactly one nonzero at arbitrary position, it’s induced rank is always bounded
Decomposability
for some f,
as in standard matrix multiplication
j =
k
Å
=1
ƒ(, yj, , j, )
Nice work, but … why?
some weird functions
how some results (and techniques) can be generalized as well
How hard can it be…
maximize |x| + |y|
many distinct rows and columns, NP-hard
columns, the problem is in P
How hard can it be…
summarization?
⊞F∈S F = X, find the the smallest C ⊆ S s.t. ⊞F∈C F = X
within superpolylogarithmic for XOR
How hard can it be…
algebra)
but in P for XOR
How hard can it be…
minimizes the error?
superpolylogarithmic factors for OR and XOR
Conclusions
simpler parts
more than just cliques as ”rank-1” matrices
cliques
Future
correct level of generality for the outer products
Tiank Y
Quettions?