New variants of Nonnegative Matrix Factorization for sparsity - - PowerPoint PPT Presentation

new variants of nonnegative matrix factorization for
SMART_READER_LITE
LIVE PREVIEW

New variants of Nonnegative Matrix Factorization for sparsity - - PowerPoint PPT Presentation

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique finding Nicolas Gillis nicolas.gillis@uclouvain.be In collaboration with Fran cois Glineur UCL/CORE (Center for Operations Research and


slide-1
SLIDE 1

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique finding

Nicolas Gillis

nicolas.gillis@uclouvain.be In collaboration with Fran¸ cois Glineur UCL/CORE (Center for Operations Research and Econometrics) UCL/INMA (Department of Mathematical Engineering)

March 3, 2009 Seminar at CESAME

CESAME Nonnegative Matrix Factorization 1

slide-2
SLIDE 2

Outline

  • 1. Introduction to Nonnegative Matrix Factorization

◮ Motivations and applications ◮ Some algorithms

  • 2. Rank-one update and Nonnegative Factorization

◮ Nonnegative Factorization ◮ Complexity and the maximum edge biclique problem

  • 3. Greedy with Underapproximations

◮ For sparse approximations ◮ Algorithm based on Lagrangian relaxation CESAME Nonnegative Matrix Factorization 2

slide-3
SLIDE 3

Why low-rank matrix approximations ?

Given a matrix M ∈ Rm×n and a factorization rank r, we would like to find U ∈ Rm×r and V ∈ Rr×n such that

M ≈ UV

M is approximated by a rank r matrix. − → dimensionality reduction for noise filtering, compression, interpretation, classification, . . .

CESAME Nonnegative Matrix Factorization 3

slide-4
SLIDE 4

Why low-rank matrix approximations ?

Given a matrix M ∈ Rm×n and a factorization rank r, we would like to find U ∈ Rm×r and V ∈ Rr×n such that

M ≈ UV

M is approximated by a rank r matrix. − → dimensionality reduction for noise filtering, compression, interpretation, classification, . . .

CESAME Nonnegative Matrix Factorization 3

slide-5
SLIDE 5

Matrix approximation and optimization

If we want to minimize the sum of squares of the error i.e. min

U,V

||M − UV ||2

F =

  • ij

(M − UV )2

ij,

the matrix factorization problem is an unconstrained

  • ptimization problem. This is a well-known problem with

nice properties and which can be solved efficiently. It corresponds to finding the principal components of your data matrix (PCA). This can be solved using truncation

  • f the singular value decomposition (SVD).

CESAME Nonnegative Matrix Factorization 4

slide-6
SLIDE 6

Matrix approximation and optimization

If we want to minimize the sum of squares of the error i.e. min

U,V

||M − UV ||2

F =

  • ij

(M − UV )2

ij,

the matrix factorization problem is an unconstrained

  • ptimization problem. This is a well-known problem with

nice properties and which can be solved efficiently. It corresponds to finding the principal components of your data matrix (PCA). This can be solved using truncation

  • f the singular value decomposition (SVD).

CESAME Nonnegative Matrix Factorization 4

slide-7
SLIDE 7

Matrix factorization, a linear model

If each column of M is an element of a dataset,

M:j

  • elements of the data

r

  • k=1

U:k

  • basis elements

Vkj

  • weights

the columns of M are decomposed into a linear combination of the columns of U which then form a basis

  • f these elements.

Example

M =   2 3 2 2 1 1 1 5   ≈   1.5 −0.8 0.7 −0.9 1.9 1  

  • 1

2.3 0.6 −1 0.7 −1

  • = UV

=   2.3 2.9 1.7 1.6 1 1.3 0.9 5.1 0.1  

CESAME Nonnegative Matrix Factorization 5

slide-8
SLIDE 8

Matrix factorization, a linear model

If each column of M is an element of a dataset,

M:j

  • elements of the data

r

  • k=1

U:k

  • basis elements

Vkj

  • weights

the columns of M are decomposed into a linear combination of the columns of U which then form a basis

  • f these elements.

Example

M =   2 3 2 2 1 1 1 5   ≈   1.5 −0.8 0.7 −0.9 1.9 1  

  • 1

2.3 0.6 −1 0.7 −1

  • = UV

=   2.3 2.9 1.7 1.6 1 1.3 0.9 5.1 0.1  

CESAME Nonnegative Matrix Factorization 5

slide-9
SLIDE 9

Nonnegativity

In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well.

CESAME Nonnegative Matrix Factorization 6

slide-10
SLIDE 10

Nonnegativity

In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well.

CESAME Nonnegative Matrix Factorization 6

slide-11
SLIDE 11

Nonnegativity

In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well.

CESAME Nonnegative Matrix Factorization 6

slide-12
SLIDE 12

Image Processing

Each column of M represents a face using pixel intensity

M is a nonnegative matrix

CESAME Nonnegative Matrix Factorization 7

slide-13
SLIDE 13

Image Processing

For an unconstrained decomposition

Figure: Gray: positive entries; Red: negatives entries

Basis elements are not nonnegative and can not be interpreted easily as facial features.

CESAME Nonnegative Matrix Factorization 8

slide-14
SLIDE 14

Image Processing

U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips.

CESAME Nonnegative Matrix Factorization 9

slide-15
SLIDE 15

Image Processing

U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips.

CESAME Nonnegative Matrix Factorization 9

slide-16
SLIDE 16

Image Processing

U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips.

CESAME Nonnegative Matrix Factorization 9

slide-17
SLIDE 17

Image Processing

NMF allows a part-based representation of the data.

CESAME Nonnegative Matrix Factorization 10

slide-18
SLIDE 18

Text Mining

M(i, j) is the frequency of word i in text j i.e. the columns of M represents the words frequency in each text.

CESAME Nonnegative Matrix Factorization 11

slide-19
SLIDE 19

Text Mining

⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes.

CESAME Nonnegative Matrix Factorization 12

slide-20
SLIDE 20

Text Mining

⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes.

CESAME Nonnegative Matrix Factorization 12

slide-21
SLIDE 21

Text Mining

⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes.

CESAME Nonnegative Matrix Factorization 12

slide-22
SLIDE 22

Text Mining

⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes.

CESAME Nonnegative Matrix Factorization 12

slide-23
SLIDE 23

Text Mining

⋄ Basis elements allow to recover the different topics:

◮ Basis element 1 : profit, company, bank, . . .

→ Economy

◮ Basis element 2 : run, jump, score, . . .

→ Sport

⋄ Weights allow to assign each text to its corresponding class.

CESAME Nonnegative Matrix Factorization 13

slide-24
SLIDE 24

Text Mining

⋄ Basis elements allow to recover the different topics:

◮ Basis element 1 : profit, company, bank, . . .

→ Economy

◮ Basis element 2 : run, jump, score, . . .

→ Sport

⋄ Weights allow to assign each text to its corresponding class.

CESAME Nonnegative Matrix Factorization 13

slide-25
SLIDE 25

Text Mining

⋄ Basis elements allow to recover the different topics:

◮ Basis element 1 : profit, company, bank, . . .

→ Economy

◮ Basis element 2 : run, jump, score, . . .

→ Sport

⋄ Weights allow to assign each text to its corresponding class.

CESAME Nonnegative Matrix Factorization 13

slide-26
SLIDE 26

Text Mining

⋄ Basis elements allow to recover the different topics:

◮ Basis element 1 : profit, company, bank, . . .

→ Economy

◮ Basis element 2 : run, jump, score, . . .

→ Sport

⋄ Weights allow to assign each text to its corresponding class.

CESAME Nonnegative Matrix Factorization 13

slide-27
SLIDE 27

Text Mining

⋄ Basis elements allow to recover the different topics:

◮ Basis element 1 : profit, company, bank, . . .

→ Economy

◮ Basis element 2 : run, jump, score, . . .

→ Sport

⋄ Weights allow to assign each text to its corresponding class.

CESAME Nonnegative Matrix Factorization 13

slide-28
SLIDE 28

Spectral Data Analysis

More than 15000 various type of objects in orbit (military/commercial satellites, debris, . . . ). Need for space object database mining, object identification, clustering, classification, . . .

CESAME Nonnegative Matrix Factorization 14

slide-29
SLIDE 29

Spectral Data Analysis

The reflectance of a space object results from the additive reflectance of its constitutive elements.

CESAME Nonnegative Matrix Factorization 15

slide-30
SLIDE 30

Spectral Data Analysis

The reflectance of a space object results from the additive reflectance of its constitutive elements.

CESAME Nonnegative Matrix Factorization 15

slide-31
SLIDE 31

Spectral Data Analysis

Example: Hubble Telescope. Comparing the NMF basis elements with known reflectance of elements allows to recover the constitutive elements of the space objects. Moreover, the weights gives the proportions of these elements in these objects.

[PPP06] Pauca, Piper and Plemmons, Nonnegative matrix factorization for spectral data analysis, Linear Algebra and its Applications, 416(1), pp. 29–47, 2006. [ZWPP08] Zhang, Wang, Plemmons and Pauca, Tensor Methods for Hyperspectral Data Analysis of Space Objects, To appear in J. Optical Soc. Amer. A, 2008.

CESAME Nonnegative Matrix Factorization 16

slide-32
SLIDE 32

Spectral Data Analysis

Example: Hubble Telescope. Comparing the NMF basis elements with known reflectance of elements allows to recover the constitutive elements of the space objects. Moreover, the weights gives the proportions of these elements in these objects.

[PPP06] Pauca, Piper and Plemmons, Nonnegative matrix factorization for spectral data analysis, Linear Algebra and its Applications, 416(1), pp. 29–47, 2006. [ZWPP08] Zhang, Wang, Plemmons and Pauca, Tensor Methods for Hyperspectral Data Analysis of Space Objects, To appear in J. Optical Soc. Amer. A, 2008.

CESAME Nonnegative Matrix Factorization 16

slide-33
SLIDE 33

Nonnegative Matrix Factorization

Given a matrix M ∈ Rm×n

+

and a factorization rank r ∈ N, find U ∈ Rm×rand V ∈ Rr×n such that min

U≥0,V ≥0 ||M − UV ||F

NMF is an additive linear model for nonnegative data.

M:j

  • ≥0

r

  • k=1

U:k

  • ≥0

Vkj

  • ≥0
  • Advantages. Interpretability - Sparsity.
  • Disadvantages. Higher error - NP-hard problem.

[Vav07] S. Vavasis, On the Complexity of Nonnegative Matrix Factorization, 2007.

CESAME Nonnegative Matrix Factorization 17

slide-34
SLIDE 34

Nonnegative Matrix Factorization

Given a matrix M ∈ Rm×n

+

and a factorization rank r ∈ N, find U ∈ Rm×rand V ∈ Rr×n such that min

U≥0,V ≥0 ||M − UV ||F

NMF is an additive linear model for nonnegative data.

M:j

  • ≥0

r

  • k=1

U:k

  • ≥0

Vkj

  • ≥0
  • Advantages. Interpretability - Sparsity.
  • Disadvantages. Higher error - NP-hard problem.

[Vav07] S. Vavasis, On the Complexity of Nonnegative Matrix Factorization, 2007.

CESAME Nonnegative Matrix Factorization 17

slide-35
SLIDE 35

Nonnegative Matrix Factorization

Given a matrix M ∈ Rm×n

+

and a factorization rank r ∈ N, find U ∈ Rm×rand V ∈ Rr×n such that min

U≥0,V ≥0 ||M − UV ||F

NMF is an additive linear model for nonnegative data.

M:j

  • ≥0

r

  • k=1

U:k

  • ≥0

Vkj

  • ≥0
  • Advantages. Interpretability - Sparsity.
  • Disadvantages. Higher error - NP-hard problem.

[Vav07] S. Vavasis, On the Complexity of Nonnegative Matrix Factorization, 2007.

CESAME Nonnegative Matrix Factorization 17

slide-36
SLIDE 36

Nonnegative Matrix Factorization

Given a matrix M ∈ Rm×n

+

and a factorization rank r ∈ N, find U ∈ Rm×rand V ∈ Rr×n such that min

U≥0,V ≥0 ||M − UV ||F

NMF is an additive linear model for nonnegative data.

M:j

  • ≥0

r

  • k=1

U:k

  • ≥0

Vkj

  • ≥0
  • Advantages. Interpretability - Sparsity.
  • Disadvantages. Higher error - NP-hard problem.

[Vav07] S. Vavasis, On the Complexity of Nonnegative Matrix Factorization, 2007.

CESAME Nonnegative Matrix Factorization 17

slide-37
SLIDE 37

Example M =   2 3 2 2 1 1 1 5   ≈   1.5 −0.8 0.7 −0.9 1.9 1  

  • 1

2.3 0.6 −1 0.7 −1

  • = UV

=   2.3 2.9 1.7 1.6 1 1.3 0.9 5.1 0.1   Error : ||M − UV ||F = 0.7. M =   2 3 2 2 1 1 1 5   ≈   1.9 1 2.3   0.4 2.2 1.2 1.4 1

  • = UV

≈   2.3 2.7 1.9 1.2 1.4 1 0.9 5   Error : ||M − UV ||F = 1.

CESAME Nonnegative Matrix Factorization 18

slide-38
SLIDE 38

Example M =   2 3 2 2 1 1 1 5   ≈   1.5 −0.8 0.7 −0.9 1.9 1  

  • 1

2.3 0.6 −1 0.7 −1

  • = UV

=   2.3 2.9 1.7 1.6 1 1.3 0.9 5.1 0.1   Error : ||M − UV ||F = 0.7. M =   2 3 2 2 1 1 1 5   ≈   1.9 1 2.3   0.4 2.2 1.2 1.4 1

  • = UV

≈   2.3 2.7 1.9 1.2 1.4 1 0.9 5   Error : ||M − UV ||F = 1.

CESAME Nonnegative Matrix Factorization 18

slide-39
SLIDE 39

Remarks

⋄ A crucial point of NMF is to achieve sparse solutions → To get a better part-based representation. → To improve interpretability.

[Hoy04] P.O. Hoyer, Nonnegative Matrix Factorization with Sparseness Constraints, J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.

⋄ Other applications: pattern recognition, email surveillance, bioinformatics, graph clustering, air emission control, . . .

[BBLPP07] M. Berry, M. Browne, A. Langville, P. Pauca, and R.J. Plemmons, Algorithms and Applications for Approximate Nonnegative Matrix Factorization, Computational Statistics and Data Analysis, Volume 52, Issue 1, pp. 155-173, 2007.

CESAME Nonnegative Matrix Factorization 19

slide-40
SLIDE 40

Remarks

⋄ A crucial point of NMF is to achieve sparse solutions → To get a better part-based representation. → To improve interpretability.

[Hoy04] P.O. Hoyer, Nonnegative Matrix Factorization with Sparseness Constraints, J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.

⋄ Other applications: pattern recognition, email surveillance, bioinformatics, graph clustering, air emission control, . . .

[BBLPP07] M. Berry, M. Browne, A. Langville, P. Pauca, and R.J. Plemmons, Algorithms and Applications for Approximate Nonnegative Matrix Factorization, Computational Statistics and Data Analysis, Volume 52, Issue 1, pp. 155-173, 2007.

CESAME Nonnegative Matrix Factorization 19

slide-41
SLIDE 41

Algorithms for NMF

Given M ≥ 0 and r > 0, the NMF optimization problem is min

U∈Rm×r,V ∈Rr×n

||M − UV ||2

F

U ≥ 0, V ≥ 0 Most algorithms aim at converging to locally optimal solutions, e.g. ⋄ Alternating nonnegative least squares; ⋄ Projected gradient methods; ⋄ Multiplicative updates;

CESAME Nonnegative Matrix Factorization 20

slide-42
SLIDE 42

Algorithms for NMF

Given M ≥ 0 and r > 0, the NMF optimization problem is min

U∈Rm×r,V ∈Rr×n

||M − UV ||2

F

U ≥ 0, V ≥ 0 Most algorithms aim at converging to locally optimal solutions, e.g. ⋄ Alternating nonnegative least squares; ⋄ Projected gradient methods; ⋄ Multiplicative updates;

CESAME Nonnegative Matrix Factorization 20

slide-43
SLIDE 43

Alternating Nonnegative Least Squares

When V is fixed, NMF is a convex optimization problem in U U ∗ = argminU≥0 ||M − UV ||2

F.

This is a linearly constrained convex quadratic problem. It is called nonnegative least squares (NNLS). On can then use the alternating scheme Uk+1 ← argminU ≥0 ||M − UVk||F Vk+1 ← argminV ≥0 ||M − Uk+1V ||F However, NNLS are relatively expensive to solve. Need for cheaper methods when dealing with large scale problems.

CESAME Nonnegative Matrix Factorization 21

slide-44
SLIDE 44

Alternating Nonnegative Least Squares

When V is fixed, NMF is a convex optimization problem in U U ∗ = argminU≥0 ||M − UV ||2

F.

This is a linearly constrained convex quadratic problem. It is called nonnegative least squares (NNLS). On can then use the alternating scheme Uk+1 ← argminU ≥0 ||M − UVk||F Vk+1 ← argminV ≥0 ||M − Uk+1V ||F However, NNLS are relatively expensive to solve. Need for cheaper methods when dealing with large scale problems.

CESAME Nonnegative Matrix Factorization 21

slide-45
SLIDE 45

First-order methods

min

(U,V )≥0 F(U, V ) = 1

2||M − UV ||2

F

(NMF) ∇UF(U, V ) = ∇U 1 2||M − UV ||2

F = −(M − UV )V T .

Projected Gradient method. Uk+1 ← PU≥0

  • Uk − λk∇UF(Uk, Vk)
  • , λk ≥ 0,

← max

  • 0, Uk + λk(M − UkVk)VkT

, λk ≥ 0.

[Lin07] Lin, Projected Gradient Methods for Nonnegative Matrix Factorization, Neural Computation, 19, pp. 2756–2779, 2007.

CESAME Nonnegative Matrix Factorization 22

slide-46
SLIDE 46

First-order methods

min

(U,V )≥0 F(U, V ) = 1

2||M − UV ||2

F

(NMF) ∇UF(U, V ) = ∇U 1 2||M − UV ||2

F = −(M − UV )V T .

Projected Gradient method. Uk+1 ← PU≥0

  • Uk − λk∇UF(Uk, Vk)
  • , λk ≥ 0,

← max

  • 0, Uk + λk(M − UkVk)VkT

, λk ≥ 0.

[Lin07] Lin, Projected Gradient Methods for Nonnegative Matrix Factorization, Neural Computation, 19, pp. 2756–2779, 2007.

CESAME Nonnegative Matrix Factorization 22

slide-47
SLIDE 47

First-order methods

min

(U,V )≥0 F(U, V ) = 1

2||M − UV ||2

F

(NMF) ∇UF(U, V ) = ∇U 1 2||M − UV ||2

F = −(M − UV )V T .

Projected Gradient method. Uk+1 ← PU≥0

  • Uk − λk∇UF(Uk, Vk)
  • , λk ≥ 0,

← max

  • 0, Uk + λk(M − UkVk)VkT

, λk ≥ 0.

[Lin07] Lin, Projected Gradient Methods for Nonnegative Matrix Factorization, Neural Computation, 19, pp. 2756–2779, 2007.

CESAME Nonnegative Matrix Factorization 22

slide-48
SLIDE 48

Multiplicative Updates

Rescaled Projected Gradient method. Uk+1 ← max

  • 0, Uk − Sk ◦ ∇UF(Uk, Vk)
  • , Sk ≥ 0.

i.e. multiplying the gradient with a diagonal nonnegative matrix. With Sk =

[Uk] [UkVkV T

k ], we obtain

Uk+1 ← max

  • 0, Uk −

[Uk] [UkVkV T

k ] ◦ (UkVk − M)V T k

  • = Uk ◦

[MV T

k ]

[UkVkV T

k ]

which is nonincreasing for ||M − UV ||F . ⋄ No stepsize to tune. ⋄ But. . . slow convergence.

[LS01] D.D. Lee and H.S. Seung, Learning the Parts of Objects by Nonnegative Matrix Factorization, Nature 401, p. 788–791, 1999.

CESAME Nonnegative Matrix Factorization 23

slide-49
SLIDE 49

Multiplicative Updates

Rescaled Projected Gradient method. Uk+1 ← max

  • 0, Uk − Sk ◦ ∇UF(Uk, Vk)
  • , Sk ≥ 0.

i.e. multiplying the gradient with a diagonal nonnegative matrix. With Sk =

[Uk] [UkVkV T

k ], we obtain

Uk+1 ← max

  • 0, Uk −

[Uk] [UkVkV T

k ] ◦ (UkVk − M)V T k

  • = Uk ◦

[MV T

k ]

[UkVkV T

k ]

which is nonincreasing for ||M − UV ||F . ⋄ No stepsize to tune. ⋄ But. . . slow convergence.

[LS01] D.D. Lee and H.S. Seung, Learning the Parts of Objects by Nonnegative Matrix Factorization, Nature 401, p. 788–791, 1999.

CESAME Nonnegative Matrix Factorization 23

slide-50
SLIDE 50

Multiplicative Updates

Rescaled Projected Gradient method. Uk+1 ← max

  • 0, Uk − Sk ◦ ∇UF(Uk, Vk)
  • , Sk ≥ 0.

i.e. multiplying the gradient with a diagonal nonnegative matrix. With Sk =

[Uk] [UkVkV T

k ], we obtain

Uk+1 ← max

  • 0, Uk −

[Uk] [UkVkV T

k ] ◦ (UkVk − M)V T k

  • = Uk ◦

[MV T

k ]

[UkVkV T

k ]

which is nonincreasing for ||M − UV ||F . ⋄ No stepsize to tune. ⋄ But. . . slow convergence.

[LS01] D.D. Lee and H.S. Seung, Learning the Parts of Objects by Nonnegative Matrix Factorization, Nature 401, p. 788–791, 1999.

CESAME Nonnegative Matrix Factorization 23

slide-51
SLIDE 51

Outline

  • 1. Introduction to Nonnegative Matrix Factorization

◮ Motivations and applications ◮ Some algorithms

  • 2. Rank-one update and Nonnegative Factorization

◮ Nonnegative Factorization ◮ Complexity and the maximum edge biclique problem

  • 3. Greedy with Underapproximations

◮ For sparse approximations ◮ Algorithm based on Lagrangian relaxation CESAME Nonnegative Matrix Factorization 24

slide-52
SLIDE 52

Rank-one updates

The approximation can be expressed as the sum of r rank-one factors: M ≈

r

  • i=1

U:iVi: ⇒ U:kVk: ≈ Rk = M −

  • i=k

U:iVi:, where Rk is called the kth residual matrix. To approximate U:kVk:, one can optimize U:k and Vk: alternatively with an exact closed-form solution: U∗

:k =

argminU:k≥0||M − UV ||2

F =

max

  • 0, RkV T

||Vk:||2

  • .

This corresponds to a block coordinate descent method.

CESAME Nonnegative Matrix Factorization 25

slide-53
SLIDE 53

Rank-one updates

The approximation can be expressed as the sum of r rank-one factors: M ≈

r

  • i=1

U:iVi: ⇒ U:kVk: ≈ Rk = M −

  • i=k

U:iVi:, where Rk is called the kth residual matrix. To approximate U:kVk:, one can optimize U:k and Vk: alternatively with an exact closed-form solution: U∗

:k =

argminU:k≥0||M − UV ||2

F =

max

  • 0, RkV T

||Vk:||2

  • .

This corresponds to a block coordinate descent method.

CESAME Nonnegative Matrix Factorization 25

slide-54
SLIDE 54

Hierarchical Alternating Least Squares

Initialization. (U, V ) ≥ 0. for k = 1 : r U:k = max

  • 0, RkV T

||Vk:||2

  • Vk: = max
  • 0, UT Rk

||U:k||2

  • end.

with Rk = M −

i=k U:iVi:.

This algorithm outperforms, in most cases, other algorithms for NMF.

[CZA07] A. Cichocki, R. Zdunek, and S. Amari, Hierarchical ALS Algorithms for Nonnegative Matrix and 3D Tensor Factorization, In: ICA07, London, Lecture Notes in Computer Science,

  • Vol. 4666, Springer, pp. 169-176, 2007.

[HDB08] N.-D. Ho, P. Van Dooren and V. Blondel, Descent Type Algorithms for NMF, In: Numerical Linear Algebra in Signals, Systems and Control, Springer Verlag, 2008. [GG08] Nonnegative Matrix Factorization and Underapproximation, 9th International Symposium on Iterative Methods in Scientific Computing, France, 2008.

CESAME Nonnegative Matrix Factorization 26

slide-55
SLIDE 55

Hierarchical Alternating Least Squares

Initialization. (U, V ) ≥ 0. for k = 1 : r U:k = max

  • 0, RkV T

||Vk:||2

  • Vk: = max
  • 0, UT Rk

||U:k||2

  • end.

with Rk = M −

i=k U:iVi:.

This algorithm outperforms, in most cases, other algorithms for NMF.

[CZA07] A. Cichocki, R. Zdunek, and S. Amari, Hierarchical ALS Algorithms for Nonnegative Matrix and 3D Tensor Factorization, In: ICA07, London, Lecture Notes in Computer Science,

  • Vol. 4666, Springer, pp. 169-176, 2007.

[HDB08] N.-D. Ho, P. Van Dooren and V. Blondel, Descent Type Algorithms for NMF, In: Numerical Linear Algebra in Signals, Systems and Control, Springer Verlag, 2008. [GG08] Nonnegative Matrix Factorization and Underapproximation, 9th International Symposium on Iterative Methods in Scientific Computing, France, 2008.

CESAME Nonnegative Matrix Factorization 26

slide-56
SLIDE 56

Typical Result

Figure: Hierarchical ALS vs. Multiplicative Updates

Hierarchical ALS outperforms the multiplicative updates: more accurate and sparser solutions are obtained faster. This behavior can be explained theoretically with a local analysis.

[GG08] Nonnegative Factorization and the maximum edge biclique problem, CORE DP 2008/64.

CESAME Nonnegative Matrix Factorization 27

slide-57
SLIDE 57

Is it possible to improve HALS?

Back on the subproblem of the Hierarchical ALS: U:kVk: ≈ Rk = M −

  • i=k

U:iVi: 0 Instead of optimizing alternatively U:k and Vk:, the question is: Is it possible to solve the problem simultaneously for (U:k, Vk:) ? i.e. to find the best nonnegative rank-one approximation of a not necessarily nonnegative matrix ? We define a new problem we call rank-one Nonnegative Factorization (NF): given any real matrix R, min

u≥0,v≥0

||R − uv||2

F

Is this problem difficult? If R ≥ 0 : No. [Perron-Frobenius and Eckart-Young theorems]. If R 0 : Surprisingly, yes.

CESAME Nonnegative Matrix Factorization 28

slide-58
SLIDE 58

Is it possible to improve HALS?

Back on the subproblem of the Hierarchical ALS: U:kVk: ≈ Rk = M −

  • i=k

U:iVi: 0 Instead of optimizing alternatively U:k and Vk:, the question is: Is it possible to solve the problem simultaneously for (U:k, Vk:) ? i.e. to find the best nonnegative rank-one approximation of a not necessarily nonnegative matrix ? We define a new problem we call rank-one Nonnegative Factorization (NF): given any real matrix R, min

u≥0,v≥0

||R − uv||2

F

Is this problem difficult? If R ≥ 0 : No. [Perron-Frobenius and Eckart-Young theorems]. If R 0 : Surprisingly, yes.

CESAME Nonnegative Matrix Factorization 28

slide-59
SLIDE 59

Is it possible to improve HALS?

Back on the subproblem of the Hierarchical ALS: U:kVk: ≈ Rk = M −

  • i=k

U:iVi: 0 Instead of optimizing alternatively U:k and Vk:, the question is: Is it possible to solve the problem simultaneously for (U:k, Vk:) ? i.e. to find the best nonnegative rank-one approximation of a not necessarily nonnegative matrix ? We define a new problem we call rank-one Nonnegative Factorization (NF): given any real matrix R, min

u≥0,v≥0

||R − uv||2

F

Is this problem difficult? If R ≥ 0 : No. [Perron-Frobenius and Eckart-Young theorems]. If R 0 : Surprisingly, yes.

CESAME Nonnegative Matrix Factorization 28

slide-60
SLIDE 60

Is it possible to improve HALS?

Back on the subproblem of the Hierarchical ALS: U:kVk: ≈ Rk = M −

  • i=k

U:iVi: 0 Instead of optimizing alternatively U:k and Vk:, the question is: Is it possible to solve the problem simultaneously for (U:k, Vk:) ? i.e. to find the best nonnegative rank-one approximation of a not necessarily nonnegative matrix ? We define a new problem we call rank-one Nonnegative Factorization (NF): given any real matrix R, min

u≥0,v≥0

||R − uv||2

F

Is this problem difficult? If R ≥ 0 : No. [Perron-Frobenius and Eckart-Young theorems]. If R 0 : Surprisingly, yes.

CESAME Nonnegative Matrix Factorization 28

slide-61
SLIDE 61

Complexity of Nonnegative Factorization

In order to prove its NP-hardness, we reduced Nonnegative Factorization to the Biclique problem, which is NP-complete.

Given a bipartite graph Gb = (V1 ∪ V2, E ∈ (V1 × V2)), Find the maximum-edge complete bipartite subgraph (biclique).

[Peet03] R. Peeters, The maximum edge biclique problem is NP-complete, Discrete Applied Mathematics, 131(3): 651-654, 2003.

CESAME Nonnegative Matrix Factorization 29

slide-62
SLIDE 62

Complexity of Nonnegative Factorization

In order to prove its NP-hardness, we reduced Nonnegative Factorization to the Biclique problem, which is NP-complete.

Given a bipartite graph Gb = (V1 ∪ V2, E ∈ (V1 × V2)), Find the maximum-edge complete bipartite subgraph (biclique).

[Peet03] R. Peeters, The maximum edge biclique problem is NP-complete, Discrete Applied Mathematics, 131(3): 651-654, 2003.

CESAME Nonnegative Matrix Factorization 29

slide-63
SLIDE 63

Complexity of Nonnegative Factorization

In order to prove its NP-hardness, we reduced Nonnegative Factorization to the Biclique problem, which is NP-complete.

Given a bipartite graph Gb = (V1 ∪ V2, E ∈ (V1 × V2)), Find the maximum-edge complete bipartite subgraph (biclique).

[Peet03] R. Peeters, The maximum edge biclique problem is NP-complete, Discrete Applied Mathematics, 131(3): 651-654, 2003.

CESAME Nonnegative Matrix Factorization 29

slide-64
SLIDE 64

Reduction to the Biclique Problem

Let M ∈ {0, 1}m×n be the adjacency matrix of the graph G, With (u, v) binary variables to indicate which vertices belong to the solution u =

  • 1

1 T , v =

  • 1

1

  • ,

uv =   1 1 1 1   , bicliques of G can be represented as binary rank-one matrices.

CESAME Nonnegative Matrix Factorization 30

slide-65
SLIDE 65

Reduction to the Biclique Problem

Let M ∈ {0, 1}m×n be the adjacency matrix of the graph G, With (u, v) binary variables to indicate which vertices belong to the solution u =

  • 1

1 T , v =

  • 1

1

  • ,

uv =   1 1 1 1   , bicliques of G can be represented as binary rank-one matrices.

CESAME Nonnegative Matrix Factorization 30

slide-66
SLIDE 66

Reduction to the Biclique Problem

Let M ∈ {0, 1}m×n be the adjacency matrix of the graph G, With (u, v) binary variables to indicate which vertices belong to the solution u =

  • 1

1 T , v =

  • 1

1

  • ,

uv =   1 1 1 1   , bicliques of G can be represented as binary rank-one matrices.

CESAME Nonnegative Matrix Factorization 30

slide-67
SLIDE 67

Reduction to the Biclique Problem

Let M ∈ {0, 1}m×n be the adjacency matrix of the graph G, min

u,v ||M − uv||F

uv ≤ M u ∈ {0, 1}m, v ∈ {0, 1}n is an exact formulation with binary variables of the biclique problem.

CESAME Nonnegative Matrix Factorization 31

slide-68
SLIDE 68

Reduction to the Biclique Problem

Let M ∈ {0, 1}m×n be the adjacency matrix of the graph G, min

u,v ||M − uv||F

uv ≤ M u ∈ {0, 1}m, v ∈ {0, 1}n is an exact formulation with binary variables of the biclique problem. Example.

CESAME Nonnegative Matrix Factorization 31

slide-69
SLIDE 69

Reduction to the Biclique Problem

M =   1 1 1 1 1 1   and Md =   1 −d 1 −d 1 1 1 −d 1   Rank-one nonnegative factorization problem: min

u≥0,v≥0 ||Md − uv||F

Theorem. For d ≥

  • |E|, optimal solutions of rank-one

Nonnegative Factorization are binary. Therefore, they correspond to the optimal solutions of the biclique problem.

  • Corollary. Nonnegative Factorization is NP-hard.

[GG08] Nonnegative Factorization and the maximum edge biclique problem, CORE DP 2008/64.

CESAME Nonnegative Matrix Factorization 32

slide-70
SLIDE 70

Reduction to the Biclique Problem

M =   1 1 1 1 1 1   and Md =   1 −d 1 −d 1 1 1 −d 1   Rank-one nonnegative factorization problem: min

u≥0,v≥0 ||Md − uv||F

Theorem. For d ≥

  • |E|, optimal solutions of rank-one

Nonnegative Factorization are binary. Therefore, they correspond to the optimal solutions of the biclique problem.

  • Corollary. Nonnegative Factorization is NP-hard.

[GG08] Nonnegative Factorization and the maximum edge biclique problem, CORE DP 2008/64.

CESAME Nonnegative Matrix Factorization 32

slide-71
SLIDE 71

Reduction to the Biclique Problem

M =   1 1 1 1 1 1   and Md =   1 −d 1 −d 1 1 1 −d 1   Rank-one nonnegative factorization problem: min

u≥0,v≥0 ||Md − uv||F

Theorem. For d ≥

  • |E|, optimal solutions of rank-one

Nonnegative Factorization are binary. Therefore, they correspond to the optimal solutions of the biclique problem.

  • Corollary. Nonnegative Factorization is NP-hard.

[GG08] Nonnegative Factorization and the maximum edge biclique problem, CORE DP 2008/64.

CESAME Nonnegative Matrix Factorization 32

slide-72
SLIDE 72

Reduction to the Biclique Problem

M =   1 1 1 1 1 1   and Md =   1 −d 1 −d 1 1 1 −d 1   Rank-one nonnegative factorization problem: min

u≥0,v≥0 ||Md − uv||F

Theorem. For d ≥

  • |E|, optimal solutions of rank-one

Nonnegative Factorization are binary. Therefore, they correspond to the optimal solutions of the biclique problem.

  • Corollary. Nonnegative Factorization is NP-hard.

[GG08] Nonnegative Factorization and the maximum edge biclique problem, CORE DP 2008/64.

CESAME Nonnegative Matrix Factorization 32

slide-73
SLIDE 73

Algorithms for Nonnegative Factorization

We can generalize the NMF algorithms for the NF problem, e.g. Multiplicative Updates. ∀P,N ≥ 0 s.t. R = P − N, ||R − UV ||F is nonincreasing under U ← U ◦ [PV T ] [UV V T + NV T ], V ← V ◦ [UT P] [UT UV + UT N] Special case: N = 0 → this is NMF. Hierarchical Alternating Least Squares. It is already applicable to not necessarily nonnegative matrices.

CESAME Nonnegative Matrix Factorization 33

slide-74
SLIDE 74

Algorithms for the biclique problem

NF algorithms can be used as Biclique Finding Algorithms, solving min

u≥0,v≥0 ||Md − uv||F

For example, using the multiplicative updates : Initialization. (u, v) ≥ 0, d > 0, α > 1. for i = 1 : maxiter u = u ◦

[PvT ] [NvT +uvvT ]

v = v ◦

[uT P] [uT N+uT uv]

d = α d end where P = M and N = d(1 − M) such that Md = P − N.

CESAME Nonnegative Matrix Factorization 34

slide-75
SLIDE 75

Detecting large bicliques with NF

size Heuristic M.-S.* NF with Mult. ham62 ham64 ham82 ham84 john824 john844 john1624 john3224 MANN a9 MANN a27 64 64 256 256 28 70 120 496 45 378 304 42 4672 440 36 182 784 14400 289 28728 mean best 157 225 21 36 2839 3920 226 506 26 36 132 225 457 700 6294 9129 272 342 15946 28875 mean best 269 320 37 42 4569 4770 830 1015 28 36 220 225 514 675 8722 9108 342 342 30800 30800

Table: Solutions for DIMACS data: number of edges in the bicliques.

  • Applications. Biclustering gene expression data, recommendation system,

community detection, . . . *[DZLH06] Ding, Zhang, Li and Holbrook, Biclustering Protein Complex Interactions with a

Biclique Finding Algorithm, Sixth IEEE International Conference on Data Mining, 2006.

CESAME Nonnegative Matrix Factorization 35

slide-76
SLIDE 76

Outline

  • 1. Introduction to Nonnegative Matrix Factorization

◮ Motivations and applications ◮ Some algorithms

  • 2. Rank-one update and Nonnegative Factorization

◮ Nonnegative Factorization ◮ Complexity and the maximum edge biclique problem

  • 3. Greedy with Underapproximations

◮ For sparse approximations ◮ Algorithm based on Lagrangian relaxation CESAME Nonnegative Matrix Factorization 36

slide-77
SLIDE 77

Singular Value Decomposition (SVD)

For a nonnegative matrix, the dominant rank-one factor is nonnegative [Perron-Frobenius]. Hence, SVD is optimal for r = 1 with the Frobenius norm. Example : M =   2 3 2 2 1 1 1 5   ≈   1.5 0.7 1.9   1 2.3 0.6 = UV ≈   1.5 3.5 0.9 0.7 1.6 0.4 1.9 4.4 1.1  

CESAME Nonnegative Matrix Factorization 37

slide-78
SLIDE 78

Singular Value Decomposition (SVD)

For a nonnegative matrix, the dominant rank-one factor is nonnegative [Perron-Frobenius]. Hence, SVD is optimal for r = 1 with the Frobenius norm. Example : M =   2 3 2 2 1 1 1 5   ≈   1.5 −0.8 0.7 −0.9 1.9 1  

  • 1

2.3 0.6 −1 0.7 −1

  • = UV

≈   2.3 2.9 1.7 1.6 1 1.3 0.9 5.1 0.1   The second factor is also optimal but not nonnegative.

CESAME Nonnegative Matrix Factorization 37

slide-79
SLIDE 79

Singular Value Decomposition (SVD)

For a nonnegative matrix, the dominant rank-one factor is nonnegative [Perron-Frobenius]. Hence, SVD is optimal for r = 1 with the Frobenius norm. Example : M − UV =   2 3 2 2 1 1 1 5   −   1.5 0.7 1.9   1 2.3 0.6 =   0.5 −0.5 1.1 1.3 −0.6 0.6 −0.9 0.6 −1.1   The reason is the presence of negative entries in the residue.

CESAME Nonnegative Matrix Factorization 37

slide-80
SLIDE 80

Underapproximation

With the additional constraint UV ≤ M, any rank-one solution is s.t. 0 ≤ u1v1 ≤ M The difference can be underapproximated with a nonnegative factor 0 ≤ u2v2 ≤ M − u1v1 and so on . . . 0 ≤ [u1 u2 . . . ur][v1; v2; . . . ; vr] = UV ≤ M. Objective : solve the rank-one underapproximation problem.

CESAME Nonnegative Matrix Factorization 38

slide-81
SLIDE 81

Underapproximation

With the additional constraint UV ≤ M, any rank-one solution is s.t. 0 ≤ u1v1 ≤ M The difference can be underapproximated with a nonnegative factor 0 ≤ u2v2 ≤ M − u1v1 and so on . . . 0 ≤ [u1 u2 . . . ur][v1; v2; . . . ; vr] = UV ≤ M. Objective : solve the rank-one underapproximation problem.

CESAME Nonnegative Matrix Factorization 38

slide-82
SLIDE 82

Underapproximation

With the additional constraint UV ≤ M, any rank-one solution is s.t. 0 ≤ u1v1 ≤ M The difference can be underapproximated with a nonnegative factor 0 ≤ u2v2 ≤ M − u1v1 and so on . . . 0 ≤ [u1 u2 . . . ur][v1; v2; . . . ; vr] = UV ≤ M. Objective : solve the rank-one underapproximation problem.

CESAME Nonnegative Matrix Factorization 38

slide-83
SLIDE 83

Underapproximation

With the additional constraint UV ≤ M, any rank-one solution is s.t. 0 ≤ u1v1 ≤ M The difference can be underapproximated with a nonnegative factor 0 ≤ u2v2 ≤ M − u1v1 and so on . . . 0 ≤ [u1 u2 . . . ur][v1; v2; . . . ; vr] = UV ≤ M. Objective : solve the rank-one underapproximation problem.

CESAME Nonnegative Matrix Factorization 38

slide-84
SLIDE 84

Nonnegative Matrix Underapproximation min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M

This is NP-hard, including r = 1. However, the additional constraint UV ≤ M will lead to sparser

  • factors. In particular,

Mij = 0 ⇒ (UV )ij = 0 ⇒ Uik = 0 or Vkj = 0, ∀k

[GG09] Using Underapproximations for Sparse Nonnegative Matrix Factorization, CORE DP 2009/6.

CESAME Nonnegative Matrix Factorization 39

slide-85
SLIDE 85

Nonnegative Matrix Underapproximation min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M

This is NP-hard, including r = 1. However, the additional constraint UV ≤ M will lead to sparser

  • factors. In particular,

Mij = 0 ⇒ (UV )ij = 0 ⇒ Uik = 0 or Vkj = 0, ∀k

[GG09] Using Underapproximations for Sparse Nonnegative Matrix Factorization, CORE DP 2009/6.

CESAME Nonnegative Matrix Factorization 39

slide-86
SLIDE 86

Solving NMU with Lagrangian Dual

Nonnegative Matrix Underapproximation (NMU). min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M. Lagrangian Dual. Let’s drop the m × n underapproximation constraints (UV )ij ≤ Mij and add them into the objective function with Lagrange multipliers (dual variables), say Λij, and write the standard Lagrangian dual : max

Λ≥0

min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij ≡ max

Λ≥0

min

V,W≥0 ||(M − Λ) − UV ||2 F

  • Nonnegative Factorization

.

CESAME Nonnegative Matrix Factorization 40

slide-87
SLIDE 87

Solving NMU with Lagrangian Dual

Nonnegative Matrix Underapproximation (NMU). min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M. Lagrangian Dual. Let’s drop the m × n underapproximation constraints (UV )ij ≤ Mij and add them into the objective function with Lagrange multipliers (dual variables), say Λij, and write the standard Lagrangian dual : max

Λ≥0

min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij ≡ max

Λ≥0

min

V,W≥0 ||(M − Λ) − UV ||2 F

  • Nonnegative Factorization

.

CESAME Nonnegative Matrix Factorization 40

slide-88
SLIDE 88

Solving NMU with Lagrangian Dual

Nonnegative Matrix Underapproximation (NMU). min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M. Lagrangian Dual. Let’s drop the m × n underapproximation constraints (UV )ij ≤ Mij and add them into the objective function with Lagrange multipliers (dual variables), say Λij, and write the standard Lagrangian dual : max

Λ≥0

min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij ≡ max

Λ≥0

min

V,W≥0 ||(M − Λ) − UV ||2 F

  • Nonnegative Factorization

.

CESAME Nonnegative Matrix Factorization 40

slide-89
SLIDE 89

Solving NMU with Lagrangian Dual

Nonnegative Matrix Underapproximation (NMU). min

U≥0,V ≥0

||M − UV ||2

F

UV ≤ M. Lagrangian Dual. Let’s drop the m × n underapproximation constraints (UV )ij ≤ Mij and add them into the objective function with Lagrange multipliers (dual variables), say Λij, and write the standard Lagrangian dual : max

Λ≥0

min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij ≡ max

Λ≥0

min

V,W≥0 ||(M − Λ) − UV ||2 F

  • Nonnegative Factorization

.

CESAME Nonnegative Matrix Factorization 40

slide-90
SLIDE 90

Solving the Lagrangian Dual

maxΛ≥0 L(Λ) s.t. L(Λ) = min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij

  • 1. Initialize Λ ≥ 0.
  • 2. Find (U, V ) ≥ 0 i.e. approximate L(Λ) using NF

algorithms.

  • 3. Update Λ. Example:

Λ = max(0, Λ + α(UV − M)), α → 0 . Remark.

Convergence not guaranteed but efficient in practice.

CESAME Nonnegative Matrix Factorization 41

slide-91
SLIDE 91

Solving the Lagrangian Dual

maxΛ≥0 L(Λ) s.t. L(Λ) = min

U,V ≥0

1 2||M − UV ||2

F +

  • i,j

Λij(UV − M)ij

  • 1. Initialize Λ ≥ 0.
  • 2. Find (U, V ) ≥ 0 i.e. approximate L(Λ) using NF

algorithms.

  • 3. Update Λ. Example:

Λ = max(0, Λ + α(UV − M)), α → 0 . Remark.

Convergence not guaranteed but efficient in practice.

CESAME Nonnegative Matrix Factorization 41

slide-92
SLIDE 92

Visual Result for Face Feature Extraction

CESAME Nonnegative Matrix Factorization 42

slide-93
SLIDE 93

NMU achieves sparse solutions

[Hoy04] P.O. Hoyer, Nonnegative Matrix Factorization with Sparseness Constraints, J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.

CESAME Nonnegative Matrix Factorization 43

slide-94
SLIDE 94

NMF and NMU for Document Classification

ROC Area Anomaly DNAA Event Type NMF NMU 1 Airworthiness Issue .8621 .9655 2 Noncompliance (policy/proc.) .3971 .5502 5 Incursion (collision hazard) .6173 .7037 6 Departure Problem .5566 .4615 7 Altitude Deviation .5600 .4000 8 Course Deviation .3580 .7531 10 Uncommanded (loss of control) .6071 .6071 12 Traffic Proximity Event .5650 .5750 13 Weather Issue .6964 .7321 14 Airspace Deviation .7778 .4815 18 Aircraft Damage/Encounter .4286 .6249 19 Aircraft Malfunction Event .5556 .3086 21 Illness/Injury Event .8571 .8750 22 Security Concern/Threat .2759 .3103

Table: ROC Areas Versus DNAA Event Types for Selected Anomalies of Aviation Safety

Reporting System (ASRS) documents obtained from the Distributed National ASAP Archive (DNAA). [BGG09] with M. W. Berry, Document Classification Using Nonnegative Matrix Factorization and Underapproximation, IEEE International Symposium on Circuits and Systems, 2009.

CESAME Nonnegative Matrix Factorization 44

slide-95
SLIDE 95

Conclusion

  • 1. Nonnegative Factorization

◮ Rank-one updates are efficient for solving NMF ◮ The rank-one subproblems are NP-hard ◮ However, it can be used for finding large bicliques

  • 2. Nonnegative Matrix Underapproximation

◮ Additional underapproximation constraints ◮ Subproblems NP-hard as well ◮ Algorithm based on Lagrangian relaxation ◮ NMU is able to find sparse approximations

  • 3. Further work

◮ Accelerating NMF algorithms with a multilevel approach ◮ Design an exact algorithm for NMF ◮ Consider other cost functions, e.g. l1 norm: ||M − UV ||1. CESAME Nonnegative Matrix Factorization 45

slide-96
SLIDE 96

Conclusion

  • 1. Nonnegative Factorization

◮ Rank-one updates are efficient for solving NMF ◮ The rank-one subproblems are NP-hard ◮ However, it can be used for finding large bicliques

  • 2. Nonnegative Matrix Underapproximation

◮ Additional underapproximation constraints ◮ Subproblems NP-hard as well ◮ Algorithm based on Lagrangian relaxation ◮ NMU is able to find sparse approximations

  • 3. Further work

◮ Accelerating NMF algorithms with a multilevel approach ◮ Design an exact algorithm for NMF ◮ Consider other cost functions, e.g. l1 norm: ||M − UV ||1. CESAME Nonnegative Matrix Factorization 45

slide-97
SLIDE 97

Conclusion

  • 1. Nonnegative Factorization

◮ Rank-one updates are efficient for solving NMF ◮ The rank-one subproblems are NP-hard ◮ However, it can be used for finding large bicliques

  • 2. Nonnegative Matrix Underapproximation

◮ Additional underapproximation constraints ◮ Subproblems NP-hard as well ◮ Algorithm based on Lagrangian relaxation ◮ NMU is able to find sparse approximations

  • 3. Further work

◮ Accelerating NMF algorithms with a multilevel approach ◮ Design an exact algorithm for NMF ◮ Consider other cost functions, e.g. l1 norm: ||M − UV ||1. CESAME Nonnegative Matrix Factorization 45

slide-98
SLIDE 98

Thank you for your attention!

CESAME Nonnegative Matrix Factorization 46