Sparse Separable Nonnegative Matrix Factorization Extending - - PowerPoint PPT Presentation

sparse separable nonnegative matrix factorization
SMART_READER_LITE
LIVE PREVIEW

Sparse Separable Nonnegative Matrix Factorization Extending - - PowerPoint PPT Presentation

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with 0 sparsity constraints Nicolas Nadisic, Arnaud Vandaele, Jeremy Cohen, Nicolas Gillis 9 October 2020 GdR MIA Thematic Day Universit e de Mons, Belgium


slide-1
SLIDE 1

Sparse Separable Nonnegative Matrix Factorization

Extending Separable NMF with ℓ0 sparsity constraints

Nicolas Nadisic, Arnaud Vandaele, Jeremy Cohen, Nicolas Gillis 9 October 2020 — GdR MIA Thematic Day

Universit´ e de Mons, Belgium 1/33

slide-2
SLIDE 2

Nonnegative Matrix Factorization

Given a data matrix M ∈ Rm×n

+

and a rank r ≪ min(m, n), find W ∈ Rm×r

+

and H ∈ Rr×n

+

such that M ≈ WH. In optimization terms, standard NMF is equivalent to: min

W ≥0,H≥0 M − WH2 F 2/33

slide-3
SLIDE 3

Nonnegative Matrix Factorization

Why nonnegativity?

  • More interpretable factors (part-based representation)
  • Naturally favors sparsity
  • Makes sense in many applications (image processing, hyperspectral

unmixing, text mining, . . . )

3/33

slide-4
SLIDE 4

NMF Geometry (M ≈ WH)

Data points M(:, j)

4/33

slide-5
SLIDE 5

NMF Geometry (M ≈ WH)

Vertices W (:, p) Data points M(:, j)

4/33

slide-6
SLIDE 6

Application – hyperspectral unmixing

M(:, j)

spectral signature of j-th pixel

  • p

W (:, p)

spectral signature of p-th material

H(p, j)

abundance of p-th material in j-th pixel

Images from Bioucas Dias and Nicolas Gillis.

5/33

slide-7
SLIDE 7

Application – hyperspectral unmixing

Grass Rooftop Trees

Materials W (:, p) Pixels M(:, j)

6/33

slide-8
SLIDE 8

Starting point 1/2 – Separable NMF

  • NMF is NP-hard [Vavasis, 2010].
  • Under the separability assumption, it’s solvable in polynomial time

[Arora et al., 2012].

7/33

slide-9
SLIDE 9

Starting point 1/2 – Separable NMF

Separability:

  • The vertices are selected among the data points
  • In hyperspectral unmixing, equivalent to Pure-pixel assumption

Standard NMF model Separable NMF M = WH M = M(:, J )H

8/33

slide-10
SLIDE 10

Separable NMF – Geometry

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Data points M(:, j) Selected vertices W (:, j) Unit simplex

9/33

slide-11
SLIDE 11

Algorithm for Separable NMF – SNPA

SNPA = Successive Nonnegative Projection Algorithm [Gillis, 2014]

  • Start with empty W , and residual R = M
  • Alternate between
  • Greedy selection of one column of R to be added to W
  • Projection of R on the convex hull of the origin and columns of W
  • Stop when reconstruction error = 0 (or < ǫ)

10/33

slide-12
SLIDE 12

SNPA

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

11/33

slide-13
SLIDE 13

SNPA

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

11/33

slide-14
SLIDE 14

SNPA

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

11/33

slide-15
SLIDE 15

SNPA

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

11/33

slide-16
SLIDE 16

SNPA

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

11/33

slide-17
SLIDE 17

Limitations of Separable NMF

What if one column of W is a combination of others columns of W ? → Interior vertex SNPA cannot identify it, because it belongs to the convex hull of the

  • ther vertices.

12/33

slide-18
SLIDE 18

Limitations of Separable NMF

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1 2 3 4 5 Data points M(:, j) Exterior vertices Interior vertex Unit simplex

13/33

slide-19
SLIDE 19

Limitations of Separable NMF

SNPA is unable to handle this case, the interior vertex is not identifiable. However, if columns of H are sparse (a data point is a combination of

  • nly k < r vertices), this interior vertex may be identifiable.

14/33

slide-20
SLIDE 20

Starting point 2/2 — k-Sparse NMF

M ≈ WH s.t. H is column-wise k-sparse (for all i, H(:, i)0 ≤ k)

  • Motivation → better interpretability
  • Motivation → improve results using prior sparsity knowledge
  • Ex: a pixel expressed as a combination of at most k materials

= M W H

15/33

slide-21
SLIDE 21

k-Sparse NMF – Geometry

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

16/33

slide-22
SLIDE 22

k-Sparse NMF

k-Sparse NMF is combinatorial, with

r

k

  • possible combinations per

column of H. Previous work: a branch-and-bound algorithm for Exact k-Sparse NNLS [Nadisic et al., 2020].

X = [x1 x2 x3 x4 x5] root node, unconstrained k′ ≤ n = 5 X = [0 x2 x3 x4 x5] X = [0 0 x3 x4 x5] X = [0 0 0 x4 x5] X = [0 0 x3 0 x5] X = [0 0 x3 x4 0] k′ ≤ 2 = k → stop X = [0 x2 0 x4 x5] X = [0 x2 x3 0 x5] ... k′ ≤ 3 X = [x1 0 x3 x4 x5] ... k′ ≤ 4

17/33

slide-23
SLIDE 23

Sparse Separable NMF

Standard NMF model Separable NMF SSNMF M = WH M = M(:, J )H M = M(:, J )H s.t. for all i, H(:, i)0 ≤ k

18/33

slide-24
SLIDE 24

Our approach for SSNMF

Replace the projection step of SNPA, from projection on convex hull to projection on k-sparse hull, done with our BnB solver ⇒ kSSNPA. kSSNPA

  • Identifies all interior vertices (non-selected points are never vertices)
  • May also identify wrong vertices (explanation to come!)

⇒ kSSNPA can be seen as a screening technique to reduce the number

  • f points to check.

19/33

slide-25
SLIDE 25

Our approach for SSNMF

In a nutshell, 3 steps:

  • 1. Identify exterior vertices with SNPA
  • 2. Identify candidate interior vertices with kSSNPA
  • 3. Discard bad candidates, those that are k-sparse combinations of
  • ther selected points (they cannot be vertices)

Our algorithm: BRASSENS Relies on Assumptions of Sparsity and Separability for Elegant NMF Solving.

20/33

slide-26
SLIDE 26

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-27
SLIDE 27

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-28
SLIDE 28

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-29
SLIDE 29

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-30
SLIDE 30

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-31
SLIDE 31

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-32
SLIDE 32

BRASSENS with sparsity k = 2

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

21/33

slide-33
SLIDE 33

Complexity

  • As opposed to Sep NMF, SSNMF is NP-hard (Arnaud proved it, see

the paper)

  • Hardness comes from the k-sparse projection
  • Not too bad when r is small, with our BnB solver

22/33

slide-34
SLIDE 34

Correctness

Assumption 1 No column of W is a nonnegative linear combination of k

  • ther columns of W.

⇒ necessary condition for recovery by BRASSENS Assumption 2 No column of W is a nonnegative linear combination of k

  • ther columns of M.

⇒ sufficient condition for recovery by BRASSENS If data points are k-sparse and generated at random, Assumption 2 is true with probability one.

23/33

slide-35
SLIDE 35

Related work

Only one similar work: [Sun and Xin, 2011]

  • Handles only one interior vertex
  • Non-optimal bruteforce-like method

24/33

slide-36
SLIDE 36

Experiments

  • Experiments on synthetic datasets with interior vertices
  • Experiment on underdetermined multispectral unmixing (Urban

image, 309 × 309 pixels, limited to m = 3 spectral bands, and we search for r = 5 materials)

  • No other algorithm can tackle SSNMF, so comparisons are limited

25/33

slide-37
SLIDE 37

XP Synthetic: 3 exterior and 2 interior vertices, n grows

20 40 60 80 100 120 140 160 180 200 6 8 10 12 Number of data points n Number of candidate interior vertices 5 10 15 Run time (in seconds)

26/33

slide-38
SLIDE 38

XP Synthetic 2: dimensions grow

m n r k Number of candidates Run time in seconds 3 25 5 2 5.5 0.26 4 30 6 3 8.5 3.30 5 35 7 4 9.5 38.71 6 40 8 5 13 395.88 Conclusion from experiments:

  • kSSNPA is efficient to select few candidates
  • Still, BRASSENS does not scale well :(

27/33

slide-39
SLIDE 39

XP on 3-bands Urban dataset with r = 5

SNPA Grass+Trees +Rooftops Rooftops 1 Dirt+Road +Rooftops Dirt+Grass Rooftops 1 +Dirt+Road BRASSENS (finds 1 interior point) Grass+Trees Rooftops 1 Road Rooftops+Road Dirt+Grass

28/33

slide-40
SLIDE 40

Future work

  • Theoretical analysis of robustness to noise
  • New real-life applications

29/33

slide-41
SLIDE 41

Take-home messages

Sparse Separable NMF:

  • Combine constraints of separability and k-sparsity
  • A new way to regularize NMF
  • Can handle some cases that Separable NMF cannot
  • Underdetermined case
  • Interior vertices
  • Is NP-hard (unlike Sep NMF), but actually “not so hard” for small r
  • Is provably solved by our approach
  • Does not scale well

30/33

slide-42
SLIDE 42

References i

Arora, S., Ge, R., Kannan, R., and Moitra, A. (2012). Computing a nonnegative matrix factorization – provably. STOC ’12. Gillis, N. (2014). Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation. SIAM Journal on Imaging Sciences, 7(2):1420–1450. Nadisic, N., Vandaele, A., Gillis, N., and Cohen, J. E. (2020). Exact Sparse Nonnegative Least Squares. In ICASSP 2020, pages 5395 – 5399.

31/33

slide-43
SLIDE 43

References ii

Sun, Y. and Xin, J. (2011). Underdetermined Sparse Blind Source Separation of Nonnegative and Partially Overlapped Data. SIAM Journal on Scientific Computing, 33(4):2063–2094. Vavasis, S. A. (2010). On the Complexity of Nonnegative Matrix Factorization. SIAM Journal on Optimization.

32/33

slide-44
SLIDE 44

Contact: nicolas.nadisic@umons.ac.be Code and exp.: https://gitlab.com/nnadisic/ssnmf Slides and paper: http://nicolasnadisic.xyz