Parallel Nonnegative Matrix Factorization Algorithms for - PowerPoint PPT Presentation

Parallel Nonnegative Matrix Factorization Algorithms for Hyperspectral Images A Masters Thesis Lukasz Grzegorz Maciak Montclair State University Department of Computer Science 1

Objectives 1. Background 2. Investigate Novel Feature Extraction methods (NMF) 3. Design, implement and test parallel NMF algorithms 4. Initiate development of Java based hyperspectral imaging toolkit 2

What is Remote Sensing? The study of Earth’s surface and atmospheric features from a distance. • direct observation • reconnaissance • aerial and/or satellite photography/video • Applications in: • agriculture and forestry • meteorology • military surveilance • 3

Hyperspectral Images Visible Light: 0 . 4 µm to 0 . 7 µm Hyperspectral Lens: 0 . 4 µm and 2 . 4 µm 4

Hyperspectral Images • spectra • mixed pixels • 5

Feature Extraction Reduce dimensionality of the data sample without data loss Source Separation Spectral Unmixing 6

Linear Mixing Model Each pixel p is composed of d endmembers Each endmember has a natural intensity e Each endbember contributes fractional amount a to the total pixel intensity   a 1   a 2   � � p = × e 1 e 2 ... e d   (1)   ...     a d 7

Linear Mixing Model m � X = a i s i + w = Sa + w (2) i =1 m � where a i ≥ 0 , i = 1 , ..., m and a i = 1 (3) i =1 8

Unsupervised Feature Extraction No training data available Extraction must be performed algorithmically Nonnegative Matrix Factorization 9

Nonnegative Matrix Factorization Relatively Fast Straightforward Algorithm Good Source Separation 11

The NMF Problem Given a nonnegative matrices Y ∈ R mxn , W ∈ R mxk , H ∈ R kxn and a positive integer k ≤ min { m, n } find W and H which minimize the function: f ( W, H ) := 1 2 � Y − WH � 2 (4) F 12

Applying NMF to Hyperspectral Data Unrolling the hyperspectral image cube into an array of bandvectors 13

How does NMF work? Start with random W and H and repeatedly update them using following rules: ( Y H T ) W = W (5) ( WHH T ) + ǫ ( W T Y ) H = H (6) ( W t HWH ) + ǫ where ǫ is a very small positive quantity ( ǫ ≤ 10 − 9 ) 14

NMF Algorithm 1. Given Y ∈ R mxn ≥ 0 , k > 0 k ≪ min ( m, n ) and (7) randomly initialize matrices W ∈ R mxk and H ∈ R kxn with nonnegative values 2. Scale the columns of W to sum up to one 3. Create temporary variables ¯ W and ¯ H . Sent their contents to be equal to H and W respectively. 4. Repeatedly apply the following steps until convergence criteria are met: (a) Update ¯ W and ¯ H by using: ( W T Y ) cj ¯ (1 ≤ c ≤ k ) (1 ≤ j ≤ n ) H ij ⇐ H cj (8) ( W T W H ) cj + ǫ ( Y H T ) ic ¯ 1 ≤ i ≤ m ) (1 ≤ c ≤ k ) W ic ⇐ W ic (9) ( W HH T ) ic + ǫ (b) Set W = ¯ W and H = ¯ H (c) Scale the columns of W to sum up to one 15

Projected Gradient NMF (PG-NMF) Alternatively fix one matrix and update another. or find W k +1 such that f ( W k +1 , H k ) ≤ f ( W k , H k ) and find H k +1 such that f ( W k +1 , H k +1 ) ≤ f ( W k +1 , H k ) 16

PG-NMF Update H and W using following rules: H k +1 = H k − αW T k ( W k H k − Y ) (10) k − Y T ) W k +1 = W T k − αH k +1 ( H T k +1 W T (11) 17

PG-NMF: Finding α Substitute H or W for X as necessary: f ( X k +1 ) − f ( X k ) ≤ α �∇ f ( X k ) T ( X k +1 − X k ) � (12) where ∇ f ( H ) = W T ( WH − Y ) and ∇ f ( W ) = H ( H T W T − Y T ) 18

PG-NMF: α Updates if α k satisfies 12 then repeatedly increase α α k ← α k /β as long as it still satisfies 12. else repeatedly decrease α α k ← α k ∗ β until it satisfies 12. in our case we used β = 0 . 1 19

PPG-NMF: α Updates rewritten by Chih-Jen Lin k ( W k H k − Y ) , H k +1 − H k � + 1 (1 − σ ) � W T 2 � H k +1 − H k , ( W T k W k )( H k +1 − H k ) � ≤ 0 (13) k � + 1 k − Y T ) , W T (1 − σ ) � H k +1 ( H T k +1 W T k +1 − W T 2 � W k +1 − W k , ( H k +1 H T k +1 )( W T k +1 − W T k ) � ≤ 0 (14) where � , � denotes the sum of component wise products of two matrices 20

PG-NMF Algorithm 1. Given Y ∈ R mxn ≥ 0 , k > 0 , k ≪ min ( m, n ) , α = 1 , β = 0 . 1 , σ = 0 . 01 (15) randomly initialize matrices W ∈ R mxk and H ∈ R kxn with nonnegative values. Find Hk +1 using Equation 10 2. 3. Evaluate Equation 10 and: (a) if Equation 13 is satisfied then: if at the last iteration Equation 13 was not satisfied set Hk +1 ← ¯ Hk +1 and goto 4 i. else save the value of Hk +1 in a temporary buffer ¯ Hk +1 ii. iii. save the outcome of Equation 13 update α ← α/β iv. v. go back to 2. (b) if Equation 13 is not satisfied then: if at the last iteration Equation 13 was satisfied set Hk +1 ← ¯ Hk +1 and goto 4 i. else save the value of Hk +1 in a temporary buffer ¯ Hk +1 ii. iii. save the outcome of Equation 13 update α ← α ∗ β iv. v. go back to 2. Find Wk +1 using Equation 11 4. 5. Evaluate Equation 10 and perform steps analogous to 3a and 3b for W. Set H = Hk +1 and W = Wk +1 6. 7. Go back to 2 until convergence criteria are met. 21

Convergence Criteria 1. Change in f ( W, H ) 450 P-NMF 400 2. Desired Value of f ( W, H ) 350 300 Average Accuracy 3. Number of Iterations 250 200 4. Execution Time 150 100 50 0 100 200 300 400 500 600 700 Iterations 22

Need For Parallelization 1. Large Image Resolution 2. Many Bands 3. Large Image Size 4. Computational Complexity 24

Shared Memory: SMP SMP - Symmetric Multiprocessor 25

SMP Features 1. CPU Scheduling transparent to the user 2. Java Threads leverage SMP architecture 3. No communication overhead 4. Not very scalable 26

Distributed Memory 27

Distributed Memory Features 1. Communication between nodes is slow 2. Data must be distributed manually 3. Unlimited scalability 4. Require 3rd party libraries There also exist hybrid systems - clusters of SMP’s 28

Data Partitioning 1. Spectral Data Partitioning 2. Spatial Data Partitioning 3. Other Approaches 29

Speedup S p = T p (16) T s T s time of sequential execution T p time of parallel execution with p processors. An ideal speedup is linear: S p = p 30

Parallel NMF (P-NMF) using spectral data partitioning 31

P-NMF Parallel Execution 32

Data Distribution int start, end; NMFThread[] tmp = new NMFThread[number_of_threads]; for(int i=0; i<number_of_threads; i++) { start = i*(k/number_of_threads); end = (i+1)*(k/number_of_threads); if(tt == (number_of_threads -1)) end = k; tmp[tt] = new NMFThread(start, end, times); tmp[tt].start(); } 33

Parallel Projected Gradient NMF (PPG-NMF) Spatial and Spectral Partitioning Strategies NOT APPROPRIATE instead each CPU evaluates one α value 34

PPG-NMF Execution 35

PPG-NMF Distribution 1. If we do not have a previous sum (ie. this is the first run): Initialize a single thread with α = 1 and evaluate it 2. if we do have a previous sum S then: Let i be the number of threads such that i ∈ { 1 , 2 , ..., n } where n is the total number of threads; Let e be an integer such that e = 1 if S < = 0 or e = − 1 if S > 0 . Initialize all the threads with α = β ie 36

Experimental Data a) Hyperspectral Digital Imagery Collection Experiment (HYDICE) b) Photo taken using SOC 700 hyperspectral sensor 37

Experimental Data 1. HYDICE (a) 85x185 pixels and 40 bands 2. SOC 700 (a) 160x160 pixels and 120 (b) only 40 bands used 38

HYDICE Data Set 39

SOC 700 Data Set 40

Testing Platform 1. SunFire v880 2. 4 UltraSparc9 750MHz CPU’s 3. 8 GB of RAM 4. Solaris 8 5. Publicly Accessible Server 41

Test Procedure 5 P-NMF and 5 PPG-NMF tests per sample Each test: run 1-8 threads until converges Convergence Criteria: change in f ( W, H ) ≤ 0 . 001 All runs in a test initialized with same seed value 42

P-NMF Accuracy 450 30 P-NMF SOC 700 P-NMF 400 25 350 20 300 Average Accuracy Accuracy 250 15 200 10 150 5 100 50 0 0 100 200 300 400 500 600 700 0 500 1000 1500 2000 Iterations Iterations 43

P-NMF Execution Time 6e+07 HYDICE PPG-NMF SOC 700 P-NMF 2.8e+06 HYDICE P-NMF 5.5e+07 2.6e+06 Average Execution Time (Milliseconds) 5e+07 Avg. Execution Time (Milliseconds) 2.4e+06 2.2e+06 4.5e+07 2e+06 4e+07 1.8e+06 3.5e+07 1.6e+06 3e+07 1.4e+06 1.2e+06 2.5e+07 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Number of Threads Number of Threads 44

P-NMF Average Time Per Iteration 8000 HYDICE P-NMF SOC 700 P-NMF 35000 7500 30000 7000 Average Time per Iteration (Milliseconds) Average Time per Iteration (Milliseconds) 25000 6500 6000 20000 5500 15000 5000 10000 4500 5000 4000 3500 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Number of Threads Number of Threads 45

Parallel Nonnegative Matrix Factorization Algorithms for - PowerPoint PPT Presentation

Parallel Nonnegative Matrix Factorization Algorithms for Hyperspectral Images A Masters Thesis Lukasz Grzegorz Maciak Montclair State University Department of Computer Science 1 Objectives 1. Background 2. Investigate Novel Feature Extraction

Nonnegative matrix factorization and applications in audio signal processing C edric F

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

+ Design of Parallel Algorithms Parallel Dense Matrix Algorithms + Topic Overview n

Data Sciences CentraleSupelec Advance Machine Learning Course VI - Nonnegative matrix

Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Automatic relevance determination in nonnegative matrix factorization with the -divergence

Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique

Nonnegative Matrix Factorization and Applications Christine De Mol (joint work with Michel

Adversarial Nonnegative Matrix Factorization Lei Luo, Yanfu Zhang, Heng Huang Electrical and

Some Recent Advances in Nonnegative Matrix Factorization and their Applications to Hyperspectral

Multi-View Clustering via Joint Nonnegative Matrix Factorization Jialu Liu 1 Chi Wang 1 Jing Gao 2

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling Jamie Haddock

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with 0 sparsity

A Model For Mixed Linear-Tropical Matrix Factorization James Hook, Sanjar Karaev, Pauli Miettinen

Lecture 23 Reminders: Programming Project 5 (reassigned Homework Problem 5.48) due on

Exploring vulnerabilities in Android 6.0 fingerprint authentication Thom Does & Mike Maarse

IP Flow Information eXport IP Flow Information eXport (IPFIX) (IPFIX)

I Control Your Code Attack Vectors through the Eyes of Software-based Fault Isolation Mathias

Hacking Consumer Devices for Fun and Profit An Insider's View of the NSLU2-Linux Open-Source

308-435 Socket Programming Juan de Lara, Hans Vangheluwe Edited for Fall 2001 by Clark Verbrugge

Data visualization on Cell Broadband Engine VUT FEL v Praze 36SPA 21. kvtna 2007 Martin

Internet publishing recommended software, encoding HTML, publishing on webserver Petr Zmostn

Parallel Nonnegative Matrix Factorization Algorithms for - PowerPoint PPT Presentation

Parallel Nonnegative Matrix Factorization Algorithms for Hyperspectral Images A Masters Thesis Lukasz Grzegorz Maciak Montclair State University Department of Computer Science 1 Objectives 1. Background 2. Investigate Novel Feature Extraction

Nonnegative matrix factorization and applications in audio signal processing C edric F

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

+ Design of Parallel Algorithms Parallel Dense Matrix Algorithms + Topic Overview n

Data Sciences CentraleSupelec Advance Machine Learning Course VI - Nonnegative matrix

Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Automatic relevance determination in nonnegative matrix factorization with the -divergence

Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique

Nonnegative Matrix Factorization and Applications Christine De Mol (joint work with Michel

Adversarial Nonnegative Matrix Factorization Lei Luo, Yanfu Zhang, Heng Huang Electrical and

Some Recent Advances in Nonnegative Matrix Factorization and their Applications to Hyperspectral

Multi-View Clustering via Joint Nonnegative Matrix Factorization Jialu Liu 1 Chi Wang 1 Jing Gao 2

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling Jamie Haddock

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with 0 sparsity

A Model For Mixed Linear-Tropical Matrix Factorization James Hook, Sanjar Karaev, Pauli Miettinen

Lecture 23 Reminders: Programming Project 5 (reassigned Homework Problem 5.48) due on

Exploring vulnerabilities in Android 6.0 fingerprint authentication Thom Does &amp; Mike Maarse

IP Flow Information eXport IP Flow Information eXport (IPFIX) (IPFIX)

I Control Your Code Attack Vectors through the Eyes of Software-based Fault Isolation Mathias

Hacking Consumer Devices for Fun and Profit An Insider's View of the NSLU2-Linux Open-Source

308-435 Socket Programming Juan de Lara, Hans Vangheluwe Edited for Fall 2001 by Clark Verbrugge

Data visualization on Cell Broadband Engine VUT FEL v Praze 36SPA 21. kvtna 2007 Martin

Internet publishing recommended software, encoding HTML, publishing on webserver Petr Zmostn

Exploring vulnerabilities in Android 6.0 fingerprint authentication Thom Does & Mike Maarse