Dictionary learning in geoscience Michael Bianco UCSD Noise Lab, - PowerPoint PPT Presentation

Dictionary learning in geoscience Michael Bianco UCSD Noise Lab, Scripps Institution of Oceanography noiselab.ucsd.edu 5/9/18

Dictionary learning • Means of estimating sparse causes for given classes of signals, e.g. natural images, audio • Originated in neuroscience to estimate structure of V1 visual cortex cells from natural images • Useful for regularization of general image denoising inverse problem, but only recent applications in the geosciences • Seismic survey image denoising • Dictionary learning of ocean sound speed profiles (SSPs) 10 depth (m) 40 70 -1 0 1 amplitude Beckouche 2014 Bianco and Gerstoft 2017 Olshausen 2009 2

Background: sparse modeling of arbitrary signal y error dictionary Measurement vector y is expressed as sparse linear combination of columns or • "atoms" from dictionary D y could be (for example) segments of speech or vectorized 2D image patches • Dictionary atoms represent elemental patterns that generate y, e.g. wavelets or • learned from the data using dictionary learning x is estimated using sparsity inducing constraint, example " -norm" regularization: • - norm "counts" # non-zero coe ffi cients

Background: sparse modeling of arbitrary signal y error dictionary ?? Measurement vector y is expressed as sparse linear combination of columns or • "atoms" from dictionary D y could be (for example) segments of speech or vectorized 2D image patches • Dictionary atoms represent elemental patterns that generate y, e.g. wavelets or • learned from the data using dictionary learning x is estimated using sparsity inducing constraint, example " -norm" regularization: • - norm "counts" # non-zero coe ffi cients

Background: sparsity and dictionary learning Dictionary learning obtains "optimal" sparse modeling dictionaries directly from data • Dictionary learning was developed in neuroscience (a.k.a. sparse coding) to help • understand mammalian visual cortex structure Assumes (1) Redundancy in data: image patches are repetitions of a smaller set of • elemental shapes; and (2) Sparsity: each patch is represented with few atoms from dictionary "Natural" images, patches shown in magenta Learn dictionary D describing 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 • Each patch is signal 50 • Set of all patches 100 150 200 250 300 350 Olshausen 2009 400 Sparse model for patch composed of few atoms from D 450 500 50 100 150 200 250 300 350 400 450 500

Olshausen and Field 1997: image model with sparse prior Assume that each image patch described by linear system Goal: estimate bases from observations Probability of image patch arising from bases phi is , with Independent, sparse prior Likelihood Image patches 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 Likelihood Prior Posterior 7

Olshausen and Field 1997- sparse prior induces sparse coefficients Sparsity inducing prior "Cauchy distribution" Derivative of prior induces sparsity in solution, as we’ll see…

Olshausen and Field 1997 - derivation of Error function Learn basis functions by minimizing Kullback-Leibler (KL) divergence between true images and those reproduced by model Since is fixed, KL is minimized by maximizing log-likelihood (or minimizing negative log-likelihood) of image patches generated from model, hence Given:

Olshausen and Field 1997 - derivation of Error function cont’d Learn basis functions by minimizing Kullback-Leibler (KL) divergence between true images and those reproduced by model

Olshausen and Field 1997 - derivation of Error function cont’d Learn basis functions by minimizing Kullback-Leibler (KL) divergence between true images and those reproduced by model Given: Obtain:

Olshausen and Field 1997 - gradients for network model Rewriting Error function, take derivatives to find gradient Update to with network (inner loop) with Update to with gradient descent (outer loop) "Hebbian" update

From Olshausen ’97 method, obtain dictionary atoms that resemble cells from mammalian visual cortex Natural image patches 50 50 Dictionary elements 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500

Nice to have atoms like cells, but what else is dictionary learning useful for?

Nice to have atoms like cells, but what else is dictionary learning useful for? Image restoration tasks Denoising Inpainting (a.k.a. matrix completion) Mairal 2009 Elad 2006

Olshausen and Field 1997 - gradients for network model Can be rephrased with Laplacian prior Coefficients calculated using gradient descent, then dictionary updated by This idea of iterative refinement is familiar: solving for coefficients, then updating basis functions

Vector Quantization and K-means 2D example Vector quantization (VQ): means of compressing a set of data observations using a nearest neighbor metric with codebook 1 0 -1 K-means: finds optimal codebook for VQ (a) (b) 1 0 1 0 -1 -1 17

Relationship to sparse coding Sparse processor { { VQ operators Dictionary learning objective Gain-shape VQ K-means K-means G-S VQ 18

Background: a basic dictionary learning framework Given set of patches , learn dictionary D describing them 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 Patches shown in magenta Dictionary D Dictionary learning objective Objective solved as simple optimization problem 1. Solve for sparse coefficients using sparse solver 2. Solve for dictionary D using sparse coefficients from step (1)….. repeat until convergence

MOD algorithm: Extending K-means to dictionary learning problem Method of Optimal Directions (MOD) [Engan 2000] MOD algorithm: 1. COEFFICIENTS: Solve for coefficients X=[x_1…x_i] for fixed Q using orthogonal matching pursuit (OMP) 2. DICTIONARY UPDATE: Solve for dictionary Q=[q_1…q_i] , by inverting the coefficient matrix X , and normalizing dictionary entries to have unit norm. Q = YX T ( XX T ) − 1 b …. repeat until convergence Simple and flexible but, a few drawbacks: computationally expensive to invert coefficient matrix X • since keeping coefficients in X fixed during dictionary update, slow convergence • 20

K-SVD algorithm K-SVD [Aharon 2006]: Learn optimal dictionary for sparse representation of data K-SVD algorithm: 2D example 1. Solve for coefficients X=[x_1…x_i] for fixed Q using OMP 2. Solve (1) for dictionary Q=[q_1…q_i] , 1 updating both Q and X from the SVD of representation error � � ✓ ◆ X q j x j � q k x k � � k Y � QX k F = Y � 0 � � T T � � F j 6 = k { = k E k � q k x k T k -1 update q_k, x_k by SVD (a) (b) k = USV T E e q k = U (: , 1) , x k T = V (: , 1) S (1 , 1) 1 0 1 0 -1 -1 …. repeat until convergence 21

Image restoration tasks Denoising Inpainting (a.k.a. matrix completion) Mairal 2009 Elad 2006

Image restoration tasks Denoising: learning from noisy image patches for specific image Solved using block-coordinate descent algorithm (also two steps): (1) (2) Elad 2006

Why not just use neural networks? Burger 2012: Multi-layer perceptron competes with state of art denoising algorithms, using 362 million training samples (~one month of GPU time) … at least in geoscience (seimsics, ocean acoustics) we rarely have this much training data Adaptive image denoising-like MLP-like Wipf 2018

Why not just use neural networks? (cont’d)

Dictionary learning of ocean sound speed profiles Bianco and Gerstoft 2017 • Acoustic observations from ocean contain information about ocean environment • The inversion of environment parameters is limited by physics and signal processing assumptions Sound speed Hydrophones profile c(z) Source (active or noise) ⍴ 1 , c 1 ⍴ 2 , c 2 26

Dictionary learning in geoscience Michael Bianco UCSD Noise Lab, - PowerPoint PPT Presentation

Dictionary learning in geoscience Michael Bianco UCSD Noise Lab, Scripps Institution of Oceanography noiselab.ucsd.edu 5/9/18 Dictionary learning Means of estimating sparse causes for given classes of signals, e.g. natural images, audio

The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the

CMSC 206 Dictionaries and Hashing The Dictionary ADT n a dictionary (table) is an abstract

Sparse Coding and Dictionary Learning for Image Analysis Part II: Dictionary Learning for signal

Geoscience Data Journal Developing a workflow for cross-linking between dataset and Data Paper

The Western New York Chapter of the IEEE Geoscience and Remote Sensing Society Established in

Geoscience Programme IGCP The IGCP is the oldest and one of the most successful examples of

Agenda Introduction What is a Geoscience Technician What to expect as a new hire Transition

Simulation of Particle Optics: Applications in Geoscience and Biomedicine Lei Bi, Bingqi Yi,

Dictionaries A Good morning dictionary English: Good morning Spanish: Buenas das

Hashing - Introduction Dictionary Dictionary = a dynamic set that supports the = a dynamic set

Dictionary lookup Suppose youre looking up a word in the dictionary (paper one, not

The dictionary problem. A dictionary can be seen as a database of records; in each record we

6. Dictionary models for text compression Previous techniques: Predictive, statistical One

Hash- Tables Introduction Dictionary Dictionary stores key-value pairs Find( k ) Insert( k

Agenda Announcements Dictionary please snarf code for class today

Dictionaries and Sets Ali Taheri Sharif University of Technology Spring 2019 Outline 1.

Status of the light signal simulation for ProtoDUNE-DP Anne CHAPPUIS Isabelle DE BONIS

How Long Will It Take? A Guide to Software Estimation by Jared Faris @jaredthenerd

Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Maximum likelihood

EasyTracker Automatic Transit Tracking, Mapping, and Arrival Time Prediction Using Smartphones

Inferring User Routes and Locations using Zero-Permission Mobile Sensors Sashank Narain, Triet D.

Route 1 Multimodal Alternatives Analysis: R t 1 M lti d l Alt ti A l i Technical Advisory

Physics 2D Lecture Slides Lecture 1: Jan 3 2005 Vivek Sharma UCSD Physics 1 Modern Physics

Service Contract Act of 1965 (SCA) PRESENTED BY GSA GSAC NCM C NCMA April pril 17 17, ,