Sparse Overcomplete, Shift- and Transform-Invariant Representations - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Sparse Overcomplete, Shift- and Transform-Invariant Representations Class 15. 14 Oct 2009

Recap: Mixture-multinomial model n The basic model: Each frame in the magnitude spectrogram is a histogram drawn from a mixture of multinomial (urns) The probability distribution used to draw the spectrum for q the t-th frame is: = � SOURCE specific ( ) ( ) ( | ) P f P z P f z Frame-specific t t z bases spectral distribution Frame(time) specific mixture weight 11-755 MLSP: Bhiksha Raj

Recap: Mixture-multinomial model 91 411 501 50291 411 501 502 91 411 501 502 91 411 501 502 515 515 515 515 127 27 101 203 127 27 101 203 127 27 101 203 127 27 101 203 24 477 24 477 24 477 24 477 69 69 69 69 5 5 598 1 274 1 7520 91 501 444 453 99 453 37 411 502 515 15 164 81 147 327 1 147 38 1 127 27 101 81 224 111 203 8 6 224 47 201 737 24 477 399 369 69 The individual multinomials represent the “spectral bases” that n compose all signals generated by the source E.g., they may be the notes for an instrument q More generally, they may not have such semantic interpretation q 11-755 MLSP: Bhiksha Raj

Recap: Learning Bases 5 5 598 1 274 1 7520 91 501 453 453 411 502 444 99 37 515 15 164 81 147 327 1 147 38 1 127 27 101 81 224 111 203 8 224 201 24 6 47 737 477 399 369 69 n Learn bases from example spectrograms n Initialize bases (P(f|z)) for all z, for all f n For each frame, initialize P t (z) n Iterate = � P z P f ( ) ( | ) z t P z f ( | ) t P z P f ( ') ( | ') z t z ' = � � P z f S ( | ) ( ) f P z f S ( | ) ( ) f t t t t t ( | ) f P f z �� P z ( ) = �� ( | ') ( ') t P z f S f P z ( '| f S ) ( ) f t t t t f ' t z ' f 11-755 MLSP: Bhiksha Raj

Bases represent meaning spectral structures bases Basis-specific spectrograms Speech Signal 5 55 98 444 1 2 74 453 1 99 7 520 453 37 91 411 501 502 515 15 164 81 147 327 1 147 38 1 127 27 101 81 224 111 203 8 6 224 47 201 37 24 477 399 369 7 69 P(f|z) From Bach’s Fugue in Gm Frequency fi P t (z) Time fi 11-755 MLSP: Bhiksha Raj

How about non-speech data 19x19 images = 361 dimensional vectors We can use the same model to represent other data n Images: n Every face in a collection is a histogram q Each histogram is composed from a mixture of a fixed number of q multinomials All faces are composed from the same multinomials, but the manner in which the n multinomials are selected differs from face to face Each component multinomial is also an image q And can be learned from a collection of faces n Component multinomials are observed to be parts of faces n 11-755 MLSP: Bhiksha Raj

How many bases can w e learn n The number of bases that must be learned is a fundamental question How do we know how many bases to learn q How many bases can we actually learn computationally q n A key computational problem in learning bases: The number of bases we can learn correctly is restricted by q the dimension of the data I.e., if the spectrum has F frequencies, we cannot estimate q more than F-1 component multinomials reliably Why? n 11-755 MLSP: Bhiksha Raj

Indeterminacy in Learning Bases 3 3 3 Consider the four histograms n 2 2 2 2 2 to the right 1 1 1 1 All of them are mixtures of the n same K component multinomials B1 B2 For K < 3, a single global n solution may exist I.e there may be a unique set q of component multinomials c*B1+d*B2 that explain all the e*B1+f*B2 multinomials With error – model will not be g*B1+h*B2 n perfect i*B1+j*B2 For K = 3 a trivial solution n exists 11-755 MLSP: Bhiksha Raj

Indeterminacy C*B1+C*2*B2+C*3*B3 = 0.5B1+0.33B2+0.17B3 Multiple solutions for K = 3.. n 0.5B1+0.17B2+0.33B3 We cannot learn a non- q 0.33B1+0.5B2+0.17B3 trivial set of “optimal” bases from the histograms 0.4B1+0.2B2+0.4B3 The component q multinomials we do learn tell 3 3 3 us nothing about the data 2 2 2 2 2 1 1 1 1 For K > 3, the problem only n gets worse An inifinite set of solutions q B1 B2 B3 are possible 1 1 1 E.g. the trivial solution plus 0 0 0 0 0 0 n a random basis 11-755 MLSP: Bhiksha Raj

Indeterminacy in signal representations n Spectra: If our spectra have D frequencies (no. of unique indices in q the DFT) then.. We cannot learn D or more meaningful component q multinomials to represent them The trivial solution will give us D components, each of which n has probability 1.0 for one frequency and 0 for all others This does not capture the innate spectral structures for the n source n Images: Not possible to learn more than P-1 meaningful component multinomials from a collection of P-pixel images 11-755 MLSP: Bhiksha Raj

Overcomplete Representations Representations where there are more bases than dimensions n are called Overcomplete E.g. more multinomial components than dimensions q More L2 bases (e.g. Eigenvectors) than dimensions q More non-negative bases than dimensions q Overcomplete representations are difficult to compute n Straight-forward computation results in indeterminate solutions q Overcomplete representations are required to represent the n world adequately The complexity of the world is not restricted by the dimensionality q of our representations! 11-755 MLSP: Bhiksha Raj

How many bases to represent sounds/images? In each case, the bases represent “typical unit structures” n Notes q Phonemes q Facial features.. q To model the data well, all of these must be represented n How many notes in music n Several octaves q Several instruments q The total number of notes required to represent all “typical” sounds in n music are in the thousands The typical sounds in speech – n Many phonemes, many variations, can number in the thousands q Images: n Millions of units that can compose an image – trees, dogs, walls, sky, etc. q etc. etc… 11-755 MLSP: Bhiksha Raj

How many can w e learn Typical Fourier representation of sound: 513 (or less) unique n frequencies I.e. no more than 512 unique bases can be learned reliably q These 512 bases must represent everything q Including the units of music, speech, and the other sounds in the n world around us Depending on what we’re attempting to model q Typical “tiny” image: 100x100 pixels n 10000 pixels q I.e. no more than 9999 distinct bases can be learned reliably q But the number of unique entities that can be represented in a q 100x100 image is countless! We need overcomplete representations to model these data well n 11-755 MLSP: Bhiksha Raj

Learning Overcomplete Representations n Learning more multinomial components than dimensions (frequencies or pixels) in the data leads to indeterminate or useless solution n Additional criteria must be imposed in the learning process to learn more components than dimensions Impose additional constraints that will enable us to obtain q meaningful solutions n We will require our solutions to be sparse 11-755 MLSP: Bhiksha Raj

SPARSE Decompositions 5 5 598 1 274 1 520 5 5 598 1 274 1 520 5 5 598 1 274 1 520 91 501 453 7 453 453 7 453 453 7 453 411 502 444 99 37 444 99 37 444 99 37 515 15 164 81 147 327 1 147 38 1 15 164 81 147 327 1 147 38 1 15 164 81 147 327 1 147 38 1 127 27 101 81 224 111 81 224 111 81 224 111 203 8 6 224 47 201 37 8 6 224 47 201 37 8 6 224 47 201 37 24 477 399 369 7 399 369 7 399 369 7 69 Allow any arbitrary number of bases (urns) n Overcomplete q Specify that for any specific frame only a small number of bases may be n used Although there are many spectral structures, any given frame only has a few of q these In other words, the mixture weights with which the bases are combined n must be sparse Have non-zero value for only a small number of bases q Alternately, be of the form that only a small number of bases contribute q significantly 11-755 MLSP: Bhiksha Raj

Sparse Overcomplete, Shift- and Transform-Invariant Representations - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Sparse Overcomplete, Shift- and Transform-Invariant Representations Class 15. 14 Oct 2009 Recap: Mixture-multinomial model n The basic model: Each frame in the magnitude spectrogram is a histogram

Overcomplete models & Lateral interactions and Feedback Teppo Niinimki April 22, 2010

1 2 nd Shift Associates 2 nd Shift Associates 3 rd Shift Associates 3 rd Shift Associates 2

Learning sparsely used overcomplete dictionaries Alekh Agarwal Microsoft Research Joint work

Making Convolutional Networks Shift-Invariant Again Richard Zhang Adobe Research Example

Topic 10: The Z Transform o Introduction to Z Transform o Relationship to the Fourier transform o

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

HOLY SHIFT! Linda Zheng Roadmap You are here My Shift Introduction Shift AST Experience

Lecture 22: Linear Shift-Invariant (LSI) Systems and Convolution April 26, 2016. Linear

SMART GOVERNMENT INVOICING: INVOICE PROCESSING PLATFORM LEAD. TRANSFORM. DELIVER LEAD. TRANSFORM.

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Shift- and Transform-Invariant Representations Denoising Speech Signals Class 18. 22 Oct 2009

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Paradigm Shift: Moving from Vertical Paradigm Shift: Moving from Vertical Paradigm Shift:

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

Shift Invariant Spaces and BMO Morten Nielsen Department of Mathematical Sciences Aalborg

An Evolutionary View on Reversible Shift-invariant Transformations Luca Mariot, Stjepan Picek,

Heb. 11:32, And what more shall I say? For the time would fail me to tell of Gideon and Barak

Poetic Figures 3 ANALOGIES AND COMPARISONS Simile : the explicit comparison of two things

District 4, Area 15, South Florida Groups Groups Intergroup Rep/Alt Intergroup GSR/Alt GSR

By Andrew D. Birrell and Bruce Jay Nelson Presented By: Abdussalam Alawini Reviewed By Prof.

Music Informatics Alan Smaill March 26, 2018 Alan Smaill Music Informatics March 26, 2018 1/1

Delivering IaaS for the Greek Academic and Research Community Vangelis Koukis

National Technical University of Athens SCHOOL OF APPLED MATHEMATICS & PHYSICS SCIENCES Greek

Polychotomizers: One-Hot Vectors, Softmax, and Cross-Entropy Mark Hasegawa-Johnson, 3/9/2019.

Sparse Overcomplete, Shift- and Transform-Invariant Representations - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Sparse Overcomplete, Shift- and Transform-Invariant Representations Class 15. 14 Oct 2009 Recap: Mixture-multinomial model n The basic model: Each frame in the magnitude spectrogram is a histogram

Overcomplete models &amp; Lateral interactions and Feedback Teppo Niinimki April 22, 2010

1 2 nd Shift Associates 2 nd Shift Associates 3 rd Shift Associates 3 rd Shift Associates 2

Learning sparsely used overcomplete dictionaries Alekh Agarwal Microsoft Research Joint work

Making Convolutional Networks Shift-Invariant Again Richard Zhang Adobe Research Example

Topic 10: The Z Transform o Introduction to Z Transform o Relationship to the Fourier transform o

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

HOLY SHIFT! Linda Zheng Roadmap You are here My Shift Introduction Shift AST Experience

Lecture 22: Linear Shift-Invariant (LSI) Systems and Convolution April 26, 2016. Linear

SMART GOVERNMENT INVOICING: INVOICE PROCESSING PLATFORM LEAD. TRANSFORM. DELIVER LEAD. TRANSFORM.

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Shift- and Transform-Invariant Representations Denoising Speech Signals Class 18. 22 Oct 2009

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Paradigm Shift: Moving from Vertical Paradigm Shift: Moving from Vertical Paradigm Shift:

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

Shift Invariant Spaces and BMO Morten Nielsen Department of Mathematical Sciences Aalborg

An Evolutionary View on Reversible Shift-invariant Transformations Luca Mariot, Stjepan Picek,

Heb. 11:32, And what more shall I say? For the time would fail me to tell of Gideon and Barak

Poetic Figures 3 ANALOGIES AND COMPARISONS Simile : the explicit comparison of two things

District 4, Area 15, South Florida Groups Groups Intergroup Rep/Alt Intergroup GSR/Alt GSR

By Andrew D. Birrell and Bruce Jay Nelson Presented By: Abdussalam Alawini Reviewed By Prof.

Music Informatics Alan Smaill March 26, 2018 Alan Smaill Music Informatics March 26, 2018 1/1

Delivering IaaS for the Greek Academic and Research Community Vangelis Koukis

National Technical University of Athens SCHOOL OF APPLED MATHEMATICS &amp; PHYSICS SCIENCES Greek

Polychotomizers: One-Hot Vectors, Softmax, and Cross-Entropy Mark Hasegawa-Johnson, 3/9/2019.

Overcomplete models & Lateral interactions and Feedback Teppo Niinimki April 22, 2010

National Technical University of Athens SCHOOL OF APPLED MATHEMATICS & PHYSICS SCIENCES Greek