Multi-scale Geometric Summaries for Similarity-based Upstream Sensor - PowerPoint PPT Presentation

Multi-scale Geometric Summaries for Similarity-based Upstream Sensor Fusion Christopher Tralie, Paul Bendich, John Harer Duke University, ECE / Math 3/6/2019 Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Overall Goals / Design Choices ⊲ Leverage multiple, heterogeneous modalities in identification ⊲ Develop general tools without domain specific models ⊲ Techniques are unsupervised (no training data required) Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

OuluVS2 Digits Dataset ⊲ 51 speakers ⊲ 10 sequences, 3 instances per speaker per sequence ⊲ Video from multiple points of view, audio http://www.ee.oulu.fi/research/imag/OuluVS2/ index.html Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Why Digits? ⊲ Modalities capture different aspects (“p” versus “b”) ⊲ Variation across speakers and across runs ⊲ Even after uniformly scaling, the raw audio signals do not align perfectly in time Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Problems And Success Metrics ⊲ Decompose set of digit strings various ways: ◮ by digit string, by speaker, by speaker and digit string ⊲ Goal is to come up with similarity ranking mechanism µ s.t. ◮ For each object s , µ ( s, t ) is larger when t is in same class as s (Rusinkiewicz and Funkhouser 2009) Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Problems And Success Metrics ⊲ Success Evaluated by precision-recall curves for each object s ⊲ Recall : Proportion of class items considered in an ordered list by similarity ⊲ Precision : The proportion of items that are actually correct Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Problems And Success Metrics ⊲ Success Evaluated by precision-recall curves for each object s ⊲ Report average P-R curves ⊲ Area under P-R curve is mean average precision (MAP) Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Other approches, our pipeline(s) ⊲ Many approaches (including ours) construct µ via mapping strings into a feature space ⊲ Lots of deep learning approaches (Lopez and Sukno, 2018) ⊲ HMM per class, use canonical correlation analysis to learn good ways to extract fused audio/visual features (Sargin et al, 2007) ⊲ We propose a set of entirely unsupervised pipelines ◮ Labeled examples used only to evaluate not to train s s s Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Self-Similarity Matrices (SSMs) D ij = || X i − X j || 2 Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Why SSMs? Imran N Junejo et al. “View-independent action recognition from temporal self-similarities”. In: IEEE transactions on pattern analysis and machine intelligence 33.1 (2011), pp. 172–185 Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

SSMs on Our Data Video: ⊲ Extract lip region from each frame and rescale to 25 × 25 grayscale ⊲ Treat as time series in 25 × 25 = 625 dim Euclidean space Audio: ⊲ Break audio signal into overlapping windows ⊲ Summarize each window via 20 MFCC coefficients ⊲ Treat as time series in 20 dimensional Euclidean space Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Similarity Network Fusion (SNF) ⊲ Transform several weight matrices W 1 , . . . , W m into one that (hopefully) has best qualities of all ⊲ Based on random walks with cross-talk between matrices for probabilities (works best if modalities are complementary) Bo Wang et al. “Unsupervised metric fusion by cross diffusion”. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on . IEEE. 2012, pp. 2997–3004 Bo Wang et al. “Similarity network fusion for aggregating data types on a genomic Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor scale”. In: Nature methods 11.3 (2014), p. 333

SNF for Early Audio-Visual Fusion ⊲ We use SNF to fuse MFCC (audio) and lip pixel (video) SSMs (W ) (W ) (W ) F A v (W ) (W ) (W ) A v F c a b 9 7 4 4 4 3 5 5 8 7 a: repeating 4s, b: repeating 5s, c: repeating 7s Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

How To Compare (Fused) SSMs? ⊲ Each string s transformed into SSM W A ( s ) , W v ( s ) , then fused into W F ( s ) ⊲ How to compare W F ( s ) with W F ( s ′ ) ? Could just use ℓ 2 (Matrix Frobenius Norm) s s s Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Measuring Similarity between SSMs ⊲ Each string s transformed into SSM W A ( s ) , W v ( s ) , then fused into W F ( s ) ⊲ How to compare W F ( s ) with W F ( s ′ ) ? Could just use ℓ 2 (Matrix Frobenius Norm) ⊲ Local delays (time warps) induce local perturbations in SSMs ⊲ ℓ 2 norm unstable to these perturbations Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

The Scattering Transform ⊲ Instead of ℓ 2 , use the scattering transform on SSMs ◮ Has nice theoretical stability properties Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor Laurent Sifre and St´ ephane Mallat. “Rotation, scaling and deformation invariant

The Scattering Transform: A Few Details ⊲ Given an N × N image I ( u, v ) , choose lowpass filter φ ( u, v ) ⊲ Level 0: S 0 ( u, v ) = I ∗ φ ( u, v ) ⊲ There are d × d total coefficients: d = N/ 2 J − 1 , J max scale Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

The Scattering Transform: A Few Details ⊲ Now choose a mother wavelet ψ ( u, v ) , a set of L directions γ i , and a set of J scales j ∈ 0 , 1 , . . . , J − 1 ⊲ Level 1: S 1 i,j ( u, v ) = | I ∗ 2 − 2 j ψ γ i ( u/ 2 j , v/ 2 j ) | ∗ φ ( u, v ) Using complex Gabor wavelets: ψ γ = e iγ · ( u,v ) e − ( u 2 + v 2 ) /σ 2 Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

The Scattering Transform: A Few Details ⊲ Now choose a mother wavelet ψ ( u, v ) , a set of L directions γ i , and a set of J scales j ∈ 0 , 1 , . . . , J − 1 ⊲ Level 1: S 1 i,j ( u, v ) = | I ∗ 2 − 2 j ψ γ i ( u/ 2 j , v/ 2 j ) | ∗ φ ( u, v ) There are d 2 LJ level 1 coefficients Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

The Scattering Transform: A Few Details ⊲ Level 2: S 2 i,j,k,l ( u, v ) = || I ∗ 2 − 2 j ψ γ i ( u/ 2 j , v/ 2 j ) |∗ 2 − 2 l ψ γ k ( u/ 2 l , v/ 2 l ) |∗ φ ( u, v ) (1) ⊲ There are d 2 L 2 J ( J − 1) / 2 level 2 coefficients Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

The Scattering Transform: A Few Details ⊲ One can continue past level 2, but we stop there ⊲ Repeated convolve-with-wavelet, take complex modulus, do low-pass filter gives CNN-style architecture, but unsupervised. ⊲ Each choice of wavelets in sequence is called a path Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Scattering Transform As Feature Extractor ⊲ Resize each SSM to 256 × 256 resolution ⊲ Take L = 8 equally spaced directions between 0 and π ⊲ Take J = 4 scales, so that each path is 32 × 32 ⊲ Results in 32 2 (1 + 4 × 8 + 8 2 × 4 × 3 / 2) = 427 , 008 scattering coefficients extracted from SSM (6.5x data size, but stable) Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Scattering Transform As Feature Extractor ⊲ Example scattering SSM Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

SNF for Late Audio-Visual Fusion ⊲ Everything so far has happened upstream : before ranking decisions are made ⊲ Can also apply SNF downstream ⊲ Given object-level metrics µ 1 , . . . , µ k on set of N objects (strings) ⊲ Each one produces object-level SSMs, which can themselves be fused into a new SSM ⊲ We apply that here with k = 3 (audio, visual, early fused) s s s Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Results: Digit String Identification Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Results: Digit String Identification, Simulated Noise ∞ 14 10.5 20 16.5 12 26 PSNR (dB) Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for Similarity-based Upstream Sensor

Multi-scale Geometric Summaries for Similarity-based Upstream Sensor - PowerPoint PPT Presentation

Multi-scale Geometric Summaries for Similarity-based Upstream Sensor Fusion Christopher Tralie, Paul Bendich, John Harer Duke University, ECE / Math 3/6/2019 Christopher Tralie, Paul Bendich, John Harer Multi-scale Geometric Summaries for

Business Statistics CONTENTS Data summaries Univariate summaries Bivariate summaries

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Distributed Multi-modal Similarity Retrieval David Novak Seminar of DISA Lab, October 14, 2014

Herbal summaries for the public Involvement of PCOs in preparation of herbal summaries Federica

Herbal summaries for the public Involvement of PCOs in preparation of herbal summaries Jill

Overall Mark for summaries on Moodle is misleading Moodle shows an Overall Mark for your

Mergeable Summaries Graham Cormode graham@research.att.com graham@research.att.com Pankaj

Geometric Optimization Piotr Indyk April 26, 2005 Lecture 19: Geometric Optimization Geometric

Geometric Algebra A powerful tool for solving geometric problems in visual computing Leandro A.

Geometric Transformations CSE 576 Ali Farhadi Many slides from Steve Seitz and Larry Zitnick

Unification of CSC and SE ABET Effor ts Similarity of CSC and SE Programs Similarity of CSC and

LECTURE 4 Similarity and Distance Recommender Systems SIMILARITY AND DISTANCE Thanks to: Tan,

I/O-EFFICIENT SIMILARITY JOIN R. Pagh, N. Pham, F. Silvestri, M. Stckel Similarity Join R = Q

COMP9313: Big Data Management High Dimensional Similarity Search Similarity Search Problem

Physics becomes the computer Norm Margolus CBSSS 6/25/02 Physics becomes the computer

Uniqueness, Spatial Mixing, and Approximation for Ferromagnetic 2-Spin Systems Heng Guo 1 and

Supersymmetric contributions to Z decays Gennaro Corcella INFN, Laboratori Nazionali di

Electroweak Baryogenesis in the -from- SSM Andrew Long University of Wisconsin, Madison

Dataflow networks Kahn-Dennis networks: A network of computing stations connected by unbounded

Barbara Caccianiga- INFN Milano Studying Solar Neutrinos The Sun is powered by nuclear reactions

A road map to more complex dynamic models Y Y Y discrete discrete continuous A A A X X

Compromise, Consensus, and System-ness: Developing a Community Crisis Standards of