Correspondence retrieval Alexandr Andoni Daniel Hsu Kevin Shi - PowerPoint PPT Presentation

Correspondence retrieval Alexandr Andoni † Daniel Hsu † Kevin Shi † Xiaorui Sun ♯ † Columbia University, ♯ Simons Institute for the Theory of Computing July 8th, 2017 1

Problem setup Correspondence retrieval ◮ The universe has unknown vectors x 1 , · · · , x k ∈ R d ◮ Sample measurement vectors w 1 , · · · , w n ◮ For each w i , observe the unordered set { w T i x 1 , · · · , w T i x k } 2

Problem setup Special case - phase retrieval (real-valued) ◮ The universe has a single unknown vector x ◮ Sample measurement vectors w 1 , · · · , w n ◮ For each w i , observe | w T i x | This is obtained by setting k = 2 and w = 1 2 ( x 1 − x 2 ) 3

Related work Mixture of linear regressions [YCS14] [YCS16] ◮ Universe has k hidden model parameters x 1 , · · · , x k ◮ For each i = 1 , · · · , n , sample multinomial random variable z i and measurement vector w i ◮ Observe response-covariate pairs { ( y i , w i ) } n i = 1 such that k � y i = � w j , x i � ✶ ( z i = j ) j = 1 Algorithms ◮ [YCS16] show an efficient inference algorithm with sample complexity ˜ O ( k 10 d ) ◮ Uses tensor decomposition for mixture models and alternating minimization 4

Main result Theorem Assume the following conditions: ◮ n ≥ d + 1 i . i . d . ◮ w i ∼ N ( 0 , 1 ) for i = 1 , · · · , n ◮ x 1 , · · · , x k are linearly dependent with condition number λ ( X ) Then there is an efficient algorithm which solves the correspondence retrieval using n measurement vectors. Introduces a nonstandard tool in this area - the LLL Lattice Basis Reduction algorithm. 5

Comparison with related work Mixture of linear regressions ◮ Each sample vector w i corresponds to k samples in the mixture model ◮ Previous result: ˜ O ( k 10 d ) samples ◮ Our result: k ( d + 1 ) samples Real-valued phase retrieval ◮ Previous result: 2 d − 1 measurement vectors can recover all possible hidden x [BCE08] ◮ Our result: d + 1 measurement vectors suffice to recover any single hidden x with high probability 6

Main idea - reduction to Subset Sum Subset sum Given integers { a i } n i = 1 and a target sum M , determine if there are z i ∈ { 0 , 1 } such that n � z i a i = M i = 1 Complexity ◮ Subset Sum is NP-hard in the worst case, but easy in the average case where the a i ’s are uniformly distributed [LO85] ◮ We extend this to the case where � n i = 1 z i a i just needs to satisfy anti-concentration inequalities at every point 7

Lattices Definition (Lattice) Given a collection of linearly independent vectors b 1 , · · · , b m ∈ R d , a lattice Λ B over the basis B = { b 1 , · · · , b m } is the Z -module of B as embedded in R d � m � � z i b i : z i ∈ Z Λ B = i = 1 Shortest vector problem Given a lattice basis B ⊂ R d , find the lattice vector B z ∈ Λ B s.t. � B z � 2 z = arg min 2 z ∈ Z −{ 0 } 8

Shortest Vector Problem Hardness of approximation Shortest vector problem is NP-hard to approximate to within a constant factor. LLL Lattice Basis Reduction[LLL82] There is an efficient approximation algorithm for solving the Shortest Vector Problem. ◮ Approximation factor: 2 d / 2 ◮ Running time: poly ( d , log λ ( B )) 9

Proof Overview 1. Reduce the correspondence retrieval problem to the shortest vector problem in a lattice with basis B : arg min z ∈ Z dk + 1 � Bz � 2 2. Show that the coefficient vector z with 1’s in the correct √ correspondences produces a lattice vector of norm d + 1 3. Show that for a fixed, incorrect z , with high probability � Bz � 2 ≥ 2 ( dk + 1 ) / 2 √ d + 1 over the randomness of the w i ’s 4. Under appropriate scaling and a union bound argument, every incorrect z produces a lattice vector with norm at least 2 ( dk + 1 ) / 2 √ d + 1 10

Recap ◮ We defined a new observation model which is loosely inspired by mixture models and which also generalizes phase retrieval ◮ We show that this observation model admits exact inference with lower sample complexity than either of the above two models ◮ We describe an algorithm based on a completely different technique - the LLL basis reduction algorithm 11

Recap ◮ We defined a new observation model which is loosely inspired by mixture models and which also generalizes phase retrieval ◮ We show that this observation model admits exact inference with lower sample complexity than either of the above two models ◮ We describe an algorithm based on a completely different technique - the LLL basis reduction algorithm Thanks for listening! 11

Correspondence retrieval Alexandr Andoni Daniel Hsu Kevin Shi - PowerPoint PPT Presentation

Correspondence retrieval Alexandr Andoni Daniel Hsu Kevin Shi Xiaorui Sun Columbia University, Simons Institute for the Theory of Computing July 8th, 2017 1 Problem setup Correspondence retrieval The universe has

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Correspondence Management and Workflow Optimisation Workshop Your Facilitator is Nick Sharples

Business Correspondence Tone! Dr Bean ( ) at Business Correspondence Tone! Tone

package package ca function function ca mjca (simple) correspondence multiple

Types of Correspondence Problems and Data Sets 1 1 Correspondence Registration 2

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

P

Modes of Persuasion Toward Unanimous Consent Arjada Bardhi, Yingni Guo 1 1 Northwestern Jan 05

Device Layer-Aware Analytical Placement for Analog Circuits Biying Xu 1 , Shaolan Li 1 , Chak-Wa

IC OFF THE RECORD: Direct access to leaked information related to the surveillance activities of

OLD AND NEW RESULTS FOR Johan Bijnens HADRONIC-LIGHT-BY-LIGHT Overview Main contributions

In-Medium Nucleon Structure and Fragmentation Zhihong Ye Medium Energy Group, Physics Division

MPLS and DiffServ Sources: MPLS Forum, Cisco V. Alwayn, Advanced MPLS Design and Implementation ,

SPIN QUACK ! QUACK ! Not all things that quack are ducks! We will see two important themes:

Correspondence retrieval Alexandr Andoni Daniel Hsu Kevin Shi - PowerPoint PPT Presentation

Correspondence retrieval Alexandr Andoni Daniel Hsu Kevin Shi Xiaorui Sun Columbia University, Simons Institute for the Theory of Computing July 8th, 2017 1 Problem setup Correspondence retrieval The universe has

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Correspondence Management and Workflow Optimisation Workshop Your Facilitator is Nick Sharples

Business Correspondence Tone! Dr Bean ( ) at Business Correspondence Tone! Tone

package package ca function function ca mjca (simple) correspondence multiple

Types of Correspondence Problems and Data Sets 1 1 Correspondence Registration 2

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

P

Modes of Persuasion Toward Unanimous Consent Arjada Bardhi, Yingni Guo 1 1 Northwestern Jan 05

Device Layer-Aware Analytical Placement for Analog Circuits Biying Xu 1 , Shaolan Li 1 , Chak-Wa

IC OFF THE RECORD: Direct access to leaked information related to the surveillance activities of

OLD AND NEW RESULTS FOR Johan Bijnens HADRONIC-LIGHT-BY-LIGHT Overview Main contributions

In-Medium Nucleon Structure and Fragmentation Zhihong Ye Medium Energy Group, Physics Division

MPLS and DiffServ Sources: MPLS Forum, Cisco V. Alwayn, Advanced MPLS Design and Implementation ,

SPIN QUACK ! QUACK ! Not all things that quack are ducks! We will see two important themes:

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models