Robust Models in Information Retrieval Nedim Lipka Benno Stein - PowerPoint PPT Presentation

Robust Models in Information Retrieval Nedim Lipka Benno Stein Bauhaus-Universität Weimar [www.webis.de]

Robust Models in Information Retrieval Outline · Introduction · Bias and Variance · Robust Models in IR · Summary · Excursus: Bias Types

Introduction c [ ∧ ] � stein TIR’11

Introduction Classification Task Given: ❑ set O of real-world objects o ❑ feature space X with feature vectors x ❑ classification function (closed form unknown) c : X → Y ❑ sample S = { ( x , y ) | x ∈ X, y = c ( x ) } c [ ∧ ] � stein TIR’11

Introduction Classification Task Given: ❑ set O of real-world objects o ❑ feature space X with feature vectors x ❑ classification function (closed form unknown) c : X → Y ❑ sample S = { ( x , y ) | x ∈ X, y = c ( x ) } Searched: ❑ hypothesis h ∈ H that minimizes P ( h ( x ) � = c ( x )) , the generalization error. � �� err ( h ) c [ ∧ ] � stein TIR’11

Introduction Classification Task Given: ❑ set O of real-world objects o ❑ feature space X with feature vectors x ❑ classification function (closed form unknown) c : X → Y ❑ sample S = { ( x , y ) | x ∈ X, y = c ( x ) } Searched: ❑ hypothesis h ∈ H that minimizes P ( h ( x ) � = c ( x )) , the generalization error. � �� err ( h ) Measuring effectiveness of h : ❑ err S ( h ) = 1 � loss 0 / 1 ( h ( x ) , c ( x )) | S | x ∈ S err S ( h ) is called test error if S is not used for the construction of h . ❑ err ( h ∗ ) := min h ∈ H err ( h ) defines lower bound for err ( h ) ➜ restriction bias. c [ ∧ ] � stein TIR’11

Introduction Model Formation Task The process (the function) α for deriving x from o is called model formation . α : O → X c [ ∧ ] � stein TIR’11

Introduction Model Formation Task The process (the function) α for deriving x from o is called model formation . α : O → X Choosing between different model formation functions α 1 , . . . , α m ➜ choosing between different feature spaces X α 1 , . . . , X α m ➜ choosing between different hypotheses spaces H α 1 , . . . , H α m c [ ∧ ] � stein TIR’11

Introduction Model Formation Task The process (the function) α for deriving x from o is called model formation . α : O → X Choosing between different model formation functions α 1 , . . . , α m ➜ choosing between different feature spaces X α 1 , . . . , X α m ➜ choosing between different hypotheses spaces H α 1 , . . . , H α m Feature spaces X α 1 X α m x x ... Hypotheses spaces h H α 1 h H α m ... c [ ∧ ] � stein TIR’11

Introduction Model Formation Task The process (the function) α for deriving x from o is called model formation . α : O → X Choosing between different model formation functions α 1 , . . . , α m ➜ choosing between different feature spaces X α 1 , . . . , X α m ➜ choosing between different hypotheses spaces H α 1 , . . . , H α m Feature spaces X α 1 X α m x x ... Hypotheses spaces h H α 1 h H α m ... We call the model under α 1 being more robust than the model under α 2 ⇔ err S ( h ∗ α 1 ) > err S ( h ∗ err ( h ∗ α 1 ) < err ( h ∗ α 2 ) and α 2 ) c [ ∧ ] � stein TIR’11

Introduction The Whole Picture Object classification (real-world) Objects Classes O Y c [ ∧ ] � stein TIR’11

Introduction The Whole Picture Object classification (real-world) Objects Classes O Y Model formation α X Feature space c [ ∧ ] � stein TIR’11

Introduction The Whole Picture Object classification (real-world) Objects Classes O Y Model formation α Feature vector classification c X Feature space Learning means searching for a h ∈ H such that P ( h ( x ) � = c ( x )) is minimum. c [ ∧ ] � stein TIR’11

Bias and Variance c [ ∧ ] � stein TIR’11

Bias and Variance Error Decomposition Consider: ❑ A feature vector x and its predicted class label ˆ y = h ( x ) , where ❑ h is characterized by a weight vector θ , where ❑ θ has been estimated based on a random sample S = { ( x , c ( x ) } . ➜ θ ≡ θ ( S ) , and hence h ≡ h ( θ S ) c [ ∧ ] � stein TIR’11

Bias and Variance Error Decomposition Consider: ❑ A feature vector x and its predicted class label ˆ y = h ( x ) , where ❑ h is characterized by a weight vector θ , where ❑ θ has been estimated based on a random sample S = { ( x , c ( x ) } . ➜ θ ≡ θ ( S ) , and hence h ≡ h ( θ S ) Observations: ❑ A series of samples S i , S i ⊆ U , entails a series of hypotheses h ( θ i ) , ❑ giving for a feature vector x a series of class labels ˆ y i = h ( θ i , x ) . ➜ ˆ y is considered as a random variable, denoted as Z . c [ ∧ ] � stein TIR’11

Bias and Variance Error Decomposition Consider: ❑ A feature vector x and its predicted class label ˆ y = h ( x ) , where ❑ h is characterized by a weight vector θ , where ❑ θ has been estimated based on a random sample S = { ( x , c ( x ) } . ➜ θ ≡ θ ( S ) , and hence h ≡ h ( θ S ) Observations: ❑ A series of samples S i , S i ⊆ U , entails a series of hypotheses h ( θ i ) , ❑ giving for a feature vector x a series of class labels ˆ y i = h ( θ i , x ) . ➜ ˆ y is considered as a random variable, denoted as Z . Consequences: ❑ σ 2 ( Z ) is the variance of Z , (= variance of the prediction) σ 2 ( Z ) ↑ ❑ | θ | : | S | ↑ ➜ σ 2 ( Z ) ↑ ❑ | S | : | U | ↓ ➜ c [ ∧ ] � stein TIR’11

Bias and Variance Error Decomposition (continued) Let Z and Y denote the random variables for ˆ y ( = h ( θ S , x ) ) and y ( = c ( x ) ) . MSE ( Z ) = E (( Z − Y ) 2 ) = E ( Z 2 − 2 · Z · Y + Y 2 ) = E ( Z 2 ) − 2 · E ( Z · Y ) + E ( Y 2 ) = ( E ( Z )) 2 + σ 2 ( Z ) − 2 · E ( Z · Y ) + E ( Y 2 ) = ( E ( Z )) 2 + σ 2 ( Z ) − 2 · E ( Z · Y ) + ( E ( Y )) 2 + σ 2 ( Y ) + σ 2 ( Y ) + σ 2 ( Z ) = ( E ( Z )) 2 − 2 · E ( Z · Y ) + ( E ( Y )) 2 = ( E ( Z ) − E ( Y )) 2 + σ 2 ( Y ) + σ 2 ( Z ) = ( E ( Z − Y )) 2 + σ 2 ( Z ) + σ 2 ( Y ) = ( bias ( Z )) 2 + σ 2 ( Z ) + IrreducibleError If Y is constant: = ( E ( Z ) − Y ) 2 + σ 2 ( Z ) c [ ∧ ] � stein TIR’11

Robust Models in Information Retrieval Nedim Lipka Benno Stein - PowerPoint PPT Presentation

Robust Models in Information Retrieval Nedim Lipka Benno Stein Bauhaus-Universitt Weimar [www.webis.de] Robust Models in Information Retrieval Outline Introduction Bias and Variance Robust Models in IR Summary Excursus: Bias

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Models for Models for Retrieval and Browsing Retrieval and Browsing - Structural Models and

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Models for Models for Retrieval and Browsing Retrieval and Browsing - Fuzzy Set, Extended

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Luo Si Department of Computer Science Purdue University Retrieval Models Information Need

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Industrial survey papers Six reasons to reject them Marco Torchiano, Filippo Ricca CESI 2013

Road Map When Costs and Probabilities are Both Unknown Problem and Challenge Formulation

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Audio instructions Select Computer audio to use your computers sound OR Select

GPP 501 Microeconomic Analysis for Public Policy Fall 2017 Given by Kevin Milligan Vancouver

Lecture 1 We start by recalling that the tropical variety of an affine variety equipped with an

Softwaretechnologie II Dr.-Ing. Sebastian Gtz Technische Universitt Dresden Institut fr

Getting the most out of your planner(s): from static to dynamic algorithm configuration Frank

Robust Models in Information Retrieval Nedim Lipka Benno Stein - PowerPoint PPT Presentation

Robust Models in Information Retrieval Nedim Lipka Benno Stein Bauhaus-Universitt Weimar [www.webis.de] Robust Models in Information Retrieval Outline Introduction Bias and Variance Robust Models in IR Summary Excursus: Bias

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Models for Models for Retrieval and Browsing Retrieval and Browsing - Structural Models and

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Models for Models for Retrieval and Browsing Retrieval and Browsing - Fuzzy Set, Extended

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Luo Si Department of Computer Science Purdue University Retrieval Models Information Need

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Industrial survey papers Six reasons to reject them Marco Torchiano, Filippo Ricca CESI 2013

Road Map When Costs and Probabilities are Both Unknown Problem and Challenge Formulation

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Audio instructions Select Computer audio to use your computers sound OR Select

GPP 501 Microeconomic Analysis for Public Policy Fall 2017 Given by Kevin Milligan Vancouver

Lecture 1 We start by recalling that the tropical variety of an affine variety equipped with an

Softwaretechnologie II Dr.-Ing. Sebastian Gtz Technische Universitt Dresden Institut fr

Getting the most out of your planner(s): from static to dynamic algorithm configuration Frank

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models