Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas A&M University ICML 2019 Poster @ Pacific Ballroom # 124 Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 1 / 10

Lifetime Maximization: Continuing The Play • A finite game is played for the purpose of winning. • An infinite game is for the purpose of continuing the play. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 2 / 10

Lifetime Maximization: Continuing The Play • A finite game is played for the purpose of winning. • An infinite game is for the purpose of continuing the play. Lifetime maximization Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 2 / 10

Why Lifetime Maximization? Medical treatments Portfolio selection Cloud services Salient features of these applications: 1 Each participant has a satisfaction level. 2 A participant drops if the outcomes are not satisfactory. 3 The outcomes depend heavily on the contextual information of the participant. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 3 / 10

Model: Linear Bandits With Reneging 1 { x t , a } a ∈ A are pairwise participant-action contexts (observed by the platform when participant t arrives). 2 Outcome r t , a is conditionally independent given the context and has mean θ T ∗ x t , a . 3 Participant t keeps interacting with the platform as long as r t , a ≥ β t . Otherwise, the participant drops. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 4 / 10

Heteroscedastic Outcomes • Heteroscedasticity: Outcome variations can be wildly different across different participants and actions Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 5 / 10

Heteroscedastic Outcomes • Heteroscedasticity: Outcome variations can be wildly different across different participants and actions • Example: • Two actions, 1 (red) and 2 (blue) • Participant satisfaction level = β • Heteroscedasticity is widely studied in econometrics, and is usually captured through regression on variance. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 5 / 10

Model: Heteroscedastic Bandits With Reneging 1 { x t , a } a ∈ A are pairwise participant-action contexts (observed by the platform when participant t arrives) 2 Outcome r t , a is conditionally independent given the context and satisfies that r t , a ∼ N ( θ ⊤ ∗ x t , a , f ( φ ⊤ ∗ x t , a )) . 3 Participant t keeps interacting with the platform if r t , a ≥ β t . Otherwise, the participant drops. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 6 / 10

Oracle Policy and Regret • Oracle policy π oracle already knows θ ∗ and φ ∗ . • For each participant t , π oracle keeps choosing the action that minimizes reneging probability P { r t , a < β t | x t , a } • Hence, π oracle is a fixed policy Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 7 / 10

Oracle Policy and Regret • Oracle policy π oracle already knows θ ∗ and φ ∗ . • For each participant t , π oracle keeps choosing the action that minimizes reneging probability P { r t , a < β t | x t , a } • Hence, π oracle is a fixed policy • For T participants, define Regret π ( T ) = ( the total expected lifetime under π oracle ) − ( the total expected lifetime under π ) Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 7 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime • Apply the action arg max a Q HR ( x t , a ) . t Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime • Apply the action arg max a Q HR ( x t , a ) . t Main technical challenges 1 Design estimators � θ, � φ under heteroscedasticity 2 Derive the confidence intervals C θ , C φ for � θ, � φ 3 Convert the C θ , C φ into the confidence interval of lifetime Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Estimators of θ ∗ and φ ∗ (Challenge 1) • Generalized least square estimator (Wooldridge, 2015): With any n outcome observations, � � − 1 X ⊤ � X ⊤ θ n = n X n + λ I n r , � � − 1 X ⊤ � n f − 1 ( � X ⊤ ε ◦ � φ n = n X n + λ I ε ) . • X n is the matrix of n applied contexts • r is the vector of n observed outcomes ε ( x t , a ) = r t , a − � n x t , a is the estimated residual with respect to � θ ⊤ • � θ n Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 9 / 10

Estimators of θ ∗ and φ ∗ (Challenge 1) • Generalized least square estimator (Wooldridge, 2015): With any n outcome observations, � � − 1 X ⊤ � X ⊤ θ n = n X n + λ I n r , � � − 1 X ⊤ � n f − 1 ( � X ⊤ ε ◦ � φ n = n X n + λ I ε ) . • X n is the matrix of n applied contexts • r is the vector of n observed outcomes ε ( x t , a ) = r t , a − � n x t , a is the estimated residual with respect to � θ ⊤ • � θ n • Nice property (Abbasi-Yadkori et al., 2011): Let V n = X ⊤ n X n + λ I . For any δ > 0, with probability at least 1 − δ , for all n ∈ N , � � log( 1 || � θ n − θ ∗ || V n ≤ C θ ( δ, n ) = O δ ) + log n . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 9 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Theorem � � ∆( C θ ( n , δ ) , C φ ( n , δ ) , x ) := k 1 C θ ( n , δ ) + k 2 C φ ( n , δ ) · || x || V − 1 is a n confidence interval with respect to lifetime, where k 1 , k 2 are constants independent of past history and x . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Theorem � � ∆( C θ ( n , δ ) , C φ ( n , δ ) , x ) := k 1 C θ ( n , δ ) + k 2 C φ ( n , δ ) · || x || V − 1 is a n confidence interval with respect to lifetime, where k 1 , k 2 are constants independent of past history and x . Theorem �� Under the HR-UCB policy, Regret ( T ) = O T (log T ) 3 . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas

STAY HOME | STAY HEALTHY | STAY CONNECTED | RETURN STRONGER STAY HOME. STAY HEALTHY. STAY

Oh, Wont You Stay? Oh, Wont You Stay? Oh, Won t You Stay? Oh, Won t You Stay? Predictors

Lifetime Products EcoHousing Initiative February 2010 WHO IS LIFETIME? Lifetime Products, Inc.

Opening the door to Lifetime Allowance & Relevant Life Opening the door to Lifetime Allowance

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Stay informed. Stay involved. Stay invested. Parallel Parenting Presentation Children

ST STAY Y MOB MOBILE ILE, ST , STAY Y UP2GO UP2GO COMMUNI COMMUNITY TY CARPOOL CARPOOLIN

Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

CSC304 Lecture 12 Mechanism Design w/ Money: Revenue maximization Myersons Auction CSC304 -

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Should it stay or should it go? Mark Galtrey www.falcon-chambers.co.uk www.falcon-chambers.co.uk

Building a strong learning community online Lessons from the Public Narrative & Organizing

the right camera mic equipment zoom.us Sign-in to ZOOM getting started JOIN JOIN a MEETING

You Neednt Build That Reusable Ethics Compliance Infrastructure for Human Subjects Research

Two-phase commit Implications of Two Generals Cannot get agreement in a distributed system to

PA Initial Reactions to Non- Participant Study Findings April 15, 2020 Findings from NP Study 2

Rethinking Employment Services and Supports in the COVID-19 Era August 5, 2020 Real Work for

Competitive Grants Reporting Requirements UNITED STATES DEPARTMENT OF LABOR Veterans

Clinical data transparency solutions for sharing participant level data Barbara E. Bierer, MD -

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas

STAY HOME | STAY HEALTHY | STAY CONNECTED | RETURN STRONGER STAY HOME. STAY HEALTHY. STAY

Oh, Wont You Stay? Oh, Wont You Stay? Oh, Won t You Stay? Oh, Won t You Stay? Predictors

Lifetime Products EcoHousing Initiative February 2010 WHO IS LIFETIME? Lifetime Products, Inc.

Opening the door to Lifetime Allowance &amp; Relevant Life Opening the door to Lifetime Allowance

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Stay informed. Stay involved. Stay invested. Parallel Parenting Presentation Children

ST STAY Y MOB MOBILE ILE, ST , STAY Y UP2GO UP2GO COMMUNI COMMUNITY TY CARPOOL CARPOOLIN

Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

CSC304 Lecture 12 Mechanism Design w/ Money: Revenue maximization Myersons Auction CSC304 -

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Should it stay or should it go? Mark Galtrey www.falcon-chambers.co.uk www.falcon-chambers.co.uk

Building a strong learning community online Lessons from the Public Narrative &amp; Organizing

the right camera mic equipment zoom.us Sign-in to ZOOM getting started JOIN JOIN a MEETING

You Neednt Build That Reusable Ethics Compliance Infrastructure for Human Subjects Research

Two-phase commit Implications of Two Generals Cannot get agreement in a distributed system to

PA Initial Reactions to Non- Participant Study Findings April 15, 2020 Findings from NP Study 2

Rethinking Employment Services and Supports in the COVID-19 Era August 5, 2020 Real Work for

Competitive Grants Reporting Requirements UNITED STATES DEPARTMENT OF LABOR Veterans

Clinical data transparency solutions for sharing participant level data Barbara E. Bierer, MD -

Opening the door to Lifetime Allowance & Relevant Life Opening the door to Lifetime Allowance

Building a strong learning community online Lessons from the Public Narrative & Organizing