correlated bandits or how to minimize mean squared error
play

Correlated bandits or: How to minimize mean-squared error online 1 - PowerPoint PPT Presentation

Correlated bandits or: How to minimize mean-squared error online 1 LinkedIn Corp. 2 Indian Institute of Technology Madras. A portion of this work was done while the authors were at University of Maryland, College Park 1 V. Praneeth Boda 1 and


  1. Correlated bandits or: How to minimize mean-squared error online 1 LinkedIn Corp. 2 Indian Institute of Technology Madras. A portion of this work was done while the authors were at University of Maryland, College Park 1 V. Praneeth Boda 1 and Prashanth L. A. 2

  2. Centrality among Bandits for measuring temperature in a region. approximate the whole network. Aim: Find arm with highest information about other arms 2 ▶ Placement of sensors used ▶ Best set of towers which

  3. Minimum Mean Squared Error Estimation g The optimal K 3 MMSE ▶ Jointly Gaussian arms X M = ( X 1 , . . . , X K ) , with zero mean and covariance matrix Σ ≜ E [ X T M X M ] . E i ≜ min [( ) T ( )] X M − g ( X i ) X M − g ( X i ) E [( ) 2 ] ∑ ∑ = E X j − E [ X j | X i ] = σ 2 j ( 1 − ρ 2 ij ) j = 1 j ̸ = i g ∗ ( X i ) = E [ X M | X i ] = [ E [ X 1 | X i ] . . . E [ X K | X i ]] T , with E [ X j | X i ] = E [ X j X i ] i ] X i = ρ ij σ j X i . E [ X 2 σ i

  4. Correlated Bandits Observe a sample from the bivariate endfor A n based on sample-based MSE-value estimates necessary for estimating correlation structure 4 Input : set of arm-pairs S ≜ { ( i , j ) | i , j = 1 , . . . , K , i < j } , number of rounds n For t = 1 , 2 , . . . , n do Select a pair ( i t , j t ) ∈ S distribution corresponding to the arms i t , j t Output an arm ˆ so that P ( A n ̸ = i ∗ ) is minimized. Here i ∗ = arg min E i . i ∈M

  5. MSE Estimation and Concentration j i cK 5 Based on samples of the Gaussian arms: Sample correlation Sample variance ij 5 MSE of arm i ( ) ˆ ∑ E i ≜ σ 2 ˆ 1 − ˆ ρ 2 . j ̸ = i MSE Concentration: Assume σ 2 i ≤ 1 , i = 1 , . . . , K . Then, for any i = 1 , . . . , K , and for any ϵ ∈ [ 0 , 2 K ] , we have ( − nl 2 ϵ 2 ) (� � ) � ˆ E i − E i � > ϵ ≤ 14 K exp , P � � where c is a universal constant, and 0 < l = min σ 2 i .

  6. SR algorithm: Illustration of arm-pair elimination (1,2) are eliminated (4,5) (3,5) (3,4) (2,5) (2,4) (2,3) (1,5) (1,4) (1,3) eliminated Maintain active arms and arm-pairs (4,5) (3,5) (3,4) (2,5) (2,4) (2,3) (1,5) (1,4) (1,3) (1,2) 6 Active arm-pairs after arms 4 , 5 are Active arm-pairs after arms 3 , 4 , 5

  7. Successive Rejects: An algorithm to find the best arm arm with lowest MSE times) 2 play each arm-pair Play the remaining two arm Phase . . Initial- . . . Play each arm pair in A 2 , Phase 2 . Pull each pair in A 1 , n 1 2 ization 7 Phase 1 A 1 = all arm pairs, ▶ One arm pair played n 1 B 1 = { 1 , . . . , K } , times, . . . , another two ⌈ ⌉ n − ( K ) n k = , C ( K ) ≈ C ( K ) ( K + 1 − k ) played n 2 times K log K . ▶ k arms played n k + 1 times times; Set B k + 1 = B k \ K − 1 ▶ ∑ ( k − 1 ) n k + ( K − 1 ) n K − 1 < n , k = 1 n 2 − n 1 times; Eliminate . . . ▶ n k increases with k ▶ Adaptive exploration: better than uniform ( = ( K ) n / pairs n K − 1 − n K − 2 times K − 1

  8. Thanks. Questions? 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend