MAB Learning in IoT Networks Learning helps even in non-stationary - PowerPoint PPT Presentation

MAB Learning in IoT Networks Learning helps even in non-stationary settings! Rémi Bonnefoi Lilian Besson Émilie Kaufmann Christophe Moy Jacques Palicot PhD Student in France Team SCEE, IETR, CentraleSupélec, Rennes & Team SequeL, CRIStAL, Inria, Lille 20-21 Sept - CROWNCOM 2017

1. Introduction and motivation 1.a. Objective We want A lot of IoT devices want to access to a gateway of base station. Insert them in a crowded wireless network . With a protocol slotted in time and frequency . Each device has a low duty cycle (a few messages per day). Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 2 / 18

1. Introduction and motivation 1.a. Objective We want A lot of IoT devices want to access to a gateway of base station. Insert them in a crowded wireless network . With a protocol slotted in time and frequency . Each device has a low duty cycle (a few messages per day). Goal Maintain a good Quality of Service . Without centralized supervision! Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 2 / 18

1. Introduction and motivation 1.a. Objective We want A lot of IoT devices want to access to a gateway of base station. Insert them in a crowded wireless network . With a protocol slotted in time and frequency . Each device has a low duty cycle (a few messages per day). Goal Maintain a good Quality of Service . Without centralized supervision! How? Use learning algorithms : devices will learn on which frequency they should talk! Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 2 / 18

1. Introduction and motivation 1.b. Outline Outline 1 Introduction and motivation 2 Model and hypotheses 3 Baseline algorithms : to compare against naive and efficient centralized approaches 4 Two Multi-Armed Bandit algorithms : UCB, Thompson sampling 5 Experimental results 6 Perspectives and future works 7 Conclusion Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 3 / 18

2. Model and hypotheses 2.a. Model Model Discrete time t ≥ 1 and N c radio channels ( e.g. , 10) ( known ) Figure 1: Protocol in time and frequency, with an Acknowledgement . D dynamic devices try to access the network independently S = S 1 + · · · + S N c static devices occupy the network : S 1 , . . . , S N c in each channel ( unknown ). Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 4 / 18

2. Model and hypotheses 2.b. Hypotheses Hypotheses I Emission model Each device has the same low emission probability: each step, each device sends a packet with probability p . (this gives a duty cycle proportional to 1 /p ) Background traffic Each static device uses only one channel. Their repartition is fixed in time. = ⇒ Background traffic, bothering the dynamic devices! Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 5 / 18

2. Model and hypotheses 2.b. Hypotheses Hypotheses II Dynamic radio reconfiguration Each dynamic device decides the channel it uses to send every packet . It has memory and computational capacity to implement basic decision algorithm. Problem Goal : maximize packet loss ratio ( = number of received Ack ) in a finite-space discrete-time Decision Making Problem . Solution ? Multi-Armed Bandit algorithms , decentralized and used independently by each device. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 6 / 18

3. Baseline algorithms 3.a. A naive strategy : uniformly random access A naive strategy : uniformly random access Uniformly random access : dynamic devices choose uniformly their channel in the pull of N c channels. Natural strategy, dead simple to implement. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 7 / 18

3. Baseline algorithms 3.a. A naive strategy : uniformly random access A naive strategy : uniformly random access Uniformly random access : dynamic devices choose uniformly their channel in the pull of N c channels. Natural strategy, dead simple to implement. Simple analysis, in term of successful transmission probability (for every message from dynamic devices) : N c � × 1 (1 − p/N c ) D − 1 (1 − p ) S i P ( success | sent ) = × . � �� N c i =1 No other dynamic device No static device Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 7 / 18

3. Baseline algorithms 3.a. A naive strategy : uniformly random access A naive strategy : uniformly random access Uniformly random access : dynamic devices choose uniformly their channel in the pull of N c channels. Natural strategy, dead simple to implement. Simple analysis, in term of successful transmission probability (for every message from dynamic devices) : N c � × 1 (1 − p/N c ) D − 1 (1 − p ) S i P ( success | sent ) = × . � �� N c i =1 No other dynamic device No static device Works fine only if all channels are similarly occupied, but it cannot learn to exploit the best (more free) channels. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 7 / 18

3. Baseline algorithms 3.b. Optimal centralized strategy Optimal centralized strategy I If an oracle can decide to affect D i dynamic devices to channel i , the successful transmission probability is: N c � P ( success | sent ) = (1 − p ) D i − 1 (1 − p ) S i × × D i /D . � �� i =1 D i − 1 others No static device Sent in channel i The oracle has to solve this optimization problem :  � N c  i =1 D i (1 − p ) S i + D i − 1 arg max  D 1 ,...,D Nc � N c   such that i =1 D i = D and D i ≥ 0 , ∀ 1 ≤ i ≤ N c . We solved this quasi-convex optimization problem with Lagrange multipliers , only numerically. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 8 / 18

3. Baseline algorithms 3.b. Optimal centralized strategy Optimal centralized strategy II ⇒ Very good performance, maximizing the transmission = rate of all the D dynamic devices But unrealistic But not achievable in practice : no centralized oracle! Let see realistic decentralized approaches → Machine Learning ? ֒ → Reinforcement Learning ? ֒ → Multi-Armed Bandit ! ֒ Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 9 / 18

4. Multi-Armed Bandit algorithm : UCB 4.1. Multi-Armed Bandit formulation Multi-Armed Bandit formulation A dynamic device tries to collect rewards when transmitting : it transmits following a Bernoulli process (probability p of transmitting at each time step τ ), chooses a channel A ( τ ) ∈ { 1 , . . . , N c } , if Ack (no collision) ⇒ reward r A ( τ ) = 1 , = if collision (no Ack ) ⇒ reward r A ( τ ) = 0 . = Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 10 / 18

4. Multi-Armed Bandit algorithm : UCB 4.1. Multi-Armed Bandit formulation Multi-Armed Bandit formulation A dynamic device tries to collect rewards when transmitting : it transmits following a Bernoulli process (probability p of transmitting at each time step τ ), chooses a channel A ( τ ) ∈ { 1 , . . . , N c } , if Ack (no collision) ⇒ reward r A ( τ ) = 1 , = if collision (no Ack ) ⇒ reward r A ( τ ) = 0 . = Reinforcement Learning interpretation Maximize transmission rate ≡ maximize cumulated rewards horizon � max r A ( τ ) . algorithm A τ =1 Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 10 / 18

4. Multi-Armed Bandit algorithm : UCB 4.2. Upper Confidence Bound algorithm : UCB Upper Confidence Bound algorithm ( UCB 1 ) A dynamic device keeps τ number of sent packets, T k ( t ) selections of channel k , X k ( t ) successful transmission in channel k . 1 For the first N c steps ( τ = 1 , . . . , N c ), try each channel once . 2 Then for the next steps t ≥ N c : � X k ( τ ) log( τ ) Compute the index g k ( τ ) := + 2 N k ( τ ) , N k ( τ ) � �� Mean � Upper Confidence Bound µ k ( τ ) Choose channel A ( τ ) = arg max g k ( τ ) , k Update T k ( τ + 1) and X k ( τ + 1) . References: [Lai & Robbins, 1985], [Auer et al, 2002], [Bubeck & Cesa-Bianchi, 2012] Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 11 / 18

5. Experimental results 5.1. Experiment setting Experimental setting Simulation parameters N c = 10 channels, S + D = 10000 devices in total, p = 10 − 3 probability of emission, horizon = 10 5 time slots ( ≃ 100 messages / device), The proportion of dynamic devices D/ ( S + D ) varies, Various settings for ( S 1 , . . . , S N c ) static devices repartition. What do we show After a short learning time, MAB algorithms are almost as efficient as the oracle solution. Never worse than the naive solution. Thompson sampling is even more efficient than UCB. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 12 / 18

5. Experimental results 5.2. First result: 10% 10% of dynamic devices 0.91 0.9 Successful transmission rate 0.89 0.88 0.87 0.86 UCB 0.85 Thompson-sampling Optimal 0.84 Good sub-optimal Random 0.83 0.82 2 4 6 8 10 Number of slots × 10 5 Figure 2: 10% of dynamic devices. 7% of gain. Lilian Besson (CentraleSupélec & Inria) MAB Learning in IoT Networks CROWNCOM 2017 13 / 18

MAB Learning in IoT Networks Learning helps even in non-stationary - PowerPoint PPT Presentation

MAB Learning in IoT Networks Learning helps even in non-stationary settings! Rmi Bonnefoi Lilian Besson milie Kaufmann Christophe Moy Jacques Palicot PhD Student in France Team SCEE, IETR, CentraleSuplec, Rennes & Team SequeL,

The Internet of Things: (almost) every thing connected to Internet By Vctor M. Rivas Santos

IoT - Big Data & Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Internet of Things (IoT) Raspberry Pi Summer Camp Tech Talk Raspberry Pi Camp IoT 1

mAb Glycopeptide Profiling with V-T ag Adding reliable glycoprofiling to your peptide mapping

MAB Life Cycle View Analysis tool for the life cycle management of your plant and strategic

The Ludger GX-mAb Glycan Analysis Service High Throughput glycan analysis service of monoclonal

IoT Trade Mission to Malaysia 23 rd 26 th April 2018 IOT IN ASIA AND MALAYSIA Global IoT

Why IoT IoT Domain IoT Data Characteristics Massive data: 20.4 Billion connected Growing

Akintayo Akinyoade 12/01/2017 Survey Roadmap Internet of Things (IoT)? Tech. Enablers for IoT

Considerations for Enterprise Grade IoT Ishu Verma Red Hat AGENDA l 50 Shades of IoT l Functions,

Data Privacy and Security in the Age of IoT(Internet of Things) What is IoT? (The Internet of

IoT-Flows: Lightweight Policy Enforcement of Information Flows in IoT Infrastructures Jos

NB-IOT Antti Ratilainen LPWAN@IETF96 1 NB-IoT targeted use cases NB-IoT Low cost Ultra

Consumer IoT security What is consumer IoT? We have defined consumer IoT as products that are

Telkomsel Presenta.on IoT for Making Indonesia 4.0 Jakarta Conven,on Center, 28 November

An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule Touqir Sajed

Beaufort Pediatrics QTIP team Jill Aiken, MD Nan Krueger, BSN Sydney Lubkin, RN Nikki Self A

Design of Lightweight Linear Diffusion Layers from Near-MDS Matrices Chaoyun Li 1 Qingju Wang 1 ,

MDs for 2018 lets start discussing MD Coordinators: M. Solfaroli, R. Tom as and J.

The Exploration-Exploitation Dilemma A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

Scheduling Black-box Muta5onal Fuzzing ACM CCS 2013 Maverick Woo Carnegie Mellon University

A Multi-Armed Bandit Framework for Recommendations at Netflix Jaya Kawale Elliot Chow

The Alternative Block Nondeterministially choose and execute any fragment whose guard is true

MAB Learning in IoT Networks Learning helps even in non-stationary - PowerPoint PPT Presentation

MAB Learning in IoT Networks Learning helps even in non-stationary settings! Rmi Bonnefoi Lilian Besson milie Kaufmann Christophe Moy Jacques Palicot PhD Student in France Team SCEE, IETR, CentraleSuplec, Rennes & Team SequeL,

The Internet of Things: (almost) every thing connected to Internet By Vctor M. Rivas Santos

IoT - Big Data &amp; Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Internet of Things (IoT) Raspberry Pi Summer Camp Tech Talk Raspberry Pi Camp IoT 1

mAb Glycopeptide Profiling with V-T ag Adding reliable glycoprofiling to your peptide mapping

MAB Life Cycle View Analysis tool for the life cycle management of your plant and strategic

The Ludger GX-mAb Glycan Analysis Service High Throughput glycan analysis service of monoclonal

IoT Trade Mission to Malaysia 23 rd 26 th April 2018 IOT IN ASIA AND MALAYSIA Global IoT

Why IoT IoT Domain IoT Data Characteristics Massive data: 20.4 Billion connected Growing

Akintayo Akinyoade 12/01/2017 Survey Roadmap Internet of Things (IoT)? Tech. Enablers for IoT

Considerations for Enterprise Grade IoT Ishu Verma Red Hat AGENDA l 50 Shades of IoT l Functions,

Data Privacy and Security in the Age of IoT(Internet of Things) What is IoT? (The Internet of

IoT-Flows: Lightweight Policy Enforcement of Information Flows in IoT Infrastructures Jos

NB-IOT Antti Ratilainen LPWAN@IETF96 1 NB-IoT targeted use cases NB-IoT Low cost Ultra

Consumer IoT security What is consumer IoT? We have defined consumer IoT as products that are

Telkomsel Presenta.on IoT for Making Indonesia 4.0 Jakarta Conven,on Center, 28 November

An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule Touqir Sajed

Beaufort Pediatrics QTIP team Jill Aiken, MD Nan Krueger, BSN Sydney Lubkin, RN Nikki Self A

Design of Lightweight Linear Diffusion Layers from Near-MDS Matrices Chaoyun Li 1 Qingju Wang 1 ,

MDs for 2018 lets start discussing MD Coordinators: M. Solfaroli, R. Tom as and J.

The Exploration-Exploitation Dilemma A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

Scheduling Black-box Muta5onal Fuzzing ACM CCS 2013 Maverick Woo Carnegie Mellon University

A Multi-Armed Bandit Framework for Recommendations at Netflix Jaya Kawale Elliot Chow

The Alternative Block Nondeterministially choose and execute any fragment whose guard is true

IoT - Big Data & Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017