Data Collection through Device- to-Device Communications for Mobile - - PowerPoint PPT Presentation

data collection through device to device communications
SMART_READER_LITE
LIVE PREVIEW

Data Collection through Device- to-Device Communications for Mobile - - PowerPoint PPT Presentation

Data Collection through Device- to-Device Communications for Mobile Big Data Sensing Hanshang Li, Ting Li, Xinghua Shi and Yu Wang College of Computing and Informatics University of North Carolina at Charlotte May 17 , 2016 @ The First


slide-1
SLIDE 1
slide-2
SLIDE 2

2

Data Collection through Device- to-Device Communications for Mobile Big Data Sensing

Hanshang Li, Ting Li, Xinghua Shi and Yu Wang


College of Computing and Informatics University of North Carolina at Charlotte

May 17, 2016 @ The First Workshop of Mission-Critical Big Data Analytics (MCBDA 2016)

slide-3
SLIDE 3

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

3

slide-4
SLIDE 4

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

4

slide-5
SLIDE 5

MOBILE DEVICES

➤ Nowadays, more and more smart mobile devices are utilized by humans as the

primary personal devices, which have the functions of computing, sensing, communicating and so on.

5

slide-6
SLIDE 6

MOBILE DEVICES AND USERS

An Introduction to Mobile Marketing: The Past, Present, 
 and Future, Marketo, 2015 Cisco VNI Global Mobile Data Traffic Forecast, 2015 - 2020, Cisco, 2016

slide-7
SLIDE 7

Source: Cisco VNI Mobile, 2016

MOBILE DATA EXPLOSION

➤ Mobile data traffic grows!

grew 74% in 2015, reached 3.7 exabytes/month, 4,000 times of the one in 2005 will surpass 30.6 exabytes per month in 2020

➤ Mainly came from smart devices

though smart devices 


  • nly represent 36% of 


devices/connections, 
 they account for 89% 


  • f all mobile traffics

Cisco VNI Global Mobile Data Traffic Forecast, 2015 - 2020, Cisco, 2016

slide-8
SLIDE 8

MOBILE CROWD SENSING — “POWER OF THE CROWD”

➤ Individuals with sensing and computing devices collectively share data and extract

information to measure and map phenomena of common interests

➤ Widely used in many applications - human as sensors

8

slide-9
SLIDE 9

ADVANTAGES OF MOBILE CROWD SENSING

➤ Leverages existing sensing and communication 


infrastructures with less additional costs;

➤ Provides unprecedented spatial-temporal coverage, 


especially for observing unpredictable events;

➤ Integrates human intelligence into the sensing 


and data processing.

9

slide-10
SLIDE 10

GENERAL FRAMEWORK OF MOBILE CROWD SENSING

Sensing Tasks Selection Mechanism Participants

Coverage Cost Incentive

Reward Sensing Data User Traces Task Assignment Tasks

10

➤ A large number of

mobile participants

➤ A set of crowd sensing

tasks

➤ Participant selection

mechanism - the focus

  • f most current works
slide-11
SLIDE 11

GENERAL FRAMEWORK OF MOBILE CROWD SENSING

Sensing Tasks Selection Mechanism Participants

Coverage Cost Incentive

Reward Sensing Data User Traces Task Assignment Tasks

10

➤ A large number of

mobile participants

➤ A set of crowd sensing

tasks

➤ Participant selection

mechanism - the focus

  • f most current works
slide-12
SLIDE 12

CHALLENGE TO CURRENT NETWORK INFRASTRUCTURE

➤ Current cellular network do not have enough capacity to support all of the fast

growing mobile big data from smart devices and mobile sensing

slide-13
SLIDE 13

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

12

slide-14
SLIDE 14

DATA COLLECTION IN MOBILE CROWD SENSING

➤ How to transfer sensing data back?

cellular network (piggyback) WiFi or femtocell offloading D2D/DTN relays


Sensing Tasks

Selection Mechanism Participants

Coverage Cost Incentive

Rewards Sensing Data

User Traces Task Assignment

Tasks

D2D: Device-to-Device
 DTN: Delay Tolerant Networks

slide-15
SLIDE 15

DATA COLLECTION IN MOBILE CROWD SENSING

➤ How to transfer sensing data back?

cellular network (piggyback) WiFi or femtocell offloading D2D/DTN relays


Sensing Tasks

Selection Mechanism Participants

Coverage Cost Incentive

Rewards Sensing Data

User Traces Task Assignment

Tasks

+ low cost and easy to deploy D2D: Device-to-Device
 DTN: Delay Tolerant Networks

slide-16
SLIDE 16

DATA COLLECTION IN MOBILE CROWD SENSING

➤ How to transfer sensing data back?

cellular network (piggyback) WiFi or femtocell offloading D2D/DTN relays


Sensing Tasks

Selection Mechanism Participants

Coverage Cost Incentive

Rewards Sensing Data

User Traces Task Assignment

Tasks

+ low cost and easy to deploy

  • longer delay and low deliver ratio

D2D: Device-to-Device
 DTN: Delay Tolerant Networks

slide-17
SLIDE 17

MOBILE DATA COLLECTION VIA D2D RELAYS

➤ Leverage user mobility to delivery the sensing data from the source to the sink(s)


14

slide-18
SLIDE 18

RELATED WORKS

➤ Data Collection in Mobile Sensing

Wang et al. [UbiComp 2013] consider Bluetooth/Wifi offloading (one-hop) to reduce energy consumption and data cost of data-plan users Karaliopoulos et al. [InfoCom 2015] consider a joint user recruitment with D2D data collection (multi-hop), however, the time complexity of proposed greedy algorithm is large due to search over all space-time paths

➤ DTN/D2D Routing

Focus on point to point delivery over D2D relays, selecting relay node on ride

➤ Data Offloading

WiFi [Lee et al. 2010, Dimatteo et al. 2011], FemtoCell [Chandrasekhar et al. 2008] D2D [Han et al. 2012, Li et al. 2014, Zhu et al., 2013], broadcasting or point-to-point

slide-19
SLIDE 19

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

16

slide-20
SLIDE 20

MODEL AND ASSUMPTIONS

➤ n mobile users, User=u1,u2, …, un ➤ m locations, Location=l1,l2, …, lm ➤ T, time period for delivery ➤ Known probability p(i,j,t), mobile user ui visits 


location lj at time t (learn from historical data)

➤ T

wo devices can transfer sensing data if they are 
 visiting the same location within a particular time slot

➤ Collection task: sending the data from a source 


node s to a sink node d (a mobile device or a location)

➤ Restricted flooding (Epidemic routing) is used within selected relay nodes U(s,d)

slide-21
SLIDE 21

RELAY SELECTION PROBLEM

➤ Goal: minimize the number relay nodes U(s,d) while maximize the data delivery ➤ T

wo versions of the optimization problem

Minimum Relay Problem K Relay Problem

slide-22
SLIDE 22

TWO CHALLENGES

➤ How to model the time-evolving D2D network and estimate the delivery

probability? weighted space-time graph and reliability calculation

➤ How to identify a small set of relay nodes from a huge candidate pool to guarantee

certain level of data delivery? greedy algorithm

slide-23
SLIDE 23

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

20

slide-24
SLIDE 24

SPACE-TIME GRAPH

➤ Space-time graph describes all characteristics among the selected relay nodes in both

spacial and temporal spaces

21

1

u

2

u

3 4

u u

1 1 1 1 1 1

u

2

u

3 4

u u s= d= u t=1 t=2 t=3 t=5 t=4 =s =d u

1

u

2

u

3 4

u u

5 5 5 5 5 5 5

u

slide-25
SLIDE 25

➤ Each spacial link has a delivery probability ➤ With flooding, the delivery probability can be calculated


via the following dynamic programming 
 
 
 
 
 
 Thus,

DELIVERY PROBABILITY OVER SPACE-TIME GRAPH

22

1

u

2

u

3 4

u u

1 1 1 1 1 1

u

2

u

3 4

u u s= d= u t=1 t=2 t=3 t=5 t=4 =s =d u

1

u

2

u

3 4

u u

5 5 5 5 5 5 5

u

p( − − − − → ut−1

j

ut

k) = (1 − m

Y

i=1

(1 − p(j, i, t)p(k, i, t))) · r( − − − − → ut−1

j

ut

k),

Q

delivery probability based on the ws p(U(s, d), s, d) = pG(s0, dT )

slide-26
SLIDE 26

➤ Each spacial link has a delivery probability ➤ With flooding, the delivery probability can be calculated


via the following dynamic programming 
 
 
 
 
 
 Thus,

DELIVERY PROBABILITY OVER SPACE-TIME GRAPH

22

1

u

2

u

3 4

u u

1 1 1 1 1 1

u

2

u

3 4

u u s= d= u t=1 t=2 t=3 t=5 t=4 =s =d u

1

u

2

u

3 4

u u

5 5 5 5 5 5 5

u

p( − − − − → ut−1

j

ut

k) = (1 − m

Y

i=1

(1 − p(j, i, t)p(k, i, t))) · r( − − − − → ut−1

j

ut

k),

Q

delivery probability based on the ws p(U(s, d), s, d) = pG(s0, dT )

slide-27
SLIDE 27

RELAY SELECTION ALGORITHM

➤ Greedy Algorithm

in each step, greedily selects the user u 
 which leads to maximal improvement 


  • f p(U(s, d), s, d) into U(s, d)

➤ Cold Start Problem

initially, the space-time is not connected 
 at all, and adding a single user cannot 
 solve this solution: simply pick the most active user

1,

arding the Algorithm 1 Relay Selection Algorithm Input: potential user set User, call probability p(i, j, t) for each user in User, the source s and the sink d. Output: selected relay nodes U(s, d).

1: U(s, d) = ∅ 2: while GU(s,d) is connected do 3:

Choose the most active user and add it into U(s, d)

4: while |U(s, d)| < K or p(U(s, d), s, d) < γ (for K relay

problem or minimum relay problem, respectively) do

5:

for all ui ∈ User and / ∈ U(s, d) do

6:

Calculate the improvement of p(U(s, d), s, d) by adding ui in to U(s, d)

7:

Select the user ui with the largest reliability improve- ment and add it into U(s, d)

8: return U(s, d)

slide-28
SLIDE 28

RELAY SELECTION ALGORITHM

➤ Greedy Algorithm

in each step, greedily selects the user u 
 which leads to maximal improvement 


  • f p(U(s, d), s, d) into U(s, d)

➤ Cold Start Problem

initially, the space-time is not connected 
 at all, and adding a single user cannot 
 solve this solution: simply pick the most active user

1,

arding the Algorithm 1 Relay Selection Algorithm Input: potential user set User, call probability p(i, j, t) for each user in User, the source s and the sink d. Output: selected relay nodes U(s, d).

1: U(s, d) = ∅ 2: while GU(s,d) is connected do 3:

Choose the most active user and add it into U(s, d)

4: while |U(s, d)| < K or p(U(s, d), s, d) < γ (for K relay

problem or minimum relay problem, respectively) do

5:

for all ui ∈ User and / ∈ U(s, d) do

6:

Calculate the improvement of p(U(s, d), s, d) by adding ui in to U(s, d)

7:

Select the user ui with the largest reliability improve- ment and add it into U(s, d)

8: return U(s, d)

slide-29
SLIDE 29

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusions

24

slide-30
SLIDE 30

D4D DATASET

➤ Cellular tracing data (anonymized Call Records) from Orange

50, 000 mobile users in Ivory Coast for one half year contains access records of each mobile user over every two-week period 46,254 active users and 1,097 cellular towers released for Data for Development (D4D) Challenge in 2013

25

slide-31
SLIDE 31

EXPERIMENT SETTING

➤ 20 most popular towers with largest associated records ➤ Choose relay nodes from a 100 candidate user set ➤ For simplicity, link reliability as 0.5, i.e., the successful 


transferring over a pair of nodes is 50% during their encountering

➤ For each data collection task, we randomly select a mobile user as the data source

and one location as the sink

➤ For each set of experiments, we test 15 tasks and 100 rounds per tasks. The average

performances over 1, 500 rounds are reported.

slide-32
SLIDE 32

TESTED ALGORITHMS

➤ Three algorithms

Our Method: greedily choose the user with most improvement of delivery ratio Active: choose the most active user (visiting most location) Random: randomly choose a user at each step until K users or delivery ratio >= γ

27

slide-33
SLIDE 33

RESULTS

slide-34
SLIDE 34

RESULTS

K

10 15 20

Delivery Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Random Activity Our Method K

10 15 20

Delivery Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

  • Est. DR

DR

K relay problem where K = 10, 15 or 20

slide-35
SLIDE 35

RESULTS

K

10 15 20

Delivery Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Random Activity Our Method K

10 15 20

Delivery Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

  • Est. DR

DR

K relay problem where K = 10, 15 or 20

γ

0.6 0.75 0.9

Delivery Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Random Activity Our Method γ

0.6 0.75 0.9

|U(s,d)|

5 10 15 20 25 30 35 40 45

Random Activity Our Method

minimum relay problem where γ = 0.6, 0.75 or 0.9

slide-36
SLIDE 36

OUTLINE

➤ Introduction ➤ Mobile Data Collection ➤ Relay Selection Problem ➤ Our Solutions ➤ Simulations ➤ Conclusion

29

slide-37
SLIDE 37

CONCLUSION

➤ Big data from mobile sensing bring new challenge in mobile data collection ➤ Consider a relay selection problem for mobile data collection via D2D relays

aim to use small relay set to guarantee certain data delivery via D2D flooding formate the problem as two optimization problems (K relay selection or minimum relay selection) on relay set selection propose a greedy based solution, which utilizes the historical records and space-time graph to estimate the expected delivery ratio tested via real-life D4D dataset

➤ Future work

hybrid data collection scheme

30

slide-38
SLIDE 38

Tracing data provided by: Funded by:

31

Contact: yu.wang@uncc.edu PhD Students: Hanshang Li, Ting Li Collaborators: Xinghua Shi (UNCC) Joint works with:

ACKNOWLEDGEMENT

slide-39
SLIDE 39

TIME COMPLEXITY

➤ Given the space-time graph G defined by r relay nodes, starting from a source node,

the dynamic programming algorithm can compute the delivery ratio of all other nodes within time of O(rT (log(rT ) + r))

➤ Greedy algorithm runs at most K or n times (for K relay problem or minimum relay

problem), thus the total time complexity is at most O(KrT (log(rT ) + r)) or O(nrT (log(rT ) + r)) in the worst case