Information Centric Networking(ICN) for Delivering Big Data with - - PowerPoint PPT Presentation

information centric networking icn for delivering big
SMART_READER_LITE
LIVE PREVIEW

Information Centric Networking(ICN) for Delivering Big Data with - - PowerPoint PPT Presentation

Information Centric Networking(ICN) for Delivering Big Data with Persistent Identifiers(PID) Andreas Karakannas Research Project 2 Supervised by: Zhiming Zhao Background PIDs in IP Network Information Centric Networking A new network


slide-1
SLIDE 1

Information Centric Networking(ICN) for Delivering Big Data with Persistent Identifiers(PID)

Andreas Karakannas

Research Project 2 Supervised by: Zhiming Zhao

slide-2
SLIDE 2

Background

User at Web Browser

www.resolver.org 1 2 3 4 http://www.resolver.org/<PID>

PID: ark:12345/CIA/DNS_1.pdf URL: https://www.os3.nl/_media/2013- 2014/courses/cia/dns_1.pdf

Information Centric Networking

A new network concept

Based on the idea that users are interested in accessing Digital Objects regardless of their locations.

No end-to-end communication

Digital Objects are uniquely identified

Request for Objects are routed based on the Digital Object unique name (NO IP ROUTING!!!)

Objects are cached in the path from source to destination(In-Network Caching).

In-Network Caching aims to achieve efficient & reliable distribution of the contents among the network infrastructure.

PIDs in IP Network

slide-3
SLIDE 3

 How can PID types be mapped/resolved

to ICNs’ Object Identifiers?

 What is the efficiency of ICNs’ caching

algorithms for delivering Big Data?

Research Questions

slide-4
SLIDE 4

Approach

 Theoretical Studies on latest ICN

Projects and PID Standards.

 Propose Mapping Architecture Design

based on the Theoretical study

 Evaluate In-Network Caching

Performance for Big Data Objects

slide-5
SLIDE 5

ICN approaches

  • A Survey of Information-Centric Networking Research

Theoretical Studies ICN Approaches

slide-6
SLIDE 6

ICN approaches

  • A Survey of Information-Centric Networking Research

Theoretical Studies ICN Approaches

slide-7
SLIDE 7

Named Data Networking(NDN)

 The most mature ICN approach.  The only approach with published specification.(Packet

Format 0.1a2 published on March 27,2014).

 Most research in caching algorithms in ICN is based on

NDN.

 Only one with available open source

simulators(ndnSIM,ccnSIM) for evaluating caching performance under different scenarios.

Theoretical Studies ICN Approaches NDN

slide-8
SLIDE 8

Named Data Networking(NDN)

 Names in NDN

  • Based on URI syntax
  • Have hierarchical structure (e.g.

/NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf)

  • Names can be anything: a pdf file, a video, an endpoint, a command to

turn on some lights.

  • Names are used in the Routing procedure.

 2 Types of packets

  • INTEREST(request) packets

 Contains the Name of the Request e.g. INTEREST(/NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf)

  • DATA(answer) packets

 Contains the Name of the Request & the Data e.g. DATA(NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf, <DATA>)

Theoretical Studies ICN Approaches NDN

slide-9
SLIDE 9

Named Data Networking(NDN)

Theoretical Studies NDN Populating the Name Prefix

slide-10
SLIDE 10

Named Data Networking(NDN)

Theoretical Studies NDN Populating the Name Prefix

slide-11
SLIDE 11

Named Data Networking(NDN)

Theoretical Studies NDN Populating the Name Prefix

slide-12
SLIDE 12

Named Data Networking(NDN)

Theoretical Studies NDN Populating the Name Prefix

slide-13
SLIDE 13

Named Data Networking(NDN)

Theoretical Studies NDN Routing the INTEREST packet

slide-14
SLIDE 14

Named Data Networking(NDN)

Theoretical Studies NDN Routing the INTEREST packet

slide-15
SLIDE 15

Named Data Networking(NDN)

Theoretical Studies NDN Routing the INTEREST packet

slide-16
SLIDE 16

Named Data Networking(NDN)

Theoretical Studies NDN Routing the DATA packet

slide-17
SLIDE 17

Named Data Networking(NDN)

Theoretical Studies NDN Routing the DATA packet

slide-18
SLIDE 18

Named Data Networking(NDN)

Theoretical Studies NDN Routing the DATA packet

slide-19
SLIDE 19

Named Data Networking(NDN)

Theoretical Studies NDN Cache HIT

slide-20
SLIDE 20

Named Data Networking(NDN)

Theoretical Studies NDN Cache HIT

slide-21
SLIDE 21

Persistent Identifiers(PIDs)

 A name with specific syntax that uniquely identifies an

  • bject for a long-lasting period regardless of its’ location

and lifespan.

 Different PID types are available for naming digital

  • bjects.

 Each PID has three parts:

Unique Identifier of the A Unique Identifier of the A Unique Identifier of the PID Type(e.g.urn:,ark: ) Authority(e.g. isbn,ietf) Digital Object (e.g. 0-7645-2641-3) Further Delegation to sub-Authorities is possible Example : urn:isbn:0-7645-2641-3

PID PID Type Authority Name of Dig. Object

Theoretical Studies PID

slide-22
SLIDE 22

Persistent Identifiers(PIDs)

PID Types PID Type Identifier Authority Name URL url: <protocol><host>:<port> [/<path>[?<searchpart>]] URN urn: <NID>: <NSS> ARK ark: <NAAN> /”<Name>[<Qualifier>] HANDLE handle: <Handle Naming Authority> /<Handle Local Name> PURL purl: <protocol><resolver address> /<name> DOI doi: 10.<Naming Authority> /<doi name syntax> Most-well known PID Types

Theoretical Studies PID Standards

slide-23
SLIDE 23

Mapping Architecture Design Goals

 Generic  Extensible  Scalable  Easy to Implement, Manage &

Administrate

Mapping Architecture Design

slide-24
SLIDE 24

Mapping Architecture Name-Space Implementation

Root PID Server

<Root PID NDN Name>

URN Handle Doi Ark

. . . . . . . . .

ISBN IETF

. . . .

12345 56789

. . . . . . . . . . . . .

Root PID Layer PID Type Layer Authority PID Layer

(Further Delegation is Possible) PID NDN-Name urn:isbn:0-7645-2641-3 /UvA/NaturalScience/CS/CIA/DNS.pdf . . . .

slide-25
SLIDE 25

Client 1

1.INTEREST(<PID_Resolver NDN Name><PID>) Clients’ PID

Resolver

Server <PID

Resolver

NDN Name> 2.INTEREST(<Root_PID_Server NDN Name><PID>)

3.DATA(<Root PID Server NDN Name><PID>,<Answer>)

Root PID Server <Root PID Server NDN Name>

PID Type Server

<PID Type Server NDN Name> 4.INTEREST(<PID Type Server NDN Name> <PID>)

5.DATA(< PID Type Server NDN Name> <PID>,<Answer>)

Authority PID Server <Authority PID Server NDN Name> 6.INTEREST(<Authority PID Server NDN Name ><PID>)

7.DATA(<Authority PID Server NDN Name ><PID>,<Answer>)

8.DATA(<PID Resolver NDN Name><PID><Answer>)

User Interface

9.INTEREST(<PIDs’ NDN Name>) CONTENT

ROUTER

NDN

9.INTEREST(<PIDs’ NDN Name>) 10.DATA(<PIDs’ NDN Name>,<Data>) 10.DATA(<PIDs’ NDN Name>,<Data>)

Iterative Resolution of PIDs to NDN names

slide-26
SLIDE 26

Caching Strategies

 Decision Algorithms(DA)

Which Content Router caches what? LCE,LCD,FIX(P),ProbCache

 Replacement Algorithm(RA)

How are Content Routers replaced Objects in the Content Store? FIFO,RANDOM,LRU,LFU

Evaluate In- Network Caching Performance

slide-27
SLIDE 27

Simulation Parameters

Big Data

Repository

Parameter Description Values R Big Data Repository Size 51.2TBytes |R|

  • Num. of Big Data Objects in R

150 B Size of Big Data Object 350GBytes c

  • Num. of sub-Objects a Big Data

Object is consisted of [1,2,4,6..20] a Popularity of Big Data sets is based on Zipf Distribution: P(x=i)=(1/i^a)/C C= 𝟐/𝒋^𝒃

|𝑺| 𝒋=𝟐

1 Parameter Description Values C The Content Store Size in each Content Router expressed as Size

  • f a Big Data Object

[0.5B,1B,2B,4B,8B,16B] CA Caching Algorithm [LCE,LCD,FIX(0,5),FIX (0.25),ProbCache] RA Replacement Algorithm LRU

CLIENT

Parameter Description Values T Indicated the number of Requests for a Big Data Object the Client has send so far

  • Evaluate In-

Network Caching Performance

slide-28
SLIDE 28

Network T

  • pologies

Binary Tree String In both Network Topologies the distance between the client and the Big Data Repository is 4 Hops(Content Routers)

1 2 3 4 4 3 2 2 1 1 1 1 Evaluate In- Network Caching Performance

slide-29
SLIDE 29

Performance Metrics

In ICN the in-network caching aims to:

  • From the Customer point of view:

Reduce the average time required to download the requested content.

  • From the Publisher point of view:

Reduce the number of requests the publisher needs to serve.

  • From the Network point of view

Reduce the network traffic.

Average Number of Hops per simulation describes all the above benefits.

Evaluate In- Network Caching Performance

slide-30
SLIDE 30

Collection of Measurements

Collection of the Average number of Hops for each simulation starts when the Average Number of Hops converges for at least 50T.

Evaluate In- Network Caching Performance

0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2 2,2 2,4 2,6 2,8 3 3,2 3,4 3,6 3,8 4 4,2 1 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 Average Number of Hops T-Clients Requests LCE Fix(0.5) Fix(0.25) ProbCache LCD No Cache

slide-31
SLIDE 31

Results : String Network (1 Client)

  • Number of sub-Objects(c) a Big Data Object is consisted of has neglectable impact on the performance of caching

algorithms.

  • C:B ≤ 1 Low Caching Algorithms Performance
  • C:B ≥ 2 Significant Benefits can be gained from this point and onwards.

1,4 1,6 1,8 2,0 2,2 2,4 2,6 2,8 3,0 3,2 3,4 3,6 3,8 4,0 4,2 0,5 1 2 4 8 16

Average Hops Content Router Cache Size/Big Data Object Size (C:B)

LCE Fix(0.5) Fix(0.25) ProbCache LCD

Indicates the Standard Deviation for different c values[1,2,4..20]. c : The number of sub- Objects a Big Data Object is consisted of.

slide-32
SLIDE 32

Results : Binary Tree Network (8 Clients)

  • Number of sub-Objects(c) a Big Data Object is consisted of has neglectable impact on the performance of caching

algorithms.

  • C:B ≤ 1 Low Caching Algorithms Performance
  • C:B ≥ 2 Significant Benefits can be gained from this point and onwards.

Indicates the Standard Deviation for different c values[1,2,4..20]. c : The number of sub- Objects a Big Data Object is consisted of.

1,4 1,6 1,8 2,0 2,2 2,4 2,6 2,8 3,0 3,2 3,4 3,6 3,8 4,0 4,2 0,5 1 2 4 8 16

Average Hops Content Router Cache Size / Big Data Object Size (C:B)

LCE Fix(0.5) Fix(0.25) ProbCache LCD

slide-33
SLIDE 33

Conclusion

 Based on our research in ICN approaches & PID

Standards, mapping PIDs to ICN Names is possible

  • Decentralized Solution Proposed for NDN approach.

 Generic  Extensible  Scalable  Administration & Management is needed on each Layer

 Evaluation of Caching Algorithms gave us

  • Cache Size/Big Data Set Size(C:B), plays critical role on the efficiency of

current caching algorithms.

 C:B ≤ 1 Insignificant gain from Caching.  C:B ≥ 2 Significant Benefits can be gained from this point and onwards.

  • Number of sub-Object the Big Data Object is segmented does not

significantly affect the efficiency of caching algorithms.

slide-34
SLIDE 34

Questions?