Heterogeneous Cloud Storage Platforms Ilja Livenson*, Erwin Laure - - PowerPoint PPT Presentation

heterogeneous cloud storage platforms
SMART_READER_LITE
LIVE PREVIEW

Heterogeneous Cloud Storage Platforms Ilja Livenson*, Erwin Laure - - PowerPoint PPT Presentation

Towards Transparent Integration of Heterogeneous Cloud Storage Platforms Ilja Livenson*, Erwin Laure KTH PDC livenson@kth.se * Presenter Outline Motivation and problem Our approach CDMI-Proxy Status and roadmap Background


slide-1
SLIDE 1

Ilja Livenson*, Erwin Laure KTH PDC livenson@kth.se

Towards Transparent Integration of Heterogeneous Cloud Storage Platforms

* Presenter

slide-2
SLIDE 2

Outline

 Motivation and problem  Our approach

 CDMI-Proxy

 Status and roadmap

slide-3
SLIDE 3

Background

 Work done within EU FP7 VENUS-C Project

 creating a platform that enables user applications to leverage on

cloud computing principles;

 creating a sustainable infrastructure with a valid business model.

 Resource providers are MS Azure, Engineering, BSC and

KTH

 User scenarios from biomedicine, civil engineering, civil

protection and emergencies, marine biodiversity and more.

slide-4
SLIDE 4

Problem

 Lacking component – common storage access mechanism  Clouds typically expose RESTful interfaces for file access

 AWS S3 or MS Azure Blob

 DCI and local infrastructures (including laptops) tend to

provide POSIX interface

 FS or shared FS

 Need to offer a compatibility layer

slide-5
SLIDE 5

Storage Objects

 There are three objects with generally close semantics

 Container  Blob  Message Queue

 Each resource provider offers its own flavour of APIs

 AWS S3 vs MS Azure Blob vs POSIX  AWS SQS vs MS Azure Queue vsAMQP

slide-6
SLIDE 6

VENUS-C Applications Requirements

 Blob

 generic data item + metadata

 Message Queue

 FIFO queue

 Key-value database

 Aka NoSQL databases  Semantics depend on implementation

slide-7
SLIDE 7

Data Access Strategies

slide-8
SLIDE 8

Motivation for a Proxy Approach

 Easier exposure of local storage through RESTful API  Centralized control over resources  Easier access to resources

 Integration point with existing identity providers

 Easier release cycle. It is much easier to update a central

CDMI-proxy service than a set of deployed libraries

 Optimization effect from optimizing data of multiple users

can be higher than if optimized individually

slide-9
SLIDE 9

CDMI

 SNIA’s Cloud Data Management Interface

 http://www.snia.org/cloud  Standard (1.0.1h) + rising adoption by vendors

 CDMI provides an interface description for performing a set

  • f operations on the data elements from the cloud

 CDMI objects:

 Data  Queue  Container  Domain  Capability

slide-10
SLIDE 10

CDMI-Proxy Structure

Core CDMI FUSE HTTP FTP CIFS AuthZ/AuthN Generic Blob Generic MQ

Generic Document DB

local Azure S3 CDMI local Azure SQS CDMI Azure CouchDB SimpleDB

slide-11
SLIDE 11

Data Flow

  • 1. Parse CDMI Request.
  • 2. Extract request parameters.
  • 3. Call generic ADT (e.g. blob)

with extracted parameters.

  • 1. Divide parameters into data and metadata.
  • 2. Access metadata in metadata store (e.g. CouchDB)
  • 3. Access data in data store (e.g. blob/mq).
  • 4. Crosscutting: checks and business-logic validation.
  • 1. Manage connection with

the metadata store.

  • 2. Search/Load/Save metadata.

CDMI HTTP request

  • 1. Manage connection with

the data backend

  • 2. Load/Save data.

Data (blob content, message value) Metadata (ACLs, ctime/mtime, size, etc) ADT call with extracted data CDMI Frontend Concrete ADT Metadata Backend Blob/MQ Backend

slide-12
SLIDE 12

VENUS-C Deployment Models

 Everything from the laptop

 Client would need to have a business relationship with a cloud

provider

 VENUS-C on-premises

 E.g. VENUS-C services deployed at a research group

 VENUS-C in the cloud

 E.g. a commercial offer to a company

slide-13
SLIDE 13

Demo deployment

KTH OpenNebula A Local FS CDMI-Serve (SARA) AWS A CDMI-Proxy Azure Blob AWS A CDMI-Proxy AWS S3 Local laptop CDMI-Proxy Local FS Data movement using CDMI:

  • 1. Get data from 3 sources
  • localdisk via CDMI-proxy
  • AWS via CDMI-proxy
  • localdisk via CDMI-Serve (SARA)
  • 2. Create a new folder in CDMI-proxy (Azure backend)
  • 3. Upload files to a new container.

Metadata Store (CouchDB)

slide-14
SLIDE 14

Security

 Crossing of trust domain  Integration point with in-house

 Identity providers  AuthZ systems

slide-15
SLIDE 15

Client Side

 Developing CDMI SDKs in .Net, Java and Python, also

exporting as CLIs

 Integration with EMIC’s Generic Worker and BSC COMP

Superscalar

 Community efforts

 SARA  OCCI/CDMI demo from NetApp  (More are coming)

 Commercial offerings

 Mezeo Cloud

slide-16
SLIDE 16

Status and plans

 Core functionality is getting more mature

 Supported ADTs: Blobs and Message Queues  Extended namespace for 1-level cloud storages (AWS S3, Azure

Blob)

 Delivery of the first prototype is due in Autumn 2011

 Prerelease earlier

 Will not expose document store via CDMI

 Custom installations at DCIs with a shared security system  Will wait for CDMI specification

slide-17
SLIDE 17

Roadmap

 Integration into application’s workflows

 Ongoing: bioinf, rendering, medical imaging

 Performance and stability testing  3rd party transfers with encryption of the content  Enrichment of data items with (approximate) costs  Basic accounting + interface to VENUS-C accounting and

billing engine

 Dynamic credential passing to allow reuse of personal

accounts

slide-18
SLIDE 18

Technical Details

 CDMI-Proxy core

 Twisted networking engine (Python)  Python 2.5+

 Backends

 Metadata store: CouchDB (Azure Table, AWS SimpleDB)  Blobs: POSIX, Azure Blob, AWS S3, CDMI  MQ: AMQP

, Azure Queue, AWS SQS, CDMI

slide-19
SLIDE 19

Thank you!

http://github.com/livenson/vcdm http://github.com/livenson/libcdmi-java http://github.com/livenson/libcdmi-python