2 This work is funded under National Data Storage 2 project - - PowerPoint PPT Presentation

2
SMART_READER_LITE
LIVE PREVIEW

2 This work is funded under National Data Storage 2 project - - PowerPoint PPT Presentation

2 National Data Storage 2 Secure sharing, publishing and exchanging data Maciej Brze niak, Norbert Meyer, Micha Jankowski, Gracjan Jankowski Supercomputing Department, PSNC 2 This work is funded under National Data Storage 2 project


slide-1
SLIDE 1

2

Maciej Brzeźniak, Norbert Meyer, Michał Jankowski, Gracjan Jankowski Supercomputing Department, PSNC

This work is funded under National Data Storage 2 project (2011-2013), Project number NR02-0025-10/2011. http://nds.psnc.pl

Full Polish name of the project: System bezpiecznego przechowywania i współdzielenia danych oraz składowania kopii zapasowych i archiwalnych w Krajowym Magazynie Danych

National Data Storage 2

Secure sharing, publishing and exchanging data

2

slide-2
SLIDE 2

2

  • Context: NDS1 & PLATON Popular Archive Service
  • Why version 2 of NDS needed?
  • NDS2:

– New functionality:

  • secure sharing, publishing and exchanging files
  • versioning, point in time recovery

– New features:

  • enhanced security,
  • performance scalability,
  • multi-user readiness
  • Some observations / open issues

Agenda

2

slide-3
SLIDE 3

2 Projects – where we are?

2007 2008 2009 2010 2011 2012 2013

Tenders... Production Tests with users Internal tests

  • f NDS system

POPULAR ARCHIVE SERVICE

R&D: NDS Design & implementation Deployment

  • f NDS in the infrastructure

& the service operation

2

R&D: NDS2 Design & implementation

slide-4
SLIDE 4

2

High-level aim:

To support scientific and academic community in protecting and archiving the data

Detailed aims:

– Adressing secondary storage applications:

  • Long-term data archival
  • Short-term backup

– Assumptions:

  • people do have

their own primary storage

  • people use another tools

for data exchange and CM

Local storage User Network NDS system

NDS1 – aims and focus

CMS black box Data exchange tools

slide-5
SLIDE 5

2

NDS1 Design assumptions

  • Focus on specific system features

and functionality:

– Long-term data durability and consistency:

  • Physical protection of the data
  • Replication + safe storage
  • Keeping consistency of the data

– Confidentiality and safety of the data

  • To be supported (not able to solve all issues)

– Easy usage:

  • standard access methods
  • possible integration with existing tools
  • Transparent data replication

– Stable & reliable product and service!

  • HA, Trust....
slide-6
SLIDE 6

2

NDS1 – Features & challenges

  • HA:

– Geographically distributed system – Synchronous and asynchronous replication (reliability vs performance)

  • Scalability:

– performance, – storage capacity, – number of users

  • Challenges:

– fault tolerance – Consistency vs high performance =?

slide-7
SLIDE 7

2

NDS1 - Architecture

HSM System Storage Node file system Replica access methods servers Storage Node NDS system logic Virtual filesystem for data and meta-data (FUSE) Access Node Database Node Access Methods Servers (SSH, HTTPs, WebDAV...) Meta- data DB Users DB Accounting & limits DB Storage Node HSM System Replication Slave Meta- data DB

NDS1 user application

slide-8
SLIDE 8

2

NDS system – Architecture comments

  • Data durability and service availability

– Sync & async replication – Multiple data access & storage sites – Monitoring & faults detection – Limits: no data consistency checking inside the system at the moment

Storage Node Access Node User

  • Meta-data durability and consistency:

– Multiple meta-data databases instances – Semi-synchronous replication of meta-data

Master Meta- data DB Slave Meta- data DB NDS mechanisms Synchronous

  • peration

logs replication Asynchronous transaction replication

slide-9
SLIDE 9

2

NDS– Architecture comments/limits

  • Data confidentiality

– Dedicated name spaces – Data sharing possible among designated users, limited to a given institution/profile – Limit:

  • No support for sercure data sharing

among institutions

– NDS1 uses encryption where possible; means: not everywhere!

  • Data access:

– Scp/Sftp, httpS, WebDAV over httpS

  • Storage:

– Encryption-enabled tapes (in fact external to system)

– Encryption outside the system:

  • supported by client application (details later)

– System-side encryption & data consistency checks to be considered to increase the security of data not encrypted by user

Meta- data DB1 User1 Meta- data DB2 User2 Logical or physical separation

slide-10
SLIDE 10

2

NDS– Architecture comments/limits

  • Client side encryption & automation

User data NDS B/A application (on-the-fly encryption and checksums) User’s data copy NDS/PLATON-U4 service

– User side B/A application that supports security and automation – Limits: user side encryption is CPU intensive – Some hardware-aid solutions might be necessary for users having a lot of data – Additional tools needed: – Management features for keys – Automation of security-related features

slide-11
SLIDE 11

2

NDS – Architecture comments/limits (3)

  • Scalability:

– performance:

  • Many ANs, SNs
  • Many storage devices
  • Data access
  • ptimisation:

– Load balancing – Monitoring

– Limits:

  • Metadata handling is... centralised for a given logical name space!
  • Consistency vs performance...

Storage Node Access Node User

– Storage capacity:

  • Many SNs
  • Many storage devices
  • Cost-effective approach:

HSM as the storage backend

HSM System

slide-12
SLIDE 12

2

NDS – Architecture comments/limits (4)

  • Scalability:

– Number of users

  • We can configure

multiple system instances when the single system limit is reached

  • Architecture is

virtualization-ready

– Limits:

  • The more users, the more metadata

and more complicated user management

  • No real experince from the production system yet
  • Some level of the user management de-centralisation is needed
slide-13
SLIDE 13

2

NDS – Architecture comments/limits (5)

  • Ease of integration / usage:

– Standard user interfaces:

  • We support: SCP, HTTP/WebDAV, GridFTP
  • Integration with existing tools easy
  • NDS logic details

hidden from the user

– Limitations:

  • No ‘special features’

for users through standard interfaces (except meta-data fs)

  • Extra features are to be provided by additional tool / interface:
  • Client backup/archive applicaiton
  • Web/GUI Interface
  • E.g. No advanced tools to manage ACLs and sharing

NDS system logic Virtual filesystem for data and meta-data (FUSE) Access Node Access Methods Servers (SSH, HTTPs, WebDAV...) User User backup/archive software User data

– Single sign-on:

  • Based on X.509 certificates stored in LDAP
  • Keys and certificates distributed automatically

to access methods servers (sshd, apache, gridftp) and converted to appropriate format on-the-fly by KeyFS solution

X.509 KeyFS

Access Node

slide-14
SLIDE 14

2

NDS2 - summary of issues to address

  • In NDS2 we need to address (functionalities):
  • Advanced features for long-term B/A:
  • Versioning – point in time recovery
  • Security and data safety related:
  • Data consistency checks
  • Strong and efficient encryption:
  • on the client side (hardware aid, automated + tools)
  • Sharing:
  • Inside NDS (some trust to users assumed)
  • NDS <-> external world (one side of sharing not trusted)
  • Publishing data using our infrastructure:
  • e.g. for Digital Libraries
  • they store archives in the in NDS already
  • Extra functionalities to be offered by extended (non-standard) interface
  • e.g. versions management
  • We still keep standard interfaces working
slide-15
SLIDE 15

2

NDS2 - summary of issues to address

  • We need to address (features):
  • Scalability:
  • Deal with metadata handling scalability
  • but keep consistency untouched!
  • common logical view is needed for all users going

to share data

  • => De-centralise logical name space management:
  • De-centralise users management:
  • hierarchical management (not covered in this presentation)
slide-16
SLIDE 16

2

NDS1 – Scalability improvements

  • De-centralised logical name space management:
  • Step 1: divide the namespace into parts distributed

across multiple metadata DBs (dCache-like approach?)

Database Node Meta- data DB Users DB Accounting & limits DB Meta- data DB B Meta- data DB C Meta- data DB A

/

++ load distribution ++ consistency

  • - single point of failure
slide-17
SLIDE 17

2

NDS1 – Scalability improvements

  • De-centralised logical name space management:
  • Step 2: combine distribution with replication

Database Node Meta- data DB Users DB Accounting & limits DB Meta- data DB B

instance 1

Meta- data DB C

instance 1

Meta- data DB A instance 1

/

++ load distribution ++ consistency ++ no single point of failure SSR – Semi Synchronous Replication of meta-data

Meta- data DB B

instance 2

Meta- data DB

instance 2

SSR SSR

slide-18
SLIDE 18

2

NDS1 – Scalability improvements

  • De-centralised logical name space management:
  • Step 3: combine distribution with replication + provide automated failover

Database Node Meta- data DB Users DB Accounting & limits DB Meta- data DB B

instance 1

Meta- data DB C

instance 1

Meta- data DB A instance 1

/

++ load distribution ++ consistency ++ no single point of failure ++ automated failover SSR – Semi Synchronous Replication of meta-data

Meta- data DB B

instance 2

Meta- data DB

instance 2

SSR SSR

slide-19
SLIDE 19

2

NDS2 – New functionalities

  • In NDS2 we need to address (functionalities):
  • Advanced features for long-term B/A:
  • Versioning – point in time recovery
  • Security and data safety related:
  • Data consistency checks
  • Strong and efficient encryption:
  • on the client side (hardware aid, automated, + tools)
  • Sharing:
  • Inside NDS (some trust to users assumed)
  • NDS <-> external world (one side of sharing not trusted)
  • Publishing data using our infrastructure:
  • e.g. for Digital Libraries
  • they store archives in the in NDS already
  • Extra functionalities to be offered by extended (non-standard) interface
  • e.g. versions management
  • We still keep standard interfaces working
slide-20
SLIDE 20

2

NDS2 – Versioning + Point-in-time recovery

/

NDS virtual filesystem structure

  • pen (..., ..., O_RDWR)

A B C D E F G H a b c d e f g h d /

NDS virtual filesystem structure

A B C D E F G H a b c e f g h H’ d’

  • Versioning is:
  • Transparent to users
  • they may access to old versions through a special directory or mountpoint;

similarly (protocol, access point) as the current replica

  • Efficient:
  • We can use one of many existing replicas to handle current operations immediately
slide-21
SLIDE 21

2

NDS2 – Security and data safety functions

  • Data consistency checks:
  • Cryptographics shortcuts calculated for replicas and put to Metadata DB
  • Can be presented to users (to be compared – manually or automatically)
  • Background checks made periodically

(at least while storing and retrieving the file)

  • Shortcuts calculated close to data

(on the storage nodes)

  • Redundany coding to be considered

HSM System Storage Node file system Replica access methods servers Storage Node NDS system logic Virtual filesystem for data and meta-data (FUSE) Access Node

`

Database Node Access Methods Servers (SSH, HTTPs, WebDAV...) Meta- data DB Users DB Accounting & limits DB Storage Node HSM System Replication

NDS2 user application CC deamon CC deamon CC control

slide-22
SLIDE 22

2

NDS2 – Security and data safety functions

  • Strong and efficient encryption – on the client side
  • Software-based – application for individual users
  • Sardware-aid – appliance for those having really huge amounts data
  • Both approaches automated + tools provided

User data NDS B/A application (on-the-fly encryption and checksums) User’s data encrypted copy NDS2 service

SSH, HTTPs, WebDAV...

User data NDS B/A ‘APPLIANCE’ (on-the-fly encryption and checksums aided by hardware) User’s data encrypted copy NDS2 service

SSH, HTTPs, WebDAV... DATA EXPORTING METHOD (say CIFS/NFS)

Safe key storage

slide-23
SLIDE 23

2

NDS2 – Secure sharing and publishing

  • Secure sharing the data inside the NDS2 system

/home/maciej

NDS virtual filesystem structure

A B C D E F G H a b c d e f g h

Maciej – trusted NDS2 user

/home/jan A B D SHARING

Copy (or link) the file

d

Jan – trusted NDS2 user

SHARING d

+ set the ACL

C

+ setting the ACLs enables the other user to SEE the file + key exchange needed to really READ its contents + again, we can use multiple replicas to handle modifications

slide-24
SLIDE 24

2

NDS2 – Secure sharing and publishing

  • Secure sharing the data with the external World

(FileSender-like use case)

/home/maciej

NDS virtual filesystem structure

A B C D E F G H a b c d e f g h

Maciej – trusted NDS2 user Copy the file to the publishing sandbox Brian... – UNTRUSTED user system side replication and decryption – user not involved!

+ sharing effective

  • Maciej discloses the file vs the system

+ Maciej’s secret key still not disclosed

Simplified NDS logic Sandboxed part of the system

d

(‘secured’ or NOT) URL provided to Brian + some additional info:

  • user/login
  • key for the file...

Brian can access the file

slide-25
SLIDE 25

2

NDS2 – Secure sharing and publishing

  • Secure publishing... (a special case of sharing)

/home/maciej

NDS virtual filesystem structure

A B C D E F G H a b c d e f g h

Maciej – trusted NDS2 user Copy the file to the publishing sandbox Anonymous UNTRUSTED users system side replication and decryption – user not involved! + publishing effective + mulitple replicas can be effectively served by multiple servers (e.g. Apache) to many users

  • Maciej discloses the file vs the system

Simplified NDS logic Sandboxed part of the system

d

Public URL generated and published People can access the file

slide-26
SLIDE 26

2

NDS2 – Secure sharing and publishing

  • Secure publishing...

(practical use case for digital libraries)

NDS2 services

CMS

User / CMS operator Data source MASTER data Meta- data End user (publication reader)

MASTER versions STORAGE/ARCHIVAL Presentation versions storage

High-res presentation versions ‘publication’

A

‘Regular’ (low-res) presentation versions

2

B

2

NDS2 sandbox

d HR LR HR LR HR

+ link to high-res presentation version put to library portal

slide-27
SLIDE 27

2

NDS2 – Secure sharing and publishing

  • Some observations:
  • Proper key management mechanisms needed:
  • User (e.g. Maciej) owns the private/public „master” key pair
  • Each stored file is encrypted using the a separate file’s symmetric key
  • Symmetric keys are safely stored in NDS2 Metadata DB

(encrypted by master key known only to the user)

  • Sharing steps:
  • Sharing file physically (ACL...) – enables to see the file
  • Retrieving the symmetric file key from the system
  • Decrypting the file’s key (on the user side)
  • Passing the decrypted file’s key to the target user
  • Decrypting the file on the user’s side – enabling to understand the file contents
  • Publishing steps:
  • Retrieving the file’s symmetric key from the system
  • Passing the decrypted key back to the system
  • The system copies the file to sandbox (and decrypts it)
  • The others can see the file and understand it
  • Sharing mgmt. and key handling tools should be provided to users
slide-28
SLIDE 28

2

NDS2 – New features vs scalability

  • Further observations / open issues:
  • Improving system scalability is necessary to implement new features
  • internal sharing require ‘single’ logical view of the shared structure
  • additional metadata processing
  • Two versions of NDS2 logic:
  • Regular – for trusted part of the system, feature-full
  • Light – for the sandbox – hack-proof, non feature-full, simple!
  • Q: are we going to end-up with two completely different products?
  • We have to consider new interfaces:
  • Sanbox’s public interface for downloading/uploading data
  • What about reusing part of Filesender project findings?
  • New features require going beyond standard interfaces:
  • Web GUI
  • (Java?) client-side application
slide-29
SLIDE 29

2

NDS2 – Summary

  • In NDS 2 we try to:
  • Address more sophisticated B/A –related needs
  • Provide transparent versioning
  • Increased security (encryption) and performance (hardware-aided)
  • Improve overall system scalability
  • Provide sharing facilities:
  • Internal sharing – 2 lines of defence – ACLs on file access + keys for contents
  • File-Sender-like features for NDS users
  • Support publishing:
  • ‘Private’ publishing use case
  • ‘Massive’ publishing use case
slide-30
SLIDE 30

2

THANK YOU!

More information: nds.psnc.pl

Note: most of the pictures used in the presentation come from sxc.hu service

2