national data storage national data storage g
play

National Data Storage National Data Storage - g - architecture - PowerPoint PPT Presentation

National Data Storage National Data Storage - g - architecture and mechanisms architecture and mechanisms Micha Jankowski Maciej Brze niak PSNC A Agenda d Introduction Assumptions Assumptions Architecture Main


  1. National Data Storage National Data Storage - g - architecture and mechanisms architecture and mechanisms Micha ł Jankowski Maciej Brze ź niak PSNC

  2. A Agenda d • Introduction • Assumptions Assumptions • Architecture • Main components • Deployment • Deployment • Use case

  3. Th The problem bl Data storage : • needs considerable resources (human, software, hardware…) • is complex and expensive • exceeds the abilities of many institutions • is not their core business Outsourcing the process may be the Outsourcing the process may be the only or at least the most reasonable solution to that problem! solution to that problem!

  4. O Our project j t • KMD (NDS) – National Data Storage National Data Storage KMD (NDS) – 2007-2009 – R&D project that implemented the software R&D project that implemented the software • PLATON-U4 – ”Popular backup/archival service” se ce – 2009-2012 – Deployment project p y p j – Target: scientific and academic institutions

  5. Ai Aims Primary aim: Primary aim: To support scientific and academic community in protecting and archiving community in protecting and archiving the data Secondary aims: –Physical protection of the data y p –Assuring logical consistency of the data –Long-term data archival Long term data archival –Tools supporting backup

  6. O Our potential customers t ti l t • Digital libraries • Virtual laboratories Virtual laboratories > 600 orga • Academic computer centres and network t d t k anization h hundreds of d d f operators TB/year ns • Research institutions • Universities U i iti ~500 ~500 • Clinical hospitals p > 50

  7. D Design assumptions I i ti I • High-availability and reliability Hi h il bilit d li bilit – Geographically distributed storage system with data replication – Additional profit: scalability (performance, storage capacity, number of users) – Challenges: consistency, fault tolerance and high performance

  8. D Design assumptions II i ti II • Focus on specific system features and functionality F ifi t f t d f ti lit (pointed by the potential users in a survey) – Secondary data storage Secondary data storage – Data durability and service availability – Geographical data replication – No data sharing or exchange capabilities N d t h i h biliti – Confidentiality of the data -> dedicated name spaces – Automatic replication according to preferred policy: p g p p y • Number of replicas • Synchronous/asynchronous mode • Allowed physical localizations Allowed physical localizations

  9. D Design assumptions III i ti III • Realism about what we are actually able R li b t h t t ll bl to provide – stable production-level service – budget and the time limitations g

  10. O Overall architecture ll hit t User Access Methods Servers (SSH HTTPs WebDAV ) Access Methods Servers (SSH, HTTPs, WebDAV...) Database Node Virtual file systems for data and meta ‐ data Meta ‐ Access data DB Node NDS system logic NDS system logic Replication Users Accounting DB DB & li & limits DB it DB Replica access methods servers Storage Node Storage Node file system

  11. M t Metacatalog t l • Logical structure of the virtual file system • Attributes and other metadata of files • Mapping logical files – replicas • History of operations • History of operations

  12. L Logical separation of namespaces i l ti f • Each customer’s contract is connected with separate virtual file system p y (namespace) • Data sharing is not expected by the users • Data sharing is not expected by the users • Confidentiality is improved • Logically separate namespaces mean physically separated metacatalogs physically separated metacatalogs – Improved performance and scalability

  13. Di t ib ti Distribution of metadata f t d t • Each metacatalog is replicated Each metacatalog is replicated asynchronously in master-slaves mode (Slony-I) • Number of MC replicas refers number of Number of MC replicas refers number of replicas of user files • In case of failure of master MC some slave I f f il f t MC l one is (manually) selected as master

  14. S Semi-synchronous metadata replication i h t d t li ti • Used in synchronous mode of replication of user data • All operations on metadata are synchronously logged to a number of distributed logs gg g • In case of failure: all operations logged between the update of the slave MC and the failure of the the update of the slave MC and the failure of the master MC are performed on the ”new” master • Solution safe as synchronous database • Solution safe as synchronous database replication, but much lighter

  15. U Users database d t b • Institutions -customers • Contracts and profiles (parameters of services) (parameters of services) – Required number and localization of replicas – Mode of replication (synchronic, asynchronic) Mode of replication (synchronic asynchronic) – … • Users (certificates) Users (certificates)

  16. A Accounting database ti d t b • Resource usage – Statistics – Billing • Limits (quota) Li it ( t )

  17. D t Data Daemon D • Emulates virtual file system with logical files and directories on AN • Enforces security policies and replication policies • Takes into account output of monitoring and prediction modules • Produces accounting data • The virtual FS can be accessed in a standard way or via a portal (universal interface!) (universal interface!) • Based on FUSE

  18. Metadata Daemon • Emulates virtual file system with metadata on AN metadata on AN • Metadata is placed in special files located in directories corresponding to l t d i di t i di t logical files and directories • The virtual FS can be accessed in a standard way or via a portal y p • Based on FUSE

  19. S System interfaces t i t f • Low level: Virtual file systems for data and metadata y • High level: Standard protocols: SSH HTTPS Standard protocols: SSH, HTTPS, WebDAV, GridFTP – Limitation: authorization - keyFS

  20. D t Data access • Typical client yp software • Specialized portal Speciali ed portal

  21. M Monitoring and prediction it i d di ti • Monitoring of all important elements allows avoiding unaccessible SNs by Data Daemon and quick reaction of the administrators • Prediction helps optimal selection of Prediction helps optimal selection of replica to read or selection of node to write a new replica write a new replica

  22. D t St Data Storage • HSM (Hierarchical Storage Management)

  23. F From the user’s point of view… th ’ i t f i

  24. S Scalability l bilit • • Storage Nodes Storage Nodes – Access performance – System capacity – Data transmission throughput (distributed data traffic) Data transmission throughput (distributed data traffic) • Access Nodes – Responsiveness to I/O requests – Data transmission throughput Data transmission throughput • Metacatalogs – Potential bottleneck – Maximum number of files/directories Maximum number of files/directories – Perform file system level operations – Operation time depends on the database size – Transactions -> limited parallel access Transactions -> limited parallel access – Sensitive to meta-data-intensive operations (backup/archive applications are rather throughput-intensive)

  25. S System instantiation t i t ti ti • Separate metacatalogs allows for easy division S t t t l ll f di i i of the system into many instances • Pools of access nodes and storage nodes may P l f d d t d be assigned to the instances • The instances and their elements (metacatalogs, The instances and their elements (metacatalogs virtual file systems…) may be located on dedicated physical or virtualized servers or dedicated physical or virtualized servers or coexist • The configuration depends on user requirements • The configuration depends on user requirements against data and metadata processing efficiency

  26. S System deployment infrastructure for PLATON t d l t i f t t f PLATON • 12 5 PB tape storage in 5 localizations • 12,5 PB tape storage in 5 localizations • 2 PB disc storage in 10 localizations • 70 servers, SANs and10Gbit Ethernet SAN d10Gbit Eth t 70

  27. U Use case – storage of PIONIER network traffic t f PIONIER t k t ffi • Store protocol headers for legal requirements St t l h d f l l i t • 168TB/year -> 5,5 MB/sec • Data collected in >20 geographically distributed PIONIER nodes • 5-year durability (replication required) • Frequent data writes, rare metadata operations • Fits well to PLATON environment: – Many data sources -> multiple virtualized ANs y p – Replicas -> multiple SNs – Little metadata processing -> may use shared DBMS

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend