GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro - - PowerPoint PPT Presentation

globalfs a strongly consistent multi site filesystem
SMART_READER_LITE
LIVE PREVIEW

GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro - - PowerPoint PPT Presentation

GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro Pacheco Raluca Halalai Valerio Schiavoni Fernando Pedone Etienne Rivire Pascal Felber RainbowFS Workshop May 3rd, 2017 Distributed applications GlobalFS: A Strongly Consistent


slide-1
SLIDE 1

GlobalFS: A Strongly Consistent Multi-Site Filesystem

Leandro Pacheco Raluca Halalai Valerio Schiavoni Fernando Pedone Etienne Rivière Pascal Felber May 3rd, 2017 RainbowFS Workshop

slide-2
SLIDE 2

Distributed applications

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-3
SLIDE 3

Distributed applications

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-4
SLIDE 4

Distributed applications

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-5
SLIDE 5

Distributed applications

?

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-6
SLIDE 6

Distributed applications

Distributed Storage

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-7
SLIDE 7

Distributed applications

Distributed Storage

SQL Databases Key-value storage NoSQL Databases Coordination Systems Caches File Systems

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-8
SLIDE 8

Distributed applications

Distributed Storage

SQL Databases Key-value storage NoSQL Databases Coordination Systems Caches File Systems File Systems Easy interoperability for existing aplications

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

slide-9
SLIDE 9

Global infrastructure

Amazon’s AWS global infrastructure

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 3

slide-10
SLIDE 10

CAP theorem

Weak Consistency Lower latency Higher availability Possibly incorrect/unexpected results Strong Consistency Clear semantics and guarantees Easier to reason about Block instead of providing incorrect results

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 4

slide-11
SLIDE 11

What is GlobalFS?

Geographically distributed filesystem Familiar interface (POSIX) Strong consistency Fault-tolerance through replication Flexible performance through locality

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 5

slide-12
SLIDE 12

Overall design

Separate data and metadata Partial replication Metadata protocol exploiting atomic multicast Causal reads

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 6

slide-13
SLIDE 13

Separate data and metadata

Immutable data Variable sized blobs Metadata Controls file contents, properties and filesystem structure Metadata refers to data blobs

1 | 2 | 3 | 4 | …

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 7

slide-14
SLIDE 14

Partial replication

Immutable data is simple to replicate consistently Metadata is partitioned between replica groups (i.e., partitions)

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 8

slide-15
SLIDE 15

Partial replication

US SA EU

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 9

slide-16
SLIDE 16

Partial replication

US SA EU

/ bin etc home www mark bob alice

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 10

slide-17
SLIDE 17

Partial replication

US SA EU

US SA EU / bin etc home www mark bob alice

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 11

slide-18
SLIDE 18

Partial replication

US SA EU

US SA EU

Global Replication

/ bin etc home www mark bob alice

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 12

slide-19
SLIDE 19

Partial replication

US SA EU

US SA EU

Global Replication

/ bin etc home www mark bob alice

Local multicast

  • fast updates
  • local or remote reads

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 13

slide-20
SLIDE 20

Partial replication

US SA EU

US SA EU

Global Replication

/ bin etc home www mark bob alice

Local multicast

  • fast updates
  • local or remote reads

Global multicast (global replication)

  • costly updates
  • fast local reads

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 14

slide-21
SLIDE 21

Partial ordering

GlobalFS exploits atomic multicast Atomic delivery to groups of processes Partial ordering: messages for different groups don’t have to be

  • rdered betweem themselves

Partial ordering is critical for scalability

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 15

slide-22
SLIDE 22

Architecture

Application Client (FUSE)

Data store

Metadata replicas

Atomic multicast Send read or update commands Insert or fetch immutable data

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 16

slide-23
SLIDE 23

Consistent update operations

Step 1 Write data blobs to data store Step 2 Issue a metadata update

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 17

slide-24
SLIDE 24

Consistent update operations

Step 1 Write data blobs to data store Step 2 Issue a metadata update Single-partition

G1 G2 Reply Req write to file in G1

Uncoordinated multi-partition

G1 G2 Reply Req write to file in {G1,G2}

Coordinated multi-partition

G1 G2 Reply Req move file from G1 to G2 Atomic Multicast Execution

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 17

slide-25
SLIDE 25

Causal read operations

Causally related updates are seen in the same order e.g., operations done by the same client

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

slide-26
SLIDE 26

Causal read operations

Causally related updates are seen in the same order e.g., operations done by the same client Client A Creates an image cat.jpg Modifies a page pets.html to include the image cat.jpg

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

slide-27
SLIDE 27

Causal read operations

Causally related updates are seen in the same order e.g., operations done by the same client Client A Creates an image cat.jpg Modifies a page pets.html to include the image cat.jpg Client B Opens the pets.html page and finds a broken image reference Where is the cat?

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

slide-28
SLIDE 28

Causal read operations

Step 1 Contact a metadata replica for a list of blob ids Step 2 Get the data from the data store Approach inspired by vector clocks Vector is composed of one counter per replica group

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 19

slide-29
SLIDE 29

Evaluation

Complete prototype in Java https://github.com/pacheco/GlobalFS Filesystem in Userspace (FUSE) URingPaxos for atomic multicast Global deployment using Amazon EC2

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 20

slide-30
SLIDE 30

Maximum throughput by operation

GlobalFS throughput 10000 20000 30000 40000 50000 60000 read 1KB local create 1KB local write 1KB Operations/sec

GlobalFS CalvinFS

200 400 600 800 1000 1200 1400 1600 1800

  • glob. create 1KB
  • glob. write 1KB

Locality

3 region deployment US west, US east and Europe

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 21

slide-31
SLIDE 31

Geographical scalability

0.2 0.4 0.6 0.8 1 read 1KB create write 1KB Geographical Scalability

1 Region 3 Regions 6 Regions 9 Regions 1 6 8 1

  • p

s 6 8 8 2

  • p

s 3 7 2

  • p

s

Ideal Normalized throughput per region as more regions are added 9 regions uses all EC2 regions available at the time

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 22

slide-32
SLIDE 32

GlobalFS: Summary

Strong consistency at global scale Simple and familiar API (POSIX) Flexible performance through partial replication and locality Cheap causal read operations

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 23

slide-33
SLIDE 33

GlobalFS: Summary

Strong consistency at global scale Simple and familiar API (POSIX) Flexible performance through partial replication and locality Cheap causal read operations

Thank you!

Leandro Pacheco pachecol@usi.ch

GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 23