A P2P Dropbox @mafintosh 8 person team Based in 5 countries - - PowerPoint PPT Presentation

a p2p dropbox mafintosh 8 person team based in 5
SMART_READER_LITE
LIVE PREVIEW

A P2P Dropbox @mafintosh 8 person team Based in 5 countries - - PowerPoint PPT Presentation

A P2P Dropbox @mafintosh 8 person team Based in 5 countries >1500 npm modules >1500 npm modules (~0.5% of npm) We make tools that help scientists share data We make tools that help scientists share data (and other people as


slide-1
SLIDE 1

A P2P Dropbox

slide-2
SLIDE 2

@mafintosh

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

8 person team

slide-6
SLIDE 6

Based in 5 countries

slide-7
SLIDE 7

>1500 npm modules

slide-8
SLIDE 8

>1500 npm modules

(~0.5% of npm)

slide-9
SLIDE 9

We make tools that help scientists
 share data

slide-10
SLIDE 10

We make tools that help scientists
 share data

(and other people as well)

slide-11
SLIDE 11

Data === Files

slide-12
SLIDE 12

Existing great file sharing tools

slide-13
SLIDE 13
  • Extremely easy to use
  • Centralised / High cost
  • Who owns the data?
  • Sustainable?
slide-14
SLIDE 14
  • Decentralised / P2P
  • Massive adopted / Simple protocol
  • Only works for static files
  • Scales worse on really big data sets
  • No diffs
slide-15
SLIDE 15

We can do better

slide-16
SLIDE 16
  • Easy to use, but not centralised like Dropbox
  • Decentralised / P2P but not for piracy like BitTorrent
  • Build for modern use cases
slide-17
SLIDE 17
  • Easy to use, but not centralised like Dropbox
  • Decentralised / P2P but not for piracy like BitTorrent
  • Build for modern (scientific) use cases
slide-18
SLIDE 18

A next generation file sharing tool

slide-19
SLIDE 19

Real time / Live data

(get only the data you need and get updates when it changes)

slide-20
SLIDE 20

Decentralised

(no servers / data centers needed, actually serverless)

slide-21
SLIDE 21

Diffable

(sharing two similar data sets should only share the diff)

slide-22
SLIDE 22

npm install -g dat

slide-23
SLIDE 23
slide-24
SLIDE 24

Append only logs

slide-25
SLIDE 25

Append only logs

(a list of data you only ever append to, get it?)

slide-26
SLIDE 26

Append only logs lists

(a list of data you only ever append to, get it?)

slide-27
SLIDE 27

Data item #0 (Append item to list)

slide-28
SLIDE 28

Data item #0 Data item #1 (Append item to list)

slide-29
SLIDE 29

Data item #0 Data item #1 Data item #2 (Append item to list)

slide-30
SLIDE 30

Why “Append Only Logs”?

slide-31
SLIDE 31
  • A simple data structure
  • Immutable
  • Logical ordering
  • Easy to digest / index
slide-32
SLIDE 32

How can we share append only logs?

slide-33
SLIDE 33

How can we share append only logs?

(over a p2p network where we don’t trust other people)

slide-34
SLIDE 34

Merkle Trees

slide-35
SLIDE 35

Merkle Trees

(a tree structure that verifies data)

slide-36
SLIDE 36

Merkle Trees

(a tree structure that verifies data) (unrelated to Angela Merkel)

slide-37
SLIDE 37

Merkle Trees

(a tree structure that verifies data) (unrelated to Angela Merkel)

slide-38
SLIDE 38

Data #0

slide-39
SLIDE 39

Data #0 Root hash #0 Hash #0

slide-40
SLIDE 40

Data #1 Data #0 Hash #0 Root hash #1 Hash #1 Hash #2

slide-41
SLIDE 41

Data #2 Root hash #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4

slide-42
SLIDE 42

Root hash #3 Data #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5

slide-43
SLIDE 43

Root hash #3 verifies all the data

slide-44
SLIDE 44

👪 wants to share data with 

Data #2

slide-45
SLIDE 45

Root hash #3 Data #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5  trust this hash 👪 wants to share this

slide-46
SLIDE 46

Root hash #3 Data #2 Hash #1 Hash #6  trust this hash 👪 needs to share these

slide-47
SLIDE 47

Root hash #3 Hash #1 Hash #6 Hash #4 Data #2

slide-48
SLIDE 48

Root hash #3 Hash #1 Hash #6 Hash #4 Data #2 Hash #5

slide-49
SLIDE 49

Root hash #3 Hash #1 Hash #6 Hash #4 Data #2 Hash #5 Hash #3

slide-50
SLIDE 50

 checks that match

Hash #3 Root hash #3

slide-51
SLIDE 51

👪 only needs to send O(log(n)) hashes to 

slide-52
SLIDE 52

👪 only needs to send O(log(n)) hashes to 

slide-53
SLIDE 53

👪 only needs to send O(log(n)) hashes to 

(can easily be optimised to never send the same hash twice)

slide-54
SLIDE 54

👪 only needs to send O(log(n)) hashes to 

(can easily be optimised to never send the same hash twice) (come ask me later, i’m fun at parties)

slide-55
SLIDE 55

Real time

slide-56
SLIDE 56

Every time we append data root hash changes

Root hash

slide-57
SLIDE 57

Crypto to the rescue

slide-58
SLIDE 58

Generate a key pair

Secret Key Public Key +

slide-59
SLIDE 59

 trusts …….

Public Key

slide-60
SLIDE 60

Data #2 Root hash #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Secret Key 👪 signs the root

slide-61
SLIDE 61

Root hash #3 Data #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5 Secret Key 👪 signs the new root

slide-62
SLIDE 62

 uses to verify signatures

Public Key Root hash

slide-63
SLIDE 63

npm install hypercore

slide-64
SLIDE 64

(demo)

slide-65
SLIDE 65

How do we turn append only logs into a file sharing tool?

slide-66
SLIDE 66

Take a file

~/cool.data

slide-67
SLIDE 67

Cut it into pieces

~/cool.data

slide-68
SLIDE 68

Insert each piece into the log

~/cool.data Data #0 Data #1

Data #2

Data #3 Data #4

slide-69
SLIDE 69

Diffable

slide-70
SLIDE 70

Divide a file into chunks that are unlikely to change when the file is updated

slide-71
SLIDE 71

Example: git

slide-72
SLIDE 72

function hello () { var world = 'world' console.log('hello', world) }

slide-73
SLIDE 73

(One line per chunk)

function hello () { var world = 'world' console.log('hello', world) }

slide-74
SLIDE 74

(Edit one line)

function hello () { var world = 'universe' console.log('hello', world) }

slide-75
SLIDE 75

(3/4 chunks unchanged)

function hello () { var world = 'universe' console.log('hello', world) }

slide-76
SLIDE 76

Only works for text files

slide-77
SLIDE 77

Rabin fingerprinting

(Content defined chunking)

slide-78
SLIDE 78

Scans through the file and creates chunks based on the actual file content

slide-79
SLIDE 79

(A new part is inserted in the middle of the file)

slide-80
SLIDE 80

(Only the neighbouring chunks are changed)

slide-81
SLIDE 81

npm install rabin

slide-82
SLIDE 82

Each Rabin chunk is an entry in our append only log

slide-83
SLIDE 83

Data #2 Data #1 Data #0 …

slide-84
SLIDE 84

Merkle trees + Rabin = ❤

slide-85
SLIDE 85

Data #2 Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5

slide-86
SLIDE 86

Data #2 * Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5 Change some data

slide-87
SLIDE 87

Data #2 * Data #1 Data #0 Hash #0 Hash #1 Hash #2 Hash #4 Hash #3 Data #3 Hash #6 Hash #5 Change some data Rabin makes sure these entries do not change

slide-88
SLIDE 88

Data #2 * Data #1 Data #0 Hash #0 * Hash #1 * Hash #2 Hash #4 * Hash #3 Data #3 Hash #6 Hash #5 Change some data Only a few hashes change

slide-89
SLIDE 89

Keep an index

Data Data … Hash Data

slide-90
SLIDE 90

See the same hash twice, just copy the data

Hash Data

slide-91
SLIDE 91

See the same hash twice, just copy the data

Hash Data (no need to re-download it)

slide-92
SLIDE 92

See the same hash twice, just copy the data

Hash Data (no need to re-download it) (can be … easily … optimised for space)

slide-93
SLIDE 93

npm install hyperdrive

slide-94
SLIDE 94
slide-95
SLIDE 95

(demo)

slide-96
SLIDE 96

is a cli tool and desktop app that manages hyperdrives

slide-97
SLIDE 97

(demo)

slide-98
SLIDE 98

Great apps build on

slide-99
SLIDE 99

Beaker browser

https://github.com/beakerbrowser/beaker

slide-100
SLIDE 100

Science Fair

https://github.com/codeforscience/sciencefair

slide-101
SLIDE 101

https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf

Read our paper

slide-102
SLIDE 102

Thank you!

https://github.com/mafintosh/hypercore https://github.com/maxogden/rabin https://github.com/mafintosh/hyperdrive https://github.com/datproject/dat