Go London User Group - 21 st November 2018 Rclone rsync for cloud - - PowerPoint PPT Presentation

go london user group 21 st november 2018
SMART_READER_LITE
LIVE PREVIEW

Go London User Group - 21 st November 2018 Rclone rsync for cloud - - PowerPoint PPT Presentation

Go London User Group - 21 st November 2018 Rclone rsync for cloud storage https://rclone.org https://github.com/ncw/rclone Talk by Nick Craig-Wood T witter: @njcw Email: nick@craig-wood.com 1 rclone.org Nick


slide-1
SLIDE 1

rclone.org 1

Nick Craig-Wood

Go London User Group - 21st November 2018

  • Rclone “rsync for cloud storage”

– https://rclone.org – https://github.com/ncw/rclone

  • Talk by

– Nick Craig-Wood – T

witter: @njcw

– Email: nick@craig-wood.com

slide-2
SLIDE 2

rclone.org 2

Nick Craig-Wood

About me

  • Nick Craig-Wood

– CTO of Memset Ltd by day – Open Source coder by night – Keen interest in storage, data integrity – Reformed data hoarder (ha!)

slide-3
SLIDE 3

rclone.org 3

Nick Craig-Wood

Contents

  • About Me
  • What Rclone Is
  • History
  • How it works
  • Some code
  • Testing
  • Libraries
slide-4
SLIDE 4

rclone.org 4

Nick Craig-Wood

Rclone - “rsync for cloud storage”

  • Rclone is a command line program to sync fjles and directories to

and from cloud providers

  • MD5/SHA1 hashes checked at all times for fjle integrity
  • Timestamps preserved on fjles
  • Copy mode to just copy new/changed fjles
  • Sync (one way) mode to make a directory identical
  • Check mode to check for fjle hash equality
  • Can sync to and from network, eg two difgerent cloud accounts
  • Encryption backend
  • Cache backend
  • Optional FUSE mount (rclone mount)
slide-5
SLIDE 5

rclone.org 5

Nick Craig-Wood

Rclone vs Rsync

  • rsync is a utility for effjciently transferring

and synchronizing fjles across computer systems, by checking the timestamp and size

  • f fjles.
  • It is commonly found on Unix-like systems

and functions as both a fjle synchronization and fjle transfer program.

  • The rsync algorithm is a type of delta

encoding, and is used for minimizing network usage.

✓ ✓ ✗

From Wikipedia

slide-6
SLIDE 6

rclone.org 6

Nick Craig-Wood

Cloud providers supported by rclone

  • Amazon Drive
  • Amazon S3
  • Backblaze B2
  • Box
  • Ceph
  • DigitalOcean Spaces
  • Dreamhost
  • Dropbox
  • FTP
  • Google Cloud Storage
  • Google Drive
  • HTTP
  • Hubic
  • Jottacloud
  • IBM COS S3
  • Memset Memstore
  • Mega
  • Microsoft Azure Blob Storage
  • Microsoft OneDrive
  • Minio
  • Nextcloud
  • OVH
  • OpenDrive
  • Openstack Swift
  • Oracle Cloud Storage
  • ownCloud
  • pCloud
  • put.io
  • QingStor
  • Rackspace Cloud Files
  • SFTP
  • Wasabi
  • WebDAV
  • Yandex Disk
  • The local fjlesystem
slide-7
SLIDE 7

rclone.org 7

Nick Craig-Wood

Rclone platforms

I ♥ Cross Compilation CPU OS

slide-8
SLIDE 8

rclone.org 8

Nick Craig-Wood

How rclone came to be

  • Started as a tool to

exercise

– github.com/ncw/swift – originally was “swiftsync”

  • First version in 2012

– Go 1.0 – 3 backends

  • Somewhat outgrew its
  • riginal design!
slide-9
SLIDE 9

rclone.org 9

Nick Craig-Wood

Why Go?

  • Single binary deploy
  • Excellent concurrency
  • Great cross platform
  • Fast!
  • Standard library
  • New challenge for me
  • Easy for contributors to

pick up

Why?

slide-10
SLIDE 10

rclone.org 10

Nick Craig-Wood

One tool to rule them all

  • What started as a tiny exercise

– 11,000 stars on Github – 200 contributors – 500 pull requests – 1,500 issues – 250,000 downloads a month – Packaged in Ubuntu, Arch, Debian,

Homebrew, Chocolatey and more

  • ...is now an enormous project.
slide-11
SLIDE 11

rclone.org 11

Nick Craig-Wood

Visualising Rclone’s History

slide-12
SLIDE 12

rclone.org 12

Nick Craig-Wood

Rclone becomes popular and breaks Amazon Cloud Drive

?

slide-13
SLIDE 13

rclone.org 13

Nick Craig-Wood

Rclone verbs – bigger = more popular

slide-14
SLIDE 14

rclone.org 14

Nick Craig-Wood

rclone confjg - Confjg Wizard

  • Old School

Confjg Wizard

– T

ext based

– Easy to use – Not pretty – Calls your

browser to do

  • auth
slide-15
SLIDE 15

rclone.org 15

Nick Craig-Wood

rclone copy - demo

  • rclone copy

– Copy new fjles

to destination

– Don’t delete

fjles from destination

– Your go to

rclone command!

slide-16
SLIDE 16

rclone.org 16

Nick Craig-Wood

rclone sync - demo

  • rclone sync

– Copy new fjles

to destination

– Delete

destination fjles not in source

– Use with –dry-

run fjrst recommended

slide-17
SLIDE 17

rclone.org 18

Nick Craig-Wood

rclone copy “Source Dir” “Dest Dir”

File 1 Source Dir File 2 File 3 Destination Before Source Before Actions File 1 Source Dir File 2 File 3 Dest Dir File 2 File 3 Destination After Source After File 1 Destination includes Source File 4 Dest Dir File 2 Old File 3 Copied Not T

  • uched

Overwritten Not T

  • uched

File 4

slide-18
SLIDE 18

rclone.org 19

Nick Craig-Wood

rclone sync “Source Dir” “Dest Dir”

File 1 Source Dir File 2 File 3 Destination Before Source Before Actions File 1 Source Dir File 2 File 3 Dest Dir File 2 File 3 Destination After Source After File 1 Destination identical to Source File 4 Dest Dir File 2 Old File 3 Copied Not T

  • uched

Overwritten Deleted

slide-19
SLIDE 19

rclone.org 21

Nick Craig-Wood

rclone mount remote:path /mount/point

  • FUSE Filesystem

– Linux, macOS, FreeBSD – Windows va WinFSP

  • Optional caching layer

– Needed as can’t write to

middle of object

– Or read and write together

  • Can run as daemon
slide-20
SLIDE 20

rclone.org 22

Nick Craig-Wood

rclone ncdu

This displays a text based user interface allowing the navigation of a Remote. It is most useful for answering the question: What is using all my disk space?

slide-21
SLIDE 21

rclone.org 23

Nick Craig-Wood

Backend interface

slide-22
SLIDE 22

rclone.org 24

Nick Craig-Wood

Object interface

slide-23
SLIDE 23

rclone.org 25

Nick Craig-Wood

Optional interfaces for Fs

slide-24
SLIDE 24

rclone.org 26

Nick Craig-Wood

Using an optional interface

Do a type assertion for the interface to see if it exists.

But what if this is a wrapper backend wrapping a backend that doesn’t support Purge?

And if we need to know in advance?...

slide-25
SLIDE 25

rclone.org 27

Nick Craig-Wood

The solution

slide-26
SLIDE 26

rclone.org 28

Nick Craig-Wood

T esting

  • How to test

– 27 backends – x 50 commands – x 8 OSes – x 6 CPU Architectures – x 4 Go versions?

  • 69k lines of code
  • 26k lines of test code
  • Unit test what we can

– Some things are easy – Who wants to write mocks

for 27 difgerent cloud providers?

  • Integration test

– Integration tests use go

test framework

– Run daily

slide-27
SLIDE 27

rclone.org 29

Nick Craig-Wood

CI – Unit testing and build

  • CI Pipeline

– Runs all non

integration tests

– T

ests mount

– Builds for all – Makes binaries – Uploads to beta

release

Push Pull Request Push Pull Request

slide-28
SLIDE 28

rclone.org 30

Nick Craig-Wood

Integration testing

  • Integration test

– Run daily – T

  • o expensive to

run on every push

  • Cost ~ 30p
  • Time ~ 1 Hour

– Creates fancy

report

– Not integrated with

Github (yet)

Daily Pull Integration T est Server Subset of cloud providers At least one per backend FTP SFTP HTTP Crypt

slide-29
SLIDE 29

rclone.org 31

Nick Craig-Wood

Integration tests

  • Problems

– Cloud providers aren’t

perfectly reliable

– Eventual consistency – Networking

  • Solution

– Retries, Retries, Retries – Lots of work getting it right

slide-30
SLIDE 30

rclone.org 32

Nick Craig-Wood

Retrying integration tests

  • test_all framework

– Runs standard go tests – Runs lots of tests in parallel – Provides fmags as specifjed in a

confjg fjle

– Parses the output of the tests – Retries the just the failing tests – Should probably become an

  • pensource package in its own

right!

Attempt 1/5 ./operations.test

  • test.v
  • test.timeout 30m0s
  • remote TestAzureBlob:

Attempt 2/5 ./operations.test

  • test.v
  • test.timeout 30m0s
  • remote TestAzureBlob:
  • test.run '^(TestPurge|

TestRmdirsNoLeaveRoot)$'

slide-31
SLIDE 31

rclone.org 33

Nick Craig-Wood

Integration tests for backends

  • Backend integration

tests

– Easy to add thanks to

go1.6 nested tests

– Give a recipe to follow

when making a new backend

– Just make the

integration tests pass

– Originally done with

code gen pre go1.6

slide-32
SLIDE 32

rclone.org 34

Nick Craig-Wood

Integration tests elsewhere

  • You can add fmags

to tests

– Rclone uses this with

a “-remote” fmag to signal that the test should be done remotely

– There are other fmags

for debugging and more in depth tests

slide-33
SLIDE 33

rclone.org 35

Nick Craig-Wood

Standing on the shoulders of giants

  • Rclone

– 95,000 lines of code – 450 source fjles – Not including “vendor”

  • Rclone’s libraries

– 520,000 lines of code – 1,100 fjles – All stored in “vendor”

All build on top of the excellent standard library

slide-34
SLIDE 34

rclone.org 36

Nick Craig-Wood

Favourite libraries and tools: golang.org/x/tools/cmd/goimports

– Get it in your editor –

never type an import statement again

– Run it as a save hook –

it will `go fmt` your code too

slide-35
SLIDE 35

rclone.org 37

Nick Craig-Wood

github.com/spf13/cobra

  • Make commands with subcommands
  • Very fmexible / extensible
  • Used by Kubernetes / Hugo / Docker
  • POSIX fmags `--fmag` with spf13/pfmag
  • Creates bash completion scripts
  • Creates docs
  • Makes cofgee and cleans the kitchen.
slide-36
SLIDE 36

rclone.org 38

Nick Craig-Wood

Documentation with github.com/spf13/cobra

Go code defjnes help… …becomes -h output… …and markdown for web.

slide-37
SLIDE 37

rclone.org 39

Nick Craig-Wood

github.com/pkg/errors

  • Turns an error like

this

– “unexpected EOF”

  • Into

– “NewFs creating

backend: couldn’t connect SSH: unexpected EOF”

slide-38
SLIDE 38

rclone.org 40

Nick Craig-Wood

What to do if your open source project takes ofg...

  • Don’t Panic!
  • Open a forum (Discourse

is good)

  • Ask everyone who makes

an issue for help

  • Recruit pull requesters as

contributors

  • Make good contributing

docs

  • Get octobox.io

Rclone Star History

Front Page of Hacker News

slide-39
SLIDE 39

rclone.org 41

Nick Craig-Wood

Thank you for listening

  • Rclone “rsync for cloud storage”

– https://rclone.org – https://github.com/ncw/rclone

  • Talk by

– Nick Craig-Wood – T

witter: @njcw

– Email: nick@craig-wood.com

  • Special efgects by

– Gource – source code history visualisation – Asciinema and asciicast2gif – terminal GIFs