functional thinking applying the philosophy of functional - - PowerPoint PPT Presentation

functional thinking
SMART_READER_LITE
LIVE PREVIEW

functional thinking applying the philosophy of functional - - PowerPoint PPT Presentation

functional thinking applying the philosophy of functional programming to system design & architecture Jed Wesley-Smith @jedws please, ask questions functional programming has many bene fj ts: better program reasonability, composition,


slide-1
SLIDE 1

functional thinking

applying the philosophy of functional programming to system design & architecture Jed Wesley-Smith @jedws

slide-2
SLIDE 2

please, ask questions

slide-3
SLIDE 3

functional programming has many benefjts: better program reasonability, composition, refactorability and performance yet, the dominant models & paradigms for software architecture and building software systems today remain rooted in mutation and side-effects many of the ideas and principles of functional programming have been applied to solve design problems including security, concurrency, auditing and robustness it is possible and desirable to apply them to all of the systems we build, and gain practical advantage from doing so

slide-4
SLIDE 4

is the universe mutable?

Q.

slide-5
SLIDE 5

what is change? what about the past? what is now?

slide-6
SLIDE 6

x = x + 1

the fundamental absurdity at the heart of programming

slide-7
SLIDE 7

x = x + 1

is a low-level execution plan for specifjc hardware, not a fundamentally important strategy to how we should write programs it only makes any sense as an execution strategy when associated with the most common Von Neumann style hardware architectures it makes less sense at any higher-level of abstraction, but we learn fairly early on that this is how programming works! it makes even less sense as a way we should design software in the large

slide-8
SLIDE 8

x = x + 1

the biggest problem is that we can only know one value for x: what it is

  • r

what it was the value of x is ephemeral, we forget what it was!

slide-9
SLIDE 9

x = x + 1

we all know global mutable variables are to be avoided unfortunately, many of our common storage systems use exactly the same paradigm

UPDATE person 
 WHERE person_id = 123 
 SET phone_no = “+61 2 9876 5432”

same goes for writing to a fjle same goes for most REST interfaces… pure functional programming, in the large, rejects this approach to programming!

slide-10
SLIDE 10

what is functional programming?

slide-11
SLIDE 11

programming, with functions!

slide-12
SLIDE 12

a function

f : A -> B

relates one value from its domain: A to exactly one value from its codomain: B always the same – or equivalent – value and nothing else! this is also known as a pure function, because programming defjnes impure ones too

slide-13
SLIDE 13

programming with values!

slide-14
SLIDE 14

values

immutable, values cannot change easily shareable without concern for concurrent modifjcation referentially transparent expressions can be replaced with their computed value the state of a thing in an instant in time is a value

slide-15
SLIDE 15

–Rich Hickey, Simple Made Easy

“we invented mutable values, we must uninvent them”

slide-16
SLIDE 16

what about things?

slide-17
SLIDE 17

identity

what we think of as the things around us; you, me, the plants and animals, rivers and mountains, are identities identities are things we name we are used to thinking of the world in terms of identities, they are the objects in our world

slide-18
SLIDE 18

the philosophy of functional programming

slide-19
SLIDE 19

since the time of Plato and Aristotle, philosophers have posited true reality as timeless, based on permanent substances, while processes are denied or subordinated to timeless substances if Socrates changes, becoming sick, Socrates is still the same, and change (his sickness) only glides over his substance: change is accidental, whereas the substance is essential.

“ ”

http://en.wikipedia.org/wiki/Process_philosophy

slide-20
SLIDE 20

– Heraclitus

no one ever steps in the same river twice, for it's not the same river and it's not the same person

slide-21
SLIDE 21

an identity is a series of values over time

slide-22
SLIDE 22

reifying time

slide-23
SLIDE 23

time

requesting the current time is not a function, it always gives a different answer! as we are functional programmers, we recognise now is a side-effect we usually model side-effects as explicit things, commonly via a type such as IO

now :: IO Time (java) public IO<Time> now() IO is a value describing how to perform a side-effect which we can run later now is a pure function as it returns a value

slide-24
SLIDE 24

f : A -> B

slide-25
SLIDE 25

f : A -> B

slide-26
SLIDE 26

f : A -> B f : A -> T -> B

slide-27
SLIDE 27

f : A -> B f : A -> T -> B

slide-28
SLIDE 28

a -> t1 -> X

slide-29
SLIDE 29

a -> t1 -> X a -> t2 -> X'

slide-30
SLIDE 30

change

slide-31
SLIDE 31

X + Δ = X' X' - X = Δ X' - Δ = X

we can store entire versions, or we can store deltas, or patches* they are equivalent being in possession of any two allows us to traverse time

* http://liamoc.net/posts/2015-11-10-patch-theory.html

slide-32
SLIDE 32

architecture in the Real World™

slide-33
SLIDE 33

architecture in the New World

slide-34
SLIDE 34

new world

it wasn’t that long ago that computation was expensive, disk storage was expensive, DRAM was expensive, but coordination with latches/locks was cheap now, all these have changed using cheap computation (with many-core), cheap commodity disks, and cheap DRAM and SSD coordination with latches/locks gets harder because latch latency loses lots of instruction

  • pportunities; with branch prediction and deep CPU pipelines, accessing main memory is

now much more expensive increasingly, applications are distributed, often globally, however we still use paradigms invented in the old-world with old-world assumptions

slide-35
SLIDE 35

new world

the new world is increasingly distributed distribution brings enormous problems, including increased latency and unreliability conventional consensus techniques (locks and transactions) impose intolerable constraints:

  • locks do not compose
  • distributed transaction protocols are bespoke and not widely supported
  • latency costs are enormous, huge performance hit
  • distributed transactions have fundamental problems
slide-36
SLIDE 36

new world, latency

https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html

slide-37
SLIDE 37

new world, latency

https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html

slide-38
SLIDE 38

– C. A. R. Hoare

“the unavoidable price of reliability is simplicity.”

slide-39
SLIDE 39

accountants do not use erasers

slide-40
SLIDE 40

how do we change without mutating?

slide-41
SLIDE 41

if an identity is series of values over time, can we model things so we keep

  • ld and new values?
slide-42
SLIDE 42

this is known as append-only computing

slide-43
SLIDE 43

double-entry bookkeeping

the fjrst functional architecture, codifjed in 1492 by Luca Pacioli central to bookkeeping — then and now — are its three books: the memorandum, the journal and the ledger the memorandum records all transactions as they happen the journal records the detailed transcription of these transactions, involving debiting and crediting specifjc accounts the ledger is generated by posting the journal entries to the individual accounts balance is fundamental and is key to the correctness protocol

slide-44
SLIDE 44

event sourcing

event sourcing is a name given to the practice of storing a journal, or stream, of changes the changes can be deltas or full versions, depending on efficiency and other factors it is possible to reconstruct the state of entities in the stream at any point in time serves as a complete history or audit log of changes a single event stream usually serves as a unit of consistency, or shard, and may have one or more entities contained within it

slide-45
SLIDE 45

event sourcing

given a source of events and a function to fold (or reduce) over them we can produce a “current” version of a value, or construct a view at any previous time given the ability to save event values, we can continuously feed new events into our fold function, using the old persisted value as the seed and produce our new “mutated” value, which we can also persist this persistence strategy is now decoupled from the source of the data (the events) we can have multiple views of our data, each of which can be tailored for specifjc purposes, such as query optimised storage

slide-46
SLIDE 46

event sourcing

we can store events that patch, or “mutate” the previous value additionally, we can store derived facts, such as the complete value after a number of updates – sometimes known as an epoch event sourcing fjts very well with Command/Query Request Separation (CQRS) as an architectural practice – allowing separate deployment of specialised query services, and tailored load-balancing strategies for different access patterns an event stream is easily distributable, with various strategies available for consensus on write, depending on requirements

slide-47
SLIDE 47

content-addressable storage

fjles are stored at an address computed from their content: a content hash names are associated with a hash retrieval looks up the current hash for a name, then accessing the content stored at that address update adds new content, then a new (name, hash) pair caches only cache content at a hash, not at a name, avoiding concurrency issues

slide-48
SLIDE 48
slide-49
SLIDE 49

git: version control system

non-linear development, branching/merging distributed development, changes must be shareable between repositories that are not necessarily connected cryptographic authentication of history, the ability to uniquely identify the complete development history of any change to the resources in a repository

slide-50
SLIDE 50

git: design

content is stored as a directed acyclic graph (DAG) of content, where some of the content is the repository fjle content, and some is meta-data, including fjle-trees and commits all content is stored using a secure hash of the fjle trees store lists of fjle and directory names and links to their content in the form of other trees (for sub-directories) or fjle blobs commits are stored using a hash of the contained meta-data, including tree hash, author, date, parent commits & additional optional data such as signatures verifying authenticity

slide-51
SLIDE 51
slide-52
SLIDE 52

git: fjle format

updates add new deltas, or a full version known as a pack all old versions are reconstructable the same content produces the same hash, equivalent updates commute data-structure is (mostly) immutable mutable pointer to head of a branch

slide-53
SLIDE 53

git: benefjts

presents a mutable fjle-system “view” of an immutable structure a commit includes author, date and parent commits (via their hash), providing a cryptographically secure signature of content and history all content, commits and content, are immutable shareable values, enabling simple distribution between multiple repositories unreferenced data is easily garbage-collectible via simple tree-walk from content roots (the branch heads)

slide-54
SLIDE 54
slide-55
SLIDE 55

lucene

full-text indexing and search needs to maintain a stable searchable “view” of an index in the face of concurrent updates

slide-56
SLIDE 56

lucene: index

an index is a collection of Documents a document is a collection of Fields and has an ID an index is updated by deleting and re-adding documents searching is done via a Searcher – for its lifetime, a searcher will see the state of the index as it was when it was opened

slide-57
SLIDE 57

an index is made of Segment fjles segments contain documents deleting a document adds the document ID to a per-segment “.del” fjle – ie. it doesn’t modify the segment fjle directly when no searchers reference a segment with many deleted documents, it may be be merged with others into a new segment containing the remaining documents –

  • ie. garbage collection

lucene: fjle-format

slide-58
SLIDE 58

segment 1

document 1 document 2 document 3 document 4 document 5 document 9 document 8 document 6 document 7 document 0

segment 2

document 11 document 12 document 13 document 14 document 15 document 19 document 18 document 16 document 17 document 10

slide-59
SLIDE 59

segment 1 segment 2

document 1 document 2 document 3 document 4 document 5 document 9 document 8 document 6 document 7 document 0 document 11 document 12 document 13 document 14 document 15 document 19 document 18 document 16 document 17 document 10

searcher1

slide-60
SLIDE 60

segment 1 segment 2 searcher segment 3

document 1 document 2 document 3 document 4 document 5 document 9 document 8 document 6 document 7 document 0 document 11 document 12 document 13 document 14 document 15 document 19 document 18 document 16 document 17 document 10 document 21 document 22 document 20 document 3 document 8 document 11

searcher1

slide-61
SLIDE 61

segment 1

document 1 document 2 document 3 document 4 document 5 document 9 document 8 document 6 document 7 document 0

segment 2

document 11 document 12 document 13 document 14 document 15 document 19 document 18 document 16 document 17 document 10

searcher segment 3

document 21 document 22 document 20 document 3 document 8 document 11

searcher1

slide-62
SLIDE 62

segment 1 segment 2 searcher1 searcher2 segment 3

document 1 document 2 document 3 document 4 document 5 document 9 document 8 document 6 document 7 document 0 document 11 document 12 document 13 document 14 document 15 document 19 document 18 document 16 document 17 document 10 document 3 document 8 document 11 document 21 document 22 document 20

slide-63
SLIDE 63
slide-64
SLIDE 64

netfmix

immutable everything, including servers:

  • servers are values, not modifjed
  • new versions are printed and deployed
  • old versions are replaced

idempotent updates ReactiveJava/RX (JavaScript) programming model

slide-65
SLIDE 65

immutability changes everything

source: Pat Helland http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf

slide-66
SLIDE 66

conclusions

avoid mutation of your core data, don’t use an eraser! values replace – or occlude – previous values store changes apply changes to construct a mutable temporal view apply these ideas to your entire system architecture profjt!

slide-67
SLIDE 67

thanks