I Hate Your Database I Hate Your Database Andrew Godwin Andrew - - PowerPoint PPT Presentation

i hate your database i hate your database
SMART_READER_LITE
LIVE PREVIEW

I Hate Your Database I Hate Your Database Andrew Godwin Andrew - - PowerPoint PPT Presentation

flickr.com/96dpi flickr.com/96dpi I Hate Your Database I Hate Your Database Andrew Godwin Andrew Godwin @andrewgodwin @andrewgodwin flickr.com/parkerblohm flickr.com/parkerblohm Hate? Databases? Hate? Databases? Misuse Misuse


slide-1
SLIDE 1

I Hate Your Database I Hate Your Database

Andrew Godwin Andrew Godwin @andrewgodwin @andrewgodwin

flickr.com/96dpi flickr.com/96dpi

slide-2
SLIDE 2

Hate? Databases? Hate? Databases?

flickr.com/parkerblohm flickr.com/parkerblohm

slide-3
SLIDE 3

· Misuse · Misuse · Ignorance · Ignorance · Lies · Lies

slide-4
SLIDE 4

Different databases, Different databases, different occasions different occasions

flickr.com/maistora flickr.com/maistora

slide-5
SLIDE 5

Relational Relational Document Document Key-Value Key-Value

Graph Graph Object / Hierarchial Object / Hierarchial Spatial Spatial Time-series / RRD Time-series / RRD Search Search

slide-6
SLIDE 6

Relational Relational

PostgreSQL PostgreSQL MySQL MySQL SQLite SQLite PostgreSQL PostgreSQL MySQL MySQL SQLite SQLite MongoDB MongoDB CouchDB CouchDB

Document Document

Redis Redis Cassandra Cassandra Riak Riak

Key-value Key-value

slide-7
SLIDE 7

Some Quick Theory Some Quick Theory

flickr.com/iamdabe flickr.com/iamdabe

slide-8
SLIDE 8

A Atomicity tomicity C Consistency

  • nsistency

I Isolation solation D Durability urability

slide-9
SLIDE 9

C Consistency

  • nsistency

A Availability vailability P Partition Tolerance artition Tolerance

slide-10
SLIDE 10

Relational Databases Relational Databases

flickr.com/maistora flickr.com/maistora

slide-11
SLIDE 11

Common Pitfalls Common Pitfalls

SELECT item1, item2, item3 FROM basket; SELECT item1, item2, item3 FROM basket; INSERT INTO groups (name, people) INSERT INTO groups (name, people) VALUES ('friends', 'aaron,betty,chris,damien') VALUES ('friends', 'aaron,betty,chris,damien') SELECT file_bytes FROM pictures WHERE filename = SELECT file_bytes FROM pictures WHERE filename = 'foo.jpg' 'foo.jpg'

slide-12
SLIDE 12

names = set() names = set() for book in Book.objects.filter(year=2012): for book in Book.objects.filter(year=2012): names.add(book.author.name) names.add(book.author.name)

slide-13
SLIDE 13

names = Author.objects.filter( names = Author.objects.filter( books__year=2012 books__year=2012 ).values_list('name', flat=True).distinct() ).values_list('name', flat=True).distinct()

slide-14
SLIDE 14

No transactional DDL No transactional DDL Poor query optimiser Poor query optimiser MyISAM: Full-table locking, no transactions MyISAM: Full-table locking, no transactions Oracle Oracle Very fast for some operations Very fast for some operations

MySQL MySQL

slide-15
SLIDE 15

Little integrity checking (slowly being fixed) Little integrity checking (slowly being fixed) Impossible to do some table alterations Impossible to do some table alterations No concurrent access No concurrent access Very low overhead Very low overhead Very portable Very portable

SQLite SQLite

slide-16
SLIDE 16

Slow default configuration Slow default configuration Can be a little harder to learn / less familiar Can be a little harder to learn / less familiar Almost too many features Almost too many features Incredibly reliable Incredibly reliable

PostgreSQL PostgreSQL

slide-17
SLIDE 17

Document Databases Document Databases

flickr.com/babyowls flickr.com/babyowls

slide-18
SLIDE 18

db.insert({ db.insert({ 'name': 'Sally', 'name': 'Sally', 'tags': ['django', 'python', 'search'], 'tags': ['django', 'python', 'search'], 'addresses': [ 'addresses': [ {'type': 'jabber', 'jid': 'sally@eg.com'}, {'type': 'jabber', 'jid': 'sally@eg.com'}, {'type': 'phone', 'number': '011899981199'}, {'type': 'phone', 'number': '011899981199'}, ], ], }) }) db.find({ db.find({ 'tags': 'python', 'tags': 'python', 'addresses.type': 'jabber', 'addresses.type': 'jabber', }) })

slide-19
SLIDE 19

No fixed schema No fixed schema Low barrier to entry Low barrier to entry Closer to Python datatypes Closer to Python datatypes

Advantages Advantages

slide-20
SLIDE 20

Immature (but improving) Immature (but improving) No transactions No transactions No integrity checking No integrity checking

Problems Problems

slide-21
SLIDE 21

Key-value stores Key-value stores

flickr.com/zebble flickr.com/zebble

slide-22
SLIDE 22

db.set('foo', 'bar) db.set('foo', 'bar) x = db.get('foo') x = db.get('foo') db.sadd('names', 'andrew') db.sadd('names', 'andrew') db.sadd('names', 'brian') db.sadd('names', 'brian') y = db.scard('names') y = db.scard('names')

slide-23
SLIDE 23

Horizontal scaling (but with drawbacks) Horizontal scaling (but with drawbacks) Extremely fast Extremely fast Can only fetch objects by key Can only fetch objects by key Batch/map-reduce queries Batch/map-reduce queries Transactions not possible Transactions not possible

Traits Traits

slide-24
SLIDE 24

Other database types Other database types

flickr.com/tusnelda flickr.com/tusnelda

slide-25
SLIDE 25

Knowledge of projections useful Knowledge of projections useful Spatial indexes really speed up some problems Spatial indexes really speed up some problems Generally add-on to existing DB Generally add-on to existing DB

Spatial Spatial

slide-26
SLIDE 26

Hierarchial key-value store Hierarchial key-value store Allows multiple writers for appends Allows multiple writers for appends Supports very large files Supports very large files

Filesystems Filesystems

slide-27
SLIDE 27

Allow efficient neighbour queries Allow efficient neighbour queries Generally not much use for anything else Generally not much use for anything else

Graph Databases Graph Databases

slide-28
SLIDE 28

Deliberately loses old data Deliberately loses old data Useful for logging or statistics Useful for logging or statistics

Round-Robin Database Round-Robin Database

slide-29
SLIDE 29

Final Thoughts Final Thoughts

flickr.com/pagedooley flickr.com/pagedooley

slide-30
SLIDE 30

It's unlikely your data all It's unlikely your data all fits in one paradigm. fits in one paradigm.

slide-31
SLIDE 31

Just buying bigger servers Just buying bigger servers goes a long way goes a long way

slide-32
SLIDE 32

If it sounds too good to be true, If it sounds too good to be true, it probably is. it probably is.

slide-33
SLIDE 33

Fin. Fin.

Andrew Godwin / @andrewgodwin Andrew Godwin / @andrewgodwin

flickr.com/oimax flickr.com/oimax