Designing for Distributed, Unstructured Data
Matt Brender
Developer Advocate at Basho
1
Designing for Distributed, Unstructured Data Matt Brender - - PowerPoint PPT Presentation
Designing for Distributed, Unstructured Data Matt Brender Developer Advocate at Basho 1 => curl $RIAK/props { Matt Brender : developer advocate, ops > dev, mbrender@basho.com, @mjbrender,
Matt Brender
Developer Advocate at Basho
1
tweet me @mjbrender
=> curl $RIAK/props
2
{ “Matt Brender” :
‘developer advocate’, ‘ops > dev’, ’mbrender@basho.com’, ‘@mjbrender’, ‘neckbeardinfluence.com’, ‘geek-whisperers.com’, ‘indoor enthusiast’
}
tweet me @mjbrender
3
Not “react,” as in react.js
tweet me @mjbrender 4
tweet me @mjbrender 5
tweet me @mjbrender 6
{ "text": ”Woot! #qconnewyork", "entities": { "hashtags": [“#qconnewyork”], "symbols": [], "urls": [], "user_mentions": [{ "screen_name": ”mjbrender", "name": ”Matt Brender", "id": 4948123, "id_str": ”42424242", "indices": [81, 92] }, { "screen_name": ”mjbrender", "name": ”Matt Brender", "id": 376825877, "id_str": "376825877", "indices": [121, 132] }] } }
7
tweet me @mjbrender
Just Hording?
8
tweet me @mjbrender
Just Hording?
9
tweet me @mjbrender
10
tweet me @mjbrender 11
tweet me @mjbrender 12
tweet me @mjbrender 13
tweet me @mjbrender 14
tweet me @mjbrender 15
tweet me @mjbrender 16
tweet me @mjbrender 17
tweet me @mjbrender 18
tweet me @mjbrender 19
tweet me @mjbrender 20
tweet me @mjbrender
Our Problem(s)
21
tweet me @mjbrender
22
tweet me @mjbrender 23
tweet me @mjbrender 24
tweet me @mjbrender 25
tweet me @mjbrender 26
tweet me @mjbrender 27
tweet me @mjbrender 28 36
tweet me @mjbrender 29
tweet me @mjbrender 30
tweet me @mjbrender
31
tweet me @mjbrender
This or That
32
Reduce
tweet me @mjbrender
33
tweet me @mjbrender
Basho ConfidentialWhat Qualifies as NoSQL?
34
tweet me @mjbrender
Basho ConfidentialNOSQL Community
35
tweet me @mjbrender
Persistence
36
tweet me @mjbrender
37
tweet me @mjbrender 38
tweet me @mjbrender
39
tweet me @mjbrender
Other Queries
Understanding how you get your data back
Query Languages
Query Interfaces
40
tweet me @mjbrender
Apache Solr Integration
Write it like Riak. Query it like Solr.
Distributed Full-Text Search Standard full-text Solr queries automatically expand into distributed search queries for a complete result set across instances.
Ad-Hoc Query Support
Broad support for Solr query parameters, e.g., exact match, range queries, and/or/not, sorting, pagination, scoring, ranking, etc.
Index Synchronization
Data is automatically synchronized between Riak KV and Solr using intelligent monitoring to detect changes, and propagates those to Solr indexes.
Solr API Support
Query data in Riak KV using existing Solr APIs
Auto-Restart
Monitor Solr OS processes continuously and automatically start or restart them whenever failures are detected.
41
tweet me @mjbrender 42
There are a diverse group of client libraries for Riak that support both the HTTP and Protocol Bufger APIs:
Basho Supported Libraries:
Community Libraries:
Polylingual Querying
tweet me @mjbrender
43
tweet me @mjbrender 44
tweet me @mjbrender
45
tweet me @mjbrender
Sharding Strategies
46
Master Slave Slave Slave
OR
Node%1% Node%2% Node%3%
tweet me @mjbrender
Sharding Strategies
47
tweet me @mjbrender
CAP Theorem
48
AP Riak Cassandra Couchbase Voldemort CP MongoDB BigTable Redis Hbase
CA RDBMS MySQL Postgres
tweet me @mjbrender
What Are You Sacrificing?
data will be out of sync (and won't re-sync)
tolerance (preventing data de-sync) by becoming unavailable when a node goes down
aren't guaranteed that all nodes will have the same data (either during or after the partition)
49
tweet me @mjbrender 50
The Dynamo Paper
tweet me @mjbrender
51
tweet me @mjbrender
52
tweet me @mjbrender
set conflict resolution
{ [“Beth” : “Tom”], [“Beth” : “Jim”], [“Beth” : “George”] } 2015:05:27 { [“George” : “Tom”], [“Beth” : “Jim”], [“George” : “Jim”] } 2015:05:26 { [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } 2015:05:25
53
tweet me @mjbrender
set conflict resolution
Riak
54
Client Client Client
tweet me @mjbrender
set conflict resolution
Riak
55
Client Client Client
{ [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
56
Client Client Client
{ [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
57
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
58
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
59
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
60
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Tom”: “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
61
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Tom”: “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Beth”, “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”] }
tweet me @mjbrender
set conflict resolution
Riak
62
Client Client Client
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Tom”: “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Beth”, “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Beth”, “Jane”] }
tweet me @mjbrender
set conflict resolution
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Tom”: “Jane”], [“Beth”: “Jane”] }
63
{ [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Tom”: “Jane”] } { [“Jane”: “Tom”], [“Tom” : “Beth”], [“Beth” : “Tom”], [“George” : “Jim”], [“Beth”, “Jane”] }
tweet me @mjbrender
64
tweet me @mjbrender
What do I need (most) from my database?
65
Where do I need most from my data?
tweet me @mjbrender 66
tweet me @mjbrender
67
tweet me @mjbrender
68
tweet me @mjbrender
69
tweet me @mjbrender
70
tweet me @mjbrender
71
tweet me @mjbrender 72
riak-dev cluster
https://github.com/basho-labs
tweet me @mjbrender 73
Matt Brender @mjbrender