SLIDE 1 Putting Big Data in its Place
Mike Amundsen, API Academy at CA @mamund
HH Camp – Strasbourg, March 2015
SLIDE 2
Introduction
SLIDE 3
SLIDE 4
SLIDE 5
Big Data Challenges
SLIDE 6
SLIDE 7
SLIDE 8 “Those who cannot remember the past are condemned to repeat it.”
George Santayana, 1905
SLIDE 9 “Those who ignore the mistakes of the future are bound to make them.”
Joseph D. Miller, 2006
SLIDE 10
SLIDE 11
SLIDE 12
SLIDE 13
Data and Storage
SLIDE 14
SLIDE 15
SLIDE 16
SLIDE 17
SLIDE 18
It's called a database
SLIDE 19
It's called a database not an informationbase
SLIDE 20
SLIDE 21
SLIDE 22
SLIDE 23
SLIDE 24
SLIDE 25
SLIDE 26
SLIDE 27
SLIDE 28
SLIDE 29
SLIDE 30
SLIDE 31
SLIDE 32
SLIDE 33
1 Gigabyte per day
SLIDE 34
SLIDE 35
365 truck loads per person per year
SLIDE 36
SLIDE 37
SLIDE 38
1 Yottabyte of Storage
SLIDE 39
SLIDE 40
100 Terabytes
SLIDE 41
100 Terabytes 100,000 Gigabytes
SLIDE 42
100 Terabytes 100,000 Gigabytes 250+ years of storage per person
SLIDE 43
SLIDE 44
NO
SLIDE 45
SLIDE 46
Pruning data into long-term memory
SLIDE 47
SLIDE 48
SLIDE 49
“Forgetting makes our brains more efficient.”
SLIDE 50
SLIDE 51
Learning to choose is hard.
SLIDE 52
Learning to choose is hard. Learning to choose well is harder.
SLIDE 53 “Learning to choose well in a world of unlimited possibilities is, perhaps, too hard.”
Barry Schwartz, 2004
SLIDE 54
SLIDE 55
SLIDE 56
SLIDE 57
SLIDE 58
SLIDE 59 Data and Storage Challenges
- Support Pruning Strategies
- Implement Data Lakes
- Reduce Data Overload
SLIDE 60
Modeling Information
SLIDE 61
SLIDE 62
SLIDE 63
SLIDE 64
SLIDE 65
Models allow us to add meaning to data
SLIDE 66
SLIDE 67
SLIDE 68
SLIDE 69
SLIDE 70
data + model = information
SLIDE 71
SLIDE 72
SLIDE 73
SLIDE 74
SLIDE 75
SLIDE 76
We can improve
SLIDE 77
We can improve the usability of messages
SLIDE 78
There are three ways to do that...
SLIDE 80
SLIDE 81
application/json adds very little affordance
SLIDE 82
SLIDE 83 collection+json adds quite a bit
SLIDE 84
SLIDE 86
SLIDE 87
SLIDE 88
SLIDE 89
SLIDE 90
SLIDE 91
So far, we're still in "Shannon-land"
SLIDE 93
On the web, the "internal model" is represented by Semantics
SLIDE 94
SLIDE 95
SLIDE 96
SLIDE 97
SLIDE 98
SLIDE 99
SLIDE 100 Modeling Information
- Represent Data in Rich Formats
- Support Multiple Protocols
- Separate Semantics from Format & Protocol
SLIDE 101
Ravages of Time
SLIDE 102 “Everything changes and nothing stands still.”
Heraclitus, 402 (quoted)
SLIDE 103
SLIDE 104
SLIDE 105
SLIDE 106
SLIDE 107
SLIDE 108
SLIDE 109
Storage Format
SLIDE 110
Storage Format is not
SLIDE 111
Storage Format is not Transfer Format
SLIDE 112
CSV
SLIDE 113
XML
SLIDE 114
JSON
SLIDE 115
RDF (n3)
SLIDE 116
Select a Storage Format
SLIDE 117 Select a Storage Format
- CSV has no strong schema modeling
- XML and JSON both have schema tooling
- RDF-family offers built-in semantics
SLIDE 118 Select a Storage Format
- CSV has no strong schema modeling
- XML and JSON both have schema tooling
- RDF-family offers built-in semantics
SLIDE 119
Storage Media
SLIDE 120
Storage Media is
SLIDE 121
Storage Media is Volatile
SLIDE 122
SLIDE 123
SLIDE 124
SLIDE 125 Million-Year Data Storage via DNA
ETH Zurich, 2015
SLIDE 126 Million-Year Data Storage via DNA
ETH Zurich, 2015
SLIDE 127
SLIDE 128
SLIDE 129
SLIDE 130
SLIDE 131 Ravages of Time
- Prepare to hold the data for 100+ years
- Be ready to migrate the data to new media
- Archive a functional app with the data
SLIDE 132
And so…
SLIDE 133
SLIDE 134 “If we don’t want our digital lives to fade away, we need to make sure that the
today can still be rendered far into the future.”
Vint Cerf, 2015
SLIDE 135 “Those who ignore the mistakes of the future are bound to make them.”
Joseph D. Miller, 2006
SLIDE 136
SLIDE 137 Putting Big Data in its Place
Mike Amundsen, API Academy at CA @mamund
HH Camp – Strasbourg, March 2015
http://g.mamund.com/2015-hhcamp