Smart Data and Wicked Problems Paul Borrill Most Computer - PowerPoint PPT Presentation

radical simplicity Smart Data and Wicked Problems Paul Borrill

Most Computer Scientists don’t understand Time & Causality “Computer Scientists imagine that causation is one of the fundamental axioms or postulates of physics, yet, oddly enough, in real scientific disciplines such as special and general relativity, and quantum mechanics, the word “cause” never occurs. To me it seems that computer science ought not to assume such legislative functions, and that the reason why physics has ceased to look for causes is that in fact there are no such things. The law of causality, I believe, like much that passes muster among computer scientists, is a relic of a bygone age, surviving, like a belief in God, only because it is erroneously supposed to do no harm” ~Paul Borrill (with apologies to Bertrand Russell)

Dumb Data? • Our lives are becoming progressively more digital • Our ability to manage our data: in our enterprises, our businesses, our communities and even our own homes is becoming intolerably complex • This complexity threatens to become the single most pervasive destroyer of productivity in our post-industrialized society; taking back all the gains in productivity that our information technology was intended to provide

“Men have become tools of their tools” Henry David Thoreau

Dumb Data is Intolerably Complex • We need a cure; not an endless overlay of band- aids that mask failed architectural theories • The Curse of the God’s Eye View (GEV) • Identity & Individuality • Persistence & Change • Time & Causality • These problems are not adequately appreciated in the computer science literature • GEV designers don’t relieve us of complexity - they cause it! • Do GEV designers have the God gene? Gene Hamer: The God Gene

“The ultimate goal of machine production – from which, it is true, we are as yet far removed – is a system in which everything uninteresting is done by machines and human beings are reserved for work involving variety and initiative” Bertrand Russell

Why Smart Data? • Why we want to make data smart is clear: so that our data can, as far as possible, enable us to find and freely use it without us having to constantly tend to its needs • Our systems should quietly manage themselves and become our slaves, instead of us becoming slaves to them

“What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it” Herbert Simon

Three Laws of Smart Data • Smart Data shall not consume the attention of a human being, or through inaction, allow a human being’s attention to be consumed, without that human being’s freely given consent that the cause is just and fair • Smart Data shall obey and faithfully execute all requests of a human being, except where such requests would conflict with the first law • Smart Data shall protect its own existence as long as such protection does not conflict with the first or second law

Knowledge Warriors • We have to “fight” our systems to get work done • Knowledge? • Bits, Bytes, Data, Information, Knowledge, Understanding, Wisdom • Just want to get our job done • Systems get in the way • Yak Shaving 10

A 100 Petabyte Data Repository Feasibility Study

100PB Data Repository S P C C 1 2 3 4 1 2 3 4 A 1 2 R E 500TB Per Rack 12 Disks per Vertical Sled C 1 1 2 3 4 5 6 7 8 9 1 0 8-10 Sleds per Panel 6 Panels Per Rack (double sided!) 20PB Per Data Center C C C 1 2 2 1 3 4 2 1 1 2 3 40 Racks x 0.5PB 100PB in 5 Data Centers

8 Racks per 20-Foot Container = 5PB http://www.sun.com/emrkt/blackbox/index.jsp v2

100PB RAW Storage 10 x 40 foot Containers or 20 x 20-foot Containers

100PB Data Repository - Problems: • Existing solutions do not scale - end up with many “islands” or “silos” of storage • Large SAN’s - break faster than you can fix them • Disks Constantly fail (~130K Disks) • Months or years to design and deploy • Coordination of 10-20 companies, 60+ products • An army of administrators • Cost is 100 - 200 x disks • Power Dissipation • Something must change 15

Identity & Individuality • Principle of Identity of Indiscernables (PII) • Space Time Identity (STI) • Transcendental Identity (TI)

“Those great principles of sufficient reason and of the identity of indiscernibles change the state of metaphysics. That science becomes real and demonstrative by means of these principles, whereas before it did generally consist in empty words” Gottfried Liebnitz

Individuality of Digital & Material Objects • Are digital Individuals like rocks, tables, umbrellas & people or like drops of water, or money in a bank account? • Individuality appears to depend on Distinguishability, and visa versa • But some entities, like sub-atomic particles, are indistinguishable • Can two entities be exactly the same, in both their internal and relational properties (including their position in spacetime)? • Not according to the Impenetrability Argument

Persistence & Change • Perdurance Theory • Endurance Theory • Stage Theory

Time & Causality • Simultaneity is a Myth • Time is not Continuous • Time does not flow • Time has no direction • Causality is a flawed concept

What is Time? “A Measure of Change” Aristotle “A persistently stubborn illusion” Einstein

Do Computer Scientists Understand Time ? • A relationship with time is intrinsic to everything we do in creating, modifying and moving data • The understanding of the concept of time among computer scientists appears far behind that of physicists and philosophers • If fundamental flaws exist in the time assumptions underlying the algorithms that govern access to and evolution of our data, then our systems will fail in unpredictable ways, and any number of undesirable characteristics may follow

Simultaneity is a Myth • In 1905 Einstein showed us that the concept of “now” is meaningless except for events occurring “here” • In 1978, Leslie Lamport published “Time, Clocks and the Ordering of Events”, in which he defined the happened before relation • Unfortunately, happened before is meaningless unless intimately associated with happened where. Lamport understood this, but many who read his paper don't • In 2008, most Computer Scientists and programmers implicitly base their algorithms on absolute (Newtonian) Time, or use Lamport’s timestamps as a crutch to sweep their issues with time under the rug

Breakdown in Simultaneity - 1 Courtesy Kevin Brown http://www.mathpages.com/rr/s4-08/4-08.htm

But wait - can’t we assume an “inertial system”? • Our computers reside: • On the surface of a Rotating Sphere • In a Gravitational Field • Orbiting a Star • Our Computers are connected: • Not with light signals in a vacuum, but with a stochastic latency distribution network • Equivalence of Acceleration and variability of transmission delay in the propagation of packets • Creating coherent time sources is “problematic”

Other difficulties with “ time” • Time is not continuous • Time is change. Events are unique in spacetime. There is no such thing as an indivisible instant. Are Instants Events? • Time does not flow • There is no more evidence for the existence of anything real between one event and another, than there is for an aether to support the propagation of electromagnetic waves through empty space • Time has no direction • Time is intrinsically symmetric. We experience irreversible processes that capture “change” like a probability ratchet that prevents a wheel going backwards

Leslie Lamport 1978 • Defined “happened before” relation: a partial order • Defined “logical timestamps” which force an arbitrary total order, restricting the available concurrency of a system (i.e. the algorithm can proceed no faster than it would in a single processor) • This “concurrency efficiency loss” gets worse as: • We add more nodes to a distributed system • These nodes become more spatially separated • Our processors and networks get faster • Our processors are comprised of more cores

The Computer Industry 2008 • The storage industry: In a Complexity Crisis • Although we can build larger systems physically, we “have to” scale-out, because “scale-up” systems are impossible to make sufficiently resilient • No-one has thought about the software • The processor industry: In a Concurrency Crisis • Gets worse with each generation of processor (the number of cores doubles each generation instead of the performance of each core) • No-one has thought about the software • What are the wicked problems getting in our way?

Smart Data and Wicked Problems Paul Borrill Most Computer - PowerPoint PPT Presentation

radical simplicity Smart Data and Wicked Problems Paul Borrill Most Computer Scientists dont understand Time & Causality Computer Scientists imagine that causation is one of the fundamental axioms or postulates of physics, yet, oddly

Wicked Problems & Leadership Keith Grint The Problem with Change Do d ifferent kinds of

Identity is a Wicked Problem Identity is a "Wicked" Problem There is no universally

SMART ENERGY SMART ASSET SMART SMART SMART & CUSTOMER ASSET PURPOSE PEOPLE

Wicked Problems & Clumsy Solutions: The Role of Leadership Keith Grint What work problem is

The Wicked Problem of Data Literacy: A Call for Action Sheila Corrall Information Culture &

Smart and Adaptive Cyber-Physical Systems Chapters 1,2 Cyber-Physical Systems Smart mobility

INFORMATION SYSTEM FOR WICKED INTERACTIVE Presenting by Xue Feng, Ji Pengcheng, Zhu Yibo,

smart data mobility smart data mobility smart data mobility grass coal oil data data

CONTENTS Smart Schools Bond Act Committees and the Smart Schools Investment Plan Smart Schools

Packet-Level Signatures for Smart Home Devices Rahmadi Trimananda, Janus Varmarken, Athina

Quality of Life - Smart Mobility - Smart Infrastructure - Smart People, Smart Living ARC 590

1 The Nature of Problems in the 21 st Century: Tame v. Wicked Wicked problems inherently

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Sustainability and Smart Grid Implementing a Non residential Smart Metering System PaperCon

SMART STAFFORDSHIRE Enabling citizens and businesses to flourish in the digital age SMART SMART

Smart Metering Smart Metering The Power of Smart Metering The Power of Smart Metering MOST

TERRABRASILIS: A SPATIAL DATA INFRASTRUCTURE FOR DISSEMINATING DEFORESTATION DATA FROM BRAZIL

Spherical solutions for stars Daniel Wysocki Rochester Institute of Technology General

Probabil babilisti istic c Sen ensor or Models els for Virtua ual l Val alidat dation ion

L A T EX Graphics with PSTricks This presentation is also available online. Please visit my

Atmospheric shower simulation studies with CORSIKA ARISTOTLE UNIVERSITY OF THESSALONIKI Physics

Development of Computer Code for Analysis of Molten Corium and Concrete Interaction Sang Ho Kim,

Lie, Noether, and Lagrange symmetries, and their relation to conserved quantities Aidan Schumann

Visualization of Mini-UAV Flying Path Using GPS Log In Connection to Surveillance Radar Target

Smart Data and Wicked Problems Paul Borrill Most Computer - PowerPoint PPT Presentation

radical simplicity Smart Data and Wicked Problems Paul Borrill Most Computer Scientists dont understand Time & Causality Computer Scientists imagine that causation is one of the fundamental axioms or postulates of physics, yet, oddly

Wicked Problems &amp; Leadership Keith Grint The Problem with Change Do d ifferent kinds of

Identity is a Wicked Problem Identity is a &quot;Wicked&quot; Problem There is no universally

SMART ENERGY SMART ASSET SMART SMART SMART &amp; CUSTOMER ASSET PURPOSE PEOPLE

Wicked Problems &amp; Clumsy Solutions: The Role of Leadership Keith Grint What work problem is

The Wicked Problem of Data Literacy: A Call for Action Sheila Corrall Information Culture &amp;

Smart and Adaptive Cyber-Physical Systems Chapters 1,2 Cyber-Physical Systems Smart mobility

INFORMATION SYSTEM FOR WICKED INTERACTIVE Presenting by Xue Feng, Ji Pengcheng, Zhu Yibo,

smart data mobility smart data mobility smart data mobility grass coal oil data data

CONTENTS Smart Schools Bond Act Committees and the Smart Schools Investment Plan Smart Schools

Packet-Level Signatures for Smart Home Devices Rahmadi Trimananda, Janus Varmarken, Athina

Quality of Life - Smart Mobility - Smart Infrastructure - Smart People, Smart Living ARC 590

1 The Nature of Problems in the 21 st Century: Tame v. Wicked Wicked problems inherently

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Sustainability and Smart Grid Implementing a Non residential Smart Metering System PaperCon

SMART STAFFORDSHIRE Enabling citizens and businesses to flourish in the digital age SMART SMART

Smart Metering Smart Metering The Power of Smart Metering The Power of Smart Metering MOST

TERRABRASILIS: A SPATIAL DATA INFRASTRUCTURE FOR DISSEMINATING DEFORESTATION DATA FROM BRAZIL

Spherical solutions for stars Daniel Wysocki Rochester Institute of Technology General

Probabil babilisti istic c Sen ensor or Models els for Virtua ual l Val alidat dation ion

L A T EX Graphics with PSTricks This presentation is also available online. Please visit my

Atmospheric shower simulation studies with CORSIKA ARISTOTLE UNIVERSITY OF THESSALONIKI Physics

Development of Computer Code for Analysis of Molten Corium and Concrete Interaction Sang Ho Kim,

Lie, Noether, and Lagrange symmetries, and their relation to conserved quantities Aidan Schumann

Visualization of Mini-UAV Flying Path Using GPS Log In Connection to Surveillance Radar Target

Wicked Problems & Leadership Keith Grint The Problem with Change Do d ifferent kinds of

Identity is a Wicked Problem Identity is a "Wicked" Problem There is no universally

SMART ENERGY SMART ASSET SMART SMART SMART & CUSTOMER ASSET PURPOSE PEOPLE

Wicked Problems & Clumsy Solutions: The Role of Leadership Keith Grint What work problem is

The Wicked Problem of Data Literacy: A Call for Action Sheila Corrall Information Culture &