Creating Data A Boat Filled With Sauerkraut @lukasvermeer Lukas - - PowerPoint PPT Presentation

creating data
SMART_READER_LITE
LIVE PREVIEW

Creating Data A Boat Filled With Sauerkraut @lukasvermeer Lukas - - PowerPoint PPT Presentation

Creating Data A Boat Filled With Sauerkraut @lukasvermeer Lukas Vermeer Experiments at Booking.com @lukasvermeer Introduction SAUERKRAUT @lukasvermeer HMS Endeavour Our voyage starts when a ship sails from Plymouth on 26 August 1768.


slide-1
SLIDE 1

Creating Data

A Boat Filled With Sauerkraut

@lukasvermeer
slide-2
SLIDE 2
slide-3
SLIDE 3

Lukas Vermeer

Experiments at Booking.com

@lukasvermeer
slide-4
SLIDE 4

SAUERKRAUT

Introduction

@lukasvermeer
slide-5
SLIDE 5

HMS Endeavour

Our voyage starts when a ship sails from Plymouth on 26 August 1768.

@lukasvermeer
slide-6
SLIDE 6 @lukasvermeer
  • Wikipedia

“Provisions loaded at the outset of the voyage included 6,000 pieces of pork and 4,000 of beef, nine tons of bread, five tons of flour, three tons

  • f sauerkraut, one ton of raisins

and sundry quantities of cheese, salt, peas, oil, sugar and oatmeal.”

slide-7
SLIDE 7 @lukasvermeer
slide-8
SLIDE 8

REVOLUTIONS

Chapter 1

@lukasvermeer
slide-9
SLIDE 9 @lukasvermeer
  • Thomas Kuhn

“Progress in science is not a simple line leading to the truth.”

slide-10
SLIDE 10

Aristarchus of Samos

  • c. 310 BC – c. 230 BC
@lukasvermeer
slide-11
SLIDE 11 @lukasvermeer
  • Archimedes

“[Aristarchus’] hypotheses are that the fixed stars and the Sun remain unmoved, that the Earth revolves about the Sun on the circumference

  • f a circle, […]”
slide-12
SLIDE 12 @lukasvermeer
  • George Pólya

“His fame rests on his heliocentric

  • theory. […] Perhaps ‘theory’ is too

strong a word, for his proofs were weak; yet it was a great idea.”

slide-13
SLIDE 13

Claudius Ptolemy

  • c. 100 AD – c. 170 AD
@lukasvermeer
slide-14
SLIDE 14 @lukasvermeer
  • Claudius Ptolemy

“But it has escaped [heliocentric proponents’] notice in the light of what happens around us in the air that such a notion would seem altogether absurd.”

slide-15
SLIDE 15 @lukasvermeer
  • Claudius Ptolemy

“For the earth would always

  • utstrip them in its eastward

motion, so that all other bodies would seem to be left behind and to move towards the west.”

slide-16
SLIDE 16 @lukasvermeer
  • Claudius Ptolemy

No westward motion. No stellar parallax. Geocentric math works. QED

slide-17
SLIDE 17 @lukasvermeer
  • Claudius Ptolemy

No observed westward motion. No observed stellar parallax. Geocentric math works to explain what we have observed. GED

slide-18
SLIDE 18

Nicolaus Copernicus

1473 – 1543

@lukasvermeer
slide-19
SLIDE 19

Dē RevolutionibusOrbiumCoelestium

Copernicus's vision of the universe, published in 1543, the year of his death, though he had formulated the theory several decades earlier.

@lukasvermeer
slide-20
SLIDE 20 @lukasvermeer
  • Andreas Osiander

“These hypotheses need not be true nor even probable. On the contrary, if they provide a calculus consistent with the observations, that alone is enough.”

slide-21
SLIDE 21

Galileo Galilei

1564 – 1642

@lukasvermeer
slide-22
SLIDE 22 @lukasvermeer
  • Michael Fowler

“The real breakthrough that ultimately led to the acceptance of Copernicus’ theory was due to Galileo, but was actually a technological rather than a conceptual breakthrough.”

slide-23
SLIDE 23

Galileo did not invent the idea. He built a better telescope.

@lukasvermeer
slide-24
SLIDE 24

Galileo first observed the moons of Jupiter

This observation upset the notion that all celestial bodies revolve around the Earth. Galileo published a full description in March 1610.

@lukasvermeer
slide-25
SLIDE 25

Multiple models could probably explain the data you already have. Determining which one is closer to the truth requires a directed effort to collect new data (to the contrary).

@lukasvermeer
slide-26
SLIDE 26 @lukasvermeer
slide-27
SLIDE 27 @lukasvermeer
slide-28
SLIDE 28 @lukasvermeer

Data You

Have

Data You

Need

slide-29
SLIDE 29 @lukasvermeer

Data You

Have

Data You

Need

Sauerkraut Science

slide-30
SLIDE 30

TRANSIT

Chapter 2

@lukasvermeer
slide-31
SLIDE 31

“On the sizes and distances”

Aristarchus's 3rd-century BC calculations on the relative sizes of (from left) the Sun, Earth and Moon, from a 10th-century AD Greek copy.

@lukasvermeer
slide-32
SLIDE 32 @lukasvermeer

Aristarchus (3rd century BC)

Distance to the sun

380 - 1520

Earth Radii

slide-33
SLIDE 33

Diagram from Edmund Halley's 1716 paper

Addressed to the Royal Society showing how the Venus transit could be used to calculate the distance between the Earth and the Sun.

@lukasvermeer
slide-34
SLIDE 34

Route of the first voyage of James Cook

An expedition to the south Pacific Ocean aboard HMS Endeavour, from 1768 to 1771. It was the first of three Pacific voyages of which Cook was the commander.

@lukasvermeer
slide-35
SLIDE 35

Three years of travel. For two timestamps.

@lukasvermeer
slide-36
SLIDE 36 @lukasvermeer
  • James Cook

“Not a Clowd was to be seen the Whole day and the Air was perfectly clear, so that we had every advantage we could desire in Observing the whole of the passage of the Planet Venus over the Suns disk.”

slide-37
SLIDE 37

The "black drop effect"

As recorded during the 1769 transit by James Cook.

@lukasvermeer
slide-38
SLIDE 38

Right place, right time, right idea. Insufficiently accurate telescope.

@lukasvermeer
slide-39
SLIDE 39 @lukasvermeer

Jérôme Lalande (1771)

Distance to the sun

24 000

Earth Radii

slide-40
SLIDE 40

Science is limited by data. Data is limited by engineering.

@lukasvermeer
slide-41
SLIDE 41 @lukasvermeer
  • Elon Musk

“In the absence of the engineering, you do not have the data. You just hit a limit. You can be real smart within the context of the limit of the data you have, but unless you have a way to get more data, you can’t make progress.”

slide-42
SLIDE 42 @lukasvermeer

Data You

Have

Data You

Need

slide-43
SLIDE 43 @lukasvermeer Data You

Have

Data You

Need

Data You

COULD CREATE

slide-44
SLIDE 44

TRANSMUTATION

Chapter 3

@lukasvermeer
slide-45
SLIDE 45 @lukasvermeer
  • Michael Palmer

“Data is just like crude. It’s valuable, but if unrefined it cannot really be

  • used. It has to be changed into gas,

plastic, chemicals, etc to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.”

slide-46
SLIDE 46 @lukasvermeer
  • Wikipedia

“The philosopher's stone is a legendary alchemical substance capable of turning base metals such as mercury into gold or silver.”

slide-47
SLIDE 47

Data Science. versus Data Alchemy.

@lukasvermeer
slide-48
SLIDE 48

Kaggle.com

“Your home for data science”.

@lukasvermeer
slide-49
SLIDE 49

Something good, something bad

Hotel reviews on Booking.com.

@lukasvermeer
slide-50
SLIDE 50

Sentiment Analysis

Excerpt from “Entity Based Sentiment Analysis on Twitter” by Siddharth Batra and Deepak Rao (Stanford University).

@lukasvermeer
slide-51
SLIDE 51

Kaggle is to real-life machine learning as chess is to war Intellectually challenging and great mental exercise, but YOU DON'T KNOW, MAN! YOU WEREN'T THERE!

@lukasvermeer
slide-52
SLIDE 52

Sentiment Analysis

At Booking.com, we solve the sentiment analysis challenge at data collection time.

@lukasvermeer
slide-53
SLIDE 53 @lukasvermeer Data You

Have

Data You

Need

Data You

COULD CREATE

slide-54
SLIDE 54 @lukasvermeer Data You

Have

Data You

Need

Data You

COULD CREATE

Delete or archive?

slide-55
SLIDE 55 @lukasvermeer Data You

Have

Data You

Need

Data You

COULD CREATE

Keep, or recreate?

slide-56
SLIDE 56 @lukasvermeer Data You

Have

Data You

Need

Data You

COULD CREATE

Proxies?

slide-57
SLIDE 57 @lukasvermeer
slide-58
SLIDE 58 @lukasvermeer
slide-59
SLIDE 59 @lukasvermeer
slide-60
SLIDE 60 @lukasvermeer
slide-61
SLIDE 61 @lukasvermeer
slide-62
SLIDE 62

Deciding which data to collect, and how, is a fundamental step in the scientific method. Limited both by available theories and engineering.

@lukasvermeer
slide-63
SLIDE 63

Some of us are philosophers. Some of us build telescopes.

@lukasvermeer
slide-64
SLIDE 64 @lukasvermeer
  • Voltaire

“Judge a man by his questions rather than by his answers.”

slide-65
SLIDE 65
slide-66
SLIDE 66

EXPERIMENTING WITH SCURVY

Appendix A

@lukasvermeer
slide-67
SLIDE 67

COOK’S SECRET SECOND OBJECTIVE

Appendix B

@lukasvermeer
slide-68
SLIDE 68

Terra Australis Nondum Cognita

1570 map by Abraham Ortelius depicting a large continent on the bottom of the map and also an Arctic continent.

@lukasvermeer
slide-69
SLIDE 69

Route of the first voyage of James Cook

An expedition to the south Pacific Ocean aboard HMS Endeavour, from 1768 to 1771. It was the first of three Pacific voyages of which Cook was the commander.

@lukasvermeer
slide-70
SLIDE 70

A new map of the world

With Captain Cook's tracks, his discoveries and those of the other

  • circumnavigators. Published in 1800 by W. Palmer.
@lukasvermeer
slide-71
SLIDE 71 @lukasvermeer
  • Matthew Flinders (1814)

“There is no probability, that any

  • ther detached body of land, of

nearly equal extent, will ever be found in a more southern latitude”