Information Retrieval vs. The Real World Rolf Michelsen Industrial - - PDF document

information retrieval vs the real world
SMART_READER_LITE
LIVE PREVIEW

Information Retrieval vs. The Real World Rolf Michelsen Industrial - - PDF document

2017-04-03 Information Retrieval vs. The Real World Rolf Michelsen Industrial software engineering Creating a platform product costs an order of magnitude more than creating a program . Frederick P. Brooks, jr. 1 2017-04-03 Background


slide-1
SLIDE 1

2017-04-03 1

Information Retrieval vs. The Real World

Rolf Michelsen

Creating a platform product costs an order of magnitude more than creating a program. — Frederick P. Brooks, jr.

Industrial software engineering

slide-2
SLIDE 2

2017-04-03 2

Background

Large Scale Web Search

slide-3
SLIDE 3

2017-04-03 3

Web search — algorithm

Front End

Web search — system

Content Processing Crawler Search Q&R Processing Index Internet Global Analysis

slide-4
SLIDE 4

2017-04-03 4

Web search — application

Intent Outcome

Web search — business model

Users Advertisers Publishers

slide-5
SLIDE 5

2017-04-03 5

New Possibilities

slide-6
SLIDE 6

2017-04-03 6

3,3 GHz multi-core CPU 4 GB memory 1 TB disk 150 W power supply Price $899

Computing is free

Raspberry Pi 3 1,2 GHz multi-core CPU 1 GB RAM Price $50

The data center is free

$ aws ec2 run-instances --image-id ami-xxxxxxxx --count 100 --instance-type t1.micro --key-name MyKeyPair

slide-7
SLIDE 7

2017-04-03 7

“Magic” is free

<input id="text"> <button onclick="talk()">Talk It!</button> <button onclick="listen()">Voice</button> <script src="http://jyunming-chen.github.io/webspeech/platform/platform.js"></script> <script src="http://jyunming-chen.github.io/webspeech/webspeech/src/webspeech.js"></script> var speaker = new RobotSpeaker(); var listener = new AudioListener(); function talk() { speaker.speak("en", document.getElementById("text").value); } function listen() { listener.listen("en", function (text) { document.getElementById("text").value = text; }); }

More “Magic” is free

slide-8
SLIDE 8

2017-04-03 8

Technical Challenges

Engineering complex systems

«Anyone can build a fast CPU. The trick is to build a fast system.»

— Seymour Cray (1926-1996)

slide-9
SLIDE 9

2017-04-03 9

Scaling strategy — scaling up or out Scalable architecture

Dispatch Search

slide-10
SLIDE 10

2017-04-03 10

Large scale systems

slide-11
SLIDE 11

2017-04-03 11

Failure

slide-12
SLIDE 12

2017-04-03 12

Engineering for failure

Repair & recovery Fault tolerance Fault detection Bug or failure

Synchronization

slide-13
SLIDE 13

2017-04-03 13

The CAP Theorem

Consistency Availability Performance

slide-14
SLIDE 14

2017-04-03 14

Information Retrieval vs. The Real World

Rolf Michelsen

slide-15
SLIDE 15

2017-04-03 15

Quality

What is quality?

slide-16
SLIDE 16

2017-04-03 16

What is quality? Measuring quality

«When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be.» — Lord Kelvin (1824 – 1907)

slide-17
SLIDE 17

2017-04-03 17

Performance measurements

  • Statistical models.
  • Lab setup.
  • Load generation tools and data.
  • Measurement tools and reporting.
  • Diagnostics to identify bottlenecks.
  • System skills.

Metrics

Hard metrics Soft metrics

slide-18
SLIDE 18

2017-04-03 18

User studies The future

  • Task completion
  • Intent detection
  • Personalization
slide-19
SLIDE 19

2017-04-03 19

Operations

The Environment

slide-20
SLIDE 20

2017-04-03 20

slide-21
SLIDE 21

2017-04-03 21

slide-22
SLIDE 22

2017-04-03 22

Organizational Challenges

slide-23
SLIDE 23

2017-04-03 23

How do we win this war?

slide-24
SLIDE 24

2017-04-03 24

Lean Startup

Build → Measure → Learn

  • 1. Business hypothesis
  • 2. Iterative development
  • 3. Minimum viable product
  • 4. Validated learning
  • 5. Pivot

There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity. — Frederick P. Brooks, jr.

No silver bullet

slide-25
SLIDE 25

2017-04-03 25

Technical debt

«Here is Edward Bear, coming downstairs now, bump, bump, bump on the back of his head, behind Christopher Robin. It is, as far as he knows, the only way of coming downstairs, but sometimes he feels that there really is another way, if

  • nly he could stop bumping for a

moment and think of it. And then he feels that perhaps there isn’t.»

Rolf Michelsen Rolf.michelsen@cxense.com