Inspector Gadget: A Framework for Custom Monitoring and Debugging - - PowerPoint PPT Presentation

inspector gadget a framework for custom monitoring and
SMART_READER_LITE
LIVE PREVIEW

Inspector Gadget: A Framework for Custom Monitoring and Debugging - - PowerPoint PPT Presentation

Inspector Gadget: A Framework for Custom Monitoring and Debugging of Distributed Dataflows Christopher Olston and Benjamin Reed Yahoo! Research Web Scale problems Lots of servers, users, and data Fun to have power at your fingertip


slide-1
SLIDE 1

Christopher Olston and Benjamin Reed Yahoo! Research

Inspector Gadget: A Framework for Custom Monitoring and Debugging of Distributed Dataflows

slide-2
SLIDE 2

Web Scale problems

  • Lots of servers, users, and data
  • Fun to have power at your fingertip
  • Sucks when things go wrong
slide-3
SLIDE 3

Map/Reduce

Map Map Map Map I n p u t D a t a s e t Reduce Reduce Reduce O u t p u t D a t a s e t Per record Processing & Partitioning Per Partition Processing

slide-4
SLIDE 4

Pig on Map/Reduce

Map/Reduce Cluster Parser Optimizer/ Compiler script flow MR job(s)

slide-5
SLIDE 5

Example Pig Workflow

group count join filter store load load

Pages = load 'webpages' UserViews = load 'userclicks' NerdPages =filter Pages by NerdFilter(content) NerdPageViews = join NerdPages, UserViews by url NerdUsers = group NerdPageViews by user Counts = foreach NerdUsers generate user, COUNT(NerdPageViews) store Counts into 'nerdviewcounts'

slide-6
SLIDE 6

Motivated by User Interviews

Interviewed 10 Yahoo dataflow programmers (mostly Pig users; some users of other dataflow environments) Asked them how they (wish they could) debug

slide-7
SLIDE 7

Summary of User Interviews

# of requests feature 7 crash culprit determination 5 row-level integrity alerts 4 table-level integrity alerts 4 data samples 3 data summaries 3 memory use monitoring 3 backward tracing (provenance) 2 forward tracing 2 golden data/logic testing 2 step-through debugging 2 latency alerts 1 latency profiling 1

  • verhead profiling

1 trial runs

slide-8
SLIDE 8

Running Pig

Pig

slide-9
SLIDE 9

Running Pig

Error! Pig

slide-10
SLIDE 10

Running Pig

Detective

Pig

slide-11
SLIDE 11

Running Pig

Detective Pig Error!

slide-12
SLIDE 12

Running Pig

Detective Pig Error!

Explanation

slide-13
SLIDE 13

Our Approach

Goal: a programming framework for adding debugging features to Pig Precept: avoid modifying Pig or tampering with data flowing through Pig Approach: perform Pig script rewriting – insert special (User Defined Functions) UDFs that look like no-ops to Pig

slide-14
SLIDE 14

group count join filter load load IG coordinator store IG agent IG agent IG agent IG agent IG agent IG agent

Pig w/ Inspector Gadget

slide-15
SLIDE 15

group count join filter load load IG coordinator store IG agent

Row Integrity

bad records

slide-16
SLIDE 16

Example: Forward Tracing

tracing instructions report traced records to user group count join filter load load IG coordinator store IG agent IG agent IG agent IG agent traced records

slide-17
SLIDE 17

Example: Crash Culprit Determination group count join filter load load IG coordinator store IG agent IG agent IG agent IG agent IG agent IG agent

slide-18
SLIDE 18

Crash Culprit Sending every 5th

IG coordinator

slide-19
SLIDE 19

Crash Culprit Sending every 5th

IG coordinator

slide-20
SLIDE 20

Crash Culprit sending every 5th

IG coordinator

slide-21
SLIDE 21

Crash Culprit Sending 5th

IG coordinator

slide-22
SLIDE 22

Crash Culprit Sending every 2nd

IG coordinator

slide-23
SLIDE 23

Crash Culprit Sending every 2nd

IG coordinator

slide-24
SLIDE 24

Crash Culprit Sending every tuple

IG coordinator

slide-25
SLIDE 25

Crash Culprit Sending every tuple

IG coordinator

slide-26
SLIDE 26

Agent & Coordinator APIs

Agent Class init(args) tags = observeRecord(record, tags) receiveMessage(source, message) finish() Coordinator Class init(args) receiveMessage(source, message)

  • utput = finish()

Agent Messaging sendT

  • Coordinator(message)

sendToAgent(agentId, message) sendDownstream(message) sendUpstream(message) Coordinator Messaging sendToAgent(agentId, message)

slide-27
SLIDE 27

Applications Developed Using IG

# of requests

feature

lines of code (Java)

7 crash culprit determination 141 5 row-level integrity alerts 89 4 table-level integrity alerts 99 4 data samples 97 3 data summaries 130 3 memory use monitoring N/A 3 backward tracing (provenance) 237 2 forward tracing 114 2 golden data/logic testing 200 2 step-through debugging N/A 2 latency alerts 168 1 latency profiling 136 1

  • verhead profiling

124 1 trial runs 93

slide-28
SLIDE 28

In Paper

Semantics under parallel/distributed execution Messaging & tagging implementation Limitations Performance experiments Related work

slide-29
SLIDE 29

Performance Experiments

15-machine Pig/Hadoop cluster (1G network) Four dataflows over a small web crawl sample (10M URLs): Dataflow Program Early Projection Optimization ? Early Aggregation Optimization ? Number of Map-Reduce Jobs Distinct Inlinks N N 1 Frequent Anchortext Y N 1 Big Site Count Y Y 1 Linked By Large N Y 2

slide-30
SLIDE 30

Dataflow Running Times

slide-31
SLIDE 31

Related Work

XTrace, etc. taint tracking aspect-oriented programming

slide-32
SLIDE 32

Summary / Status

  • Users have a long wish-list for “debuggability”
  • Make a general framework rather than tool for each
  • Addressed most features with few lines of code
  • Rather than implement them as separate features in the Pig core,

we built a layer on top

  • IG (called Penny) is open source. Accepted into Apache Pig v0.9

release (http://pig.apache.org)

slide-33
SLIDE 33

The End