CLOUD PROGRAMMING
Andrew Harris & Long Kai
1
Machine Image Graph Data Learning Processing Analysis Mining - - PowerPoint PPT Presentation
C LOUD PROGRAMMING Andrew Harris & Long Kai 1 M OTIVATION Research problem : How to write distributed data-parallel programs for a compute cluster? Drawback of Parallel Databases (SQL) : Too limited for many applications. Very
Andrew Harris & Long Kai
1
MOTIVATION
Research problem: How to write distributed
data-parallel programs for a compute cluster?
Drawback of Parallel Databases (SQL): Too
limited for many applications.
Very restrictive type system The declarative query is unnatural. Drawback of Map Reduce: Too low-level and
rigid, and leads to a great deal of custom user code that is hard to maintain, and reuse.
2
Image Processing
LAYERS
3
Server Cluster Services Hadoop Map-Reduce / Dryad Pig Latin / DryadLINQ Server Server Server Other Languages Machine Learning Graph Analysis Data Mining Applications
Other Applications
A Not-So-Foreign Language for Data Processing
4
DATAFLOW LANGUAGE
User specifies a sequence of steps where each
step specifies only a single, high level data
procedural – desirable for programmers.
With SQL, the user specifies a set of declarative
non-programmers.
5
AN SAMPLE CODE OF PIG LATIN
6
SELECT category, AVG(pagerank) FROM urls WHERE pagerank > 0.2 GROUP BY category HAVING COUNT(*) > 10^6 good_urls = FILTER urls BY pagerank > 0.2; groups = GROUP good_urls BY category; big_groups = FILTER groups BY COUNT(good_urls)>10^6;
category, AVG(good_urls.pagerank);
SQL Pig Latin
Pig Latin program is a sequence of steps, each of which carries out a single data transformation.
DATA MODEL
Atom: Contains a simple atomic value such as a
string or a number, e.g., ‘Joe’.
Tuple: Sequence of fields, each of which might be any
data type, e.g., (‘Joe’, ‘lakers’)
Bag: A collection of tuples with possible duplicates.
Schema of a bag is flexible.
Map: A collection of data items, where each item has
an associated key through which it can be looked up. Keys must be data atoms.
7
A COMPARISON WITH RELATIONAL ALGEBRA
8
Everything is a bag. Dataflow language. FILTER is same as
the Select operator.
Everything is a table. Dataflow language. Select operator is same
as the FILTER cmd.
Pig Latin Relational Algebra
Pig Latin has only included a small set of carefully chosen primitives that can be easily parallelized.
SPECIFYING INPUT DATA: LOAD
queries = LOAD `query_log.txt' USING myLoad() AS (userId, queryString, timestamp);
The input file is “query_log.txt”. The input file should be converted into tuples by
using the custom myLoad deserializer.
The loaded tuples have three fields named userId,
queryString, and timestamp.
9
Note that the LOAD command does not imply database-style loading into tables. It’s only logical.
PER-TUPLE PROCESSING: FOREACH
Expanded_queries = FOREACH queries GENERATE userId, expandQuery(queryString);
expandQuery is a User Defined Function. Nesting can be eliminated by the use of the
FLATTEN keyword in the GENERATE clause.
userId, FLETTEN(expandQuery(queryString));
10
DISCARDING UNWANTED DATA: FILTER
real_queries = FILTER queries BY userId neq `bot'; real_queries = FILTER queries BY NOT isBot(userId);
Again, isBot is a User Defined Function Operations might be ==, eq, !=, neq, <, >, <=, >= A comparison operation may utilize Boolean
11
GETTING RELATED DATA TOGETHER: COGROUP
grouped_data = COGROUP results BY queryString, revenue BY queryString;
group together tuples from one or more data sets, that
are related in some way, so that they can subsequently be processed together.
In general, the output of a COGROUP contains one
tuple for each group.
The first field of the tuple (named group) is the group
input being cogrouped.
12
MORE ABOUT COGROUP
13
COGROUP + FLATTEN = JOIN
EXAMPLE: MAP-REDUCE IN PIG LATIN
map_result = FOREACH input GENERATE FLATTEN(map(*)); key_groups = GROUP map_result BY $0;
A map function operates on one input tuple at a time,
and outputs a bag of key-value pairs.
The reduce function operates on all values for a key
at a time to produce the final results.
14
IMPLEMENTATION
Building a logical plan: Pig builds a logical plan for every bag that the user
defines.
No processing is carried out when the logical plans are
invokes a STORE command on a bag.
Compilation of the logical plan into a physical plan.
15
MAP-REDUCE PLAN COMPILATION
The map-reduce primitive essentially provides
the ability to do a large-scale group by, where the map tasks assign keys for grouping, and the reduce tasks process a group at a time.
Converting each (CO)GROUP command in the
logical plan into a distinct map-reduce job with its own map and reduce functions.
16
OTHER FEATURES
Fully nested data model. Extensive support for user-defined functions. Manages plain input files without any schema
information.
A novel debugging environment.
17
DISCUSSION: PIG LATIN MEETS MAP-REDUCE
Is it necessary to run Pig Latin on Map-Reduce
platform?
Is Map-Reduce a perfect platform for Pig Latin?
Any drawbacks?
Data must be materialized and replicated on the
distributed file system between successive map- reduce jobs.
Not flexible enough. Well, it does work fine. parallelism, load-
balancing, and fault-tolerance……
18
19
DRYAD EXECUTION PLATFORM
Job execution plan is a dataflow
graph.
A Dryad application combines
computational “vertices” with communication “channels” to form a dataflow graph.
20
MAP-REDUCE IN DRYADLINQ
21
IMPLEMENTATION - OPTIMIZATIONS
Static Optimizations
Pipelining: Multiple operators may be executed in a single
process.
Removing redundancy: DryadLINQ removes unnecessary
partitioning steps.
Eager Aggregation: Aggregations are moved in front of
partitioning operators where possible.
I/O reduction: Where possible, uses TCP-pipe and in-memory
FIFO channels instead of persisting temporary data to files.
Dynamic Optimizations
Dynamically sets the number of vertices in each stage at run
time based on the size of its input data.
Dynamically mutate the execution graph as information from
the running job becomes available.
22
MAP-REDUCE IN DRYADLINQ
23
Step (1) is static, (2) and (3) are dynamic based on the volume and location of the data in the inputs.
1
Long Kai and Andrew Harris
2
3
data
(data pieces do not depend on other data pieces) or only marginally dependent
4
search platform - driving app: Google’s indexer
updates to petabyte-scale data sets
allowing for “fresher” search results
5
documents (billions per day)
compile web index for these documents
spent 2-3 days being indexed
6
Bigtable Bigtable Tabletserver Tabletserver Chunkserver Chunkserver
database
App with App with Percolator Percolator Library Library
documents All communication handled via RPCs Single lines of code in observer Google indexing system uses ~10 observers
7
handled as an ACID transaction
deadlock resolution
via coordinated timestamp oracle
8
Result of dropping 33% of tablet servers in use
9
connection with Bigtable
location
previously failed transaction
10
connections are unidirectional)
11
Bigtable Bigtable
column is changed one
NOTIFY NOTIFY Observer Observer
new update transaction
receives most recent column data
12
Observer Observer
Key Value Notify
Search Search Thread Thread Search Search Thread Thread Search Search Thread Thread
Percolator workers spawn threads which search randomly, report changed cells to
(sequential search) (transactions)
13
against comparison DBMS (TPC-E, a stock market trading backend)
previous!
14
when does scaling break down?
dependent or rapidly mutating data sets
linked children may report updates before their parents - implications?