Roomy: A New Approach to Parallel Disk-based Computation Dan Kunkle - - PowerPoint PPT Presentation

roomy a new approach to parallel disk based computation
SMART_READER_LITE
LIVE PREVIEW

Roomy: A New Approach to Parallel Disk-based Computation Dan Kunkle - - PowerPoint PPT Presentation

Roomy: A New Approach to Parallel Disk-based Computation Dan Kunkle Thesis Proposal College of Computer and Information Science Northeastern University Committee: Gene Cooperman (Advisor), Panagiotis Manolios, Mirek Riedewald, Fan Yang


slide-1
SLIDE 1

Roomy: A New Approach to Parallel Disk-based Computation

Dan Kunkle

Thesis Proposal College of Computer and Information Science Northeastern University Committee: Gene Cooperman (Advisor), Panagiotis Manolios, Mirek Riedewald, Fan Yang (Google)

November 9, 2009

Dan Kunkle Roomy, disk-based computation November 9, 2009 1 / 25

slide-2
SLIDE 2

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 2 / 25

slide-3
SLIDE 3

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 3 / 25

slide-4
SLIDE 4

Problem Statement

Goal: to solve space limited problems without significantly increasing hardware costs or radically altering existing algorithms and data structures. A space limited problem is one where existing solutions quickly exceed available memory. This could be solved by significantly increasing available RAM, but that is expensive. New algorithmic techniques that reduce space usage may help in certain cases (e.g., Bloom filters), but not always. Our approach is to use parallel disk-based computation.

Dan Kunkle Roomy, disk-based computation November 9, 2009 4 / 25

slide-5
SLIDE 5

Definition: Parallel Disk-based Computation

Parallel disk-based computation: using disks as the main working memory of a computation, instead of RAM. This provides several orders of magnitude more space for the same price. Performance Issues and Solutions Bandwidth: the bandwidth of a disk is roughly 50 times less than RAM (100 MB/s versus 5 GB/s). Solution: use many disks in parallel. Latency: even worse, the latency of disk is many orders of magnitude worse than RAM. Solution: avoid latency penalties by using streaming access.

Dan Kunkle Roomy, disk-based computation November 9, 2009 5 / 25

slide-6
SLIDE 6

Implications of Disk-based Computation

By replacing RAM with disks A cluster of 50 computers, each with 8 cores and 1 TB of disk space, can substitute for a shared memory computer with 400 cores and a single 50 TB memory subsystem. Algorithm and Software Engineering Issues Unfortunately, writing programs that use many disks in parallel and avoid using random access is often a difficult task. Our group has five years of case histories applying this to computational group theory – but each case requires months of development and debugging.

Rubik’s Cube in 26 moves, 2007, 8 TB of aggregate storage. (CACM Viewpoint, April 2008).

Dan Kunkle Roomy, disk-based computation November 9, 2009 6 / 25

slide-7
SLIDE 7

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 7 / 25

slide-8
SLIDE 8

Roomy

Roomy is: A new programming model that extends a programming language with transparent disk-based computing support. An open source library for C/C++ implementing this new programming language extension. The primary goals of Roomy are: Minimally invasive: common data structures in user sequential code are replaced by Roomy data structures (lists, arrays, and hash tables). Performance: the interface biases programmers toward approaches with high performance parallel disk-based implementations. Choice of architectures: can used shared or distributed memory; locally attached disks or storage area networks (SAN). Fault tolerance: can be combined with our group’s distributed checkpointing tool DMTCP.

Dan Kunkle Roomy, disk-based computation November 9, 2009 8 / 25

slide-9
SLIDE 9

Roomy Programming Model

The Roomy programming model: Provides basic data structures (arrays, lists, and hash tables). Transparently distributes data structures across many disks and performs operations on that data in parallel. Immediately processes streaming access operators. Delays processing random access operators until they can be performed efficiently in batch (e.g., collecting and sorting updates to an array).

Dan Kunkle Roomy, disk-based computation November 9, 2009 9 / 25

slide-10
SLIDE 10

Example: Delayed Processing of Hash Table Insertions

  • Dan Kunkle

Roomy, disk-based computation November 9, 2009 10 / 25

slide-11
SLIDE 11

Design of Roomy

Applications Algorithm Library

API

Foundation

file management remote I/O external sorting synchronization and barriers RoomyArray: update, predicates delayed read map, reduce RoomyList: add, remove addAll, removeAll removeDupes map, reduce breadth-first search parallel depth-first search dynamic programming A.I search (pancake sorting, Rubik’s Cube) SAT solver Binary decision diagrams Explicit state model checking

Dan Kunkle Roomy, disk-based computation November 9, 2009 11 / 25

slide-12
SLIDE 12

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 12 / 25

slide-13
SLIDE 13

Types of Disk-based Computing Systems

Current approaches to disk-based computing can be classified into a few broad categories: Large scale data processing: primarily motivated by a need to process very large data sets, such as in web search. Focus on scalability and fault tolerance. → MapReduce (Google), Hadoop (open source MapReduce), Dryad (Microsoft Research) Libraries of theoretically optimal algorithms: motivated by the development of external memory complexity models and algorithms. → TPIE, STXXL Roomy

Dan Kunkle Roomy, disk-based computation November 9, 2009 13 / 25

slide-14
SLIDE 14

Delayed Random Operations

Three ways to handle random access operations: Eliminate random access operations (e.g., MapReduce) → limits the range of algorithms that can be used Process random access operations immediately (e.g., STXXL) → may suffer large latency penalties Delay processing until they can be performed efficiently The delayed processing of random access operations is one of the features that differentiates Roomy from other approaches to disk-based computation.

Dan Kunkle Roomy, disk-based computation November 9, 2009 14 / 25

slide-15
SLIDE 15

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 15 / 25

slide-16
SLIDE 16

Research Goals

Using Roomy as a development platform, the two central research questions we seek to answer are: What is the class of applications for which parallel disk-based computing is practical? How can existing sequential algorithms and software be adapted to take advantage of parallel disk-based computing? We will answer these questions by using Roomy to extend existing algorithms and software.

Dan Kunkle Roomy, disk-based computation November 9, 2009 16 / 25

slide-17
SLIDE 17

Applications of Parallel Disk-based Computation

Previous disk-based computing projects: 26 moves suffice for Rubik’s cube. Search and enumeration problems from computational group theory. General breadth-first search (e.g., pancake sorting problem). Target applications of Roomy from formal verification: Bounded model checking using SAT solvers. Binary decision diagrams (BDDs). Explicit state model checking. We will implement one or more of the above applications by integrating existing open source software with the Roomy library.

Dan Kunkle Roomy, disk-based computation November 9, 2009 17 / 25

slide-18
SLIDE 18

Potential Applications of Roomy

Discipline Example Application A.I. Search Rubik’s Cube Group Theory Search and Enumeration in Mathematical Structures Verification SAT Solvers (as used in Bounded Model Checking) Verification Symbolic Computation using BDDs Verification Explicit State Verification Coding Theory Search for New Codes Security Exhaustive Search for Passwords Semantic Web RDF query language; OWL Web Ontology Language Artificial Intelligence Planning Proteomics Protein folding via a kinetic network model Operations Research Branch-and-Bound Operations Research Integer Programming (applic. of Branch-and-Bound) Economics Dynamic Programming Numerical Analysis ATLAS, PHiPAC, FFTW, and other adaptive software Engineering Sensor Data

Dan Kunkle Roomy, disk-based computation November 9, 2009 18 / 25

slide-19
SLIDE 19

Outline

1

Overview of Parallel Disk-based Computation

2

Roomy: Programming Model, Goals, and Design

3

Related Work

4

Research Goals and Applications

5

Example Application: Pancake Sorting Problem

Dan Kunkle Roomy, disk-based computation November 9, 2009 19 / 25

slide-20
SLIDE 20

Pancake Sorting Problem

Pancake sorting: Sort using prefix reversal. Goal is to minimize the number of reversals used. Example 3142 1342 4312 2134 1234 Question: what is the maximum number of reversals needed to sort N elements?

Dan Kunkle Roomy, disk-based computation November 9, 2009 20 / 25

slide-21
SLIDE 21

Pancake Sorting Graph

1234 2134 3214 4321 3124 4312 4123 1243 2143 4213 3421 3412 1324 2314 4231 4132 1342 3142 2431 1423 2413 3241 1432 2341 Dan Kunkle Roomy, disk-based computation November 9, 2009 21 / 25

slide-22
SLIDE 22

Pancake Sorting Graph: Breadth-first Search Levels

1234 2134 3214 4321 3124 4312 4123 1243 2143 4213 3421 3412 1324 2314 4231 4132 1342 3142 2431 1423 2413 3241 1432 2341 Dan Kunkle Roomy, disk-based computation November 9, 2009 22 / 25

slide-23
SLIDE 23

Roomy Breadth-first Search Implementation

// Init lists for duplicates, current level, and next level RoomyList* allLevList = RoomyList_make("allLev", eltSize); RoomyList* curLevList = RoomyList_make("lev0", eltSize); RoomyList* nextLevList = RoomyList_make("lev1", eltSize); // Function to be mapped over current level to produce next level void genNextLev(void* val) { /* * User defined code to generate nbrs array inserted here. */ for(int i=0; i<numNbrs; i++) { RoomyList_add(nextLevList, nbrs[i]); } } // Add start element RoomyList_add(allLevList, startElt); RoomyList_add(curLevList, startElt); // Generate levels until no new states are found while(RoomyList_size(curLevList)) { // generate next level from current RoomyList_map(curLevList, genNextLev); // detect duplicates RoomyList_removeDupes(nextLevList); RoomyList_removeAll(nextLevList, allLevList); RoomyList_addAll(allLevList, nextLevList); // rotate levels RoomyList_destroy(curLevList); curLevList = nextLevList; nextLevList = RoomyList_make(levName, eltSize); } Dan Kunkle Roomy, disk-based computation November 9, 2009 23 / 25

slide-24
SLIDE 24

Experimental Results for Pancake Sorting

We performed a breadth-first search of the pancake graph for N = 13. The graph has approximately 6.2 billion vertices and 74 billion edges. The computation completed in 5 hours on a cluster with 32 nodes. This replicated the best result as of 2006. Writing the Roomy program took less than a day. The distribution of elements in the breadth-first search is:

Level Number of Elements 1 1 12 2 132 3 1451 4 14556 5 130096 6 1030505 7 7046318 8 40309555 9 184992275 10 639768688 11 1525115582 12 2183056185 13 1458670200 14 186883243 15 2001 Dan Kunkle Roomy, disk-based computation November 9, 2009 24 / 25

slide-25
SLIDE 25

Q & A

Dan Kunkle Roomy, disk-based computation November 9, 2009 25 / 25