GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack - PowerPoint PPT Presentation

Green�Marl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong*, Hassan Chafi* + , Eric Sedlar + , and Kunle Olukotun* *Pervasive Parallelism Lab, Stanford University + Oracle Labs

Graph Analysis � Classic graphs; New applications � Artificial Intelligence, Computational Biology, … � SNS apps: Linkedin, Facebook,… Graph Analysis: a process of drawing out further information � Example> Movie Database from the given graph data�set “What would be the avg. hop�distance between any two (Australian) actors?” Sam Worthington �� James “Is he a central figure in the movie Cameron Linda Hamilton network? How much?” Kevin Bacon Sigourney Weaver ,, “Do these actors work together �� more frequently than others?” Jack Black Ben Stiller Owen Wilson

More formally , � Graph Data*Set � �� G = (V,E): �� relationship (E) between data entities (V) � �� P: any extra data associated with each vertex or edge of graph G �� Your Data*Set = (G, Π) = (G, P 1 , P 2 , … ) Your Data*Set = (G, Π) = (G, P , P , … ) � Graph analysis on (G, Π) � Compute a scalar value � e.g. Avg*distance, conductance, eigen*value, … � Compute a (new) property � e.g. (Max) Flow, betweenness centrality, page*rank, … � Identify a specific subset of G: � e.g. Minimum spanning tree, connected component, community structure detection, …

The Performance Issue Traditional single*core machines showed limited � performance for graph analysis problems A lot of random memory accesses + data does not fit � in cache � Performance is bound to memory latency Conventional hardware (e.g. floating point units) does Conventional hardware (e.g. floating point units) does � not help much Use parallelism to accelerate graph analysis � Plenty of data*parallelism in large graph instances � Performance now depends on memory �� , not � �� . Exploit modern parallel computers: Multi*core CPU, � GPU, Cray XMT, Cluster, ...

New Issue: Implementation Overhead � It is challenging to implement a graph algorithm � correctly � + and efficiently � + while applying parallelism + while applying parallelism � + differently for each execution environment � ��

Our approach: DSL We design a domain specific language (DSL) for graph analysis � The user writes his/her algorithm concisely with our DSL � The compiler translates it into the target language (e.g. parallel � C++ or CUDA) (1) Inherent data�parallelism (1) Inherent data�parallelism (2) Good impl. templates (2) Good impl. templates Intuitive Efficient (parallel) Description of a Implementation of graph algorithm (3) High�level optimization the given algorithm ,, Foreach (t: G. Edgeset For(i=0;i<G.numN Nodes) odes();i++) { t.sigma += Foreach __fetch_and_add , (G.nodes[i], ,) BFS �� DSL �� Compiler ��

Example: Betweenness Centrality � Betweenness Centrality (BC) Low BC High BC � A measure that tells how ‘central’ a node is in the graph � Used in social network analysis � Definition � How many shortest paths are How many shortest paths are there between any two nodes Kevin going through this node. Bacon Ayush K. Kehdekar [Image source; Wikipedia]

Example: Betweenness Centrality Init BC for every node and begin outer�loop (s) [Brandes 2001] Looks complex s BFS Queues, Lists, w Order w Stack, Is this Is this parallelizable? v Compute sigma from parents s Reverse v BFS Order w w w Compute delta from children Accumulate delta into BC

Example: Betweenness Centrality [Brandes 2001] s BFS w Order w v Compute sigma from parents s Reverse v BFS Order w w w Compute delta from children

Example: Betweenness Centrality [Brandes 2001] s Parallel Iteration BFS Parallel w Order w Assignment v Parallel BFS Compute sigma from parents s Reverse v BFS Order w w w Compute delta from children Reduction

DSL Approach: Benefits � Three benefits � Productivity � Portability � Performance

Productivity Benefits � A common limiting resource in software development � your brain power (i.e. how long can you �� ?) A C++ implementation of BC from SNAP ( a parallel graph library parallel graph library from GT): ≈ 400 line of codes (with OpenMP) Vs. Green�Marl* LOC: 24 *Green�Marl ( 그린 말 ) means �� in Korean

Productivity Benefits �� BC ~ 400 24 SNAP C++ openMP Vertex Cover 71 21 SNAP C++ openMP Conductance 42 10 SNAP C++ openMP Page Rank Page Rank 75 75 15 15 http:// .. http:// .. C++ single thread C++ single thread SCC 65 15 http:// .. Java single thread � It is more than LOC � Focusing on the algorithm, not its implementation � More intuitive, less error*prone � Rapidly explore many different algorithms

Portability Benefits (On�going work) � Multiple compiler targets Command line argument DSL DSL Description Compiler CUDA for Codes for (Parallelized) GPU Cluster C++ � SMP back*end � SMP back*end LIB (& RT) LIB (& RT) LIB (& RT) � Cluster back*end (*) � For large instances � We generate codes that work on Pregel API [Malewicz et al. SIGMOD 2010] � GPU back*end (*) � For small instances � We know some tricks [Hong et al. PPOPP 2011]

Performance Benefits Optimized data structure Back�end specific & Code template optimization �� Target Arch. Threading Lib, (SMP? GPU? (e.g.OpenMP) Distributed?) Graph Data Structure Compiler Arch. Arch. Parsing & Code Independent Dependent Checking Generation Opt Opt Use High�level Semantic �� Information ��

Arch�Indep�Opt: Loop Fusion �� Loop �� Fusion �� “set” of nodes (elems are unique) �� C++ compiler cannot merge ��!��!�� loops �� "� ��#�� (Independence not �$%�&��$%�&� gauranteed) �� "� ��#�� $%�&��$%�&��$%�&� Optimization enabled by high�level (semantic) information

Arch�Indep�Opt: Flipping Edges Adding 1 to for all Outgoing Neighbors, if my B value is positive � Graph*Specific Optimization �� '� �� '� �� s t s s s t t s s (Why?) Reverse edges may not be Counting number of available or expensive to compute Incoming Neighbors whose B value is positive Optimization using domain�specific property

GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack - PowerPoint PPT Presentation

GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong, Hassan Chafi + , Eric Sedlar + , and Kunle Olukotun* *Pervasive Parallelism Lab, Stanford University + Oracle Labs Graph Analysis Classic graphs; New applications

Green Marl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong, Hassan Chafi, Eric

and Efficient Graph Analysis Hong, Chafi, Sedlar and Olukotun Reviewed by Neil Satra (ns532)

Green-Marl A DSL for Easy and Efficient Graph Analysis S. Hong, H. Chafi, E. Sedlar, K. Olukotun

EASY AND EFFICIENT GRAPH ANALYSIS Sungpack Hong, Hassan Chafi, Eric Sedlar, Kunle Olukotun

Early Experience with Intergrating Charm++ Support to Green-Marl DSL Alexander Frolov DISLab,

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

DSL with pyrser Author: L. Auroux lionel@lse.epita.fr For pyParis 2018 lionel@lse.epita.fr For

100% JDclare Language Workbench Software Factories DSL Workbenches - PMW DSL Workbenches -

Its Not Its Not Easy Being Green: Easy Being Green: Green Screen as Green Screen as

Easy-to-Use Easy-to-Install Easy on the Budget orecx.com Easy-to-Use

Dutch State Treasury Agency Investor presentation Green DSL Latest update: 2 May 2019 Green

Efficient Graph Rewriting York Semigroup Graham Campbell May 2019 Graham Campbell Efficient

DSL CPE Module A unique solution for enabling board functional test of existing & emerging

DSL Design DSL Design Jumps/GOTOs Control flow in DSLs A jump transfers control to a

Perl in Scheme: A DSL Abram Hindle Kitchener/Waterloo Perl Mongers Canada http://kw.pm.org/ {

Using Aspects for Language Portability Lennart Kats Eelco Visser DSLs Stratego SDF Spoofax

AXA 1H19 results Transcript August 1, 2019 DISCLAIMER This document is the transcript of the

One Pass Distinct Sampling Amit Poddar http://www.oraclegeek.net Object Statistics Table

Vision 2020/Quality Schools in Every Neighborhood District Accountability Report LCAP Goal 4:

Au Augus ust t 2016 16 Disclaimer The information in this presentation has been prepared by

SRI annual meeting 13 November 2015 Cautionary statement Forward-looking statements - cautionary

YOUR LOGO COMPANY HERE LOGO HERE Customize this presentation for your organization and tailor

Sign up for an Aurasma Account Please sign in on the link below: https://goo.gl/t5sDLr

INVESTOR PRESENTATION APRIL 2020 (NYSE: PINE) FIRST QUARTER 2020 PINE Snap Shot As of April

GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack - PowerPoint PPT Presentation

GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong*, Hassan Chafi* + , Eric Sedlar + , and Kunle Olukotun* *Pervasive Parallelism Lab, Stanford University + Oracle Labs Graph Analysis Classic graphs; New applications

Green Marl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong, Hassan Chafi, Eric

and Efficient Graph Analysis Hong, Chafi, Sedlar and Olukotun Reviewed by Neil Satra (ns532)

Green-Marl A DSL for Easy and Efficient Graph Analysis S. Hong, H. Chafi, E. Sedlar, K. Olukotun

EASY AND EFFICIENT GRAPH ANALYSIS Sungpack Hong, Hassan Chafi, Eric Sedlar, Kunle Olukotun

Early Experience with Intergrating Charm++ Support to Green-Marl DSL Alexander Frolov DISLab,

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

DSL with pyrser Author: L. Auroux lionel@lse.epita.fr For pyParis 2018 lionel@lse.epita.fr For

100% JDclare Language Workbench Software Factories DSL Workbenches - PMW DSL Workbenches -

Its Not Its Not Easy Being Green: Easy Being Green: Green Screen as Green Screen as

Easy-to-Use Easy-to-Install Easy on the Budget orecx.com Easy-to-Use

Dutch State Treasury Agency Investor presentation Green DSL Latest update: 2 May 2019 Green

Efficient Graph Rewriting York Semigroup Graham Campbell May 2019 Graham Campbell Efficient

DSL CPE Module A unique solution for enabling board functional test of existing &amp; emerging

DSL Design DSL Design Jumps/GOTOs Control flow in DSLs A jump transfers control to a

Perl in Scheme: A DSL Abram Hindle Kitchener/Waterloo Perl Mongers Canada http://kw.pm.org/ {

Using Aspects for Language Portability Lennart Kats Eelco Visser DSLs Stratego SDF Spoofax

AXA 1H19 results Transcript August 1, 2019 DISCLAIMER This document is the transcript of the

One Pass Distinct Sampling Amit Poddar http://www.oraclegeek.net Object Statistics Table

Vision 2020/Quality Schools in Every Neighborhood District Accountability Report LCAP Goal 4:

Au Augus ust t 2016 16 Disclaimer The information in this presentation has been prepared by

SRI annual meeting 13 November 2015 Cautionary statement Forward-looking statements - cautionary

YOUR LOGO COMPANY HERE LOGO HERE Customize this presentation for your organization and tailor

Sign up for an Aurasma Account Please sign in on the link below: https://goo.gl/t5sDLr

INVESTOR PRESENTATION APRIL 2020 (NYSE: PINE) FIRST QUARTER 2020 PINE Snap Shot As of April

GreenMarl: A DSL for Easy and Efficient Graph Analysis Sungpack Hong, Hassan Chafi + , Eric Sedlar + , and Kunle Olukotun* *Pervasive Parallelism Lab, Stanford University + Oracle Labs Graph Analysis Classic graphs; New applications

DSL CPE Module A unique solution for enabling board functional test of existing & emerging