GMaVis: A Domain-Specific Language for Large-Scale Geospatial Data - - PowerPoint PPT Presentation

gmavis a domain specific language for large scale
SMART_READER_LITE
LIVE PREVIEW

GMaVis: A Domain-Specific Language for Large-Scale Geospatial Data - - PowerPoint PPT Presentation

Introduction Related Work GMaVis GMaVis Compiler Evaluation and Discussion Conclusion GMaVis: A Domain-Specific Language for Large-Scale Geospatial Data Visualization Supporting Multi-core Parallelism Cleverson Ledur Advisor: Ph.D. Luiz


slide-1
SLIDE 1

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion

GMaVis: A Domain-Specific Language for Large-Scale Geospatial Data Visualization Supporting Multi-core Parallelism

Cleverson Ledur Advisor: Ph.D. Luiz Gustavo Fernandes Co-Advisor: Ph.D. Isabel Manssour

Pontifical Catholic University of Rio Grande do Sul - PUCRS Computer Science Graduate Program - PPGCC Grupo de Modelagem de Aplicac ¸ ˜

  • es Paralelas - GMAP

November 2015

1 / 61

slide-2
SLIDE 2

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion

Outline

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion

2 / 61

slide-3
SLIDE 3

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Contextualization

1

Introduction Contextualization GMaVis Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion

3 / 61

slide-4
SLIDE 4

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Contextualization

Data Visualization Data visualization is the representation of data using graphic elements. Provide a quick understanding of data. Visualization creation pipeline has three steps: data pre-processing, data to visual mappings, and view transformations.

Figure: Visualization pipeline (Adapted from [1]).

4 / 61

slide-5
SLIDE 5

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Contextualization

Data Visualization Users with low-level knowledge in programming that need to create a geo-spatial data visualization may have a hard time. For example, to create a simple web visualization map, they will need to know at least JavaScript and HTML. If they are dealing with a huge volume of data, it will be more difficult since most libraries and tools do not provide big data preprocessing. It is possible to use parallel processing to speed up visualization creation.

5 / 61

slide-6
SLIDE 6

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Contextualization

Parallel Programming Developers must worry about:

Architecture System? Parallel Interface? Strategy: Shared Memory, Message Passing, Hybrid. Decomposition of the problem: Data Parallelism, Task Parallelism, Stream Parallelism. Steps: Study problem or code, Look for parallelism

  • pportunities, Try to keep all cores busy.

Developer must care about Synchronization (Locks/Semaphores, Synchronous through communication, Barriers), Data Dependencies, Granularity, I/O, ...

6 / 61

slide-7
SLIDE 7

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis Introduction

1

Introduction Contextualization GMaVis Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion

7 / 61

slide-8
SLIDE 8

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis Introduction

Introduction We propose an external domain-specific language for the creation of large-scale data visualization focused on web data visualization map applications. The main goal is to provide a high-level interface that supports the visualization of the detail specification and the manipulation of raw data automatically, using a parallel data pre-processor.

8 / 61

slide-9
SLIDE 9

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis Introduction

Introduction

Figure: Research scenario framework (Extracted from [4]).

9 / 61

slide-10
SLIDE 10

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion

1

Introduction

2

Related Work DSLs for Visualization

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion

10 / 61

slide-11
SLIDE 11

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion DSLs for Visualization

Related Work

DSL Domain Focus Parallelism Interface Vivaldi Sci/Vol Vis. Vol Rend.

  • Dist. CPU/GPU

High-level (H:Phyton) ViSlang Sci/Vol Vis. Vol Rend. CPU/GPU High-level(H:C++) Diderot Img Analysis & Med. Vis. Img Rend. & Analysis CPU/GPU High-level(H:C) Shadie Med Vis. Vol Rend. CPU/GPU High-level(H:Phyton) Superconductor General Interactive Vis. Rend. CPU/GPU High-level(Ext.) GMaVis [6]

  • Geo. Data Vis. Maps

Data Preprocessing CPU High-level(Ext.)

Vivaldi, ViSlang, Diderot, and Shadie are focusing in the generation of volumetric data visualizations. Superconductor allows the user to create maps because have more expressiveness, but it requires programming skills. Google Maps API, Leaflet and OpenLayers are visualization maps libraries with high-level abstractions. However, they do not avoid to learn a programming language, pre-processes and insert data.

11 / 61

slide-12
SLIDE 12

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion DSLs for Visualization

Related Work

Figure: Data Visualization Creation Comparison

12 / 61

slide-13
SLIDE 13

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion DSLs for Visualization

Related Work

Table: Complexities abstraction in each visualization creation phase.

DSL Data Pre-processing Data to Visual Mappings View Transformations Vivaldi [3] X X ViSlang [8] X X Diderot [2] X X Shadie [5] X X Superconductor [7] X X GMaVis [6] X X X

Table: Parallel processing in each visualization creation phase.

DSL Data Pre-processing Data to Visual Mappings View Transformations Vivaldi [3] X X ViSlang [8] X X Diderot [2] X X Shadie [5] X X Superconductor [7] X X GMaVis [6] X 13 / 61

slide-14
SLIDE 14

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization GMaVis

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion

14 / 61

slide-15
SLIDE 15

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Proposed DSL Aim to facilitate the creation of visualizations. To be as close as possible closer to the domain vocabulary (supporting a suitable and friendly language syntax). Users will not have to know programming aspects like functions, variables, methods or any other web development issue. Users will have automatic data processing that empowers the data filtering, cleaning and, classification. Optimized file loading in memory that allows users to

  • pen files bigger than the RAM memory available in the

system.

15 / 61

slide-16
SLIDE 16

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Architecture Overview

Figure: DSL Enviroment Figure: Data Pre-processor Workflow

16 / 61

slide-17
SLIDE 17

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Interface - Elements High-level interface Few lines of code External DSL Manipulation and preprocessing of data This DSL language rule set consists of blocks and declarations.

Figure: Example of Block and Declaration

17 / 61

slide-18
SLIDE 18

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Interface - Logical Operators

Figure: Logical Operators for Filtering and Classifying

18 / 61

slide-19
SLIDE 19

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Interface Example - Traffic Accidents in Porto Alegre (Brazil)

Figure: Traffic accidents in Porto Alegre.

19 / 61

slide-20
SLIDE 20

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion GMaVis

Interface Example - Flickr Pictures Classified by Brand of Used Camera

Figure: Flickr pictures by brand of camera used.

20 / 61

slide-21
SLIDE 21

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Compiler Environment

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler Compiler Environment Data Preprocessor Generator Parallel Data Preprocessor

5

Evaluation and Discussion

6

Conclusion

21 / 61

slide-22
SLIDE 22

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Compiler Environment

Compiler Environment

Flex and Bison Receives the source code and performs lexical and syntax analysis. Generates tokens and join them according to the specified grammar rules. Each rule has different actions, such as: saving information, calling functions to process values, concatenation of strings and flagging.

Figure: GMaVis compiler environment.

22 / 61

slide-23
SLIDE 23

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Data Preprocessor Generator

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler Compiler Environment Data Preprocessor Generator Parallel Data Preprocessor

5

Evaluation and Discussion

6

Conclusion

23 / 61

slide-24
SLIDE 24

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Data Preprocessor Generator

Data Preprocessor Generator

Figure: Example of logical expression transformation. Figure: Data preprocessor generation.

24 / 61

slide-25
SLIDE 25

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Data Preprocessor Generator

Data Preprocessor Generator

Table: Data preprocessor functions for each logical operator.

GmaVis Logical Operator Input Data Type Generated Function in Data Preprocessor Is equal to string string is equal(counter,data,i,field,value) date date is equal(counter,data,i,field,date2) integer int is equal(counter,data,i,field,value) float float is equal(counter,data,i,field,value) Is different than string string is different(counter,data,i,field,value) date date is different(counter,data,i,field,date2) integer int is different(counter,data,i,field,value) float float is different(counter,data,i,field,value) Is greater than date date is greater(counter,data,i,field,date2) integer int is greater(counter,data,i,field,value) float float is greater(counter,data,i,field,value) Is less than date date is less(counter,data,i,field,date2) integer int is less(counter,data,i,field,value) float float is less(counter,data,i,field,value) Is between and date date is between(counter,data,i,field,date2,date3) integer int is between(counter,data,i,field,value1,value2) float float is between(counter,data,i,field,value1,value2) contains string string contains(counter,data,i,field,value) 25 / 61

slide-26
SLIDE 26

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Data Preprocessor Generator

Visualization Generator

Figure: Data visualization generation.

26 / 61

slide-27
SLIDE 27

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler Compiler Environment Data Preprocessor Generator Parallel Data Preprocessor

5

Evaluation and Discussion

6

Conclusion

27 / 61

slide-28
SLIDE 28

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

Parallel Data Preprocessor Data preprocessor module has a high computational cost [6]. We implemented a parallel version for multi-core architectures. The code that we implement to parallelize the data preprocessor using SPar is almost the same as is used in the sequential version. GMaVis compiler generates SPar annotations.

28 / 61

slide-29
SLIDE 29

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

Sequential Data Preprocessor

Figure: Example of sequential processing in data preprocessor.

29 / 61

slide-30
SLIDE 30

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

Parallel Data Preprocessor

Figure: Parallel preprocessor using pipeline strategy.

30 / 61

slide-31
SLIDE 31

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

Data Preprocessor Generator

Figure: Illustration of sequential process function code.

31 / 61

slide-32
SLIDE 32

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Parallel Data Preprocessor

Data Preprocessor Generator

Figure: Illustration of process function code with SPar annotations.

32 / 61

slide-33
SLIDE 33

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion Methodology Programming Effort Results Performance Results

6

Conclusion

33 / 61

slide-34
SLIDE 34

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

34 / 61

slide-35
SLIDE 35

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

H1- GMaVis requires less programming effort than visualization libraries (Google Maps API, Leaflet and OpenLayers).

34 / 61

slide-36
SLIDE 36

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

H1- GMaVis requires less programming effort than visualization libraries (Google Maps API, Leaflet and OpenLayers). H2- The inclusion of automatic data pre-processing in GMaVis reduces programming effort.

34 / 61

slide-37
SLIDE 37

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

H1- GMaVis requires less programming effort than visualization libraries (Google Maps API, Leaflet and OpenLayers). H2- The inclusion of automatic data pre-processing in GMaVis reduces programming effort.

Q2 - Can the parallel code generated by this DSL speed up the data processing of raw geospatial data?

34 / 61

slide-38
SLIDE 38

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

H1- GMaVis requires less programming effort than visualization libraries (Google Maps API, Leaflet and OpenLayers). H2- The inclusion of automatic data pre-processing in GMaVis reduces programming effort.

Q2 - Can the parallel code generated by this DSL speed up the data processing of raw geospatial data?

H3- GMaVis can generate parallel code annotations using SPar for speeding up performance.

34 / 61

slide-39
SLIDE 39

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization?

H1- GMaVis requires less programming effort than visualization libraries (Google Maps API, Leaflet and OpenLayers). H2- The inclusion of automatic data pre-processing in GMaVis reduces programming effort.

Q2 - Can the parallel code generated by this DSL speed up the data processing of raw geospatial data?

H3- GMaVis can generate parallel code annotations using SPar for speeding up performance. H4 - Code generation for SPar is simpler than using TBB (Thread Building Blocks) for speeding up performance.

34 / 61

slide-40
SLIDE 40

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Applications - Datasets

YFCC100M: Provided by Yahoo Labs. It has about 54GB of data, divided into ten files. This is a public multimedia data set with 99.3 million images and 0.7 million videos, all from Flickr and under Creative Commons licensing. Traffic Accidents: Also we used a dataset from DataPoa with traffic accident data in Porto Alegre, Brazil that contains information about the type of accidents, vehicles, date and time, level and location. It has 39 fields, including latitude and longitude, and about 20.937 registers. Airports: A dataset obtained in OpenFlights1 that contains all the airports in the world was used, with information about latitude and longitude, city, country and airport code. It has 8107 registers with 12 fields with information about the airports. Six visualization applications were created using these datasets. Two applications for each type of visualization provided in GMaVis (MarkedMap, Heatmap, and Clusteredmap). GMaVis, Google Maps API, Leaflet and OpenLayers. C/C++ data preprocessor were created for Google Maps API, Leaflet and OpenLayers.

35 / 61

slide-41
SLIDE 41

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Evaluation Q1 - Programming effort evaluation

COCOMO Model (Sloccount)

Q2 - Performance evaluation

Execution Time Throughput

36 / 61

slide-42
SLIDE 42

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Programming Effort Evaluation This Validates the effectiveness of facilitating visualization creation with GMaVis. This evaluation compares the effort to create a visualization using GMaVis and traditional tools. Traditional tools: Google Maps API, Leaflet and OpenLayers. 6 Applications (2 for each visualization type). In order to estimate data preprocessing, we measured a code developed in C++ that loads, processes data and

  • utput in the format required by the application.

37 / 61

slide-43
SLIDE 43

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

SLOCCount COCOMO model

Usability model for measuring code and estimation metrics. Development time and effort based on the physical source lines of code (SLOC/KSLOC). Entire development cycle for generating a visualization, including the initial process of planning, coding, testing, documenting and deploying it for users.

SLOCCount tool

It is a software measurement tool, which counts the physical source lines of code (SLOC), ignoring empty lines and comments. It also estimates development time, cost and effort based

  • n the original Basic COCOMO model.

38 / 61

slide-44
SLIDE 44

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Performance Evaluation It evaluates the parallel data preprocessor to verify if it achieves better performance compared to the sequential version. Since TBB was also considered to be used to parallelize data preprocessor module because it offers stream parallelism support, we compared its aplication to SPar. Verifies if a parallelized TBB version presents better results than the automatic generated SPar version of data preprocessor.

39 / 61

slide-45
SLIDE 45

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Methodology

Performance Evaluation - Hardware/Software Multi-core computer Blade Dell PowerEdge M610 with the following specification: Two processors Intel Xeon Six-Core E5645 2.4GHz Hyper-Threading 24 GBytes of memory. Software:

Operational System: Ubuntu 14.04 LTS C/C++ Compiler: G++ 5.2 SPar Compiler: CINCLE

40 / 61

slide-46
SLIDE 46

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Programming Effort Results

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion Methodology Programming Effort Results Performance Results

6

Conclusion

41 / 61

slide-47
SLIDE 47

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Programming Effort Results

Effort Evaluation - Programming Effort (Entire development cycle).

Programming Effort (to Develop, Test, Document and Deploy)

5 10 15 20 GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Heatmap - Airports in World (Hm-Airp) Main Structuring Formating 5 10 15 20

GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Heatmap - Traffic Accidents in Porto Alegre (Hm-Accid) Main Structuring Filtering Formating

Programming Effort (to Develop, Test, Document and Deploy)

5 10 15 20 GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Clusteredmap - Airports in World (Ctr-Airp) Main Structuring Formating 5 10 15 20

GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Clusteredmap - Flickr Photos with "Computer" Tag (Ctr-Comp) Main Structuring Filtering Formating

Programming Effort (to Develop, Test, Document and Deploy)

5 10 15 20 GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Clusteredmap - Airports in World (Ctr-Airp) Main Structuring Formating 5 10 15 20

GmaVis GoogleMaps OpenLayers Leaflet

Development Time (Hours) Application Clusteredmap - Flickr Photos with "Computer" Tag (Ctr-Comp) Main Structuring Filtering Formating

42 / 61

slide-48
SLIDE 48

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Programming Effort Results

Code productivity results

20 40 60 80 100 120 Ctr-Airp Ctr-Comp Hm-Airp Hm-Accid Mm-Dev Mm-Accid Cost (Dollars) Application Programming Effort for Application (disconsidering data processing) GmaVis Google Maps OpenLayers

Table: SLOC (Physical Lines of Code) for each application.

Application GmaVis Google Maps OpenLayers Leaflet Ctr-Airp 15 34 46 46 Ctr-Comp 17 34 28 39 Hm-Airp 15 42 29 24 Hm-Accid 17 41 34 24 Mm-Dev 25 34 22 27 Mm-Accid 21 43 24 27 43 / 61

slide-49
SLIDE 49

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Programming Effort Results

Lines of code to parallelize data preprocessor using SPar and TBB

Application Sequential SPAR TBB SLOC SLOC Code Increase (%) SLOC Code Increase (%) CM CP 210 216 2,857 274 30,476 CM AIR 199 205 3,015 263 32,160 HM AC 210 216 2,857 274 30,476 HM AIR 199 205 3,015 263 32,160 MM DEV 224 230 2,678 288 28,571 MM ACID 216 222 2,777 280 29,629 44 / 61

slide-50
SLIDE 50

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Performance Results

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion Methodology Programming Effort Results Performance Results

6

Conclusion

45 / 61

slide-51
SLIDE 51

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Performance Results

Sequential GMaVis Execution Times

Table: Completion times (seconds) [6].

Size Data Pre-processing Data to Visual Mappings - Google Maps API 10GB 110.4948 (Std. 0.9763) 2.910 (Std. 1.6084) 50GB 544.0506 (Std. 9.4225) 3.2738 (Std. 2.0663) 100GB 1098.9284 (Std. 19.0383) 3.8536 (Std. 2.7584)

46 / 61

slide-52
SLIDE 52

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Performance Results

TBB and SPar SLOC

Performance Results (Airports in world - Clusteredmap)

50 100 150 200 250 300 350 400 450 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 10 20 30 40 50 60 70 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 2 4 6 8 10 12 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

Performance Results (Flickr Photos with "Computer" Word as Tag - Clusteredmap)

50 100 150 200 250 300 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 5 10 15 20 25 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 0.5 1 1.5 2 2.5 3 3.5 4 4.5 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 140 160 180 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 50 100 150 200 250 300 350 400 450 500 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 50 100 150 200 250 300 350 400 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

47 / 61

slide-53
SLIDE 53

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Performance Results

TBB and SPar SLOC

Performance Results (Airports in world - Heatmap)

50 100 150 200 250 300 350 400 450 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 10 20 30 40 50 60 70 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 2 4 6 8 10 12 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

Performance Results (Traffic Accidents in Porto Alegre - Heatmap)

50 100 150 200 250 300 350 400 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 10 20 30 40 50 60 70 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 2 4 6 8 10 12 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 140 160 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 180 200 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 180 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

48 / 61

slide-54
SLIDE 54

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Performance Results

TBB and SPar SLOC

Performance Results (Traffic Accidents in Porto Alegre - Markedmap)

50 100 150 200 250 300 350 400 450 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 10 20 30 40 50 60 70 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 2 4 6 8 10 12 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 140 160 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 180 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 20 40 60 80 100 120 140 160 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

Performance Results (Flickr Phothos by Device - Markedmap)

50 100 150 200 250 300 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Big Dataset (Execution Time) TBB SPAR 5 10 15 20 25 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Medium Dataset (Execution Time) TBB SPAR 2 4 6 8 10 12 2 4 6 8 10 12 Execution Time (Seconds) Replicate # Small Dataset (Execution Time) TBB SPAR 20 40 60 80 100 120 140 160 180 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Big Dataset (Throughput) TBB SPAR 100 200 300 400 500 600 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Medium Dataset (Throughput) TBB SPAR 50 100 150 200 250 300 350 400 2 4 6 8 10 12 Throughput (Mbytes/s) Replicate # Small Dataset (Throughput) TBB SPAR

49 / 61

slide-55
SLIDE 55

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion

1

Introduction

2

Related Work

3

GMaVis: A DSL for Geospatial Data Visualization

4

GMaVis’ Compiler

5

Evaluation and Discussion

6

Conclusion Conclusion

50 / 61

slide-56
SLIDE 56

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Final Remarks We provided a new domain-specific language for the geospatial data with simpler and friendly interface. It offers support for big raw data sets. We used a real-world data set to evaluate our DSL, comparing it in different visualization types.

51 / 61

slide-57
SLIDE 57

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Final Remarks - Research Questions Q1 - Is it possible to create a DSL (GMAVIS) to reduce the programming effort for creating geospatial visualization? Results demonstrate that even when providing the abstraction of complexities in the data pre-processing phase and totally parallel programming abstraction, GMaVis could reduce the effort and cost required to implement the three supported data visualizations (clusteredmap, heatmap, and markedmap) (Validades H1). It is possible to confirm that GMaVis can reduce not only the programming effort but also the cost of development for creating geospatial data visualizations (i.e., markedmaps, clusteredmaps and heatmaps)(Validades H2).

52 / 61

slide-58
SLIDE 58

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Final Remarks - Research Questions Q2 - Can the parallel code generated by this DSL speed up the data processing of raw geospatial data? The GMaVis compiler implementation demonstrates that it is possible to generate SPar annotations automatically to speed up the data processing (Validades H3). Results demonstrate that both versions (SPar and TBB) increased performance by decreasing the execution time when compared to the sequential version, but SPar required less code generation (Validades H4).

53 / 61

slide-59
SLIDE 59

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Limitations Shared memory architectures have a bottleneck when reading data from disk. First version of GMaVis was implemented targeting these types of architectures.

54 / 61

slide-60
SLIDE 60

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Limitations Shared memory architectures have a bottleneck when reading data from disk. First version of GMaVis was implemented targeting these types of architectures. Low expressiveness. This is because to offer a higher level of abstraction, it is required to keep some features hidden from users that allow changing some visualization details, such as colors, icons, and specific sizes.

54 / 61

slide-61
SLIDE 61

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Limitations Shared memory architectures have a bottleneck when reading data from disk. First version of GMaVis was implemented targeting these types of architectures. Low expressiveness. This is because to offer a higher level of abstraction, it is required to keep some features hidden from users that allow changing some visualization details, such as colors, icons, and specific sizes. This DSL has focused in geospatial data visualizations. However, data analysis may require the use of different visualization types/techniques to achieve its objective.

54 / 61

slide-62
SLIDE 62

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Limitations Shared memory architectures have a bottleneck when reading data from disk. First version of GMaVis was implemented targeting these types of architectures. Low expressiveness. This is because to offer a higher level of abstraction, it is required to keep some features hidden from users that allow changing some visualization details, such as colors, icons, and specific sizes. This DSL has focused in geospatial data visualizations. However, data analysis may require the use of different visualization types/techniques to achieve its objective. The data preprocessor in GMaVis only supports non-hierarchical data files.

54 / 61

slide-63
SLIDE 63

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Future Work Improve this DSL to perform in distributed memory architectures.

55 / 61

slide-64
SLIDE 64

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Future Work Improve this DSL to perform in distributed memory architectures. The insertion and support of other visualization types is planned for the future, using geospatial data.

55 / 61

slide-65
SLIDE 65

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Future Work Improve this DSL to perform in distributed memory architectures. The insertion and support of other visualization types is planned for the future, using geospatial data. The creation of a pre-parser for hierarchical files are planned for future work. Also, it is possible to use or convert a hierarchical into a non-hierarchical data file.

55 / 61

slide-66
SLIDE 66

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Future Work Improve this DSL to perform in distributed memory architectures. The insertion and support of other visualization types is planned for the future, using geospatial data. The creation of a pre-parser for hierarchical files are planned for future work. Also, it is possible to use or convert a hierarchical into a non-hierarchical data file. Data declarations can be extended in this interface to offer smart classifications and data selection through the use

  • f machine learning and data mining algorithms.

55 / 61

slide-67
SLIDE 67

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Published Papers Related to this work

Ledur, C.; Griebler, D.; Fernandes, L. G.; Manssour, I. “Uma Linguagem Espec´ ıfica de Dom´ ınio com Gerac ¸ ˜ ao de C´

  • digo Paralelo para

Visualizac ¸ ˜ ao de Grandes Volumes de Dados”. In: Escola Regional de Alto Desempenho (ERAD), 2015, pp. 2. Ledur, C.; Griebler, D.; Manssuor, I.; Fernandes, L. G. “Towards a Domain-Specific Language for Geospatial Data Visualization Maps with Big Data Sets”. In: ACS/IEEE International Conference on Computer Systems and Applications, 2015, pp. 8. (SUBMISSION) Ledur, C.; Griebler, D.; Fernandes, L. G.; Manssour, I. ”GMaVis: A Domain-Specific Language for Geospatial Data Visualizations”. In: IEEE Information Visualization (InfoVis), IEEE VIS, 2016

56 / 61

slide-68
SLIDE 68

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Published Papers Contributions

Adornes, D.; Griebler, D.; Ledur, C.; Fernandes, L. G. “A Unified MapReduce Domain-Specific Language for Distributed and Shared Memory Architectures”. In: The 27th International Conference on Software Engineering & Knowledge Engineering, 2015, pp. 6. Adornes, D.; Griebler, D.; Ledur, C.; Fernandes, L. G. “Coding Productivity in MapReduce Applications for Distributed and Shared Memory Architectures”, International Journal of Software Engineering and Knowledge Engineering, 2015, pp. 4.

57 / 61

slide-69
SLIDE 69

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

References I

  • S. K. Card, J. D. Mackinlay, and B. Shneiderman.

Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1999.

  • C. Chiw, G. Kindlmann, J. Reppy, L. Samuels, and N. Seltzer.

Diderot: A Parallel DSL for Image Analysis and Visualization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, volume 47 of PLDI’12, pages 111–120, New York, USA, June 2012. ACM.

  • H. Choi, W. Choi, T. Quan, D. G. Hildebrand, H. Pfister, and W.-K. Jeong.

Vivaldi: A Domain-Specific Language for Volume Processing and Visualization on Distributed Heterogeneous Systems. IEEE Transactions on Visualization and Computer Graphics, 20(12):2407–2416, 2014 December.

  • D. Griebler.

A New Compiler-Based Framework Perspective for High-Level Stream Parallelism. PhD thesis, Faculdade de Inform´ atica - PPGCC - PUCRS, Porto Alegre, Brazil, March 2016.

  • M. Hasan, J. Wolfgang, G. Chen, and H. Pfister.

Shadie: A Domain-Specific Language for Volume Visualization, 2010 (accessed December 15, 2015).

  • C. Ledur, D. Griebler, I. Manssuor, and L. G. Fernandes.

Towards a Domain-Specific Language for Geospatial Data Visualization Maps with Big Data Sets. In ACS/IEEE International Conference on Computer Systems and Applications, AICCSA’15, page 8, Marrakech, Marrocos, November 2015. IEEE. 58 / 61

slide-70
SLIDE 70

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

References II

  • L. A. Meyerovich, M. E. Torok, E. Atkinson, and R. Bod´

ık. Superconductor: A Language for Big Data Visualization, February 2013 (accessed December 17, 2015). P . Rautek, S. Bruckner, M. Groller, and M. Hadwiger. ViSlang: A System for Interpreted Domain-Specific Languages for Scientific Visualization. IEEE Transactions on Visualization and Computer Graphics, 20(12):2388–2396, December 2014. Voltar para Capa 59 / 61

slide-71
SLIDE 71

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Questions Questions & Answers

60 / 61

slide-72
SLIDE 72

,

Introduction Related Work GMaVis GMaVis’ Compiler Evaluation and Discussion Conclusion Conclusion

Thank you Thank you! More questions: cleverson.ledur@acad.pucrs.br cleversonledur@gmail.com http://gmap.pucrs.br/cleversonledur/

61 / 61