Outline Introduction Background and Motivation Spatial, Temporal - PowerPoint PPT Presentation

High-Performance Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs Jianting Zhang 1,2 Simin You 2 , Le Gruenwald 3 1 Depart of Computer Science, CUNY City College (City College of New York) 2 Department of Computer Science, CUNY Graduate Center 3 School of Computer Science, the University of Oklahoma 1

Outline • Introduction • Background and Motivation • Spatial, Temporal and Spatiotemporal Aggregations of Taxi Trips • Implementation Details • Experiments and Results • Conclusion and Future Work 2

Introduction • Spatial, temporal and spatiotemporal aggregations are commonly used OLAP operations  SOLAP, TOLAP, STOLAP • Several existing OLAP systems are built on top of GIS and Spatial Databases and suffer from low performance when handling large-scale datasets on traditional hardware (disk-resident + serial CPU) • This research aims at investigating the feasibility and efficiency on spatial, temporal and spatiotemporal aggregations on new hardware (large main-memory + massively data parallel GPUs) using a domain-specific case study (taxi trip records) 3

Background and Motivation Taxi trip records • ~300 million trips in about two years • ~170 million trips (300 million passengers) in 2009 • 1/5 of that of subway riders and 1/3 of that of bus riders in NYC • 13,000 Medallion taxi cabs • Only taxis with Medallion license are for hail (the rule could be changing outside Manhattan...) 4 4

but the median speed is about 10 miles per hour: significant traffic of taxi trips are within 3 miles and cost less than $10: affordable; Overall distributions of trip distance, time, speed and fare: majority Count Count 10000000 15000000 20000000 5000000 10000000 15000000 20000000 5000000 0 <= 0.0 0 <= 0.0 ( 1.0, 2.0] ( 3.0, 4.0] ( 0.8, 1.0] ( 5.0, 6.0] ( 1.8, 2.0] ( 7.0, 8.0] Background and Motivation ( 2.8, 3.0] ( 9.0, 10.0] ( 3.8, 4.0] Count-Distance Distribution ( 11.0, 12.0] ( 13.0, 14.0] ( 4.8, 5.0] Count-Speed Distribution ( 15.0, 16.0] ( 5.8, 6.0] ( 17.0, 18.0] ( 6.8, 7.0] ( 19.0, 20.0] Trip Distance (mile) ( 7.8, 8.0] ( 21.0, 22.0] Speed (MPH) ( 8.8, 9.0] ( 23.0, 24.0] ( 25.0, 26.0] ( 9.8, 10.0] ( 27.0, 28.0] ( 10.8, 11.0] ( 29.0, 30.0] ( 11.8, 12.0] ( 31.0, 32.0] ( 12.8, 13.0] ( 33.0, 34.0] ( 13.8, 14.0] ( 35.0, 36.0] ( 37.0, 38.0] ( 14.8, 15.0] ( 39.0, 40.0] ( 15.8, 16.0] ( 41.0, 42.0] ( 16.8, 17.0] ( 43.0, 44.0] ( 17.8, 18.0] ( 45.0, 46.0] ( 18.8, 19.0] ( 47.0, 48.0] ( 49.0, 50.0] ( 19.8, 20.0] Count Count 10000000 15000000 20000000 25000000 30000000 5000000 10000000 15000000 20000000 5000000 0 <= 0.0 ( 1.0, 2.0] 0 ( 3.0, 4.0] <= 0.0 ( 5.0, 6.0] ( 2.0, 3.0] ( 7.0, 8.0] ( 5.0, 6.0] ( 9.0, 10.0] ( 11.0, 12.0] Count-Time Distribution ( 8.0, 9.0] ( 13.0, 14.0] Count-Fare Distribution ( 11.0, 12.0] ( 15.0, 16.0] ( 14.0, 15.0] ( 17.0, 18.0] ( 19.0, 20.0] ( 17.0, 18.0] TripTime (Minute) ( 21.0, 22.0] ( 20.0, 21.0] Fare ($) ( 23.0, 24.0] ( 23.0, 24.0] ( 25.0, 26.0] ( 27.0, 28.0] ( 26.0, 27.0] ( 29.0, 30.0] ( 29.0, 30.0] ( 31.0, 32.0] ( 33.0, 34.0] ( 32.0, 33.0] ( 35.0, 36.0] ( 35.0, 36.0] ( 37.0, 38.0] ( 38.0, 39.0] ( 39.0, 40.0] ( 41.0, 42.0] ( 41.0, 42.0] ( 43.0, 44.0] ( 44.0, 45.0] ( 45.0, 46.0] ( 47.0, 48.0] ( 47.0, 48.0] ( 49.0, 50.0] > 50.0 5

Background and Motivation • How to manage taxi trip data? – Geographical Information System (GIS) • E.g. ESRI ArcGIS – Spatial Databases (SDB) • E.g., PostgreSQL/PostGIS – Moving Object Databases (MOD) • E.g. Secondo • How good are they? – Pretty good for small amount of data  – But, rather poor for large-scale data  6 6

Background and Motivation • Example 1: – Creating a geometry column from lat/long columns that is necessary for subsequent indexing and query processing in PostgreSQL/PostGIS – 170 million taxi pickup locations in 2009 – UPDATE t SET PUGeo = ST_SetSRID(ST_Point("PULong","PULat"),4326); – 105.8 hours! • Example 2: – Finding the nearest tax blocks for 170 million taxi pickup locations (to aggregate based on tax block types) – Using open source libspatiaindex+GDAL (to avoid database overhead) – 30.5 hours! Can we get interactive responses? 7

Background and Motivation Multicore CPUs Cloud computing+MapReduce+Hadoop GPGPU Computing: From Fermi to Kepler 8

Background and Motivation Feature Intel Xeon E7-8870 Nvidia Tesla K10 Price $4,61,6 $2,500 Processing Cores 10 3,072 (in 15 multiprocessors) Hardware threads 10*2 15*2048 Frequency 2400 MHZ 745 MHZ L1/L2/L3 cache (32k+32K)/256K/30M per 48K per SM core RAM variable 8GB Memory Bandwidth 25.6 GB/s 320 GB/s Number of 2.6 Billion 7.0 Billion Transistors Power Consumption 130 W 225 W 9

Aggregations on Taxi Trip Records 8 9 7 Trip_Pickup_Location Start_Zip_Code time_between_service Trip_Dropoff_Location End_Zip_Code distance_between_service start_x 3 Start_Lon Trip_Pickup_DateTime start_y Start_Lat Trip_Dropoff_DateTime end_x End_Lon end_y End_Lat ( local 2 5 Passenger_Count projection ) Fare_Amt Medallion# Tolls_Amt Shift# Trip_Time 4 6 Tip_Amt 1 Trip# Trip_Distance Payment_Type 10 vendor_name Surcharge 11 date_loaded Total_Amt store_and_forward Rate_Code 10

Aggregations on Taxi Trip Records Year City Week of the Borough Top level Month Day of the Year grid Year Community Tax District Census Day of the Police Lot Day Tract Week Precinct Hour Level k Census Tax grid Peak/ Block Block off-peak 15/30- Level 0 grid Street Segment minutes Pickup/drop-off timestamps Pickup/drop-off locations Auxiliary data (weather, events…) NYC taxi trip records 11

Implementation Details Mapping a point to its nearest street segment Grouping points into quadrants Points Single-Level Grid- File based Spatial Vertices of Filtering on GPUs street segments 12

Implementation Details Parallel Counting on GPUs using parallel primitives struct make_key { __host__ __device__ Transform to generate keys uint operator()(thrust::tuple<uint, uint> v) (spatial entity identifiers, { uint segid=(thrust::get(0)(v)) &0x07FFFFFF temporal units or their uint hour =(thrust::get(1)(v)>>12)&0x0000001F; combinations) return ((segid<<5)|hour); } }; 3 1 2 1 3 Sort 1 1 2 3 3 key count Reduce 1 2 3 13 2 1 2

Experiment and Results • Data – Taxi trip records: 300 million in two years (2008-2010), ~170 million in 2009 – NYC DCPLION street network data: 147,011 street segments • Hardware – Dell T5400 Dual Quadcore CPUs with 16 GB memory – Nvidia Quadro 6000 with 448 cores and 6 GB memory 14

Experiment and Results Table 1 Results on Spatial Associations on GPUs 1 2 3 4 6 9 12 # of Months N1 (*10 6 ) 13.84 27.00 41.17 55.23 83.81 124.64 168.38 N2 (*10 6 ) 0.155 0.306 0.496 0.676 0.982 1.358 1.747 t1 (second) 0.955 1.876 2.908 3.915 5.986 9.001 12.233 t2 (second) 2.059 1.615 1.472 1.495 1.123 1.176 1.221 t3(second ) 0.200 0.343 0.519 0.677 0.941 1.270 1.601 T=t1+t2+t3 3.214 3.834 4.899 6.087 8.050 11.447 15.055 N1- # of point locations; N2- # of point quadrants t1: time to generate point quadrants t2:time to filter bounding boxes (point quadrants/street segments) 15 t3: time to compute distances and assign identifiers

Experiment and Results Table 1. Performance comparison on spatial association GPU-Time CPU-Time Speedup t1 (s) 12.233 162.004 13X t2 (s) 1.221 / t3(s ) 1.601 35.338 T=t1+t2+t3(s) 15.055 197.342 22X t1: time to generate point quadrants t2:time to filter bounding boxes (point quadrants/street segments) t3: time to compute distances and assign identifiers 16

Experiment and Results Table 2. Experiment Results for Different Aggregations on Multi-Core CPUs (in Seconds) Aggregation Serial 1T 2T 4T 8T 16T 1 Pickup Segment (spatial) 12.519 19.776 9.768 4.992 2.513 1.721 2 Pickup Hour (temporal) 7.043 6.089 4.347 2.121 1.186 0.907 3 Pickup Segment+Hour (Spatiotemporal) 17.128 24.238 12.522 6.707 3.803 3.781 17

Experiment and Results Performance comparison on counting Aggregation CPU- CPU- GPU CPU-Serial CPU-Best Serial Best /GPU /GPU 66.6 9.2 Spatial 12.519 1.721 0.188 27.4 3.5 Temporal 7.043 0.907 0.257 62.5 13.8 Spatiotemporal 17.128 3.781 0.274 18

Conclusion and Future Work • We report our designs, implementations and experiments on spatial, temporal and spatiotemporal aggregations of hundreds of millions of taxi trip records in an OLAP setting • By utilizing the massively data parallel GPU processing power, we were able to spatially associate nearly 170 million taxi pickup location points with their nearest street segments among 147,011 candidates in about 15 seconds and achieved 13X speedup over optimized serial CPU implementation. • Spatial, temporal and spatiotemporal aggregations can be processed in the order of a fraction of a second on GPUs. • The experiment results support the feasibility of building a high- performance OLAP system for processing large-scale taxi trip data for real-time, interactive data explorations on GPUs. 19

Outline Introduction Background and Motivation Spatial, Temporal - PowerPoint PPT Presentation

High-Performance Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs Jianting Zhang 1,2 Simin You 2 , Le Gruenwald 3 1 Depart of Computer Science, CUNY City College (City College of New York) 2 Department of Computer

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Automated discovery of permutation patterns Henning lfarsson, Reykjavk University joint work

Welcome back (CRCOS) #00212K Day 2 STEM Practices IDEAS METHODS VALUES Problem Finding

ENSO teleconnections with the North Atlantic Belen Rodrguez de Fonseca Universidad Complutense

Brownian Motion Area with Generatingfunctionology Uwe Schwerdtfeger RMIT University/University

1 A GENDA : Brief history of Lake Region Medical Training Development Progression

TENET: Tail-Event-driven NETwork Risk Wolfgang Karl Hrdle Weining Wang Lining Yu Ladislaus

Home Community Meeting . OCTOBER 29 TH , 2020 2:30PM Welcome Katie Martin

SEPECC Meeting Tuesday, May 12, 2020 9:00 Welcome Agenda 9:05 OCDEL and ELRC Updates 9:35