high performance data intensive computing
play

High Performance Data Intensive Computing Dongfang Zhao, Assistant - PowerPoint PPT Presentation

High Performance Data Intensive Computing Dongfang Zhao, Assistant Professor Department of Computer Science & Engineering University of Nevada, Reno Who am I 2017 , Assistant Professor, University of Nevada, Reno 2016,


  1. High Performance Data Intensive Computing Dongfang Zhao, Assistant Professor Department of Computer Science & Engineering University of Nevada, Reno

  2. Who am I • 2017 – , Assistant Professor, University of Nevada, Reno • 2016, Postdoctoral Fellow, University of Washington, Seattle • 2015, PhD in Computer Science, Illinois Institute of Technology, Chicago • 2015, Summer Intern, IBM Research – Almaden, San Jose, CA • 2009-2011, Software Engineer, Epic Systems, Madison, WI • 2008, MS in Computer Science, Emory University, Atlanta, GA • 2005, MS in Statistics, Katholieke Universiteit Leuven, Belgium

  3. Outline • Past Work – 2005-2008: Machine Intelligence, Computer Vision – 2012-2015: High Performance Computing, Distributed Systems – 2015-2016: Big Data Systems, Database Systems • Current Status – Personnel – Facilities • Future Research Directions – Distributed Memory Management for Big Data Systems – Locality-aware Resource Management in Virtualized Computing – High Performance Database Systems

  4. Past Work: 2005-2008 • Incremental Dimensionality Reduction • E.g., published at IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)

  5. Past Work: 2012-2015 • High Performance Computing • E.g., published at IEEE Transactions on Parallel and Distributed Systems (TPDS)

  6. Past Work: 2015-2016 • Big Data Systems • E.g., published at Very Large Database Systems (VLDB)

  7. Current Status: Personnel • Currently at Nevada: – 1 PhD student starting Fall 2017 – 1 master student starting Fall 2017 – Collaborating closely with Prof. Dr. Feng Yan working on Data Mining and Performance Modelling. He used to publish at KDD, SIGMETRICS, Supercomputing, CLOUD, NOMS, etc. • Plan: – By Fall 2018, the lab will recruit: two more PhD students, two more master students

  8. Current Status: Facilities • Nevada’s HPC cluster – 56 compute nodes: PowerEdge C6320 • 1792 cores • 128 (or 192?) GB RAM per node – 11 GPU Nodes: PowerEdge C4130 each with 4xP100 with NVLink • 352 cores • 44 P100 GPUs • Our lab’s 10 -node GPU cluster, each node has – 12 CPU cores – 4 GeForce GTX 1080 cards – 64 GB RAM

  9. Future Directions • Distributed Memory Management for Big Data Systems – Motivation: Modern big data systems do not have a coordinated way to manage memory • Users are asked to specify the memory allocation • Local OS takes the responsibility – Objective • A middleware to automatically manipulate memory for big data systems • The middleware oversees the overall memory status rather than optimizing the local usage • Users should be able to plug in ad-hoc strategy for the underlying memory management

  10. Future Directions • Locality-aware Resource Management in Virtualized Computing – Extension of my intern work in Summer 2015 – Motivation: Load balance is sometimes overemphasized – Objective: improve data locality for virtualized computation

  11. Future Directions • High Performance Distributed Databases – Motivation: for some reason, HPC’s dominant storage solution is file system – Objective: building a high-performance distributed database system atop existing parallel/distributed file systems that will support performant: • Queries expressed in SQL • Data load, transform, extract, etc. – Challenges • Performance bottleneck: from network to what? • How to leverage GPUs, InfiniBand, MPI, etc. for database workloads? • …

  12. Thanks! Dongfang Zhao dzhao@unr.edu

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend