in memory computing at scale look beyond physical dram
play

IN-MEMORY COMPUTING AT SCALE? LOOK BEYOND PHYSICAL DRAM! Iacovos G. - PowerPoint PPT Presentation

IN-MEMORY COMPUTING AT SCALE? LOOK BEYOND PHYSICAL DRAM! Iacovos G. Kolokasis , Anastasios Papagiannis, Polyvios Pratikakis, and Angelos Bilas October 25, 2019 Institute of Computer Science (ICS) Foundation of Research and T echnology


  1. IN-MEMORY COMPUTING AT SCALE? LOOK BEYOND PHYSICAL DRAM! Iacovos G. Kolokasis , Anastasios Papagiannis, Polyvios Pratikakis, and Angelos Bilas October 25, 2019 Institute of Computer Science (ICS) Foundation of Research and T echnology – Hellas (FORTH) & Computer Science Department, University of Crete

  2. ANNUAL SIZE OF THE DRAM SCALING GLOBAL DATASPHERE TREND 10000 200 MEGABITS/CHIP S S R R A ZETABYTES A 1000 E E 2X/3 YEARS 2X/3 YEARS Y Y 100 5 5 100 . . 1 1 / / X X 10 2 2 0 0 3 6 9 2 5 1 1 1 1 1 2 2 0 0 0 0 0 0 1985 1995 2005 2015 2 2 2 2 2 2 YEAR YEAR Data is growing faster while DRAM scaling is getting diffjcult 1

  3. ANNUAL SIZE OF THE NAND FLASH GLOBAL DATASPHERE SCALING TREND 200 4 DENSITY (TB) ZETABYTES 2 100 0 7 9 1 4 7 0 3 0 1 1 2 2 2 3 3 0 0 0 0 0 0 0 0 3 6 9 2 5 2 2 2 2 2 2 2 1 1 1 1 2 2 0 0 0 0 0 0 2 2 2 2 2 2 YEAR YEAR NAND Flash capacity is continuous scaling 2

  4. DATA-INTENSIVE APPLICATIONS DNA/PROTEIN DNA/PROTEIN VIRTUAL IMAGE VIRTUAL IMAGE SYNTHESIS SYNTHESIS REALITY ANALYSIS REALITY ANALYSIS IN-MEMORY FRAMEWORKS IN-MEMORY FRAMEWORKS More demand for memory More demand for memory 3

  5. APACHE SPARK IN-MEMORY COMPUTING RDD Operation 1 RDD RDD . . . RDD Operation n RDD RDD RDD RDD Operation 1 Operation n DISK RAM RAM DISK 4

  6. INTRODUCTION TO SPARK IN-MEMORY COMPUTING MEMORY_AND_DISK MEMORY_ONL Y SERIALIZE RDD RDD RDD RDD partition partition DISK MEMORY MEMORY MEMORY 5

  7. LET’S EXPLOIT THE CAPACITY OF STORAGE DEVICES JVM-based Analytics Frameworks Serialization / Memory-Mapped fjle I/O Deserialization We explore both approaches 6

  8. SERIALIZATION / DESERIALIZATION (LIMITATIONS) • Out-of-memory Errors due to small size of heaps. • Large computing results are generated during processing a record • Serialization / Deserialization afgects CPU performance • GC overhead to reclaim long-lived accumulated objects • Iterative applications 7

  9. ON-GOING WORK Non- DRAM JVM Device Heap Other Heap Heap fmap DRAM Storage Device 8

  10. ON-GOING WORK • Data placement policy inside JVM to manipulate Objects • Short-Lived data objects on DRAM Heap • Long-Lived data objects on Storage Device Heap • Add extra Storage Level in Apache Spark to support caching RDDs on Storage Heap • Thorough evaluation on SSDs, NVMe, Optane devices 9

  11. CONTACT INFORMATION Iacovos G. Kolokasis MSc Student, Computer Science Department, University of Crete kolokasis@ics.forth.gr Institute of Computer Science (ICS) Foundation for Research and Technology Hellas (FORTH) www.ics.forth.gr 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend