solros a data centric operating system architecture for
play

Solros: A Data-Centric Operating System Architecture for - PowerPoint PPT Presentation

Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing Changwoo Min , Woonhak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim Virginia Tech, eBay, Georgia Tech, Chonbuk National University


  1. Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing Changwoo Min , Woonhak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim Virginia Tech, eBay, Georgia Tech, Chonbuk National University April 26, 2018 Changwoo Min Solros: Data-Centric OS April 26, 2018 1 / 21

  2. Cambrian Explosion of Processor Architecture Specialization of general-purpose processors Changwoo Min Solros: Data-Centric OS April 26, 2018 2 / 21

  3. Cambrian Explosion of Processor Architecture Specialization of general-purpose processors Generalization of co-processors Changwoo Min Solros: Data-Centric OS April 26, 2018 2 / 21

  4. Cambrian Explosion of Processor Architecture Specialization of general-purpose processors Generalization of co-processors Specialization of co-processors Changwoo Min Solros: Data-Centric OS April 26, 2018 2 / 21

  5. Blazingly fast IO Devices Blazingly fast storage/memory Changwoo Min Solros: Data-Centric OS April 26, 2018 3 / 21

  6. Blazingly fast IO Devices Blazingly fast storage/memory Blazingly fast network Changwoo Min Solros: Data-Centric OS April 26, 2018 3 / 21

  7. Blazingly fast IO Devices How to exploit the full potential of such hardware devices without pain? System-wide performance Ease of programming Blazingly fast storage/memory Blazingly fast network Changwoo Min Solros: Data-Centric OS April 26, 2018 3 / 21

  8. Outline Heterogeneous Computing Architectures 1 Solros: Split-Kernel Approach 2 Solros Architecture Operating System Services Evaluation 3 Changwoo Min Solros: Data-Centric OS April 26, 2018 4 / 21

  9. Host-Centric Approach Host OS controls co-processors and IO devices Examples: OpenCL, CUDA Host processor Application OS Mem Core I/O device SSD / NIC Co-processor Application control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 5 / 21

  10. Host-Centric Approach Host OS controls co-processors and IO devices Examples: OpenCL, CUDA Host processor Application ① OS Mem Core I/O device SSD / NIC Co-processor Application control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 5 / 21

  11. Host-Centric Approach Host OS controls co-processors and IO devices Examples: OpenCL, CUDA Host processor Application ① OS Mem Core ② I/O device SSD / NIC Co-processor Application control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 5 / 21

  12. Host-Centric Approach Host OS controls co-processors and IO devices Examples: OpenCL, CUDA Host processor Application ① OS Mem Core ② I/O device SSD / NIC Co-processor Application ③ control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 5 / 21

  13. Host-Centric Approach Host OS controls co-processors and IO devices Examples: OpenCL, CUDA Host processor Application ① OS Mem Core ② I/O device SSD / NIC Co-processor Application ③ control Mem Core data Problem Redundant data communication Complex to program and hard to optimize Changwoo Min Solros: Data-Centric OS April 26, 2018 5 / 21

  14. Coprocessor-Centric Architecture Co-processors control IO devices Examples: Xeon Phi (Linux), GPUfs [ASPLOS13], GPUNet [OSDI14] Host processor Application OS Mem Core I/O device SSD / NIC Co-processor Application OS control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 6 / 21

  15. Coprocessor-Centric Architecture Co-processors control IO devices Examples: Xeon Phi (Linux), GPUfs [ASPLOS13], GPUNet [OSDI14] Host processor Application OS Mem Core I/O device SSD / NIC Co-processor Application ① OS control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 6 / 21

  16. Coprocessor-Centric Architecture Co-processors control IO devices Examples: Xeon Phi (Linux), GPUfs [ASPLOS13], GPUNet [OSDI14] Host processor Application OS Mem Core I/O device SSD / NIC Co-processor Application ① OS ② control Mem Core data Changwoo Min Solros: Data-Centric OS April 26, 2018 6 / 21

  17. Coprocessor-Centric Architecture Co-processors control IO devices Examples: Xeon Phi (Linux), GPUfs [ASPLOS13], GPUNet [OSDI14] Host processor Application OS Mem Core I/O device SSD / NIC Co-processor Application ① OS ② control Mem Core data Problem Significant effort required for porting IO stack to co-processor Not completely exploiting powerful host processors Changwoo Min Solros: Data-Centric OS April 26, 2018 6 / 21

  18. Outline Heterogeneous Computing Architectures 1 Solros: Split-Kernel Approach 2 Solros Architecture Operating System Services Evaluation 3 Changwoo Min Solros: Data-Centric OS April 26, 2018 7 / 21

  19. Solros Goal Ease of programming Best use of processor architecture System-wide optimization Changwoo Min Solros: Data-Centric OS April 26, 2018 8 / 21

  20. Solros Goal Ease of programming Best use of processor architecture System-wide optimization Challenge Co-processor needs IO abstraction IO stacks is branch-divergent and difficult to parallelize It needs system-wide information Changwoo Min Solros: Data-Centric OS April 26, 2018 8 / 21

  21. Solros Architecture Split-Kernel Architecture Data-plane OS Runs on a co-processor Provides IO abstraction Delegates actual IO operations to a control-plane OS Control-plane OS Runs on a host processor Runs actual IO stack Performs system-wide coordination Changwoo Min Solros: Data-Centric OS April 26, 2018 9 / 21

  22. Solros Architecture Control-plane OS: actual OS service + system-wide coordination Data-plane OS: thin communication layer to host processor Co-processor Host processor Application Application OS stub OS proxy Policy Core Mem Mem Core I/O device SSD / NIC control data Changwoo Min Solros: Data-Centric OS April 26, 2018 10 / 21

  23. Solros Architecture Control-plane OS: actual OS service + system-wide coordination Data-plane OS: thin communication layer to host processor Co-processor Host processor Application Application ① OS stub OS proxy Policy Core Mem Mem Core I/O device SSD / NIC control data Changwoo Min Solros: Data-Centric OS April 26, 2018 10 / 21

  24. Solros Architecture Control-plane OS: actual OS service + system-wide coordination Data-plane OS: thin communication layer to host processor Co-processor Host processor Application Application ① OS stub OS proxy Policy Core Mem Mem Core I/O device ② SSD / NIC control data Changwoo Min Solros: Data-Centric OS April 26, 2018 10 / 21

  25. Solros Architecture Control-plane OS: actual OS service + system-wide coordination Data-plane OS: thin communication layer to host processor Co-processor Host processor Application Application ① OS stub OS proxy Policy Core Mem Mem Core I/O device ③ ② SSD / NIC control data Changwoo Min Solros: Data-Centric OS April 26, 2018 10 / 21

  26. Solros Architecture Control-plane OS: actual OS service + system-wide coordination Data-plane OS: thin communication layer to host processor Co-processor Host processor Application Application ① OS proxy Policy OS stub Core Mem Mem Core I/O device ③ ② SSD / NIC control data Co-processor has OS abstraction with minimal effort Best use of each of the fat and lean processors Efficient global coordination among devices (policy) Changwoo Min Solros: Data-Centric OS April 26, 2018 10 / 21

  27. Operating System Services 1 Transport service 2 Filesystem service 3 Network service Changwoo Min Solros: Data-Centric OS April 26, 2018 11 / 21

  28. Operating System Services 1 Transport service 2 Filesystem service 3 Network service Changwoo Min Solros: Data-Centric OS April 26, 2018 12 / 21

  29. Transport Service High performance data transfer among devices are challenging: Uniform data transfer among devices High contention in massively-parallel co-processor Asymmetric performance between host processor and co-processor Changwoo Min Solros: Data-Centric OS April 26, 2018 13 / 21

  30. Transport Service High performance data transfer among devices are challenging: Uniform data transfer among devices High contention in massively-parallel co-processor Asymmetric performance between host processor and co-processor Our approach Uniform data transfer ⇒ system-mapped PCIe window High contention ⇒ combining, replication, interleaving, etc. Asymmetric performance ⇒ flexibly configurable (host DMA engine vs. co-processor DMA engine) Changwoo Min Solros: Data-Centric OS April 26, 2018 13 / 21

  31. Transport Service High performance data transfer among devices are challenging: Uniform data transfer among devices High contention in massively-parallel co-processor Asymmetric performance between host processor and co-processor Our approach Uniform data transfer ⇒ system-mapped PCIe window High contention ⇒ combining, replication, interleaving, etc. Asymmetric performance ⇒ flexibly configurable (host DMA engine vs. co-processor DMA engine) See details in the paper Changwoo Min Solros: Data-Centric OS April 26, 2018 13 / 21

  32. Filesystem Service Peer-to-peer operation Buffered operation Co-processor Host processor Application File system proxy ① File system stub File system PCIe DMA engine control data SSD Zero-copy of data between co-processor memory and SSD Minimal data transfer Changwoo Min Solros: Data-Centric OS April 26, 2018 14 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend