visual data management system
play

Visual Data Management System Vishakha Gupta-Cledat, Luis Remis, - PowerPoint PPT Presentation

Images Find data and me cats metadata Visual Data Management System Vishakha Gupta-Cledat, Luis Remis, Christina Strong, Ragaad Altarawneh, Scott Hahn vishakha.s.gupta, luis.remis, christina.r.strong, ragaad.altarawneh, scott.hahn@intel.com


  1. Images Find data and me cats metadata Visual Data Management System Vishakha Gupta-Cledat, Luis Remis, Christina Strong, Ragaad Altarawneh, Scott Hahn vishakha.s.gupta, luis.remis, christina.r.strong, ragaad.altarawneh, scott.hahn@intel.com Intel Labs

  2. What is VDMS? A novel Visual Data Management System • For storing, accessing and transforming visual data 2 Intel Labs Intel Labs

  3. What is VDMS? A novel Visual Data Management System • For storing, accessing and transforming visual data • Primarily geared towards visual analytics pipelines and data science queries 2 Intel Labs Intel Labs

  4. What is VDMS? A novel Visual Data Management System • For storing, accessing and transforming visual data • Primarily geared towards visual analytics pipelines and data science queries • With a goal of efficiently achieving cloud scale while maintaining ease-of-use 2 Intel Labs Intel Labs

  5. What is VDMS? A novel Visual Data Management System • For storing, accessing and transforming visual data • Primarily geared towards visual analytics pipelines and data science queries • With a goal of efficiently achieving cloud scale while maintaining ease-of-use Also aims to • Exploit Intel’s heterogeneous memory and storage hierarchy 2 Intel Labs Intel Labs

  6. What is VDMS? A novel Visual Data Management System • For storing, accessing and transforming visual data • Primarily geared towards visual analytics pipelines and data science queries • With a goal of efficiently achieving cloud scale while maintaining ease-of-use Also aims to • Exploit Intel’s heterogeneous memory and storage hierarchy • Be general purpose e.g. common core for medical imaging, sports, retail 2 Intel Labs Intel Labs

  7. Visual Data: Scale and Applications Billions of sources 3 Intel Labs

  8. Visual Data: Scale and Applications Images Videos Billions of sources Large in size (individual object could range in size from KB to GB) Increasingly being used for visual understanding in a range of machine learning applications 3 Intel Labs

  9. Visual Data: Scale and Applications Images Videos Billions of sources Feature Vectors / Descriptors Large in size (individual object could range in size from KB to GB) Increasingly being used for visual understanding in a range of machine learning applications 3 Intel Labs

  10. The Unsustainable Current Solutions Resolve visual computing challenges and frameworks first • Improving accuracy of algorithms on more and more complex data • Storage has not become a bottleneck yet! 4 Intel Labs Intel Labs

  11. The Unsustainable Current Solutions Resolve visual computing challenges and frameworks first • Improving accuracy of algorithms on more and more complex data • Storage has not become a bottleneck yet! Application-specific solutions, if data does become a problem • Organize media files • Manually gather and normalize relevant metadata • Build custom scripts to tie together many stages of complex processing 4 Intel Labs Intel Labs

  12. The Unsustainable Current Solutions Resolve visual computing challenges and frameworks first • Improving accuracy of algorithms on more and more complex data • Storage has not become a bottleneck yet! Application-specific solutions, if data does become a problem • Organize media files • Manually gather and normalize relevant metadata • Build custom scripts to tie together many stages of complex processing Visual data management for scale and reuse is still an open problem. 4 Intel Labs Intel Labs

  13. VDMS Storage Architecture Exploding amount of visual data • For any request, access only the required subset of data – exploit metadata 5 Intel Labs Intel Labs

  14. VDMS Storage Architecture Exploding amount of visual data • For any request, access only the required subset of data – exploit metadata Even individual objects could be large • Speed up access to this desired data • Preprocess while reading where possible e.g. crop or detect edges before transferring 5 Intel Labs Intel Labs

  15. VDMS Storage Architecture Exploding amount of visual data • For any request, access only the required subset of data – exploit metadata Even individual objects could be large • Speed up access to this desired data • Preprocess while reading where possible e.g. crop or detect edges before transferring High performance as well as ease-of-use • Suitable design choices for metadata and data, at scale • Intel hardware optimizations e.g. 3D Xpoint, media hardware, disk offload • Simple API and client libraries 5 Intel Labs Intel Labs

  16. VDMS Implementation User VDMS Visual Data Storage 6 Intel Labs Intel Labs

  17. VDMS Implementation Efficient metadata access via Persistent Memory Graph User Database (PMGD) for visual data • Optimized for metadata storage and access patterns • Easy to evolve schema with new vision research VDMS PMGD (Metadata Database) Visual Data Storage 6 Intel Labs Intel Labs

  18. VDMS Implementation Efficient metadata access via Persistent Memory Graph User Database (PMGD) for visual data • Optimized for metadata storage and access patterns • Easy to evolve schema with new vision research VDMS Efficient data access via Visual Compute Library • Enable alternate image/video analysis friendly storage PMGD Visual formats as compared to viewer friendly ones (Metadata Compute • Process data while accessing it Database) Library Visual Data Storage 6 Intel Labs Intel Labs

  19. VDMS Implementation Efficient metadata access via Persistent Memory Graph User Database (PMGD) for visual data • Optimized for metadata storage and access patterns • Easy to evolve schema with new vision research VDMS Request Server Efficient data access via Visual Compute Library • Enable alternate image/video analysis friendly storage PMGD Visual formats as compared to viewer friendly ones (Metadata Compute • Process data while accessing it Database) Library Ease-of-use via Request Server • Implement a unified and simple client API Visual Data Storage • Route query (or parts) to the right components for a coherent user response 6 Intel Labs Intel Labs

  20. Where We Are Now User API v1.0 defined with internal feedback 7 Intel Labs Intel Labs

  21. Where We Are Now User API v1.0 defined with internal feedback Functional one node server and client libraries 7 Intel Labs Intel Labs

  22. Where We Are Now User API v1.0 defined with internal feedback Functional one node server and client libraries Three interesting proofs of concept at various stages of development with input from product groups • Real data and concrete use case: medical imaging application • Large scale, real time, intensive use case: FreeD sports storage architecture • Integration with a larger analytic framework: Retail shopper insights application 7 Intel Labs Intel Labs

  23. Medical Imaging Proof of Concept on VDMS The Cancer Image Archive: http://www.cancerimagingarchive.net/ • 60TB of medical images (Volumetric data) • ~1000 patients metadata information (very sparse) 8 Intel Labs Intel Labs

  24. Medical Imaging Proof of Concept on VDMS The Cancer Image Archive: http://www.cancerimagingarchive.net/ • 60TB of medical images (Volumetric data) • ~1000 patients metadata information (very sparse) For our PoC: • 457 patients metadata, including drug and radiation treatments • Scans for 384 patients (60K images) • Replicated metadata x10 and x100, keeping the original distribution 8 Intel Labs Intel Labs

  25. Medical Imaging Proof of Concept on VDMS The Cancer Image Archive: http://www.cancerimagingarchive.net/ • 60TB of medical images (Volumetric data) • ~1000 patients metadata information (very sparse) For our PoC: • 457 patients metadata, including drug and radiation treatments • Scans for 384 patients (60K images) • Replicated metadata x10 and x100, keeping the original distribution Segmentation pipeline for demo 8 Intel Labs Intel Labs

  26. Segmentation Pipeline PyClient VDMS Server VDMS Segmentation Client Algorithm Python for Brian Module Tumors 9 Intel Labs

  27. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query VDMS Segmentation Client Algorithm Python for Brian Module Tumors 9 Intel Labs

  28. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query Query - Pull Data VDMS Segmentation Client Algorithm Python for Brian Module Tumors 9 Intel Labs

  29. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query Query - Pull Data Return Data VDMS Segmentation Client Algorithm Python for Brian Module Tumors 9 Intel Labs

  30. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query Query - Pull Data Return Data VDMS Segmentation Client Algorithm Python for Brian Module Tumors 9 Intel Labs

  31. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query Query - Pull Data Return Data VDMS Segmentation Client Algorithm Python for Brian Module Tumors Constructed JSON Query + Image Blob 9 Intel Labs

  32. Segmentation Pipeline PyClient VDMS Server Constructed JSON Query Query - Pull Data Return Data VDMS Segmentation Client Algorithm Python for Brian Module Tumors Query - Push Data Constructed JSON Query + Image Blob 9 Intel Labs

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend