THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA - PowerPoint PPT Presentation

THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA Solutions Architect, lcapps@nvidia.com

Background Big Data vs Fast Data AGENDA HPC and Hyperscale Evolving data technologies Future research 2

Who is NVIDIA? ACCELERATED COMPUTING REVOLUTION Defense HPC Oil & Gas Deep Learning Medical Imaging Founded in 1993 Jen-Hsun Huang is co-founder/CEO person Joined NASDAQ as NVDA in 1999 dog FY14: $4.13 billion in revenue chair >9,000 employees worldwide Headquarters: Santa Clara, CA AMBER · GROMACS · NAMD Image & Voice Recognition RTM · FWI · Elastic Signal · Image · Video MRI · Tomography · Ultrasound Created a revolution with first GPU in 1999, has shipped > 1 Billion Leader in Parallel simulation Optimized Applications in the Data Center Visualization Deep Learning Accelerated Data Center Accelerated Algorithms Innovation with > 7,000 patents Research investment Servers & Interconnects Developer T ools Bleeding edge technologies GPU Accelerators CUDA Tesla Accelerated Computing Platform 5

GAMING AUTO ENTERPRISE HPC & CLOUD OEM & IP THE WORLD LEADER IN VISUAL COMPUTING 6 X

https://research.nvidia.com/publication/online-detection-and-classification-dynamic-hand-gestures-recurrent-3d-convolutional https://research.nvidia.com/publication/parallel-spectral-graph-partitioning 7 https://research.nvidia.com/publication/robust-model-based-3d-head-pose-estimation

BIG DATA VS FAST DATA 8

FAST DATA IS THE NEW NORM “'Big Data' Is No Longer Enough: It's Now All About 'Fast Data’” - Entreprenuer Media, June 2016 “...4,300 percent increase in annual data production by 2020 .” – Forbes Magazine and CSC, April 2016 9

OPEN DATA SCIENCE “Open source software is fundamental to big data, says Roman Shaposhnik, who runs the Apache Incubator project ...’In a way, open source has won in the enterprise’ says Shaposhnik, whose day job is director of open source at Pivotal.” – Datanami, Feb 2016 And just in the past month • 1. The embrace of stream processing and real-time data access is driving enterprise adoption of the Apache Kafka 2. Google open-sources SyntaxNet, a natural-language understanding library for TensorFlow 3. IBM Is Now Letting Anyone Play With Its Quantum Computer 4. Amazon open-sources its own deep learning software, DSSTNE 5. Facebook details its company-wide machine learning platform, FBLearner Flow 6. Google gives TensorFlow distributed computing support 7. OpenAI launches Gym, a toolkit for testing and comparing reinforcement learning algorithms 10

SYNERGY OF BIG DATA TECHNOLOGIES Enterprise Growing Rise of the Embrace of Insatiable Research Data Open Tech Desire for Scientist Insight Machine Intelligence • Deep Learning Data Storage • Image, text, speech, sensor SSD, FLASH • Huge DRAM • Database Scalability / Velocity Visualization Compute Engines • Hadoop • Real-time point clouds Intelligent Large clusters • • Spark • Interactive Acceleration • Insight • Kafka • Precise Extreme bandwidth • • Mutli system interconnect • Remote Cloud Data Capabilities • Broad acceptance • Unstructured • Production • Graphs • Large shared storage • Frameworks Geometric Data Growth --- Reducing Insight latency 11

REBIRTH OF THE DATA SCIENTIST - Computerworld 2016 12

HPC AND HYPERSCALE 13

BIG DATA – FROM HPC TO HYPERSCALE TO ... Big Small Data in Data in Big Big Huge Huge compute storage storage compute Hyperscale HPC Small Huge Data out data out Discovery Insight Prediction Autonomy Similar engine? Process data Create data 14

REMOTE VISUALIZATION ON BLUE WATERS Faster Time to Results 48x Acceleration with Tesla Stellar combustion visualized on Local Viz Cluster Limitations GPUs in the HPC Center Blue Waters (26 TB dataset) • Limited GPUs and other hardware 48 days resources • Long data transfer times Rendering HPC Cluster Advantages • Scales to 100s of GPUs in the cluster Data transfer 1 day • Eliminates data transfers Paul Woodward, U. Minnesota: HVR w/ OpenGL 6 GPUs in local 128 GPUs in on Blue Waters* viz cluster HPC center 15 *Mark D. Klein and John E. Stone, “Unlocking the full potential of the Cray XK7 accelerator”, Cray Users Group, Lugano, Switzerland, May 2014

PHOTO REALISTIC VR RENDERING 16

DATA TECHNOLOGY 17

AI RACE IS ON IMAGENET Accuracy Rate 100% Traditional CV Deep Learning 90% 80% 70% Baidu Deep Speech 2 IBM Watson Achieves Breakthrough Facebook Beats Humans in Natural Language Processing Launches Big Sur 60% 50% 40% 30% 20% 10% Google Toyota Invests $1B Microsoft & U. Science & Tech, China Launches TensorFlow in AI Labs Beat Humans on IQ 0% 2009 2010 2011 2012 2013 2014 2015 2016 18

DEEP LEARNING FOR IMAGE ANALYTICS person bird car frog helmet motorcycle person person dog hammer chair flower pot power drill 20

2016: AN AMAZING YEAR FOR SELF-DRIVING CARS Volvo Drive Me on Public Roads in 2017 Uber Enters the Race Toyota Invests $1B NHTSA: Computer Tesla Model 3: in AI Lab Counts as Driver 300K pre-orders Audi, BMW, Daimler Buy HERE Baidu Enters the Race GM Buys Cruise Tesla Model S Auto-pilot Honda, Nissan, Toyota Team Up 21

DEEP LEARNING AND KITTI 22

ACCELERATING SIGNAL & VIDEO ANALYTICS Real-time HD video Video surveillance with faster enhancements and analytics real time analytics Made possible only with GPUs 12x Faster with GPUs Unmanned submarine with Faster satellite image accelerated sonar processing processing for actionable intelligence 50-100x speed up over CPU 12x Faster with GPUs 23

MISSION PLANNING WITH REAL-TIME LINE OF SIGHT Video Data Image Data Signal Data World Leader in Geospatial Situational Awareness GPU CPU 1 Computation/Second 100 Computations/Second Delayed Response Real-Time Response 24 http://www.luciad.com/

DEEP LEARNING REVOLUTIONIZING MEDICAL RESEARCH Detecting Mitosis in Breast Cancer Cells Molecular Activity Prediction for Drug Discovery Merck IDSIA Predicting the Toxicity of New Drugs Understanding Gene Mutation to Prevent Disease Johannes Kepler University of University T oronto 25

ACCELERATED DATABASE TECHNOLOGY Big data ISVs moving to the accelerated model SQL No SQL Graph Trend towards accelerated computing - variety of firms with accelerated databases for big data available today –well-funded start-ups to large well-known players. 26

MAPD Lightning-fast analytic SQ database and vis - MapD processing - > 40k cores vs Traditional Processing 20 cores - 100-1000x faster queries - Visualization of billions of data points - http://www.ma pd.com/demos/ tweetmap/ 27

NETFLOW GRAPH ANALYTICS USE CASE Initial graph filter by PageRank / SecureRank using DASL. • 140M netflows in real-time • Analytics in Scala from Spark • Blazegraph DASL • Run pagerank • Qty shown by color 28

NETFLOW GRAPH ANALYTICS USE CASE Interactive visual query session. Suspicious node communicating with our Then look at internal nodes – many are internal network – but also one outside communicating with a single outside node 29

NETFLOW GRAPH ANALYTICS USE CASE Identification of exfiltrated data traffic. • Clicking and exploring attributes • Appears node is scanning from network externally • Possibly pulling out data 30

NVIDIA Research 31

GPU GRAPH RESEARCH Acceleration NVGRAPH • Take analytics and make it linear algebra • What Blaze graph and DASL do today • Pagerank, SSSP, Some accel on GPU Standardizing with Graph BLAS Working on glue for GraphX Spark + GPUs • Offload core ops to GPU • Compiler eval data flows and fuse ops together into one kernel (Project Tungsten) • Project Tungsten: https://spark-summit.org/2015/events/deep-dive-into-project-tungsten-bringing-spark-closer-to-bare-metal/ • Graph BLAS www.graphblas.org 33

NVIDIA DIGITS Interactive Deep Learning GPU Training System Configure DNN Process Data Monitor Progress Visualize Layers Test Image developer.nvidia.com/digits github.com/NVIDIA/DIGITS 34

DIGITS FUTURE Object Detection Workflow Object Detection Workflows for Automotive and Defense Targeted at Autonomous Vehicles, Remote Sensing developer.nvidia.com/digits github.com/NVIDIA/DIGITS 35

NEED FOR SPEED Progress in DNN research depends on compute Lavin & Gray, “Fast Algorithms for Convolutional Neural Networks”, 2015 36

GOALS OF ACCELERATION Progress in DNN research depends on compute Faster Performance - inf/sec More Efficent Cost – inf/$ § Energy – inf/J § Inference – run example forward through the network Training Run the network forward Back-propagate the gradient Update of parameters 37

THREE KINDS OF NETWORKS DNN – all fully connected layers (filters, recommendation) CNN – some convolutional layers (image, vision, text) RNN – recurrent neural network, LSTM (semantics, intent) http://scikit-learn.org/stable/tutorial/machine_learning_map/ 38

DATA PARALLEL EXAMPLE (CPU) Linear speed-up. NVIDIA Whitepaper “GPU based deep learning inference: A performance and power analysis.” 39

THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA - PowerPoint PPT Presentation

THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA Solutions Architect, lcapps@nvidia.com Background Big Data vs Fast Data AGENDA HPC and Hyperscale Evolving data technologies Future research 2 Who is NVIDIA? ACCELERATED

Assessment Synergy Office of Institutional Effectiveness Assessment Synergy Background

Post-Deal Integration & Synergy Capture As multiples include synergy potential, cracking the

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

i-Synergy Introducing the worlds first smart cleaning machine! Why i-Synergy i20NBTL ?

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Engineering Culture Secret Sauce of Great Software Great Software process model Great

Advanced Driver Assistance System Synergy - 360 AVM Proposal Synergy Smart Vision 360 : a

Analyst Meeting Synergy and Equity Offering July 30 th , 2019 1 Over ervi view ew and

IT SYNERGY WORK SHOP IT Synergy: OVERVIEW An increase in the value of the organization as a

Synergy of social networks defeats online privacy Eleonora Petridou Marek Kuczy nski System

Synergy: Quality of Service Synergy: Quality of Service Support for Distributed Support for

GM Synergy Short Coaching Skills Workshop 1 WHAT IS THE GM SYNERGY PROJECT? Based on the

EXPLORING SYNERGY WITH INDUSTRY EXPLORING SYNERGY WITH INDUSTRY Media X X @ Stanford University

Synergy a new approach for optimizing the resource usage in OpenStack Overview Synergy cloud

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Deploying OpenStack What options do we have? Agenda Introduction Deployment projects

eZeeKonfigurator eZeeKonfigurator Vlad Grigorescu Vlad Grigorescu vlad@es.net Zeek Week 2019

Graal, GraalVM, Truffle: What do they mean for polyglot developers? 26-27th March 2018 59th

The quest for the IdM holy grail Stig Wennevold University of Troms Disclaimer The idea

Mass roll out of Linux with Windows Mass roll out of Linux with Windows as a VM Guest as a VM

LibreOffjce On-Line server Initial implementation details Tor Lillqvist Collabora Productivity

Deployment options Vlad Ionescu BDevOps vladionescu.me Plan Deployment options

Kyle Suero @KST123ABC The Arch History Inspired by CRUX/BSD - Minimalism Only officially

THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA - PowerPoint PPT Presentation

THE GREAT SYNERGY OF BIG DATA TECHNOLGIES Louis Capps, NVIDIA Solutions Architect, lcapps@nvidia.com Background Big Data vs Fast Data AGENDA HPC and Hyperscale Evolving data technologies Future research 2 Who is NVIDIA? ACCELERATED

Assessment Synergy Office of Institutional Effectiveness Assessment Synergy Background

Post-Deal Integration &amp; Synergy Capture As multiples include synergy potential, cracking the

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

i-Synergy Introducing the worlds first smart cleaning machine! Why i-Synergy i20NBTL ?

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Engineering Culture Secret Sauce of Great Software Great Software process model Great

Advanced Driver Assistance System Synergy - 360 AVM Proposal Synergy Smart Vision 360 : a

Analyst Meeting Synergy and Equity Offering July 30 th , 2019 1 Over ervi view ew and

IT SYNERGY WORK SHOP IT Synergy: OVERVIEW An increase in the value of the organization as a

Synergy of social networks defeats online privacy Eleonora Petridou Marek Kuczy nski System

Synergy: Quality of Service Synergy: Quality of Service Support for Distributed Support for

GM Synergy Short Coaching Skills Workshop 1 WHAT IS THE GM SYNERGY PROJECT? Based on the

EXPLORING SYNERGY WITH INDUSTRY EXPLORING SYNERGY WITH INDUSTRY Media X X @ Stanford University

Synergy a new approach for optimizing the resource usage in OpenStack Overview Synergy cloud

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Deploying OpenStack What options do we have? Agenda Introduction Deployment projects

eZeeKonfigurator eZeeKonfigurator Vlad Grigorescu Vlad Grigorescu vlad@es.net Zeek Week 2019

Graal, GraalVM, Truffle: What do they mean for polyglot developers? 26-27th March 2018 59th

The quest for the IdM holy grail Stig Wennevold University of Troms Disclaimer The idea

Mass roll out of Linux with Windows Mass roll out of Linux with Windows as a VM Guest as a VM

LibreOffjce On-Line server Initial implementation details Tor Lillqvist Collabora Productivity

Deployment options Vlad Ionescu BDevOps vladionescu.me Plan Deployment options

Kyle Suero @KST123ABC The Arch History Inspired by CRUX/BSD - Minimalism Only officially

Post-Deal Integration & Synergy Capture As multiples include synergy potential, cracking the