Real-Time Image Recognition Nikita Shamgunov, CEO, MemSQL - - PowerPoint PPT Presentation

real time image recognition
SMART_READER_LITE
LIVE PREVIEW

Real-Time Image Recognition Nikita Shamgunov, CEO, MemSQL - - PowerPoint PPT Presentation

Real-Time Image Recognition Nikita Shamgunov, CEO, MemSQL In-Memory Computing Summit 2017 1 The future of computing is visual 2 and also numerical :) 3 4 5 6 7 Putting image recognition to work today How It Works 10


slide-1
SLIDE 1

1

Real-Time
 Image Recognition

Nikita Shamgunov, CEO, MemSQL In-Memory Computing Summit 2017

slide-2
SLIDE 2

2

The future of computing is visual

slide-3
SLIDE 3

3

and also numerical :)

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

6

slide-7
SLIDE 7

7

slide-8
SLIDE 8

Putting image recognition to work today

slide-9
SLIDE 9
slide-10
SLIDE 10

10

How It Works

slide-11
SLIDE 11

11

Real-Time Image Recognition Workflow

▪ Train the model with Spark, TensorFlow, and Gluon ▪ Use the Model to extract feature vectors from images

  • Model + Image => FV

▪ You can store every feature vector in a MemSQL table

CREATE TABLE features ( id bigint(11) NOT NULL AUTO_INCREMENT, image binary(4096) DEFAULT NULL, KEY id (id)USING CLUSTERED COLUMNSTORE )

slide-12
SLIDE 12

12

Working with Feature Vectors

For every image, we store an ID and a normalized feature vector in a MemSQL table called features. ID | Feature Vector x | 4KB To find similar images, we use this SQL query

SELECT id FROM features WHERE DOT_PRODUCT(feature * <input>) > 0.9

slide-13
SLIDE 13

13

Understanding Dot Product

▪ Dot Product is an algebraic operation

  • SUM(Xi*Yi) TODO: Put a formula

▪ With the specific model and normalized feature vectors

DOT PRODUCT results in a similarity score

  • The closer the score is to 1 the more similar are the images
slide-14
SLIDE 14

14

Performance Enhancing Techniques

Achieving best-in-class Dot Product implementation

▪ SIMD-powered ▪ Data compression ▪ Query parallelism ▪ Scale out ▪ Result: Processing at Memory Bandwidth Speed

slide-15
SLIDE 15

15

Performance Numbers

▪ Memory Speed: 50GB/sec ▪ Each vector 4K ▪ 12.5 Million Images a second per node

  • r

▪ 1 Billion images a second on 100 node cluster

slide-16
SLIDE 16

Demo

slide-17
SLIDE 17

17

Demo Architecture

Persistent, Queryable Format Images ML Framework Model ML Framework Real-time
 image
 recognition

slide-18
SLIDE 18

18

SELECT id FROM features WHERE DOT_PRODUCT(image, 0xa334efa…)

slide-19
SLIDE 19

About MemSQL

slide-20
SLIDE 20

▪ Scalable

  • Petabyte scale
  • High concurrency
  • System of record

▪ Real-time

  • Operational

Compatible

  • ETL
  • Business Intelligence
  • Kafka
  • Spark

MemSQL: The Real-Time Data Warehouse

▪ Deployment

  • MemSQL Cloud
  • Any public cloud
  • On-premises

▪ Developer Edition

  • Unlimited scale
  • Limited high availability

and security features

20

slide-21
SLIDE 21

21

2017 Magic Quadrant for Data Management Solutions for Analytics

slide-22
SLIDE 22

About ML Training

slide-23
SLIDE 23

23

ML training is available through a variety

  • f frameworks, including Spark MLlib,

TensorFlow, Gluon, and Caffe.

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

ML Frameworks MemSQL

Fast, large scale General processing engines Great for training Fast, large scale Real-time data warehouse Great for real-time scoring

Understanding ML Frameworks and MemSQL

slide-26
SLIDE 26

Highly parallel, high throughput, bi-directional

26

Example: MemSQL Spark Connector

slide-27
SLIDE 27

Thank you! @NikitaShamgunov www.memsql.com