Recording the Visual Mind Appu Shaji Head of Research & - - PowerPoint PPT Presentation

recording the visual mind
SMART_READER_LITE
LIVE PREVIEW

Recording the Visual Mind Appu Shaji Head of Research & - - PowerPoint PPT Presentation

Recording the Visual Mind Appu Shaji Head of Research & Development @panteanaghavi EyeEm The Evolution of Photography The Boulevard du Temple, (1837) by Louis Daguerre (public domain). A Harvest of Death, Gettysburg,


slide-1
SLIDE 1

Appu Shaji Head of Research & Development EyeEm

Recording the Visual Mind

@panteanaghavi

slide-2
SLIDE 2

“The Boulevard du Temple”, (1837) by Louis Daguerre (public domain).

The Evolution of Photography

slide-3
SLIDE 3

“A Harvest of Death, Gettysburg, Pennsylvania”, (1863) by Timothy H. O'Sullivan (public domain).

slide-4
SLIDE 4

“Migrant Mother”, (1936) by Dorothy Lange (public domain).

slide-5
SLIDE 5

“Earth rise”, (1969) by William Anders (public domain).

slide-6
SLIDE 6

"Revenge of the goldfish”, (1981) by Sandy Skoglund (fair use).

slide-7
SLIDE 7

“Paris, Montparnasse”, (1993) by Andreas Gurksy (fair use).

slide-8
SLIDE 8
slide-9
SLIDE 9

AT THE CORE

Storytelling

A story behind every image.

slide-10
SLIDE 10

EYEEM

Vision

Towards uncovering the stories within an image.

https://www.eyeem.com/tech

slide-11
SLIDE 11

One person Young woman Front view Adult

Human body

portrait of young woman 89%

Contemplation Head shot

Tags & Caption Aesthetics

EyeEm Vision will organize your visual content

Identify all relevant concepts See the story in a headline The aesthetic score ranks the quality of images Trained on visual trends and feedback from EyeEm community

slide-12
SLIDE 12

WE ARE A

Community

@BettinaDarger

20 million

PHOTOGRAPHERS

Marketplace

AUTHENTIC PHOTOGRAPHY

Technology

SEARCH & DISCOVERY

150

COUNTRIES

slide-13
SLIDE 13

IN SHORT

EyeEm is

@LAX2NRT

EyeEm is a photography company. We build the world's leading computer vision technology to connect our global creative community with iconic brands.

slide-14
SLIDE 14

AT THE CORE

Storytelling

A story behind every image.

slide-15
SLIDE 15

EYEEM

Vision

Towards uncovering the stories within an image.

slide-16
SLIDE 16

http://www.eyeem.com/tech

slide-17
SLIDE 17

http://www.eyeem.com/tech

slide-18
SLIDE 18

Understanding Aesthetics.

@Aadnan

slide-19
SLIDE 19

Can we learn from the masters ?

slide-20
SLIDE 20

Bombay Churchgate Station, (1994) by Raghu Rai (fair use).

slide-21
SLIDE 21

Bombay Churchgate Station, (1995) by Sebastião Salgado (fair use).

slide-22
SLIDE 22

Bombay Churchgate Station, (2011) by Randy Olson (fair use).

slide-23
SLIDE 23

Bombay Churchgate Station, (unknown ) from Google Image Search

slide-24
SLIDE 24

Steve McCurry David Uzochukwu What is common among them ? Me :-) What differentiates them ?

slide-25
SLIDE 25

Photographs by EyeEm users @nicanorgarcia, @cocu_liu , and me.

slide-26
SLIDE 26
slide-27
SLIDE 27

4 million

Photographs curated by Experts

BAD GOOD

3 years

and ongoing

Data

  • Aesthetics are hard to express.
  • Aesthetics differ from person to person.
  • However, experts often find a common

language to define and communicate about aesthetics, after considerable conversation and debate.

Crowdsourced Expertly Curated Community Social Data

slide-28
SLIDE 28

PERSONALIZED

Aesthetics

Isn't Aesthetics Subjective ?

slide-29
SLIDE 29

@jackyczj2010 @KatePhellini @ken_kou @itchban @idjphotography @ArifNurhakim @sayjor @svanteberg

slide-30
SLIDE 30

Image Convolutional Layers + Non Linearity Feature Representations Dense Layer Predictions

slide-31
SLIDE 31

Feature

Representations

BCG Kayak All Images

slide-32
SLIDE 32

Image Convolutional Layers + Non Linearity Feature Representations Dense Layer Predictions Non-linear Ranker (Multi-layer Perceptron) Personalized Rank

slide-33
SLIDE 33

Energy

Surface

BCG Kayak All Images

slide-34
SLIDE 34

200,000

Photos can be scored in a sec.

3 ~ 4 secs

Train a New Personalized Layer ( in Titan GPU )

slide-35
SLIDE 35

https://www.youtube.com/watch?v=lYFKqoekP_0

slide-36
SLIDE 36

Data

  • What kind of data should we collect ?
  • How do you understand the data ?

Human thinking is non-linear, and any given selection is prone to biases.

  • How do you sample the data ?
  • How do you evaluate & analyze the

results ?

  • How much data is enough ?
  • How do you fill in data gaps ?

Algorithms

  • How do you pose the problem ?
  • What do you look for in good feature

representations ?

  • Will the architecture engineered for

classification task be the best for aesthetics ?

  • What is the network learning ?

Efficiency

  • How do you make your networks faster ?
  • How do you make your networks

smaller ?

  • Can we transfer learn effectively what we

have learned in the past ?

( Hint : It is about answering questions with no easy answers )

COMPUTER VISION AND MACHINE LEARNING RESEARCH

At EyeEm

slide-37
SLIDE 37

Research

Directions

CNN RNN/LSTM Attention Mechanisms Zero-shot Many-shot Transfer Learning GANs Fill-in-the blanks style model Semi/Weakly Supervised Reinforcement Learning Unsupervised Learning Personalization Understanding Models Reasoning In production system

slide-38
SLIDE 38

Product

ASK LORENZ

Technology

slide-39
SLIDE 39

Conversations Community Technology Photos

slide-40
SLIDE 40

Joint Work With

  • Dr. Gökhan Yildrim
  • Harsimrat Sandhawalia
  • Ludwig Schmidt-Hackenberg
  • Dr. Hicham Badri
  • Dr. Praveen Kulkarnai
  • Nour Karessli
  • Dr. Josip Krapac
  • Fulya Neubert

Thanks

  • Photography Team @ EyeEm
  • Core Engineering @ EyeEm
  • Clients @ EyeEm

And yes, we are hiring ( https://www.eyeem.com/jobs ).

slide-41
SLIDE 41

A photograph exists in past, present and future; and magic can happen in any of these times! appu@eyeem.com

@evanscsmith