Co Compute mputer r Vision 3: Detection, , Segmentation and - - PowerPoint PPT Presentation

co compute mputer r vision 3 detection segmentation and
SMART_READER_LITE
LIVE PREVIEW

Co Compute mputer r Vision 3: Detection, , Segmentation and - - PowerPoint PPT Presentation

Co Compute mputer r Vision 3: Detection, , Segmentation and and Tr Trac acking king CV3DST | Prof. Leal-Taix 1 The The Te Team Lecturers Prof. Dr. Laura Dr. Aljosa Leal-Taix Osep CV3DST | Prof. Leal-Taix 2 Wh What at


slide-1
SLIDE 1

Co Compute mputer r Vision 3: Detection, , Segmentation and and Tr Trac acking king

1 CV3DST | Prof. Leal-Taixé

slide-2
SLIDE 2

The The Te Team

2

Lecturers

  • Prof. Dr. Laura

Leal-Taixé

  • Dr. Aljosa

Osep

CV3DST | Prof. Leal-Taixé

slide-3
SLIDE 3

Wh What at this cou

  • urse

e is:

  • A course on Computer Vision

– Object detection – Instance and semantic segmentation – Multiple object tracking in 2D and 3D

  • Other CV courses:

– Computer Vision 2: Multiple View Geometry (WS)

3 CV3DST | Prof. Leal-Taixé

slide-4
SLIDE 4

Wh What at this cou

  • urse

e is NOT:

  • An Introduction to Deep Learning

– Take “Introduction to Deep Learning” if you are not familiar with basic DL concepts

  • A practical project course

– Take “Advanced Deep Learning for Computer Vision”

  • A theoretical introduction into 3D Vision

– Take “Computer Vision 2: Multiple View Geometry (WS)”

4 CV3DST | Prof. Leal-Taixé

slide-5
SLIDE 5

Wh What at is Com

  • mputer

er Vi Vision

  • n?
  • First defined in the 60s in artificial intelligence groups
  • “Mimic the human visual system”
  • Center block of robotic intelligence

5 CV3DST | Prof. Leal-Taixé

slide-6
SLIDE 6

6 CV3DST | Prof. Leal-Taixé

slide-7
SLIDE 7

Comp Computer er Vis ision ion

9

Give eyes to a computer

CV3DST | Prof. Leal-Taixé

slide-8
SLIDE 8

Comp Computer er Vis ision ion

10

Understand every pixel of an image

CV3DST | Prof. Leal-Taixé

slide-9
SLIDE 9

Comp Computer er Vis ision ion

11

road tree Semantic segmentation person car Understand every pixel of an image

CV3DST | Prof. Leal-Taixé

slide-10
SLIDE 10

Comp Computer er Vis ision ion

12

person 1 tree Semantic segmentation person 2 person 3 Instance- based segmentation road car Understand every pixel of an image

CV3DST | Prof. Leal-Taixé

slide-11
SLIDE 11

Comp Computer er Vis ision ion

13

Understand every pixel of a video Semantic segmentation Instance- based segmentation Multiple

  • bject

tracking

CV3DST | Prof. Leal-Taixé

slide-12
SLIDE 12

Dyn Dynamic c Sce Scene Understa tanding

14

Understand every pixel of a video Semantic segmentation Instance- based segmentation Multiple

  • bject

tracking

CV3DST | Prof. Leal-Taixé

slide-13
SLIDE 13

Au Auto tono nomous drivi ving ng

15 CV3DST | Prof. Leal-Taixé

slide-14
SLIDE 14

Au Auto tono nomous drivi ving ng

16 CV3DST | Prof. Leal-Taixé

slide-15
SLIDE 15

Un Underst standin ing a an ima image

17

Credit: Li/Karpathy/Johnson

CV3DST | Prof. Leal-Taixé

slide-16
SLIDE 16

Un Underst standin ing a an ima image

18

  • K. He, G. Gkioxari, P. Dollar, R. Girshick. Mask R-CNN. ICCV 2017.

CV3DST | Prof. Leal-Taixé

slide-17
SLIDE 17

Un Underst standin ing a an ima image

19 CV3DST | Prof. Leal-Taixé

slide-18
SLIDE 18

Un Underst standin ing a an ima image

20 CV3DST | Prof. Leal-Taixé

slide-19
SLIDE 19

Un Underst standin ing a an ima image

21 CV3DST | Prof. Leal-Taixé

slide-20
SLIDE 20

Un Underst standin ing a an ima image

  • Different representations depending on the

granularity

– Detections (coarse) – Segmentations (precise) – Semantic with/without instances (person 1, person 2)

  • Goes well with Deep Learning

22 CV3DST | Prof. Leal-Taixé

slide-21
SLIDE 21

Un Underst standin ing a an v vid ideo

  • Temporal domain which brings us advantages

– A lot of redundancy – A smoothness assumption: things do not change much from one frame to another

  • … but also disadvantages

– At 30 FPS, image the computation one has to do to process a video…. – Occlusions, multiple objects moving and interacting…

23 CV3DST | Prof. Leal-Taixé

slide-22
SLIDE 22

Un Underst standin ing a an v vid ideo: t : then

24 CV3DST | Prof. Leal-Taixé

slide-23
SLIDE 23

Un Underst standin ing a an v vid ideo: n : now

25 CV3DST | Prof. Leal-Taixé

slide-24
SLIDE 24

Un Underst standin ing a an v vid ideo

  • Where is every object going?
  • How are objects interacting?
  • Get consistent results in the temporal dimension

26 CV3DST | Prof. Leal-Taixé

slide-25
SLIDE 25

Rou Rough schedu edule/c e/con

  • nten

ent

  • 1. Introduction
  • 2. Object Detection 1
  • 3. Object Detection 2
  • 4. Single/Multiple object tracking
  • 5. Multiple object tracking
  • 6. Trajectory prediction
  • 7. Semantic segmentation
  • 8. Instance Segmentation
  • 9. Video object segmentation
  • 10. Going towards 3D tracking and segmentation

27 CV3DST | Prof. Leal-Taixé

slide-26
SLIDE 26

Rou Rough schedu edule/c e/con

  • nten

ent

  • RCNN, Fast RCNN and Faster RCNN
  • YOLO, SSD, RetinaNet
  • Siamese networks – Person Re-Identification
  • Message Passing Networks
  • Network (non-neural) flow for tracking
  • Generative Adversarial Networks – trajectory prediction
  • Mask-RCNN, UPSNet (panoptic segmentation)
  • Deformable/atrous convolutions
  • 3D – data, algorithms.

28 CV3DST | Prof. Leal-Taixé

slide-27
SLIDE 27

Our Our Research rch Lab

Dynamic Vision and Learning Group https://dvl.in.tum.de/

29 CV3DST | Prof. Leal-Taixé

slide-28
SLIDE 28

Ab About t the the lectu ture

  • Theory: 10-11 lectures
  • Every Wednesday 16:00-18:00 (MI HS 2)
  • Lectures will be recorded this year!

– Due to the virus situation, we will make all slides and video recordings available every Wednesday

30

https://dvl.in.tum.de/teaching/cv3dst-ss20/

CV3DST | Prof. Leal-Taixé

slide-29
SLIDE 29

Gra Grading syst g system

  • Exam: tbd

tbd

  • There will be a retake exam as this course will be

moved permanently to Summer Semester

  • Completing the practical part successfully gives a

bonus of 0.3

31 CV3DST | Prof. Leal-Taixé

slide-30
SLIDE 30

Pr Pract ctica cal l pa part

  • Internal Kaggle competition
  • We will have a tracking challenge.
  • You will all start from the same point (code)
  • After that, it will be an open competition.
  • Check out the video of the presentation of the

challenge!

32 CV3DST | Prof. Leal-Taixé

slide-31
SLIDE 31

Moodl Moodle

  • Announcements via Moodle - IMPORT

PORTANT ANT!

– Sign up in TUM online for access: https://www.moodle.tum.de/ – We will share common information (e.g., regarding exam) – Ask content questions online so others benefit – Don’t post solutions

33 CV3DST | Prof. Leal-Taixé

slide-32
SLIDE 32

Ema Email ils & Slid ides es

  • All material will be uploaded on Moodle and the web
  • Questions regarding the syllabus, exercises or

contents of the lecture, use Moodle!

  • Questions regarding organization of the course:
  • Emails to the individual addresses will not be

answered.

34

dst@dvl.in.tum.de

CV3DST | Prof. Leal-Taixé

slide-33
SLIDE 33

Co Compute mputer r Vision 3: Detection, , Segmentation and and Tr Trac acking king

40 CV3DST | Prof. Leal-Taixé