Ultra-Low Power Render-Based Collision Detection for CPU/GPU - - PowerPoint PPT Presentation

ultra low power render based collision detection for cpu
SMART_READER_LITE
LIVE PREVIEW

Ultra-Low Power Render-Based Collision Detection for CPU/GPU - - PowerPoint PPT Presentation

48 th International Symposium on Microarchitecture Ultra-Low Power Render-Based Collision Detection for CPU/GPU Systems Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio Gonzlez Market Mobile devices market is


slide-1
SLIDE 1

48th International Symposium on Microarchitecture

Ultra-Low Power Render-Based Collision Detection for CPU/GPU Systems

Enrique de Lucas Pedro Marcuello

Joan-Manuel Parcerisa Antonio González

slide-2
SLIDE 2

Enrique de Lucas

  • 1. Motivation

2

 Mobile devices market is growing fast

2008 2009 2010 2011 2012 2013 2014 200 400 600 800 1000 1200 1400

Millions of unit shipped

Market

PC Smartphone

slide-3
SLIDE 3

Enrique de Lucas

  • 1. Motivation

3

Mobile Systems

  • Users demand realistic and complex graphics like in

laptops and desktops

– Battery life is about 4 hours

for GFXBench 3.0!1

– Heat dissipation

  • “CPU, GPU and screen are the dominant energy

consumers on a smartphone”2

  • Graphics animation applications are quite popular

– Collision Detection is an important task

  • 1. With an ARM Mali 400MP GPU, www.gfxbench.com
  • 2. Mittal et al., Empowering Developers to Estimate App Energy Consumption, Aug. 2012.
slide-4
SLIDE 4

Enrique de Lucas 4

Outline

  • 1. Motivation
  • 2. Collision Detection (CD)
  • 3. Render-Based CD in the GPU
  • 4. Results
  • 5. Conclusions
slide-5
SLIDE 5

Enrique de Lucas

  • 2. Collision Detection

5

Collision Detection (CD)

Frame i Frame i+2

CD identifies the contact points between objects

Bounding Volume

Contact!

slide-6
SLIDE 6

Enrique de Lucas

  • 2. Collision Detection

6

Bounding Volumes

false area = false collisions

False Collisions!

Collision

Bounding Volume Cuboid Convex Hull Accuracy Low Medium Computing Cost Low High

slide-7
SLIDE 7

Enrique de Lucas

  • 2. Collision Detection

7

Image-Based CD (IBCD)

  • CD performed at pixel granularity
  • No Bounding Volumes
  • Higher Accuracy
  • Computing/Energy Cost
  • CPU: huge
  • Our technique (GPU): tiny
slide-8
SLIDE 8

Enrique de Lucas

  • 2. Collision Detection

8

How does IBCD work?

Steps of Image-Based CD

Depth S c r e e n

P1

slide-9
SLIDE 9

Enrique de Lucas

  • 2. Collision Detection

9

How does IBCD work?

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-10
SLIDE 10

Enrique de Lucas

  • 2. Collision Detection

10

How does IBCD work?

1) Project objects onto a plane (screen)

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-11
SLIDE 11

Enrique de Lucas

  • 2. Collision Detection

11

How does IBCD work?

1) Project objects onto a plane (screen)

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-12
SLIDE 12

Enrique de Lucas

  • 2. Collision Detection

12

How does IBCD work?

1) Project objects onto a plane (screen)

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-13
SLIDE 13

Enrique de Lucas

  • 2. Collision Detection

13

How does IBCD work?

1) Project objects onto a plane (screen)

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-14
SLIDE 14

Enrique de Lucas

  • 2. Collision Detection

14

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-15
SLIDE 15

Enrique de Lucas

  • 2. Collision Detection

15

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

Steps of Image-Based CD

Depth S c r e e n

P1

slide-16
SLIDE 16

Enrique de Lucas

  • 2. Collision Detection

16

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1

Steps of Image-Based CD

Depth S c r e e n

1

P1

slide-17
SLIDE 17

Enrique de Lucas

  • 2. Collision Detection

17

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7

Steps of Image-Based CD

Depth S c r e e n

1 7

P1

slide-18
SLIDE 18

Enrique de Lucas

  • 2. Collision Detection

18

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5

Steps of Image-Based CD

Depth S c r e e n

1 5 7

P1

slide-19
SLIDE 19

Enrique de Lucas

  • 2. Collision Detection

19

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5 2

Steps of Image-Based CD

Depth S c r e e n

1 2 5 7

P1

slide-20
SLIDE 20

Enrique de Lucas

  • 2. Collision Detection

20

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5 2 6

Steps of Image-Based CD

Depth S c r e e n

1 2 5 6 7

P1

slide-21
SLIDE 21

Enrique de Lucas

  • 2. Collision Detection

21

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5 2 6 8

Steps of Image-Based CD

Depth S c r e e n

1 2 5 6 7 8

P1

slide-22
SLIDE 22

Enrique de Lucas

  • 2. Collision Detection

22

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5 2 6 8 3

Steps of Image-Based CD

Depth S c r e e n

1 2 3 5 6 7 8

P1

slide-23
SLIDE 23

Enrique de Lucas

  • 2. Collision Detection

23

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store

depths in a list

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4

Steps of Image-Based CD

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-24
SLIDE 24

Enrique de Lucas

  • 2. Collision Detection

24

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-25
SLIDE 25

Enrique de Lucas

  • 2. Collision Detection

25

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-26
SLIDE 26

Enrique de Lucas

  • 2. Collision Detection

26

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-27
SLIDE 27

Enrique de Lucas

  • 2. Collision Detection

27

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-28
SLIDE 28

Enrique de Lucas

  • 2. Collision Detection

28

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-29
SLIDE 29

Enrique de Lucas

  • 2. Collision Detection

29

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-30
SLIDE 30

Enrique de Lucas

  • 2. Collision Detection

30

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 Overlap! 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Collision! Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-31
SLIDE 31

Enrique de Lucas

  • 2. Collision Detection

31

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 Overlap! 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Collision! Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-32
SLIDE 32

Enrique de Lucas

  • 2. Collision Detection

32

List for P1

How does IBCD work?

1) Project objects onto a plane (screen) 2) Rasterize surface of the

  • bjects and store depths

in a list 3) Sort values by depth 4) Detect overlapping depth-ranges

Depth P1 Screen Pixels

1 7 5 2 6 8 Overlap! 3 4 1 2 5 8 3 4 Sort 7 6

Steps of Image-Based CD

Collision!

Already done for image rendering!

Depth-ranges

Depth S c r e e n

1 2 3 4 5 6 7 8

P1

slide-33
SLIDE 33

Enrique de Lucas 33

Outline

  • 1. Motivation
  • 2. Collision Detection (CD)
  • 3. Render-Based CD in the GPU
  • 4. Results
  • 5. Conclusion
slide-34
SLIDE 34

Enrique de Lucas

  • 3. Render-Based CD in the GPU

34

Vertex processor Memory controller

Rasterizer Geometry Stage Raster Stage

Programmable Fixed-Function

CD Integration into GPU pipeline

Vertices Triangles Fragments

RBCD unit

Collision Points GPU Commands

New Hardware Color Buffer Fragment processor Fragment processor Fragment processor Fragment processor

Collisionable Fragments

Image

Graphics Pipeline from 10,000 feet

Memory

slide-35
SLIDE 35

Enrique de Lucas

  • 3. Render-Based CD in the GPU

35

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

1) Store fragments sorted by depth 2) Detects collision points

...

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-36
SLIDE 36

Enrique de Lucas

  • 3. Render-Based CD in the GPU

36

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

1) Store fragments sorted by depth 2) Detects collision points

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-37
SLIDE 37

Enrique de Lucas

  • 3. Render-Based CD in the GPU

37

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

Surface Points (Fragments)

1) Store fragments sorted by depth 2) Detects collision points

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-38
SLIDE 38

Enrique de Lucas

  • 3. Render-Based CD in the GPU

38

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

Surface Points (Fragments)

1) Store fragments sorted by depth 2) Detects collision points

Insertion Sort

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-39
SLIDE 39

Enrique de Lucas

  • 3. Render-Based CD in the GPU

39

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

Surface Points (Fragments)

1) Store fragments sorted by depth 2) Detects collision points

Z-Overlap Test

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-40
SLIDE 40

Enrique de Lucas

  • 3. Render-Based CD in the GPU

40

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

Surface Points (Fragments) Collision Points

1) Store fragments sorted by depth 2) Detects collision points

Z-Overlap Test

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-41
SLIDE 41

Enrique de Lucas

  • 3. Render-Based CD in the GPU

41

RBCD unit

ZEB buffer

Insertion Sort Z-Overlap Test

Surface Points (Fragments) Collision Points

1) Store fragments sorted by depth 2) Detects collision points

... ZEB buffer (on-chip array) 1 list of fragments per every pixel 16x16 lists, 8 entries per list 8 KB (16x16x8x32B)

List 1 / Line 1 (32 B) List 2 / Line 2 (32 B) List 256 / Line 256 (32 B) ...

slide-42
SLIDE 42

Enrique de Lucas

  • 3. Render-Based CD in the GPU

42

Insertion Sort

...

ZEB

2 6 ...

List-Register

3

New

< ... < < < ...

...

Muxes

Comparators

Read one list Compare Shift Write new list

slide-43
SLIDE 43

Enrique de Lucas

  • 3. Render-Based CD in the GPU

43

Insertion Sort

...

ZEB

2 6 ... 2 6

List-Register

3

New

< ... < < < ...

...

Muxes

Comparators

Read one list Compare Shift Write new list

slide-44
SLIDE 44

Enrique de Lucas

  • 3. Render-Based CD in the GPU

44

Insertion Sort

...

ZEB

2 6 ... 2 6

List-Register

3

New

< ... < < < ...

...

Muxes

Comparators

Read one list Compare Shift 6 3 2

slide-45
SLIDE 45

Enrique de Lucas

  • 3. Render-Based CD in the GPU

45

Insertion Sort

...

ZEB

2 6 ... 2 6

List-Register

3

New

< ... < < < ...

...

Muxes

Comparators

Read one list Compare Shift Write new list 6 3 2 6 3

slide-46
SLIDE 46

Enrique de Lucas

  • 3. Render-Based CD in the GPU

46

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-47
SLIDE 47

Enrique de Lucas

  • 3. Render-Based CD in the GPU

47

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-48
SLIDE 48

Enrique de Lucas

  • 3. Render-Based CD in the GPU

48

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-49
SLIDE 49

Enrique de Lucas

  • 3. Render-Based CD in the GPU

49

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-50
SLIDE 50

Enrique de Lucas

  • 3. Render-Based CD in the GPU

50

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Push Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-51
SLIDE 51

Enrique de Lucas

  • 3. Render-Based CD in the GPU

51

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-52
SLIDE 52

Enrique de Lucas

  • 3. Render-Based CD in the GPU

52

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-53
SLIDE 53

Enrique de Lucas

  • 3. Render-Based CD in the GPU

53

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-54
SLIDE 54

Enrique de Lucas

  • 3. Render-Based CD in the GPU

54

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-55
SLIDE 55

Enrique de Lucas

  • 3. Render-Based CD in the GPU

55

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-56
SLIDE 56

Enrique de Lucas

  • 3. Render-Based CD in the GPU

56

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-57
SLIDE 57

Enrique de Lucas

  • 3. Render-Based CD in the GPU

57

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Push Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-58
SLIDE 58

Enrique de Lucas

  • 3. Render-Based CD in the GPU

58

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-59
SLIDE 59

Enrique de Lucas

  • 3. Render-Based CD in the GPU

59

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-60
SLIDE 60

Enrique de Lucas

  • 3. Render-Based CD in the GPU

60

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Push Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-61
SLIDE 61

Enrique de Lucas

  • 3. Render-Based CD in the GPU

61

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-62
SLIDE 62

Enrique de Lucas

  • 3. Render-Based CD in the GPU

62

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-63
SLIDE 63

Enrique de Lucas

  • 3. Render-Based CD in the GPU

63

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-64
SLIDE 64

Enrique de Lucas

  • 3. Render-Based CD in the GPU

64

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-65
SLIDE 65

Enrique de Lucas

  • 3. Render-Based CD in the GPU

65

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1 1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-66
SLIDE 66

Enrique de Lucas

  • 3. Render-Based CD in the GPU

66

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1 1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-67
SLIDE 67

Enrique de Lucas

  • 3. Render-Based CD in the GPU

67

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1 1

Collision Pair Generator Fragments Above Front-Face?

<red, blue>

Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Collision!!!

Generate Collision Pairs

Yes No

Search Front-Face

slide-68
SLIDE 68

Enrique de Lucas

  • 3. Render-Based CD in the GPU

68

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1 1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Collision!!!

Generate Collision Pairs

Yes No

Search Front-Face

slide-69
SLIDE 69

Enrique de Lucas

  • 3. Render-Based CD in the GPU

69

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-70
SLIDE 70

Enrique de Lucas

  • 3. Render-Based CD in the GPU

70

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is Front Face? Search Front-Face

slide-71
SLIDE 71

Enrique de Lucas

  • 3. Render-Based CD in the GPU

71

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-72
SLIDE 72

Enrique de Lucas

  • 3. Render-Based CD in the GPU

72

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Search Front-face Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-73
SLIDE 73

Enrique de Lucas

  • 3. Render-Based CD in the GPU

73

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-74
SLIDE 74

Enrique de Lucas

  • 3. Render-Based CD in the GPU

74

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Is any fragment above it? Search Front-Face

slide-75
SLIDE 75

Enrique de Lucas

  • 3. Render-Based CD in the GPU

75

Z-Overlap Test

ZEB

List-Register Stack

... = = = ...

Current

=

Current Element

Is Front Face? Push

Yes No

Search Front-face

Hit0 Hit3 Hit2

...

Hit1

Collision Pair Generator Fragments Above Front-Face? Is any fragment above it?

Collision!!! Generate Collision Pairs

Yes No

Search Front-Face

slide-76
SLIDE 76

Enrique de Lucas 76

Outline

  • 1. Motivation
  • 2. Collision Detection (CD)
  • 3. Render-Based CD in the GPU
  • 4. Results
  • 5. Conclusion
slide-77
SLIDE 77

Enrique de Lucas

  • 4. Results

77

Evaluation Methodology

  • TEAPOT simulation infrastructure

– Android and OpenGL ES – GPU timing simulator models:

  • Tile-Based Rendering architecture (ARM Mali 400MP-like)

– GPU power model based on McPAT

  • Extended with RBCD unit
  • Marss cycle accurate full system simulator
  • Workloads: 4 unmodified Android commercial games

Benchmark

Type Donwnloads Captain America beat'em up 10-50 M Crazy Snowboard snowboard arcade 5-10 M Sleepy Jack action 100-500 K Temple Run adventure arcade 100-500 M

slide-78
SLIDE 78

Enrique de Lucas

  • 4. Results

78

Performance and Energy of CD

RBCD vs CD on a CPU

  • 3400x speedup and 2875x energy reduction on average

– RBCD reuses rendering results

cap crazy sleepy temple geo.mean

1000 2000 3000 4000 5000 6000 Speedup

cap crazy sleepy temple geo.mean

1000 2000 3000 4000 5000 6000 Energy Reduction

slide-79
SLIDE 79

Enrique de Lucas

  • 4. Results

79

GPU Overhead

Row 1 Row 2 Row 3 Row 4 2 4 6 8 10 12 Column 1 Column 2 Column 3

cap crazy sleepy temple geo.mean 0.2 0.4 0.6 0.8 1

Normalized Energy

cap crazy sleepy temple geo.mean 0.2 0.4 0.6 0.8 1

Normalized Time

  • Energy: 3.5%
  • Time overhead: 3%
  • Area overhead: < 1%
slide-80
SLIDE 80

Enrique de Lucas

  • 4. Results

80

GPU Overhead: Face Culling

cap crazy sleepy temple geo.mean

Triangles Fragments

Normalized

Rasterize 18% more triangles, 6% more fragments

Face Culling

Rasterization

Fragments

Fragment Processing

Common Face Culling:

Face Culling

Rasterization

Fragments

Fragment Processing

Fragments

CD

Deferred Face Culling:

slide-81
SLIDE 81

Enrique de Lucas

  • 5. Conclusions

81

Conclusions

  • Energy budget limits the quality and realism of CD in mobile

graphics animation applications

  • Most of the computation required by Image-Based CD is already

done in image rendering

  • RBCD provides

2875x energy reduction

3400x speedup

Pixel level accuracy

Small overheads: time (3%), energy (3.5%) and area (1%)

  • RBCD is a low-energy yet high-fidelity CD solution
slide-82
SLIDE 82

48th International Symposium on Microarchitecture

Ultra-Low Power Render-Based Collision Detection for CPU/GPU Systems

Enrique de Lucas Pedro Marcuello

Joan-Manuel Parcerisa Antonio González