for Large Volumetric Canvas Yeojin Kim 1 , Byungmoon Kim 2 , Jiyang - - PowerPoint PPT Presentation

for large volumetric canvas
SMART_READER_LITE
LIVE PREVIEW

for Large Volumetric Canvas Yeojin Kim 1 , Byungmoon Kim 2 , Jiyang - - PowerPoint PPT Presentation

S7698: CanvoX: High-Resolution VR Painting for Large Volumetric Canvas Yeojin Kim 1 , Byungmoon Kim 2 , Jiyang Kim 1 and Young J. Kim 1 1 Ewha Womans University , 2 Adobe Research http://graphics.ewha.ac.kr/canvox/ Vector or Pixel? Tilt Brush,


slide-1
SLIDE 1

CanvoX: High-Resolution VR Painting for Large Volumetric Canvas

Yeojin Kim1, Byungmoon Kim2, Jiyang Kim1 and Young J. Kim1

1Ewha Womans University, 2Adobe Research

S7698:

http://graphics.ewha.ac.kr/canvox/

slide-2
SLIDE 2
slide-3
SLIDE 3

Vector or Pixel?

Tilt Brush, Quill, … Our System

slide-4
SLIDE 4

Voxels?

  • Voxel = Volume + Pixel
  • Easy to manipulate and traverse
  • Allow to recolor, erase strokes and mix colors
  • Allow to express semi-transparent strokes
slide-5
SLIDE 5

2D Painting Represents Large Space

2D Painting In Canvas

1142 1600

slide-6
SLIDE 6

1000

1G Voxel

3D Voxel Canvas will be very limited

In canvas

slide-7
SLIDE 7

Deep Octree For Large Canvas With High Details

40km

226 × 226 × 226 = 302,231,454,903,657,293,676,544

(0.3𝑛𝑛)3

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

Challenges

  • Painting in Large Canvas with High Detail
  • Deep Level Octree
  • Dynamic Tree on GPU
  • Expensive Refinement and Coarsening Cost
  • #10,000 ~ #100,000 nodes are generated or deleted with a stroke
  • Volume Rendering in VR Environment
  • Real-time Ray Casting
  • e.g. HTC Vive : Resolution 1680 × 1512 × 2, 90

90fp fps~

  • Accumulated error along the ray
slide-11
SLIDE 11

Dynamic Octree Structure

slide-12
SLIDE 12

Octree data must be kept on the GPU sid ide

GPU-Only ly Octree v vs. s. CPU & GPU Octree

  • No
  • CP

CPU-CPU Transfer

  • Scatter lim

limited in in GLS LSL

  • Atomic for allo

llocation

  • Ba

Bala lancin ing is is not

  • t tri

trivial

  • Nee

eed CP CPU-GPU Transfer

  • CP

CPU : : Mem emory Management GPU : : Da Data for

  • r Ren

endering

slide-13
SLIDE 13

Tree cells

CPU Side Octree GPU Side Octree Update

[Kim15] Byungmoon Kim, Panagiotis Tsiotras, Jeong-Mo Hong , and Oh-young Song, Interpolation and parallel adjustment of center-sampled trees with new balancing constraints

  • Strong 2-to-1 Balanced Tree [Kim15]
  • Simple Primal-only Tree
  • 1D array fields
  • Maximum Depth Level : 26
  • Physical Unit : 0.3mm3~ 40km3

Octree Outline

slide-14
SLIDE 14

Tree cells

CPU Side Octree GPU Side Octree Update

  • GPU Side Octree
  • 1D Array Fields(CPU) → 2D Texture(GPU)
  • Size of texture : 64M
  • Allow Both Up & Down Traversal

IDs IDs & Fl Flag

32bit X 3 INT

RGBA

8bit X 4 UBYTE

In Interpola lation Table Ind Index

16bit X 4 INT

In Interpola lation Weig ight Table le

16bit X 2 FLOAT

Octree Outline

slide-15
SLIDE 15
slide-16
SLIDE 16

Tree cells

Tree Synchronization : CPU – GPU Transfer

Synchroniz ize ?

Refin ine Coa

  • arsen
  • Drawing a stroke causes local changes in space

Block : M x N Texels

slide-17
SLIDE 17

One-level Refinement and Coarsening

0 : outside 1 : boundary 2 : inside

Frame 𝑢0 0 0 1 1 0 1 2 1 1 2 1 0 1 1 0 0

slide-18
SLIDE 18

One-level Refinement and Coarsening

0 : outside 1 : boundary 2 : inside

0 0 0 0 2 2

1 1 1 1 1 1 1 1 2 2

Frame 𝑢1

1 1 1 2 1 0 1 1 1 1 1 1

slide-19
SLIDE 19

One-level Refinement and Coarsening

Frame 𝑢𝑜 ≤ # 𝐍𝐛𝐲 𝐄𝐟𝐪𝐮𝐢

slide-20
SLIDE 20

Update Tree to GPU

0 : outside 1 : boundary 2 : inside

Frame 𝑢0 0 0 1 1 0 1 2 1 1 2 1 0 1 1 0 0

6 5 4 3 2 1 Blo lock ID ID

slide-21
SLIDE 21

Update Tree to GPU

0 0 0 0 2 2

1 1 1 1 1 1 1 1 2 2

Frame 𝑢1

1 1 1 2 1 0 1 1 1 1 1 1 0 : outside 1 : boundary 2 : inside

6 5 4 3 2 1 Blo lock ID ID

slide-22
SLIDE 22

Update Tree on GPU

Update Bloc lock Ordered Se Set

0 1 2 7 3 11 blo block ID ID Frame 𝑢𝑜

Main ain Thread: Update one block in every Frame

Frame 𝑢0 0 0 1 1 0 1 2 1 1 2 1 0 1 1 0 0 Frame 𝑢1 0 0 0 0 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 Cell to be updated

Push bloc lock ID ID : : 0 Push bloc lock ID ID : : 1 Pop

  • p bloc

lock ID ID

slide-23
SLIDE 23
slide-24
SLIDE 24

Real-time Drawing

slide-25
SLIDE 25

Volume Rendering

slide-26
SLIDE 26

Volume Rendering?

  • Triangle Mesh Generation
  • Performance, Accuracy, Transparency Problem
  • Slicing
  • We tested octree texture interpolation : slow
  • Splatting
  • Splat should be bigger than cell : loss of resolution
  • Performance
  • Ray Casting
slide-27
SLIDE 27

Ray Casting with Large Canvas

Proble lem 1

Tree traversal from root to leaf at every sample points

Proble lem 2

Useless sample points at empty space

Proble lem 3

Error increases along the ray

Proble lem 4

Rendering can be slow

→ Traverse up when cell ll is is empty ty

slide-28
SLIDE 28

P1 P1

From Root to Leaf?

Vis isit itin ing Ce Cell ll at t P0 Vis isit itin ing Ce Cell ll at t P1

P0 P0

Vis isit itin ing Ce Cell ll at t P0 Vis isit itin ing Ce Cell ll at t P1

  • (6~24 neighbors) x (# cells) = ….?
slide-29
SLIDE 29

From Root to Leaf?

𝑶𝟏 𝑶𝟐 𝑶𝟑 𝑶𝟒 𝑫

  • Thanks to 2-to-1 balance tree,
  • A Cell always has 6 Neighbors
  • 3 Neighbors share the parent

( = Their ID can be computed using offset)

  • 3 Neighbors have different parent

→ If we precompute only 3-Neig ighbors, we can move to next xt neig ighbor dir irectly ly

Given Cell Neighbors which shares the parent Neighbors which have different parent

slide-30
SLIDE 30

Foveal Region

slide-31
SLIDE 31

QuadTree Render Target

𝑋 4 × 𝐼 4 𝑋 2 × 𝐼 2 𝑋 × 𝐼

slide-32
SLIDE 32
slide-33
SLIDE 33

CPU CPU

One-level Refine/Coarsen marked cell Color and Mark cell Pain aint Th Thread

GPU GPU

Update 3-neighbor texture Render Scene(Ray Casting) Render heat map Update Scene Quad Tree Interpolate Scene Quad Tree

HM HMD Con Controlle ler /Haptic De Device

  • View
  • Position
  • Brush

Position

Mai Main Th Thread Initialize Octree

St Stroke Da Data

Oc Octree

Child ID Parent ID

flag RGBA Temp0

Segment ID Stroke ID

Upd pdate Bloc Block Or Ordered Se Set block ID Update View matrix & controller Pos. Render Store Stroke Data Update Texture Block to GPU Oc Octree

CanvoX Model

slide-34
SLIDE 34

Implementation Detail

HMD HTC Vive CPU Intel(R) Core(TM) i7-4790 CPU @ 3.60 GHz RAM 16 GB GPU NVIDIA GeForce 980Ti OS Window 10 64Bit Libraries OpenGL 4.3, OpenVR, Grizzly [Kim 15]

slide-35
SLIDE 35
slide-36
SLIDE 36

Summary

  • Dynamic and Simple Octree both on CPU and GPU
  • Shadow octree on GPU maintained by local changes
  • One-level refine/coarsen strategy
  • Real-time Ray Casting in Large Canvas
  • Fast tree traversal at samples using tree connectivity
  • Fast rendering using Quadtree-based Foveated Rendering
  • Minimize floating point error using local coordinates
slide-37
SLIDE 37

Future Work

  • Performance Optimization
  • Adaptive 3D Interpolation
  • GPU-only Octree
  • Isosurface Rendering
  • More artistic tools
slide-38
SLIDE 38

Low Precision Computation

slide-39
SLIDE 39

Numbers

slide-40
SLIDE 40

Numbers

  • New algebra with finite numbers someday?
  • This will be a breakthrough in math
  • Until then, we should live with floating points
  • Approximation to real field
  • (a+b)+c ≈ a+(b+c)
  • Extension to real field
  • Inf, NaN
  • We may probably establish an extension
  • No

Nonstandard An Analysis

  • Fin

Finit ite Ext xtended Ordered Fie Field ld? 10 10 10 10 10 10 10 10 10 10

slide-41
SLIDE 41

Decreasing Precision

  • 8bit CPU had 80bit registers to hold intermediate extended precision
  • Do you use long double?
  • Single precision
  • SSE, AVX, …
  • 24bit float is found in GPU
  • half-float is common in GPU
  • Mobile GPUs and highest end GP100
  • 8bit representation is also common in GPU
  • UNORM/SNORM
slide-42
SLIDE 42

Knowing Numerical Error

  • Given x, floating point representation has error proportional to x
  • fl(x) = x(1+e), |e| <= 1.19 × 10−7
  • Numerical error:
  • fl(x+y) = (x+y)(1+e1), |e1| <= 1.19 × 10−7
  • fl(fl(x+y) + z) = ((x+y)(1+e1) + z)(1+e2) = (x+y)(1+e1+e2+e1e2) + z(1+e2)

= (x+y+z)(1+2e3) , |e3| <= 1.19 × 10−7

  • fl( a / fl(fl(x+y) + z)) = a / (x+y+z)/(1+2e3) (1+e4) = a / (x+y+z)/(1+3e5), |e5| <=

1.19 × 10−7

slide-43
SLIDE 43
slide-44
SLIDE 44

Numerical Error In Volume Ray Casting

  • Cells are (much) smaller than floating point precision
  • We can use cell local coordinates, but we have to cross many cells
  • How much error we are getting along the ray?
  • If the error is large, then we may have entered to a wrong cell. This is hard to c
  • rrect.
  • We should advance the ray such that the numerical error does not accumulate.
slide-45
SLIDE 45

pright-eye α 𝑢 p1 pleft-eye peye-center pi

Numerical Error In Ray Casting

slide-46
SLIDE 46

Error Does Not Increase By The Ray Length L

slide-47
SLIDE 47
slide-48
SLIDE 48

Sin Single Precisio ion: ε = 1.19 1.19 × 10 10−7 Ha Half lf Precision: ε = 10 10−3

slide-49
SLIDE 49

Conclusion

  • We have chosen a scheme with a small enough error
slide-50
SLIDE 50

Thank you 

Project Webpage : http://graphics.ewha.ac.kr/canvox/ Yeojin Kim, yeojinkim@ewhain.net Byungmoon Kim, bmkim@adobe.com Jiyang Kim, soarmin11@ewhain.net Young J. Kim, kimy@ewha.ac.kr

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (No. 2017R1A2B3012701)

slide-51
SLIDE 51

Interpolation: Node-Sampled vs Center-Sampled

No Node = = Cell ll Cor

  • rner

Cell ll Center

Par arent-Child Complex Si Simple Co Covered Area ea Complex Si Simple Boo Book-keeping Face / Edge Shared Duplicate Samples Prim rimal l Tree Only ly Par arall llel Adj djustment Unknown 2014 2014 / Griz Grizzly ly In Interpola lation Eas asy Hard

slide-52
SLIDE 52

Interpolation: Node-Sampled vs Center-Sampled

Glift lift (2 (2006) Gri rizzly (2 (2014) Co Constrain ined Uniform 2-to-1 balanced In Interpola lation tech echniq ique Texture Unit Samples with stencil and table

slide-53
SLIDE 53

Non-Uniform, Scale Variant Resolution