Imaging Pipeline Instructors: Yasuhiro Mukaigawa, Takuya Funatomi, - - PowerPoint PPT Presentation

imaging pipeline
SMART_READER_LITE
LIVE PREVIEW

Imaging Pipeline Instructors: Yasuhiro Mukaigawa, Takuya Funatomi, - - PowerPoint PPT Presentation

Visual Media Processing II 2020 No.2 Imaging Pipeline Instructors: Yasuhiro Mukaigawa, Takuya Funatomi, Kenichiro Tanaka Todays mini -report: http://bit.ly/vmp2-2020-2 Slide credit: Ioannis Gkioulekas, Tomokazu Sato Todays mini -report


slide-1
SLIDE 1

No.2 Imaging Pipeline

Instructors: Yasuhiro Mukaigawa, Takuya Funatomi, Kenichiro Tanaka

2020 Visual Media Processing II Slide credit: Ioannis Gkioulekas, Tomokazu Sato Today’s mini-report: http://bit.ly/vmp2-2020-2

slide-2
SLIDE 2

Visual Media Processing II (4098), 2020 Fall

Today’s mini-report and Changes

  • Today’s mini-report

http://bit.ly/vmp2-2020-2

  • I’ve put the questions on the website.

http://omilab.naist.jp/class/VMP2/2020/ Also, Google Forms contains full questions.

  • Strongly recommend to complete Mini-report

in the class.

2

slide-3
SLIDE 3

Visual Media Processing II (4098), 2020 Fall

Photometric analysis pipeline

3

Image processing / image analysis (#3 - #8)

Optics (#1) (Today) Sensor in-camera processing

slide-4
SLIDE 4

Visual Media Processing II (4098), 2020 Fall

Digital Image Consists of Pixels

4

slide-5
SLIDE 5

Visual Media Processing II (4098), 2020 Fall

How many pixels?

  • iPhone 11
  • 12MP (Mega Pixels) (approx. 4000 x 3000)
  • A virtual 45 Giga pixels camera

5

http://360gigapixels.com/tokyo-tower-panorama-photo/

slide-6
SLIDE 6

Visual Media Processing II (4098), 2020 Fall

Each pixel has RGB values

  • Red, Green, Blue

6

ja.wikipedia.org R: 189 G: 207 B: 207

slide-7
SLIDE 7

Visual Media Processing II (4098), 2020 Fall

Bit depth of image

  • 8bit (or 24bit) image
  • Each color channel value ranges between 0-255.
  • Typical images (e.g., JPEG) are 8bit depth.
  • Sufficient for normal use, insufficient for editing.
  • 16bit image
  • Each channel ranges 0-65535.
  • Contains more rich information.
  • Typical RAW data are 16bit depth.
  • More high bit / float value
  • For internal or scientific purpose.

7

Xviii 18bit camera

slide-8
SLIDE 8

Visual Media Processing II (4098), 2020 Fall

Format of digital image

  • RAW image
  • Camera’s sensor data
  • 12MP x 3 channels x 2 bytes = 72 MB
  • Compressed image
  • Lossy. JPEG, GIF, HIEC, etc.
  • Values are grouped and approximated.
  • Lossless. PNG, TIF, etc.
  • Preserves exactly same values. Some formats support 16bit.

8

10 times 100 times 1000 times Original 圧縮 非可逆 可逆

slide-9
SLIDE 9

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q1. Check if explanation is true.

A) For consumer purpose, lossy compression is widely used. B) For photometric analysis, lossless format is preferred. C) The storage size of lossless and RAW files are the same. D) The digital image has continuous values. E) Digital image is the array of integers spans among width, height, and channel. F) If you plan to edit the photo, camera should save 8bit image rather than 16bit.

9

http://omilab.naist.jp/class/VMP2/2020/

slide-10
SLIDE 10

Visual Media Processing II (4098), 2020 Fall

Sensor

  • Array of buckets.
  • Buckets collect photons.
  • Bare sensor does not distinguish the color.

10

Canon 6D sensor

close-up view of photon buckets photons array of photon buckets

To intensity 101 122 80 光子 バケツ 光子 むき出しのセンサ

slide-11
SLIDE 11

Visual Media Processing II (4098), 2020 Fall

Pixel design

11

made of silicon, emits electrons from photons

photodiode photodiode

silicon for read-

  • ut etc. circuitry

color filter color filter

helps photodiode collect more light (also called lenslet) microlens microlens

potential well potential well

stores emitted electrons

フォトダイオード ポテンシャル井戸 読出し回路などのシリコン層 電子をためる 光子を電子に変換する

slide-12
SLIDE 12

Visual Media Processing II (4098), 2020 Fall

How to capture image?

  • Let’s say we have a sensor and an object to

be photographed.

12

digital sensor real-world

  • bject
slide-13
SLIDE 13

Visual Media Processing II (4098), 2020 Fall

How to capture image?

  • Lens map bundles of rays from scene point

to the sensor pixel.

13

Digital sensor Real-world

  • bject

Lens

slide-14
SLIDE 14

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q2. Check if explanation is true.

A) Sensor has bunch of photodiodes that collect photons. B) Sensor distinguishes spectral difference of light. C) Sensor distinguishes angular difference of rays. D) Camera lens map scene point to sensor pixel. E) A scene point contributes to all pixels if there is no lens.

14

http://omilab.naist.jp/class/VMP2/2020/

slide-15
SLIDE 15

Visual Media Processing II (4098), 2020 Fall

Focal length

  • Distance where parallel rays intersect.
  • Fixed 𝑏 comparison
  • Focal length determines field of view.

15

1 𝑔 = 1 𝑏 + 1 𝑐 𝑔 𝑏 𝑐 sensor sensor sensor Short focal length Moderate focal length Long focal length Short focal length Moderate focal length Long focal length 𝑔 𝑔 𝑔 焦点距離 視野角 𝑏 Wide angle telephoto

slide-16
SLIDE 16

Visual Media Processing II (4098), 2020 Fall

Focal length and compression effect

16 photo-studio9.com/compression-effect/

far near Wide Field of view Normal Narrow Focal length Small Moderate Large Perspective effect is lost.

psy.ritsumei.ac.jp/

圧縮効果

slide-17
SLIDE 17

Visual Media Processing II (4098), 2020 Fall

Perspective distortion

  • Appearance is different

17

short long

slide-18
SLIDE 18

Visual Media Processing II (4098), 2020 Fall

Focus and Depth of Field

  • Focus is not focal length.
  • Effect of two (or more) lenses.
  • Depth of field (DOF). (How deep light is focused)

18

ピント 被写界深度 Shallow DOF Wide DOF

Pooh 1 Pooh 2 Rilakuma

In focus region Focusing (ピント合わせ)

Pooh 1 Pooh 2 Rilakuma

Depth of field In focus region ピントの合う位置

slide-19
SLIDE 19

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q3. Check if explanation is true.

A) A far object in a image is blurred because focal length was small. B) A far object in a image becomes small if focal length is small. C) Wide angle lens has small focal length. D) Wide DOF lens has small focal length. E) Focal length changes the appearance of the object.

20

http://omilab.naist.jp/class/VMP2/2020/

slide-20
SLIDE 20

Visual Media Processing II (4098), 2020 Fall

Over/under exposure

in shadows we are limited by noise in highlights we are limited by clipping 露光

slide-21
SLIDE 21

Visual Media Processing II (4098), 2020 Fall

It’s very hard to capture all range at once

  • Digital image (or sensor) is linear scale.
  • World’s brightness is log scale.

1500 1 25,000 400,000 2,000,000,000 明るさ

slide-22
SLIDE 22

Visual Media Processing II (4098), 2020 Fall

The world has a high dynamic range

10-6 106 adaptation range of our eyes

common real-world scenes

ダイナミックレンジ

slide-23
SLIDE 23

Visual Media Processing II (4098), 2020 Fall

Digital sensors have a low dynamic range

10-6 106 adaptation range of our eyes

common real-world scenes

10-6 106 sensor

slide-24
SLIDE 24

Visual Media Processing II (4098), 2020 Fall

(Digital) images have an even lower dynamic range

10-6 106 adaptation range of our eyes

common real-world scenes

10-6 106 image low exposure

slide-25
SLIDE 25

Visual Media Processing II (4098), 2020 Fall

(Digital) images have an even lower dynamic range

10-6 106 adaptation range of our eyes

common real-world scenes

10-6 106 image high exposure

slide-26
SLIDE 26

Visual Media Processing II (4098), 2020 Fall

slide-27
SLIDE 27

Visual Media Processing II (4098), 2020 Fall

slide-28
SLIDE 28

Visual Media Processing II (4098), 2020 Fall

slide-29
SLIDE 29

Visual Media Processing II (4098), 2020 Fall

slide-30
SLIDE 30

Visual Media Processing II (4098), 2020 Fall

High Dynamic Range (HDR) Fusion

  • Restore log scale luminance and map LDR values.

HDR Data Log scale luminance Fusion image

Ordinary (LDR) images. HDR mode in iPhone camera. 明るさ 合成

slide-31
SLIDE 31

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q4 Check if explanation is true.

A) Digital sensor’s dynamic range is wider than human eyes. B) Saturated region is black due to lack of photons. C) Sunny outdoor is approximately 10 times brighter than room light. D) HDR mode is suitable for capturing very dark and very bright region at the same time.

32

http://omilab.naist.jp/class/VMP2/2020/

slide-32
SLIDE 32

Visual Media Processing II (4098), 2020 Fall

Three parameters to control brightness

  • Manual Mode
slide-33
SLIDE 33

Visual Media Processing II (4098), 2020 Fall

Aperture (Iris)

  • How many light rays are gathered.
  • Large aperture creates large bokeh.

34

Single Pixel Small aperture = in-focus Large aperture = reach to neighbor pixels = out-of-focus センサ面 Lens Aperture Small = dark Large = bright 開口(絞り) ボケ

slide-34
SLIDE 34

Visual Media Processing II (4098), 2020 Fall

Aperture and DOF

35

1s, F14 = Wide DOF 1/30s, F2.2 = Shallow DOF Nice bokeh!

slide-35
SLIDE 35

Visual Media Processing II (4098), 2020 Fall

Shutter (Exposure time)

  • How long the sensor accumulate photons.
  • Long exposure suffers from motion blur.

36

シャッター速度(露光時間) time time time Fast motion can be captured Splash is blurred

Nikon-image.com

動きブレ、手ブレ

slide-36
SLIDE 36

Visual Media Processing II (4098), 2020 Fall

ISO (Sensitivity, Amplification ratio)

  • How to map electron levels to digital signal.
  • Noise is also amplified.

37

x10 High ISO Equivalent long shutter ISO感度(感度、増強率)

slide-37
SLIDE 37

Visual Media Processing II (4098), 2020 Fall

Best parameters depend on scene

  • Which effect do you want?
slide-38
SLIDE 38

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q5. Do parameters explain the photo?

39

slow shutter, small aperture, low ISO fast shutter, large aperture, low ISO fast shutter, large aperture, high ISO fast shutter, small aperture, high ISO

A) B) C) D)

http://omilab.naist.jp/class/VMP2/2020/

slide-39
SLIDE 39

Visual Media Processing II (4098), 2020 Fall

Human perception of tones.

  • Perceived brightness is non-linear.
  • More sensitive to dark tones.
  • Approximately a gamma function.
  • 𝛿 = 2.2 is a good approximation.

40

𝑀∗ = 𝑍1/𝛿

perceived luminance トーン・諧調 本当の明るさ 知覚する明るさ

slide-40
SLIDE 40

Visual Media Processing II (4098), 2020 Fall

Tone reproduction

  • As known as gamma correction.
  • Without tone reproduction, images look very

dark.

41

ガンマ補正 Without gamma correction. (RAW, linear image) After gamma correction.

slide-41
SLIDE 41

Visual Media Processing II (4098), 2020 Fall

Camera sensitivity

  • Camera is linear sensitivity.
  • Why look dark?

42

The scene True luminance What we perceive input

  • utput

Camera Captured image (RAW) 線型な感度 真の明るさ

slide-42
SLIDE 42

Visual Media Processing II (4098), 2020 Fall

Display’s tone response

  • Display has opposite response to human perception.
  • Tone pipeline without tone reproduction. (very dark)

43

input

  • utput

Human perception input

  • utput

display response input

  • utput

Camera input

  • utput

Display input

  • utput

Human

slide-43
SLIDE 43

Visual Media Processing II (4098), 2020 Fall

Tone reproduction

  • Fitting to human perception.
  • Add inverse tone mapping before image

compression to cancel display response.

44

input

  • utput

Camera RAW Luminance input

  • utput

Tone reproduction Saved image (e.g., jpeg) input

  • utput

Display input

  • utput

Human perception Perceived image Warning: Our values are no longer linear relative to scene radiance!

slide-44
SLIDE 44

Visual Media Processing II (4098), 2020 Fall

JPEG v.s. RAW (Camera setting)

  • If you plan to manipulate pixel values, use RAW.
  • Apply JPEG conversion just before presentation.

45

JPEG RAW Brightness Non-linear Linear Dynamic range Low (8bit) High (16bit) Color Distorted Bayer mosaic Information Lost Lossless File size Small Large Appearance Natural Dark JPEG への変換は、表示する直前にすること

slide-45
SLIDE 45

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q6 Check if explanation is true.

A) Human perceive more sensitive to dark region. B) Digital image after gamma correction is no longer linear. C) The best gamma depends on your display device. D) Without gamma, the image looks dark. E) JPEG image is already tone reproduced. F) RAW is recommended for photometric analysis.

46

http://omilab.naist.jp/class/VMP2/2020/

slide-46
SLIDE 46

Visual Media Processing II (4098), 2020 Fall

Remember: Model for Color Observation

  • Multiplication of all spectral effects.

Incident spectrum 𝐹(𝜇) Sensitivity of color sensor 𝑇𝑗(𝜇)

𝑊

𝑗 = න 𝜇=380 780

𝐹 𝜇 𝑆 𝜇 𝑇𝑗 𝜇 𝑒𝜇

𝑗 ∈ [𝑆, 𝐻, 𝐶]

Spectral reflectance 𝑆(𝜇)

R G B

slide-47
SLIDE 47

Visual Media Processing II (4098), 2020 Fall

Color filter arrays (CFA)

  • To measure color with a digital sensor, pixels are

covered by different color filters, each with its own spectral sensitivity function.

photodiod e photodiod e color filter color filter

microlens microlens

potential well potential well photodiod e color filter

microlens

potential well

slide-48
SLIDE 48

Visual Media Processing II (4098), 2020 Fall

What color filters to use?

  • Spectral sensitivity of filters?
  • Spatial arrange (mosaic) of filters?

Bayer mosaic Spectral sensitivity of Canon cameras Why more green pixels? Generally do not match human’s cone response.

slide-49
SLIDE 49

Visual Media Processing II (4098), 2020 Fall

Many different CFAs

Finding the “best” CFA mosaic is an active research area. CYGM

Canon IXUS, Powershot

RGBE

Sony Cyber-shot

How would you go about designing your

  • wn CFA? What criteria would you consider?
slide-50
SLIDE 50

Visual Media Processing II (4098), 2020 Fall

Many different spectral sensitivity functions

Each camera has its more or less unique, and most of the time secret, spectral sensitivity.

  • Makes it very difficult to correctly reproduce the color of sensor

measurements.

Images of the same scene captured using 3 different cameras with identical sRG RGB settings.

slide-51
SLIDE 51

Visual Media Processing II (4098), 2020 Fall

What does RAW (Bayer) image look like?

mosaicking artifacts

  • Kind of disappointing.
  • We call this the RAW image.
slide-52
SLIDE 52

Visual Media Processing II (4098), 2020 Fall

CFA demosaicing

Produce full RGB image from mosaiced sensor output. Interpolate from neighbors:

  • Bilinear interpolation (needs 4 neighbors).
  • Bicubic interpolation (needs more neighbors, may overblur).
  • Edge-aware interpolation.

Large area of research. デモザイキング

slide-53
SLIDE 53

Visual Media Processing II (4098), 2020 Fall

Demosaicing by bilinear interpolation

Bilinear interpolation: Simply average your 4 neighbors. G? G1 G4 G3 G2 G? = G1 + G2 + G3 + G4 4 Neighborhood changes for different channels: 線型補間

slide-54
SLIDE 54

Visual Media Processing II (4098), 2020 Fall

Aside: can you think of other ways to capture color?

[Slide credit: Gordon Wetzstein]

slide-55
SLIDE 55

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q7. Check if explanation is true.

A) A pixel of mosaic pattern sensor captures single color information only. B) Color Full HD image’s dimension is 1920x1080x3 while RAW (bayer) image’s dimension is 1920x1080. C) Even the scene is exactly the same, different cameras record slightly different colors due to the difference of spectral sensitivity. D) Demosaicing always recovers the correct color.

56

http://omilab.naist.jp/class/VMP2/2020/

slide-56
SLIDE 56

Visual Media Processing II (4098), 2020 Fall

Remember: Color constancy

Human visual system has chromatic adaptation:

  • We can perceive white (and other colors) correctly under different light sources.

Retinal vs perceived color.

slide-57
SLIDE 57

Visual Media Processing II (4098), 2020 Fall

White?

  • Most types of light “contain” more than one wavelengths.
  • We call our sensation of all of these distributions “white”.
  • How should the digital image be?

“White” lamps.

slide-58
SLIDE 58

Visual Media Processing II (4098), 2020 Fall

White balancing

Human visual system has chromatic adaptation:

  • We can perceive white (and other colors) correctly under different light sources.
  • Cameras cannot do that (there is no “camera perception”).

White balancing: The process of removing color casts so that colors that we would perceive as white are rendered as white in final image. different whites image captured under fluorescent image white- balanced to daylight ホワイトバランス

slide-59
SLIDE 59

Visual Media Processing II (4098), 2020 Fall

Color temperature / White balancing presets

Cameras nowadays come with a large number of presets: You can select which light you are taking images under, and the appropriate white balancing is applied.

Emission of magma(=1200℃)

White colors are described by the temperature of ideal black body radiation.

Black body radiation

ろうそくの火 室内灯 朝焼け・夕焼け 蛍光灯 フラッシュ 平均的な屋外 12時の太陽 雲 青空

色温度 黒体放射 マグマの黒体放射

slide-60
SLIDE 60

Visual Media Processing II (4098), 2020 Fall

Manual white balancing

  • 1. Select white object
  • 2. Normalize so that selected

pixel to be white.

Possible approaches:

  • Select a camera preset based on lighting.
  • Manually select object in photograph that is color-neutral and

use it to normalize.

slide-61
SLIDE 61

Visual Media Processing II (4098), 2020 Fall

Automatic white balancing

  • 1. Grey world assumption: : force average color of scene to be grey.
  • Compute per-channel average.
  • Normalize each channel by its average.
  • Normalize by green channel average.
  • 2. White world assumption: force brightest object in scene to be white.
  • Compute per-channel maximum.
  • Normalize each channel by its maximum.
  • Normalize by green channel maximum.

sensor RGB white-balanced RGB sensor RGB white-balanced RGB

  • 3. Sophisticated histogram-based algorithms (what most modern cameras do).
slide-62
SLIDE 62

Visual Media Processing II (4098), 2020 Fall

Automatic white balancing example

input image grey world white world

slide-63
SLIDE 63

Visual Media Processing II (4098), 2020 Fall

Mini-report

  • Q8. Check if explanation is true.

A) White is the color of around 6000𝐿. B) White is the effect of human feelings. C) A certain wavelength is white. D) Color temperature of fluorescent light is 4000 − 5000𝐿. This means the temperature of fluorescent light is 3727 − 4727℃. (abs. zero = −273℃) E) The primary purpose of white balancing is to render white object as R=G=B.

64

http://omilab.naist.jp/class/VMP2/2020/