Natural Scene Perception
PSY3280 - Week 10 Lecture (01 Oct 2018)
Rafik Hadfi Zhao Hui Koh
Natural Scene Perception PSY3280 - Week 10 Lecture (01 Oct 2018) - - PowerPoint PPT Presentation
Natural Scene Perception PSY3280 - Week 10 Lecture (01 Oct 2018) Rafik Hadfi Zhao Hui Koh Learning Objective - Natural Scene Human? Machine? Images retrieved from https://www.planetminecraft.com/project/minecraft-poem-scene/ and
Rafik Hadfi Zhao Hui Koh
Images retrieved from https://www.planetminecraft.com/project/minecraft-poem-scene/ and https://www.expedia.com/pictures/usa/washington-state.d249/
Human? Machine?
“gist”
Stage 1: Free Recall
Appearance, Spatial relations between objects
(taxonomy) of object categories
(Fei-Fei et al., 2007) Easy Complex
○ Preference of outdoor (vs indoor) if visual information is scarce (small PT)
vehicle) as well as basic category levels (e.g. train, plane, car)
activities
information (object identification, object/scene categorisation)
(Fei-Fei et al., 2007)
phenomenal vision
Images retrieved from http://psychologyexamrevision.blogspot.com/2012/01/sperling-1960.html
(Haun et al., 2017)
experience (Haun et al., 2017) ○ Controlled experiments - what a participant can report on (high-level categorical response, binary choice)
exposures to the stimulus.
○ Information Theory - quantify bits of information (reduction of uncertainty) ○ Yes/no question from an image (presented for 1 second) - 1 bit of information ○ Past research - We can perceived up to maximum of 44 bits/second (Pierce, 1980)
Informative” (Loeffler, Alon, 2017)
a natural scene
(Loeffler, Alon, 2017)
whether a word (descriptor) could describe the image (present and absent)
(SOA - time between image
choices)
+ confidence rating
(Loeffler, Alon, 2017)
○ Shorter SOA - bottom-up processing (features) ○ Longer SOA - top-down processing (semantic)
Exp 2 (10 questions/image) Exp 3 (20 questions/image) 52 bits/sec 100 bits/sec SOA: 133ms (Loeffler, Alon, 2017)
Pixel matrices with RGB values
Images retrieved from Nishimoto (2015) and https://www.ini.uzh.ch/~ppyk/BasicsOfInstrumentation/matlab_help/visualize/coloring-mesh-and-surface-plots.html
recognition & classifications, object detection, face recognition, cameras, robots)
Images retrieved from https://support.apple.com/en-us/HT208109 and Van Essen, & Gallant (1994)
Apple Face ID
(Week 9 Lecture)
Retrieved from https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148
Images retrieved from https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
Image Filter (Activation map/Feature Map)
Images retrieved from https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
(Vinyals et al., 2015; LeCun et al., 2015)
Encoder Decoder
Captions?
(feedback/recurrent)
○ English -> French ○ Image -> Caption
(LeCun et al., 2015)
Inputs Outputs
(Kelvin et al., 2016; LeCun et al., 2015)
visual illusion, change blindness, binocular rivalry?
○ PredNet (Watanabe et al., 2018) Rotating Snake Illusion
Representation of Primate IT Cortex for Core Visual Object Recognition. PLOS Computational Biology, 10(12), e1003963–18. http://doi.org/10.1371/journal.pcbi.1003963
7(1), 10–29. http://doi.org/10.1167/7.1.10
817–4. http://doi.org/10.1093/nc/niw023
Australia.
Publications.
Networks Trained for Prediction. Frontiers in Psychology, 9, 1143–12. http://doi.org/10.3389/fpsyg.2018.00345
Neuron, 13(1), 1-10.
Presented at the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. http://doi.org/10.1109/CVPR.2015.7298935
Neural image caption generation with visual attention. Jmlr.org