SLIDE 1
Visual disability Low vision 2015 Estimated blind people 2020 - - PowerPoint PPT Presentation
Visual disability Low vision 2015 Estimated blind people 2020 - - PowerPoint PPT Presentation
Visual disability Low vision 2015 Estimated blind people 2020 Visually impaired 285 M Blind 54 M Blind 39 M Global data souce: WHO, IBU See the world through the eyes of a visually impaired person Normal vision Cataract Glaucoma
SLIDE 2
SLIDE 3
See the world through the eyes of a visually impaired person
SLIDE 4
Normal vision
SLIDE 5
Cataract
SLIDE 6
Glaucoma
SLIDE 7
Macular degeneration
SLIDE 8
Diabetic Retinopathy
SLIDE 9
Complete blindness
SLIDE 10
One goal: independence
SLIDE 11
Text recognition
SLIDE 12
Object recognition
SLIDE 13
Mobility assistance
SLIDE 14
Scene and photos description
SLIDE 15
Face recognition
SLIDE 16
Day-lasting battery OTA Upgrades Smartphone App Real time performance (offline) Obstacle perception
Possible approaches
Single Camera Stereo Camera CPU Stereo Camera FPGA Stereo Camera GPU
SLIDE 17
SLIDE 18
How Horus works
Externalinput identification Cameras aquire images Image are transferred to the computing unit
3 2 1
Audio is transferred back to the headset
5
Information extraction
4
Sound output
6
SLIDE 19
What Horus does
Horus can help the user with:
Scene description Face recognition Text reading Object recognition Mobility assistance
SLIDE 20
ON/OFF button After powering Horus, the user can choose the desired functionality by navigating a vocal menu using the navigation buttons.
User interaction
Navigationbuttons
Scene description Text reading Face recognition Mobility assistance Object recognition
Navigation menu
SLIDE 21
Whole process runs on NVIDIA TK1
A sunset over the mountains Convolutional network Language model
Image description with Deep Learning
SLIDE 22
CNN: 595ms LSTM: 1200ms
TOTAL: 1795ms
CNN: 22ms LSTM: 498ms
TOTAL: 520ms
300 MB Processing time on GPU Memory footprint
Results on TK1 CPU vs GPU (CNN + LSTM)
Processing time on GPU
~3.5X faster on GPU
SLIDE 23
User
High-pitched sound left
Horus uses 3D sound to report the presence of obstacles during movement. The space in front of the user is divided in different sectors: lateral obstacles generate high-pitched sounds in one of the two speakers, while central obstacles generate low pitched centered sounds. These sounds are repetitive and they increase in repetition frequency as the
- bstacle gets closer.
High-pitched sound right
Reporting obstacles
Low-pitched sound center
SLIDE 24
5650 ms 116 ms Processing time on GPU (Visionworks)
Results on TK1 CPU vs GPU (SGBM @480p)
Processing time on CPU
~48X faster on GPU
SLIDE 25
If the text is located in the upper part of the fieldof view, Horus emits a high-pitched sound to tell the userto lower the text If the text is located in the lower part of the fieldof view, Horus emits a low-pitched sound to tell the userto raise the text
Example of audio feedback
SLIDE 26
Source frame
Face recognition
Face detection Tracking CNN Classification
SLIDE 27
Future improvements 3D reconstruction of faces 3D undistortion of sheets Object recognition Multi language LSTM models
SLIDE 28