[PPT] - Obstacle Avoidance for Monocular Drones Table of Contents I. PowerPoint Presentation

SLIDE 1

Obstacle Avoidance for Monocular Drones

SLIDE 2

Obstacle Avoidance for Monocular Drones 2

Technological context

Today's trend for consumer drones,

– Follow Me – Flight Plan

Parrot is working on it but it needs some safety features

SLIDE 5

Obstacle Avoidance for Monocular Drones 5

Technological context

We need to make sure the drone is going to

avoid obstacles when in these modes.

We aim to avoid « jerk » so that the footage

remains stable and the customer is still happy

We cannot use expensive sensors, we only

have 1 camera and IMU.

SLIDE 6

Obstacle Avoidance for Monocular Drones 6

Technological context

Parrot already have a strong piloting library, that accepts

high level commands

They also have a good video stabilization algorithm, so

that we can assume input videos is just translated.

Video stab ? Piloting Motors Video ??

SLIDE 7

Obstacle Avoidance for Monocular Drones 7

Analytical solutions

Structure from motion

– SLAM – Obstacle detection – PathFinding

SLAM is either very expensive or very sparse, which is a

problem for consumer products

These solutions also imply to render a whole 3D

environment which is a waste of computation

We want an end-to-end system that directly applies a

policy to get a smooth obstacle detection

SLIDE 8

Obstacle Avoidance for Monocular Drones 8

II. Neural Networks for obstacle

avoidance

SLIDE 9

Obstacle Avoidance for Monocular Drones 9

Deep Learning and Autonomous Vehicules

Very fashionable for autonomous cars since 2005,

not so much for drones

DAVE , 2005, was already trying to avoid obstacles

– It learned to drive by mimicking human piloting : it

was not reinforcement learning

LAGR: Learning Applied to Ground Robotics

– Made classification to incoming road

Sermanet, R. H. P., Erkan, J. B. A., Scoffier, M., & LeCun, K. K. U. M. Y. Learning Long-Range Vision for Autonomous Off-Road Driving. Muller, U., Ben, J., Cosatto, E., Flepp, B., & Cun, Y. L. (2006). Off-road obstacle avoidance through end-to-end learning. In Advances in neural information processing systems (pp. 739-746).

SLIDE 10

Deep Learning and Autonomous Vehicules

So what's the problem with drones ?

– It's impossible to relate depth of the pixels according to the horizon, in cluttered

environment, every pixel is an obstacle, what we don't know is its distance

The learning task is more difficult because it needs a context, and it may be

impossible to have a ground truth to work with.

– Human piloting ressources are expensive, and not necessarily perfect – Still got some

interesting results with imitation training

Ross, S., Melik-Barkhudarov, N., Shankar, K. S., Wendel, A., Dey, D., Bagnell, J. A., & Hebert, M. (2013, May). Learning monocular reactive uav control in cluttered natural environments. In Robotics and Automation (ICRA), 2013 IEEE International Conference on (pp. 1765-1772). IEEE.

SLIDE 11

Obstacle Avoidance for Monocular Drones 11

Reinforcement learning for robots

Presumably the best way to have a well

condensed system, as it only deals with inputs and outputs

As it is, it's not reasonnable to apply it on a real

drone, too many crashes ! Real world is very complex and we need an efficient way of learning

SLIDE 12

Obstacle Avoidance for Monocular Drones 12

Introducing Deep Reinforcement Learning

Deep Q Learning, solves complexe Atari games only by looking

at the picture output and simple rewards.

Combined with DQN, Deep DPG (DDPG) can solve continuous

states problems such as driving games

Why not drones ? Let's try out

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. In Nature Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., ... & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

SLIDE 13

13

We approximate Q' with a neural net Q(s,a,W) and change our policy to maximize it,

which is possible for a limited set of possible actions

How to learn Q ? We need a loss function which is differentiable, we try to approach

Bellman equation

Deep Q learning

Environment Policy Action at State, Reward st, rt-1

Rt =

∑

t '=t T

γ

t'−trt'

at=maxaQ(st,a ,W )

L(W )=[ y−Q(st,at ,W t)]

2

∇W L(W )=[( y−Q (st ,at,W )) ∇W Q(s ,a,W )] y=rt+ γmaxa'Q(st+1,a' ,W t−1)

SLIDE 14

Obstacle Avoidance for Monocular Drones 14

Deep Dpg learning

Environment Actor (µ)

at=maxaQ (st,a ,W )

Easy to find if the set of action is finite. What about continuous action ? We need to

find a better method to find the best action possible. We design with Q a function that maps directly to actions, maximizing Q regarding the action Critic (Q) st st,rt-1 st at

L(W ')=|max aQ(st,a)−Q (st,μ(st,W '))|

∇W ' L=∇ aQ (s ,a)∇ W 'μ(s,W ') During Inference, only the Actor will be called. We now have a mapping directly from previous states to any continuous action

SLIDE 15

Obstacle Avoidance for Monocular Drones 15

Deep Reinforcement Learning applied to autonomous drone flying

Parrot already has an implemented simulator, based on

Gazebo, which behave (hopefully) exactly like the real drone.

We can use that for testing our algorithms, it's much faster and

safer than real training

Gazebo simulator although realistic for physical interactions is

not very similar to real video footage. Our system won't work if it's only trained on simulator

SLIDE 16

Obstacle Avoidance for Monocular Drones 16

The domain adaptation problem

Suppose we have something working great on simulator, how

do we make it work in real life ?

How can we use already trained networks as a pretrained net ?
Can we train the network in simulator differently ?

SLIDE 17

Obstacle Avoidance for Monocular Drones 17

The pretraining problem

The necessary dataset is going to be huge if we want not to be

context-sensitive, even for a simulation

We can try to condense information in higher level such as

depth map in a supervised training, and then train a simpler network with reinforcement learning

SLIDE 18

Obstacle Avoidance for Monocular Drones 18

The two axis of development

Supervised Reinforcement Network Complexity Environment Complexity Simulator Real Conditions Depth Map Extraction Goal Depth Map for real footage Obstacle Avoidance in simulator

SLIDE 19

Obstacle Avoidance for Monocular Drones 19

III. Depth Net

SLIDE 20

Obstacle Avoidance for Monocular Drones 20

Definitions

Focus Of Expansion (FOE, Φ)
Disparity (δ)

Rigid scene and rotation-less movement only

SLIDE 21

Obstacle Avoidance for Monocular Drones 21

Introducing Flow Net

Flow + Focus of Expansion = Depth Map multiscale reconstruction

Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., ... & Brox, T. (2015). Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2758-2766).

SLIDE 22

Obstacle Avoidance for Monocular Drones 22

Introducing Flow Net

Simple architecture, with only convolutions, LeakyReLU

and concatenation

Earlier feature maps are used for flow construction to

retain spatial information

However, poor results compared to more complicated

architectures such as FlowNetC, FlowNet2 and GC-Net

Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2016). Flownet 2.0: Evolution of optical flow estimation with deep networks. arXiv preprint arXiv:1612.01925. Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., & Bry, A. (2017). End-to-End Learning of Geometry and Context for Deep Stereo Regression. arXiv preprint arXiv:1703.04309.

SLIDE 23

Obstacle Avoidance for Monocular Drones 23

Flow Net → Depth Net

Similar problem to solve, we want the network to
utput directly depth for rigid scenes and rotation less

cameras.

FlowNet is trained on all possible flow and thus can be

considered overkill for rigid scenes problem, which allows us to consider smaller variations of FlowNet

Problem : FlowNet is fully convolutional. Makes sense

for optical flow, but not instinctively for depth

SLIDE 24

Obstacle Avoidance for Monocular Drones 24

Why not using depth from FlowNet ?

For rotation less camera we have this

equation :

Implies an indetermined form when

approaching FOE, which will make errors diverge dramatically

SLIDE 25

Obstacle Avoidance for Monocular Drones 25

The Fully Convolutional problem

Depending on FOE and also optical center (P0) :

needed operation for depth is depending on its

location. For each area, you have to locate FOE and

relative position to P0

Instinctively you would need a FC layer at some

point to solve the problem : connection from feature maps to FC vector gives away FOE and P0

SLIDE 26

Obstacle Avoidance for Monocular Drones 26

Still Box Dataset

Blender powered synthetic dataset, with known depth maps and

displacements

Textures scraped randomly from Flickr
Randomly placed mesh primitives
Not realistic on purpose, depth and colors are completely

decorrelated

SLIDE 27

Obstacle Avoidance for Monocular Drones 27

Same architecture as FlowNetS, but thinner and output is only 1xhxw

layer instead of 2xhxw

10 images scenes with constant displacement
The Training process gives images pairs with equivalent depth map

for a nominal displacement of 30 cm

The network then always assumes the same speed for the camera

and outputs depth values within [0,100]

Simple meanL1 Error is used

Depth Net training with Still Box

SLIDE 28

Obstacle Avoidance for Monocular Drones 28

Depth Net training with Still Box

Lowest feature map size for 64x64 input : convolutions on 1x1

feature maps are then FC layers, FOE and P0 problem solved

Retraining with bigger sizes, lowest feature maps are not 1x1

anymore, but It still improves!

Direct training with big images does not converge as well as with

pretraining.

SLIDE 29

Obstacle Avoidance for Monocular Drones 29

Depth Net training with Still Box

DepthNet64 : 16x16 output DepthNet512 : 128x128 output

SLIDE 30

Obstacle Avoidance for Monocular Drones 30

Depth Net for drone usecase

Only depth equivalent to a displacement D0 of 30 cm is computed. For another

displacement, we can use this simple equation :

Optimal contrast of DepthNet output is reached for central values, within

[15,60]. An optimal shift can be computed from last depth map to center next DepthNet output.

∆t is then computed from previous frame to be as close to D
p

t i m a l as possible

This algorithm can be easily used with multiple shifts, for wide range scenes

where you would need good contrast for close object (<15m) AND for far

bjects ( >60m). Multiple D
p

t i m a l values can be computed with clustering

algorithms such as K-means

β

m e a n

= , 3 α = 1 m

SLIDE 31

31

Multiple Shift DepthNet

Final workflow for multiple shift depth extraction and fusion

1+ e e e βmin βmax βmean

SLIDE 32

Obstacle Avoidance for Monocular Drones 32

Results Video

SLIDE 33

Obstacle Avoidance for Monocular Drones 33

First POC using a depth based Obstacle avoidance algorithm

BatLib is a library from Parrot that take a

depthMap as input (from any provider, such as ToF, stereo or DepthNet)

Video is sent via streaming to a Gaming Laptop

which sends back DepthNet output to the drone

Real demo after the presentation !

SLIDE 34

Obstacle Avoidance for Monocular Drones 34

IV. Future Work

SLIDE 35

Obstacle Avoidance for Monocular Drones 35

Future work

Supervised Reinforcement Network Complexity Environment Complexity Simulator Real Conditions DepthNet Goal DepthNet In real conditions Obstacle Avoidance in simulator We want to go here

SLIDE 36

36

Future work Auto supervision for finetuning

(from Sfm-Net) Photometric error : reprojection error E(f(It-dt), It) f : warp function from depth + displacement

Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., & Fragkiadaki, K. (2017). SfM-Net: Learning of Structure and Motion from Video. arXiv preprint arXiv:1704.07804. Zhou, T., Brown, M., Snavely, N., & Lowe, D. G. (2017). Unsupervised learning of depth and ego-motion from video. arXiv preprint arXiv:1704.07813.

SLIDE 37

Obstacle Avoidance for Monocular Drones 37

Auto supervision

Problems with low textured area, photometric error will

be 0.

Instead of warping images we can also warp depth

maps to get a new error E(f(βt),βt-dt) + E(f(βt-dt),βt)

We can work with completely different shifts (It1,It2),(It3,It4)
This error is to maximize consistency but may converge

to degenerate states not representative to depth map

We can mix the two as a compromise

SLIDE 38

Obstacle Avoidance for Monocular Drones 38

Auto supervision

We need a very precise information on

displacements

This information can be reached for smooth flight
Training dataset can be easily constructed by

anyone and does not require depth map ground truth.

SLIDE 39

Obstacle Avoidance for Monocular Drones 39

Future work

Supervised Reinforcement Network Complexity Environment Complexity Simulator Real Conditions DepthNet Goal DepthNet In real conditions Obstacle Avoidance in simulator We want to go there

SLIDE 40

40

Future work

Obstacle avoidance on Parrot Simulator

Reminder : Simulator is really good for physics but not for

rendering

Train a (not fully) convolutional neural network with e.g. DPPN to

infer safe commands from depthMap in Gazebo.

Higher level goals have to be determined clearly :
Smoothness
Deviation from wanted trajectory

DPPG

Depth Map command

DQN

Reward Constraints Conv FC Conv FC Maximize

Potential weight sharing

SLIDE 41

Obstacle Avoidance for Monocular Drones 41

Future work

Supervised Reinforcement Network Complexity Environment Complexity Simulator Real Conditions DepthNet Goal DepthNet In real conditions Obstacle Avoidance in simulator We want to go there

SLIDE 42

Obstacle Avoidance for Monocular Drones 42

Future work

Domain sensitive (DepthNet) and task sensitive

(DPPN) can now be concatenated for a final finetuning with human supervised flight

Online rewards can be based on human

interception, as with DAgger, for stability, along with GPS and speed constraints reward.

Ross, S., Gordon, G. J., & Bagnell, D. (2011). A reduction of imitation learning and structured prediction to no-regret online learning. In International Conference on Artificial Intelligence and Statistics (pp. 627-635).

SLIDE 43

Obstacle Avoidance for Monocular Drones 43

DepthNet + DQN + DDPG for real life online finetuning

DepthNet Imgt,imgt-dt Depth Map Command DDPG DQN

Maximize Mean, std, optimal dt

Hand crafted Reward

SLIDE 44

Obstacle Avoidance for Monocular Drones 44

Future work

Our network will be hopefully functionning, but

will be very heavy and impossible to deploy for users

We must find a way to make a end-to-end

network that will not waste time constructing an explicit depth map.

We can use our overkill network for Imitation

learning and self supervision ! It can be online or

ffline.
We also must train for optimal shift for next

frame

SLIDE 45

Obstacle Avoidance for Monocular Drones 45

DepthNet + DQN + DDPG → One network to rule them all

DepthNet DDPG Imgt,imgt-dt Depth Map Command

Maximize

DQN S&A Network

ptimal dt

Imtation learning Hand crafted Reward

SLIDE 46

Obstacle Avoidance for Monocular Drones 46

S&A Dagger Learning

DepthNet DDPG Imgt,imgt-dt Depth Map Command S&A Network

ptimal dt

DDPG-based reward

SLIDE 47

Obstacle Avoidance for Monocular Drones 47

Conclusion

Depth Net working on simulator, almost working
n real conditions
Depth Based Avoidance Network to be trained

and evaluated

A mixture of the two will be used for a first Fully

differentiable Obstacle Avoidance real conditions POC

A end-to-end condensed S&A network will be

trained to mimic the former mixture and then to explore by itself

SLIDE 48

Obstacle Avoidance for Monocular Drones 48

Obstacle Avoidance for Monocular Drones

Table of Contents

Technological context

Technological context

Technological context

Analytical solutions

avoidance

Deep Learning and Autonomous Vehicules

Deep Learning and Autonomous Vehicules

Reinforcement learning for robots

Introducing Deep Reinforcement Learning

Deep Q learning

Deep Dpg learning

Deep Reinforcement Learning applied to autonomous drone flying

The domain adaptation problem

The pretraining problem

The two axis of development

Definitions

Introducing Flow Net

Introducing Flow Net

Flow Net → Depth Net

Why not using depth from FlowNet ?

The Fully Convolutional problem

Still Box Dataset

Depth Net training with Still Box

Depth Net training with Still Box

Depth Net training with Still Box

Depth Net for drone usecase

Multiple Shift DepthNet

Results Video

First POC using a depth based Obstacle avoidance algorithm

Future work

Future work Auto supervision for finetuning

Auto supervision

Auto supervision

Future work

Future work

Obstacle avoidance on Parrot Simulator

DPPG

DQN

Future work

Future work

DepthNet + DQN + DDPG for real life online finetuning

Future work

DepthNet + DQN + DDPG → One network to rule them all

S&A Dagger Learning

Conclusion

Thank You !