SLIDE 17 16
Disjoint Two-Stream Convolutional Networks
System Description
◮ Two streams (spatial and temporal), same structure. ◮ Last FC layer with 3 outputs: left lane-change (LLC), right lane-change (RLC) and no lane-change (NLC). ◮ Dense OF using polynomial expansion. ◮ Spatial stream pre-trained using ImageNet. ◮ Temporal stream pre-trained using UCF-101 and HMDB-51.
Spatial Stream ConvNet
conv1 7x7x96 stride 2 pool 2x2
LC score fusion
single frame
conv2 5x5x256 stride 2 pool 2x2 conv3 3x3x512 stride 1 conv4 3x3x512 stride 1 conv5 3x3x512 stride 1 full6 4096 dropout full7 2048 dropout full8 3 softmax
Temporal Stream ConvNet
conv1 7x7x96 stride 2 pool 2x2
conv2 5x5x256 stride 2 pool 2x2 conv3 3x3x512 stride 1 conv4 3x3x512 stride 1 conv5 3x3x512 stride 1 full6 4096 dropout full7 2048 dropout full8 3 softmax
- D. F. Llorca, M. Biparva, R. Izquierdo, J. K. Tsotsos | IEEE ITSC 2020 (20-23 September 2020)