Understand Basketball Games
2018.6.15 吴浩贤 朱⽂斈韬
Understand Basketball Games 2018.6.15 Sports Videos Large - - PowerPoint PPT Presentation
Understand Basketball Games 2018.6.15 Sports Videos Large quantity, high quality Practical utility Stereotypical Sports Videos Stereotypical: Sports Videos Stereotypical: Sports Videos Stereotypical: [Pass,
2018.6.15 吴浩贤 朱⽂斈韬
Large quantity, high quality Practical utility Stereotypical
Stereotypical:
Stereotypical:
[Pass, Score(2-Pointer)] [Pass, Pass, Score(3-Pointer)] [Pass, Pass, Score(3-Pointer)] [Pass, Score(2-Pointer)]
Stereotypical:
Recognition
http://basketballattention.appspot.com/dataset_browser.html Detecting events and key actors in multi-person videos, Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei, CVPR 2016, https://arxiv.org/abs/1511.02917
Challenges:
http://basketballattention.appspot.com/dataset_browser.html Detecting events and key actors in multi-person videos, Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei, CVPR 2016, https://arxiv.org/abs/1511.02917
Event Label
Long-term Recurrent Convolutional Networks
Long-term Recurrent Convolutional Networks for Visual Recognition and Description, CVPR2015 Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell
Event Label
Detection and/or Tracking
In your imagination…
In dataset…
球的颜⾊艳,形状相对固定,考虑传统⽅斺法 Canny边缘检测+Hough变换应⽤甩于曲线检测 (x-c1)2+(y-c2)2 = r2 图像⼆亍值化,Canny边缘检测, 在边缘像素点(x,y)上枚举c1,c2 累加在像素点(x,y)下的三元组(c1,c2,r),检测圆形
快速运动中球的形变,⾊艳变;复杂背景下多个候选圆⽬盯 标。
Event Label
Frame Feature: CNN ResNet (no top) (2048,)
Player Feature: CNN VGG19 (no top) (512,) for player
(1877,) for player
Weighted
(1365,) for player
Concat
spatial histogram with pyramid
(1877,) player feature (2048,) frame feature … LSTM
In a clip Trajectory Extra Constant Vector As Context
In a clip Trajectory Extra constant vector as context of sequence
In a clip Trajectory Extra constant vector as context of sequence
Bidirectional LSTM: Compute a global(clip-level) context feature for each frame
Next we use a unidirectional LSTM with extra input to represent the state of the event at time t
Gradient Clipping
Gradient Explode ❌ Clip the gradient before parameter update
I Goodfellow, The cliff Y Bengio, Deep Learning
Spatial Only LRCN Combined (Context) Combined (Concat) Top_1_acc 0.44 0.35 0.47 0.41 Top_2_acc 0.69 0.59 0.70 0.62