event based motion segmentation by motion compensation
play

event-based motion segmentation by motion compensation Timo - PowerPoint PPT Presentation

event-based motion segmentation by motion compensation Timo Stoffregen, Guillermo Gallego, Tom Drummond, Lindsay Kleeman, Davide Scaramuzza, ICCV 2019 presented by Ondrej Holesovsky, ondrej.holesovsky@cvut.cz The 5th of November 2019 Czech


  1. event-based motion segmentation by motion compensation Timo Stoffregen, Guillermo Gallego, Tom Drummond, Lindsay Kleeman, Davide Scaramuzza, ICCV 2019 presented by Ondrej Holesovsky, ondrej.holesovsky@cvut.cz The 5th of November 2019 Czech Technical University in Prague, CIIRC

  2. outline 1. Background: event camera intro. 2. Addressed problem: motion segmentation. 3. Related work. 4. Proposed method. 5. Experimental findings, discussion. 1

  3. event camera intro

  4. event camera sensor principle Patrick Lichtsteiner et al., A 128x128 120 dB 15 µ s Latency Asynchronous Temporal Contrast Vision Sensor, IEEE Journal of Solid-State Circuits, 2008. • Each pixel is independent, no global or rolling shutter. • A pixel responds by events to changes in log light intensity. • Level-crossing sampling. 3

  5. an event A contrast change detection event k: e k = [ x k , t k , s k ] • x k - pixel coordinates • t k - timestamp in seconds, microsecond resolution • s k - polarity, − 1 or + 1 This sensory representation usually requires ’re-inventing’ computer vision approaches. Alternative way: render videos from events. 4

  6. a sample event sequence A ball rolling on the floor. 10 ms of events shown in an image plane view. 5

  7. a sample event sequence A rolling ball captured in an XYT view, 10 ms of events. 6

  8. a sample event sequence A rolling ball captured in an XYT view, 300 ms of events. 7

  9. the problem: motion segmentation

  10. event motion segmentation • Classify events into N l clusters, each representing a coherent motion with parameters θ j . • Clusters are three-dimensional (space-time coordinates). • Two objects sharing the same motion are segmented together. • Assume motion constancy: events processed in temporally short packets. • Chicken-and-egg: estimate motion of clusters, cluster events by motion. 9

  11. related work

  12. traditional cameras - sparse method Xun Xu et al., Motion Segmentation by Exploiting Complementary Geometric Models, CVPR 2018. • Assuming known keypoint correspondences (SIFT, corners...). • Geometric models: affine, homography, fundamental. • Spectral clustering at the core. Similarly moving tracked points should belong to the same partition of an affinity graph (motion hypothesis - feature). 11

  13. traditional cameras - dense method Brox and Malik, Object Segmentation by Long Term Analysis of Point Trajectories, ECCV 2010. • Intensity constancy assumption. • Sparse translational point trajectories (3% of pixels): optical flow -> point tracking -> trajectory affinities -> spectral clustering. • Dense segmentation: variational label approximation (Potts model optimisation) on sparse trajectories and pixel colour. VGA at 1 FPS on a GPU. 12

  14. event-based vs. traditional cameras and approaches • The presented approach is semi-dense - more than keypoints. • Assumptions: constant contrast vs. constant intensity. (Both invalid in general.) • Event-based could benefit from higher data efficiency. • High-speed, high dynamic range (HDR), low power. • Real-time: still difficult for both. 13

  15. event-driven ball detection and gaze fixation in clutter A. Glover and C. Bartolozzi, IROS 2016. • Detecting and tracking a ball from a moving event camera. • Locally estimate normal flow directions by fitting planes to events. • Flow direction points to or from the circle centre, which directs the Hough transform. • Any motion but only circular objects. 14

  16. independent motion detection with event-driven cameras V. Vasco et al., ICAR 2017. • An iCub robot head and camera move. • Detect and track corners among events and estimate their velocity. • Learn a model relating head joint velocities (from encoders) to corner velocities. • Independent corners are inconsistent (Mahalanobis distance) with the head joint velocities. • Any objects but need to know egomotion. 15

  17. iwe - image of warped events Or motion-compensated event image. G. Gallego and D. Scaramuzza, Accurate Angular Velocity Estimation With an Event Camera, RAL 2016. • A rotating camera. Look at 2D event cloud projections. • Project along the motion trajectories -> edge structure revealed. • Events of a trajectory: same edge, same polarity*. • Consider the sum of polarities along a trajectory. • Number of trajectories = number of pixels... 16

  18. iwe - method description 1a (simplified) An event image sums polarities along a trajectory. Discrete image coordinates x , continuous event coordinates x k : I ( x ) = ∑ N − 1 k = 0 s k f ( x , x k ) . • N - number of events in the cloud, within a small time interval. • f - bilinear interpolation function, ( x , x k ) �→ [ 0 , 1 ] . • Each event contributes to its four neighbouring pixels. • I ( x ) - sum of neighbourhood-weighted polarities of events firing at pixel location x . 17

  19. iwe - method description 1b (paper notation) An event image sums polarities along a trajectory, continuous image coordinates x , continuous event coordinates x k : I ( x ) = ∑ N − 1 k = 0 s k δ ( x − x k ) . • δ - Dirac delta. • Need to integrate the image for meaningful values. • Naive pixelwise sums along the time axis -> motion blur! 18

  20. iwe - method description 2 Idea: Maximise IWE sharpness by transforming the events to compensate for the motion. Iterative motion parameter optimisation: • Sharpness metric: variance of the IWE pixel values. • IWE variance and its derivatives w.r.t. motion parameters. • Update the motion parameters. Transform the event cloud. • A new IWE from the transformed event cloud. Repeat. 19

  21. iwe - translational motion model example Event cloud transform equations with motion parameters v x , v y : x ′ k = x k + t k v x y ′ k = y k + t k v y It transforms all events to their spatial location at t ′ k = 0. 20

  22. simultaneous optical flow and segmentation (sofas) using a Dynamic Vision Sensor, by Timo Stoffregen and Lindsay Kleeman, ACRA 2017. • Not easy to read. Rough method description: greedy sequential model fitting. • The number of local maxima of the contrast objective ideally matches the number of structures with distinct optical flow velocities. 21

  23. the proposed method

  24. solution summary • One IWE per motion cluster. Each with a different motion model. • Table of event-cluster associations. • Sharpness of the IWEs guides event segmentation. • Joint identification of motion models and associations. 23

  25. event clusters • Event-cluster association p kj = P ( e k ∈ l j ) of event k being in cluster j. • P ≡ ( p kj ) is an N e × N l matrix with all event-cluster associations. Non-negative, rows add up to one. • Association-weighted IWE for cluster j: I j ( x ) = ∑ N e k = 1 p kj δ ( x − x ′ kj ) . x ′ kj is the warped event location. Note: ignoring polarity. 24

  26. single objective to optimise Event alignment within cluster j measured by image contrast, such as the variance, 1 ∫ ( I j ( x ) − µ I j ) 2 d x , Var ( I j ) = | Ω | Ω µ I j is the mean of the IWE for cluster j over the image plane Ω . Find the motion parameters θ and the event-cluster associations P , such that the total contrast of all cluster IWEs is maximised. N l ( θ ∗ , P ∗ ) = argmax ( θ, P ) Var ( I j ) . ∑ j = 1 25

  27. the solution - alternating optimisation Update the motion parameters of each event cluster. Associations are fixed. N l Var ( I j )) , ∑ θ ← θ + µ ∇ θ ( j = 1 µ ≥ 0 is the step size. Single gradient ascent step. Recompute event-cluster associations. Motion parameters are fixed. c j ( x ′ k ( θ j )) p kj = , ∑ N l i = 1 c i ( x ′ k ( θ i )) c j ( x ) ̸ = 0 is the local sharpness of the cluster j at pixel x , c j ( x ) . = I j ( x ) . 26

  28. initialisation Greedy. Not crystal clear. • Start with equal associations. • Optimise the first cluster motion parameters. • Gradient g jk of the local contrast of each event w.r.t. motion parameters. • g kj negative -> the event k likely in the cluster j, p kj set high, low for other clusters. • Such event becomes blurred when moving away from the optimised parameters. • Repeat for the remaining clusters. 27

  29. experimental findings

  30. occlusion Mitrokhin’s 2018 Extreme Event Dataset (EED), ball behind a net. 29

  31. low light, strobe light EED, lighting variation. 30

  32. accuracy - bounding boxes Mitrokhin’s dataset: 31

  33. accuracy - per event Using a photorealistic simulator. Textured pebbles, different relative velocities. Roughly 4 pixels of relative displacement to achieve 90% accuracy (true for any velocity). 32

  34. throughput Complexity linear in the number of clusters N l , events N e , IWE pixels N p , iterations N it . O (( N e + N p ) N l N it ) . Optical flow warps, CPU 2.4 GHz, GPU GeForce 1080: Fast moving drone sequence: ca. 370 kevents/s. Ball behind net: ca. 1000 kevents/s. 33

  35. different motion models Fan blades spinning at 1800 rpm and a falling coin. 34

  36. street, facing the sun 35

  37. non-rigid objects 36

  38. number of clusters If set too large, the clusters not needed end up empty. 5 × OF 10 × OF 5 × OF + 5 × Rotation 37

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend