social interactions a first person perspective
play

Social Interactions: A First-Person Perspective. A. Fathi, J. - PowerPoint PPT Presentation

Social Interactions: A First-Person Perspective. A. Fathi, J. Hodgins, J. Rehg Presented by Jacob Menashe November 16, 2012 Social Interaction Detection Objective: Detect social interactions from video footage. Social Interaction Detection


  1. Social Interactions: A First-Person Perspective. A. Fathi, J. Hodgins, J. Rehg Presented by Jacob Menashe November 16, 2012

  2. Social Interaction Detection Objective: Detect social interactions from video footage.

  3. Social Interaction Detection Objective: Detect social interactions from video footage. ◮ Consider faces and attention

  4. Social Interaction Detection Objective: Detect social interactions from video footage. ◮ Consider faces and attention ◮ Account for temporal context

  5. Social Interaction Detection Objective: Detect social interactions from video footage. ◮ Consider faces and attention ◮ Account for temporal context ◮ Analyze first-person movements cues

  6. Introduction Overview Features Temporal Context Experiments

  7. Video Example Red Dialogue Yellow Walking Dialogue Green Discussion Light Blue Walking Discussion Dark Blue Monologue None Background Link

  8. Features Features are constructed based on first- and third-person information.

  9. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement).

  10. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person)

  11. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person) 3. Attention and Roles. For each person x :

  12. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person) 3. Attention and Roles. For each person x : ◮ Faces looking at x

  13. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person) 3. Attention and Roles. For each person x : ◮ Faces looking at x ◮ Whether first person looks at x

  14. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person) 3. Attention and Roles. For each person x : ◮ Faces looking at x ◮ Whether first person looks at x ◮ Mutual attention between x and first person

  15. Features Features are constructed based on first- and third-person information. 1. Dense optical flow (first-person movement). 2. Face locations (relative to first person) 3. Attention and Roles. For each person x : ◮ Faces looking at x ◮ Whether first person looks at x ◮ Mutual attention between x and first person ◮ Number of faces looking at where x is looking

  16. Feature Example

  17. Conditional Random Fields CRFs are described in Lafferty et al. [2001].

  18. Conditional Random Fields CRFs are described in Lafferty et al. [2001]. ◮ Observations and labels form a Markov chain. ◮ Nodes pend on neighbors. y 1 y 2 y 3 x 1 x 2 x 3

  19. Conditional Random Fields CRFs are described in Lafferty et al. [2001]. ◮ Observations and labels form a Markov chain. ◮ Nodes pend on neighbors. y 1 y 1 y 2 y 3 p ( y 1 | x 1 , y 2 ) x 1 x 2 x 3

  20. Conditional Random Fields CRFs are described in Lafferty et al. [2001]. ◮ Observations and labels form a Markov chain. ◮ Nodes pend on neighbors. y 1 y 1 y 2 y 2 y 3 p ( y 2 | y 1 , y 3 , x 2 ) x 1 x 2 x 3

  21. Conditional Random Fields CRFs are described in Lafferty et al. [2001]. ◮ Observations and labels form a Markov chain. ◮ Nodes pend on neighbors. y 1 y 2 y 3 y 3 p ( y 3 | y 2 , x 3 ) x 1 x 2 x 3

  22. Hidden Conditional Random Fields A micro view of the HCRF model as described in Quattoni et al. [2007]. Y h 1 h 2 h 3 x i

  23. Hidden Conditional Random Fields A micro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. Y h 1 h 2 h 3 x i

  24. Hidden Conditional Random Fields A micro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ x i is a single observation in the sequence. Y h 1 h 2 h 3 x i

  25. Hidden Conditional Random Fields A micro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ x i is a single observation in the sequence. ◮ Each h i is a possible hidden state. Y h 1 h 2 h 3 x i

  26. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. Y h 1 h 2 h 3 x 1 x 2 x 3

  27. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. Y h 1 h 2 h 3 x 1 x 2 x 3

  28. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. Y h 1 h 2 h 3 x 1 x 2 x 3

  29. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. ◮ Each h i is the hidden state label assigned to x i . Y h 1 h 2 h 3 x 1 x 2 x 3

  30. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. ◮ Each h i is the hidden state label assigned to x i . Y p ( h 1 | Y , h 2 , x 1 ) h 1 h 1 h 2 h 3 x 1 x 2 x 3

  31. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. ◮ Each h i is the hidden state label assigned to x i . Y p ( h 2 | Y , h 1 , h 3 , x 2 ) h 1 h 2 h 2 h 3 x 1 x 2 x 3

  32. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. ◮ Each h i is the hidden state label assigned to x i . Y p ( h 3 | Y , h 2 , x 3 ) h 1 h 2 h 3 h 3 x 1 x 2 x 3

  33. Hidden Conditional Random Fields (cont.) A macro view of the HCRF model as described in Quattoni et al. [2007]. ◮ Y is a label for the whole sequence. ◮ Each x i is a single observation in the sequence. ◮ Each h i is the hidden state label assigned to x i . Y Y p ( Y |{ h i } ) = p ( Y |{ x i } ) h 1 h 2 h 3 x 1 x 2 x 3

  34. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). WDlg h 1 h 2 h 3 x 1 x 2 x 3

  35. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. WDlg h 1 h 2 h 3 x 1 x 1 x 2 x 2 x 3 x 3

  36. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: WDlg h 1 h 1 h 2 h 2 h 3 h 3 x 1 x 2 x 3

  37. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 1 : John wants to hear about my weekend. WDlg h 1 h 1 h 2 h 3 x 1 x 2 x 3

  38. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 2 : I’m feeling talkative. WDlg h 1 h 2 h 2 h 3 x 1 x 2 x 3

  39. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 3 : Mary wants to listen to her iPod. WDlg h 1 h 2 h 3 h 3 x 1 x 2 x 3

  40. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 1 : John wants to hear about my weekend. WDlg p ( h 1 | Y , h 2 , x 1 ) h 1 h 1 h 2 h 3 x 1 x 2 x 3

  41. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 2 : I’m feeling talkative. WDlg p ( h 2 | Y , h 1 , h 3 , x 2 ) h 1 h 2 h 2 h 3 x 1 x 2 x 3

  42. HCRF Example Suppose we want to find the likelihood of “walking dialogue” ( WDlg ) vs “walking discussion” ( WDisc ). ◮ Each x i is now a feature extracted from video frames. ◮ Each h i is determined from training: ◮ h 3 : Mary wants to listen to her iPod. WDlg p ( h 3 | Y , h 2 , x 3 ) h 1 h 2 h 3 h 3 x 1 x 2 x 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend