taking synchrony seriously taking synchrony seriously a
play

Taking Synchrony Seriously: Taking Synchrony Seriously: A - PowerPoint PPT Presentation

Taking Synchrony Seriously: Taking Synchrony Seriously: A Perceptual-Level Model of Infant A Perceptual-Level Model of Infant Synchrony Detection Synchrony Detection Christopher G. Prince, George J. Hollich, Christopher G. Prince, George J.


  1. Taking Synchrony Seriously: Taking Synchrony Seriously: A Perceptual-Level Model of Infant A Perceptual-Level Model of Infant Synchrony Detection Synchrony Detection Christopher G. Prince, George J. Hollich, Christopher G. Prince, George J. Hollich, Nathan A. Helder, Eric J. Mislivec, Nathan A. Helder, Eric J. Mislivec, Anoop Reddy, Sampanna Salunke, & Anoop Reddy, Sampanna Salunke, & Naveed Memon Naveed Memon Department of Computer Science Department of Psychological Sciences University of Minnesota Duluth Purdue University Duluth, MN USA West Lafayette, IN USA chris@cprince.com 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 1

  2. Outline of Talk Outline of Talk  Types of Synchrony Detection  Types of Synchrony Detection  A Model of Synchrony Detection  A Model of Synchrony Detection  Comparison to Infant Behavior  Comparison to Infant Behavior  Conclusions  Conclusions 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 2

  3. Acknowledgements Acknowledgements  Collaborators  Collaborators  Lakshmi Gogate  Lakshmi Gogate  Students  Students  Soleh Dib, Tyrel Pollak  Soleh Dib, Tyrel Pollak  Tim Colburn’s CS 4531 software engineering class  Tim Colburn’s CS 4531 software engineering class  Colleagues  Colleagues  Rocio Alba-Flores, Kang James  Rocio Alba-Flores, Kang James  Supported in part by UROP grants and by a donation  Supported in part by UROP grants and by a donation from Digi-Key from Digi-Key QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 3

  4. Types of Audio-Visual Types of Audio-Visual Synchrony Detection Synchrony Detection QuickTime™ and a YUV420 codec decompressor are needed to see this picture.  Punctuate speech-object synchrony  Punctuate speech-object synchrony  Two month olds can detect (Gogate et  Two month olds can detect (Gogate et al., 2004) al., 2004)  Face-voice synchrony  Face-voice synchrony QuickTime™ and a YUV420 codec decompressor are needed to see this picture.  10- to 16-week old infants (Dodd,  10- to 16-week old infants (Dodd, 1979) 1979)  Talker with distractor  Talker with distractor QuickTime™ and a YUV420 codec decompressor are needed to see this picture.  E.g., cocktail party (Hollich et al., in  E.g., cocktail party (Hollich et al., in press) press)  Multiple visual events  Multiple visual events QuickTime™ and a YUV420 codec decompressor  E.g., multiple talkers (Pickens et al.,  E.g., multiple talkers (Pickens et al., are needed to see this picture. 1994; Hollich & Prince, in progress) 1994; Hollich & Prince, in progress) 4

  5. Research Question Research Question  Can a single general-purpose synchrony  Can a single general-purpose synchrony detection mechanism, estimating audio- detection mechanism, estimating audio- visual synchrony from low-level signal visual synchrony from low-level signal features, account for infant synchrony features, account for infant synchrony detection across a broad range of audio- detection across a broad range of audio- visual speech integration tasks ? visual speech integration tasks ? 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 5

  6. Hershey & Movellan (2000) Hershey & Movellan (2000)  Computes mutual information between two  Computes mutual information between two sensory channels over a time window (length S ) sensory channels over a time window (length S )  Assumes Gaussian distributed sensory signals  Assumes Gaussian distributed sensory signals  Synchrony defined as mutual-information  Synchrony defined as mutual-information between sensory channels between sensory channels   | A ( t k ) || V ( x , y , t k ) | M ( x , y , t k )  1 2 log 2 ( audioDampening )  | A , V ( x , y , t k ) | For other approaches see: http://www.cprince.com/PubRes/Zurich04 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 6

  7. Synchrony Detection with HM Synchrony Detection with HM   HM algorithm HM algorithm   Generates mixelgrams Generates mixelgrams   Each pixel of the mixelgram is Each pixel of the mixelgram is a mixel , a m utual i nformation a mixel , a m utual i nformation pix el pix el SenseStream progam: Mixels computed from mutual information between audio and visual channels (Mislivec, 2004) Perceptually relevant mixelgrams typically indicate synchrony between the two input channels (Vuppla, 2004) 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 7

  8. Calculation for Each Mixel Frames from one channel consist Frames from other channel consist of single vectors of n -elements, of h x w m -element vectors, here here processed audio features visual image features 1 É n S frames 1 É n 1 É n 1 É n h 1 É 1 É m n w n m QuickTime™ and a decompressor are needed to see this picture. Audio cov arian ce  ma trix: A ( t k ) QuickTime™ and a decompressor are needed to see this picture. n+m Visual cov arian ce  V ( x , y , t k ) ma trix:  A , V ( x , y , t k ) Joint covariance matrix:   | A ( t k ) || V ( x , y , t k ) | M ( x , y , t k )  1 25 August 2004 25 August 2004 8 2 log 2  | A , V ( x , y , t k ) |

  9. Audio Dampening Audio Dampening  We use an additional term on the HM  We use an additional term on the HM equation to dampen mutual information equation to dampen mutual information outputs when audio is “sub-audible” outputs when audio is “sub-audible”   | A ( t k ) || V ( x , y , t k ) | M ( x , y , t k )  1 (1  1 2 log 2 2 r  )  | A , V ( x , y , t k ) |  r = max RMS audio value over S interval  r = max RMS audio value over S interval   = 50 is a fixed threshold   = 50 is a fixed threshold 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 9

  10. SenseStream Program SenseStream Program Running Running Original Video SenseStream Running On Video QuickTime™ and a YUV420 codec decompressor are needed to see this picture. QuickTime™ and a H.261 decompressor are needed to see this picture. 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 10 10

  11. Quantitative Analysis of Quantitative Analysis of Synchrony Synchrony  HM algorithm outputs mixelgrams  HM algorithm outputs mixelgrams  Qualitative  Qualitative  Depict synchrony graphically  Depict synchrony graphically  Also useful to reduce mixelgrams to  Also useful to reduce mixelgrams to scalars scalars  Quantitative synchrony analysis  Quantitative synchrony analysis 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 11 11

  12. Idea: Connected Regions Idea: Connected Regions Original Video SenseStream Running On Video QuickTime™ and a YUV420 codec decompressor are needed to see this picture. QuickTime™ and a H.261 decompressor are needed to see this picture. 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 12 12

  13. Connected Region Analysis Connected Region Analysis  Compute variance in sizes of connected regions  Compute variance in sizes of connected regions per mixelgram. Nonzero mixels i and j are said to per mixelgram. Nonzero mixels i and j are said to be connected when j is one of the eight-neighbors be connected when j is one of the eight-neighbors of i (edge mixels have fewer neighbors), and of i (edge mixels have fewer neighbors), and   max M ( i ) M ( j ) , M ( j )  Threshold    M ( i )  applies where M(mixel) is the value of the mixel , and  applies where M(mixel) is the value of the mixel , and Threshold = 1.125. Threshold = 1.125.  Connected regions are the spatial extent of pairs of  Connected regions are the spatial extent of pairs of mixels that are connected. mixels that are connected. 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 13 13

  14. Edge Detection Method Edge Detection Method  Another synchrony estimation method uses  Another synchrony estimation method uses general-purpose image processing general-purpose image processing  Relies on a similar observation to that of  Relies on a similar observation to that of connected region analysis connected region analysis  With mixelgram M ,  With mixelgram M , h  w  Sobel 3  3 ( Gaussian 15  15 ( M )) i  1  Generally better results than with connected  Generally better results than with connected region analysis region analysis 25 August 2004 25 August 2004 http://www.cprince.com/PubRes/EpiRob04 http://www.cprince.com/PubRes/EpiRob04 14 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend