Multimodal Corpus for Integrated language and action Rishabh Nigam - PowerPoint PPT Presentation

Multimodal Corpus for Integrated language and action Rishabh Nigam 10598 Cognitive Sciences

Multimodal Corpus for Integrated language and action

Multimodal Corpus for Integrated language and action ◮ Abstract: Collected data from audio, vedio, kinect and RFID tags to augment raw data with annotations for actions performed. The action in this case is making a cup of tea.

Multimodal Corpus for Integrated language and action ◮ Abstract: Collected data from audio, vedio, kinect and RFID tags to augment raw data with annotations for actions performed. The action in this case is making a cup of tea. ◮ Goal: Cognitive Assistance for everyday’s task

Multimodal Corpus for Integrated language and action ◮ Abstract: Collected data from audio, vedio, kinect and RFID tags to augment raw data with annotations for actions performed. The action in this case is making a cup of tea. ◮ Goal: Cognitive Assistance for everyday’s task ◮ Related work: the CMU Multi-Modal Activity Database (2009) is a corpus of recorded and annotated video, audio and motion capture data of subjects cooking recipes in a kitchen.[1]

Multimodal Corpus for Integrated language and action ◮ Abstract: Collected data from audio, vedio, kinect and RFID tags to augment raw data with annotations for actions performed. The action in this case is making a cup of tea. ◮ Goal: Cognitive Assistance for everyday’s task ◮ Related work: the CMU Multi-Modal Activity Database (2009) is a corpus of recorded and annotated video, audio and motion capture data of subjects cooking recipes in a kitchen.[1] ◮ Difference: Here we also include 3-d data using Kinect, the subject verbally describes what he is doing and there are attached anotations to each action performed.

Equipments used ◮ Audio – 3 microphones – to capture what the subject is using to describe the task he/she is performing.

Equipments used ◮ Audio – 3 microphones – to capture what the subject is using to describe the task he/she is performing. ◮ Vedio – HD vedios

Equipments used ◮ Audio – 3 microphones – to capture what the subject is using to describe the task he/she is performing. ◮ Vedio – HD vedios ◮ Kinect RGB + depth data

Equipments used ◮ Audio – 3 microphones – to capture what the subject is using to describe the task he/she is performing. ◮ Vedio – HD vedios ◮ Kinect RGB + depth data ◮ RFID tags : The subject was supposed to wear an RFID sensing iBracelet which records the RFID tag closest to the wrist at any time. sensors attached to Kitchen appliances to give better data on which instrument is used.

Equipments used ◮ Audio – 3 microphones – to capture what the subject is using to describe the task he/she is performing. ◮ Vedio – HD vedios ◮ Kinect RGB + depth data ◮ RFID tags : The subject was supposed to wear an RFID sensing iBracelet which records the RFID tag closest to the wrist at any time. sensors attached to Kitchen appliances to give better data on which instrument is used. ◮ Power Consumption: use of electric kettle and we determine using power consumption whether the kettle is on or not.

Annotations ◮ The Audio data was transcribed and transcription was segmented to utterances. Breaks as one speaks where used to mark the end of sentences. If an utterance were not complete sentences, longer pause where looked at.

Annotations ◮ The Audio data was transcribed and transcription was segmented to utterances. Breaks as one speaks where used to mark the end of sentences. If an utterance were not complete sentences, longer pause where looked at. ◮ Then it uses a parser using semantic lexicons to create the logical form, the semantic representation of the language

Annotations ◮ The Audio data was transcribed and transcription was segmented to utterances. Breaks as one speaks where used to mark the end of sentences. If an utterance were not complete sentences, longer pause where looked at. ◮ Then it uses a parser using semantic lexicons to create the logical form, the semantic representation of the language ◮ IM (Interpretation manager) was used to extract a concise event description from each clause, derived from each main verb and its arguments. e.g. Place tea bag in the cup = > PUT THE TEA BAG INTO THE CUP.

Annotations ◮ The Audio data was transcribed and transcription was segmented to utterances. Breaks as one speaks where used to mark the end of sentences. If an utterance were not complete sentences, longer pause where looked at. ◮ Then it uses a parser using semantic lexicons to create the logical form, the semantic representation of the language ◮ IM (Interpretation manager) was used to extract a concise event description from each clause, derived from each main verb and its arguments. e.g. Place tea bag in the cup = > PUT THE TEA BAG INTO THE CUP. ◮ To learn the name of the given IDs that the audio description has, we gather the nouns mentioned by the subject, convert them into ontological concepts using parse data and determine the concept with the highest probability of being mentioned when that ID is detected

Results ◮ While they only have a small amount of data, the labels generated by the algorithm agreed with a human anno-tator, who used the video to determine the mappings, for six out of the eight tags.

References [1] http://kitchen.cs.cmu.edu/ [2] Mary Swift , George Ferguson , Lucian Galescu , Yi Chu , Craig Harman , Hyuckchul Jung ,Ian Perera , Young Chol Song , James Allen , Henry Kautz ”A multimodal corpus for integrated language and action”, Department of Computer Science, University of Rochester, Rochester, NY 14627

Multimodal Corpus for Integrated language and action Rishabh Nigam - PowerPoint PPT Presentation

Multimodal Corpus for Integrated language and action Rishabh Nigam 10598 Cognitive Sciences Multimodal Corpus for Integrated language and action Multimodal Corpus for Integrated language and action Abstract: Collected data from audio,

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

The SmartKom Multimodal Corpus Data Collection and EndtoEnd Evaluation Nicole Beringer

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Production in a Multimodal Corpus: How Speakers Communicate Complex Actions LREC 2008 Carlos

MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt,

The ICSI corpus; Browsing meetings nlssd natural language and speech system design . Steve

The Extended SPaRKy Restaurant Corpus designing a corpus with variable information density David

Translation Model Parallel corpus source target translation e f phrase phrase features

Municipal Water District of Orange County May 1, 2019 Action 1 Action 1 Action 2 Action 2

Samskip Multimodal Short Sea and Multimodal Business www.samskip.com 1 Samskip Group Profile

Fusical : Multimodal Fusion for Video Sentiment Boyang Tom Jin Leila Abdelrahman Cong Kevin Chen

BigARTM: Open Source Library for Regularized Multimodal Topic Modeling of Large Collections

Composite Correlation Qantization for Efficient Multimodal Retrieval Mingsheng Long 1 , Yue Cao 1

Multimodal Machine Learning Main Goal Define a common taxonomy for multimodal machine learning

Multimodal Dependent Type Theory Daniel Gratzer 0 Alex Kavvos 0 Andreas Nuyts 1 Lars Birkedal 0

Speaker and Emotion Recognition of TV-Series Data Using Multimodal and Multitask Deep Learning

Multimodal Abstractive Summarization for How2 Videos ACL19 Shru* Palaskar Jindrich

Nave Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky & James Martin, Jacob

Multimodal Corpus for Integrated language and action Rishabh Nigam - PowerPoint PPT Presentation

Multimodal Corpus for Integrated language and action Rishabh Nigam 10598 Cognitive Sciences Multimodal Corpus for Integrated language and action Multimodal Corpus for Integrated language and action Abstract: Collected data from audio,

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

The SmartKom Multimodal Corpus Data Collection and EndtoEnd Evaluation Nicole Beringer

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Multimodal Corridor Planning &amp; Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Production in a Multimodal Corpus: How Speakers Communicate Complex Actions LREC 2008 Carlos

MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt,

The ICSI corpus; Browsing meetings nlssd natural language and speech system design . Steve

The Extended SPaRKy Restaurant Corpus designing a corpus with variable information density David

Translation Model Parallel corpus source target translation e f phrase phrase features

Municipal Water District of Orange County May 1, 2019 Action 1 Action 1 Action 2 Action 2

Samskip Multimodal Short Sea and Multimodal Business www.samskip.com 1 Samskip Group Profile

Fusical : Multimodal Fusion for Video Sentiment Boyang Tom Jin Leila Abdelrahman Cong Kevin Chen

BigARTM: Open Source Library for Regularized Multimodal Topic Modeling of Large Collections

Composite Correlation Qantization for Efficient Multimodal Retrieval Mingsheng Long 1 , Yue Cao 1

Multimodal Machine Learning Main Goal Define a common taxonomy for multimodal machine learning

Multimodal Dependent Type Theory Daniel Gratzer 0 Alex Kavvos 0 Andreas Nuyts 1 Lars Birkedal 0

Speaker and Emotion Recognition of TV-Series Data Using Multimodal and Multitask Deep Learning

Multimodal Abstractive Summarization for How2 Videos ACL19 Shru* Palaskar Jindrich

Nave Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky &amp; James Martin, Jacob

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

Nave Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky & James Martin, Jacob