In-Home Daily-Life Captioning Using Radio Signals Lijie Fan* - - PowerPoint PPT Presentation

β–Ά
in home daily life captioning
SMART_READER_LITE
LIVE PREVIEW

In-Home Daily-Life Captioning Using Radio Signals Lijie Fan* - - PowerPoint PPT Presentation

In-Home Daily-Life Captioning Using Radio Signals Lijie Fan* Tianhong Li* Yuan Yuan Dina Katabi MIT CSAIL * denotes equal contribution How can I make sure grandma is fine? How can I make sure grandma is fine? Daily Life Captioning


slide-1
SLIDE 1

In-Home Daily-Life Captioning Using Radio Signals

Lijie Fan* Tianhong Li* Yuan Yuan Dina Katabi MIT CSAIL

* denotes equal contribution

slide-2
SLIDE 2

How can I make sure grandma is fine?

slide-3
SLIDE 3

How can I make sure grandma is fine? Daily Life Captioning

08:30am: Grandma wakes up and leaves bedroom 10:30am: Grandma takes medicine and eats breakfast 02:00pm: Grandma is watching TV

slide-4
SLIDE 4

Camera is not acceptable

Camera

slide-5
SLIDE 5

How to do Daily Life Captioning?

slide-6
SLIDE 6

What about Radio-Frequency(RF) Signals?

RF Device

slide-7
SLIDE 7

RF signals are privacy-preserving …

RGB Video RF Signals

slide-8
SLIDE 8

RGB Video RF Signals

but are capable of capturing people’s movements and activities

slide-9
SLIDE 9

Challenge I. Object Information

slide-10
SLIDE 10

Challenge I. Object Information

slide-11
SLIDE 11

Challenge I. Object Information

slide-12
SLIDE 12

Solution I. Skeleton + Floormap

RF Signal Skeleton Generation Network Skeleton

slide-13
SLIDE 13

Floormap Illustration

Bed Stove Sink TV RF Device Fridge Wardrobe Shelf Window Dish Washer Sofa

Solution I. Skeleton + Floormap

X Y

Table

slide-14
SLIDE 14

Challenge II. No Existing RF Captioning Dataset! Can We Leverage Existing RGB Captioning Dataset?

slide-15
SLIDE 15

Solution II. Multi-modal Feature Alignment

RF+Floormap Feature Extraction Feature Extraction Network

RF Signal Floormap

+

𝐯𝑄

slide-16
SLIDE 16

Solution II. Multi-modal Feature Alignment

RF+Floormap Feature Extraction

Paired Video π˜π‘„ Video Encoder

Feature Extraction Network

RF Signal Floormap

+

Video Feature Extraction 𝐯𝑄 𝐰𝑛

𝑄

π°π‘œ

𝑄

Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰

slide-17
SLIDE 17

Paired Data Alignment Loss

β„’π‘žπ‘π‘—π‘ 

𝑀2

Solution II. Multi-modal Feature Alignment

RF+Floormap Feature Extraction

Paired Video π˜π‘„ Video Encoder

Feature Extraction Network

RF Signal Floormap

+

Video Feature Extraction 𝐯𝑄 𝐰𝑛

𝑄

π°π‘œ

𝑄

Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰

slide-18
SLIDE 18

Paired Data Alignment Loss

β„’π‘žπ‘π‘—π‘ 

𝑀2

Solution II. Multi-modal Feature Alignment

RF+Floormap Feature Extraction

Paired Video π˜π‘„ Video Encoder

Feature Extraction Network

RF Signal Floormap

+

Video Feature Extraction

Unpaired Video π˜π‘‰ Video Encoder

𝐯𝑄 𝐰𝑛

𝑄

𝐰𝑛

𝑉

π°π‘œ

𝑉

π°π‘œ

𝑄

Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰ Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰

slide-19
SLIDE 19

Paired Data Alignment Loss

β„’π‘žπ‘π‘—π‘ 

𝑀2

Solution II. Multi-modal Feature Alignment

RF+Floormap Feature Extraction

Paired Video π˜π‘„ Video Encoder

Feature Extraction Network

RF Signal Floormap

+

Video Feature Extraction

Unpaired Video π˜π‘‰ Video Encoder

𝐯𝑄

Unpaired Data Alignment Loss

β„’π‘£π‘œπ‘žπ‘π‘—π‘ 

πΈπ‘œ 𝐸𝑛

𝐰𝑛

𝑄

𝐰𝑛

𝑉

π°π‘œ

𝑉

π°π‘œ

𝑄

Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰ Spa Spati tial π‘Έπ’‘π’‘π’Žπ’‹π’π’‰

slide-20
SLIDE 20

RF-Diary System Structure

slide-21
SLIDE 21

RF-Diary can caption people’s daily life in home …

RF Signals Floormap RGB Video RF-Caption

A person enters the

  • kitchen. He takes off

his clothes, sits at table and starts playing laptop.

slide-22
SLIDE 22

Even when the light is off …

RF Signals Floormap RGB Video RF-Caption

A person walks to the

  • kitchen. He then pours

water into a cup and drinks from it. Not Applicable

slide-23
SLIDE 23

Quantitative Results

slide-24
SLIDE 24

Summary

  • RF-Diary enables captioning people’s daily life in their home.
  • RF-Diary uses radio signals as input to address the privacy issues of

camera.

  • RF-Diary achieves comparable results of camera-based captioning

and keeps working under poor lighting or occluded scenarios.

slide-25
SLIDE 25

For more information, please visit our webpage:

http://rf-diary.csail.mit.edu