Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi - PowerPoint PPT Presentation

Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi Yue Zheng 1 , Yi Zhang 1 , Kun Qian 1 , Guidong Zhang 1 , Yunhao Liu 1,3 , Chenshu Wu 2 , Zheng Yang 1 1 Tsinghua University 2 University of Maryland, College Park 3 Michigan State University

Motivation • Human gesture recognition is the core enabler for a wide range of applications. Smart Home Virtual Reality Security Surveillance • RF radios VS. Cameras/Wearable devices/Ultrasound Wi-Fi is currently widely deployed ! – Less privacy concern. – No requirement of on-body sensors. – More ubiquitous deployment and larger sensing range. 2

State-of-the-Art Works • E-eyes (Wang et al, MobiCom’14) – a pioneer work to use strength distribution of commercial Wi-Fi signals and KNN to recognize human activities. • CARM (Wang et al, MobiCom’15) – calculates power distribution of Doppler Frequency Shifts components as learning features of HMM model. • WIMU (Venkatnarayan et al, MobiSys’18) – segments DFS power profiles for multi-person activity recognition. They use primitive signal features which usually carry environment information unrelated to gestures. E-eyes CARM WIMU 3

State-of-the-Art Works Cross-domain Gesture Recognition Domain: Location, Orientation, Environment • Explore cross-domain generalization ability of recognition model. All require extra training efforts at each – CrossSense (Zhang et al, MobiCom’18) time a new target domain is added into – EI (Jiang, MobiCom’18) • Generate signal features of target domain for model re-training. the recognition model. – WiAG (Virmani et al, MobiSys’17) 4

Key Idea • Can we avoid extra data collection or model- retraining for cross-domain recognition ? – Yes! We move generalization ability downwardly at the lower signal level, rather than the upper model level. – Extract domain-independent features – Trained once and used anywhere 5

System Overview C1: How to define a domain-independent feature in theory ? C2: How to estimate the feature in practice with collected Wi-Fi measurements ? C3: How to devise the recognition model to fully capture the characteristics of the new feature ? 6

Our Prior Efforts • Widar (MobiHoc’17) – models the relation among person’s walking velocity, location and DFS, and pinpoints the person passively. – achieves a decimeter-level accuracy with only one commercial Wi-Fi sender and two receivers. 8

Our Prior Efforts • Widar2.0 (MobiSys’18) – proposes a unified model of ToF, AoA and DFS and devises an efficient algorithm for their joint estimation. – with fine-grained range and AoA provided by a single link, directly localizes the moving person at the decimeter-level. DFS Prior works regard a person as a single point, Ellipse which is infeasible for recognizing complex Curve gestures that involve multiple body parts. Ray ToF We need to define a new feature! Array AoA Baseline LoS Path Tx Rx 9

Anticipated Properties of Signal Features for Finer-Grained Tasks • Domain-independent – capture only human actions rather than domain factors (location, orientation, environment, etc.). • Zero-effort – no model re-training for a new domain. • Finer-grained – contain multiple signal components that correspond to different body parts. 10

Our Solution • BVP: B ody-coordinate V elocity P rofile – Same gestures may exhibit different velocity distributions in the global coordinate system. – Transformation can be achieved with the knowledge of locations of devices, and location and orientation information of the user. 11

One-Link DFS and BVP • The relation between DFS profile of the 𝑗 𝑢ℎ link 𝐸 (𝑗) and the vectorized BVP 𝑊 which include multiple velocity components can be modeled as: – 𝐸 (𝑗) = 𝑑 (𝑗) 𝐵 (𝑗) 𝑊 𝑑 (𝑗) - scaling factor due to propagation loss 𝐵 (𝑗) - assignment matrix. 𝑘 = 𝑔 (𝑗) റ (𝑗) = ൝1, 𝑔 𝑤 𝑙 𝐵 𝑘,𝑙 0, 𝑓𝑚𝑡𝑓 𝐸 (𝑗) : 𝐺 × 1 , 𝐺 is the number of sampling points in frequency domain. 𝑊 : 𝑂 2 × 1 , 𝑂 is the number of sampling points in velocity domain. 13

From Multiple DFS to BVP • DFS from one link only depicts radial velocity components. [1] • DFS from multiple links are utilized to fully recover BVP. [1] Widar, MobiHoc ’17 14

Problems of BVP Estimation • The equation system 𝐸 (𝑗) = 𝑑 (𝑗) 𝐵 (𝑗) 𝑊 is severely under-determined. – DFS profiles from multiple links provide much fewer constraints compared with the variables which required to be estimated in BVP. • Only a few dominant velocity components exist in each BVP snapshot. 15

Optimization of BVP Estimation • We adopt sparse recovery to estimate BVP. • We formulate the estimation of BVP as a 𝑚 0 optimization problem: 𝐹𝑁𝐸(𝐵 𝑗 𝑊, 𝐸 (𝑗) ) + 𝜃 𝑁 𝑊 σ 𝑗=1 ԡ𝑊 0 ԡ – min – The sparsity of the number of the velocity components is coerced by the term 𝜃 ԡ𝑊 0 . ԡ – EMD(Earth Mover’s Distance) resolves the unknown scaling factor caused by propagation loss of the reflected signal and relieves quantization error in BVP. 16

Comparison of Signal Features • Investigate raw CSI, DFS, BVP – example gesture: Pushing and Pulling – two domains 17

Comparison of Signal Features Domain-1 orientation #1 location #1 environment #1 CSI DFS BVP Domain-2 orientation #2 location #2 environment #2 CSI and DFS of same gestures are CSI DFS BVP probable to vary across different domains, but BVP stays consistent! 18

BVP Examples Sliding Pushing & Pulling Clapping 19

Gesture Recognition Model • A hybrid CNN+RNN model is designed to fully capture characteristics of BVP. GRU captures temporal dependencies among BVP snapshots, and is easier to train with less data CNN extract spatial features from each single With the help of BVP, the simple BVP snapshot. recognition model is effective. 21

Experiment • Implementation – Mini-desktops with Intel 5300 NIC. • Setup – 3 scenarios: classroom, hall, office. 0.5m Sensing 0.9m 0.5m Area 0.5m Tx A B 2 Rx 1 3 (b) Hall 0.9m 2m E 4 Loc Sensing 5 Sensing Area D C Orient Area Sensing Area 2m 0.5m (a) Classroom (c) Office 22

Overall Accuracy • Dataset:12,000 gesture samples (16 users × 5 positions × 5 orientations × 6 gestures × 5 instances) • Gestures: pushing and pulling, sweeping, clapping, sliding, drawing circle and drawing zigzag. • Widar3.0 achieves consistent high accuracy across different domains. 23

Method Comparison Different Approaches Different Input Different learning model • Widar3.0 outperforms with the state-of-the-art cross- domain learning methodologies. – It does not require extra data from a new domain or model re-training. • BVP outperforms both denoised CSI and DFS. • The proposed recognition model is simple but effective with BVP as input. 24

Parameter Study Impact of training set diversity. • The accuracy increases from 74% to 89% when the number of training users varies from 1 to 7. – More data to train the learning model. – More likely to reduce the behavior difference between testing persons and training persons. 25

Conclusion • From Widar, Widar2.0 to Widar3.0 – Widar3.0 aims at recognizing complex gestures that involve multiple body parts rather than regarding a person as a single point. • Zero-effort cross-domain gesture recognition system – We propose the domain-independent feature, BVP. – With BVP as input, the recognition model does not require extra data collection or model-retraining when a new domain is added. – With spatial-temporal characteristics of BVP fully captured, the system achieves high recognition accuracy across different domain factors, specifically, 89.7%, 82.6%, 92.4% for user’s location, orientation, and environment. – The dataset is available to public. 26

Data Availability • We collect a hand gesture dataset, which consists of raw Wi-Fi readings (CSI) and other sophisticated features (e.g., DFS and BVP) of 258K instances, duration of 8,620 minutes, from 75 domains. • The dataset and Widar series of works can be found in http://tns.thss.tsinghua.edu.cn/widar3.0/index.html 27

Yue Zheng Tsinghua University cczhengy@gmail.com 28

Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi - PowerPoint PPT Presentation

Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi Yue Zheng 1 , Yi Zhang 1 , Kun Qian 1 , Guidong Zhang 1 , Yunhao Liu 1,3 , Chenshu Wu 2 , Zheng Yang 1 1 Tsinghua University 2 University of Maryland, College Park 3 Michigan State

Changes to the Technical Manual, 18 th Edition Monday, November 17, 2014 12:00 p.m. 1:30 p.m.

MPowered eXPerience in Ultrasound MyLabEight platform is the culmination of Esaotes

COMP 546 Lecture 23 Echolocation Tues. April 10, 2018 1 Echos time = arrival echo

Enhancing Indoor Inertial Odometry with WiFi Raghav H. Venkatnarayan, Muhammad Shahzad NC State

Defying Nyquist in Analog to Digital Conversion Yonina Eldar Department of Electrical

Course outline Week 1 : Introduction to intelligent vehicles, context, applications and

COVID-19: Community Midwives, Public Health, and Emergency Preparedness AUDIENCE IS CURRENTLY

Rhonda Dickman, RN, MSN, CPHQ Rhonda Dickman is a Quality Improvement Specialist with the

Lecture 18: Localization Lecture 18: Localization algorithms algorithms Mythili Vutukuru CS

Phonetic and phonological factors in coronal-to-dorsal perceptual assimilation Eleanor Chodroff

1 Rapidly adapting Slowly adapting Nociceptors, Mechano- thermoreceptors, & itch

Wrist Pain to primary care physicians We will focus on acute, orthopedic problems Review

Maternal Control of Germ- The zygotic genome is activated at the Layer Formation in Xenopus

Neurobiology HMS 130/230 Harvard / GSAS 78454 Visual object recognition: From computational and

2/11/17 1 2 1 2/11/17 Visual System II: objects and faces I. Local vs. distributed functions

Userspace RCU Library: What Linear Multiprocessor Scalability Means for Your Application Linux

Anopheles franciscanus McCracken 2 1 3 5 5 Anopheles franciscanus is a common wetland species.

(VHI ADVANTAGE PRO) ADVANTAGE PRO HARNESS USER INSTRUCTIONS HARNESS DESIGNATION: FALL ARREST

Part 1: Intro Charting some developments in feature theory Christian Uffmann Setting the stage

Reading the mind of a worm 0.1 Global dynamics embed the motor command sequence of C. elegans

The Parenting Brain: Facilitating Repair Offered by Sarah Peyton

Reduplication and Finite-State Machinery Hossep Dolatian & Jeffrey Heinz University of

LABIAL PLACE IN PHONOLOGY: UNIVERSAL AND VARIABLE Daniel Currie Hall Saint Marys University

Cardiac mechanisms of GLP-1 receptor agonists Filip K. Knop, MD PhD Professor, Consultant

Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi - PowerPoint PPT Presentation

Widar3.0 Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi Yue Zheng 1 , Yi Zhang 1 , Kun Qian 1 , Guidong Zhang 1 , Yunhao Liu 1,3 , Chenshu Wu 2 , Zheng Yang 1 1 Tsinghua University 2 University of Maryland, College Park 3 Michigan State

Changes to the Technical Manual, 18 th Edition Monday, November 17, 2014 12:00 p.m. 1:30 p.m.

MPowered eXPerience in Ultrasound MyLabEight platform is the culmination of Esaotes

COMP 546 Lecture 23 Echolocation Tues. April 10, 2018 1 Echos time = arrival echo

Enhancing Indoor Inertial Odometry with WiFi Raghav H. Venkatnarayan, Muhammad Shahzad NC State

Defying Nyquist in Analog to Digital Conversion Yonina Eldar Department of Electrical

Course outline Week 1 : Introduction to intelligent vehicles, context, applications and

COVID-19: Community Midwives, Public Health, and Emergency Preparedness AUDIENCE IS CURRENTLY

Rhonda Dickman, RN, MSN, CPHQ Rhonda Dickman is a Quality Improvement Specialist with the

Lecture 18: Localization Lecture 18: Localization algorithms algorithms Mythili Vutukuru CS

Phonetic and phonological factors in coronal-to-dorsal perceptual assimilation Eleanor Chodroff

1 Rapidly adapting Slowly adapting Nociceptors, Mechano- thermoreceptors, &amp; itch

Wrist Pain to primary care physicians We will focus on acute, orthopedic problems Review

Maternal Control of Germ- The zygotic genome is activated at the Layer Formation in Xenopus

Neurobiology HMS 130/230 Harvard / GSAS 78454 Visual object recognition: From computational and

2/11/17 1 2 1 2/11/17 Visual System II: objects and faces I. Local vs. distributed functions

Userspace RCU Library: What Linear Multiprocessor Scalability Means for Your Application Linux

Anopheles franciscanus McCracken 2 1 3 5 5 Anopheles franciscanus is a common wetland species.

(VHI ADVANTAGE PRO) ADVANTAGE PRO HARNESS USER INSTRUCTIONS HARNESS DESIGNATION: FALL ARREST

Part 1: Intro Charting some developments in feature theory Christian Uffmann Setting the stage

Reading the mind of a worm 0.1 Global dynamics embed the motor command sequence of C. elegans

The Parenting Brain: Facilitating Repair Offered by Sarah Peyton

Reduplication and Finite-State Machinery Hossep Dolatian &amp; Jeffrey Heinz University of

LABIAL PLACE IN PHONOLOGY: UNIVERSAL AND VARIABLE Daniel Currie Hall Saint Marys University

Cardiac mechanisms of GLP-1 receptor agonists Filip K. Knop, MD PhD Professor, Consultant

1 Rapidly adapting Slowly adapting Nociceptors, Mechano- thermoreceptors, & itch

Reduplication and Finite-State Machinery Hossep Dolatian & Jeffrey Heinz University of