Toward Automated Pattern Discovery: Deep Representation Learning - PowerPoint PPT Presentation

Toward Automated Pattern Discovery: Deep Representation Learning with Spatial-Temporal-Networked Data —Collective, Dynamic, and Structured Analysis Yanjie Fu

Outline 5 ¨ Background and Motivation ¨ Collective Representation Learning ¨ Dynamic Representation Learning ¨ Structured Representation Learning ¨ Conclusions and Future Work

Human-Social-Technologic Systems 6 Physical World IoT, GPS, wireless sensors, mobile Apps Cyber World

Human Activities in Human-Social- Technologic Systems 7 ¨ Spatial, Temporal, and Networked (STN) data can be o Spatial: Point-of-Interests, blocks, zones, regions o Spatiotemporal: Taxi trajectories, bus trips, bike traces o Spatiotemporal-networked: Geo-tagged twitter posts, power grid netload ¨ from a variety of sources o Devices: phones, WIFIs, network stations, RFID o Vehicles: bikes, taxicabs, buses, subways, light-rails o Location based services: geo-tweets (Facebook, Twitter), geo- tagged photos (Flickr), check-ins (Foursquare, Yelp) Bus Traces Taxicab GPS Traces Mobile Check-ins Phone Traces Represent the spatial, temporal, social, and semantic contexts of dynamic human/systems behaviors within and across regions

Important Applications 8 Solar Analytics for User Profiling & Intelligent Energy Saving Recommendation Systems Transportation Systems Personalized and Intelligent City Governance and Smart Heath Care Education Emergency Management

Unprecedented and Unique Complexity 9 ¨ Spatiotemporallly non-i.i.d. ¨ Spatial autocorrelation ¨ Spatial heterogeneity ¨ Sequential asymmetric patterns ¨ Temporal periodicity and dependency Spatial autocorrelations Sequential asymmetric transitions Temporal periodical patterns Spatial heterogeneity

Unprecedented and Unique Complexity 10 ¨ Networked over time ¨ Collectively-related ¨ Heterogeneous ¨ Multi-source ¨ Multi-view ¨ Multi-modality ¨ Semantically-rich ¨ Trajectory semantics ¨ User semantics ¨ Event semantics ¨ Region semantics

Technical Pains in Pattern Discovery (1) 11 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Feature identification and quantification o Traditional method: Find domain experts to hand-craft features o Can we automate feature/pattern extraction?

Technical Pains in Pattern Discovery (2) 12 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Multi-source unbalanced data fusion o Traditional method: Extract features, weigh features, weighted combination o Can we automatically extract features from multi-source unbalanced data?

Technical Pains in Pattern Discovery (3) 13 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Field data/real-world systems are usually lack of benchmark labels (i.e., y, responses, targets) o Example: Netload in power grids: behind-the-meter gas-generated electricity and solar-generated electricity are unknown o Can we learn features without labels (unsupervised)?

Deep Learning Can Help 14 Task-specific Car (End to End) Not car Deep Learning Feature extraction + Input Output Classification/Clustering Automated Feature learning from Lack of feature learning multi-source data labels Car Generic Not car Deep Learning Unsupervised Pattern (Feature Output Input / Representation ) Learning Classification /Clustering

Technical Pains in Pattern Discovery (4) 15 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Classic algorithms are not directly available in spatiotemporal networked data o Traditional method: revised classic algorithms + spatiotemporal networked data regularities n Regression + spatial properties = spatial autoregression method n Clustering + spatial properties = spatial co-location method o Can we learn features while maintaining the regularities of spatiotemporal networked data?

Data Regularity-aware Unsupervised Representation Learning 16 Data Regularity- Human and system Regularities of aware behaviors have spatiotemporal representation spatiotemporally networked learning socially regularities data Car Generic Not car Deep Learning Lack of Lack of labels (unsupervised) • Automated Multi-source multi-view multi-modality • labels feature learning • Spatial autocorrelation (peer) Data Spatial heterogeneity (clustering) • • Temporal dependencies (current-past) regularities Periodical patterns • Feature learning from Sequential asymmetric transition • • Spatial hierarchy (hierarchical clustering) multi-source data Hidden semantics • • Spatial locality Global and sub structural patterns in behavioral graphs •

The Overview of The Talk 17 Automated Feature Learning from Spatial-Temporal-Networked Data Collective representation learning with multi-view data Collective Learning Dynamic representation Structured representation learning with stream data learning with global and sub structure preservation Dynamic Structured Learning Learning

Outline 18 ¨ Background and Motivation ¨ Deep Collective Representation Learning ¨ Deep Dynamic Representation Learning ¨ Deep Structured Representation Learning ¨ Conclusion and Future Work

The Rising of Vibrant Communities 19 ¨ Consumer City Theory, Edward L. Glaeser (2001), Harvard University. ¨ More by Nathan Schiff (2014), University of British Columbia. Victor Coutour (2014), UC Berkeley. Yan Song (2014), UNC Chapel Hill. ¨ Spatial Characters : walkable, dense, compact, diverse, accessible, connected, mixed-use, etc. ¨ Socio-economic Characters: willingness to pay, intensive social interactions, attract talented workers and cutting-edge firms, etc. Supported by NSF CISE pre-Career award (III- 1755946) What are the underlying driving forces of a vibrant community?

Measuring Community Vibrancy 20 ¨ Mobile checkin data Urban vibrancy is reflected by the frequency and diversity of user activities. Shopping Transport Dinning Travel Lodging ¨ Frequency and diversity of mobile checkins o Frequency: fre = # &ℎ(&)*+ #(674689:,1234) =>? #(674689:,1234) o Diversity: div = − ∑ 1234 , where type denotes #(674689:) #(674689:) the activity type of mobile users ¨ Fused scoring Vibrancy IJ4∗L9M o @*ABC+&D = (1 + G H ) Score (N O ∗IJ4PL9M) o G controls the weights of fre and div o Power-law distributed o Some are highly vibrant while most are somewhat vibrant Community rankings

Spatial Unbalance of Urban Community Vibrancy 21

Motivation Application: How to Quantify Spatial Configurations and Social Interactions 22 Static Element Dynamic Element Urban Community =Spatial Configuration + Social Interactions

From Regions to Graphs 10 Spatial Regions as Human Mobility Graphs ¨ POIs à nodes ¨ Human mobility connectivity between two POIs à edge weights ¨ Edge weights are asymmetric

Periodicity of Human Mobility 24 ¨ Different days-hours à different periodic mobility patterns à different graph structures

Collective Representation Learning with Multi-view Graphs 12 Multiple Spatial Objects Feature Vector Representations Graphs (e.g., Regions) f( , ) = Constraint: the multi-view graphs are collaboratively related

Solving Single-Graph Input 26 ¨ The encoding-decoding representation learning paradigm o Encoder: compress a graph into a latent feature vector o Decoder: reconstruct the graph based on the latent feature vector o Objective: minimizing the difference between original and reconstructed graphs d 1 input matrix D d N d 2 d 2 d N input matrix D d 1 y y x z x Unsupervised (label-free): doesn’t require labels • Generic: not specific for single application • Intuitive: a good representation can be used to reconstruct original signals •

Solving Multi-graph Inputs: An Ensemble-Encoding Dissemble-Decoding Method 27 NN as an output NN as an input unit Minimize reconstruction loss unit of decoder of encoder signal ensemble (Multi-perceptron summation ) signal dissemble (Multi-perceptron filtering )

Solving the Optimization Problem 28 8 y ( k ) , 1 = σ ( W ( k ) , 1 p ( k ) i,t + b ( k ) , 1 ) , ∀ t ∈ { 1 , 2 , · · · , 7 } , > i,t i,t i,t > > y ( k ) ,r = σ ( W ( k ) ,r p ( k ) i,t + b ( k ) ,r 1. Multi-graph > ) , ∀ r ∈ { 2 , 3 , · · · , o } , < i,t i,t i,t y ( k ) ,o +1 t W ( k ) ,o +1 y ( k ) ,o + b ( k ) ,o +1 Ensemble Encoding = σ ( P ) , i t i,t t > > > z ( k ) = σ ( W ( k ) ,o +2 y ( k ) ,o +1 + b ( k ) ,o +2 ) , > : i i Ensemble multi-graphs  y ( k ) ,o +1 W ( k ) ,o +2 z ( k ) Dissemble multi-graphs = σ ( ˆ + ˆ b ( k ) ,o +2 ) , ˆ 2. Multi-graph  i i   y ( k ) ,o W ( k ) ,o +1 y ( k ) ,o +1 b ( k ) ,o +1 = σ ( ˆ + ˆ Dissemble Decoding  ˆ ˆ ) ,  t t i,t i y ( k ) ,r − 1 W ( k ) ,r y ( k ) ,r + ˆ b ( k ) ,r = σ ( ˆ ˆ ˆ ) , ∀ r ∈ { 2 , 3 , · · · , o } , i,t i,t i,t i,t    p ( k ) = σ ( ˆ W ( k ) , 1 y ( k ) , 1 + ˆ b ( k ) , 1  ˆ ˆ ) ,  i,t i,t i,t i,t Reconstruction loss 3. Objective L ( k ) = k ( p ( k ) p ( k ) i,t ) � v ( k ) X X i,t k 2 i,t � ˆ 2 Function i t ∈ { 1 , 2 ,..., 7 } Sparsity regularization: If mobility connectivity = 0, weight=1 to penalize the loss If mobility connectivity >0, weight>1

Toward Automated Pattern Discovery: Deep Representation Learning - PowerPoint PPT Presentation

Toward Automated Pattern Discovery: Deep Representation Learning with Spatial-Temporal-Networked Data Collective, Dynamic, and Structured Analysis Yanjie Fu Outline 5 Background and Motivation Collective Representation Learning

UNESCO Discovery Centre reference image of education space UNESCO Discovery Centre Discovery

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Pattern Recogniton Pattern: Any

Pattern Discovery in Biosequences Pattern Discovery in Biosequences ISMB 2002 tutorial ISMB 2002

Pattern Discovery in Biosequences Pattern Discovery in Biosequences ISMB 2002 tutorial (Appendix)

Pattern Discovery in Biosequences Pattern Discovery in Biosequences SDM 2005 tutorial (Appendix)

Automated Design of Digital Automated Design of Digital Automated Design of Digital Automated

Deep Reasoning A Vision for Automated Deduction Stephan Schulz Deep Reasoning A Vision for

An NFR Pattern Approach to Dealing An NFR Pattern Approach to Dealing An NFR Pattern Approach to

Scope Constrained Frequent Pattern Mining: Constrained Frequent Pattern Mining: A A

A common pattern: map Another common pattern: filter Pattern: take a list and produce a new list,

Pattern Discovery EECS 458 CWRU Fall 2004 Roadmap Pattern types Motivation

Composite Pattern Discovery for PCR Application Stanislav Angelov University of Pennsylvania,

Overview of Automated Bus Consortium Program Accelerating automated technology for transit

Automated Reasoning: Some Successes and New Challenges Predrag Jani ci c

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Polynomial, sparse and low-rank approximations Anthony Nouy Centrale Nantes Laboratoire de

Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So Song ng Amazon Product

IEPs that Work SESSION 4 SHELLEY MOORE Today A nice start Sharing what we tried Reviewing

Chemistry 120 Fall 2016 Instructor: Dr. Upali Siriwardane e-mail: upali@latech.edu Office: CTH

Low-Mass Dark Matter Searches Using Quantum Sensing and Readout with MKIDs and Paramps

Computations related to the Riemann Hypothesis William F. Galway Department of Mathematics

Krylov subspace methods for eigenvalue problems David S. Watkins watkins@math.wsu.edu

Design and Characterization of Polymeric Floating Microspheres of Levofloxacin Hemihydrate.