Distant-supervised Heterogeneous multitask learning for social - - PowerPoint PPT Presentation

distant supervised heterogeneous multitask learning
SMART_READER_LITE
LIVE PREVIEW

Distant-supervised Heterogeneous multitask learning for social - - PowerPoint PPT Presentation

Distant-supervised Heterogeneous multitask learning for social event forecasting with multilingual indicators Liang Zhao George Mason University What are Spatiotemporal Events? Week 47 Week 46 Week 45 Epidemics outbreak on Week 47 ending


slide-1
SLIDE 1

Distant-supervised Heterogeneous multitask learning

for social event forecasting with multilingual indicators Liang Zhao George Mason University

slide-2
SLIDE 2

What are Spatiotemporal Events?

Civil unrest events on Mar 17, 2013 in Brazil

Epidemics outbreak on Week 47 ending Nov 22, 2014 in southern region

Week 47 Week 46 Week 45

Protests influenza

slide-3
SLIDE 3

Open Source Indicators as the Social Sensor

Tweet volume less than 10 Tweet volume larger than 10 A civil unrest event reported after July 25

… … Protests on July 25, 2012, Mexico

slide-4
SLIDE 4

Open Source Indicators as the Social Sensor

1256 Flu tweets

Flu tweets geographical distribution (reported on Week 46) 2013-14 Influenza Season Week 46 CDC flu activity map (reported on Week 47)

slide-5
SLIDE 5

Challenge 1: Multilingual features

Moreover..

  • Countries with hundreds of languages
  • 1. Must consider multilingual, because
  • Omit a language
  • mit a group of people
  • Cannot omit, even small ones

Social events can be triggered by any people

  • 2. Too large dimension, too sparse feature vector

Imagine a feature vector of a tweet: 10 nonzeros with 1M zeros..

  • 3. Few data for small language
slide-6
SLIDE 6

Challenge 2: Cross-lingual semantic correlation

  • 2. Features are correlated via multi-partite relationship
  • 1. Features are highly semantically redundant

One feature

(http://www.writeopinions.com/complete-multipartite-graph)

(https://www.profluentplus.com/blog/)

slide-7
SLIDE 7

Challenge 3: Lack of language-wise supervision

(http://blogs.discovermagazine.com/science-sushi/2016/01/31/genetically- modified-mosquitoes-didnt-start-zika-ourbreak/) (http://www.foxnews.com/world/2017/12/18/mass-occupation- underscores-brazils-poverty-creates-angst.html)

Zika outbreaks in Brazil No label on how much each group of language-speakers contribute

slide-8
SLIDE 8

Heterogeneous Multitask learning under distant supervision

Word features Latent topics Shared sparsity pattern Distant supervision Task 1 (Language 1) Task 2 (Language 2) Task 3 (Language 3)

}

slide-9
SLIDE 9

Objective function

Distant supervision: any language triggers, the whole triggers none language triggers, the whole not triggers Higher-level topic representation and transition matrix Orthogonal constraint Shared sparsity patterns of latent topics in different tasks Upper-bounded generalization error:

slide-10
SLIDE 10

Optimization

Equivalent problem: Alternating Direction Methods of Multipliers (ADMM) Solve Q: Dynamic programming Solve Θ and U: non-monotone spectral projected gradient descent Solve Z: second-order methods

slide-11
SLIDE 11

Experiments