How to automatically Gime a Lever & a Fulcrum for ATA analyze - - PowerPoint PPT Presentation

how to automatically
SMART_READER_LITE
LIVE PREVIEW

How to automatically Gime a Lever & a Fulcrum for ATA analyze - - PowerPoint PPT Presentation

How to automatically Gime a Lever & a Fulcrum for ATA analyze the dynamics SFR Agorantic : LIA UAPV of images on the Web 2.0 Marc El-Bze Project ImagiWeb funded by the ANR 2014.IX.10 PhD : Jean-Valre COSSU (Last year) Co-direction:


slide-1
SLIDE 1

1 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Gime a Lever & a Fulcrum for ATA

SFR Agorantic : LIA UAPV

Marc El-Bèze 2014.IX.10

2 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

How to automatically analyze the dynamics

  • f images on the Web 2.0

Project ImagiWeb funded by the ANR PhD : Jean-Valère COSSU (Last year) Co-direction: E Sanjuan, JM Torres-Moreno

3 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

‘Images’ on the Web 2.0

Multiple sides of an ‘Image’:

  • What the entity (politician or company) emits
  • How (s)he/it is perceived by members of OSN

In case of a competition (elections for instance)

– each entity is not to be seen as if it was alone – his image is exposed to interaction – and … other actions or events

Once spread, is the control of the image lost?

3

4 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Aims, Goals and Objectives

  • How a component of an entity is perceived?
  • How wide is the gap between

– the perception and what was expected?

  • What can be done to reduce it?

Our goal is not to predict who will be elected … But to Produce dashboards (personal summaries) giving an overview of the Images How to do that?

4

slide-2
SLIDE 2

5 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Workflow

From all the messages:

  • Probabilistic extraction of discriminant Terms

(Ngrams with N variable : mwe or chunks)

Well-Argued Recommendation: Adaptive Models Based

  • n Words in Recommender Systems Gaillard et al.

EMNLP’13)

  • For each message, automatically identify

who is emitting this opinion ? which entity is targeted ? which (sub)targets are concerned ? 3Wh + 2P (parity & polarity)

5

6 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Segmentation

3 Populations to be segmented:

  • People sharing the same opinion on one entity
  • Topics & subTopics = components of an Image
  • Messages will be segmented in order to

– take into account nuanced discourses – improve the granularity of each partition Sender’s identity and entity taken into account

6

7 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Assumptions we have discarded for the 2 Tasks : Pol & Topic

Entities X, Y: opponents supported by B & C Assumption D1 : B has positive opinion on X, whatever the topic

  • D2 : if B has a positive opinion on X about a

topic s, B opinion on Y about s must be negative

  • D3 : if B opinion on X about s is positive, C
  • pinion on X must be negative

7

8 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Assumptions for the Polar Task

In the politics domain: Assumption A1

  • During a given short period, the global opinion
  • f one person on one entity remains quite stable.

Assumption A2

  • Variations of the opinion can be

– predicted thanks to time series modelization; – estimated by comparing 2 successive histograms.

8

slide-3
SLIDE 3

9 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Assumptions for the Polar Task

  • How much these assumptions are true?
  • How can we use it to enrich the annotated

corpus thru an incremental process? In practice:

  • For a classification method such as the Vector

model using BOW and Cosine, the user id or its membership can be introduced into each vector as an additional component

9

10 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Assumptions for the Topic Task

In the politics domain:

  • Let us call T a set of predefined topic
  • As for the polar task

– Classical classification approaches can be used

  • In this case, assumption A1 does not apply:

we have to look for another lever

  • If the text comes from Twitter, hashtags are

good candidates for a fulcrum

10

11 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Proposal P for the Topic Task

Whatever the domain:

  • Collect (aside the training corpus) a huge

amount of non annotated tweets or texts

  • Keep the S closer tweets (according to time)
  • for each tag t in T extract the most nt similar

tweets (nt corresponding to what is expected for t)

  • Classify all the remaining S - Sum nt

– which may be related (or not) to new topics

11

12 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Adaptation of Recommender Systems

Each new rating (and the comment associated) can be directly used to adapt the model

  • Flash reactivity: adaptive models in recommender

systems J Gaillard et al.DMIN‘13

This can be used to prevent the system to make twice the same error This is a good way to reduce the difficulties due to the so-called « cold start » problem

12

slide-4
SLIDE 4

13 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

What cannot be done in our case

When adapting with the prediction instead of the reference, performances not so much improved In case of topic and opinion detection / Twitter there is no star no rating no thematic tag

13

TRAINING ADAPT ADAPT TEST

  • 1. Predict Rating Rt
  • 2. Adapt with Rating Rt-k

14 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Some pecularities of our problem

In the case of topic and opinion detection / blogs No star … no rating … no tag … no hashtag Texts are longer than a tweet In a same document, authors may give positive, neutral and negative opinions on several (sub-) topics It becomes a problem of multilabel categorization More difficult than a monolabel categorization

14

15 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

A simple way to reduce the complexity

Divide texts into segments (phrase, sentence,§) Apply Proposal P to this set of segments On each unannotated segment apply as principle confidence measure = Intersystem agreement Advantages: no need to align, less complex Drawbacks: Context is lost, subject to oscillation

15

16 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

In Summary

16

TRAINING CORPUS PSEUDOS METADATA EVENTS DATES …. SET OF TOPICS

categorization clustering

NON ANNOTATED CORPUS FOR A GIVEN PERIOD t SETS OF MWE / TOPIC FLIP FLOP SETS OF MWE / POLARITY

a picture at time t

slide-5
SLIDE 5

17 Thursday 10th September 2014 ( TLSE ) – Lever & Fulcrum for ATA

Questions of interest

Feasibility of AutoGenerating DashBoards showing for an entity good/bad points

  • Images evolution

– Follow trends for some particular group

  • List points to focus on

– How to Act on these points? – What to do mostly to attract ‘swing people’?

17