in mexico
play

IN MEXICO ( EL ESTADO DE NIMO DE LOS TUITEROS EN MXICO) Gera rard - PowerPoint PPT Presentation

THE MOOD OF TWITTERERS IN MEXICO ( EL ESTADO DE NIMO DE LOS TUITEROS EN MXICO) Gera rard rdo Leyva va Octo tober, r, 20 2018 The three pillars of official statistics SURVEYS ADMINISTRATIVE CENSUSES REGISTERS The three four


  1. THE MOOD OF TWITTERERS IN MEXICO ( EL ESTADO DE ÁNIMO DE LOS TUITEROS EN MÉXICO) Gera rard rdo Leyva va Octo tober, r, 20 2018

  2. The three pillars of official statistics SURVEYS ADMINISTRATIVE CENSUSES REGISTERS

  3. The three four pillars of official statistics SURVEYS ADMINSITRATIVE BIG-DATA CENSUSES REGISTERS

  4. The Big Data definition evolves Initially, it was about...  Volume The 3 V’s  Velocity  Variety  Veracity  Value Instead... Big Data is a flexible approach to use and re-use the totality of a data set, structured or not, in a diversity of possible purposes, normally different to those that originated the information set in the first place.

  5. BIG DATA “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...” Dan Ariely

  6. Big Data (Google trends) https://www.google.com.mx/trends/ @abxda

  7. PARADIGMS Small data Big data

  8. Convergence of two agendas Big data. Subjective Well Being (Martin Seligman).

  9. General idea Goal: : Auto tomaticall lly measure re and re report rt th the mood of f tw twittere rers in in México. Meth thod: : superv rvised le learn rning • Humans tag a training set of tweets: • The system learns to automatically tag (classify) tweets as close as possible to the way humans would have done it.

  10. Since February 2014 Coll llectin ing tw tweets

  11. More than 300 million tweets

  12. Set of tagged tweets  9 330 people from Universidad Tecmilenio and INEGI.  Manually tagged 54 131 tweets.  Multiple tagging of each tweet.  Classification system: https://cienciadedatos.inegi.org.mx/pioanalisis/

  13. Estar enamorada es como ir en un Ferrari a 240 kms/h. Se siente CHINGON pero sabes que en cualquier momento viene el putazo (:

  14. Final solution SVM Tra raining se set 31 SVM Normali lized tweets tw SVM Vali Va lidati tion se set 18

  15. Optimal results (Assamble of SVM) Tra raining se set Normali lized tweets tw SVM Va Vali lidati tion se set SVM SVM 19

  16. Goal: Automatically classifying tweets Norm rmali lizatio ion Uncla lassif ified Vector r re repre resentation tw tweets Clas lassification Hundreds of millions of tagged tweets

  17. The process for sentiment classification  Cleaning  Text normalization  Vector representation of text  Training of the Machine Learning algorithm  Text classification on the fly

  18. Cleaning of the tagged set Cleaning Cleaning Contradictions and repetitions Entropy Tagged Tweets ts Cle leaning “Clean” Tweets (contradictions and repetitions) (Entropy)

  19. Cleaning of the tagged set (cleaning) Tagged Tweets ts Cle leaning “Clean” Tweets (contradictions and repetitions) (Entropy)

  20. Text normalization Q-Grams 3,4 ,4,5,7 (q=4) Po Polarity ty of f Emoti ticons (polarity tag) Oth thers

  21. Example of text normalization ORIGINAL TEXT: pésiiiimo auto :( @autoX fallan frenos y sistema de entretenimiento; no lo compren NORMALIZED TEXT: pesiiiimo auto _ negativo _ user fallan frenos y sistema de entretenimiento ; lo no_compren

  22. Example of text normalization with q-grams _pesiiiimo _auto_ _negativo__user_ fallan_frenos_y_sistema_de_entretenimiento_;_ lo_ no_compren q=4 { _pes , pesi, esii, siii, iiii, iiim, iimo, imo_, mo_a, o_au, _aut, auto, uto_, to__, o__n, __ne, _neg, nega, egat, gati, ativ, tivo, ivo_, vo__, o__u, __us, _use, user, ser_, er_f, r_fa, _fal, fall, alla, llan, lan_, an_f, n_fr, _fre, fren, reno, enos, nos_, os_y, s_y_, _y_s, y_si, _sis, sist, iste, stem, tema, ema_, ma_d, a_de, _de_, de_e, e_en, _ent, entr, ntre, tret, rete, eten, teni, enim, nimi, imie, mien, ient, ento, nto_, to_;, o_;_, _;_l, ;_lo, _lo_, lo_n, o_no, _no_, no_c, o_co, _com, comp, ompr, mpre, pren, ren_ }

  23. Example of text normalization with q-grams _pesiiiimo _auto_ _negativo__user_ fallan_frenos_y_sistema_de_entretenimiento_;_ lo_ no_compren q=4 { _pes , pesi , esii, siii, iiii, iiim, iimo, imo_, mo_a, o_au, _aut, auto, uto_, to__, o__n, __ne, _neg, nega, egat, gati, ativ, tivo, ivo_, vo__, o__u, __us, _use, user, ser_, er_f, r_fa, _fal, fall, alla, llan, lan_, an_f, n_fr, _fre, fren, reno, enos, nos_, os_y, s_y_, _y_s, y_si, _sis, sist, iste, stem, tema, ema_, ma_d, a_de, _de_, de_e, e_en, _ent, entr, ntre, tret, rete, eten, teni, enim, nimi, imie, mien, ient, ento, nto_, to_;, o_;_, _;_l, ;_lo, _lo_, lo_n, o_no, _no_, no_c, o_co, _com, comp, ompr, mpre, pren, ren_ }

  24. Example of text normalization with q-grams _pesiiiimo _auto_ _negativo__user_ fallan_frenos_y_sistema_de_entretenimiento_;_ lo_ no_compren q=4 { _pes , pesi , esii , siii, iiii, iiim, iimo, imo_, mo_a, o_au, _aut, auto, uto_, to__, o__n, __ne, _neg, nega, egat, gati, ativ, tivo, ivo_, vo__, o__u, __us, _use, user, ser_, er_f, r_fa, _fal, fall, alla, llan, lan_, an_f, n_fr, _fre, fren, reno, enos, nos_, os_y, s_y_, _y_s, y_si, _sis, sist, iste, stem, tema, ema_, ma_d, a_de, _de_, de_e, e_en, _ent, entr, ntre, tret, rete, eten, teni, enim, nimi, imie, mien, ient, ento, nto_, to_;, o_;_, _;_l, ;_lo, _lo_, lo_n, o_no, _no_, no_c, o_co, _com, comp, ompr, mpre, pren, ren_ }

  25. Example of text normalization with q-grams _pesiiiimo _auto_ _negativo__user_ fallan_frenos_y_sistema_de_entretenimiento_;_ lo_ no_compren q=4 { _pes , pesi , esii , siii , iiii, iiim, iimo, imo_, mo_a, o_au, _aut, auto, uto_, to__, o__n, __ne, _neg, nega, egat, gati, ativ, tivo, ivo_, vo__, o__u, __us, _use, user, ser_, er_f, r_fa, _fal, fall, alla, llan, lan_, an_f, n_fr, _fre, fren, reno, enos, nos_, os_y, s_y_, _y_s, y_si, _sis, sist, iste, stem, tema, ema_, ma_d, a_de, _de_, de_e, e_en, _ent, entr, ntre, tret, rete, eten, teni, enim, nimi, imie, mien, ient, ento, nto_, to_;, o_;_, _;_l, ;_lo, _lo_, lo_n, o_no, _no_, no_c, o_co, _com, comp, ompr, mpre, pren, ren_ }

  26. Vectoral representation of the text

  27. Machine learning algorithm SVM

  28. Training the SVM algorithm Po Positive tw tweets Negati tive tw tweets

  29. The task of text classification…in a nutshell: Normalization and vector representation Tagged tweets Training Production Normalization and vector representation New tweet The mood of tweeterers Decision rule

  30. Positivity quotient POSIT ITIVES ES Positivity quotient NEGATIVES

  31. The mood of tweeters in Mexico Showing 2016/Nov-2018/Sep (daily) New New year year Children’s day Christmas Chri Index Ger ermany y Osca cars 201 2018 New year New year February Febr y (12/31 & 01/01) (04/30) (12/25 ) vs vs Mexi exico The Sha The Shape of of South Sou (12/31 & 14th 14th (06/17) 01/01) Kor Korea vs vs Wate ater (03/04) . . Mex exico MTV TV Award ards . . (06/23) (05/19) . Chri Christmas (12/25) . . . . . . . . . . . . . . . . Vote Vote 201 2018 . . . . (07.01) . . Earth arthquake Ger ermany y (02/19) Mex exico Elec lections Mex exico vs vs vs vs Mexi exico “Journalist’s “ Gas asolinazo zo ” vs vs Brazi Brazil USA SA (11/08 Sw Sweden Earth arthquakes Deb ebates 201 2018 (06/17) day” (01/04) (01/04 & 05) (06/27) & 09 ) (06/27) (17/09/08 & 19) (04/22, 05/20 & 06/12) 11/01/16 12/01/16 01/01/17 02/01/17 03/01/17 04/01/17 05/01/17 06/01/17 07/01/17 08/01/17 09/01/17 10/01/17 11/01/17 12/01/17 01/01/18 02/01/18 03/01/18 04/01/18 05/01/18 06/01/18 07/01/18 08/01/18 09/01/18

  32. Link: http://www.inegi.org.mx/ http://www.beta.inegi.org.mx/app/animotu itero/#/app/multiline

  33. Help us to classify tweets Mood Leads people wanting Help Reference to help to another Visualization of Positivity Methodology page period Quotien, according to the selection of the period, the state and temporality

  34. Shows periods for selection Calendar Selection of states Shows, at the upper right corner, the Shows the National level and a selecting bar for the state of interest temporality of the indicator Daily, Weekly, Monthly, Quarterly or Annual Indicator

  35. Gathering Shows the number of tweets gathered

  36. Map Shows, on the map, the states coloured according to the positivity quotien Shows the tweets of all people in the state or the country Shows the tweets of people residing All and present in the state Shows the tweets of people visiting Residents the state Visitors

  37. Other INEGI projects with Twitter: Domestic tourism. Mental health. Mobility in Mexico City. New agglomerations. Consumer confidence. Insecurity.

  38. Other INEGI projects with big data: CFE electricity consumption for nowcasting of industrial activity. Use of satellite images for diverse purposes including land cover, agricultural activity and new settlements. Cooperation with Telefonica and BBVA- Bancomer to generate a rapid response system to face natural disasters. Web scraping and scanner data for prices.

  39. ¡Thank you!

  40. Conociendo México 01 800 111 46 34 www.inegi.org.mx atencion.usuarios@inegi.org.mx @ INEGI_INFORMA INEGI Informa

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend