Learning to Predict the Global Risks Interconnections from the Web [ - - PowerPoint PPT Presentation

learning to predict the global risks interconnections
SMART_READER_LITE
LIVE PREVIEW

Learning to Predict the Global Risks Interconnections from the Web [ - - PowerPoint PPT Presentation

Learning to Predict the Global Risks Interconnections from the Web [ Minerva: AI/ML for News ] Dr. Ernesto Diaz-Aviles Co-Founder, CEO and Chief Scientist at Libre AI and Adjunct Assistant Professor at UCD <ernesto@libreai.com>


slide-1
SLIDE 1

Learning to Predict the Global Risks Interconnections from the Web [ Minerva: AI/ML for News ]

  • Dr. Ernesto Diaz-Aviles

Co-Founder, CEO and Chief Scientist at Libre AI and Adjunct Assistant Professor at UCD <ernesto@libreai.com> https://libreai.com Supported by Google through the Digital News Initiative

2018-08-27

slide-2
SLIDE 2

2

  • Dr. Ernesto Diaz-Aviles

Co-Founder and CEO Scientist and Engineer with 15 years of experience deploying AI, ML, and Data-driven solutions at scale. Former Chief Data Scientist and VP at Citi’s Innovation Lab. Research Scientist at IBM Research. Research Fellow at the Web Science Lab, L3S Research Center, Germany. Claudia Orellana-Rodriguez, M.Sc. Co-Founder and Chief Scientific Officer Claudia is a scientist and engineer whose work leverages the power of machine learning, natural language processing, social network analysis, and opinion mining to unveil patterns of engagement, attention and influence on the digital era. Claudia is also a researcher at the Insight Centre for Data Analytics in UCD and collaborator with the MIT Center for Civic Media.

slide-3
SLIDE 3

Our mission is to widely disseminate the benefits of Artificial Intelligence and Machine Learning and make them accessible to the world

3

slide-4
SLIDE 4

AI and News

We envision a future where journalists will no longer be limited to report past

  • r current affairs, but they will be empowered by Artificial Intelligence to write

about future events with a fair degree of certainty

4

slide-5
SLIDE 5

AI and News

  • Everything is connected and there are clear historical signs and cycles that

produce very similar consequences. The understanding of such interconnections and causality is fundamental for a comprehensive news coverage

  • However, connecting the dots and discovering the multiple relationships

among events, entities, and global risks are not trivial tasks for journalists

5

slide-6
SLIDE 6

World Economic Forum: The Global Risks Interconnections

WEF: https://www.weforum.org/reports/the-global-risks-report-2018

6

A "global risk" is defined as an uncertain event

  • r condition that, if it occurs, can cause

significant negative impact for several countries or industries within the next 10 years. The Global risks 5 broader classes:

(1) Economic Risks (2) Environmental Risks (3) Geopolitical Risks (4) Societal Risks (5) Technological Risks

slide-7
SLIDE 7

Minerva: automatically generate a Global Risks Interconnections Map from large news datasets and web sources

7

slide-8
SLIDE 8

Project: Minerva

Learning to Predict the Global Risks Interconnections from the Web Prototype based on Artificial Intelligence and Machine Learning that mines the Web and predicts the (non-obvious) interconnections of global risks that will be at the core of tomorrow's news

8

slide-9
SLIDE 9

Libre AI for News

minerva

  • Classification of news articles into Global Risks
  • Detection of key entities: persons,
  • rganizations, locations
  • Unveil existing relationships: graph of

interconnections

  • Predict: infer future connections

<<discover>> <<enhance>>

Minerva: Learn to Predict the Global Risks Interconnections from Data

9

slide-10
SLIDE 10

Minerva: AI/ML Pipeline

10

Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization Common Crawl News Daily: ~ 4GB - 5GB In 2018 ~ 1.5 T http://commoncrawl.org/2016/ 10/news-dataset-available/

slide-11
SLIDE 11

Minerva: AI/ML Pipeline

11

Document Embedding Global Risk Prediction Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization Risk / No-Risk Risk Classifier no yes

slide-12
SLIDE 12

Minerva: AI/ML Pipeline

12

Document Embedding Global Risk Prediction Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization Risk / No-Risk Risk Classifier no yes

Article Extraction

  • News Please:

https://github.com/fhamborg/news-please

  • Unicode, Dammit

https://www.crummy.com/software/Beauti fulSoup/bs4/doc/ Embedding GloVe https://nlp.stanford.edu/projects/glove/

slide-13
SLIDE 13

Minerva: AI/ML Pipeline

13

Entity Extractor (entity, global risk) Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization

slide-14
SLIDE 14

Minerva: AI/ML Pipeline

14

Entity Extractor (entity, global risk) Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization NLP – NER SpaCy: Embed, encode, attend, predict. CNN + GloVe https://spacy.io/

slide-15
SLIDE 15

Minerva: AI/ML Pipeline

15

Relation Extraction Global Risks Links (entity, global risk) Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization

slide-16
SLIDE 16

Minerva: AI/ML Pipeline

16

Relation Extraction Global Risks Links (entity, global risk) Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization

risk_i w_ij risk_j

|entities_i ∩ entities_j| w_ij = ------------------------- |entities_i ∪ entities_j| w_ij = sim(embedding(entities_i), embedding(entities_j)) Strategy 1: Jaccard Similarity (faster) Strategy 2: Semantic Similarity

slide-17
SLIDE 17

Minerva: AI/ML Pipeline

17

Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization

t_1 t_2 ... t_n t_n+1

extract predict

slide-18
SLIDE 18

Minerva: AI/ML Pipeline

18

Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization risk_i w_ij risk_j

t_1 t_2 ... t_n t_n+1

extract predict

w_ij_t1, w_ij_t2, … w_ij_tn -> w_ij_t_n+1

extract predict

slide-19
SLIDE 19

Minerva: AI/ML Pipeline

19

Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization risk_i w_ij risk_j

t_1 t_2 ... t_n t_n+1

extract predict

w_ij_t1, w_ij_t2, … w_ij_tn -> w_ij_t_n+1

extract predict

Time Series Forecasting with CNN

  • Conv1D with dilation and "causal" padding
  • ~ WaveNet
  • All link weights predicted simultaneously
  • Keras / TensorFlow

https://keras.io/layers/convolutional/

slide-20
SLIDE 20

Minerva: Interactive Graph

20

Interactive Visualization D3.js https://d3js.org/ Stream of Documents Global Risk Classification Entity Extraction Relation Extraction Nowcasting Global Risk Graph Creation Visualization

slide-21
SLIDE 21

War without Rules

Offensive cyber capabilities are developing more rapidly than our ability to deal with hostile incidents. This creates a fog of uncertainty in which potential miscalculations could trigger a spiral of retaliatory responses. Imagine that a country’s critical infrastructure systems are compromised by a cyberattack, leading to disruption of essential services and loss of life—the pressure to retaliate would build rapidly, potentially setting off an escalatory chain

  • reaction. [...]

21

WEF: https://www.weforum.org/reports/the-global-risks-report-2018

slide-22
SLIDE 22

Experimental Evaluation

  • Are the main connections between Global Risks predicted?
  • Ground truth: Web Economic Forum reports
  • Metric (averaged over all risks):

|{relevant links} ∩ {top-n predicted links}| Precision@n = ------------------------------------------- n

  • Dataset Commons Crawl News 2018 sample (all articles are in English),

Articles from Irish News media outlets, Major press agencies, and Major newspapers around the world

22

slide-23
SLIDE 23

Experimental Evaluation

23

2018

slide-24
SLIDE 24

Conclusion

  • “It is difficult to make predictions, especially about the future”

– Danish Proverb

  • Predicted Global Risk Interconnection more accurate for short horizons

(Nowcasting). E.g., months or quarters rather than 10 years as WEF definition

  • Entity based relations are promising proxy for risk interconnections
  • Computationally cheaper Jaccard similarity leads to better precision than

embedding-based strategy

  • Next: continue evaluation and User study
  • Initial rollout with partners. If interested, let me know

24

slide-25
SLIDE 25

AI and News

We envision a future where journalists will no longer be limited to report past

  • r current affairs, but they will be empowered by Artificial Intelligence to write

about future events with a fair degree of certainty

25

Thank you!

  • Dr. Ernesto Diaz-Aviles

Co-Founder, CEO and Chief Scientist at Libre AI and Adjunct Assistant Professor at UCD <ernesto@libreai.com> https://libreai.com

slide-26
SLIDE 26

26

LibreAI.com