Research & Innovation for Secure Societies Monica Florea-Head of - - PowerPoint PPT Presentation

research innovation for secure societies
SMART_READER_LITE
LIVE PREVIEW

Research & Innovation for Secure Societies Monica Florea-Head of - - PowerPoint PPT Presentation

Real-time Early Detection and Alert System for Online Terrorist Content based on Natural Language Processing, Social Network Analysis, Artificial Intelligence and Complex Event Processing Research & Innovation for Secure Societies Monica


slide-1
SLIDE 1

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Real-time Early Detection and Alert System for Online Terrorist Content based on Natural Language Processing, Social Network Analysis, Artificial Intelligence and Complex Event Processing

Research & Innovation for Secure Societies

Monica Florea-Head of Unit EU projects SIVECO Romania Monica.Florea@siveco.ro

slide-2
SLIDE 2

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Project Fact Sheet

  • Project ID: 740688
  • Start: 01-06-2017
  • End: 31-05-2020
  • Budget: 5,064,437.5 Euros
  • Project Coordinator: Monica Florea - SIVECO Romania
  • Research and innovation action

2

slide-3
SLIDE 3

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

3

Objective

Provide a complete toolkit for LEAs to collect, process, visualize and store online data related to terrorist groups, whether related to propaganda, fundraising, recruitment and mobilization, networking, information sharing, planning/coordination, data manipulation and misinformation. Cover a wide range of social media channels, in particular new targeted channels, which are increasingly used by terrorist groups to disseminate their content. Allow LEAs to take coordinated action in real-time while preserving the privacy of citizens.

slide-4
SLIDE 4

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Fighting Terrorist Cyber Propaganda(1)

Social media providers are determined to fight terrorist propaganda

  • n

their platforms. There is no specific tool for identifying terrorist content on the Internet and social media tailored to LEAs’ needs. LEAs must rely on proprietary spam-fighting tools, user reports and human analysis in

  • rder

to detect accounts promoting terrorism.

4

slide-5
SLIDE 5

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Fighting Terrorist Cyber Propaganda(2)

  • 1 terror attack attempted every 9 days in Europe in 2017
  • Perpetrators were radicalized individuals recruited via online communication

channels and social media

Terror attacks in Europe and Turkey - Source: AFP (not including London June 3rd attack) 5

slide-6
SLIDE 6

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Social media, friend or foe?

Extremist and terrorist groups use the Internet for a myriad of purposes including psychological warfare, propaganda, fundraising, recruitment and mobilization, networking, information sharing, planning/coordination, data manipulation and misinformation. All active terrorist groups have established at least one form of presence on the Internet and most of them are using several formats

  • f online platforms!

Therefore, online content monitoring and analysis is a critical part of almost every national security investigation.

6

slide-7
SLIDE 7

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Mission

  • You cannot wage a traditional war against terrorists, but

a key to fighting terrorism is good intelligence based on big data analysis

  • The only way to protect the citizens and apprehend

terrorists before they execute their plans is to know what they are planning in advance

  • It is also essential to detect cyber propaganda in order

to fight radicalization

  • The only way to protect vulnerable individuals is to

identify, monitor and counteract online media channels used in terrorist cyber propaganda

7

slide-8
SLIDE 8

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Research projects tackling clearly defined challenges, which can lead to the development

  • f new knowledge or a new technology.

Innovation-RIA

RED-Alert combines AI methods with SNA and NLP technologies to detect anomalies in content production, content nature, content spread in order to provide early detection of terrorist activities. The input from AI, SNA, SMA and NLP technologies will be fed into a CEP engine to predict potential threat areas based on content production patterns, allowing the LEAs to analyse, monitor or take action on online terrorist content. 8

slide-9
SLIDE 9

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Meet the partners

Infographic Designed

Easy to change colors, photos and Text.

01 02 04

1 4 4 1 5

15 partners from 7 countries

Ministerio Del Interior - Guardia Civil from Spain (GUCI) Ministry Of Public Security - Israel National Police (MOPS-INP) Metropolitan Police Service from UK (SO15) Protection and Guard Service from Romania (SPP) Protection and Guard Service from Republic of Moldova (SPPS)

Law enforcement agencies

Intu-View Ltd (INT) Usatges Bcn 21 Sl (INSKT) Maven Seven Solution Technology (MAV) Information Catalyst for Enterprise Ltd (ICE)

SME innovation champions

Interdisciplinary Center Herzliya (ICT) Eotvos Lorand Tudomanyegyetem (ELTE) City University Of London (CITY) Birmingham City University (BCU)

Research/academic organizations

SIVECO Romania (SIV)

Industrial partner

Malta Information Technology Law Association (MITLA)

Regulatory association 9

slide-10
SLIDE 10

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Project phases

10

slide-11
SLIDE 11

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Pilots and their location

GUCI, Spain

The pilot will deploy the solution in the Intelligence Service of the Guardia Civil

  • Headquarters. GUCI will be able to apply

RED-Alert pilot for the analysis of the propaganda, funding and recruitment impact

  • f

terrorist elements. The pilot will encompass several teams from different GUCI units, whose analysts will have access to the RED-Alert system software in order to improve

  • ur

fight against crime and

  • terrorism. The pilot will seek to use the RED-

Alert software to improve our investigations in real time.

SO15, UK

RED-Alert solution will be used in accordance with RIPA on real social intelligence but during the trials, we will not be targeting known subjects of

  • interest. The analysts under the guidance of the

research & development manager will set the software with specific keywords and languages that will assist in identifying key individuals and associate networks in real time.

SPP, Romania

The pilot will deploy the solution in the main SPP facility. Test the full capacity, the efficiency, usability and accuracy of the RED-Alert tool, intelligence analysts will test it in parallel with existing tools..

SPPS, Republic of Moldova

After the implementation, the solution will be tested in real environment in SPPS daily

  • missions. One of the workstations will handle

existing classified intelligence system and the

  • ther
  • ne

will process the RED-Alert information, so the solution does not jeopardize the SPPS classified network..

MOPS INP, ISRAEL

One

  • f

the workstations will handle existing classified intelligence system and the other one will process the RED-Alert information, so the solution does not jeopardize the INP classified network. The outputs from one system will be used as inputs for the other system. The RED-Alert pilot will start gradually to process the information stored in INP existing databases, related to terrorist activities, groups or persons.

11

slide-12
SLIDE 12

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Initial phase of Requirements Analysis together

with Technical Specifications Architecture was performed during WP1 “System Specifications & Architecture” .

  • WP6 “Solution Integration” is covering also the

User-Centric Design and System Construction phases of RAD, involving several iterations where end-users interact with developers to design models and build prototypes. It also involves performing system integration and testing activities to ensure that components work well together, as designed.

  • The

final release is planned after solution deployment, pilots execution, and user feedback that will be performed in WP7 “LEA Pilots”, covering the Implementation phase of RAD.

RAD(Rapid Application development) Methodology

12

slide-13
SLIDE 13

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Main use cases

13 Analyst - Main users of the RED-Alert system, involved in data gathering, various types of analyses, perform case work and produce reports . Coordinator - Distributes tasks to analysts, monitors and reviews their work. Super-user - Input keywords access harder to use or high responsibility functionalities.

slide-14
SLIDE 14

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

RED-Alert central solution Interfaces

14

slide-15
SLIDE 15

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Algorithms are being implemented for

– Quantitatively describe topological structures of social connections between individuals and measure graph patterns in topic maps from NLP results. – Identify groups of people from time series data, where the group members can change during the observation. – Predicting missing links for partial datasets or give a probability score for existing relationships in noisy data. – Revealing hierarchical structures from flat datasets. The resulting solutions of Social Network Analysis construct new networks from input data: either from co-occurrence statistics or from directed networks containing loops.

15

Social Network Analysis(SNA)

slide-16
SLIDE 16

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

SNA: Implement network dynamics and temporal network structure models

16

Methodology: identify topological structures and clusters in evolving networks By finding dense subgraphs with many common nodes and links, the module is able to follow a time track of changing social groups, where new members appear or old members disappear or some individuals change their group membership Hierarchy of a growing group evolves by developing new levels and accumulating members at different positions

slide-17
SLIDE 17

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

SNA: quantitative topological features

17

The Core-Perifery analysis reveals central members and highlight weakly connected individuals from large social databases The NLP module provides detailed

  • ntological features of human
  • conversations. The SNA module

creates a holistic overview from the details.

slide-18
SLIDE 18

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Complex Event Process (CEP)

  • Specific CEP (Complex Event Processing) applications are being implemented for the

RED-Alert scenarios / use-cases.

  • Event patterns are being developed by two methods:

– By domain experts; – By ML techniques.

  • The implemented mini-CEPs are able to query past events and to handle the querying

results, so it is able to compare current and historical states and to reason over time and space, which are two current limitations of existing semantic CEP tools.

18

slide-19
SLIDE 19

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Privacy, Visualization and Meta- Learning

  • Anonymization tool takes as input all incoming data and removes the possibility of

an individual from being identified from the anonymized data byusing a combination of well-known privacy defenition such as : – k-anonymity; – t-closeness; – l-diversity and – differential privacy.

  • Visualization tool provides a platform for a graphical representation of a social

network.

  • In order to keep the tool adaptable to newly identified words and network

dynamics, the meta learning tool developed under this WP triggers regular updates thus improving the efficiency of the RED-Alert solution.

19

slide-20
SLIDE 20

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Social Language Processing

  • NLP features to process the texts and output categorization models based on

– Linguistic features that are extracted include a wide range of features that are automatically learned from analysis of the training corpora; – Ontological features are the disambiguated ontological instances that are linked to lexical features and determine the precise meaning of the lexical feature.

  • Automatic classifier feature to identify dangerous messages
  • SMA tool that covers next features:

– Separation of audio elements into speech, music and events (such as gunfire, explosions, crowd noises); – Extraction of speech audio for input into speech to text engines, and – Extraction and identification of image and video scene elements such as logos, flags, weapons, faces.

20

slide-21
SLIDE 21

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Social Language Processing

21

NLP features to process the texts & output categorization models

slide-22
SLIDE 22

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Topic: Converts the post in the word vector domain and

calculates distances from this point to defined topics, if the distance is shorter than a defined threshold, the topic is assigned.

  • Key ideas: Extracts particular patterns of words from the text

that are identified as key ideas.

– The method works by performing POS tagging on the text and then extracting the words that match specific language-depending patterns. – These methods include a tokenizer and a stopword remover functions.

Social Language Processing

22

NLP features to process the texts & output categorization models

slide-23
SLIDE 23

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Entities: Automatic extraction of entities

– Locations, organizations, public celebrities, etc.

  • Ontological features are the disambiguate ontological

instances that are linked to lexical features and determine the precise meaning of the lexical feature.

Social Language Processing

23

NLP features to process the texts & output categorization models

slide-24
SLIDE 24

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Supervised learning is applied to calculate the probability for a

post of having content related to specific domains

– Such as Jihadism, Extreme Right, etc.

  • Texts classifiers are trained from annotated corpus and

extrapolated to other languages using aligned word vectors.

  • Word vectors are calculated with 300 dimensions.

Social Language Processing

24

Artificial Intelligence applied to automatic text classification

slide-25
SLIDE 25

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Social Language Processing

25

Artificial Intelligence applied to automatic text classification

Extreme Right Not Extreme Right

slide-26
SLIDE 26

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • Multimedia is extensively used in social networks nowadays

and is gaining popularity among the users with the increasing growth in the network capacity, connectivity, and speed.

  • Moreover, affordable prices of data plans, especially mobile

data packages, have considerably increased the use of multimedia by different users.

– This includes terrorists who use social media platforms to promote their ideology and intimidate their adversaries. It is therefore very important to develop automated solutions to semantically analyse given multimedia contents.

Semantic Multimedia Analysis (SMA)

26

slide-27
SLIDE 27

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • The SMA Tool is developed to provide following functionalities

to the RED-Alert architecture:

1. Separation of audio elements into speech, music and events (such as gunfire, explosions, crowd noises), 2. Extraction of speech audio for input into speech to text engines, and 3. Extraction and identification of image and video scene elements such as logos, flags, weapons, faces.

  • Speech processed in three stages.

– Speaker Diarisation: Segmentation of audio into phrases spoken by individual speakers. At this stage the gender of each speaker is also determined. – Language Identification: Identifying the language pf each of the phrases segmented during diarisation. – Transcription: Transcription of the phrases into text.

SMA Tool

27

slide-28
SLIDE 28

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

SMA Tool – Supported Languages

28

 Arabic  Chinese  Dutch  English  Finnish  French  German  Greek  Hungarian  Italian  Latvian  Lithuanian  Polish  Portuguese  Brazilian

Portuguese

 Romanian  Russian  Spanish  Swedish  Turkish

Support for Hebrew and Ukrainian will be available shortly.

slide-29
SLIDE 29

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

  • The SMA tool’s object detection utility uses the Faster R-CNN

structure, which outputs bounding boxes for each identified

  • bject.

– Faster R-CNN is constructed primarily of two separate networks: a Region Proposal Network (RPN) which produces suggestions of regions

  • f an image which might contain objects, and a typical Convolutional

Neural Network (CNN) which generates a feature map and classifies the

  • bjects in the proposed regions.

– The processing applied by faster R-CNN can be split into three main stages:

1. Apply a CNN to the input image to create a feature map. 2. Pass the feature map to the RPN to identify areas which are likely to contain objects. 3. Pass the feature map and proposed regions through several more fully connected layers to classify the object present in each region and refine the coordinates of its bounding box.

SMA Tool – object detection

29

slide-30
SLIDE 30

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Integration component

  • Integrates all the SLP, CEP, Data Visualization, Data Privacy , Machine Learning

components and includes: – Main System User Interface; – User Identification and Access Management; – Collaborative Workflow/Case Management, offering process management features and tools for both business users and developers; – Application Integration Services; – System Interoperability Services; – Centralized Audit and Logging.

30

slide-31
SLIDE 31

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Data Privacy

  • Processing of personal data within

a law enforcement context brings with it a number of regulatory challenges

  • RED-Alert has brought MITLA (IT

law association) as consortium partner as well as Electronic Frontier Foundation (a leading data privacy advocate) as advisory board member

31

slide-32
SLIDE 32

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Dissemination

http://redalertproject.eu/.

32

slide-33
SLIDE 33

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 740688

Thank you for your attention!

Monica.Florea@siveco.ro