[PPT] - Unified Benchmarking of Big Data Platforms The HOBBIT Platform PowerPoint Presentation

SLIDE 1

Unified Benchmarking of Big Data Platforms The HOBBIT Platform

Axel-Cyrille Ngonga Ngomo

Horizon 2020 GA No 688227 01/12/2016–30/11/2018

Apache Big Data Sevilla, Spain November 11, 2016

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 1 / 42

SLIDE 2

A Lot of Data

1

1http://www.ibmbigdatahub.com/infographic/four-vs-big-data

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 2 / 42

SLIDE 3

A Lot of Tools

2

2https://cloudramblings.me/

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SLIDE 4

A Lot ... of Tools

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 4 / 42

SLIDE 5

A Lot of Views

4

4https://steemit.com/philosophy/@l0k1/

subjectivity-and-truth-how-blockchains-model-consensus-building

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 5 / 42

SLIDE 6

Core Questions Developers: How good is my tool? Vendors: Who is my tool good for? Users: Which tool(s) should I use for my application?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 6 / 42

SLIDE 7

Many Questions

Where are the current bottlenecks? Which steps of the data lifecycle are critical? Which solutions are available? Which key performance indicators are relevant? How well do or should tools perform? How do existing solutions perform w.r.t. relevant indicators?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 7 / 42

SLIDE 8

Solution

Benchmark

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 8 / 42

SLIDE 9

Solution

Benchmark

Components Dataset(s), e.g., Twitter stream, sensor data Task(s), i.e., NER, NEL, ingestion Key Performance Indicators, e.g., precision, recall

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 8 / 42

SLIDE 10

Challenges

Dataset Mismatch

Y e a r A C E W i k i A Q U A I N T M S N B C I I T B M e i j A I D A / C

N

L L N3 c

l

l e c t i

n

K O R E 5 W i k i

D

i s a m b 3 W i k i

A

n n

t

3 S p

t

l i g h t C

r

p u s S e m E v a l

2

1 3 t a s k 1 2 S e m E v a l

2

7 t a s k 7 S e m E v a l

2

7 t a s k 1 7 S e n s e v a l

3

N I F

b

a s e d c

r

p u s M i c r

p
s

t s 2 1 4 S

f

t w a r e a v a i l a b l e ? W e b s e r v i c e a v a i l a b l e ? Cucerzan 2007 ✓ Wikipedia 2008 ✓* ✓ Miner Illinois Wikifier 2011 ✓ ✓ ✓* ✓ ✓ Spotlight 2011 ✓ ✓ ✓ AIDA 2011 ✓ ✓ ✓** TagMe 2 2012 ✓ ✓ ✓ ✓ Dexter 2013 ✓ ✓ KEA 2013 ✓ WAT 2013 ✓ ✓ AGDISTIS 2014 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Babelfy 2014 ✓ ✓ ✓ ✓ ✓ ✓ ✓ NERD-ML 2014 ✓ ✓ ✓ ✓ BAT- 2013 ✓ ✓ ✓ ✓ ✓ ✓ ✓* ✓ Framework NERD 2014 ✓ ✓ ✓ ✓ ✓ Framework GERBIL 2014 ✓ ✓ ✓ ✓ ✓ ✓ ✓* ✓ ✓ ✓ ✓ ✓ ✓ ✓

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 9 / 42

SLIDE 11

Challenges

Unclear KPI Semantics

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 10 / 42

SLIDE 12

Challenges

Unclear KPI Semantics

Example Which time do we measure?

First or last result? With or without network delay?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 10 / 42

SLIDE 13

Challenges

Unclear KPI Semantics

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 11 / 42

SLIDE 14

Challenges

Unclear KPI Semantics

Example When is an annotation correct?

Weak or strong annotation? Semantically equivalent or exact URI?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 11 / 42

SLIDE 15

Solution

Unified Benchmarking Framework Benchmark Core

Web service calls Dataset Wrapper Web service calls Interface View Annotator Wrapper Interface View Open Datasets Configuration (Model) ...

Benchmark Core

Your Annotator Your Dataset DataHub.io

GERBIL Core Controller

Persistent Experiment Database (Model)

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 12 / 42

SLIDE 16

GERBIL

Overview

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

SLIDE 17

GERBIL

Overview

Evaluation platform for NER/NEL 18 reference annotation systems 32 reference datasets Benchmarking 10× faster

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

SLIDE 18

GERBIL

Overview

Evaluation platform for NER/NEL 18 reference annotation systems 32 reference datasets Benchmarking 10× faster Archiving of results Citeable URIs Additional analysis

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

SLIDE 19

GERBIL

Overview

Evaluation platform for NER/NEL 18 reference annotation systems 32 reference datasets Benchmarking 10× faster Archiving of results Citeable URIs Additional analysis Open-source project Local deployment Normalized implementation of KPIs Online instance Feedback for developers and users

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

SLIDE 20

GERBIL

Annotator Tasks NIF-based Annotators 2519 Babelfy 958 DBpedia Spotlight 922 TagMe 2 811 WAT 787 Kea 763 Wikipedia Miner 714 NERD-ML 639 Dexter 587 AGDISTIS 443 Entityclassifier.eu NER 410 FOX 352 Cetus 1 Overall 24.3K exps 50+ papers

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 14 / 42

SLIDE 21

HOBBIT

Rationale A community-driven benchmarking framework for the community

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 42

SLIDE 22

HOBBIT

Rationale A community-driven benchmarking framework for the community Focus on Big (Linked) Data Build upon 24.3K experiments performed with GERBIL Cover all steps of the Linked Data lifecycle

Used by a growing number of companies Mature and maturing technologies

Open benchmarks based on industrial data and use cases

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 42

SLIDE 23

Aims

1 Gather real requirements

Performance indicators Performance thresholds

2 Develop benchmarks based on real data 3 Provide universal benchmarking platform

Standardized hardware Comparable results

4 Periodic benchmarking challenges 5 Periodic reporting 6 Found independent Hobbit association Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 16 / 42

SLIDE 24

Overview

Data Collection Industry data Measure Collection Benchmark Creation Benchmark 1 KPIs Tasks KPIs Tasks KPIs Tasks KPIs Tasks KPIs Tasks KPIs Tasks Benchmark 2 Benchmark n HOBBIT Platform Solution 1 Solution k Solution 2 Challenges Reports Participants/Community

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 17 / 42

SLIDE 25

Survey

Questions In what areas are organizations active? What do people expect from benchmarks? How are benchmarks being used? Profile Count Solution providers 56 Technology users 67 Scientific community 65

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 18 / 42

SLIDE 26

Survey

Can your solution be benchmarked?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 19 / 42

SLIDE 27

Survey

Do you benchmark your solution?

Own datasets and settings in many cases Own implementations of measures Results not comparable

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 20 / 42

SLIDE 28

Survey

Application Areas

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 21 / 42

SLIDE 29

HOBBIT Platform

Features

Uses established deployment technologies (Docker)

Decoupled components Benchmark and Systems can be written in different languages

Uses scalable message queues for communication Open-source implementation Supports distributed benchmarks and systems Online instance on server cluster

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 22 / 42

SLIDE 30

HOBBIT Benchmarks

Features

Addresses all steps of the Linked Data Lifecycle Benchmarks derived from industry use cases Real data under the bechmarks Scalable size of benchmarks Open-source implementation

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 23 / 42

SLIDE 31

HOBBIT Platform

Benchmarks

Streaming and static deterministic benchmarks Realistic benchmarks Controlled volume and velocity Generation and Acquisition Conversion of XML into RDF Entity recognition and linking Relation extraction Analysis and Processing Link Discovery Machine Learning Supervised and unsupervised Storage and Curation Triple stores Versioning

Incl. updates

Visualization and Services Question Answering Faceted Browsing Usage-based benchmarks

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 24 / 42

SLIDE 32

HOBBIT Platform

Architecture

Platform Controller Data Generator Task Generator Data Generator Data Generator Task Generator Task Generator Front End Benchmarked System data flow creates component Storage Analysis Benchmark Controller Evaluation Module

Eval. Storage

Logging

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 25 / 42

SLIDE 33

HOBBIT Platform

Benchmark Initialization

Platform Controller Data Generator Task Generator Data Generator Data Generator Task Generator Task Generator Benchmarked System data flow creates component Storage Benchmark Controller

Eval. Storage

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 26 / 42

SLIDE 34

HOBBIT Platform

Benchmark Execution

Platform Controller data flow creates component Storage Data Generator Task Generator Data Generator Data Generator Task Generator Task Generator Benchmarked System Benchmark Controller

Eval. Storage

ex:Entity rdf:type ex:Class ... Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 27 / 42

SLIDE 35

HOBBIT Platform

Benchmark Execution

Platform Controller data flow creates component Storage Data Generator Task Generator Data Generator Data Generator Task Generator Task Generator Benchmarked System Benchmark Controller

Eval. Storage

v ex:Entity ... SELECT ?v WHERE { ?v a ex:Class } Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 28 / 42

SLIDE 36

HOBBIT Platform

Benchmark Execution

Platform Controller data flow creates component Storage Data Generator Task Generator Data Generator Data Generator Task Generator Task Generator Benchmarked System Benchmark Controller

Eval. Storage

X

v ex:Entity ... Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 29 / 42

SLIDE 37

HOBBIT Platform

Benchmark Evaluation

data flow creates component Platform Controller Storage Benchmark Controller Evaluation Module

Eval. Storage

precision=... recall=... F1-score=... precision=... recall=... F1-score=... benchmark parameters: ... v ex:Entity ... v ex:Entity ... Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 30 / 42

SLIDE 38

Datasets

TWIG

Goal: Simulate real Twitter Firehose Relies on 476 million tweets as training data Mimicking algorithm based on

Distribution of character frequencies Distribution of transportation frequency Network topology

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 31 / 42

SLIDE 39

Datasets

LinkedConnections

Goal: Simulate real transport network Real transportation data from Belgium for training Mimicking algorithm based on

Observed correlation between population density and transportation Distribution of transportation frequency Network topology

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 32 / 42

SLIDE 40

Datasets

Printing Machinery

Goal: Simulate events from printing machinery Mimicking algorithm using event correlations and distributions

Changing plate Double sheet Early sheet Finish job Misaligned sheet Missing sheet Operation partially completed Performance Printing interval Produktion Good Sheet Side guide warning Start job Washing blanket Washing impression cylinder Washing ink rollers with washing ink fountain roller with washing plates Mai 01 00:00 Mai 01 06:00 Mai 01 12:00 Mai 01 18:00 Mai 02 00:00

Time Events Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 33 / 42

SLIDE 41

Datasets

Weidmüller

Goal: Simulate events from injection molding machinery Mimicking algorithm using event correlations and distributions

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 34 / 42

SLIDE 42

Datasets

Semantic Publishing

Goal: Simulate data from the BBC Generator based on manually configurable set of correlations

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 35 / 42

SLIDE 43

Join HOBBIT

Join the HOBBIT community Provide KPIs Provide datasets Join the platform development Follow us on Twitter https://twitter.com/hobbit_project

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 36 / 42

SLIDE 44

HOBBIT Benchmarks

Streaming and static deterministic benchmarks Realistic benchmarks Controlled volume and velocity Generation and Acquisition Conversion of XML into RDF Entity recognition and linking Relation extraction Analysis and Processing Link Discovery Machine Learning Supervised and unsupervised Storage and Curation Triple stores Versioning

Incl. updates

Visualization and Services Question Answering Faceted Browsing Usage-based benchmarks

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 37 / 42

SLIDE 45

HOBBIT Run

Runtimes

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 38 / 42

SLIDE 46

HOBBIT Run

Effectiveness

System Precision Recall F1-measure FOX 0.515 0.310 0.351 Balie 0.369 0.230 0.249 Illinois 0.500 0.288 0.327 OpenNLP 0.442 0.241 0.285 Stanford 0.486 0.303 0.335

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 39 / 42

SLIDE 47

HOBBIT Run

Example results

A2KB, weak annotation match, Micro F1-measure System A I D A / C

N

L L

C
m

p . I I T B K O R E 5 M S N B C M i c r

p

. 2 1 4

T

r a i n N 3

R

e u t e r s

1

2 8 AIDA 0.668 0.141 0.625 0.622 0.363 0.391 Babelfy 0.448 0.129 0.564 0.423 0.311 0.289 DBpedia Spotlight 0.545 0.262 0.341 0.457 0.448 0.320 FOX 0.512 0.100 0.268 0.127 0.309 0.518 FREME NER 0.358 0.074 0.160 0.208 0.254 0.263 WAT 0.673 0.137 0.543 0.631 0.403 0.480 xLisa 0.363 0.233 0.352 0.365 0.322 0.274

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 40 / 42

SLIDE 48

Thank You

http://project-hobbit.eu/get-involved/ http://goo.gl/forms/1iRIoG4Xpb https://twitter.com/hobbit_project

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 41 / 42

SLIDE 49

Acknowledgment

This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 42 / 42