Optimised Framework based on Rough Set Theory for Big Data - PowerPoint PPT Presentation

Recent Trends in Knowledge Compilation – Dagstuhl Seminar 17381 Marie Sklodowska-Curie Actions - Individual Fellowship Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts Presented by Dr. Zaineb Chelly Dagdia

Introduction Background RoSTBiDFramework Conclusion Outline 1 Introduction Dr. Zaineb Chelly RoSTBiDFramework 1/24

Introduction Background RoSTBiDFramework Conclusion Outline 1 Introduction 2 Background Dr. Zaineb Chelly RoSTBiDFramework 1/24

Introduction Background RoSTBiDFramework Conclusion Outline 1 Introduction 2 Background 3 RoSTBiDFramework Dr. Zaineb Chelly RoSTBiDFramework 1/24

Introduction Background RoSTBiDFramework Conclusion Outline 1 Introduction 2 Background 3 RoSTBiDFramework 4 Conclusion Dr. Zaineb Chelly RoSTBiDFramework 1/24

Introduction Background RoSTBiDFramework Conclusion The MSC project Proposal title “Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts” Dr. Zaineb Chelly RoSTBiDFramework 2/24

Introduction Background RoSTBiDFramework Conclusion The MSC project Proposal title “Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts” ✎ Duration in months : 24 ✎ Panel ENG : Information Science and Engineering (ENG) ✎ Descriptor : Machine learning, statistical data processing and applications Dr. Zaineb Chelly RoSTBiDFramework 2/24

Introduction Background RoSTBiDFramework Conclusion The MSC project Proposal title “Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts” ✎ Duration in months : 24 ✎ Panel ENG : Information Science and Engineering (ENG) ✎ Descriptor : Machine learning, statistical data processing and applications ✔ Project started on the 1st of March 2017 Dr. Zaineb Chelly RoSTBiDFramework 2/24

Introduction Background RoSTBiDFramework Conclusion The MSC project Dr. Zaineb Chelly RoSTBiDFramework 3/24

Introduction Background RoSTBiDFramework Conclusion Partner organisations Dr. Zaineb Chelly RoSTBiDFramework 4/24

Introduction Background RoSTBiDFramework Conclusion Partner organisations ✓ Host : Aberystwyth University, UK Dr. Zaineb Chelly RoSTBiDFramework 4/24

Introduction Background RoSTBiDFramework Conclusion Partner organisations ✓ Host : Aberystwyth University, UK ✓ Partner Organisations : ➺ University of Birmingham, UK ➺ University of Paris 13, France ➺ University of Granada, Spain ➺ *Non-academic partner France Dr. Zaineb Chelly RoSTBiDFramework 4/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Outline 1 Introduction 2 Background Big Data Rough Set Theory 3 RoSTBiDFramework 4 Conclusion Dr. Zaineb Chelly RoSTBiDFramework 5/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Specification “Datasets which could not be captured, managed, and processed by general computers within an acceptable scope.” – [Apache Hadoop, 2010] – Dr. Zaineb Chelly RoSTBiDFramework 6/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Specification “Datasets which could not be captured, managed, and processed by general computers within an acceptable scope.” – [Apache Hadoop, 2010] – ➜ Having bigger data requires different approaches : ✔ Techniques; ✔ Tools; ✔ Architecture; Dr. Zaineb Chelly RoSTBiDFramework 6/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Distributed processing : Apache Spark ✍ Apache Spark is a lightning-fast cluster computing technology , designed for fast computation. Dr. Zaineb Chelly RoSTBiDFramework 7/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Distributed processing : Apache Spark ✍ Apache Spark is a lightning-fast cluster computing technology , designed for fast computation. ✍ It is based on the MapReduce model. Dr. Zaineb Chelly RoSTBiDFramework 7/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Distributed processing : Apache Spark ✍ Apache Spark is a lightning-fast cluster computing technology , designed for fast computation. ✍ It is based on the MapReduce model. ✍ It is an in-memory cluster computing that increases the processing speed of an application. Dr. Zaineb Chelly RoSTBiDFramework 7/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Distributed processing : Apache Spark ✍ Apache Spark is a lightning-fast cluster computing technology , designed for fast computation. ✍ It is based on the MapReduce model. ✍ It is an in-memory cluster computing that increases the processing speed of an application. ✍ It is based on Resilient Distributed Datasets (RDD) which supports in-memory processing computation. Dr. Zaineb Chelly RoSTBiDFramework 7/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion MapReduce ☞ MapReduce divides the workload into multiples independent tasks and schedule them across cluster nodes. Dr. Zaineb Chelly RoSTBiDFramework 8/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion MapReduce ☞ MapReduce divides the workload into multiples independent tasks and schedule them across cluster nodes. Data are distributed to all the nodes of the cluster as it is being loaded in. Data are split into chunks which are managed by different nodes in the cluster. ➠ Even though the file chunks are distributed across several machines they form a single namespace. Dr. Zaineb Chelly RoSTBiDFramework 8/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Rough Set Theory Basic Concepts Dr. Zaineb Chelly RoSTBiDFramework 9/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion Rough Set Theory Basic Concepts The indiscernibility relations The Lower Approximation The Upper Approximation The Boundary Region The Positive Region The Dependency of attributes Dr. Zaineb Chelly RoSTBiDFramework 9/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion RST for Feature Selection Calculate the IND of the classes; 1 Dr. Zaineb Chelly RoSTBiDFramework 10/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion RST for Feature Selection Calculate the IND of the classes; 1 Generate all the possible combinations of features; 2 Dr. Zaineb Chelly RoSTBiDFramework 10/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion RST for Feature Selection Calculate the IND of the classes; 1 Generate all the possible combinations of features; 2 For each combination : 3 Calculate the IND; Calculate the lower approximation; Calculate the positive region; Calculate the dependency; Dr. Zaineb Chelly RoSTBiDFramework 10/24

Introduction Background Big Data RoSTBiDFramework Rough Set Theory Conclusion RST for Feature Selection Calculate the IND of the classes; 1 Generate all the possible combinations of features; 2 For each combination : 3 Calculate the IND; Calculate the lower approximation; Calculate the positive region; Calculate the dependency; Select the reduct(s) where : 4 The feature set is composed of minimal features; The DEP of the feature set equals the DEP of the data set (all the features); Dr. Zaineb Chelly RoSTBiDFramework 10/24

Introduction Challenges Background Research methodology RoSTBiDFramework Proposed solution Conclusion Outline 1 Introduction 2 Background 3 RoSTBiDFramework Challenges Research methodology Proposed solution 4 Conclusion Dr. Zaineb Chelly RoSTBiDFramework 11/24

Introduction Challenges Background Research methodology RoSTBiDFramework Proposed solution Conclusion Current state Dr. Zaineb Chelly RoSTBiDFramework 12/24

Introduction Challenges Background Research methodology RoSTBiDFramework Proposed solution Conclusion Current state It has become difficult to quickly acquire the most useful information from the huge amount of data at hand. Dr. Zaineb Chelly RoSTBiDFramework 12/24

Introduction Challenges Background Research methodology RoSTBiDFramework Proposed solution Conclusion Current state It has become difficult to quickly acquire the most useful information from the huge amount of data at hand. ➽ It is necessary to perform data (pre-)processing as a first step! Dr. Zaineb Chelly RoSTBiDFramework 12/24

Introduction Challenges Background Research methodology RoSTBiDFramework Proposed solution Conclusion State-of-the-art Sequential and MapReduce based dimensionality reduction techniques involve the user for parameterisation ; Are not able to deal with the veracity aspect ; Are not able to deal with the data computational requirements ; Dr. Zaineb Chelly RoSTBiDFramework 13/24

Optimised Framework based on Rough Set Theory for Big Data - PowerPoint PPT Presentation

Recent Trends in Knowledge Compilation Dagstuhl Seminar 17381 Marie Sklodowska-Curie Actions - Individual Fellowship Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts Presented by

Optimised Synthesis of Optimised Synthesis of Asynchronous Elastic Dataflows by Leveraging

Rough paths methods 1: Introduction Samy Tindel Purdue University University of Aarhus 2016

Rough paths methods 1: Introduction Samy Tindel University of Lorraine at Nancy KU - Probability

Semantics of Rough Sets From theory to applications (through semantics understanding).

A FCA perspective on Rough Set Theory Bernhard Ganter & Christian Meschke Institut f ur

MEANING, CHOICE and ALGEBRAIC SEMANTICS of SIMILARITY BASED ROUGH SET THEORY A. Mani Member,

Foundations Boolean Reasoning - George Boole, 1847, Brown 1990 Rough Sets - Zdzislaw

Set Theory Supartha Podder uOttawa Set Theory A set is an unordered collection of objects

Big Data Max Kemman University of Luxembourg October 19, 2015 Online slides optimised for

Input. A set of men M , and a set of women W . Input. A set of men M , and a set of women W .

Rough Case Rough Case Study Plan Study Plan Responsible NORAD institutions Funding

The microstructural foundations of rough volatility Omar El Euch and Mathieu Rosenbaum Ecole

= Pull- -Off Force Off Force JKR Pull- -Off Off Pull Pull 2 0 . 6 / F W d F

T urbocharging Monte Carlo pricing under rough volatility Mikko Pakkanen Department of

Modification of branched Rough Paths Nikolas Tapia, joint work w. Lorenzo Zambotti (Paris) 23

Algebras from a Quasitopos of Rough Sets Anuj Kumar More Mohua Banerjee Department of

CMSC427 Interac/ve programs in Processing: Polyline editor Interactive programming Example:

How to Create Resilient Microservices With a PostgreSQL Dependency Glen Gomez Zuazo Senior

Inside a Self-Driving Uber Matt Ranney March 6, 2018 1.3 million people die in car crashes

LEGO MINDSTORMS & ARDUINO PRACTICAL SESSION 2 Part of SmartProducts LEGO MINDSTORMS &

Remora: A Resource Monitoring Tool for Everyone Carlos Rosales carlos@tacc.utexas.edu Where

Android AsyncTask AsyncTask Android AsyncTask is an abstract class provided by Android

Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 VLDB

A Configurable Hardware Scheduler A Configurable Hardware Scheduler (CHS) for Real- -Time

Sambuz

Useful Links

Newsletter

Mail Us

Optimised Framework based on Rough Set Theory for Big Data - PowerPoint PPT Presentation

Recent Trends in Knowledge Compilation Dagstuhl Seminar 17381 Marie Sklodowska-Curie Actions - Individual Fellowship Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts Presented by

Optimised Synthesis of Optimised Synthesis of Asynchronous Elastic Dataflows by Leveraging

Rough paths methods 1: Introduction Samy Tindel Purdue University University of Aarhus 2016

Rough paths methods 1: Introduction Samy Tindel University of Lorraine at Nancy KU - Probability

Semantics of Rough Sets From theory to applications (through semantics understanding).

A FCA perspective on Rough Set Theory Bernhard Ganter &amp; Christian Meschke Institut f ur

MEANING, CHOICE and ALGEBRAIC SEMANTICS of SIMILARITY BASED ROUGH SET THEORY A. Mani Member,

Foundations Boolean Reasoning - George Boole, 1847, Brown 1990 Rough Sets - Zdzislaw

Set Theory Supartha Podder uOttawa Set Theory A set is an unordered collection of objects

Big Data Max Kemman University of Luxembourg October 19, 2015 Online slides optimised for

Input. A set of men M , and a set of women W . Input. A set of men M , and a set of women W .

Rough Case Rough Case Study Plan Study Plan Responsible NORAD institutions Funding

The microstructural foundations of rough volatility Omar El Euch and Mathieu Rosenbaum Ecole

= Pull- -Off Force Off Force JKR Pull- -Off Off Pull Pull 2 0 . 6 / F W d F

T urbocharging Monte Carlo pricing under rough volatility Mikko Pakkanen Department of

Modification of branched Rough Paths Nikolas Tapia, joint work w. Lorenzo Zambotti (Paris) 23

Algebras from a Quasitopos of Rough Sets Anuj Kumar More Mohua Banerjee Department of

CMSC427 Interac/ve programs in Processing: Polyline editor Interactive programming Example:

How to Create Resilient Microservices With a PostgreSQL Dependency Glen Gomez Zuazo Senior

Inside a Self-Driving Uber Matt Ranney March 6, 2018 1.3 million people die in car crashes

LEGO MINDSTORMS &amp; ARDUINO PRACTICAL SESSION 2 Part of SmartProducts LEGO MINDSTORMS &amp;

Remora: A Resource Monitoring Tool for Everyone Carlos Rosales carlos@tacc.utexas.edu Where

Android AsyncTask AsyncTask Android AsyncTask is an abstract class provided by Android

Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 VLDB

A Configurable Hardware Scheduler A Configurable Hardware Scheduler (CHS) for Real- -Time

Sambuz

Useful Links

Newsletter

Mail Us

A FCA perspective on Rough Set Theory Bernhard Ganter & Christian Meschke Institut f ur

LEGO MINDSTORMS & ARDUINO PRACTICAL SESSION 2 Part of SmartProducts LEGO MINDSTORMS &