CASD-TeraLab
Secure Remote Access to Confidential Big Data
1
Alexandre Marty [ alexandre.marty@casd.eu ]
CASD-TeraLab Secure Remote Access to Confidential Big Data - - PowerPoint PPT Presentation
1 CASD-TeraLab Secure Remote Access to Confidential Big Data Alexandre Marty [ alexandre.marty@casd.eu ] Outline 2 CASD-TeraLab Use Cases Live Demo The Secure Data Access Centre 3 Data Insertions/extractions are A group of
Alexandre Marty [ alexandre.marty@casd.eu ]
Insertions Extractions
Data Insertions/extractions are
Internet access from their workspace.
Hermetic Bubble
A group of tightly-sealed secured servers Hadoop cluster is available for handling Big Data. SD-Boxes are the only means of access to the Bubble. Access occurs via the Internet by encrypted channels User applications and processing are executed strictly within the Bubble. Sensitive data is hosted
Servers & Applications Sensitive Data
Publicly funded Big Data & Data Science platform Open to:
R&D and teaching projects, proof of concepts Public and private sectors
Everything for Big Data:
Powerful and scalable infrastructure Hadoop-based with all Hadoop tools Extensive tools for scientists (R, SAS, machine learning…)
Turnkey solution with full support and maintenance
Electricity transmission network data with RTE
Impressive variety of data sources Development of innovative apps
Health data
Requires high confidentiality About 250 TB generated each year
Mobile telecommunications data for tourism statistics
European data
Involvement in European projects: DwB, Eurostat Big
Data Task Force
Work in collaboration with the Consumer Price
Index team
One goal is to improve the CPI calculation Find new opportunities to use the data and
develop new methodologies
Daily sales data from 4 French major distribution
companies
Very detailed data: products, stores… 5.7 billion rows, 1 TB
Randomly generated dataset used for this
demonstration
www.teralab-datascience.fr casd.eu alexandre.marty@casd.eu