Counting Types for Massive JSON Datasets
BDA 2017, Nancy
Counting Types for Massive JSON Datasets BDA 2017, Nancy ( prsent - - PowerPoint PPT Presentation
Counting Types for Massive JSON Datasets BDA 2017, Nancy ( prsent DBPL 2017) Mohamed-Amine Baazizi , Dario Colazzo, Giorgio Ghelli, Carlo Sartiani Counting types } Can types count? Type theory perspective } Should they? } How to
BDA 2017, Nancy
2
3
4
} How to capture correlation information? } { addr:T300; aff:T300; r:T800 }800 } { addr:T300; aff:T300; r:T300 }300 + {r:T500 }500 } { addr:T300; r:T300 }300 + { aff:T300; r:T500 }500 } { addr:T300; r:T500 }500 + { aff:T300; r:T300 }300 } { addr:T300; r:T300 }300 + { aff:T300; r:T300 }300 + {r:T200}200
5
Concision Precision
} Num2 captures any multiset of two numbers } [Num4]3 a possible type for the multiset { [1], [1], [1,2] }M
6
7
} [1], [1], [1,2] :M [ Num4]3 } [1], [1], [1,2] :M [ Num2]2+[ Num2] 1 } [1], [1], [1,2] :M [ Num1]1+ [ Num1]1+[ Num2]1 Concision Precision
7
} [1], [1], [1,2] :M [ Num4]3 } [1], [1], [1,2] :M [ Num2]2+[ Num2] 1 } [1], [1], [1,2] :M [ Num1]1+ [ Num1]1+[ Num2]1
Concision Precision
8
Reduce
8
Reduce
9
10
{ addr:T300; aff:T300; r:T800 }800 { addr:T300; r:T300 }300 + { aff:T300; r:T300 }300 + {r:T200}200
11
12
12
Kind reduction
13
14
[Klettke et al. 2016] Schema Extraction and Structural Outlier Detection for JSON-based NoSQL Data Stores, Technologie und Web (BTW) [Schmidt. 2017]. mongodb-schema. (2017). https://github.com/mongodb-js/mongodb-schema.
15