Benchmarking Database Ingestion Ability with Real-Time Big Astronomical Data
Qing Tang
Qing Tang,Chen Yang, Xiaofeng Meng, Zhihui Du
RUC 15/11/2019
Bench'19
Bench'19 Benchmarking Database Ingestion Ability with Real-Time Big - - PowerPoint PPT Presentation
Bench'19 Benchmarking Database Ingestion Ability with Real-Time Big Astronomical Data Qing Tang Qing Tang,Chen Yang, Xiaofeng Meng, Zhihui Du RUC 15/11/2019 Outline Background Benchmark Methodology Experiments and Results Analysis
Benchmarking Database Ingestion Ability with Real-Time Big Astronomical Data
Qing Tang
Qing Tang,Chen Yang, Xiaofeng Meng, Zhihui Du
RUC 15/11/2019
Bench'19
ØBackground ØBenchmark Methodology ØExperiments and Results Analysis ØConclusion
Real-time discovery of the transients Gamma-ray burats Supernova Evolution of sun-class stars AstroServer
Catalog流 微引力透镜 超新星 伽玛暴
microlensing
?
Accelerating scientific discovery
Mining the long-term regular pattern
Sky Survey Field (square degree) 5000 Sampling Frequence 15s
6.8 million generated data 2.5TB/day Service life 10 years Total data 3PB~6PB
a) Quick response b) Massive storage of data c) Timeliness of data analysis d) High cost performance
ØBackground ØBenchmark Methodology ØExperiments and Results Analysis ØConclusion
The specific methods are as follows: (1) According to the characteristics of data sets ,the corresponding workloads are analyzed in depth, and the frequent basic operating units are extracted; (2)The benchmark test specifications are determined; (3)The loads based on various software stacks are provided;
ØBackground ØBenchmark Methodology ØExperiments and Results Analysis ØConclusion
configuratio n Performance test environment Hardware software Master Memory:96GB Hard disk:3.5TB CPU:E5-2603 v3 @ 1.60GHz Ubuntu 14.04.5 Redis_3.2.5 HBase_1.2.4 MySQL_5.6.33 Kafka Slave Memory:96GB Hard disk:30TB CPU:E5-2603 v3 @ 1.60GHz Ubuntu 14.04.5 Redis_3.2.5 HBase_1.2.4 MySQL_5.6.33 Kafka
Slaves
Master
Slaves Slaves Slaves
Attribute Type Attribute Type redis_key string magcalibe double jd_str string sigma_base double ccdNum string sigma_ext double zone string tag_valid int starId long magdiff double alpha float lastCMtempname string delta float starBelong string pixx double abSignal string pixy double abVal double mag double abQuality double mage double abRank double thetaimage long sigma_ext_median double flags float mag_interval_num int ellipticity float sigmedthreshold double classstar float data11 double background float data12 double fwhm float data13 double vignet float data14 double magnorm double data15 double magcalib double
Data generator
1920 files 2.8TB
uOne time: 1920 files uOne file: 170,000 rows uOne row: 39 columns
DataBase Persistence time Compression Rate Input anomaly rate Redis+HBase 4.8h 40% 2.50% Redis/HBase 6h 40% 4.60% Redis+MySQL 201h 100% 1.00% Redis/MySQL 202h 100% 1.00% Kafka+HBase 10.9 100% 2.50%
DataBase Average storage time compare Selecttion HBase 340s > 15s No MySQL- cluster 1700s > 15s No Oracle 50.7s > 15s No Redis- cluster 6.4s < 15s Yes Kafka 20.5s > 15s
ØBackground ØBenchmark Methodology ØExperiments and Results Analysis ØConclusion
Data generater Cross matcher The cache manager Redis cluster Data persister Hbase Query engine
AstroServer
Catalog流 微引力透镜 超新星 伽玛暴
?
http://idke.ruc.edu.cn email: tangqing@ruc.edu.cn