The Distributed Database Based on Kudu
Shunda Lin
The Distributed Database Based on Kudu Shunda Lin Outline - - PowerPoint PPT Presentation
The Distributed Database Based on Kudu Shunda Lin Outline Motivation Introduction of Kudu Deployment and Configuration Query Test Conclusion Outline Motivation Introduction of Kudu Deployment and Configuration
Shunda Lin
monitoring systems
when data in the past has been filed
scanning and random access
slave2 (192.168.0.134) slave3 (192.168.0.100)
a command-line interface application for transferring data between relational databases and Hadoop
an open-source cluster-computing framework
Sqoop import –connect jdbc:mysql://202.120.36.137:6033/mag-new-160205 –username=data – password=data –table AuthorFieldCount –m 1 –target-dir /user/hadoop/AuthorFieldCount –as-parquetfile
select FOSID as Source, FOSReferencesCount.FOSReference as Target, Similarity/10000000 as Weight from (select FOSReference from `FOSReferencesCount` where `FOSID` = '0271BC14'
limit 1000) e1, (select FOSReference from `FOSReferencesCount` where `FOSID` = '0271BC14'
limit 1000) e2, FOSReferencesCount where e1.`FOSReference` = `FOSReferencesCount`.FOSID and e2.`FOSReference` = `FOSReferencesCount`.FOSReference;
10 20 30 40 50 60 70 80 90 Case1 Case2 Case3
Query
MySQL Kudu
FOSID Computer Science (0271BC14) Ethnic studies (03D2C4FF) Data Structure (09ACCB7D) MySQL 82.4s 65.4s 55.7s Kudu 8.23s 9.175s 7.821s
20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13
Query Test
MySQL Kudu