Opensource Column Store Databases: MariaDB ColumnStore vs. ClickHouse
Alexander Rubin VirtualHealth
Opensource Column Store Databases: MariaDB ColumnStore vs. - - PowerPoint PPT Presentation
Opensource Column Store Databases: MariaDB ColumnStore vs. ClickHouse Alexander Rubin VirtualHealth About me Working with MySQL for 10-15 years Started at MySQL AB 2006 - Sun Microsystems, Oracle (MySQL Consulting) - Percona since
Alexander Rubin VirtualHealth
2
3
4
MySQL = { } }
mysql> SELECT
+-----------+-------------+------------+--------------+--------+-----------+ | client_id | first_visit | last_visit | days_visited | visits | locations | +-----------+-------------+------------+--------------+--------+-----------+ | ......... | 2017-08-07 | 2019-05-24 | .. | ... | .. |
5
6
7
select_type: SIMPLE table: cv partitions: NULL type: index possible_keys: FK_client_visit key: FK_client_visit key_len: 5 ref: NULL rows: 10483873 filtered: 100.00 Extra: Using temporary; Using filesort 1 row in set, 1 warning (0.00 sec) PRIMARY KEY (`id`), KEY `FK_client_visit_author_id` (`client_id`)
8
mysql> alter table client_visit add key comb(client_id, date, service_location_name); Query OK, 0 rows affected (38.48 sec) Records: 0 Duplicates: 0 Warnings: 0 table: cv partitions: NULL type: index possible_keys: FK_client_id,comb key: comb key_len: 776 ref: NULL rows: 10483873 filtered: 100.00 Extra: Using index; Using temporary; Using filesort
9
mysql> show table status like 'client_visit'\G *************************** 1. row *************************** Name: client_visit Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 10483873 Avg_row_length: 233 Data_length: 2 452 602 880 Index_length: 1 644 773 376
24 columns, including `notes` text `description` text etc...
10
11
12
SELECT cv.client_id as client_id, min(date) as first_visit, max(date) as last_visit, count(distinct date) as days_visited, count(distinct cv.id) as visits, count(distinct cp.cpt_code) as procedures, count(distinct cv.service_location_name) as locations, sum(billed_amount) as total_billed, max(billed_amount) as max_price, avg(billed_amount) as avg_price FROM client_visit cv join client_procedure cp on cp.encounter_id = cv.encounter_id join client_procedure_claim cpc on cp.id = cpc.client_procedure_id join client_claim cc on cc.id = cpc.client_claim_id GROUP BY client_id ORDER BY total_billed desc LIMIT 10 Highly normalized schema
13
+-----------+-------------+------------+--------------+--------+------------+-----------+--------------+-----------+-------------+ | client_id | first_visit | last_visit | days_visited | visits | procedures | locations | total_billed | max_price | avg_price | +-----------+-------------+------------+--------------+--------+------------+-----------+--------------+-----------+-------------+ | ....... | 2018-02-14 | 2019-04-22 | 64 | 64 | .. | .. | 200K | 11K | 449.34 | ...
14
15
https://clickhouse.yandex/docs/en/
16
Wednesday 5:10 PM - 5:35 PM@ Texas
17
Wednesday 5:40 PM - 6:05 PM
Thursday 11:55 AM - 12:45 PM
18
19
AWS ec2 instance, c5d.4xlarge
20
AWS ec2 instance, c5d.4xlarge
21
Compression
22
23
24
https://github.com/Altinity/clickhouse-mysql-data-reader
ENGINE = MergeTree() PARTITION BY toYYYYMM(created_date) ORDER BY client_id
25
INSERT INTO client_procedure_claim SELECT * FROM mysql('host', 'db', 'client_procedure_claim', 'export', 'xxxxx') Ok. 0 rows in set. Elapsed: 17.821 sec. Processed 37.40 million rows, 299.18 MB (2.10 million rows/s., 16.79 MB/s.)
26
https://github.com/sysown/proxysql/wiki/ClickHouse-Support
27
https://github.com/sysown/proxysql/wiki/ClickHouse-Support
28
29
30
31
...
...
32
33
34
35
36
37
38
39
40
https://clickhouse.yandex/docs/en/query_language/alter/#alter-mutations
:) ALTER TABLE client UPDATE is_active = 0 WHERE user_id = 3216031 Ok. 0 rows in set. Elapsed: 0.006 sec.
41
42
ERROR 1815 (HY000): Internal error: IDB-2015: Sorting length
variable max_length_for_sor t_data needs to be set higher. Index scan
43
44
45
46
47
48
49
50