Skip to main content

Klustron 1.3 Performance Comparison Test Report

KlustronAbout 8 min

Klustron 1.3 Performance Comparison Test Report

Version: v1.3.1

Cluster Topology and Configuration:

Cluster TopologyCompute NodesStorage NodesManagement Nodeshaproxysysbenchbenchmarksql
192.168.0.20
192.168.0.21
192.168.0.22

Cluster Description: Compute Nodes: Each of the 3 machines deploys one compute node. Storage Nodes: There are 3 shards, each shard has a single master, and the single masters of the three shards are distributed across the three machines. Management Nodes: The management cluster consists of three machines, forming a 3-node cluster with one primary and two backups.

Machine Configuration: CentOS 8.5, 32 cores, 128GB RAM, 1.9TB NVMe SSD, 10Gbps NIC.

Load Balancer: HAProxy 2.5.0

Sysbench: 1.0.20

BenchmarkSQL: 5.0

Preparation Before Benchmarking:

Create a 3-shard, 3-compute node cluster.

Modify system variables on compute nodes before benchmarking:

alter system set statement_timeout=6000000;
alter system set mysql_read_timeout=1200;
alter system set mysql_write_timeout=1200;
alter system set lock_timeout=1200000;
alter system set log_min_duration_statement=1200000;
alter system set effective_cache_size = '8GB';
alter system set work_mem  = '128MB';
alter system set wal_buffers='64MB';
alter system set autovacuum=false;

Note: Restart each node for changes to take effect.

Modify system variables on storage nodes before benchmarking:

mysql -h xxx -P xxx -upgx -ppgx_pwd  # Login to each shard master to modify
set global innodb_buffer_pool_size=32*1024*1024*1024;
set global lock_wait_timeout=1200;
set global innodb_lock_wait_timeout=1200;    
set global fullsync_timeout=1200000; 
set global enable_fullsync=false;
set global innodb_flush_log_at_trx_commit=2;
set global sync_binlog=0;
set global max_binlog_size=1*1024*1024*1024;
set global enable_fullsync=off;

Disable failover for each shard via XPanel: Cluster Management -> Failover Settings.

Remove replicas from each shard.

Sysbench

oltp_point_select

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)0.812.8664.4770.55
TPS113007.3795306.5273943.3166162.5
QPS113007.3795306.5273943.3166162.5
CPU (32vC)20:29%
21:27%
22:27%
20:28%
21:26%
22:27%
20:27%
21:26%
22:26%
20:27%
21:25%
22:26%
Memory (128GB)20:33%
21:33%
22:33%
20:33%
21:33%
22:33%
20:33%
21:33%
22:33%
20:34%
21:34%
22:34%
IO Usage20:7%
21:7%
22:7%
20:7%
21:5%
22:4%
20:5%
21:3%
22:3%
20:6%
21:7%
22:4%

oltp_update_non_index

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)1.4412.351.0258.92
TPS66057.7963286.7754899.4351132.19
QPS66057.7963286.7754899.4351132.19
CPU (32vC)20:34%
21:32%
22:36%
20:31%
21:33%
22:36%
20:33%
21:30%
22:35%
20:31%
21:32%
22:33%
Memory (128GB)20:34%
21:34%
22:34%
20:34%
21:34%
22:34%
20:34%
21:34%
22:34%
20:35%
21:35%
22:35%
IO Usage20:27%
21:18%
22:39%
20:99%
21:43%
22:95%
20:95%
21:99%
22:95%
20:94%
21:91%
22:96%

oltp_update_index

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)2.4311.2446.6355.82
TPS64748.6354121.3646875.1646347.41
QPS64748.6354121.3646875.1646347.41
CPU (32vC)20:40%
21:42%
22:40%
20:33%
21:32%
22:29%
20:33%
21:28%
22:28%
20:32%
21:26%
22:34%
Memory (128GB)20:20%
21:21%
22:18%
20:20%
21:22%
22:19%
20:21%
21:23%
22:23%
20:21%
21:23%
22:21%
IO Usage20:92%
21:97%
22:98%
20:99%
21:91%
22:94%
20:96%
21:94%
22:96%
20:93%
21:92%
22:97%

oltp_read_write

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)186.54411.96612.21427.07
TPS642.611940.163095.13218.29
QPS2570.437760.6412380.3812869.15
CPU (32vC)20:11%
21:10%
22:12%
20:20%
21:16% 22:21%
20:23%
21:22%
22:25%
20:25%
21:24%
22:26%
Memory (128GB)20:35%
21:35%
22:35%
20:36%
21:36%
22:36%
20:37%
21:37%
22:37%
20:38%
21:38%
22:38%
IO Usage20:93%
21:98%
22:98%
20:60%
21:13%
22:51%
20:52%
21:54%
22:51%
20:63%
21:57%
22:61%

oltp_read_only

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)183.21502.2383.33427.07
TPS644.99865.4230863450.44
QPS2579.963461.6712334.1813783.17
CPU (32vC)20:11%
21:11%
22:12%
20:29%
21:27%
22:27%
20:28%
21:27%
22:26%
20:28%
21:27%
22:26%
Memory (128GB)20:34%
21:34%
22:34%
20:33%
21:33%
22:33%
20:33%
21:33%
22:33%
20:33%
21:33%
22:33%
IO Usage20:100%
21:100%
22:100%
20:55%
21:60%
22:58%
20:65%
21:70%
22:68%
20:75%
21:71%
22:68%

oltp_write_only

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)183.21260.72459.18637.08
TPS651.22433.68264.01198.25
QPS2604.91812.3896.45503.76
CPU (32vC)20:4%
21:4%
22:14%
20:5%
21:3%
22:10%
20:6%
21:8%
22:9%
20:6%
21:7%
22:8%
Memory (128GB)20:34%
21:34%
22:34%
20:35%
21:34%
22:34%
20:35%
21:34%
22:34%
20:36%
21:34%
22:34%
IO Usage20:100%
21:99%
22:100%
20:100%
21:100%
22:100%
20:100%
21:100%
22:100%
20:100%
21:100%
22:100%

oltp_insert

Benchmark Duration5min5min5min5min
Concurrency100300600900
95% Latency (ms)0.877.8427.6643.39
TPS110055.3298261.5375309.9677354.33
QPS110055.3298261.5375309.9677354.33
CPU (32vC)20:34%
21:26%
22:27%
20:33%
21:22%
22:29%
20:29%
21:27%
22:24%
20:25%
21:23%
22:38%
Memory (128GB)20:34%
21:34%
22:34%
20:34%
21:34%
22:34%
20:34%
21:34%
22:34%
20:35%
21:35%
22:35%
IO Usage20:56%
21:58%
22:64%
20:94%
21:47%
22:93%
20:91%
21:85%
22:93%
20:94%
21:96%
22:94%

TPC-C

Benchmark Duration10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min10min
warehouse5005005005005005005005005005005005005005001000100010001000100010001000
Concurrency5050607080901001502003004005006007005090100200300400500
tpmC (Orders per Minute)86851.5386653.6384991.9884124.681586.1883623.8646545.8232384.5121039.7821356.6221970.1322568.6224438.2423091.8887922.2183814.5481742.562162021763.2530612.0328126.18
tmpTotal193198.13192866.59188799.55186880.09181188.01185844.57103319.4571928.4446754.4947422.6248836.3950230.1854333.9851256.41195284.6186223.04181779.7248044.1748319.8468080.462555.91
Notesnode:18,19,20node:20,21,22
CPU (32vC)18:40% 19:39% 20:36%20:35% 21:33% 22:33%20:38% 21:35% 22:36%20:37% 21:32% 22:36%20:37% 21:34% 22:36%20:36% 21:29% 22:35%20:32% 21:30% 22:33%20:11% 21:31% 22:29%20:26% 21:9% 22:8%20:27% 21:8% 22:8%20:25% 21:7% 22:8%20:27% 21:19% 22:45%20:27% 21:19% 22:45%20:28% 21:11% 22:7%20:36% 21:28% 22:39%20:33% 21:35% 22:37%20:36% 21:34% 22:38%20:25% 21:7% 22:10%20:26% 21:8% 22:11%20:11% 21:10% 22:30%20:11% 21:27% 22:11%
Memory (128GB)18:25% 19:20% 20:20%20:23% 21:20% 22:21%20:23% 21:22% 22:22%20:24% 21:22% 22:23%20:25% 21:22% 22:23%20:26% 21:22% 22:23%20:26% 21:23% 22:24%20:27% 21:24% 22:25%20:27% 21:24% 22:25%20:27% 21:24% 22:26%20:28% 21:24% 22:26%20:28% 21:24% 22:26%20:28% 21:24% 22:26%20:29% 21:25% 22:27%20:34% 21:34% 22:34%20:34% 21:34% 22:34%20:34% 21:34% 22:34%20:35% 21:34% 22:34%20:35% 21:34% 22:34%20:35% 21:35% 22:35%20:35% 21:35% 22:35%
IO Usage18:70% 19:75% 20:72%20:65% 21:67% 22:62%20:73% 21:67% 22:72%20:78% 21:75% 22:71%20:62% 21:65% 22:66%20:82% 21:83% 22:85%20:81% 21:89% 22:89%20:30% 21:35% 22:55%20:28% 21:36% 22:44%20:22% 21:32% 22:21%20:25% 21:24% 22:23%20:25% 21:24% 22:31%20:32% 21:25% 22:31%20:34% 21:25% 22:17%20:78% 21:82% 22:81%20:81% 21:85% 22:85%20:81% 21:82% 22:87%20:29% 21:38% 22:31%20:31% 21:33% 22:32%20:34% 21:32% 22:35%20:33% 21:32% 22:29%

TPC-H

The new execution engine nextgen, which incorporates vectorized execution and pipelined execution technologies, has been developed for Klustron by Zetab. According to actual tests, it improves the TPC-H performance by ten to several hundred times compared to the previous version, with an average improvement of several dozen times. See the test data below for details. In the table, new-cost columns use nextgen execution.

queriescost(seconds)1Gnew-cost(seconds)1Gnew-cost(seconds)10Gnew-cost(seconds)100Gnew-cost(seconds)200Gnew-cost(seconds)500G
Q115.81.60.594.4844.3988.79
Q21.360.790.948.0526.95167.71
Q31598.10.621.1210.0955.93424.11
Q43.120.330.635.0656.83162.17
Q530.231.532.1522.33114.12465.63
Q62.60.390.382.9545.47132.41
Q72262.640.451.1512.3670.03209.25
Q85.30.461.4531.36112.17467.98
Q914.3315.452.6729.43111.07717.72
Q105.150.041.1910.6163.72266.68
Q110.880.040.312.614.99135.14
Q123.770.290.796.9964.27128.81
Q132.542.451.7515.7333.62178.35
Q142.790.430.554.0453.79174.23
Q155.360.060.787.57102.97330.5
Q160.880.870.273.7617.49641.42
Q1710.970.941.9215.4196.64445.27
Q1813.90.044.0239.05177.664191.07
Q193.143.580.968.3752.49163.6
Q204.280.521.4213.5778.37610.63
Q219.641.056.6947.8218.36
Q220.710.480.766.2920.2495.79

TPC-DS

totalCost: 2986.81s

querycost(seconds)new-cost(seconds)1Gnew-cost(seconds)10G
Q10.240.090.96
Q24.843.8437.64
Q31.510.10.66
Q430.631.527.23
Q55.180.392.56
Q6141.670.190.69
Q75.730.272.5
Q82.11.074.82
Q911.490.697.1
Q106.032.3616.2
Q1120.220.763.7
Q120.520.090.32
Q132.210.453.04
Q149.952.0716.86
Q151.080.181
Q160.750.152.08
Q176.730.272.4
Q184.260.492.14
Q191.770.180.89
Q201.030.120.51
Q215.880.41.7
Q2213.068.6109.56
Q2320.250.584.13
Q244.440.213.61
Q251292.550.362.54
Q263.520.190.99
Q273.470.232.29
Q287.5619.84
Q292.660.312.3
Q300.350.130.42
Q3119.190.311.26
Q322.140.120.79
Q333.070.593.04
Q340.090.10.75
Q355.052.2118.27
Q360.070.156.86
Q370.040.31.67
Q384.561.059.74
Q3915.459.35107.29
Q401.380.130.73
Q410.050.10.23
Q421.60.10.67
Q430.060.130.75
Q441.10.152.71
Q451029.890.190.89
Q460.070.172.46
Q476.470.761.23
Q482.050.453.21
Q493.120.383.21
Q504.580.812.09
Q514.211.9519
Q521.520.110.69
Q531.610.21.35
Q540.850.250.78
Q551.590.130.69
Q563.080.432.29
Q572.860.681.94
Q589.270.31.51
Q596.340.271.69
Q603.10.362.69
Q610.140.271.68
Q6210.070.25
Q631.630.191.43
Q6411.110.8330
Q653.680.233.08
Q661.370.170.89
Q6710.058.43110.91
Q680.090.192.66
Q695.240.181.01
Q705.04110.55
Q711.620.351.55
Q7228.570.799.98
Q730.090.120.66
Q747.490.541.18
Q755.742.3226.54
Q761.540.130.78
Q774.750.252.57
Q7825.254.27169.14
Q792.490.172.3
Q806.690.42.86
Q810.330.130.45
Q825.950.291.84
Q831.20.160.39
Q8419.20.110.33
Q852.630.381.25
Q860.730.342.58
Q874.541.110.23
Q8810.270.477
Q891.850.311.01
Q900.790.070.28
Q911.120.150.28
Q921.10.110.39
Q933.590.362.06
Q940.520.111.39
Q9532.887.9588.65
Q961.250.080.91
Q973.231.1212.65
Q981.830.21.69
Q992.030.170.98

END