Klustron RCR User Manual
Klustron RCR User Manual
01 Background
This guide aims to assist users in leveraging the Klustron database more effectively. Klustron supports Remote Cluster Replication (RCR) to synchronize data between clusters.
For enhanced city-wide availability, the Klustron distributed database sets up a replica cluster in an alternative city. This replica cluster is in an RCR relationship with the main city's cluster, facilitating real-time data synchronization. Should any issues arise in the primary city, the system can swiftly switch to the replica cluster in the other city to maintain uninterrupted services.
02 Implementation Mechanics
2.1 Mechanics
Klustron's cross-city data synchronization consists of two parts:
- Business Data Synchronization: This is between shards and is built on MySQL's binlog primary-replica replication.
- Klustron Metadata Synchronization: This syncs the necessary database table distribution details to the compute nodes. The
binlog_sync
tool performs this synchronization. This tool, when connecting to the primary city's metadata cluster through thebinlog_dump
method, captures binlog changes. It then maps the shard ID information from the binlog record and subsequently writes it into the replica cluster's metadata tables.
2.2 Considerations
1. The number of primary and replica shards must be consistent.
2. The network between primary and replica cluster_mgr must be intercommunicable.
3. Xpanel should maintain uninterrupted connectivity with the metadata cluster.
03 Klustron Version & Cluster Machine Configuration
Klustron Version | 1.2.1 |
---|
Cluster Name | Cluster_A | |
---|---|---|
Machine IP | Machine Specs | Components |
172.16.0.15 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node Xpanel |
172.16.0.16 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node |
172.16.0.17 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node |
Cluster Name | Cluster_B | |
---|---|---|
Machine IP | Machine Specs | Components |
172.16.1.18 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node Xpanel |
172.16.1.19 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node |
172.16.1.20 | CentOS7.9 32C 128G 1T*2nvme ssd | Meta Node Data Node Computing Node |
04 Data Synchronization Between Clusters
4.1 Data Synchronization Among Clusters Managed Under Local Xpanel
4.1.1 Add RCR
Initially, create two clusters named Cluster_A
and Cluster_B
as illustrated:
Once the clusters are successfully created, the cluster information is displayed in the cluster list as illustrated:
In the cluster management list, click on “RCR Service”, then select the “+ Add RCR” button as depicted:
Proceed to add metadata node information for both Cluster_A
and Cluster_B
.
Once the metadata information is successfully added:
After successfully adding metadata, check the values of various properties in the RCR service list. Here, the status value “Running” indicates that the data synchronization between the new set of clusters was successful.
4.1.2 Data Synchronization Verification
To test data synchronization, write data on the compute node of Cluster_A
:
PGPASSWORD=abc psql -h 127.16.0.16 -U abc -p 47001 -d postgres
On the compute node of Cluster_B
, inspect the data. The data has been successfully synchronized (Note: At this point, Cluster_B
acts as a replica cluster, which is read-only but you can query its data).
PGPASSWORD=abc psql -h 127.16.1.20 -U abc -p 47001 -d postgres
4.1.3 Data Consistency Verification
To further ensure data accuracy and consistency, use the MySQL protocol to access the compute node and verify the MD5 values of the data in the tables:
mysql -uabc -pabc -h172.16.0.16 -P47002
Similarly, use the MySQL protocol to log into the compute node and verify the MD5 values:
mysql -uabc -pabc -h172.16.1.20 -P47002
The resulting MD5 values should be identical to those of the primary Cluster_A
.
4.1.4 Delayed Replication Setting
This function mimics MySQL's delayed replication. It determines how many seconds Cluster_A
waits before synchronizing data to Cluster_B
.
4.1.5 Delete RCR
When an RCR relationship is established between clusters, the clusters can't be deleted. They can only be removed once the relationship is unlinked. Note: Be cautious when deleting a cluster in a production environment.
4.1.6 Switch RCR
In instances where the primary cluster (Cluster_A) experiences anomalies or needs to transition business operations to the replica cluster (Cluster_B) due to specific changes, a manual cluster switch is necessary. The specific steps are as follows:
Before making the switch, click on “Details” to review the replication information between primary and replica clusters.
After switching, ensure data accuracy by verification.
Modify information on the primary cluster, Cluster_B
(which was previously a replica before the switch, but is now the primary). Log into the compute node of the primary cluster:
PGPASSWORD=abc psql -h 127.16.1.20 -U abc -p 47001 -d postgres
Modify information on the replica cluster, Cluster_A
(which was the primary before the switch, but is now the replica). Log into the compute node of the replica cluster:
PGPASSWORD=abc psql -h 127.16.0.15 -U abc -p 47001 -d postgres
Data queries are consistent.
To further verify data consistency, use the MySQL protocol to separately log into the compute nodes of the primary and replica clusters:
mysql -uabc -pabc -h172.16.1.20 -P47002
mysql -uabc -pabc -h172.16.0.15 -P47002
The results show that the MD5 values of the queried data are perfectly consistent.
4.1.7 Start/Stop RCR
Stop RCR.
Start RCR.
The status in the attribute list will change accordingly during stopping and starting operations.
4.1.8 Add/Delete Shard in RCR
When establishing an RCR relationship, the number of shards between the primary and replica clusters must be consistent. If any shard is added or deleted in any cluster after setting up the RCR relationship, the system will automatically add or delete shards in the corresponding cluster based on the RCR relationship.
When adding a shard in the primary cluster, it is displayed as illustrated:
After successfully adding the shard in the primary cluster, the replica cluster also adds the corresponding shard node.
The deletion operation works in a similar way, and will not be demonstrated here.
4.2 Data Synchronization between Clusters Managed by Non-local Xpanel
The previous section discussed data synchronization between clusters managed by a local Xpanel (i.e., all clusters share the same metadata cluster). Now, we will describe data synchronization between clusters managed under different Xpanels (i.e., all clusters are not in the same metadata cluster). Before data synchronization, there is no logical relationship between them.
4.2.1 Data Preparation
http://192.168.0.125:18080/KunlunXPanel/ manages one cluster, namely Cluster_A.
http://192.168.0.128:18080/KunlunXPanel/ manages another cluster, namely Cluster_B.
4.2.2 Add RCR
In the cluster management list, click on "RCR Service", then click on the "Metadata Management" button as illustrated:
In the new metadata list, add the metadata information of the replica cluster.
Once added successfully, the metadata list will display the information of the recently added replica cluster as depicted in the provided image.
Back in the cluster management list, click on "RCR Service", then click on the "+ Add RCR" button as illustrated:
Enter the metadata node information for both cluster_A
and cluster_B
respectively.
After successfully adding RCR, the list information is displayed as shown:
Clicking on "Details" will reveal relevant synchronization information.
Other functionalities are the same as the synchronization function with the local Xpanel cluster, and will not be reiterated!