Klustron RCR User Manual

01 Background

This guide aims to assist users in leveraging the Klustron database more effectively. Klustron supports Remote Cluster Replication (RCR) to synchronize data between clusters.

For enhanced city-wide availability, the Klustron distributed database sets up a replica cluster in an alternative city. This replica cluster is in an RCR relationship with the main city's cluster, facilitating real-time data synchronization. Should any issues arise in the primary city, the system can swiftly switch to the replica cluster in the other city to maintain uninterrupted services.

02 Implementation Mechanics

2.1 Mechanics

Klustron's cross-city data synchronization consists of two parts:

Business Data Synchronization: This is between shards and is built on MySQL's binlog primary-replica replication.
Klustron Metadata Synchronization: This syncs the necessary database table distribution details to the compute nodes. The binlog_sync tool performs this synchronization. This tool, when connecting to the primary city's metadata cluster through the binlog_dump method, captures binlog changes. It then maps the shard ID information from the binlog record and subsequently writes it into the replica cluster's metadata tables.

2.2 Considerations

1. The number of primary and replica shards must be consistent.

2. The network between primary and replica cluster_mgr must be intercommunicable.

3. Xpanel should maintain uninterrupted connectivity with the metadata cluster.

03 Klustron Version & Cluster Machine Configuration

Klustron Version	1.2.1

Cluster Name	Cluster_A
Machine IP	Machine Specs	Components
172.16.0.15	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node Xpanel
172.16.0.16	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node
172.16.0.17	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node

Cluster Name	Cluster_B
Machine IP	Machine Specs	Components
172.16.1.18	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node Xpanel
172.16.1.19	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node
172.16.1.20	CentOS7.9 32C 128G 1T*2nvme ssd	Meta Node Data Node Computing Node

04 Data Synchronization Between Clusters

4.1 Data Synchronization Among Clusters Managed Under Local Xpanel

4.1.1 Add RCR

Initially, create two clusters named Cluster_A and Cluster_B as illustrated:

Once the clusters are successfully created, the cluster information is displayed in the cluster list as illustrated:

In the cluster management list, click on “RCR Service”, then select the “+ Add RCR” button as depicted:

Proceed to add metadata node information for both Cluster_A and Cluster_B.

Once the metadata information is successfully added:

After successfully adding metadata, check the values of various properties in the RCR service list. Here, the status value “Running” indicates that the data synchronization between the new set of clusters was successful.

4.1.2 Data Synchronization Verification

To test data synchronization, write data on the compute node of Cluster_A:

PGPASSWORD=abc psql -h 127.16.0.16 -U abc -p 47001 -d postgres

On the compute node of Cluster_B, inspect the data. The data has been successfully synchronized (Note: At this point, Cluster_B acts as a replica cluster, which is read-only but you can query its data).

PGPASSWORD=abc psql -h 127.16.1.20 -U abc -p 47001 -d postgres

4.1.3 Data Consistency Verification

To further ensure data accuracy and consistency, use the MySQL protocol to access the compute node and verify the MD5 values of the data in the tables:

mysql -uabc -pabc -h172.16.0.16 -P47002

Similarly, use the MySQL protocol to log into the compute node and verify the MD5 values:

mysql -uabc -pabc -h172.16.1.20 -P47002

The resulting MD5 values should be identical to those of the primary Cluster_A.

4.1.4 Delayed Replication Setting

This function mimics MySQL's delayed replication. It determines how many seconds Cluster_A waits before synchronizing data to Cluster_B.

4.1.5 Delete RCR

When an RCR relationship is established between clusters, the clusters can't be deleted. They can only be removed once the relationship is unlinked. Note: Be cautious when deleting a cluster in a production environment.

4.1.6 Switch RCR

In instances where the primary cluster (Cluster_A) experiences anomalies or needs to transition business operations to the replica cluster (Cluster_B) due to specific changes, a manual cluster switch is necessary. The specific steps are as follows:

Before making the switch, click on “Details” to review the replication information between primary and replica clusters.

After switching, ensure data accuracy by verification.

Modify information on the primary cluster, Cluster_B (which was previously a replica before the switch, but is now the primary). Log into the compute node of the primary cluster:

PGPASSWORD=abc psql -h 127.16.1.20 -U abc -p 47001 -d postgres

Modify information on the replica cluster, Cluster_A (which was the primary before the switch, but is now the replica). Log into the compute node of the replica cluster:

PGPASSWORD=abc psql -h 127.16.0.15 -U abc -p 47001 -d postgres

Data queries are consistent.

To further verify data consistency, use the MySQL protocol to separately log into the compute nodes of the primary and replica clusters:

mysql -uabc -pabc -h172.16.1.20 -P47002

mysql -uabc -pabc -h172.16.0.15 -P47002

The results show that the MD5 values of the queried data are perfectly consistent.

4.1.7 Start/Stop RCR

Stop RCR.

Start RCR.

The status in the attribute list will change accordingly during stopping and starting operations.

4.1.8 Add/Delete Shard in RCR

When establishing an RCR relationship, the number of shards between the primary and replica clusters must be consistent. If any shard is added or deleted in any cluster after setting up the RCR relationship, the system will automatically add or delete shards in the corresponding cluster based on the RCR relationship.

When adding a shard in the primary cluster, it is displayed as illustrated:

After successfully adding the shard in the primary cluster, the replica cluster also adds the corresponding shard node.

The deletion operation works in a similar way, and will not be demonstrated here.

4.2 Data Synchronization between Clusters Managed by Non-local Xpanel

The previous section discussed data synchronization between clusters managed by a local Xpanel (i.e., all clusters share the same metadata cluster). Now, we will describe data synchronization between clusters managed under different Xpanels (i.e., all clusters are not in the same metadata cluster). Before data synchronization, there is no logical relationship between them.