Skip to main content

Will Serverless Databases Render the Role of DBAs Redundant?

KlustronAbout 6 min

Will Serverless Databases Render the Role of DBAs Redundant?

As technologies like 5G and AI continue to advance, the foundational database technology at the heart of IT systems is also undergoing a continuous evolution, transitioning from complexity to simplicity.

In recent years, the concept of Serverless has gained significant traction, drawing attention from renowned consultancies such as Gartner and Forrester. Major cloud computing giants like AWS, Alibaba Cloud, and Tencent Cloud are continuously expanding their offerings of Serverless-related products. The integration with Serverless has undoubtedly rekindled the momentum for the evolution of databases.

A Serverless database is a type of database service built upon the Serverless architecture, combining the strengths of both cloud databases and Serverless computing. Serverless databases are particularly suited for scenarios such as IoT edge computing, development testing, and unpredictable workloads. These scenarios often involve relatively low average loads, and a significant portion of resources might remain idle most of the time. By adopting Serverless, substantial cost savings, up to 90%, can be achieved.

In comparison to traditional cloud databases, Serverless databases possess the following characteristics:

  • Automatic resource matching: Resources are automatically matched based on the user's business workload, eliminating the need for users to estimate business scale, thereby saving considerable effort.
  • Pay-as-you-go pricing: Users only pay for the actual resources they use, without needing to concern themselves with underlying infrastructure services, achieving genuine pay-as-you-go pricing.
  • Reduced database selection complexity: Users need not worry about database selection; they can focus solely on their business needs.
  • Diminished DBA operation tasks: Serverless databases can automatically scale resources based on traffic spikes, providing robust support for business operations and significantly reducing the operational burden on DBAs.

Amidst the surge in the popularity of the Serverless concept, various vendors are eagerly introducing their own offerings of Serverless databases. Recently, InfoQ had the privilege of interviewing Zhao Wei, Founder & CEO of Zetron Technology (KunlunBase), to delve into his insights on the underlying philosophy behind the Serverless version of KunlunBase database and its alignment with the future trends of Serverless.

InfoQ: Could you please introduce the design philosophy behind the KunlunBase Serverless version database?

Zhao Wei: The design goal of Klustron (formerly known as KunlunBase) is to support horizontally elastic and scalable distributed databases while ensuring financial-grade high reliability. We leveraged the strengths of two open-source standalone database systems, PostgreSQL and MySQL, and incorporated a substantial number of self-developed kernel modules. These modules handle various aspects including distributed transaction processing, distributed parallel query processing, distributed DDL transaction processing and replication, global deadlock handling, global multi-version concurrency control, fullsync & fullsync HA, automated fault recovery, cluster-level physical and logical data backup and recovery, cluster dual-active and multi-IDC high availability, and a series of other distinctive features specific to distributed databases. Additionally, we modified certain modules and seamlessly integrated them into synergistic components. This amalgamation allowed us to achieve our design goals, resulting in an effect of 1+1>>2.

InfoQ: Which application scenarios are more suitable for the Serverless version? What is the technological path from 0 to 1?

Zhao Wei: The KunlunBase Serverless version is built upon KunlunBase and introduces functionalities such as tenant management, data isolation, and usage statistics for billing purposes. Some administrative functions related to cluster management in multi-tenant scenarios are restricted to ensure they are accessible only to us as service providers, not to tenants. The Serverless version is suitable for:

  • SaaS scenarios: Similar business logic but varying user scales, with different users experiencing distinct rates of business growth.
  • Unified data platforms: For large companies, the data platform department provides DBaaS similar to private clouds for various departments, products, and services.
  • Public cloud DBaaS providers.

InfoQ: During usage, the cost of using the database is a critical consideration when customers choose a database. What are the billing principles for the Serverless version?

Zhao Wei: KunlunBase Serverless follows a billing model based on the amount of data stored by each tenant and the quantity of computational resources utilized. The complexity of building a DBaaS in Serverless mode, based on the existing capabilities of the Klustron distributed database, remains relatively manageable.

01 Overview of KunlunBase Serverless

The DBaaS service launched by Klustron (formerly known as KunlunBase) on AWS is currently operated in Serverless mode.

KunlunBase deploys a KunlunBase cluster using AWS EC2 instances and EBS storage services, providing KunlunBase Serverless services to multiple tenants. Zetuo Technology is responsible for maintaining the KunlunBase cluster on AWS, relieving users from the need to install or manage the KunlunBase cluster. During runtime, additional EC2 instances and EBS storage space can be added on-demand to provide more storage and computational capacity to both current and new tenants.

Each tenant connects to KunlunBase Serverless using their private account and password, enabling them to read and write their data. No tenant can access data belonging to other tenants, and they are unaware of which tenants are currently utilizing the cluster.

So, what are the technological practices behind KunlunBase Serverless? What technical challenges were encountered during the process? In our interview, Zhao Wei provided detailed answers to these questions.

02 KunlunBase Serverless Technical Practices

2.1 Data Isolation

Data isolation is crucial for a multi-tenant DBaaS, ensuring that no tenant can access the data of another tenant, including the existence and names of database objects like schemas and tables, which must remain unknown to other tenants.

The KunlunBase database team utilizes Klustron's database isolation capabilities to achieve data isolation for different tenants. Each KunlunBase Serverless tenant can connect to its dedicated database and execute DDL and DML statements. Tenants can create schemas within their databases to achieve logical data partitioning. However, tenants cannot use DML statements to read or write metadata tables in the system catalog.

Unlike MySQL, a client connected to a database cannot switch to another database using the USE command or mysql_select_db(). Furthermore, tenants cannot connect to databases belonging to other tenants, a behavior ensured by KunlunBase's permission settings. For KunlunBase Serverless, each tenant's business logic creates a dedicated account in the KunlunBase cluster with appropriate permissions. Please refer to the following text for more details.

2.2 User Accounts

Each tenant requires a dedicated user account to use KunlunBase Serverless, enabling access control and advanced governance features.

When users purchase KunlunBase Serverless through AWS Marketplace, the built-in logic in the purchasing process uses the provided database connection username and password to create an account in the KunlunBase cluster. Each account's permissions prohibit it from connecting to or accessing other tenants' databases, creating accounts or databases, inheriting or modifying permissions, and more. This primary account can be used to create additional sub-accounts for internal permission control. In addition, different schemas can be created within the database for different business purposes and assigned to sub-accounts for usage. All these sub-accounts can only connect to this tenant's database, and KunlunBase's control module consolidates their resource usage for unified billing by AWS.

2.3 Tenant Cluster Control

Zetuo Technology expanded KunlunBase's XPanel cluster control system into XPanel Serverless, providing independent and limited control functionality for each KunlunBase Serverless tenant. Unlike an on-premise KunlunBase cluster, many cluster control functions are not applicable to KunlunBase Serverless tenants. These include scaling, adding/deleting cluster nodes and storage shards, cluster physical backup and recovery, full-cluster logical backup and recovery, multi-availability zone (multi-data center) high availability, and dual active-active cluster functionality. Only functions like logical backup and recovery at the database, schema, and table level, online DDL & repartition, CDC, etc., continue to be effective. Tenants using the CDC feature can only export data update event streams from their own databases.

2.4 Backend Cluster Control

As the technical service provider for KunlunBase Serverless, Zetuo Technology is responsible for cluster control, including scaling, adding/deleting cluster nodes and storage shards, cluster physical backup and recovery, full-cluster logical backup and recovery, multi-availability zone (multi-data center) high availability, and dual active-active cluster functionality. These functions are accomplished by logging into the cluster using an administrator account through XPanel.

2.5 Log Access Control

KunlunBase supports collecting logs from all cluster nodes using ElasticSearch. Due to data security considerations, only our technical support personnel can access all logs generated by operations in the backend cluster control interface. Tenants can only access interface SQL logs corresponding to their databases (SQL statements sent from computing nodes to storage nodes), storage node slow query logs, and computing node slow query logs and SQL logs.

2.6 Resource Isolation

Currently, KunlunBase Serverless uses all available computational resources of the cluster to execute each SQL statement from every connected client, without resource isolation. From a user perspective, resource isolation is not provided, making KunlunBase Serverless cost-effective. The only restriction for users is the connection count, a parameter provided when purchasing KunlunBase Serverless services, which is also used in the billing rules.

In Serverless mode, the traditional use of cgroups for resource isolation is not suitable, as there is no 1-to-1 correspondence between any storage node's process/thread and tenants. Therefore, implementing resource isolation for tenants would require tracking their resource consumption and scheduling resources, consuming substantial CPU and memory resources. Currently, Zetuo Technology has not yet implemented such work, and Zhao Wei mentioned that it might be completed in the future as needed.

InfoQ: What challenges do you think current Serverless databases still face? And what strategies are there to address them?

Zhao Wei: The lack of precise resource isolation and usage control requires a significant amount of system-level development work. Currently, our approach is not to implement resource isolation but to allow users to fully utilize computing resources to serve their requests. At the same time, as a DBaaS service provider, we can quickly provide more computing resources to serve more user requests in case of resource shortages. Since we follow a pay-as-you-go billing model, this approach is actually more favorable for the providers.

InfoQ: How do you view the future development trends of Serverless databases? What opportunities might arise?

Zhao Wei: In the future, Serverless database services will continue to be attractive to small and medium-sized users. The Serverless model greatly simplifies database operation and maintenance tasks, significantly reducing the workload of users' DBAs. This allows them to focus on tasks like query performance optimization, Serverless service state monitoring, and data access control management. The cost of database usage for users will also be substantially reduced. From this perspective, Serverless can be seen as a way to share DBA expertise.