Apache ShardingSphere 5.0.0- Beta will finally be released in the near future after six months of preparation! This article gives you a preview of what features the new version will bring.
The authors introduce
Pan Juan | Trista
Co-founder of SphereEx
SphereEx co-founder, Apache Member, Apache ShardingSphere PMC, Apache BRPC (Mentor) Mentor, this Release Manager.
Former senior DBA of JINGdong Technology, responsible for the design and development of intelligent database platform of Jingdong Data Department, now focuses on distributed database & middleware ecology and open source field. He was named “2020 Chinese Open Source Pioneer” and was invited to participate in related conferences in the field of database and architecture for many times.
As the top-level project of Apache, the Release of ShardingSphere should go through the steps of community verification, voting and Release, etc., to ensure that the released version conforms to the License and Apache Release specifications, and the function and project level meet the expectations as much as possible. This is the protection of the project itself and its users. The basic build of the current version has been completed and is expected to be released this week.
This Release will bring the following important features:
1. Highlights
DistSQL, a newly defined distributed database operation language
SQL is a database query and programming language for accessing data and querying, updating, and managing relational database systems. In October 1986, the American National Standards Institute adopted SQL as the standard language for relational database management systems. The existing general database system in its practice process has made part of the REWRITING and expansion of SQL specification, with higher flexibility and richer functions, so that it is suitable for their own database system.
DistSQL (Distributed SQL) is a unique built-in SQL language proposed by Apache ShardingSphere, which can provide incremental functional operation capabilities beyond standard SQL. DistSQL allows users to manipulate the ShardingSphere as if it were a database, transforming it from a developer-oriented framework and middleware to an operation-oriented infrastructure product.
In the ShardingSphere, DistSQL is currently divided into RDL, RQL, and SCTL:
-
Resource & Rule Definition Language (RDL) : Creates, modifies, and deletes resources and rules.
-
Resource & Rule Query Language (RQL) : queries and displays resources and rules.
-
ShardingSphere Control Language (SCTL) : Hint, transaction type switch, shard execution plan query and other incremental function operations.
ShardingSphere launched the concept of Database Plus to empower traditional databases, build distributed, high-security and controllable databases to enhance the ecology, and build open source distributed Database system with both databases and in line with actual business needs. Distributed SQL (Distributed SQL) combined with the Distributed database system transforms shardingSphere-Proxy, the traditional Distributed database agent driven by configuration files, into a “Distributed database” driven by SQL in a real sense.
In 5.0.0-beta, users can start ShardingSphere-Proxy with one key, and dynamically create, modify and delete distributed database tables, encrypt database tables, dynamically inject database instance resources, create master/slave polling rules, and display global configuration information through DistSQL. Enable distributed transactions and dynamic migration of distributed library table jobs.
DistSQL is a database based product that allows users to manipulate and manage all database resources and metadata information in the ShardingSphere distributed database ecosystem with the most standardized, standard and familiar query methods. In the future, we will use DistSQL to break down the boundary between middleware and database, allowing developers to use ShardingSphere natively as if they were using a database.
Fully connected to the PostgreSQL ecosystem
PostgreSQL is the world’s leading open source database, known in the industry as the most powerful enterprise open source database. PostgreSQL is currently ranked fourth in the world and won the “Database of the Year” title in 2017 and 2018.
Shardingsphere-jdbc and Shardingsphere-Proxy constitute the access end system of ShardingSphere. Shardingsphere-proxy includes MySQL and PostgreSQL. As the MySQL protocol became more mature and widely adopted, the ShardingSphere team began to focus on the PostgreSQL protocol. In this release, PostgreSQL has been extensively developed and enhanced in its SQL parsing layer, SQL compatibility layer, protocol access layer, and permission control layer. As the flagship product of this release, shardingSphere-Proxy PostgreSQL will truly step on the road to open source PostgreSQL ecosystem, and there will be more continuous improvements in the future.
PostgreSQL is the star database product in the open source world. ShardingSphere’s link to PostgreSQL will provide a more complete and continuous maintenance solution for users considering PostgreSQL distributed, horizontal expansion, secure encryption, fine-grained permission control.
ShardingSphere pluggable architecture
Pluggable architectures seek the independence and insensitivity of modules and combine functions in a superimposed manner through a highly flexible, pluggable and extensible kernel.
In ShardingSphere, many function implementation classes are loaded through SPI (Service Provider Interface) injection. SPI is an API that is intended to be implemented or extended by third parties and can be used to implement framework extensions or component replacements.
At present, data sharding, read/write separation, data encryption, shadow library, database discovery and other functions, as well as support for SQL and protocols such as MySQL, PostgreSQL, SQLServer and Oracle, can be embedded into ShardingSphere through plug-ins. ShardingSphere now offers dozens of SPIs as extension points to the system, and more are being added. With the improvement of pluggable architecture, ShardingSphere transforms from sub-database and sub-table middleware to distributed database ecosystem.
ShardingSphere’s pluggable and expandable architecture concept provides users with customized database solutions combined like building blocks, such as traditional relational databases with horizontal expansion and data encryption functions at the same time, or independently build distributed database solutions.
2. New features
New open observable capabilities
In order to effectively separate observability from mainline functions, ShardingSphere provides automated probes that can be customized to extend calls to link tracing, performance metrics, and log burial points. ShardingSphere has built-in implementation of tracing probes based on Opentracing, Jaegar, and Zipkin, and metrics probes based on Prometheus, and provides a default logging implementation.
3. Enhance
Enhanced distributed query capabilities
Joins and subqueries across database instance nodes have always been a headache. The use of multiple database instances at the same time limits the functions at the business level by the database. Therefore, business r&d personnel need to pay attention to the application scope of query SQL.
The release of the version of the enhanced distributed query function, in support of cross-database instance Join and sub-query, through the SQL parsing, routing and execution level of the enhancement and bug fixes, To MySQL/PostgreSQL/Oracle/used in the distributed scenario of SQL compatibility degree achieved greatly ascend. This enables users to make a smooth transition from traditional database cluster to distributed horizontal expansion database cluster by introducing ShardingSphere on the basis of the original database instance cluster with low risk, high efficiency and zero transformation.
At present, the enhancement of distributed query ability is still in the PoC stage, and there is still a large space for improvement in performance. Welcome friends from the community to participate in the development.
Enhanced distributed user and permission control
User security and authority control is one of the most important functions in database field. ShardingSphere offers simple user password configuration and coarse-grained permission control at the library level in the previous 5.0.0-alpha release, and the upcoming beta builds on this feature. From only through the configuration file for user and password configuration, to the current standardized SQL way for distributed user name, hostname, password online modification and management. In addition, from the original coarse-grained library level permission control, to the library and table level permission control.
Regardless of whether MySQL or PostgreSQL is used in business scenarios (openGauss will be supported in the future), native DATABASE SQL dialect can be used. Under the distributed system of ShardingSphere, Control and manage user name, hostname, password, library, table and other freely combined permissions. The Proxy access mode of ShardingSphere-Proxy allows users to migrate their original database permissions and user systems as seamlessly as possible.
In future releases, ShardingSphere will provide permission control at the column level, view level, and even permission constraints for each row of data. For third-party business systems or user-specific security systems, ShardingSphere provides the ability to connect with these systems, so that ShardingSphere-Proxy can connect with third-party security control systems and provide the most standard database permission management mode.
The permissions module is currently under development, and the next version will present more complete features.
API to simplify refactoring capabilities
ShardingSphere’s pluggable architecture provides users with rich expansion capabilities, while also built-in common functions for ease of use. For example, the sharding strategy of sub-database sub-table is preconfigured with hash sharding, time range sharding, module sharding and other strategies. For data storage encryption in the data security field, AES, RC4, and MD5 encryption policies are preconfigured. To further simplify operations, with the new powerful DistSQL capability, users can dynamically create a shard or encrypted table online with a single SQL.
In addition to preset common functions, ShardingSphere also opens relevant algorithms and policy interfaces to meet more complex usage scenarios, so that users can inject more complex functions according to their actual business scenarios. The coexistence strategy of simplicity and openness has always been the architectural design philosophy of ShardingSphere.
4. Other functions
Performance improvement: Metadata loading optimization
Since ShardingSphere helps users to shield and manage all database instances and metadata information, performance problems of loading metadata information for a long time will occur when the application is started, especially when there are thousands of server instances, the problem of slow loading metadata is more obvious. In this release, many performance tuning and architecture adjustments have been made specifically to address the much-touted metadata loading issues. Instead of the native JDBC driver loading method, the parallel SQL query method for different database dialects extracts all metadata information at one time, thus greatly improving startup performance.
Easy to use: added built-in performance test system
In the process of continuous improvement of functions and development of new functions, ShardingSphere has always lacked a complete and comprehensive integration & performance test system. It can ensure that each submission can be compiled normally without affecting other modules and observe the upward and downward trend of performance. In addition, integration tests are conducted for data sharding, data encryption, read/write separation, distributed management and control, permission control, SQL support and other functions. Provides the foundation for monitoring and tuning performance across different databases, sharding or encryption policies, and versions.
With the release of this beta version, related performance test reports and curve change display will also be developed for the community in order to facilitate users to understand the performance changes of ShardingSphere. In addition, the entire test system source code will be made available to the community so that users can easily deploy their own tests. Thanks to SphereEx(sphere-ex.com) for contributing the entire performance test system to the community.
In addition to the functions listed above, this release also includes other enhancements, performance optimization, and bug fixes. We will continue to cover the official release of Apache ShardingSphere 5.0.0-beta, as well as in-depth technical articles on various features and functions. Stay tuned for updates in this series!
🔗ShardingSphere GitHub
Github.com/apache/shar…
- In the process of using ShardingSphere, if you find any problems, have new ideas and suggestions, you are welcome to click the “link” to participate in the community building of ShardingSphere through the Apache mailing list.