This article has been included github.com/lkxiaolou/l… Welcome to star.
background
Cobar
Cobar is an open source database middleware of Alibaba. The introduction of Cobar is not described here. Please refer to the previous article “Design and Implementation of Cobar SQL Audit”.
SQL
SQL is a domain language (programming language) commonly used in relational databases to facilitate the management of structured data. Abstract Syntax Tree (AST) is generated by lexical analysis, Syntax analysis and semantic analysis when SQL is executed by the database, and then the execution plan is generated by the optimizer and executed by the execution engine.
SQL Parser
The Parser that parses SQL into an AST is called SQL Parser. It is usually developed in two ways:
- Automatically generated by tools
- Advantages: Simple and easy to implement
- Disadvantages: poor performance, difficult secondary development
- Writing by hand
- Advantages: Good performance, code is clear and easy to expand
- Disadvantages: high requirements for developers, need to understand the principles of compilation
SQL Parser is also implemented in Cobar, and its location in Cobar can be seen in its architecture diagramSQL Parser is followed by SQL Router. It can be inferred that THE AST is parsed by SQL Parser for the routing function of library and table.
Cobar’s SQL Parser also went through three iterations, essentially for performance reasons:
- The first version: Generated SQL Parser based on JavaCC, poor performance, inconvenient optimization
- The second version: Handwritten based on the Parser structure generated by ANTLR, with too many intermediate objects
- 3rd edition: Handwriting based on LL(2) recognizer
This article will not give too much introduction to SQL Parser. If you are interested in it, you can refer to the article “Self-developed SQL Parser Design and Practice 30 times Faster than Open Source”. I have read this article carefully for several times and attached a summary of the brain map:
Github.com/lkxiaolou/r…
Cobar AST
SQL Parser from Cobar parses SQL to AST. To get a sense of it, here’s an example:
select id,type from goods as g where type in (select type from type_config where status = 0)
After Cobar SQL Parser, the following AST objects are generated:
The root node of the AST is the SELECT statement, and each attribute is a leaf node, and the attribute of the leaf node branches off from the leaf node. It may be a little convoluted, but you need to feel it at the code level.
The Node of the AST is defined as follows, and there is only an Accept method, which is used to traverse the tree.
public interface ASTNode {
void accept(SQLASTVisitor visitor);
}
Copy the code
The ASTNode implementation consists of the following:
- SQLStatement: SQL statements, such as SELECT, UPDATE, and INSERT statements, are displayed in the DMLSelectStatement in the figure above
- Expression: Expression, such as and, or, comparative statements, such as reflected in InExpression, ComparisionEqualsExpression, LiteralNumber, Identifier
- TableReference: table statements, which are TableReferences and TableRefFactor
In the realization of the ComparisionEqualsExpression, for example
Where 1 is the left and right expression for comparison, 2 is the judge, where is “=”, and 3 is to evaluate the expression.
How is evaluationInternal implemented? In fact, this problem is made easier by structuring and enumerating expressions, such as taking the left and right values and comparing them for equality.
AST operation
With that in mind, the basic operation on the AST is traversal. Using ASTNode accept, we need to implement the SQLASTVisitor interface. This SQLASTVisitor is defined as follows:
In fact, Java polymorphism is used to define a visit method for each ASTNode, traversing different objects corresponding to different methods.
For example, MySQLOutputASTVisitor can iterate over the AST and restore the AST to SQL output like this:
SQLStatement stmt = SQLParserDelegate.parse(sql);
StringBuilder s = new StringBuilder();
stmt.accept(new MySQLOutputASTVisitor(s));
System.out.println(s.toString());
Copy the code
This execution will output
SELECT id, type FROM goods AS G WHERE type IN (SELECT type FROM type_config WHERE status = 0)
Sqlparserdelegate.parse (SQL) is parsed as a DMLSelectStatement object, which implements the visit method as follows:
@Override
public void accept(SQLASTVisitor visitor) {
visitor.visit(this);
}
Copy the code
MySQLOutputASTVisitor visit(DMLSelectStatement node) The code is a bit long, so I won’t post it here, but the general idea is that if you see a leaf node, you format it directly into StringBuilder. If you don’t, you continue to iterate by calling accept on the corresponding node.
We can refer to MySQLOutputASTVisitor to write an traverser that meets our needs.
The application of the AST
Depots table
In Cobar, AST can be used to obtain table names, column names and comparison values to divide libraries into tables, which is also the most important function of Cobar.
SQL Feature Generation
In addition, the AST I know can also generate SQL characteristics for raw SQL, such as raw SQL:
select id, name, age from user as u where age >= 20
Or is it
select id, name, age from user as u where age >= 30
Can be normalized
select id, name, age from user as u where age >= ?
This is useful for SQL slow queries or other statistics, and for limiting traffic against SQL.
Dangerous SQL interception
When an UPDATE or delete is written on a line without a WHERE condition, the AST can be used to evaluate expressions and intercept SQL with no WHERE condition and where condition constant true.
The last
This article introduces SQL AST from the source, structure, traversal principle, application and other aspects. I believe that after reading this article, YOU will have a preliminary understanding of SQL AST. If you want to further understand, you can refer to the unit test in the Cobar project for actual demonstration experience.
Search attention wechat public number “bug catching master”, back-end technology sharing, architecture design, performance optimization, source code reading, problem solving, practice.