preface

In JDBC, two types of statements are used: prepareStatements that support parameterization and precompilation, which support native Sql and placeholders to parameterize input parameters to prevent Sql injection; statements that support native Sql, which risk Sql injection.

While developing with Mybatis, the details of the underlying statement are hidden. We use # and $to tell Mybatis what we are actually doing and whether we need to parameterize the statement or just keep it in its native state.

Today we will mainly look at the performance of the system to deal with Sql injection when using two symbols and Mybatis internally how to deal with their source code analysis.

The difference between # and $in dealing with Sql injection

Using an existing application, the ability to inject (malicious) SQL commands into the back-end database engine for execution can be used to retrieve a database on a vulnerable website by typing (malicious) SQL statements into a Web form, rather than executing the SQL statements intended by the designer.

For example, if you want to query student information by student name, you will pass in a parameter called name

SELECT id,name,age FROM student WHERE name = 'all';Copy the code

Without Sql injection prevention, our Sql statement might be written like this

<select id="fetchStudentByName" parameterType="String" resultType="entity.StudentEntity"> SELECT id,name,age FROM student WHERE name = '${value}' </select>Copy the code

Under normal circumstances find out the name of the square student information.

But if we make some changes to the name parameter passed in, such as anything’ OR ‘x’=’x, then the concatenated Sql becomes

SELECT id,name,age FROM student WHERE name = 'anything' OR 'x'='x'Copy the code

All the student information in the library is pulled out, isn’t it scary. If anything’ OR ‘x’=’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’ OR ‘anything’ =’x’

To prevent Sql injection, use the entire single quotation mark in anything’ OR ‘x’=’x ‘as part of the argument, rather than concatenating the single quotation mark in Sql

Use # to escape arguments in Mybatis

<select id="fetchStudentByName" parameterType="String" resultType="entity.StudentEntity"> SELECT id,name,age FROM student WHERE name = #{name} </select>Copy the code

Let’s take a look at what the Sql statement sent to the database looks like.

SELECT id,name,age FROM student WHERE name = 'anything\' OR \'x\'= \'x'Copy the code

As you can see from the above code, all single quotes in parameters are transferred, thanks to PrepareStatement in JDBC. If precompilation is enabled on the database server, the server does this.

Check out my previous post about JDBC and Mysql, which explains why a PrepareStatement can do this.

The source code

SqlSession provides an Api for the user to perform operations on the database. Executor also delegates to StatementHandler (StatementHandler).

The Handler’s implementation class is to represent the JDBC operations in the statements, CallableStatementHandler, PrepareStatementHandler and SimpleStatementHandler will represent CallableS of JDBC Tatement, PrepareStatement, and Statement. These handlers will internally call the relevant Statement in JDBC.

The analogy between the Mybatis execution flow and the original JDBC method we use is.

Mybatis: Sqlsession -> Executor -> StatementHandler -> ResultHandler

JDBC: Connection -> Statement -> Result

So we know that all operations on JDBC statements are inside StatementHandler.

In PrepareStatementHandler, the Statement is parameterized using Paramterize, where it delegates to DefualtParameterHandler for operation. Let’s see if we can Debug this code with two different statements.

The first is to use the $symbol, which is directly concatenated inSql. As you can see from the figure below, the Sql statement is already concatenated when parameterizing. See originSql.

ParameterMappings for boundSql does not exist, so you do not need to set the corresponding placeholder in the second red box.

Then, let’s see what happens when the same code is used with #. As you can see from the picture below, when # is used, the original #{value} is replaced with? Number, also known as a placeholder in JDBC.

ParameterMappings = value -> anything’ OR ‘x’=’x’;} / / TypeMappings = value -> anything’ OR ‘x’=’x’;

When used, {value} is replaced by the corresponding value. There is no parameter mapping, and no placeholder operation is performed. When used #, #{} is replaced by? Number, which has parameter mapping, and the placeholder operation is set in DefaultParameterHandler.

The problem

1 Why is the default statement PrepareStatementHandler used

When 2 and # are replaced, and why the corresponding BoundSql, $, is not mapped, # is mapped.

With these two questions, let’s take a look at the initialization stage of Mybatis. To save space, only the general path and key codes are listed.


Mybatis is built by SqlSessionFactory build, which will parse the mapping file

The SqlSessionFactoryBuilder is – > XmlConfigBuilder – > XMLMapperBuilder – > XMLStatementBuilder.

The parseStatementNode in the XMLStatementBuilder is responsible for generating the MappedStatement, which answers the first question first. When you do not specify statementType, Mybatis defaults to PrepareStatementHandler, where statementType, Use the RoutingStatementHandler in the subsequent process to select which StatementHandler to use.

Then we move on to the second problem, how the $and # are replaced.

As we mentioned earlier, BoundSql contains the Sql body, and the parameter mapping in BoundSql determines whether or not to parameterize later. $and # behave differently.

BoudSql comes from MappedStatement, where the task of getting BoundSql is delegated to the SqlSource interface. So let’s focus on how the SqlSource is generated.

XMLLandDriver can be understood as a statement used to parse Mybatis custom XML symbols. He passes the responsibility of parsing symbols to the parseScriptNode method of XMLScriptBuilder.

ParseDynamicTags wrap the statement in TextSql and then use the isDynamic method, which uses GerenericTokenParser to determine if it is a dynamic statement. If it includes $, is dynamic, if it is # is not dynamic, use the Handler is DrynamicCheckerTokenParser.

After entering the Parse method, focus on the following paragraph.

Here will use TokenHandler different implementation class, to further processing of expression, here is the Sql after the perfect, in the judgment isDynamic, use DrynamicCheckerTokenParser, one of the most simple implementation.

When parse is complete, it is a dynamic statement if isDynamic is true, using DynamicSqlSource.

If it is non-dynamic, it usually means a statement that uses #, using RawSqlSource, where further parsing is done.

Can see from the chart, the TokenParser this time use # {}, and using the ParameterMappingTokenHandler.

ParameterMappingTokenHandler handlerToken method, completed the add parameter mapping and replace # {value} for? The responsibility of the.

From the above we can see that using # will be replaced by? In the initialization phase, there is nothing special about using $, just making a judgment about whether or not a dynamic statement is made.


After initialization, let’s go to the getBoundSql method and see what DynamicSqlSource and StaticSource are doing at this point, starting with DynamicSqlSource.

Bindings (‘ _parameter ‘->’ anything’ OR ‘x’=’x ‘); bindings (‘ _databaseId ‘->’ null ‘); bindings (‘ _parameter ‘->’ anything’ OR ‘x’=’x ‘); bindings (‘ _databaseId ‘->’ null ‘)

Then I use the apply method, and I understand that I’m going to make a substitution here. ${} is the same as before, but BindingTokenParser is used.

Look at the HandleToken method of BindingTokenParser.

The effect of the above code is to use Ognl, use value Bindings, look for the corresponding value, and then return it to the Sql, which is why there is a risk of Sql injection. Value is used because Ognl uses the default value when it looks for bindings, so it needs to add an extra key value in bindings.

Next, generate the SqlSource using the PARSE method of SqlSourceBuilder.

In the above described, in this parse method, # {} is used to judge, so couldn’t get to ParameterMappingTokenHandler handlerToken method, also can’t add parameter mapping, this returns a StaticSqlSource directly, This also explains why the parameter map is empty when $is used.

The next step is to get BoundSql, using the StaticSqlSource, which instantiates one of the parameters directly, with the parameter mapping null.

When you use #, you use the StaticSqlSource, which is instantiated directly. Since the parameter mapping was generated during the initialization phase, it’s a simple process.

The subsequent process is consistent with the normal process of Mybatis.

conclusion

This paper mainly analyzes the difference between $and # in Mybatis, and the difference between the source process when using these two symbols. It is recommended that everyone use # to avoid the risk of Sql injection at the ORM level.