If you need high performance from the MySQL server, the best way to do this is to spend time with the MySQL tuning and query execution mechanisms. Once this is understood, much of the query optimization is grounded, making the entire query optimization process more logical. The following figure shows how MySQL executes a query:
- The client sends the SQL statement to the server.
- The server checks the query cache. If there is already data in the cache, the cache result is returned directly. Otherwise, the SQL statement is passed to the next segment.
- After the server parses, preprocesses, and optimizes the SQL statement, it is passed to the query optimizer to form the query plan.
- The query execution engine executes the query plan by invoking the storage engine interface.
- The server returns the query result to the client.
Each of these steps has its own complexity, and the next few articles will cover each step in detail. The query optimization process is particularly complex, and it is important to understand it.
MySQL client/server protocol
While you don’t need to know the insides of the MySQL client/server protocol, you do need to understand how it works at a high application level. This protocol is half-duplex, which means that the MySQL server does not send and receive messages at the same time and cannot split messages into multiple short messages. On the one hand, this mechanism makes MySQL communication simple and fast, on the other hand, it also adds some restrictions. For example, this means that there is no flow control, and once one party sends a message, the other party must receive the entire message before responding. It is like playing ping-pong back and forth. Only one side has the ball at the same time, and only when it is received can it be returned.
The client sends queries to the server in a single packet, so it is important to configure max_allowed_packet when large queries are present. Once the client has sent the query, it can only wait for the result to return.
Instead, the server response is typically composed of multiple packets. Once the server responds, the client must retrieve the entire result set. The client cannot simply fetch a few rows and tell the server not to send the rest of the data. If the client only needs to return the first few rows of data, it can either wait for the server to return all the data and then discard the unwanted data from it, or simply disconnect. Either way, it’s not a good choice, so a proper LIMIT clause is important.
Most MySQL connection libraries support fetching entire result sets and caching them in memory, or fetching rows of data as needed. The default behavior is usually to fetch the entire result set and cache it in memory. This is important to know because the MySQL server will not release the locks and resources for this query until all requested rows are returned. Most client-side libraries give you the impression that data is being fetched from the server, when in fact it is just being fetched from the cache. This is fine most of the time, but not for queries with large data volumes that take a long time or take up a lot of memory. If you specify not to cache the query results, the memory footprint is smaller and the results can be processed faster. The disadvantage is that this approach can lead to server-side locking and resource usage during queries.
Using PHP as an example, here is common PHP query code:
$link = mysql_connect('localhost'.'user'.'password');
$result = mysql_query('SELECT * FROM huge_table'.$link);
while ($row = mysql_fetch_array($result)) {
// Process the data result
}
? >
Copy the code
This code looks like it just fetches the required rows. However, this query actually puts all the results into memory with a call to mysql_query. The while loop actually iterates over the data in memory. In contrast, if you use mysql_unbuffered_query instead of mysql_query, the results will not be cached.
$link = mysql_connect('localhost'.'user'.'password');
$result = mysql_unbuffered_query('SELECT * FROM huge_table'.$link);
while ($row = mysql_fetch_array($result)) {
// Process the data result
}
? >
Copy the code
Different programming languages handle cache overlays differently. For example, Perl’s DBD::mysql driver needs to specify the C voice client library (default mysql_buffer_result) via the mysql_use_result attribute, as shown in the following example:
#! /usr/bin/perl
use DBI;
my $dbn = DBI->connect('DBI:mysql:; host=localhost'.'user'.'password');
my $sth = $dbn->prepare('SELECT * FROM huge_table', {mysql_use_result => 1});
$sth->execute();
while (my $row = $sth->fetchrow_array()) {
# Process data results
}
Copy the code
Notice that prepare specifies to use the results instead of caching them. It can also be specified at connect time, which makes each query not cached.
my $dbn = DBI->connect('DBI:mysql:; mysql_use_result=1; host=localhost'.'user'.'password');
Copy the code