First, performance tuning is an ongoing process, and installing MySQL is usually the first step in adjusting the operating system and database configuration. A database is a dynamic system, and it’s a never-ending story. Your MySQL database may initially be CPU-bound because you have enough memory and very little data. Over time, it may change and disk access may become more frequent. As you can imagine, I/O is the primary concern for a server whose configuration looks different from that of a server where all data fits into memory. In addition, your query mix can change over time, so access patterns or the use of features available in MySQL, such as adaptive hash indexes, can change accordingly.
OS Optimization
Operating system Settings related to memory and file system cache handling need to be checked. In general, we want to keep both vm.dirty_ratio and vM. dirty_background_ratio low.
Vm.dirty_background_ratio is the percentage of system memory that can be used to cache modified (” dirty “) pages before the background refresh process begins. More means more work to clean up the cache.
On the other hand, vm.dirty_ratio is a hard limit on the amount of memory available to cache dirty pages. This can be done if the background process is unable to quickly refresh the data to keep up with new changes due to high write activity. Once vm.dirty_ratio is reached, all I/O activity is locked until dirty pages are written to disk. The default setting here is usually 40% (it may vary in your distribution), which is quite high for a host with a lot of memory. Let’s say for a 128GB instance, even if you’re using a fast SSD, it’s equivalent to about 51GB, which can lock up your I/O for quite some time.
In general, we would like to see both variables set to low numbers, 5-10%.
Another important system variable is vm.swappiness. When using MySQL, we don’t want to use swaps. Swapping InnoDB buffer pools to disk will remove the memory buffer pool unless it is urgently needed. On the other hand, if the alternative is to start OOM and kill MySQL, we’d rather not do that. This behavior can be achieved by setting vm.swappiness to 0. Therefore, it is recommended to set vm.swappiness to 1 to allow some exchange to occur if it is the only option to keep MySQL. Of course, this slows the system down, but OOM on MySQL is very demanding.
Personal advice is to set it around 5-10
Another memory-related setting ensures that you turn NUMA off entirely. You can do this by modifying the startup script to start MySQL: numactl –interleave=all $command This setting balances memory allocation between NUMA nodes and minimizes the chance that one of them will run out of memory.
Memory allocators can also have a significant impact on MySQL performance. This is a larger topic, and we’re just scratching the surface here. You can choose different memory allocators to use MySQL. Their performance varies between versions and workloads, so it is only after you perform detailed testing that you can determine which version works best in your environment. The most common choices are the default glibc malloc, TCMALloc, and Jemalloc. You can add a new allocator by installing a new package (for Jemalloc and TCmalloc), Then use LD_PRELOAD (that is, LD_PRELOAD = “/usr/lib/libtcmalloc_minimal.so.4.1.2”) or the malloc-lib variable part my.cnf in [mysqLD_safe].
Next, you’ll want to look at the disk scheduler. CFQ (which is usually the default) is tuned for desktop workloads. This does not work well for database workloads. In most cases, you’ll see better results if you change it to a NOOP or deadline. There is little difference between the two schedulers, and we found that SAN-based storage Noop is superior (SAN is generally better at handling the workload because it knows more about the underlying hardware and what is actually stored in the cache compared to the operating system).
Now that we’re talking about disks, generally the best choice for file systems is EXT4 or XFS – this has changed a few times in the past, and if you want to take full advantage of the I/O subsystem, then you may have to do some testing of your setup. Noatime and nodiratime for MySQL volumes should be disabled regardless of which file system is used – the less metadata is written, the lower the overall overhead.
MySQL > alter database
Tweaking MySQL configuration is a topic for an entire book that cannot be covered in a single blog post. We will try to mention some of the more important variables here.
InnoDB buffer pool
What is a buffer pool and why is it important? The buffer pool is the memory InnoDB uses to cache data. It is used for cache reads and writes every page that has been modified must first be loaded into the buffer pool. Then it becomes a dirty page a page that has been modified and hasn’t been flushed into the tablespace yet. As you can imagine, such buffers are very important for proper database execution. The worse the memory/disk ratio, the more your workload will be tied to I/O, which tends to be slow.
You’ve probably heard the rule of thumb for setting the InnoDB buffer pool to 80% of your system’s total memory. It worked when 8GB was huge, but not now. When calculating the InnoDB buffer pool size, you need to consider the memory requirements of the rest of MySQL (assuming MySQL is the only application running on the server). For example, we are talking here about all those maximum sizes per join or even per query buffer such as join buffer or temporary table in memory. You also need to consider the maximum number of connections allowed – more connections means more memory usage.
For a MySQL database server with 24 to 32 cores and 128GB of memory, handling up to 20-30 concurrent connections and up to hundreds of concurrent clients, we can say that 10-15GB of memory should be sufficient. If you want to stay safe, 20GB should be enough. In general, establishing the ideal buffer pool size is a trial and error process unless you know the behavior of the database. InnoDB buffer pool is not a dynamic variable at the time of writing, so it needs to be restarted. So it’s safe to err on the side of being too small. With the introduction of dynamically allocated buffer pools by Oracle, which will change with MySQL 5.7, this will make tuning easier.
MySQL uses more buffers than InnoDB buffer pools – they are controlled by variables: join_buffer_SIZE, sort_buffer_size value, read_buffer_size, read_rnd_buffer_size. These buffers are allocated on a session basis (except for JOIN buffers, which are allocated one per JOIN). We’ve seen MySQL set these buffers to hundreds of megabytes – by increasing join_buffer_size, you expect your JOIN to perform faster, which is somewhat natural.
By default, these variables have fairly small values, which actually makes sense – we’ve seen that Settings as low as 256K can be much faster than larger values like 4M. It’s hard to pinpoint the exact reasons for this behavior, and it’s likely that there are many. One sure thing is that Linux has changed the way memory is allocated. Malloc () is used for up to 256KB. For larger memory blocks – mmap (). It is important to remember that when it comes to these variables, any changes must be backed up against the baseline to confirm that the new Settings are indeed correct. Otherwise, you may reduce your performance, not increase it.
InnoDB persistence
Another variable that has a significant impact on MySQL performance is innodb_flush_LOG_at_trx_COMMIT. This is an extension of InnoDB persistence. Default (1) ensures that data is secure even if the database server is killed – data is not lost under any circumstances. The other Settings (2 and 0) indicate that up to 1 transaction may be lost if the entire database server crashes (2), and up to 1 transaction may be lost if mysqld is terminated.
Complete persistence is obviously a good thing, but it comes at a very high price – THE I/O load is much higher because the refresh operation has to happen after every commit. Therefore, in some cases, reduced durability and the risk of data loss under certain conditions are very welcome. This is true for main-multiple slave Settings, and it is often the case that having one slave in the rebuild process is usually perfectly correct after a crash, as the rest can easily handle the workload. The same is true for Galera clusters – the entire cluster works as a single instance, even if a node crashes and loses data
I/O-related Settings
Other variables that can have a significant impact on some workloads are innodb_IO_capacity, innodb_IO_CAPACity_max, and innodb_lru_scan_depth. These variables define the number of disk operations that InnoDB’s background threads can perform, such as refreshing dirty pages from the InnoDB buffer pool. The default is conservative, which is good in most cases. If your workload is very heavy, you may need to adjust these Settings to see if you are preventing InnoDB from fully using your I/O subsystem. This is especially true if you have fast storage: SSD or PCIe SSD cards.
Speaking of disks, Innodb_flush_method is another setting you might want to look at. By switching this setting from the default fdatasync to O_DIRECT, we see a visible performance improvement. This gain is clearly visible in the Settings of hardware RAID controllers that use Bbus backup. On the other hand, when it comes to EBS volumes, we’ve seen better results using ‘O_DSYNC. Marking up here is important to understand which Settings will work better in a particular situation.
InnoDB redo log
The size of InnoDB’s redo log is also something you might want to look at. It is managed by innodb_log_FILe_size and innodb_log_FILES_IN_group. By default, we have two logs in a group, each about 50MB in size. These logs are used to store write transactions and write sequentially. The main problem here is that MySQL cannot run out of space in the log, and if the log is nearly full, it must stop the entire activity and focus on flushing data to the table space. Of course, this is very bad for the application because there are no writes during this time. This is one of the reasons why the InnoDB I/O setup we discussed above is so important. We can also increase the redo log size by changing innodb_log_file_size. The rule of thumb is to set them large enough to cover at least an hour of writes.
The query cache
The MySQL query cache is also frequently “tuned”. This cache stores the hashes of SELECT statements and their results. There are two problems – first, the cache can be flushed frequently. If any DML is performed on a given table, all results associated with that table are removed from the query cache. This seriously affects the utility of MySQL query caching. The second problem is that the query cache is protected by mutex and access is serialized. This is a significant disadvantage and limitation for any workload with high concurrency. It is therefore highly recommended to “tune” the MySQL cache by completely disabling the MySQL query cache. You can do this by setting ‘query_cache_type’. Yes, in some cases it may be useful, but in most cases it is not. Instead of relying on MySQL query caching, you can use any other external system, such as Memcached or Redis, to cache data.
Internal contention processing
Another set of Settings you might want to look at are variables that control how many instances/partitions MySQL should create for a given structure. We discuss variables here: Innodb_buffer_pool_instances table_open_cache_instances, Metadata_locks_hash_instances and innodb_adaptive_hash_index_partitions. These options were introduced when it became clear, for example, that a single buffer pool or a single adaptive hash index could be a contention point for a workload with high concurrency. Once you find that one of these structures becomes a pain point (we discussed how to capture these situations in an earlier blog post) you’ll want to tweak the variables. Unfortunately, there is no rule of thumb. It is recommended that the size of a single buffer pool instance be at least 2GB, so you may want to stick to this limit for smaller buffer pools. In the case of other variables, if we are talking about contention, then you might increase the number of instances/partitions of these data structures, but there are no rules on how to do that – you need to watch the workload and decide that contention is no longer an issue at this point.
Other Settings
There are a few other Settings that may need to be considered, some that can be applied in the most efficient way when setting up. Some can change dynamically. These Settings do not have a significant (and sometimes negative) impact on performance, but they are still important to keep in mind.
Max_connections on the one hand you want to keep it high enough to handle any incoming connections. On the other hand, since most servers can’t handle hundreds or more connections at once, you don’t want to get too high. One way to solve this problem is to implement connection pooling on the application side, or to use a load balancer like HAProxy to limit the load.
Log_bin If you are using MySQL replication, you need to enable binary logging. Even if you don’t use them, it’s very convenient to keep them because they can be used for point-in-time recovery. The row format is strongly recommended
Skip_name_resolve This variable determines whether a DNS lookup is performed on the host that is the source of the incoming connection. If enabled, the FQDN can be used as a host in MySQL authorization. If not, only users defined with IP addresses can work. The problem with enabling DNS lookup is that it may introduce additional latency. The DNS server can also stop responding (due to a crash or network problem), in which case MySQL will not be able to accept any new connections.
The innodb_file_per_TABLE variable determines whether an InnoDB table is created in a separate tablespace (set to 1) or in a shared tablespace (set to 0). Managing MySQL is much easier when each InnoDB table has a separate table space. For example, for individual table Spaces, disk space can be easily reclaimed by dropping tables or partitions. Shared table Spaces don’t work – the only way to reclaim disk space is to dump the data, clean up the MySQL data directory, and then reload the data. Obviously, this is inconvenient.
That’s the way it is now. As we mentioned at the beginning, tweaking these Settings probably won’t make the MySQL database faster – by tweaking queries, you’re more likely to speed it up. But they should still have a noticeable impact on overall performance. Good luck adjusting your work!