Tomcat performance optimization, learn to double the salary

Tomcat network handles the threading model

Tomcat is the servlet container we use in web development and the default container for springBoot’s built-in integration

So we need to understand its network threading model

Focus on public – public – number: IT elder brother, access to 300G Java learning materials

BIO + synchronous Servlet

Before Tomcat8, the default was BIO+ synchronous servlets

The execution process is and the steps are

User request
Nginx is responsible for receiving and forwarding to Tomcat
Tomcat uses a specific Servlet that BIOConnector delivers
Servlets handle business logic

This is a typical BIO operation, and you will be familiar with the threading model if you are familiar with NIO and Netty

One request for one worker thread has very low CPU utilization, so the new version does not use this approach

NIO processing is the default in Tomcat8

APR + asynchronous servlets

It’s not used very often, but it’s fairly common

Apr (Apache Portable Runtime/Apache Portable Runtime) is a support library for the Apache HTTP server

JNI calls the Apache HTTP server’s core dynamic link library to handle file reading or network transfer operations

Tomcat listens on the specified path by default and is automatically enabled if there is an APR installation

It uses the lower-level JNI form to achieve higher performance, which is more cumbersome to use in the real world

Because we still have to maintain a dynamic link library, mostly in the NIO way

NIO + asynchronous servlets

Tomcat8 starts from the default NIO mode, non-blocking reading request information non-blocking processing of the next request, completely asynchronous

NIO process

The receiver receives the socket
The sink retrieves the nioChannel object from the cache
Pollerthread registers nioChannel with its selector IO event
The poller assigns the nioChannel to a work thread to handle the request
The SocketProcessor completes the processing of the request

Tuning Tomcat parameters (theoretical)

Tomcat parameter tuning has four main parameters, and it also has three tuning directions

ConnectionTimeout

Is a timeout mechanism for processing, which can be understood as tomcat’s self-protection mechanism

If the request has not been processed for a long time, Tomcat considers the request to have timed out

Generally, it will be adjusted according to the business indicators of the project

Turn it up if you can handle a lot of requests

If the business dictates a maximum processing time for user requests, adjust it accordingly

maxThreads

The processing power of Tomcat can be understood as throughput

You can increase this parameter if you want to process more requests in a second

But bigger is not always better

acceptCount

The number of connections at the bottom of the operating system, exceeding the number of requests that Tomcat can receive later (maxConnections), and the number of requests that pile up on the operating system (Windows and Linux differ slightly)

The operating system has a queue to hold requests that the application has not had time to process

The operating system has its own Settings, and the operating system compares the two configurations to a minimum value

If Tomcat is set to 100 and the OPERATING system is set to 90, the operating system selects 90 as the number of connections for the operating system

maxConnections

Maximum number of connections for Tomcat. This parameter determines how many connections Tomcat can accept

But it’s not just set up to handle that many requests

How much, or how fast, can be processed depends on the business code

The maximum number of connections that can be accepted by a Tomcat = acceptCount+maxConnections

The maxThreads maximum number of threads needs to be adjusted for the processing power of Tomcat

Tuning Tomcat parameters cannot be guesswork and requires constant debugging to find the right configuration for the right application

Tuning a practical demonstration

Environment to prepare

Next, we will use jMeter test tool to do the test. Generally, we usually use JMeter when testing interface performance

Connection number adjustment

Let’s look at the complete process of processing a request

1.A request from a user comes in2.On Windows, it will first enter an Accept queue, which holds TCP three-way handshake requests. On Linux, it will enter the SYN queue for a three-way handshake, and then it will enter the Accept queue3.After entering the Accept queue, The Selector in Tomcat listens for notifications of events at the bottom of the operating system, and receives client connections based on the maxConnections size limit. When Tomcat reaches its maximum number of connections, it accumulates to the operating system level. Windows will reject the connection if it's full, Linux doesn't just have an Accept queue it also has a SYN queue, it will stack requests to the SYN queue and this SYN queue is part of the system kernel and we don't normally adjust it, The connection is rejected when the SYN queue is full4.After receiving the request, Tomcat reads the request data and parses the HTTP packet5.After parsing, the work processing thread pool invokes specific servlets to execute the business code. When maxThreads are full, they pile up in the Work processing thread pool6.Respond to the client after executing the business codeCopy the code

So when do you need to tweak Connections? How to adjust?

You need to scale connections up when they are smaller than maxThreads, preferably 20% more than the maximum expected concurrency

Maxthreads larger than connections will stack requests to Tomcat’s work processing thread pool.

When should acceptCount be adjusted? How to adjust?

If you want to handle more user requests without piling them up in Tomcat, you can efficiently stack them using the operating system’s processing queue, which can be adjusted to the maximum concurrency – connections

Actually this parameter doesn’t need to be adjusted, tomcat defaults to 100 and Linux defaults to 128. It is better to hand over connection control to the application for easier management

The springBoot framework is used to quickly construct a Tomcat container Web application. Three test interfaces are prepared. The controller code is as follows

package com.controller;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.Random;
import java.util.concurrent.Callable;

@RequestMapping("/demo/")
@RestController
public class DemoController {

  @GetMapping("hello")
  public String hello() throws InterruptedException {
    // This method has a fixed delay of 3 seconds to test thread/connection count control
    Thread.sleep(3000);
    return "hello!!";
  }

  @GetMapping("test")
  public String benchmark() throws InterruptedException {
    System.out.println("Access test:" + Thread.currentThread().getName());
    // This code is always running
    for (int i = 0; i < 200000; i++) {
        new Random().nextInt();
    }
    // 50 ms database wait, the thread is not working
    Thread.sleep(50L);
    return "Success";
  }

  @GetMapping("testAsync")
  public Callable<String> benchmarkAsync() {
    return new Callable<String>() {
      public String call() throws Exception {
        System.out.println("Access testAsync:" + Thread.currentThread().getName());
        // This code is always running
        for (int i = 0; i < 200000; i++) {
            new Random().nextInt();
        }
        // 50 ms database wait, the thread is not working
        Thread.sleep(50L);
        return "Success"; }}; }}Copy the code

The code is then put into a JAR package and run as a JAR package

Connections is set to 1

MaxThreads is set to 1

AcceptCount is set to 1

The command is as follows:

java -jar tomcatDemo.jar --server.tomcat.max-connections=1 --server.tomcat.max-thread=1 --server.tomcat.acceptCount=1
Copy the code

In theory, this program can accept a maximum of two requests (connections+acceptCount). Let’s see if this is the case

Open the JMeter test tool and configure the test case

Configure to send 10 requests per second

Windows Test Results

As you can see, on the Windows operating system, only 2 out of 10 requests were successful and 8 connections were rejected.

Because connections and acceptCount are both set to 1, only two requests can be processed

Linux Test results

This is different on Linux, because Linux loads up the SYN queue with requests for three-way handshakes, so it should accept the SYN request (connections + acceptCount + SYN queue capacity).

Is it so? Let’s test a wave and see

The same command starts on Linux

java -jar tomcatDemo.jar --server.tomcat.max-connections=1 --server.tomcat.max-thread=1 --server.tomcat.acceptCount=1 --server.port=9090
Copy the code

Change the HTTP request address in the JMeter test case to the address in the Linux server

In Linux, 9 out of 10 requests were received and only 1 failed

This proves that the number of connections in Linux is not limited by the total number of connections (connections + acceptCount) mentioned above

It also has a SYN queue for three-way handshake requests, so in theory it will handle more requests than Windows

Adjust the number of concurrent connections

There are two points to note when adjusting the maximum number of threads in Tomcat

Too few threads, CPU utilization is too low, the throughput of the program is small, resource waste, easy to pile up
Too many threads, too many context switches, and performance deteriorates

So how should we adjust this quantity?

scenario

The server is configured with one core, and it takes 50ms to execute the Java code after receiving the request and 50ms to wait for the data to return, regardless of the memory problem. The total time is 100ms

Ideal number of threads = (1 + code blocking time/code execution time) * number of cpus

Where does this formula come from, let me give you an example

Shenzhen a community recruitment security, and the security booth inside can only sit a person, that is to say at the same time only a security guard in the security booth, each security guard is tentatively determined every other30Go out on patrol every minute, every time30How many security guards should I bring in at this time? The minimum is two, because the security booth should be guarded by a security guard all the time, which is convenient to open the door for the owner of the community. However, this security guard cannot go on patrol at the same time, so we have to recruit another security guard for shift change, so the theoretical value is equal to (1+ When the security guard is sitting/when the security guard is patrolling) * Number of security boothsCopy the code

Our test server is 1 core, so the ideal number of threads is (1 + 50/50) * 1 = 2

The reality is to run the code and test the environment for debugging. Constantly adjust the number of threads to drive CPU utilization to 80-90%

Linux Startup Commands

Start the maximum number of threads at 100

java -jar tomcatDemo.jar --server.tomcat.max-thread=100 --server.port=9090
Copy the code

Configure the JMeter test case

The HTTP request interface is: 116.7.249.34:9090 / demo/test

The number of threads is: 100 threads are requested once in a second

The throughput is about 5 requests per second and 51 abnormal requests

Is that because it’s too small? Should I turn it up?

Let’s adjust it to 500 and let’s see

java -jar tomcatDemo.jar --server.tomcat.max-thread=500 --server.port=9090
Copy the code

Configure the JMeter test case

The HTTP request interface is: 116.7.249.34:9090 / demo/test

The number of threads is: 100 threads are requested once in a second

Click Run to see the summary report

The throughput is about 5 requests per second and 45 abnormal requests

Increasing the number of threads reduces the number of exceptions, but the throughput remains the same. Why? We clearly set the maximum number of threads to 500. Why can’t we handle it

Let’s look at the result tree and see, what’s the cause of the error

The default connection timeout is 20 seconds. How can this interface be executed for 20 seconds

Then look down and see the results in a chart

The faulty request was found because the connection timed out after 20 seconds

Why is it longer than 20 seconds? Let’s see what a normal request is

The first request is 85 milliseconds, so why do subsequent requests take longer to execute?

It is certainly true that there are too many threads and too many context switches that degrade performance. Too many threads are added and the CPU can’t handle it and only slows down

So let’s make the number of threads smaller so that the CPU doesn’t process them as frequently

Let’s make it 10 and let’s see

java -jar tomcatDemo.jar --server.tomcat.max-thread=10 --server.port=9090
Copy the code

Click Run to see the summary report

The throughput is about 5 cases per second, and abnormal requests reach 60 cases

What can I do about it? It doesn’t work if you turn it up, it doesn’t work if you turn it down, you get almost the same results, right

Let’s take a look at the code that requests the interface

@GetMapping("test")
public String benchmark() throws InterruptedException {
  System.out.println("Access test:" + Thread.currentThread().getName());
  // This code is always running
  for (int i = 0; i < 200000; i++) {
      new Random().nextInt();
  }
  // 50 ms database wait, the thread is not working
  Thread.sleep(50L);
  return "Success";
}
Copy the code

200,000 cycles, generating random numbers. This interface causes the CPU to work constantly, so it can’t handle that many requests, so it doesn’t matter if you run into junk code in a bull B configuration.

conclusion

Don’t worry too much about or pursue high concurrency, high processing power, high throughput

Another way of thinking is to improve the response speed of the interface first, and then leave the rest to the server configuration

Taking an 8-core 16GB server as an example, what do you do if you want to process 1000 requests a second?

1000Request distributed to8How many requests does each CPU process on average?1000 / 8 = 125
1CPU in1000Need to process in milliseconds125What is the maximum interface response speed allowed?1000 / 125 = 8
Copy the code

To process 1000 requests a second or less, the interface needs to respond in 8 milliseconds or less

This only takes into account the normal CPU processing time, not counting the network connection latency and the time it takes for the GC to reclaim and stop the user thread

If you can accept a slow response, increase the number of threads and increase the connection timeout, otherwise you will cluster diversion

Be realistic, optimize code is king, configuration is only icing on the cake! But you have to know how to add this flower, what do these parameters mean, how to adjust

Source: juejin. Im /post/685041…

Follow wechat public account: IT elder brother

Java actual combat project video tutorial: you can get 200G, 27 sets of actual combat project video tutorial

Reply: Java learning route, you can get the latest and most complete a learning roadmap

Re: Java ebooks, get 13 must-read books for top programmers

Java foundation, Java Web, JavaEE all tutorials, including Spring Boot, etc

Reply: Resume template, you can get 100 beautiful resumes