To solve the problem of distributed link tracing, we introduce Jaeger to implement OpenTracing. We then wrote a starter for the SpringBoot framework to allow users to achieve near-zero transformation access to the full link. Since the company had an internal framework that encapsulated SpringBoot, our starter developed based on the version of SpringBoot used by the latest framework. Therefore, the service system needs to upgrade the framework when accessing, and then introduce our starter to seamlessly access the whole link.

Fault description

Then there is a business system that follows the steps, upgrades the framework, introduces the starter into the full link system, and the functional test stress test has been passed. We went online with confidence. As a result, online Nginx reported a large number of HTTP 400 errors.

troubleshooting

After the failure, the developer of the service system checked all the logs, including the ELK logs and the logs on the machine, and found no obvious error logs. This is…

After several struggles, there was no sign of anything in the online logs. This is a little more desperate. Even weirder is that it was normal in the test environment, and this is even weirder.

And then we wonder if we didn’t do enough pressure testing, and we’ll do it again in the pressure testing environment to see if it happens again. Then just before the business system to do a pressure test, then quickly find operation and maintenance to build a pressure test environment. As soon as the result is completed, it is very face-saving to reproduce 400 errors.

Then the operations students all kinds of trouble, and then magically in nginx location under the configuration of a line.

proxy_set_header HOST $host
Copy the code

Then I started looking up what this configuration meant.

The main point of this configuration is that Nginx will add the actual Host header when forwarding HTP requests. If the HTTP request is abc.com/hello, then nGI… “Upstream” (” upstream “);

We then tried the modified version in a pressure test environment and found it to be normal. Our nginx configuration is roughly as follows

So to summarize the current phenomenon:

  • Nginx does not configure proxysetheader HOST $HOST

  • After nginx is configured with proxySetheader HOST $HOST, both versions are normal

So what exactly did we change?

  • Example Upgrade the SpringBoot version

  • The all-link starter is introduced

Then we tried to drop the reference to the full link starter and found the same 400 error. Then go back to the SpringBoot version and find that the SpringBoot version is normal

To sum up: This problem is caused by the upgrade of SpringBoot version and the change of HTTP header, so it can be boldly speculated that this problem is caused by the upgrade of Tomcat version.

Tomcat version was upgraded from 8.5.11 to 8.5.31

The fault recurs locally

Nginx does not configure proxysetheader HOST $HOST. By default, nginx forwards HTTP requests with the name of upstream as the HOST header.

This means that the new version of Tomcat reported 400 errors on receiving HTTP requests with Host sc_java (underlined)

Mysql > deploy two backend services using the new version of Tomcat on port 8083 and port 8084

The nginx configuration is as follows. Upstream is underlined

Then use postman to request nginx to reproduce error 400

110: Adjust nginx configuration: change upstream to ununderlined

Then ask again and find it is normal

Troubleshooting Solution

  • The Tomcat version is rolled back. Cost is larger

  • “Add proxysetheader HOST $HOST” or “upstream” to “name” without the underbar

Returning for analysis

We know what caused it, and we know how to fix it. But it is not known why the new Version of Tomcat has this problem. With this question in mind, our colleagues searched 400 issues in the Issue of SpringBoot project and found that there were indeed relevant issues

[tomcat] Spring boot web always return 400 when use a domain name

Although the problem seems to be the same as ours, both are 400 problems, but the specific causes are different. The issue is that if domain name.ext contains a number, such as “domain.sf1m”, 400 will occur. This issue has also been fixed in the new version of Tomcat.

But even though I’m using the latest version of Tomcat, 8.5.x, I still get 400 errors when I request Tomcat with an underlined Host HTTP.

That is, HTTP requests to Host with underscores are considered problematic by Tomcat

So why was Tomcat normal in previous versions? With this in mind let’s take a look at the Tomcat source code.

Haven’t seen before because tomcat’s source code, so let’s analyze what one line of code has a problem is difficult, so I checked the tomcat related bugImprove logging in AbstractProcessor. ParseHost ()

Here is the stack of errors in the bug

The corresponding code changes are found as follows

So here we know that the class that handles the Host header is this HttpParser class.

Check tomcat8.5.31 and 8.5.11 for HttpParser and AbstractProcessor classes. The comparison results are as follows:

Class 8.5.31 AbstractProcessor class 8.5.31 AbstractProcessor class 8.5.31 AbstractProcessor class 8.5.31 AbstractProcessor class 8.5.31 AbstractProcessor

So far we have seen why Tomcat 8.5.11 is normal. The main reason is that tomcat 8.5.11 did not validate the Host header, and tomcat 8.5.31 added this validation.

Let’s take a look at the Tomcat source code submission record

We found that host/port verification was added on 4/6, 2018.

Follow cause after cause

So why does Tomcat add checkouts to hosts and not allow hosts with underscores? There is actually a specification for this, and you can access the following address

www.ietf.org/rfc/rfc1034…

Lessons learned

For hosts with underscores, tomcat follows the rfC1-1034 specification, so tomcat’s handling is correct. However, Tomcat has a history of bugs in handling some other valid hosts, but has always handled underscores correctly.

As a result, nginx should not use underlined names when configuring upstream, and it is best to add proxysetheader HOST $HOST to location.

Reprinted from the book: Elaine Elaine

The original link: www.jianshu.com/p/d50bc43f5…