“Site-wide HTTPs” has become a hot topic, and many sites are gearing up to implement site-wide HTTPs. Coincidentally, we (Hujiang) are doing the same.

At the beginning, everyone thought it was very simple. Once the certificate was purchased and configured, there would be no problem if the corresponding path was changed. Indeed, HTTPs transformation for a single stand-alone site is easy. Once to “total station”, only to find things far more complex than imagined, total station means that all resources face all clients, involving many factors, there is not too much information on the network, can only be explored. Below I briefly talk about a few problems encountered, to provide some experience for your reference.

HSTS

If a site provides both HTTP and HTTPs services (as is often the case during the transition period), how do you direct users to HTTPs sites? This is where HSTS (HTTP Strict Transport Security) comes in. According to the Settings on the Web server, when receiving an HTTP access request, the returned header contains the Strict- transport-Security field, indicating that the browser must use HTTPs for access.

However, HSTS is not immune to the hijacking encountered on the first jump. To solve this problem once and for all, apply to join the Preload List.

Preload List is a List of HTTPs sites maintained by Google Chrome, which is used by Chrome, Firefox, Safari, Edge, and IE 11. Once the browser discovers that the site to be visited is on the Preload List, it initiates an HTTPs link by default. In this way, the first jump of HSTS is not hijacked.

SSL offloading

In general, HTTPs is only used to transmit encrypted traffic on the public network when the client starts. The traffic on the Intranet is still transmitted through HTTP without encryption. This process of “converting encrypted traffic to unencrypted traffic” is often called SSL/TLS Offloading (hereinafter referred to as “SSL Offloading”).

There are some companies that use F5 for load balancing. F5 can handle simple L4 and L7 traffic without problems, but performance is often dramatically reduced when SSL is unloaded. F5 can provide special acceleration cards to solve this problem, but the price is not cheap. Therefore, a special section is required to perform SSL uninstallation, and common Nginx and HAProxy can perform this task.

With the introduction of Intel’s Westmere series processors in 2010, the CPU supports the AES-NI(Advanced Encryption Standard New Instructions) instruction set. It can greatly increase the speed of SSL encryption and decryption (the usual data is about 5 times). However, using the CPU alone does not directly benefit from this, and the corresponding OpenSSL support is required. To find out if your OpenSSL is taking advantage of AES-NI acceleration, you can use OpenSSL’s command line debugging, plus the -evp parameter, to test if the speed changes significantly.

# without EVP API openssl speed aes-256-cbc Doing aes-256 cbc for 3s on 16 size blocks: 14388425 AES-256 CBC's in 3.00s # with EVP API speed-evp aes-256 Doing AES-256-Cbc for 3s on 16 size blocks: Aes - 256-71299827 CBC 's in 3.00 sCopy the code

Client Certificate

HTTP services are easy to verify. In general, the same behavior results in the same results whether the client is a browser, another tool, or program code. So as long as one of the client validations passes, you can basically assume that the service is fine. HTTPs sites are not.

Unlike HTTP, HTTPs connections require certificate authentication. A complete trust chain must be established starting from the root certificate. However, the root certificate trusted by browsers, common tools, and application libraries is managed independently of each other. For example, unlike the browser, where the C# trusted root certificate is in the local machine store or current user store, the Java trusted root certificate is in the JDK installation directory cacerts.

Because browser certificates are constantly updated in use, users are often unaware of them, and problems can arise if they “take it for granted” that the browser is authenticated. For example, there are many websites in China that use certificates issued by WoSign, but the old JDK concurrently trusts WoSign’s root certificate. If you browse a website with no problem, the program will report an error unless you manually import the certificate. WoSign has been untrusted by Firefox, Chrome, Safari and other major browsers because of its misconduct. This means that the WoSign certificate is a security risk. Therefore, programs that have built WoSign root certificates should also be configured accordingly.

Server side certificate

This is the same phenomenon: an HTTPs site that you can verify with your browser has no problem accessing with your application. Why is that?

I’ve seen several instances where the certificate is incorrectly configured on the server and only the final certificate is missing intermediate certificate. Usually, the final certificate contains information related to the intermediate certificate, so if the intermediate certificate is missing, the browser will do a time-consuming operation to obtain the intermediate certificate in order to establish the certificate chain, all before the HTTPs connection is actually established. To make matters worse, many browsers find ways around this problem in order to “perform better.” So the lack of intermediate certificates is always there, always undetected, and the program call speed is always slow, even a certain probability of error (I have encountered this weird problem).

Note also the size of the certificate chain if it is fully configured. Some sites have unusually large full certificates of several kilobytes or even tens of kilobytes, meaning that much data must be transferred before a connection can be established. If you do pure PC web browsing, you probably won’t have a problem. But for mobile and program calls, it’s a disaster. The best way to do this is to use OpenSSL with a tool like WireShark. If you are familiar with TCP, you can often get better optimizations.

OCSP and CRL cannot be ignored. These two techniques are used to ensure the validity of the certificate revocation (making the certificate invalid). If you look carefully, you will find that the CORRESPONDING OCSP or CRL URL is specified in the certificate to check whether the certificate is invalid. The OCSP or CRL check should be performed every time an HTTPs connection is established. The return result is usually around 1K, which should also be considered if the requests are very frequent.

SNI

You are familiar with“Virtual host” concept, it can make multiple domain name corresponding to the same IP, so that the same server to serve multiple sites. In the age of HTTP, you could specify the domain name you wanted to access via host in the header, and everything seemed perfect.

But in the AGE of HTTPs, there is no such good news. Traditional HTTPs services make it difficult for multiple domain names to correspond to the same IP address. Before proceeding to HTTP communication, you must establish an authentication certificate to establish a connection. If multiple domain names are bound to an IP address, the server has no way to know which domain name the request corresponds to at this stage, which will undoubtedly cause great inconvenience.

In order to solve this problem, SNI(Server Name Identification) came into being. This technique can be simply referred to as “host header for establishing SSL/TLS communications”, which eliminates the need for a single certificate for an IP address.

However, SNI has a short history, and many clients have strange issues with support. For example, JDK7 supports SNI, but JDK8 support is buggy. And this support often requires calling native apis, which libraries like Resteasy don’t support. If there is a problem with SNI support, even if the configuration is correct, the connection may not be established because the server does not recognize the certificate required for this request.

CDN

CDN is already a popular technology in the industry, and it is almost inconceivable for a slightly larger site to do without it. The CDN scheme of the HTTP era is quite mature, the same cannot be said of HTTPs.

To use HTTPs CDN services, decide whether to give the certificate to the CDN provider. Given the current level of commercial credibility in China, the risk of handing over certificates to CDN providers cannot be considered. Malicious speculation, once someone with ulterior motives has the certificate, it is very convenient to launch a man-in-the-middle attack through DNS hijacking without being detected.

In addition, in the AGE of HTTP, people tend to spread resources across multiple domains because the cost of establishing connections is low and the number of concurrent connections to the same domain is limited. In an HTTPs environment, the cost of establishing a connection is much higher, and frequent downloading and verification of certificates can have a significant impact on mobile devices and mobile networks (especially if the certificates are large). The impact is even more pronounced if the domain name is very fragmented. So in the AGE of HTTPs, it’s better to narrow domain names down.

Contents and others

Because HTTPs content cannot refer to HTTP resources, you should ensure that links to resource files on web pages are HTTPs. Many legacy systems probably don’t pay attention to these things, and resources are in the form of absolute addresses, which can be a lot of work to change. If you want to change it, it is best to change it to Protocol relative URL.

Absolute address URL: http://www.a.com/b.css

Protocol relative URL://www.a.com/b.css

In this way, the browser can automatically generate the absolute address of the resource according to the current protocol, whether HTTP or HTTPs, can be switched freely.

If the resources are proprietary, switching HTTPs is relatively easy. If there are external resources, especially UGC resources, switching to HTTPs is cumbersome. For hyperlinks, there is usually a dedicated jump service, which looks like this:

https://link.my.com/target=www.you.com

If it is a picture of such resources, you can set up a special program to capture it stored in their own server, and then replace the address can be. However, doing so is likely to be at your own risk of traffic and malware attacks.

For rich text and complex interactive resources such as videos and games, if the source site does not provide HTTPs services, most of them will have to give up the form of embedded presentation.

Two final lessons:

  1. If you do decide to use HTTPs, you’d better have someone familiar with OpenSSL, otherwise you’ll have a lot of problems that will confuse you. OpenSSL is a great debugging tool for locating problems.

  2. HTTPs can solve the problem of carrier content hijacking. If it is DNS hijacking, it needs to be carefully considered whether to adopt the attitude of “I will die with you” on HTTPs. I know that many websites can switch between HTTP and HTTPs at any time, which can be attacked or defended.