Article take you understand the basic knowledge of Nginx | recommended collection

1. An overview of the

Many people may know nginx more or less, even if they have not used nginx, but may have built a simple Web server with Apache, using Tomcat to write some simple dynamic pages, in fact, these functions can be implemented by Nginx.

The three most important usage scenarios for NGINx are, in my opinion, static resource services, reverse proxy services, and API services.

Web requests that enter the service go through Nginx and then to the application service, and then to Redis or mysql to provide basic data functions.

This has a problem, because application services require high development efficiency, so its operation efficiency is very low, its QBS, TPS concurrency is limited, so a lot of application services need to be grouped to provide users with high availability.

When many services are clustered, nGINx needs to have the function of reverse proxy, which can transmit dynamic requests to the corresponding application services. Service clusters must bring two requirements, dynamic capacity expansion and disaster recovery.

The reverse proxy must have the function of load balancing. Secondly, in the link, Nginx is the edge node of the enterprise Intranet. With the increase of network links, the delay experienced by users will increase.

By caching some dynamic content that appears to be constant for all users, or for a period of time, in nGINx, nGINx provides direct access to the user, and the user’s latency can be greatly reduced.

The reverse proxy is derived from another function called caching, which can speed up access. In many cases, access to CSS or JS files or some small images is not necessary by the application service. It only needs to be directly provided by Nginx.

The performance of the application service itself has great problems, and the database service is much better than the application service, because the business scenario of the database is relatively simple, and the concurrency performance and TPS are much higher than the application service. Nginx directly accessing the database or Redis is also a good choice.

The powerful concurrency performance of NGINx can also be utilized to realize some business functions such as web firewall, which requires nGINx services to have very powerful business processing functions. OpenResty and Nginx integrate some tool libraries to realize this function.

2. Historical context

Globalization and the rapid development of the Internet of Things have led to a rapid rise in the number of people and devices connected to the Internet. The rapid explosion of data has put forward high requirements on hardware performance.

Moore’s Law states that a service running on a 1GHZ CPU will see twice the performance improvement when it is upgraded to a 2GHZ CPU.

By the early 2000s, However, Moore’s Law had broken down at the rate of a single CPU, and cpus began to move toward multiple cores. When a server is running on an 8-core CPU, a year and a half later it moves to a 16-core CPU, and the performance doesn’t usually double.

These performance costs are mainly due to operating systems and large amounts of software that are not prepared to serve multi-core architectures, such as Apache, which is inefficient because of its architecture model in which a process can only handle one connection and one request at a time. Only after this request is processed will the next request be processed.

It actually uses the interprocess switching feature of the operating system, because the operating system is microscopically limited to cpus, but the operating system is designed to serve hundreds or even thousands of processes simultaneously.

Apache can only service one connection per process, which means that when Apache is dealing with hundreds of thousands or millions of connections, it can’t run millions of processes, and the cost of switching between processes is too high.

The higher the number of concurrent connections, the greater the performance cost of this unnecessary inter-process switch.

Nginx is designed specifically for this type of application scenario. It can handle millions or even millions of concurrent connections. Nginx is currently number two in the Web market share, which has grown tremendously in the past few years, and in the near future nginx will be far more used on the Web than any other server.

3. Advantages of Nginx

For most applications and servers, the number of RPS drops sharply as the number of concurrent connections increases. The principle here is that, as mentioned earlier, the architecture is not correct.

The first advantage of NGINx is the combination of high concurrency and high performance, which can be achieved by using as little memory per connection as possible.

High concurrency and high performance often require very good design, so what standards can NGINx achieve?

For example, some mainstream servers with 32 cores and 64 GIGABytes of memory can easily handle tens of millions of concurrent connections, or a million RPS for simple static resource requests.

Second, NGINx has very good scalability, mainly because its modular design is very stable, and the ecosystem of nGINx’s third-party modules is very rich. There are even third-party plug-ins like TNG and openRestry. The rich ecosystem ensures the rich functionality of Nginx.

The third advantage is its high reliability, which means that Nginx can run on the server for years without interruption, whereas many Web servers run for weeks or months before needing to be restarted.

For nginx, a reverse proxy server with high concurrency and high performance, it usually runs on the edge nodes of the enterprise Intranet. If the enterprise wants to provide four 9’s, five 9’s, or even higher high availability, the time for nGINx to continuously run down machines in one year may only be measured in seconds. So in this role, nGINx’s high reliability provides a very good guarantee.

The fourth advantage, hot deployment, is the ability to upgrade NGINx without stopping service, which is particularly important in NGINx, where millions of concurrent connections can run.

A normal service can be handled by killing the nginx process and then restarting it. However, in the case of Nginx, killing the Nginx process causes the operating system to send a TCP reset message to all clients that have established connections. Many clients are not able to handle the request well.

In a high-concurrency scenario, some random event can lead to inevitably malignant results, so hot deployment is necessary.

The fifth advantage is the BSD license. BSD Listens means that nginx is not only open source and free, but also allows you to modify the nginx source code in a customization scenario, which is legal to run in a commercial scenario.

These advantages are the core features of Nginx.

4. Major components

The first is the nginx executable file, which is a file jointly constructed by nginx’s own framework, official modules and various third-party modules. He has the whole system, all the functions are provided by him.

The second part is the nginx.conf configuration file, which is similar to the driver on a bike. Although the executable already provides many functions, it is up to the nginx.conf configuration file to determine whether these functions are enabled or what behavior is defined to handle requests when they are enabled.

The third component of Nginx is an access log called access.log. The access.log logs record information about each HTTP request and response that Nginx processes.

The fourth component is the error.log error log, which can be used to locate unexpected problems when they occur.

These four parts are mutually reinforcing.

The nginx executable and nginx.conf define how requests are handled. If you want to do some operation or maintenance analysis on web services, you need to take a closer look at access.log. If there are any unknown errors, or if they don’t match the expected behavior, the error.log should be used to locate the underlying problem.

5. Version rules

Nginx releases three features: “feature”, “bugfix”, and “change”.

Each version has a mainline trunk version and stable version.

Click download in the lower right corner of the nGINx official website to see a list of version numbers. The singular version indicates the trunk version, which will add many features, but may not be stable. The even-numbered version is the stable version.

The CHANGES file shows what new features were added, what bugs were fixed, and what minor refactorings were made for each version.

The number of Nginx bugfixes has decreased significantly since about 2009, so Nginx has been relatively stable.

Nginx was developed in 2002, but it released its first version on October 4, 2004, and underwent a major refactoring in 2005.

Because of the excellent design of NGINx, its ecosystem is extremely rich. The module design and architecture design have not been changed too much.

In 2009, NGINx began to support Windows operating system, and in 2011, the 1.0 official version was released. Meanwhile, nginx Plus, the commercial company of NGINx, was established. In 2015, NGINx released several important functions.

It provides stream, four layers of reverse proxy, which can completely replace the traditional LVS in terms of function, and has more abundant functions.

6. Select a version

Free and open source: nginx.org

Business version: nginx.com

The development of the open source free Nginx began in 2002, and the first version was released in 2004. In 2011, the open source version of Nginx released a 1.0 stable version. In the same year, the authors of Nginx established a commercial company and began to release the commercial version of Nginx Plus.

The commercial version of Nginx has many advantages in the integration of third-party modules, operation monitoring and technical support, but its biggest disadvantage is that it is not open source, so the open source version of Nginx.org is usually used in China.

Alibaba also launched a version of Tengine. The advantage of Tengine is that it has experienced very severe tests in the Alibaba ecosystem. The reason why Tengine exists is that many of its features are ahead of the official version of Nginx.

So Tengine actually changed the backbone code of the nginx official website version, of course, after the framework was changed Tengine encountered an obvious problem, it was not able to synchronize with the nginx official version of the upgrade. Tengine can also use third-party modules from Nginx.

Zhang Yichun, the author of OpenResty, developed the Lua language version of OpenResty when he was in Alibaba. Because of the considerable difficulty in developing third-party modules of Nginx, Zhang yichun provided a framework of nGINx non-blocking events to the majority of developers in the way of Lua language.

OenRestry offers high performance and high development efficiency. OpenResty is available in both open source and commercial versions. Currently, the open source version is available at openResty.org. The main feature of the commercial version of OpenRestry is that the technical support is much better.

If you don’t have much of a business appeal, then using the open source version of Nginx is sufficient. If you need to develop an Api server, or if you need to develop a Web firewall, OpenRestry is a good choice.

7. Compile the configuration

There are two ways to install nginx. In addition to compiling, you can also directly install nginx by using some of the operating system’s built-in tools, such as yum and apt-get.

The problem with installing nginx directly is that the nginx binaries do not compile modules directly into nginx. After all, not every official nginx module is enabled by default.

If you want to add third-party nginx modules, you must compile nginx.

To compile Nginx, you need to download nginx from nginx.org.

Go to nginx.org and click the “donwload” link in the lower right corner of the page. Right-click the “donwload” link and copy the link address. Then, go to the Linux operating system and use “wget” to download the file

CD/home/nginx wget HTTP: / / http://nginx.org/download/nginx-1.18.0.tar.gzCopy the code

After downloading the nginx package, decompress it first.

The tar - XZF nginx - 1.18.0. Tar. GzCopy the code

Then go to the decompressed directory and run the ll command to view all the files.

CD nginx - 1.18.0 llCopy the code

The first directory is called the Auto directory.

cd auto
Copy the code

The auto directory has four subdirectories, cc for compilation, lib library and OS judgment, and all other files are used to assist in the config script execution to determine which modules nginx supports and what features the current operating system has available to Nginx.

The CHANGES file marks which features and bugfixes are available in each version of Nginx.

cat .. /CHANGESCopy the code

There will be feature, bugfix and change in it.

The changes. ru file is the CHANGES file in the Russian language, probably because the author was a Russian.

Conf file is an example file. After nginx is installed, the example file in config is copied to the installation directory to facilitate o&M configuration.

The configure script is used to generate an intermediate file to perform a pre-compile configuration, that is, to record the pre-compile configuration information, which is used at compile time.

The contrib directory provides two scripts and vim tools that highlight support code when vim opens config files.

Copy all vim files from the contrib directory to your own directory

cp -r contrib/vim/* ~/.vim/
Copy the code

You can highlight the syntax of the Nginx language in Vim.

The HTML directory provides two standard HTML files, one that can be redirected to if error 500 is found, and the other is the default nginx welcome interface index.html.

The man file is the Linux nginx help file, which identifies the most basic nginx help and configuration.

The SRC directory is the core source code for Nginx.

8. Start compiling

See what parameters configure supports before compiling.

./configure --help | more
Copy the code

The first is to determine which directories the nginx execution will look for as auxiliary files. For example –modules-path comes into play when using dynamic modules. –lock-path Determines where to put the nginx.lock file, etc.

If nothing has changed, just specify –prefix=PATH and set an installation directory.

The second type of argument is used to determine which modules to use and which modules to not use. The prefixes are usually –with and –without.

For example –with-http_ssl_module or –with-http_v2_module usually requires active –with, which means modules are not compiled into Nginx by default.

While modules showing –without, for example –without — http_charset_module means that it is compiled into nginx by default, adding an argument removes it from the default Nginx module.

The third type of arguments specifies the special arguments required for nginx compilation, such as what optimizations are required when compiling with CC, debug level logs (–with-debug), and third-party modules (–with-zlib-asm=CPU).

The nginx installation directory specified here is in the /home/nginx directory.

./configure --prefix=/home/nginx/nginx/
Copy the code

If nginx has compiled without any errors, all nginx configuration features and the nginx runtime directory are listed at the bottom.

After the config is executed, you will see that some intermediate files are generated. The intermediate files will be in the objs folder. Most importantly, a file called ngx_modules.c is generated that determines which modules will be compiled into Nginx during subsequent compilation. You can open it up and see that all modules that are compiled into Nginx are listed here, and they end up in an array called ngx_Modules.

Perform a make compilation.

make
Copy the code

When the compilation is complete without any errors, you can see that a large number of intermediate files have been generated, as well as the final Nginx binary.

cd objs/
ll
Copy the code

Finally, make install.

make install
Copy the code

After the installation is complete, you can see many directories in the –prefix installation directory. The nginx execution file is in the sbin directory.

The configuration files for the nginx function are in the conf folder, and the access.log and error.log folders are in the log folder.

You can see that all the files in the conf directory are copied from the conf directory in the source code, and the contents are exactly the same.

9. Configure syntax

The nginx executable specifies which modules it contains, but each module provides a unique configuration syntax.

All of these configuration syntax rules follow the same syntax rules.

The nginx configuration file is an ASCII text file that consists of two main parts, instruction and instruction speed.

http { include mime.types; Upstream THWP {server 127.0.0.1:8000; } server { listen 443 http2; Limit_req_zone $binary_remote_addr zone=one:10 rate=1r/s; location ~* \.(gif|jpg|jpeg)$ { proxy_cache my_cache; expires 3m; }}}Copy the code

Include mime.types; include mime.types; It’s an instruction.

Each instruction ends in a semicolon, and the instruction and arguments are separated by Spaces. include mime.types; Include is a directive name, and mime.types is a parameter that can be separated by one or more Spaces. You can have more than one parameter. For example, limit_req_zone below has three parameters, which are separated by Spaces.

Two instructions are separated by; As separators, two instructions can be written on a single line. But the readability will be very poor.

The third block is composed of {} and groups multiple instructions together, such as upstream, where server is placed below the THWP block.

The Server also has directives listen, limit_req_zone, and other blocks such as location.

“Upstream”, followed by “THWP” as its name.

It is up to the nginx module that provides the block to decide which instructions have a name and which do not. It can also decide that the block has one or more parameters, or no parameters.

Include statements allow the introduction of multiple configuration files to improve maintainability. In this example, the mime.types file contains a list of different file extensions compared to the MIME format of HTTP.

Include means to import other configuration modules.

The # symbol can be used to add comments to improve readability, such as the nginx configuration syntax comment added to listen to describe some of the following configuration expressions.

Limit_req_zone uses a parameter called $binary_remote_addr, which is a variable that describes the address of the remote end.

Some directives have arguments that support regular expressions, such as location, which, as you’ll see later, can support very complex regular expressions and can extract the contents of regular expression parentheses as $1,$2, and $3.

There are many ways to express time in the nginx configuration file, such as the following:

Ms -> ms s -> second M -> minute H -> hour D -> Day W -> week M -> month Y -> yearCopy the code

For example, expires 3m in location; You want the cache to refresh after 3 minutes.

Space also has units. When it is not followed by any suffix, it means bytes, k or K means kilobytes, m means megabytes, and g means G bytes.

All the instructions in the HTTP braces are parsed and executed by the HTTP module. Non-http modules such as STREAM or MIME have no way to parse the instructions.

“Upstream” refers to an upstream service. An upstream can be defined when Nginx needs to interact with other Intranet services such as Tomcat.

A domain name or group of domain names corresponding to server. Location is a URL expression.

10. Overloading, hot deployment, log cutting

You can use it if you need help -? Or -h to obtain help information.

nginx -?
nginx -h
Copy the code

By default, compiled Nginx looks for the configuration file specified when the configure command is executed. Another configuration file can be specified on the command line using the -c path.

You can also specify some configurations using -g, which are directives in nGINx’s configure directory.

Generally, a process in the operation of nginx operation is sent a signal, either by the Kill command of Linux or by the nginx -s subcommand, after which can be used stop, quit, reload, reopen.

Nginx -s reopen # Restart the nginx service nginx -s quit # restart the nginx service nginx -s reopen # Restart the log file.Copy the code

-t tests whether the configuration file is valid.

-V is all the parameters that are added by the configure script execution at compile time.

1. Overload the configuration file

Change some values in the nginx configuration file. For example, in the conf/nginx.conf file, turn on tcp_nopush.

After modifying the configuration file, you can run the nginx -s reload command directly. Nginx uses the new tcp_nopush configuration without stopping the customer service.

2. Hot deployment

If nginx is running and wants to change to the latest version of nginx, download a new version of nginx according to the nginx compilation method described above.

Copy the latest version of nginx executable file nginx to a directory to replace the running nginx file. When copy is complete, you need to send a signal to the nginx master process to tell it to start a hot deployment.

Kill -usR2 Process ID (13195)Copy the code

Nginx will start a new master process using the latest nginx binary file that has just been copied.

The old worker will be running, and the new master will generate the new worker, which will smoothly transfer all requests to the new process.

A new request for a new connection will enter the new Nginx process, and a signal called WINCH will be sent to the old Nginx process, telling it to gracefully shut down all processes.

kill -WINCH 13195
Copy the code

At this time, the old worker process will gracefully exit, but the old master process is still there, but there is no worker process.

This indicates that all requests have been switched to the new nGINx. If you need to return the new version to the old version, you can send the reload command to the old process and ask it to pull the worker process up again. Then close the new version. So the master is kept to allow version rollback.

3. Log cutting

For example, the current log is very large. You need to back up the previous logs to another file, but nginx is still working fine.

This is to be done through the reopen command, first it is necessary to put a copy of the log piece currently in use in another location.

mv access_log bak.log
Copy the code

Then the order was executed.

nginx -s reopen
Copy the code

To generate a new access. Log, the original log backup became bak. Log, the realization of log cutting.

Of course this approach can be very difficult to use, and in fact it is often done every day, or every week, to do the next day to cut, which can be written as a bash script first.

First copy the file in the bash script, then execute -s reopen command, and finally put the script in the crontab.

11. Static resource Web server

Conf /nginx.conf file to find the server code block, listen to configure the listener port 8080, then you need to configure a location, using/to make all requests to the WWW folder.

You need to specify the suffix of the URL to correspond to the suffix of the file, there are two uses, root and alias, root is the system following directory, so usually use alias, alias is the installation directory of nginx.

server { listen 8080; . location / { alias www/; . }... }Copy the code

After configuration, start Nginx and access localhost:8080 in your browser.

nginx -s reload
Copy the code

1. Open the gzip

The number of bytes transferred after gZIP compression is greatly reduced, so gZIP is usually opened.

Gzip (off -> on), gzip_min_length (off -> on), gzip_min_length (off -> on), gzip(off -> on) Compression actually consumes CPU performance. Gzip_comp_level is the compression level, and gzip_types is for certain types of files.

http { ... gzip on; gzip_min_length 1; gzip_comp_level 2; gzip_types text/plain applicaton/x-javascript text/css image/png; . }Copy the code

When you restart Nginx, the browser will see that the file transfer has been reduced and the response header has content-encoding: gzip. Using GZIP makes the entire Web service transport much more efficient.

2. Open the directory structure

Nginx provides an official module called AutoIndex that displays the directory structure when accessing urls ending in /. The use method is also very simple, is the autoindex on to add a directive can be.

location / {
    autoindex on;
}
Copy the code

It lists all the files in the folders it accesses. When a directory is opened, it can continue to display the files in the directory, which is a good static resource help feature.

3. Network speed limit

For example, the bandwidth of the public network is limited. When a large number of concurrent users use the bandwidth, they will form a scrambling relationship. You can allow users to limit the speed of accessing some large files to save enough bandwidth for users to access some necessary small files.

You can do this using the set command, along with some built-in variables, such as set $limit_rate 1k, which limits the speed at which Nginx can send a response to the client browser. It means how much data is transferred to the browser per second.

location / {
    set $limit_rate 1k;
}
Copy the code

4. Log

First, you need to set the access log format. Find a directive called log_format that defines the log format. You can use variables here.

http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer"  ' '"$http_user_agent" "$http_x_forwarded_for"'; }Copy the code

$remote_addr indicates the IP address of the remote browser client, and $time_local indicates the time. $status is the returned status code. Once the format is defined, you need to define a name, which in this case is main.

Different names can be used to log in different formats for different domain names or for different urls.

Once log_format is configured, you can configure logging using the access_log directive. The access_log code block determines the location of the log. For example, the access_log is stored in the server, which means that all requests to this path and port will be recorded in the log /yindong.log file. The format is main.

server { listen 8080; access_log logs/yindong.log main; location / { alias dlib; }}Copy the code

Logs /yindong.log allows you to view the format of each request after the request is completed.

12. Reverse proxy service

Because upstream services need to deal with very complex business logic and emphasize development efficiency, their performance is not very good. After using NGINx as a reverse proxy, one NGINx can distribute requests to multiple upstream servers according to the load balancing algorithm.

This enables the possibility of horizontal scaling. In the absence of user awareness, more upstream servers can be added to improve processing performance. When the upstream server has problems, Nginx can automatically transfer requests from the faulty server to the normal server.

“Upstream server” (127.0.0.1:8080) if there are many upstream services, you can add an upstream server.

Upstream configures a batch of services called Local. Use a proxy_pass directive for all requests to proxy to local.

Upstream local {server 127.0.0.1:8080; } server { server_name yindong.com; listen 80; location / { proxy_set_header Host $host; proxt_set_header X-Real_IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; Proxy_pass http://local; }}Copy the code

Because of the reverse proxy, the information received by the real server is forwarded by the Nginx proxy server, so a lot of information is not real. For example, the domain name and IP are sent by the proxy server, so some configuration processing needs to be done in the location.

Proxy_set_header allows you to send a new header upstream with some value, such as x-real-ip, and set its value to the remote IP address from the TCP connection.

$host is also the same because the user accesses the domain name directly, which he enters in the browser, allowing him to either process the domain name on the upstream server or by the reverse proxy.

All of these configuration features can be found in the http_proxy_module on the website.

1. The cache

An important feature is proxy_cache, because when Nginx acts as a reverse proxy, only dynamic requests, where different users access the same URL and see different content, are handled by the upstream service.

But there are some content may be no change in the period of time, in order to ease the pressure on the upstream server, can let the nginx upstream service returns the contents of the cache for a period of time, such as a cache for a day, even in one day to the upstream server response to the content changed, no matter, will only get the cache in response to this piece of content to the browser.

Because the performance of NGINx is far ahead of the performance of upstream servers. So using a feature can be a huge performance boost for some small sites.

To configure the cache server, first set the directory where the cache file is written via the proxy_cache_path directive.

For example, / TMP/nginxCache, and how these files are named, and the key of these files, are going to be in shared memory. Here we have 10MB of shared memory, named my_cache.

proxy_cache_patj /tmp/nginxcache levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path_off;
Copy the code

Proxy_cache (); proxy_cache (); proxy_cache (); proxy_cache (); proxy_cache ();

$host$uri$is_args$args; $host$uri$is_args These are the keys as a whole.

location / {
    proxy_cache my_cache;
    proxy_cache_key $host$uri$is_args$args;
    proxy_cache_valid 200 304 302 1d;
}
Copy the code

After adding these parameters, you can try to stop the upstream service and then access the site, and you can find that the site is still accessible. Because it’s cached.

13. Monitor Access logs

The Access log records important information about Nginx. You can use the log to analyze problems or to analyze user operational data, but it is relatively difficult to analyze access.log in real time.

There is a tool called GoAccess that graphically reflects access.log changes to the browser in real time over the WebSocket protocol for easy problem analysis.

The GoAccess site is https://goaccess.io and is displayed in a very graphical friendly way.

GoAccess uses the -o parameter to generate a new HTML file, displaying the contents of the current access.log file as an HTML diagram. When access.log changes, GoAccess will start a new socket process. Push the new access.log to the client through the port.

goaccess access.log -o report.html --log-format=COMBINED
Copy the code

First, specify the location of the access.log program (yindong.log), and output it to.. In the/HTML /report.html file, you use –real-time HTML, which is how you update the page in real time. Time format — time – format = ‘M H % : % : % S’, the date format, the date – format = ‘% % % d/b/Y’, and the log format, the log – format = COMBINED.

cd logs goaccess yindong.log -o .. /html/report.html --real-time-html --time-format='%H:%M:%S' --date-format='%d/%b/%Y' --log-format=COMBINEDCopy the code

GoAccess can be installed using yum or wget, or you can download the source code for compilation.

WebSocket Server ready to Accept New Client connections WebSocket Server ready to Accept New Client connections A connection is initiated to the process and the process pushes the latest log changes.

Next, add location to nginx.conf, and use alias to redirect to report.html when accessing /report.html.

server { ... location /report.html { alias /usr/local/openresty/nginx/html/report.html; }... }Copy the code

Open the localhost: 8080 / report. HTML can see the effect.

Using goaccess.log is a very intuitive way to see the changes in the Access.log statistics. It is very helpful to analyze the operation of the website. You can see the proportion and distribution of people using the site at each point in time, day by day of the week, and even different countries and regions using different browsers and operating systems.

14. SSL security protocol

SSL stands for Secure Sockets Layer, and a lot of the time you’re using TLS which is Transport Layer Security. Think of TLS as an updated version of SSL.

SSL was introduced by Netscape in 1995. Later, when Microsoft bundled its Internet Explorer browser with Windows, Netscape encountered great development difficulties. Netscape handed SSL protocol to the IETF organization.

In 1999, at Microsoft’s request, the IETF renamed SSL to TLS1.0, and in 2006, 2008, and 2018 TLS protocols 1.1, 1.2, and 1.3 were released.

So how does the TLS protocol ensure that HTTP plaintext messages are encrypted?

In the ISO/OSI seven-layer model, the application layer is the HTTP protocol. Below the application layer, the presentation layer, which is also the function of TLS, encrypts data without the awareness of the HTTP layer by means of handshake, key exchange, alarm, and symmetric encryption.

When capturing packets or observing the server configuration, you can see the security password configuration, which determines how the TLS protocol ensures that the plaintext is encrypted. There are about four components here.

The first component is called key exchange, or ECDHE, which is actually an expression of the elliptic curve encryption algorithm. Key exchange is about how the browser and the server independently generate the keys that they use to encrypt the data as it is transmitted. Encryption and decryption need to use each other’s key, so they need to exchange.

During key exchange, the browser and server need to authenticate each other’s identity. Authentication requires an algorithm called RSA.

Data encryption, decryption this communication, need to use the symmetric encryption algorithm AES_128 – GCM, the first part of THE AES expression is what kind of algorithm, 128 indicates that THE AES algorithm supports three kinds of encryption strength, using 128 bit this one encryption strength. There are many packet modes in AES. GCM is a relatively new packet mode that can improve the performance of encryption and decryption in multi-core CPU cases.

SHA_256 is a summary algorithm that generates a shorter summary of fixed length from a string of indefinite length.

15. Symmetric encryption, asymmetric encryption

In symmetric encryption scenario, two people who want to communications zhang SAN and li si, they Shared the same key, zhang SAN can put the original plaintext documents, through the key encryption generates a cipher text document, and li si to get document later, he can use this key to restore plaintext into the original document, and anyone without holding it in the middle of the key, Even if he knew the symmetric encryption algorithm there was no way he could restore the ciphertext to the original document.

Then the implementation of symmetric encryption can be described by the sequence algorithm of RC4 symmetric encryption.

With xOR, it’s a bit operation, so if you do xor with 1 and 0, you get 1, 0 and 1, you get 1, and then you get the same xor with 1 and 1 or 0 and 0, you get 0.

In a scenario where 1010 is the co-held key and 0110 is the plaintext, the cipher text 1100 is obtained when the cipher is executed.

1 0 0 # 1 key xor (exclusive or operation 0 # 1 1 0 # expressly | | 1 1 0 0 # # output ciphertextCopy the code

Xor has a symmetric feature, that is, the ciphertext and the key can be done the same Xor operation to get plaintext.

1 0 0 # 1 key xor (exclusive or operation of 1 # 1 0 0 # ciphertext | | # 1 0 0 1 # plaintextCopy the code

Ciphertext can be completely restored into plaintext with the same key, so the biggest advantage of symmetric encryption is that its performance is very good, it only needs to iterate once can get the final ciphertext, the decryption process is the same, while the performance of asymmetric encryption will be much worse.

Asymmetric encryption is based on a mathematical principle that produces a pair of keys. If one of the keys is called a public key, the other is called a private key.

The public key and the private key are the same named document and if it’s encrypted with the public key it can only be decrypted with the corresponding private key, in the same way that if it’s encrypted with the private key it can only be decrypted with the public key.

Let’s say That Sam Lee has a pair of public keys and private keys, so he can publish his public key to everyone, let’s say that Sam Lee is one of the guys, he gets the public key of Sam Lee, how does the encryption work?

If Sam wants to pass a copy of the original document to Lisa, then Sam can take The public key of Lisa to encrypt the original document, and then send the ciphertext to Lisa, who can decrypt the document with his own private key. Even if others get the document, they cannot decrypt it.

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | -- - | li si's public key | -- - | li si's private key | -- - | | -- - | -- -- -- -- -- -- -- -- -- -- - > | -- - | -- -- -- -- -- -- -- -- -- -- - > | -- - | | -- - | | encrypt decrypt the -- -- -- -- -- - | | -- - | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the original document Encrypted document The original documentCopy the code

Public and private keys also have a second use, which is authentication. For example, if you have a message that Is encrypted by Lee with his private key, and then the cipher text is sent to John, as long as John can use Lee’s public key to unlock the document, then it proves that the cipher text was actually sent by John. Because only Li Si has its own encryption private key, if it is Wang Wu encrypted document zhang SAN with Li Si’s public key is not open, only with Li Si private key encryption using Li Si’s public key to unlock.

16. Credibility of SSL certificates

Here’s another question. How does Li Si know that the message really came from Zhang SAN? This involves a new concept called public trust. In the process of multi-party communication, there must be a public trust authority (CA), responsible for issuing and expiring certificates.

As the maintainer of the site is the subscriber to the certificate, you must first apply for a certificate, which may require registering who you are, what organization you belong to, and what you want to do.

Registration organ via CSR to CA, CA center through will be generated after a pair of public and private keys, a public key was preserved in CA and public key private key certificate subscription after people get would deploy it to your own web server, when the browser to access the site, the server will send a public key certificate to the browser, the browser needs to CA authentication certificate are legal and effective. If it works, it’s not tampered with.

Because the CA certificate expired on the CRL server, the server will put all the outdated certificate form a chain so that his performance is very poor, then introduced the OCSP program can check whether a certificate to expire, so the browser can be directly to query OCSP response program, but OCSP response program performance is not high.

Nginx has an OCSP switch. When the OCSP switch is turned on, nginx will actively query the OCSP. A large number of clients can directly obtain the validity of the certificate from Nginx.

There are three types of certificates.

The first is called domain name verification DV certificate. In other words, the certificate only verifies the ownership of the domain name. When applying for a certificate, the certificate can be successfully applied as long as the server that the domain name points to is the server that is applying for the certificate.

The second certificate is called the certificate of verification OV, organize validation is at the time of applying for certificates to verify fill in institutions, whether an enterprise name is correct, apply for certificates of OV often need a few days time, unlike the DV certificate, basically can get real-time, OV certificate is much higher than the price of the DV, DV certificate many are free.

Even more stringent than the OV certificate is the EV certificate. Most browsers are very friendly to the EV certificate, which will display the name of the organization that was filled in when the certificate was applied in the browser address bar.

The browser has the same effect on DV, OV, and EV certificates from a security perspective. The only thing to verify is the certificate chain.

If you click the lock header in the address bar of the website to open the certificate chain, you can find that there are three levels. At present, all master certificates are composed of the root certificate, level 2 certificate, and master certificate.

The three-level authority is required because the root certificate verification is very cautious. For example, the Root certificate library of Windows and Android operating systems is updated at least once a year. Therefore, a new root certificate CA cannot be added to the operating system or browser quickly.

Most browsers use the certificate library of the operating system. Only firefox maintains its own root certificate library. Therefore, when verifying the validity of a certificate, the browser mainly verifies whether the root certificate is valid and recognized by the certificate library in addition to whether it has expired.

Nginx needs to send two certificates to the browser. The root certificate is built-in to the operating system or browser and does not need to be sent. The site’s master certificate is sent first, followed by the level 2 certificate, and the browser automatically devalidates the issuer of the level 2 certificate to see if the root certificate is valid.

When communicating between a browser and a server, confirming that they are trusted means verifying that the issuer who issued the root certificate to the site is valid.

17. Nginx performance bottleneck during SSL handshake

The TLS communication process serves four main purposes.

1. Verify the identity of the peer

The browser sends a Client Hello message to the server. One browser is very diverse, and the versions are constantly changing. So different browsers support different security suites and different encryption algorithms. This step basically tells the server which encryption algorithms are supported by the browser.

2. Reach a consensus on the security suite

Nginx has a list of the encryption algorithms it supports and which one it prefers to use. Nginx selects the one he likes best and sends it to the client.

If you want to reuse sessions, that is, if Nginx opens session cache, a client that wants to disconnect within a day can reuse the previous key without renegotiating the key.

The server Hello message mainly sends which security suite to select.

3. Pass and generate the key

Nginx sends its public key certificate to the browser. The public key certificate contains a certificate chain. The browser can find its own root certificate library to verify whether the certificate is valid.

4. Encrypt data for communication

The server sends Server Hello Done. If the negotiated security suite is an elliptic curve algorithm, the parameters of the elliptic curve will be sent to the client. The client needs to generate its own private key based on the public parameters of the elliptic curve and then send the public key to the server.

After the server has its own private key, it sends the public key to the client. The server can generate encryption keys for both parties based on the private key of the server and the private key of the client.

The client can also generate a key based on the public key sent by the server and its own private key.

The key generated by the server and client is the same, which is guaranteed by the asymmetric encryption algorithm. The generated key can then be used to encrypt the data and communicate.

TLS communication mainly does two things, the first is to exchange keys and the second is to encrypt data, and these are the main performance costs.

Nginx is a performance optimization, here is his algorithm performance, for small files, shaking hands is the main index, property of QPS for large files, mainly considering the performance of the symmetric encryption algorithms such as AES, although symmetric encryption algorithm performance is very good, but for a very large files, test or the performance of AES better throughput.

When dealing mainly with small files, the main test is nGINx’s asymmetric encryption performance, such as RSA, when dealing mainly with large files, the main test is symmetric encryption algorithm performance, such as AES.

In the face of the scene is more small files should focus on optimizing some of the elliptic curve algorithm password strength, see if it is reduced, when the main face of large file processing needs to consider whether the AES algorithm can be replaced with a more effective algorithm, or the password strength is smaller.

Implement an HTTPS site with a free SSL certificate

First you need a domain name such as yindong.zhiqianduan.com which is an HTTP url.

Then start installing the tools, the necessary tools.

If you have CentOS, you can use yum to install it, and you can use wGET to download it.

yum install pthon2-certbot-nginx
Copy the code

When installed, the certbot command is provided, and when the –nginx suffix is added, the nginx conf automatically changes accordingly. Usually he will change the nginx configuration in /usr/local/by default. You can use –nginx-server-root to specify the path where nginx.conf is located.

Use -d to specify the domain name for which you want to apply for a certificate, for example, yindong.zhiqianduan.com.

certbot --nginx --nginx-server-root=/usr/local/nginx/conf/ -d yindong.zhiqianduan.com
Copy the code

First it gets a certificate, then it waits for validation, and then it deploys the certificate to the nginx.conf file. Finally, there are two choices. First, do not do any redirection, and second, do the redirection. Redirects HTTP access 302 to HTTPS to disable insecure HTTP access.

You can use HTTPS to access the domain name yindong.zhiqianduan.com. https://yindong.zhiqianduan.com

He added port 443 to the server block, deployed the public key certificate and private key certificate, and included some common parameters to the configuration file.

Since handshake is the most performance consuming part of SSL, sessin_cache is added to reduce the handshake. Set to 1m, it can serve about 4000 links. That is to say, after each HTTP handshake is established for the first time, if the link is broken again, then the session_TIMEOUT period is not required to repeat the handshake. The previous key can be reused, and session_TIMEOUT is set to 1440m, which is one day.

Ssl_protocols indicates which version of TLS protocol HTTPS supports. Ssl_prefer_server_ciphers indicates that nginx starts deciding which protocol to use to communicate with the browser, using the security suite in ssl_ciphers. There is an order. The first ones are used first.

Finally, ssl_DHparam in the server indicates what parameters are used for encryption, and these parameters determine the encryption strength of network security.

19. Implement simple service in Lua language based on OpenResty

Download the openResty site (openresty.org), find the latest version in the source distribution, and copy his download link to download it.

Wget HTTP: / / http://openresty.org/download/openresty-1.13.6.2.tar.gzCopy the code

After the download is complete, unzip the package and go to the source directory. You can see that the Openresty directory is much smaller than the source directory of nginx.

There are many modules in the Bundle directory, the core of which is the nginx source code, which means that the current OpenResty is a secondary development based on the corresponding nginx version.

All features that are not available in the nGINx version are unlikely to appear in the OpenResty version.

The other directories fall into two categories. The first category is nginx third-party modules, which are C modules, usually beginning with NGX. The second type of module is the LUA module, which is written in LUA code and needs to use the functions provided by those C modules. When compiling, it is mainly compiling C modules.

./configure --help | more
Copy the code

The help file shows that OpenResty is basically the same as nginx, but OpenResty integrates many third-party modules, such as http_echo, http_xSS, etc., which are not available in the official version of Nginx. Many of these modules were written by OpenResty authors.

The core lua_module usually cannot be removed, and if removed, the entire Lua will not work. Other configuration items are essentially the same as in official Nginx.

./configure

make install
Copy the code

To add Lua code to OpenResty, first open OpenResty’s conf file. You can add Lua code to the file, but you can’t add Lua syntax directly to the conf, because nginx’s parser configuration syntax is different from that of The Lua code.

There are several instructions in the Nginx_lua_module of OpenResty, one of which is called content_by_lua, which is handled with Lua code during the content generation phase of HTTP request processing.

Add a location and use lua code when typing /lua. In order to display the output text as the browser displays the text directly, add default_type text/ HTML. Add some simple commands to content_by_lua to demonstrate how Lua works.

The Lua module of OpenResty provides apis, such as ngx.say that generates HTTP responses, which are placed in the body of the HTTP request, not in the header.

You can add content to the text in the body using the ngx.say syntax. Ngx.req.get_headers retrieves the HTTP header from the user’s request, finds the UA, and returns the value to the browser.

server { server_name yindong.com; listen 80; location /lua { default_type text/html; content_by_lua 'ngx.say("User-Agent: ", ngx.req.get_headers()["User-Agent"])'; } location / { alias html/yindong/; }}Copy the code

Visit/Lua to see the effect.

With OpenResty’s nginx_lua_HTTP module, you can do a lot with the API it provides, and you can add the Lua language to the response process using some of the lua language’s own libraries.

You can directly access services such as Redis, mysql, or Tomcat using Lua and the corresponding provided tool libraries, and then combine the different responses with program logic and return them to the user.