Busy – Nginx basic data structures

Before reading the nginx source code, it is important to understand the basic data structures it defines. Otherwise, it will be difficult and confusing to read.

Basic data types

The integer

To determine if it is 64-bit, let’s look at the code:

The basic data mapping is first defined in SRC /core/ngx_config.h

typedef intptr_t        ngx_int_t;
typedef uintptr_t       ngx_uint_t;
typedef intptr_t        ngx_flag_t;
Copy the code

Including intptr_t and uintptr_t in file/Library/Developer/CommandLineTools/SDKs/MacOSX12.0 SDK/usr/include/sys / _types _intptr_t. H:

typedef __darwin_intptr_t       intptr_t;
Copy the code

Note: This source is read on macOS 12

We continue to click on __darwin_intptr_t to see:

typedef long                    __darwin_intptr_t;
Copy the code

Uintptr_t = uintptr_t

typedef unsigned long           uintptr_t;
Copy the code

The basic data types defined by nginx are long and unsigend long

String type

SRC /core/ngx_string.h SRC /core/ngx_string.h SRC /core/ngx_string.h SRC /core/ngx_string.h

typedef struct {
    size_t      len;
    u_char     *data;
} ngx_str_t;
Copy the code

The ngx_str_t structure is just a c wrapper around a size, size_t, so let’s look at ngx_keyvalue_t

typedef struct {
    ngx_str_t   key;
    ngx_str_t   value;
} ngx_keyval_t;
Copy the code

The ngx_KEYVAL_t structure contains a key and a value. Its internal structure is ngx_str_t, which are both basic string types

typedef struct {
    unsigned    len:29;

    unsigned    valid:1;
    unsigned    no_cacheable:1;
    unsigned    not_found:1;

    u_char     *data;
} ngx_variable_value_t;
Copy the code

The ngx_variable_value_t structure looks more complex and contains more fields. Let’s look at one by one:

Though, I’m not very familiar to the c language, but from the perspective of the form and the format is probably can guess out what meaning, don’t be afraid, so we read the source code can guess, and then from the book or the Internet to find the corresponding answers to check your guess, so, don’t be afraid, even guess, don’t particularly, in spite of the brain, This not only improves the courage in the face of their first time reading large excellent framework source code, but also further enhance the ability of self-learning.

  • lenThe size of the length is defined as 29, after all unsigned, and is notlongNatural 29,
  • validThis field is defined because it adds a switch. Controls whether this variable is available or not, otherwise why is the name of its structure mutable
  • no_cacheable: Whether to enable caching,
  • not_found: Visible, and the two fields above it can be regarded as flag bits
  • *data: can be understood as content…

Let’s look at some methods for initializing or assigning strings:

#define ngx_string(str)     { sizeof(str) - 1, (u_char *) str }
#define ngx_null_string     { 0, NULL }

Ngx_str_null ngx_str_null ngx_str_null ngx_str_null ngx_str_null ngx_str_null ngx_str_null ngx_str_null
#define ngx_str_set(str, text)                                               \
    (str)->len = sizeof(text) - 1; (str)->data = (u_char *) text
#define ngx_str_null(str)   (str)->len = 0; (str)->data = NULL
Copy the code

Ngx_str_set and ngx_STR_NULL are only available in version 0.8, but there’s no stopping us from learning about them…

If an Nginx string variable is already defined and then assigned, the ngx_str_set, ngx_str_NULL macro definition must be used, for example:

/* * */
ngx_str_t str1 = ngx_string("hello nginx");
ngx_str_t str2 = ngx_null_string;

/* error */
ngx_str_t str1, str2;
str1 = ngx_string("hello nginx");   /* Error compiling */
str2 = ngx_null_string;

/* If you want to write the second way, use the following method */
ngx_str_t str1, str2;
ngx_str_set(&str1, "hello nginx");
ngx_str_null(&str2);
/* Note: the ngx_string and ngx_str_set string arguments must be constant strings, not variable strings */
Copy the code

Memory pool type

This ngx_pool_t type is also a common type in nginx source code.

First define the ngx_pool_t and ngx_chain_t structures in the file core/ngx_core.h

typedef struct ngx_pool_s   ngx_pool_t;
typedef struct ngx_chain_s  ngx_chain_t;
Copy the code

Then in core/ngx_palloc.h:

struct ngx_pool_s {
    u_char               *last;
    u_char               *end;
    ngx_pool_t           *current;
    ngx_chain_t          *chain;
    ngx_pool_t           *next;
    ngx_pool_large_t     *large;
    ngx_pool_cleanup_t   *cleanup;
    ngx_log_t            *log;
};
Copy the code
  • *last: Indicates the end location of the current memory allocation, that is, the start location of the next available memory
  • *end: Indicates the end location of the memory pool
  • *current: Indicates the current memory pool
  • *next: points to the next memory pool
  • *chain: Point to angx_chain_tstructure
  • *large: Block memory linked list, that is, the memory allocated more than Max space
  • *cleanup: destructor that frees the memory pool
  • *log: Logs about memory allocation

Obviously, the structure of the memory pool is quite complicated, but we can’t understand it. Add some linked list pointing information, free memory, log information, etc

Before you understand the structure of ngx_chain_t, look at the buffer ngx_buf_t type

The buffer

However, I would like to add the concept of buffers:

A buffer is a part of memory space. In other words, a certain amount of storage space is reserved in the memory space, which is used to buffer the input or output data. This part of the reserved space is called the buffer, and obviously the buffer has a certain size.

Why introduce it?

A buffer can be created between the mismatch between high-speed and low-speed devices, which inevitably makes high-speed devices spend time waiting for low-speed devices.

Buffer functions:

  1. Data can be directly sent to the buffer, high-speed equipment no longer need to wait for low-speed equipment, improve the efficiency of the computer.

For example, we use a printer (IO device) to print documents, because the printer’s printing speed is relatively slow, we can first put a document output to the corresponding buffer, freeing the CPU, let the printer go to the buffer to take the data to gradually print.

  1. Can reduce the number of read and write data, if the data transfer only a little at a time, you need to send many times, it will waste more time, because the start, speaking, reading and writing, speaking, reading and writing and terminating need IO for a long time, if we to transmit the data to the buffer for the buffer is full again later to transfer, this will greatly reduce the number of read and write, thus reducing the time.

For example, if we want to write data to disk, we don’t write data to disk immediately, we write data to the buffer first, and when the buffer is full, we write data to disk. This reduces the number of reads and writes to disk. Under what circumstances will data be written to disk? Apparently, a log? Which framework doesn’t get a log?

So, in simple terms, a buffer is a block of memory that sits between the IO device and the CPU to store data. A low-speed I/O device works in harmony with a high-speed CPU, preventing low-speed I/O devices from occupying the CPU and freeing the CPU for efficient operation.

In the core/ngx_buf.h file, as you no doubt guessed:

typedef struct ngx_buf_s  ngx_buf_t;

struct ngx_buf_s {
    u_char          *pos;
    u_char          *last;
    off_t            file_pos;
    off_t            file_last;

    u_char          *start;         /* start of buffer */
    u_char          *end;           /* end of buffer */
    ngx_buf_tag_t    tag;
    ngx_file_t      *file;
    ngx_buf_t       *shadow;


    /* the buf's content could be changed */
    unsigned         temporary:1;

    /* * the buf's content is in a memory cache or in a read only memory * and must not be changed */
    unsigned         memory:1;

    /* the buf's content is mmap()ed and must not be changed */
    unsigned         mmap:1;

    /* recyclable, i.e. these BUFs can be released */
    unsigned         recycled:1;
    unsigned         in_file:1;
    unsigned         flush:1;
    unsigned         sync:1;
    unsigned         last_buf:1;
    unsigned         last_in_chain:1;

    unsigned         last_shadow:1;
    unsigned         temp_file:1;

    unsigned         zerocopy_busy:1;

    /* STUB */ int   num;
};
Copy the code

Oh, my God, the buffer zone takes a look, it’s complicated…

  • *pos: The start location of buffer data in memory
  • *last: End location of buffer data in memory
  • file_posandfile_last: serves a file similar to*posand*last
  • *startand*endSince the actual data may be contained in multiple buffers, the start and end of the buffer point to the beginning and end of the memory, while pos and last point to the beginning and end of the actual data contained in the buffer
  • *file: points to the file object corresponding to buffer
  • *shadow: a shadow buffer of the current buffer, that is, when one buffer copies data from another buffer. Shadow Pointers that point to each other occur
  • temporaryIf: is 1, it indicates that the buF content in the user-created memory block can be changed by filter
  • memoryIf: is 1, it indicates that the buF content is in the memory and cannot be changed by filter
  • mmapIf: is 1, it indicates that the contents of the BUF are in memory. Files can be mapped to memory through Mmap (described later) and cannot be changed by filter

Next, ngx_chain_t:

The ngx_chain_t data type is a linked list structure related to the buffer type ngx_buf_t and is defined as follows:

struct ngx_chain_s {
    ngx_buf_t    *buf;
    ngx_chain_t  *next;
};
Copy the code
  • *buf: points to the current buffer
  • *chain: indicates the next chain to form a chain list

The schematic diagram of the structure is as follows:

summary

This article mainly describes the basic structure of Nginx, know the data structure to understand the following business logic more thoroughly.

reference

  • Tc. Dreamcat. Ink/archives / 27…