preface

In Web development, cookies and sessions are often used to store some user information for authentication purposes. However, in the use of some relatively simple access, the specific Cookie and Session internal parameters and principles are basically unknown. In my personal experience, cookies and Session are frequently asked in both work and interview. As I have no in-depth understanding of them, I have stumbled a lot in this regard. Therefore, I plan to summarize cookies and Session in an article. Since I’m a PHper, the following code uses PHP as an example.

Cause of occurrence

In the early days of the web development are some web portal, only show a static page, without human interaction, then with the development of Internet, the interactive web system, such as weibo, login, search, thumb up, comments, concerns, forwarding these functions, because HTTP is a stateless transport protocol, the interaction between the server and the user, It is not known which user is accessing it, and the user’s actions cannot be recorded, so those actions are not connected. To solve this problem, authentication services exist, such as cookies, sessions and tokens. Token is a unique string in the random library of the development code. It is generally generated after the user logs in, and then the token is stored in the user table, bound to the user, and returned to the client. The client stores the token, and each subsequent request carries the token, so that the server knows who the user is. The original token was used in this way, and there were many problems, such as token unchanged and no expiration time. Later, JWT token emerged to solve these problems. Since token is not the focus of this paper, I will not go into details.

Cookie

What is a Cookie

There is a common misconception that cookies are a cache. The essence of Cookie is actually a short piece of text information, which is a key and value pair of dictionary type. It exists in the client of the user, and it is a file for the browser to manage the state. It cannot relieve the pressure on the server, but increases the bandwidth.

The classification of the Cookie

1. Session Cookie: Stored in memory and terminated automatically when the browser session ends.

2. Persistent Cookie: stored in the hard disk, it will disappear automatically only when the expiration time expires.

The attribute of the cookie

Let’s take a look at baidu for exampleAs can be seen from the figure, when we visit Baidu, many cookies will be automatically generated in the format: {Name: Value, Value: Value, Domain: Domain Name, Path: Path, Expries/ max-age: expiration time, Size: Size, HttpOnly, Secure, SameSite, SameParty, Priority}

1. Name Indicates the Name. The value cannot be repeated in the same domain

2, Value Value, there is nothing to say, but it needs to be noted that the Value is not allowed to have semicolons, commas, Spaces and other special symbols, in order to ensure that there is no such time can be directly stored, can not be guaranteed when the best Value encoding.

If the cookie is not specified when it is generated, it is the current Domain name. If my Domain name is www.cookie.com, then after the following code is executed, Domain is www.cookie.com

<? php setcookie('zhugeliang', 'www.cookie.com');Copy the code

Domain can be specified when cookies are generated, for example:

setcookie('zhugeliang', 'www.cookie.com', 0, '', '.cookie.com');
Copy the code

Note That domain can only be the current domain name or a level-1 domain name. Setting domain to another domain name does not take effect. If the current domain name iswww.cookie.comDomain is set tom.cookie.comThe cookie cannot be set successfully, and the cookie cannot be set successfully

setcookie('zhugeliang', 'san', 0, '', 'm.cookie.com');
Copy the code

Domain set to level 1 Domain name and level 2 Domain name difference:

For example, there are two secondary Domain name projects www.cookie.com and m.cookie.com. If the Domain is not specified, the cookie Domain generated is the Domain name of each project, and the cookies under the current Domain name will only be carried when the two projects visit.

When we arewww.cookie.comWhen Domain is set to a level-1 Domain name, them.cookie.comAccess is also available.

setcookie('domain1', 'test', 0, '', '.cookie.com');
Copy the code

This is also the way to implement cookie sharing across multiple domains.

4, effective Path of the Path, the default value is’/’, matching is a web routing, such as under http://www.cookie.com/index/index to generate a cookie specified Path for ‘/ index’

setcookie('pathTest', 'test', 0, '/index');
Copy the code

http://www.cookie.com/detail/indexThe following access is not available

The effective path is based on the route matching rangehttp://www.cookie.com/index/detailThe next one is available

5. Expries/ max-agecookie expiration date and Expires attribute. Generally, browser cookies are stored by default. If you want a cookie to be around for a certain amount of time, you set Expires to a future time node, Expires is the current time, and max-age is the persistent cookie

# max-age setcookie('name', 'value', 1356663605); // Old header: // set-cookie: name=value; expires=Fri, 28-Dec-2022 03:00:05 GMT // New Header: // Set-Cookie: name=value; Expires=Fri, 28-Dec-2022 03:00:05 GMT; Max-Age=5Copy the code

The value of Expries/ max-age also represents different types of cookies. If the value is session, it is a session cookie. The cookies in time format are persistent and automatically deleted when the browser closes. Conversational cookies

setcookie('zhugeliang', 'www.cookie.com');
Copy the code

Persistent cookies

setcookie('zhugeliang', 'www.cookie.com', time() + 1800);
Copy the code

SizeCookie size

7, HttpOnly If the cookie is set to HttpOnly=true attribute, then the JS script will not be able to read the cookie information, which can effectively prevent XSS attacks, increase the security of cookies. This attribute is also a frequent interview question.

Secure Indicates whether the Cookie is transmitted only using the security protocol. The default value is false. Secure protocols such as HTTPS and SSL encrypt data before transmission over the network. If the web site is HTTP, this parameter is set to true, and cookies are not set successfully.

SameSite is designed to prevent cross-site request forgery (CSRF) attacks and protect user privacy. It has three property values

  • StrictThird party cookies are completely disabled and will not be sent across sites under any circumstances.
  • LaxAllow some third party requests to carry cookies.
  • NoneCookies are sent across sites or not.

Chrome’s SameSite defaults to Lax, while Safari defaults to Strict.

Cookie Samesite parsing

10, SameParty

See the SameParty attribute added to Cookie for details

11, Priority

Priority, chrome’s proposal, defines three priorities: Low, Medium, and High. When the number of cookies exceeds, the lower priority cookies will be cleared first

Refer to Cookie knowledge two

Session

Since cookies exist on the client and can be tampered with, another relatively secure authentication session appears for security

What is the Session

A Session is a series of sessions that the server stores for user operations, managed by a Web container, and stored in a hash table structure.

Session id

When the browser requests to create a session, the server first checks whether the request contains a session id. If the request contains a session ID, it indicates that a session has been created for the client before. If the request does not contain the session ID, the server creates a new session ID. This value is a string that is neither repeated nor easily found to be forged. This session ID will be returned to the browser as a cookie in the response.

The browser requests the server, the server generates the session and returns the sessionId in the form of a cookie to the browser. The cookie name is PHPSESSID. The number of clients on the server is the same as the number of sessionids. When you change session.name while generating a session, you get one more.

The relationship between Session and Cookie

The session interaction between the browser and the server is based on the session ID. The storage of the session ID depends on the cookie, so the attribute method of cookie is also supported in the session.

Although session depends on cookie, it does not mean that session cannot be used if cookie is disabled. You only need to agree with the client on the interaction process of session ID. I do before an APP, a graphical verification code function, the service side, the APP shows, user filled out the verification code, APP request interface validation, the client’s development is the reference of the browser, the method of receiving and sending of the server can normal read and write, of course, there are a lot of ways, for example on the interface response parameters, When the client requests it, it can be brought back as a parameter.

Cluster architecture Session sharing

In the case of single machine, Session sharing does not exist, but most companies adopt cluster mode for each project. In the case of distributed system, if Session sharing is not carried out, requests will fall to different machines and repeat login. The diagram below:

When the user accesses server 1 for the first time, the generated session is also on server 1. When the user accesses server 2 for the second time, the user finds that there is no session, and then jumps to the login page to log in. In order to solve this problem, it is necessary to ensure the session consistency between servers in the cluster architecture. Session sharing comes into being, mainly in the following ways.

1. Session replication

After a session is generated when a user accesses server 1, server 1 automatically synchronizes session information to Session2 to ensure session consistency. In this way, there is no problem when the number of users is small and the number of servers is relatively small. However, if the number of users increases rapidly and the number of servers increases, its disadvantages become prominent. Synchronous transmission occupies Intranet bandwidth, synchronization performance index decreases, and the memory occupied by session cannot be horizontally expanded.

2, Nginx load balancing:

Nginx load balancing

1. Polling (default) : Each request is allocated to a different backend server one by one in chronological order. If the backend server goes down, it will be deleted automatically.

2, weight weight: specifies the polling probability, weight is proportional to the access ratio, used in the case of uneven back-end server performance.

3. Ip_hash: Each request is assigned to the server based on the HASH result of the IP address. Fixed visitors access the same server every time.

4. Fair (third party) : Requests are allocated according to the response time of the back-end server, with priority given to those with short response times.

5. Url_hash (third-party) : Allocates requests based on the hash results of urls, so that each URL is directed to the same backend server, which is more effective when the backend server is used for caching.

It is obvious that IP_hash is more suitable to fix each visitor’s access to the same server. However, this obviously limits the scalability of our project. If a server fails, it will be automatically removed or there are not enough servers to add. The IP_hash algorithm redistributes servers, which brings new problems. If a project requires high data accuracy, such as a financial project, it clearly violates the principle of high availability of the project architecture, and no architect would dare to adopt this approach.

3. Server cluster Session storage sharing

As we all know, sessions exist on servers. If sessions accessed and accessed by each server in the cluster are in the same place, it will be much more convenient to expand our servers. We only need to point the session storage configuration of each server to the same place. The following uses PHP as an example to illustrate several ways in which sessions share storage.

PHP already provides shared storage for sessions. In php.ini, you can change how sessions are stored by changing the value of session.save_handler

1. File (default). Before changing the configuration, the file is stored by default

[Session] ; Handler used to store/retrieve data. ; http://php.net/session.save-handler session.save_handler = files ; The path can be defined as: ; ; session.save_path = "N; /path"Copy the code

Save_handler = files, that is, the session data is saved by reading and writing files. The path to save the session file is specified by session.save_path, and the file name is prefixed with sess_. Followed by the SESSION ID, such as: sess_q9ga90aoc8p2i6e3fuda8ft5bf. The data in the file is the SESSION data after serialization.

If the number of visits is large, a large number of SESSION files may be generated. In this case, you can set a tiered directory to save SESSION files to improve efficiency. /save_path”, where N is the level of the tier and save_path is the start directory. When writing SESSION data, PHP will get the SESSION_ID of the client, and then find the corresponding SESSION file in the specified SESSION file save directory according to the SESSION ID. If no SESSION file exists, create it, and finally write data to the file after serialization. Reading SESSION data is a similar operation process. The read data needs to be deserialized to generate corresponding SESSION variables. When setting the save_path parameter, note that an error will be reported in actual PHP. The directory for storing sessions will not be automatically created

session.save_path = "3; /var/session"Copy the code

I have set layer 3, and the user permissions of the directory have been changed, but still an error is reported

Later, I found that ext/session/mod_files.sh file in the PHP source file can assist in the generation of directory, the specific implementation is not in-depth research, interested in hands-on operation.

File storage We can change the storage path to achieve the effect of storage sharing. We only need to provide a server dedicated to storing sessions, and then mount the storage directory of this server to all servers in the server cluster.

Mysql database storage

Saving sessions to a database is not personally meaningful

3, memcache

When the site is heavily visited, session access will inevitably affect the speed of the site. Because the file read speed is very low. Memcache, as a memory cache server, uses hash algorithm to read data in the form of key->value, which is much faster than file reading. For details, see the following:

Save_handler = memcache session. Save_path = "TCP ://Mem server 1: port number, TCP ://Mem server 2: port number...Copy the code

The saved key is session_id. Connect to the memcache server and get

Redis is an ideal session shared storage solution that I personally feel is fast, simple to configure, and has strong scalability.

Save_handler = redis // Storage mode session.save_path = "TCP ://127.0.0.1:6379" //redis if there are passwords: TCP ://127.0.0.1:6379? auth=passwordCopy the code