This article has been included in my github address, welcome to star support ^_^

Last week, we adopted the no-session method of JWT (Json Web Token) for the first time in our team to authenticate user accounts. We found that many articles on the Internet were wrong about Token. Therefore, regarding cookies, Session, token (JWT token)

Cookie

HTTP 0.9 came out in 1991, and it was just for browsing web documents, so it was just GET requests, and then it went away, and there was no connection between the two connections, and that’s why HTTP is stateless, because it didn’t have that requirement to begin with.

But with the rise of interactive Web (so-called interactive is not only can browse, you also can log in, send comments, user operation activities such as shopping), simply browse the Web has been unable to meet the requirements of people, such as with the rise of online shopping, the need to record the user’s shopping cart, then you need a mechanism to record the relationship between each connection, This way we know who the item added to our shopping cart belongs to, and cookies were born.

Cookie, sometimes also used in the plural, Cookies. The type is small text file. It is the data (usually encrypted) stored on the user’s local terminal by some websites for identifying the user’s identity and Session tracking, and temporarily or permanently stored by the user’s client computer.

The working mechanism is as follows

Take adding to the shopping cart as an example. After each browser request, the server will store the id of the product in a Cookie and return it to the client. The client will save the Cookie locally, and next time it will send the Cookie saved locally to the server. In this way, each Cookie holds the user’s product ID, and the purchase record will not be lost

If you carefully observe the figure above, you will find that with the increasing number of goods in the shopping cart, the cookies for each request will be bigger and bigger, which is a great burden for each request. I just want to add one item to the shopping cart. Why should I return the historical product records to the server? The shopping cart information is already recorded on the server. How to improve it

Session

After careful consideration, since the shopping cart information of the user will be stored in the Server, it is only necessary to save the information that can identify the user in the Cookie and know who initiated the operation of adding to the shopping cart. In this way, the user’s identity information will only be brought in the Cookie after each request. The request body only needs to carry the id of the item added to the shopping cart this time, which greatly reduces the size of cookie. We call this mechanism that can identify which request is initiated by which user as Session mechanism, and the generated string that can identify user identity information is called sessionId. Its working mechanism is as follows

  1. First, the user logs in, and the server generates a session for the user and assigns it a unique session ID. This session ID is bound to a user, that is to say, according to this session ID (assume ABC), you can find out which user it is. This sessionID is then passed to the browser via a cookie
  2. After that, every time the browser adds a shopping cart, it only needs to bring a key-value pair of sessionId= ABC in the cookie. After the server finds its corresponding user according to the sessionId, it saves the commodity ID to the shopping cart of the corresponding user in the server

You can see that this way you no longer need to pass all the shopping cart ids in the cookie, greatly reducing the request burden!

In addition, it is not difficult to observe from the above that cookies are stored in the client, while sessions are stored in the server, and sessionId is meaningful only through the transfer of cookies.

The session of pain points

Cookie + session seems to solve the problem, but we ignore one problem. The above situation works because we assume that the server works in a single machine. However, in production, in order to ensure high availability, the server usually needs at least two machines. Load balancing is used to determine which machine the request should be directed to.

As shown here, the load balancer (such as Nginx) decides which machine to call after the client requests it

If the login request is made to machine A, machine A generates A session and adds A sessionId to the cookie and returns it to the browser, then the problem is: Next time when adding shopping cart, if the request reaches B or C, because the session is generated in machine A, then B and C cannot find the session, then the error of unable to add shopping cart will occur, and we have to log in again, what should we do? There are three main solutions

1. Session replication

A generates A session and copies it to B and C, so that each machine has A session. No matter which machine the request to add shopping cart is sent to, the session can be found, so there is no problem

While this approach is feasible, the disadvantages are obvious:

  1. The same session saves multiple copies, and the data is redundant
  2. If the number of nodes is small, it is fine, but if the number of nodes is large, especially for Alibaba and wechat, which have a DAU of 100 million, tens of thousands of machines may need to be deployed. In this way, the performance consumption caused by increasing number of nodes and replication will be great.

2. Session adhesion

For example, after A browser login request is made to MACHINE A, all subsequent requests to add A shopping cart are also made to machine A. Nginx sticky module can support this method, support IP or cookie bonding, etc. For example, the bonding mode by IP address is as follows

Upstream tomcats {ip_hash; Server 10.1.1.107:88; Server 10.1.1.132:80; }Copy the code

After each client request arrives at Nginx, as long as its IP address remains unchanged, the value calculated according to the IP hash will be sent to the fixed machine. Therefore, there is no problem that the session cannot be found.

3. Session sharing

This method is also widely adopted by major companies at present. Session is stored in redis, memcached and other middleware. When the request comes, each machine can fetch the session from these middleware.

In fact, the disadvantages are not difficult to find, that is, each request has to go to Redis to fetch a session, which has an extra internal connection and consumes a little performance. In addition, in order to ensure the high availability of Redis, clusters must be made. Of course, for large companies, redis clusters will be basically deployed. So it’s the first choice for big companies.

Token: no session!

From the above analysis, we know that the user identity can be located by sharing the session on the server, but it is not difficult to find a small flaw: to make a verification mechanism, I still need to build a Redis cluster? It is true that large factories use Redis more commonly, but for small factories, their business volume may not reach the extent of using Redis, so is there any other user identity verification mechanism that does not need server to store session? This is the protagonist we are going to introduce today: token.

The server generates a token based on the user name and password input by the requester. The client saves the token locally and then puts the token in the request header when requesting the server.

I’m sure you’ll see that there are two problems

1. The token is only stored in the browser, but not in the server, so I can just make a token to the server.

A: The server has a verification mechanism to verify that the token is valid.

Select userID from sessionId. Select userID from sessionId. Select userID from sessionId. Select userID from sessionId.

A: The token itself can carry UID information, which can be obtained after decryption

First question, how to verify the token? We can learn from HTTPS signature mechanism to verify. Let’s start with the components of a JWT token

You can see that the token consists of three main parts

  1. Header: specifies the signature algorithm
  2. Payload: Non-sensitive data, such as the user ID and expiration time, can be specified
  3. The server knows what Signature algorithm it needs to use based on the header. Then it uses the key to generate a Signature for the head + Payload.

When the server receives the token from the browser, it extracts the header and payload in the token, generates a signature based on the key, and compares the signature with the signature in the token. If the signature succeeds, that is, the token is valid. You can retrieve the userId from the payload, so you don’t need to fetch the userId from the Redis session

Voice-over: Header, payload Actually exists in the form of base64. This step is omitted for ease of description.

As long as the server guarantees that the key is not leaked, the token generated is safe, because if the token is forged, it will not pass the signature verification process and the token will be deemed illegal.

It can be seen that in this way, the disadvantages of having to keep the token in the server are effectively avoided and the distributed storage is realized. However, it should be noted that once the token is generated by the server, it is valid until it expires. There is no way to invalidate the token. If the token is in the blacklist, then the token is invalid. Once this is done, it means that the blacklist must be stored in the server. This is back to the session mode. Doesn’t it smell good to just use session. Therefore, the general practice is to remove the token locally when the client logs out to make the token invalid, and then regenerate the token at the next login.

In addition, it should be noted that tokens are generally included in the header Authorization, not in cookies. This is mainly to solve the problem that cookies cannot be shared across domains (detailed below).

A simple summary of cookies and tokens

What are the limitations of cookies?

1. Cookies cannot be shared across sites, so if you want to implement multi-application (multi-system) single sign-on (SSO), it is very difficult to use cookies to do what you need (to use more complicated trick to implement, if you are interested in the link at the end of the article).

Voiceover: Single sign-on (SSO) is a system in which users log in once to access all trusted applications.

However, implementing SSO with tokens is very simple, as follows

Add the token to the Authorize field (or other customization) in the header to complete the authentication of all cross-domain sites.

2, there is no such thing as a cookie in a native request on the mobile terminal, and the sessionID depends on the cookie, so the sessionID cannot be passed with the cookie, if the token is used, because it is authoriize with the header. In other words, ** Token naturally supports mobile platforms **, naturally supports all platforms, and has good scalability

To sum up, token has the characteristics of simple storage implementation and good scalability.

What are the disadvantages of token

It is the first time for many people to hear about tokens. Is it not fragrant? Token has the following two disadvantages:

1. Token is too long

The token is a header payload, so it is usually longer than the sessionId. It is likely to exceed the cookie size limit. The longer you store information in the token, the longer you store information in the token, the longer you store information in the token. The longer the token itself will be, the more burden it will be on the request since you will carry the token with you every time

2. Not very safe

Many articles on the Internet say that token is more secure, in fact, it is not, you may find carefully, we say that token is stored in the browser, and then ask, where is it stored in the browser? Since it is too long to be stored in cookies, which may lead to cookie overload, it has to be stored in local storage, which will cause serious security problems, because local storage such as local storage can be directly read by JS. In addition, as mentioned above, Once a token is generated, it cannot be invalidated. It must wait until it expires, so that if a server detects a security threat, it cannot invalidate the token.

Therefore, token is more suitable for one-time command authentication with a shorter validity period

Some misconceptions: Cookies are less secure than tokens, such as CSRF attacks

First we need to explain how CSRF attacks work

An attacker tricks a user’s browser into visiting a previously authenticated web site and performing operations (such as emailing, sending messages, or even property operations such as transferring money or buying goods). Since the browser is authenticated (cookies carry authentication information such as sessionIDS), the site being visited will assume that it is a true user operation and run it.

Such as users log on to the website of a bank (assuming for www.examplebank.com/ * * * *, * * www.examplebank.com/withdraw?am and address as… The cookie will contain the sessionid of the logged-in user, and an attacker can place the following code on another website

<img src="http://www.examplebank.com/withdraw?account=Alice&amount=1000&for=Badman">
Copy the code

Therefore, if a normal user clicks the above picture by mistake, the request of the same domain name will automatically bring a cookie, which contains the sessionID of a normal login user, and the transfer operation like the above will be successful in the server, which will cause great security risks

The root cause of CSRF attack is that for each request of the same domain name, its cookie will be automatically carried, which is determined by the browser mechanism, so many people believe that cookies are not secure.

Using tokens does avoid CSRF problems, but as mentioned above, since tokens are stored in local storage, they will be read by JS and are not secure from a storage perspective (actually the only correct way to protect against CSRF attacks is CSRF tokens).

Therefore, both cookie and token are not secure from the perspective of storage. What we mean by security is more security in transmission. HTTPS can be used for transmission, so that the request header can be encrypted to ensure the security in transmission.

In fact, it is unreasonable for us to compare cookies and tokens. One is the storage method and the other is the authentication method. The correct comparison should be session versus token.

conclusion

Session and token are essentially the same, both are authentication mechanisms for user identity, but their verification mechanisms are different (one is stored in the server and verified by redis middleware, while the other is stored in the client and verified by signature verification). It is more reasonable to use session in most scenarios, but it is more appropriate to use token in single sign-on (SSO) and one-time commands. It is better to select the token appropriately in different business scenarios to achieve twice the result with half the effort.

Shoulders of giants

  • Cookie Session cross-site cannot share issue (single sign-on solution) : blog.csdn.net/wtopps/arti…
  • Stop using JWT for sessions: cryto.net/~joepie91/b…

More articles, welcome you to scan code to pay attention to the public number “code sea”