Cookies
When designing web applications (especially with traditional HTML types), you will at some point have to figure out how to log in users and keep them logged in between requests.
The core mechanism we use for this is cookies. Cookies are small strings sent by the server to the client. The client receives the string and repeats it in subsequent requests. We can store a “user ID” in the cookie, and for any future requests, we will know what user the customer is.
But it’s very unsafe. This information exists in the browser, which means that the user can change the user ID and be identified as a different user.
Sessions
The traditional way to solve this problem is the so-called “Sessions”. I don’t know what the earliest use of conversation was, but it’s in every web framework and has been for as long as web frameworks have existed.
Often Sessions and cookies are described as two different things, but they are not. The session needs a cookie to work.
Cookie: MY_SESSION_ID=WW91IGdvdCBtZS4gRE0gbWUgb24gdHdpdHRlciBmb3IgYSBmcmVlIGNvb2tpZQ
Copy the code
Instead of sending the customer a predictable user ID, we send the customer a completely random sessions ID, which is hard to guess. This ID has no further meaning and will not be decoded into anything. This is sometimes called an opaque mark.
When the client repeats the Sessions ID to the server, the server looks up the ID in the database (for example) to associate it with the user ID. When the user wants to log out, the Sessions ID is removed from the data store, meaning that cookies are no longer associated with the user.
Where is session data stored?
Languages like PHP have a built-in storage system that stores data in local file systems by default. In the Node.js ecosystem, by default, this data is kept in “memory” and disappears when the server restarts.
These methods make sense on developer machines, or when websites are hosted on long-lived bare metal servers, but today’s deployments often mean a whole new “system”, so this information needs to be stored in a place that outlives the server. One easy option is databases. Websites commonly use systems such as Redis and Memcached, which work for small sites as well as large ones.
Encrypted token
More than 10 years ago, I started using OAuth V1 and similar authentication systems more and wondered if we could just store all of our information in cookies and sign them cryptographically.
Despite getting some good answers, I didn’t do it because I didn’t feel confident enough to keep it safe, and I felt it required more knowledge of encryption than I did.
A few years later, we had JWT, and it was a hit. JWT itself is a standard for encrypting/signing JSON objects, which is used heavily for authentication. We actually embed user_id again instead of an opaque token in the cookie, but we include a signature. The signature can only be generated by the server and is calculated using a “secret” and the actual data in the cookie.
This means that if the data is tampered with (user_id is changed), the signatures no longer match.
So why is this useful? My best answer to this is that you don’t need a system with session data, such as Redis or a database. All the information is contained in JWT, which means your infrastructure is theoretically simpler. It is possible to reduce the number of calls to the data store on a per request basis.
disadvantages
There are some major disadvantages to using JWT.
First, it’s a complex standard, and users can easily get Settings wrong. If set incorrectly, in the worst case, this can mean that anyone can generate a valid JWT and impersonate someone else. This isn’t a beginner level problem either, as Auth0 had one last year.
Auth0 is (or was it? They just acquired) a major provider of security products and ironically sponsored the jwt. IO website. If they are not safe, what chance does the general [developer] public have?
However, this problem is part of the larger reason many security experts don’t like JWT: it has a lot of functionality and a very large scope, which gives it a lot of surface area for library authors or users of those libraries to make mistakes. Stateless tokens exist as alternatives to JWT, some of which do solve this problem).
The second problem is “cancellation”. In a traditional session, you simply remove the session flag from the session store, which is enough to “invalidate” the session.
This is not possible for JWT and other stateless tokens. We cannot remove tokens because they are self-contained and there is no central authority to invalidate them.
This is usually solved in three ways.
- Tokens have a short life span. For example, 5 minutes. Before the 5 minutes are up, we generate a new one. A separate refresh token is usually used).
- Maintain a system with a list of recently expired tokens.
- There is no server-driven logout, assuming that customers can remove their own tokens.
Good systems usually use the first two. It should be noted that in order to support logoffs, you may still need a centralized storage mechanism (for refreshing tokens, revoking lists, or both), which is exactly what JWT should “fix”.
As an aside: Some people prefer JWT because there are fewer systems involved per request, but this contradicts being able to revoke tokens before they expire.
My favorite solution is to keep a global list of JWTs that have been revoked prior to expiration (the tokens are removed after expiration). Rather than having webServers click on the server to get the list, the pub/sub mechanism is used to push the list to each server.
Revoking tokens is important for security, but rare. In reality, the list is small enough to fit easily into memory. This largely solves the problem of write-offs.
The final problem with JWT is that they are relatively large and can add a lot of per-request overhead when used in cookies.
All in all, there are many disadvantages to simply avoiding central session storage. I don’t think JWT is a generally bad idea or no good, but there are a lot of things to consider.
Why are they popular?
One of the things that strikes me while reading tech blogs is how much discussion there is around JWT. Especially on Medium and subreddits like /r/ Node, I see references to JWT extremely often.
I realize that this does not mean that “JWT is more popular than Session Tokens “, just as GraphQL is not more popular than REST and NoSQL is not more popular than relational databases: it is not all that fun to write about technologies that have been tried for more than a decade (see: Call for novelty). In addition, subject matter experts writing new solutions are likely to have different problems and scales than most of their readers.
However, these new technologies create more buzz than simple ones, and if enough people keep talking about the hot stuff, eventually this can translate into actual adoption, although it’s suboptimal for most simple use cases.
This is similar to how many new developers learn how to build SPA with React before the server renders HTML. Experienced developers may feel that server-rendered HTML should probably be your default choice to set up a SPA if needed, but this is not what new developers are usually taught.
Adopting complex systems before considering simple options is something I’ve seen more of, but it surprised me at JWT.
As an exercise, I looked up the most popular posts (by vote) on /r/ Node that mentioned JWT. I wanted to see the first 100, but I got bored after the first 12.
From these 12 articles and Github Repos.
- One mentions the use of undo lists and three mentions refesh Tokens. The remaining articles and Github Repositories are only tools without write-offs.
- 1This article mentioned using standard session storageMay beFor the better.
- 1articleAt the same timeStandard session storage and JWT are used, making JWT unnecessary.
- A Github repository comes with a pre-generated private key. (yup)
- Most articles used expiration times of weeks or months, and 3 articles never expired their JWT.
With the exception of 1, these high-vote posts are of such low quality that the author probably isn’t qualified to write them and could cause real world harm.
All of this at least confirms my bias that JWT of secure tokens is hard to do well.
About JWT and scale
Through numerous Reddit posts and comments, IT also gave me a more nuanced understanding of why people think JWT is better. The top reason everywhere was: “It’s more scalable,” but it’s not obvious at what scale people think problems will start. I believe the point at which the problems started is probably much higher than people assume.
Most of us are not Facebook, but distributed key -> value storage is unlikely to crash even with “millions of active sessions”.
Statistically, most of us are building apps that Raspberry PI can easily handle.
conclusion
Using JWTs as a token adds some neat attributes and in some cases makes it possible for your service to be stateless, which might be ideal in some architectures.
There are drawbacks to using JWT. Either you give up undo, or you need to set up the infrastructure, which is much more complicated than simply adopting session storage and opaque tokens.
This is not to discourage the use of JWT, but to use it with caution. Be aware of security and functional trade-offs and pitfalls. Don’t put it in your “template” and don’t make it the default.
thanks
Thanks to Nick Chang-Fong and Dominik Zogg for their feedback and suggestions for this article.
Original link: evertpot.com/jwt-is-a-ba…