Liao Yikang, Happy Every Day front-end development engineer!
preface
As front-end development, cookies are something we deal with a lot. We use it for authentication, we use it for behavior tracking, and we use it to “state” stateless HTTP protocols. This article will focus on this little cookie, cookie break, knead and talk about how it is used by us.
This article uses Chrome 96.0.4664.55 as the client environment, and all subsequent code descriptions are based on that version. The following four points are mainly elaborated:
cookie
Existing properties ofcookie
How can it be used to track uscookie
Front-end management practicescookie
The future of
Don’t say a word, let’s go!
First, the attributes of cookies
Before going into the details of the properties, let’s reacquaint old and new with cookies.
Official definition: Magic cookie, no
An HTTP cookie (web cookie, browser cookie) is a small piece of data that a server sends to a user's web browser. The browser may store the cookie and send it back to the same server with later requests. Typically, An HTTP cookie is used to tell if two requests come from the same browser -- keeping a user logged in, for example. It remembers stateful information for the stateless HTTP protocol.Copy the code
According to the official definition, a cookie is a small piece of data stored on the client’s device. The definition here refers to the cookie created by the response header set-cookie of the HTTP request. Now we can also use document.cookie to access and manually create first-party cookies.
1.1 Whycookie
As you all know, there is a market because there is a need, and so is technology, and that’s how cookies came about. The origin of cookie is the famous Netscape when developing e-commerce programs for customers, customers require that the server do not have to store transaction state. There is no way, the server does not want to save, only the client effort. Cookie was born. HTTP is a stateless protocol, which is one of the reasons why it is so fast. But a lot of times we need to know who sent us the request, and we need to keep track of the user status.
So the reason for cookie’s birth:
- The server does not want to save state
- We need to store state
Some people might say, well, I’m going to use sessionStorage, I’m going to use localStorage, I’m going to put a custom parameter in the request header to say I’m going to use sessionStorage.
That being said, of course it’s true, as browsers have evolved, there are many alternatives to cookies, but cookies are still unique. For example, the server can be set to…… So the exact solution depends on your actual scenario.
1.2 Attribute details
Open your browser’s Devtool panel
As you can see, it has all the attributes that cookie has had so far, so what do those attributes represent?
cookie
Attribute Description table
The property name | Attributes that |
---|---|
name | cookie The name of the |
domain | cookie Subordinate to the domain,domain Indicates which domain names can be usedcookie Important attributes. |
path | cookie The usage path of,path The identity specifies which paths are acceptable under the hostcookie . |
Max-Age | cookie Validity period, in seconds. If the value is an integercookie inMax-Age It will expire in seconds. |
Expires | cookie The expiration time of, ifcookie No expiration time is set, socookie The life cycle is only in the current session, closing the browser means the end of the session, at this pointcookie Then it fails. It has now been replaced by the maxAge property and needs to be a date object. |
HttpOnly | Property to block access through javascriptcookie . document.cookie The content read does not contain HttpOnly Settingscookie To some extent deter such attacks. |
secure | It is a Boolean that specifies how to transmit over the networkcookie , which defaults to false and is transmitted over a normal HTTP connection, marked truecookie Only requests encrypted by HTTPS are sent to the server. |
SameSite | Restricts whether third-party urls can be carriedcookie . There are three values: Strict/Lax(default)/None. |
SameParty | Chrome has a new oneFirst-Party Sets Policy that allows different domain names owned by the same entity to be treated as first-party. Before, it was differentiated by sites, but now it can be divided by a party.SameParty To fit in with that strategy. (Currently only Chrome has this property) |
Priority | Priority, chrome’s proposal (not supported by Firefox), defines three priorities, Low/Medium/High, whencookie Low priority when the size exceeds the browser limitcookie Will be cleared first. (Currently only Chrome has this property) |
Some people might say, this is it, this is it? Yes, that’s all… It’s impossible. Now that we have to break it down and crumple it up, of course we have to give a detailed description of each attribute. These points are mainly about the common things we need to be aware of when using cookies, which are the pits.
Without further ado, let’s begin one by one. (The attributes marked green indicate the front-end attributes that can be manipulated directly by JS)
name
Note that the name is the same as the name. The same name can be used for different domains and paths when setting cookies. If the domain and path are the same, the cookie set later overwrites the cookie set earlier.
In addition, note that the first item always corresponds to name=value when manipulating cookies with JS. The document. The cookie = “path = /; Name =test” does not result in what looks like a cookie with a value of test configured in the/path. I’m setting a cookie whose name is path and value is slash.
value
The value of cookie, which supports only strings, is converted by calling the toString() method if other types are used.
domain
Domain has many points to pay attention to, we can use it to achieve front-end cross-domain access, single sign-on (secondary domain name co-domain), tracking users….. The main points to note are as follows
- Js set
cookie
configurationdomain
When, only the first party can be set. Such as in example.comjs
Set up thecookie
, such asdocument.cookie = token=123; domain=test.com
It’s not going to work. - Js set
cookie
configurationdomain
In the previous example, if the attribute is manually configured, it is automatically added regardless of whether the attribute is preceded by the ** “.”** symboldomian=.test.com
, in the form of a wildcard domain name. - Js to get
cookie
, the current universal access mode is stilldocument.cookie
This is an API, so we don’t know what’s going on when we get itcookie
It’s which one it belongs todomain
.
path
- Set up the
cookie
Property of the property must exist at the momenturl
thepathname
, anddomain
Similar, otherwise cannot set. - Set up the
path
If a relative path is configured, the match is automatically set to fullpathname
, such as the currenturl
forhttps://test.com/ab/cd
To set updocument.cookie = token=123;; path=ab
, so eventuallypath
It would be/ab/cd
. - Everybody knows
cookie
Will be senthttp
Request headers are automatically placed when requested. ifcookie
If this property is set, the request path is matched to see if it contains the value of this property, which is sent if it doescookie
. So let’s be careful herepath
It’s a matching pattern. Such aspath
for/test
, then/testtest
And it will match. path
Attributes also have a role to play, promotioncookie
The sort of, usually we setcookie
When, the first set in the front, after the Settings in the back. But whencookie
There arepath
This property haspath
Chrome will be promoted to the front (other browsers are not tested).
Note also that when setting path, the value must be included in the current URL. But the most common Ajax requests are not the current path, so they don’t carry this cookie. Note that not all requests under the current path are carried, but only if the requested address contains the path.
Max-Age
Cookies are deleted using the expiration time. If the expiration time max-age is set to an integer <= 0, the cookie can be deleted.
Max-Age
If the value is a non-integer, the default expiration time issession
, that is to close the page that is invalid.Max-Age
Is seconds. If the value is set to 10, it will expire after 10 seconds, which is in line with our expectations. Pay attention to the followingexpires
The difference between.
Expires
Expires is also an expiration date, but it occurs earlier than max-age (which is more compatible with IE8 below). Max-age is now preferred. Using Expires to set an expiration time can sometimes be confusing.
Expires
The first one I accepted was aDate
Object, and secondly, in particular,cookie
The expiration time is based onUTC
Time. The standard time in China isGMT
Time, thanUTC
It’s eight hours faster. When we set the expiration time through this property, as shown inexpires=${new Date()}
. It looks like I’m setting the expiration time to the current,cookie
It’s going to go off immediately, but we’re actually puttingcookie
The overdueUTC
Time set to current time, distancecookie
We have eight hours to go before it actually expires. Sometimes we go throughexpires
To clearcookie
That could be why they didn’t clear it.- When we set both
Expires
andMax-Age
When,Max-age
Has a higher priority.
HttpOnly
This attribute is mainly used to defend against basic XSS attacks. Js cannot configure this attribute when setting cookies. This field can be configured only by the field returned by the server in the set-cookie response header. But this is just a defense, so don’t rely too much on it.
Secure
Secure, like HttpOnly, is a one-two punch for cookie security. Unlike HttpOnly, it allows configuration through JS.
Here’s a little bit of what I found in practice. Sometimes we need to deploy some applications on the Intranet to access cloud services in the form of forwarding requests by ng on the front computer. Access to the front-end is usually in the form of direct access IP, this is definitely not HTTPS protocol. However, if the response header set-cookie of the cloud service contains the secure attribute. (1) Although it can be sent back through NG, it cannot be written inside the Intranet machine. (2) When this problem occurs, our solution is usually to deal with the secure of response header in the NG layer. Note that the 360 browser detects this behavior (it’s not clear how) and still cannot write cookies. The current version of Chrome can be successfully written, not ruled out Chrome update to block the possibility.
SameSite
This property is an important new property in Chrome 51. The meanings of the three values are as follows
- Strict: Allows only requests from the same site to be sent
cookie
. - Lax: Allow some third parties to request to carry
cookie
That is, a GET request to navigate to the target url. Including hyperlink, preload and get form three forms to sendcookie
. This value is the default value for Chrome 80+. - None: Send any
cookie
, set to None, which must be set at the same timeSecure=true
That is, websites must use HTTPS.
Cross -site and cross-origin are not the same thing. Cross-domain means that any difference among portal, host, and port is considered cross-domain. Cross-sites are more lenient, as long as they have the same secondary domain name (a secondary domain name is the next level of a top-level domain such as.com, such as test.com).
By default, Lax will affect our POST form, iframe, Ajax, and image.
SameParty
SameParty is Chrome’s third one-two punch for cookie security. It is mainly used in conjunction with the first-party strategy. All cookies that need to be shared under the domain name first-party Sets enabled need to add SameParty property. If the Set – cookies: name = test; Secure; SameSite=Lax; SameParty. SameParty itself has no value, but it must be set to Secure and SameSite cannot be strict
The first-party strategy is explained in more detail later in “How Chrome works.”
Priority
This may be a bit verbose, but this property is… Nothing to say. Look at the chart above.
How can cookies be used to track us
After reading the above paragraph, you should have a comprehensive understanding of cookies now. How can cookies be exploited?
The most widespread use of cookies, as we all know, is to pass authentication information and do single sign-on, which is also the most commonly used. But cookies have been a tool for advertisers to snoop on users’ privacy since their inception. For example, I searched a XX cup in some cat today, and posted it the next day. The micro-blog is full of promotional advertisements of this thing. It was as if the whole world knew I had seen it (and they did). So how did they do it?
2.1 Third-party Cookies
Before we get to the common practice, let’s distinguish between first-party cookies and third-party cookies. We generally believe that cookies whose domain exists in the current domain name or the parent of the current domain name are called first-party cookies, otherwise they are third-party cookies. Although in many cases “third party” cookies are also injected by us.
The following picture takes a treasure as an example:
Taobao.com is the first-party cookie, and this. Mmstat.com is obviously a third-party cookie.
2.2 Advertising and privacy
Speaking of third-party cookies, they are the culprit of all kinds of online advertising, and they are the easiest and cheapest. Let’s continue with the example above. Everybody can find this. Mmstat.com domain. So what does this website do?
I first went to Baidu under this domain name, the domain itself can not access, but in baidu results below it is found some interesting entries 😅.
In fact, it is not an unhealthy website, it is an advertising marketing platform under Ali.
This is not an advertisement… Just let people know he’s in advertising and marketing.
How does it work? We can look at it on request.
Tag the user
The first request to the mmstat.com domain name is the following:
Load a barely visible GIF image while writing a third-party cookie via set-cookie.
And then asked for this GIF.
Will get the information for the first time, completed the user’s mark, Ali mother successfully know who you are. When you open other sites that access Ali mom, such as the video player site Youku, you will see that you both have the same tag ID. Ali’s mom knows it was you who just opened one of our treasures and came back to watch some cool TV.
Get user footprint
We know that when people buy ads, they want to see a high conversion rate, so how do you improve conversion rate? The best way, of course, is to recommend products to people who need them. How do you know if the user needs the product? Nature is to record users of a treasure search records, browsing records. By tagging this data, you can obtain the target population of the product.
Again, let’s take the example of x, where when I clicked on an item, I sent five requests like this
There is nothing to say about the last request, it is a redirect path to Tmall, because the goods I click are on Tmall instead of a treasure. Let’s focus on the first four requests.
According to the basic function of cookie, we know that these four requests all carry the cookie information written when we open a treasure for the first time, that is, the information that marks who we are. At the same time, some parameters are passed in these four requests to mark our behavior, which we pull out and analyze one by one.
As we can see from the figure above, the first request is almost identical to the second, and the third request is almost identical to the fourth. Look at the first and second requests first
It can be seen that the two request parameters are almost the same, the only difference is that the gokey of the first request has an aws=1 parameter. Of course, we do not know the function of each parameter, but we can infer from the information contained in the parameter. This is obvious when you see the information contained in the _p_URL parameter.
Analyzing the value of this parameter, it is found that this parameter contains information about how we found the product. It is clearly stated in the parameter that we reached this page by searching q=%E8%83%8C%E5%8C%85 through s.taobao.com, and the value of q is the name of the product I searched “backpack”. The link also carries other parameters, such as sourceId, etc. Through these parameters and the third-party cookie implanted at the beginning, ali’s mother knew that “you entered the product page by searching the backpack on the home page of a treasure”. Then you are marked as “a user who needs a backpack.” As a kind of active behavior, the weight of search must be relatively high, then this information is very useful.
The third and fourth requests are similar to the first and second, but contain more detailed information. The third and fourth requests have the same address and almost the same parameters, with the only difference being the AWS value.
Let’s look at the third one:
As you can see, a lot of information is put into the Gokey.
After transcoding, you can see some information sent, for example
- serach_radio_all
- GongYingLianDIsts
- list_model
- isp4p
- item_click_form
The above information can be roughly guessed by the name of the definition. Ali’s mom recorded your behavior using the above information:
- You open Taobao
- You did a backpack search on taobao’s home page
- You clicked on a backpack on the first page
- You just look at it and don’t buy it
- .
With this information in hand, when we open a website that has access to Ali mama’s ads, such as Xcool. Will you find that he also has the third party cookie of MMStat.com, and there is an ID information that is the same as when you visit a treasure. At this time advertising platform knew: you boy just want to buy a bag, now ran to some cool to see the drama, and so on to give you push point bag of advertising, hey hey hey!
And the basis of all of this is to tag us with a little third-party cookie, complete the user profile, and then combine that information and push the product to us? That’s like a robbery. You just have the idea of buying a bag but can’t decide whether to buy it or not, and then they constantly tell you to buy a bag, buy a bag, and you spend your money.
conclusion
To summarize, there are basically three steps to advertising tracking you
- Visit the site for the first time through a third party
cookie
Implants mark you. - Through an implant while browsing the website
cookie
Constantly send your browsing footprint to the marketing platform (almost every step, every click, every pause is recorded). - The same third party is entered when visiting other social networking sites
cookie
Tag you and send you relevant ads.
The specific implementation of the operation will be much more complex than this, I hope you can understand according to the above analysis.
You might feel like you’re being watched. The above is just a basic analysis of shopping websites. For example, Baidu and Weibo, which we use most in daily life, are connected to similar advertising marketing platforms. In addition, the existence of advertising alliance makes it possible to say that every step of our footsteps on the Internet is recorded. However, such data may be scattered among different manufacturers, and with the addition of national supervision, the major manufacturers will not be unscrupulous. But the feeling of being monitored is always uncomfortable, which is why Google has received more than $10 billion in privacy fines. And it all started with a little cookie.
Note: The above is based on my request for advertising behavior withcookie
Some related analysis, the analysis of a treasure is just an example, we know how advertising platform is operating.
Cookie front-end management practice
Having said the basic properties of cookies and their impact on our daily life, I will talk about how I manage cookies in practice for your reference.
3.1 Common Front-end Cookie Problems
Before we talk about how to do this, let’s talk about some of the problems I’ve had with cookies:
-
The main domain name is embedded with sub-items, so the behavior of cookies among different items is difficult to be unified
-
Do not modify cookie authentication operations when you do not fully understand them
-
Different projects using company uniform secondary domain name cases, because the cookie name is the same may cause confusion
-
Cookies in a project may be missing if a property is to be adjusted or if the project is to extend the subsystem
-
If the company’s domain name is changed, the related information of cookie should also be changed, which may be omitted
The above problems are nothing more than three points: (1)cookie management chaos; (2) Cookie subsequent maintenance is difficult; (3) Cookie conflicts between different projects.
3.2 the cookie management
Since these problems exist, we should try to solve them. My approach is to manage cookies during data initialization, and the front-end intercepts undefined cookie operations.
I want to manage all the cookies in the project, and even deal with many projects with secondary domain names, by defining all the cookies to use when the project is initialized. All subsequent operations on cookies deal with these defined cookies.
First we create an object that declares all cookies to be used in our project
// cookie-schema.js
{
cookie1: {
name: 'cookie1'./ / the name of the cookie
domain: 'root'.// root- Root domain name sub- Current domain name and subdomain name current- Current domain name (default value)
path: '/'.// Match the path
expires: 30 * 24 * 3600 * 1000.// The expiration time is 30d
secure: false / / HTTPS transport
},
cookie2: {
name: 'cookie2'.domain: 'sub'.expires: 30 * 24 * 3600 * 1000.// 30d
path: '/cookie-test'.secure: false}}Copy the code
As mentioned above, cookie-schema.js is all the cookies we will use in our project. This can be declared directly by the front end, or if you want to deal with conflicts between different projects, you can create a data center. All cookies available for the project are configured in the background, and available cookies for the project are loaded when the project is initialized.
Interested students can cooperate with my small tool, there are detailed instructions:
Cookie – Util: * * * * github.com/Mrlyk/cooki…
The future of cookies
At present, due to the privacy problems caused by cookies, major browser manufacturers also began to restrict third-party cookies. Safari has completely banned third-party cookies, but Chrome has the biggest impact on the market, so people don’t feel very strongly about it.
4.1 How does Google do it
Google is already facing more than $10 billion in fines for privacy violations caused by cookies. So Google announced in early 2020 that Chrome would completely disable third-party cookies for the next two to three years. Of course, the resistance is very big, the advertising industry is the biggest voice, it is understandable 😄
It’s been almost two years now, and you can see that Google is doing the same thing with cookies. From SameSite in Chrome 51 to First-party Sets in Chrome 89, Trust Tokens (which I haven’t used yet) are increasingly limiting third-party cookies and offering solutions to historical problems.
First-Party Sets
In the above description of how ads track us, we can clearly see that mmStat.com and Taobao.com are not the same domain, but they are from the same company. These tripartite cookies can be thought of as being accessed by us, and we don’t want them to be killed by the browser. Therefore, Google launched the first-party Sets policy. With Chrome’s saome-party property, we can allow cookies between domain names in a first-party Party to access each other.
Of course, to prevent abuse of this policy, Google also makes the following restrictions:
First-Party Sets
Domains in must be owned and operated by the same organization.- All domain names should be identified by the user as a group.
- All domain names should share a common privacy policy.
At the same time, this policy does not allow the exchange of user information between unrelated sites, so single sign-on still requires another scheme.
To use this scheme, you can declare that the current domain name is a member (or owner) of a party by providing a manifest file that defines the relationship between the current domain name and other domain names. The file needs to be a JSON file located under the. Well-known /first-party-set route
Taking the official Google example, suppose that A. sample, B. sample, and C. sample want to form a first-party owned by A. sample. Declare the following JSON file:
// https://a.example/.well-known/first-party-set
{
"owner": "a.example"."members": ["b.example"."c.example"],... }// https://b.example/.well-known/first-party-set
{
"owner": "a.example"
}
// https://c.example/.well-known/first-party-set
{
"owner": "a.example"
}
Copy the code
Note: This solution is still in the trial phase, chrome 89 to 93 can be tried
The students interested to Google’s privacy policy can have a look at this piece of open documents: developer.chrome.com/docs/privac…
4.2 the cookie Store API
The Cookie Store API provides an asychronous API for managing cookies, while also exposing cookies to
service workers
.
The official has also added a new API for cookie operation, which is more convenient than the tedious operation of using Document. cookie. However, it can only be used under THE HTTPS protocol with poor compatibility. The method is similar to the cookie-util tool above me. If you are interested, take a look at the official documentation below.
Official documentation: developer.mozilla.org/en-US/docs/…
4.3 Where do third-party cookies go from here
Through a series of analysis above, we can see that the official will mainly crack down on third-party cookies in the future. Chrome will soon disable these third-party cookies entirely, too. So how does the loss of these third-party cookies affect our business?
- Single sign-on, if previously dependent on three parties
cookie
Do light point login students should pay special attention to find a new scheme - Exception tracking tool, usually using a third party
cookie
To tag the user. When prohibited, it may cause problems with UV inflation - User behavior analysis tool, losing this tag may become invalid
- .
And of course we go back to the essence of what cookies do, which is tag the user. So can we tag users in other ways to solve some of these problems? Of course you can, like the browser fingerprint that a lot of people already use.
From the perspective of ordinary users, we certainly want our privacy to be better protected. But the wall of privacy will also help giant companies build ever higher barriers of their own. For example, When Google launches FloC models to analyze the behavior of user groups, when third-party cookies are blocked, advertisers have to find these companies to use their solutions in order to effectively advertise.
From a technical point of view, we should pay more attention to the specification changes of these industry specification customizers, keep alert to any specification changes, and timely deal with the risks to our own projects.
The full text after
reference
HTTP cookies: developer.mozilla.org/zh-CN/docs/…).
The Wiki cookies: en.wikipedia.org/wiki/HTTP_c…
When the browser fully disables the three-party cookie: mp.weixin.qq.com/s?__biz=Mzk…
New SameParty attribute for Cookie: juejin.cn/post/700201…