Liao Yikang, Happy Every Day front-end development engineer!

preface

As front-end development, cookies are something we deal with a lot. We use it for authentication, we use it for behavior tracking, and we use it to “state” stateless HTTP protocols. This article will focus on this little cookie, cookie break, knead and talk about how it is used by us.

This article uses Chrome 96.0.4664.55 as the client environment, and all subsequent code descriptions are based on that version. The following four points are mainly elaborated:

  • cookieExisting properties of
  • cookieHow can it be used to track us
  • cookieFront-end management practices
  • cookieThe future of

Don’t say a word, let’s go!

First, the attributes of cookies

Before going into the details of the properties, let’s reacquaint old and new with cookies.

Official definition: Magic cookie, no

An HTTP cookie (web cookie, browser cookie) is a small piece of data that a server sends to a user's web browser. The browser may store the cookie and send it back to the same server with later requests. Typically, An HTTP cookie is used to tell if two requests come from the same browser -- keeping a user logged in, for example. It remembers stateful information for the stateless HTTP protocol.Copy the code

According to the official definition, a cookie is a small piece of data stored on the client’s device. The definition here refers to the cookie created by the response header set-cookie of the HTTP request. Now we can also use document.cookie to access and manually create first-party cookies.

1.1 Whycookie

As you all know, there is a market because there is a need, and so is technology, and that’s how cookies came about. The origin of cookie is the famous Netscape when developing e-commerce programs for customers, customers require that the server do not have to store transaction state. There is no way, the server does not want to save, only the client effort. Cookie was born. HTTP is a stateless protocol, which is one of the reasons why it is so fast. But a lot of times we need to know who sent us the request, and we need to keep track of the user status.

So the reason for cookie’s birth:

  • The server does not want to save state
  • We need to store state

Some people might say, well, I’m going to use sessionStorage, I’m going to use localStorage, I’m going to put a custom parameter in the request header to say I’m going to use sessionStorage.

That being said, of course it’s true, as browsers have evolved, there are many alternatives to cookies, but cookies are still unique. For example, the server can be set to…… So the exact solution depends on your actual scenario.

1.2 Attribute details

Open your browser’s Devtool panel

As you can see, it has all the attributes that cookie has had so far, so what do those attributes represent?

cookieAttribute Description table

The property name Attributes that
name cookieThe name of the
domain cookieSubordinate to the domain,domainIndicates which domain names can be usedcookieImportant attributes.
path cookieThe usage path of,pathThe identity specifies which paths are acceptable under the hostcookie.
Max-Age cookieValidity period, in seconds. If the value is an integercookieinMax-AgeIt will expire in seconds.
Expires cookieThe expiration time of, ifcookieNo expiration time is set, socookieThe life cycle is only in the current session, closing the browser means the end of the session, at this pointcookieThen it fails. It has now been replaced by the maxAge property and needs to be a date object.
HttpOnly Property to block access through javascriptcookie. document.cookieThe content read does not contain HttpOnly SettingscookieTo some extent deter such attacks.
secure It is a Boolean that specifies how to transmit over the networkcookie, which defaults to false and is transmitted over a normal HTTP connection, marked truecookieOnly requests encrypted by HTTPS are sent to the server.
SameSite Restricts whether third-party urls can be carriedcookie. There are three values: Strict/Lax(default)/None.
SameParty Chrome has a new oneFirst-Party SetsPolicy that allows different domain names owned by the same entity to be treated as first-party. Before, it was differentiated by sites, but now it can be divided by a party.SamePartyTo fit in with that strategy. (Currently only Chrome has this property)
Priority Priority, chrome’s proposal (not supported by Firefox), defines three priorities, Low/Medium/High, whencookieLow priority when the size exceeds the browser limitcookieWill be cleared first. (Currently only Chrome has this property)

Some people might say, this is it, this is it? Yes, that’s all… It’s impossible. Now that we have to break it down and crumple it up, of course we have to give a detailed description of each attribute. These points are mainly about the common things we need to be aware of when using cookies, which are the pits.

Without further ado, let’s begin one by one. (The attributes marked green indicate the front-end attributes that can be manipulated directly by JS)

name

Note that the name is the same as the name. The same name can be used for different domains and paths when setting cookies. If the domain and path are the same, the cookie set later overwrites the cookie set earlier.

In addition, note that the first item always corresponds to name=value when manipulating cookies with JS. The document. The cookie = “path = /; Name =test” does not result in what looks like a cookie with a value of test configured in the/path. I’m setting a cookie whose name is path and value is slash.

value

The value of cookie, which supports only strings, is converted by calling the toString() method if other types are used.

domain

Domain has many points to pay attention to, we can use it to achieve front-end cross-domain access, single sign-on (secondary domain name co-domain), tracking users….. The main points to note are as follows

  • Js setcookieconfigurationdomainWhen, only the first party can be set. Such as in example.comjsSet up the cookie, such asdocument.cookie = token=123; domain=test.comIt’s not going to work.
  • Js setcookieconfigurationdomainIn the previous example, if the attribute is manually configured, it is automatically added regardless of whether the attribute is preceded by the ** “.”** symboldomian=.test.com, in the form of a wildcard domain name.
  • Js to getcookie, the current universal access mode is stilldocument.cookieThis is an API, so we don’t know what’s going on when we get itcookieIt’s which one it belongs todomain.

path

  • Set up thecookieProperty of the property must exist at the momenturlthepathname, anddomainSimilar, otherwise cannot set.
  • Set up thepathIf a relative path is configured, the match is automatically set to fullpathname, such as the currenturlforhttps://test.com/ab/cdTo set updocument.cookie = token=123;; path=ab, so eventuallypathIt would be/ab/cd.
  • Everybody knowscookieWill be senthttpRequest headers are automatically placed when requested. ifcookieIf this property is set, the request path is matched to see if it contains the value of this property, which is sent if it doescookie. So let’s be careful herepathIt’s a matching pattern. Such aspathfor/test, then/testtestAnd it will match.
  • pathAttributes also have a role to play, promotioncookieThe sort of, usually we setcookieWhen, the first set in the front, after the Settings in the back. But whencookieThere arepathThis property haspathChrome will be promoted to the front (other browsers are not tested).

Note also that when setting path, the value must be included in the current URL. But the most common Ajax requests are not the current path, so they don’t carry this cookie. Note that not all requests under the current path are carried, but only if the requested address contains the path.

Max-Age

Cookies are deleted using the expiration time. If the expiration time max-age is set to an integer <= 0, the cookie can be deleted.

  • Max-AgeIf the value is a non-integer, the default expiration time issession, that is to close the page that is invalid.
  • Max-AgeIs seconds. If the value is set to 10, it will expire after 10 seconds, which is in line with our expectations. Pay attention to the followingexpiresThe difference between.

Expires

Expires is also an expiration date, but it occurs earlier than max-age (which is more compatible with IE8 below). Max-age is now preferred. Using Expires to set an expiration time can sometimes be confusing.

  • ExpiresThe first one I accepted was aDateObject, and secondly, in particular,cookieThe expiration time is based onUTCTime. The standard time in China isGMTTime, thanUTCIt’s eight hours faster. When we set the expiration time through this property, as shown inexpires=${new Date()}. It looks like I’m setting the expiration time to the current,cookieIt’s going to go off immediately, but we’re actually puttingcookieThe overdueUTCTime set to current time, distancecookieWe have eight hours to go before it actually expires. Sometimes we go throughexpiresTo clearcookieThat could be why they didn’t clear it.
  • When we set bothExpiresandMax-AgeWhen,Max-ageHas a higher priority.

HttpOnly

This attribute is mainly used to defend against basic XSS attacks. Js cannot configure this attribute when setting cookies. This field can be configured only by the field returned by the server in the set-cookie response header. But this is just a defense, so don’t rely too much on it.

Secure

Secure, like HttpOnly, is a one-two punch for cookie security. Unlike HttpOnly, it allows configuration through JS.

Here’s a little bit of what I found in practice. Sometimes we need to deploy some applications on the Intranet to access cloud services in the form of forwarding requests by ng on the front computer. Access to the front-end is usually in the form of direct access IP, this is definitely not HTTPS protocol. However, if the response header set-cookie of the cloud service contains the secure attribute. (1) Although it can be sent back through NG, it cannot be written inside the Intranet machine. (2) When this problem occurs, our solution is usually to deal with the secure of response header in the NG layer. Note that the 360 browser detects this behavior (it’s not clear how) and still cannot write cookies. The current version of Chrome can be successfully written, not ruled out Chrome update to block the possibility.

SameSite

This property is an important new property in Chrome 51. The meanings of the three values are as follows

  • Strict: Allows only requests from the same site to be sentcookie.
  • Lax: Allow some third parties to request to carrycookieThat is, a GET request to navigate to the target url. Including hyperlink, preload and get form three forms to sendcookie. This value is the default value for Chrome 80+.
  • None: Send anycookie, set to None, which must be set at the same timeSecure=trueThat is, websites must use HTTPS.

Cross -site and cross-origin are not the same thing. Cross-domain means that any difference among portal, host, and port is considered cross-domain. Cross-sites are more lenient, as long as they have the same secondary domain name (a secondary domain name is the next level of a top-level domain such as.com, such as test.com).

By default, Lax will affect our POST form, iframe, Ajax, and image.

SameParty

SameParty is Chrome’s third one-two punch for cookie security. It is mainly used in conjunction with the first-party strategy. All cookies that need to be shared under the domain name first-party Sets enabled need to add SameParty property. If the Set – cookies: name = test; Secure; SameSite=Lax; SameParty. SameParty itself has no value, but it must be set to Secure and SameSite cannot be strict

The first-party strategy is explained in more detail later in “How Chrome works.”

Priority

This may be a bit verbose, but this property is… Nothing to say. Look at the chart above.

How can cookies be used to track us

After reading the above paragraph, you should have a comprehensive understanding of cookies now. How can cookies be exploited?

The most widespread use of cookies, as we all know, is to pass authentication information and do single sign-on, which is also the most commonly used. But cookies have been a tool for advertisers to snoop on users’ privacy since their inception. For example, I searched a XX cup in some cat today, and posted it the next day. The micro-blog is full of promotional advertisements of this thing. It was as if the whole world knew I had seen it (and they did). So how did they do it?

2.1 Third-party Cookies

Before we get to the common practice, let’s distinguish between first-party cookies and third-party cookies. We generally believe that cookies whose domain exists in the current domain name or the parent of the current domain name are called first-party cookies, otherwise they are third-party cookies. Although in many cases “third party” cookies are also injected by us.

The following picture takes a treasure as an example:

Taobao.com is the first-party cookie, and this. Mmstat.com is obviously a third-party cookie.

2.2 Advertising and privacy

Speaking of third-party cookies, they are the culprit of all kinds of online advertising, and they are the easiest and cheapest. Let’s continue with the example above. Everybody can find this. Mmstat.com domain. So what does this website do?

I first went to Baidu under this domain name, the domain itself can not access, but in baidu results below it is found some interesting entries 😅.

In fact, it is not an unhealthy website, it is an advertising marketing platform under Ali.

This is not an advertisement… Just let people know he’s in advertising and marketing.

How does it work? We can look at it on request.

Tag the user

The first request to the mmstat.com domain name is the following:

Load a barely visible GIF image while writing a third-party cookie via set-cookie.

And then asked for this GIF.

Will get the information for the first time, completed the user’s mark, Ali mother successfully know who you are. When you open other sites that access Ali mom, such as the video player site Youku, you will see that you both have the same tag ID. Ali’s mom knows it was you who just opened one of our treasures and came back to watch some cool TV.

Get user footprint

We know that when people buy ads, they want to see a high conversion rate, so how do you improve conversion rate? The best way, of course, is to recommend products to people who need them. How do you know if the user needs the product? Nature is to record users of a treasure search records, browsing records. By tagging this data, you can obtain the target population of the product.

Again, let’s take the example of x, where when I clicked on an item, I sent five requests like this

There is nothing to say about the last request, it is a redirect path to Tmall, because the goods I click are on Tmall instead of a treasure. Let’s focus on the first four requests.

According to the basic function of cookie, we know that these four requests all carry the cookie information written when we open a treasure for the first time, that is, the information that marks who we are. At the same time, some parameters are passed in these four requests to mark our behavior, which we pull out and analyze one by one.

As we can see from the figure above, the first request is almost identical to the second, and the third request is almost identical to the fourth. Look at the first and second requests first

It can be seen that the two request parameters are almost the same, the only difference is that the gokey of the first request has an aws=1 parameter. Of course, we do not know the function of each parameter, but we can infer from the information contained in the parameter. This is obvious when you see the information contained in the _p_URL parameter.

Analyzing the value of this parameter, it is found that this parameter contains information about how we found the product. It is clearly stated in the parameter that we reached this page by searching q=%E8%83%8C%E5%8C%85 through s.taobao.com, and the value of q is the name of the product I searched “backpack”. The link also carries other parameters, such as sourceId, etc. Through these parameters and the third-party cookie implanted at the beginning, ali’s mother knew that “you entered the product page by searching the backpack on the home page of a treasure”. Then you are marked as “a user who needs a backpack.” As a kind of active behavior, the weight of search must be relatively high, then this information is very useful.

The third and fourth requests are similar to the first and second, but contain more detailed information. The third and fourth requests have the same address and almost the same parameters, with the only difference being the AWS value.

Let’s look at the third one:

As you can see, a lot of information is put into the Gokey.

After transcoding, you can see some information sent, for example

  • serach_radio_all
  • GongYingLianDIsts
  • list_model
  • isp4p
  • item_click_form

The above information can be roughly guessed by the name of the definition. Ali’s mom recorded your behavior using the above information:

  • You open Taobao
  • You did a backpack search on taobao’s home page
  • You clicked on a backpack on the first page
  • You just look at it and don’t buy it
  • .

With this information in hand, when we open a website that has access to Ali mama’s ads, such as Xcool. Will you find that he also has the third party cookie of MMStat.com, and there is an ID information that is the same as when you visit a treasure. At this time advertising platform knew: you boy just want to buy a bag, now ran to some cool to see the drama, and so on to give you push point bag of advertising, hey hey hey!

And the basis of all of this is to tag us with a little third-party cookie, complete the user profile, and then combine that information and push the product to us? That’s like a robbery. You just have the idea of buying a bag but can’t decide whether to buy it or not, and then they constantly tell you to buy a bag, buy a bag, and you spend your money.

conclusion

To summarize, there are basically three steps to advertising tracking you

  1. Visit the site for the first time through a third partycookieImplants mark you.
  2. Through an implant while browsing the websitecookieConstantly send your browsing footprint to the marketing platform (almost every step, every click, every pause is recorded).
  3. The same third party is entered when visiting other social networking sitescookieTag you and send you relevant ads.

The specific implementation of the operation will be much more complex than this, I hope you can understand according to the above analysis.

You might feel like you’re being watched. The above is just a basic analysis of shopping websites. For example, Baidu and Weibo, which we use most in daily life, are connected to similar advertising marketing platforms. In addition, the existence of advertising alliance makes it possible to say that every step of our footsteps on the Internet is recorded. However, such data may be scattered among different manufacturers, and with the addition of national supervision, the major manufacturers will not be unscrupulous. But the feeling of being monitored is always uncomfortable, which is why Google has received more than $10 billion in privacy fines. And it all started with a little cookie.

Note: The above is based on my request for advertising behavior withcookieSome related analysis, the analysis of a treasure is just an example, we know how advertising platform is operating.

Cookie front-end management practice

Having said the basic properties of cookies and their impact on our daily life, I will talk about how I manage cookies in practice for your reference.

3.1 Common Front-end Cookie Problems

Before we talk about how to do this, let’s talk about some of the problems I’ve had with cookies:

  • The main domain name is embedded with sub-items, so the behavior of cookies among different items is difficult to be unified

  • Do not modify cookie authentication operations when you do not fully understand them

  • Different projects using company uniform secondary domain name cases, because the cookie name is the same may cause confusion

  • Cookies in a project may be missing if a property is to be adjusted or if the project is to extend the subsystem

  • If the company’s domain name is changed, the related information of cookie should also be changed, which may be omitted

The above problems are nothing more than three points: (1)cookie management chaos; (2) Cookie subsequent maintenance is difficult; (3) Cookie conflicts between different projects.

3.2 the cookie management

Since these problems exist, we should try to solve them. My approach is to manage cookies during data initialization, and the front-end intercepts undefined cookie operations.

I want to manage all the cookies in the project, and even deal with many projects with secondary domain names, by defining all the cookies to use when the project is initialized. All subsequent operations on cookies deal with these defined cookies.

First we create an object that declares all cookies to be used in our project

// cookie-schema.js
{
  cookie1: {
    name: 'cookie1'./ / the name of the cookie
    domain: 'root'.// root- Root domain name sub- Current domain name and subdomain name current- Current domain name (default value)
    path: '/'.// Match the path
    expires: 30 * 24 * 3600 * 1000.// The expiration time is 30d
    secure: false / / HTTPS transport
  },
  cookie2: {
    name: 'cookie2'.domain: 'sub'.expires: 30 * 24 * 3600 * 1000.// 30d
    path: '/cookie-test'.secure: false}}Copy the code

As mentioned above, cookie-schema.js is all the cookies we will use in our project. This can be declared directly by the front end, or if you want to deal with conflicts between different projects, you can create a data center. All cookies available for the project are configured in the background, and available cookies for the project are loaded when the project is initialized.

Interested students can cooperate with my small tool, there are detailed instructions:

Cookie – Util: * * * * github.com/Mrlyk/cooki…

The future of cookies

At present, due to the privacy problems caused by cookies, major browser manufacturers also began to restrict third-party cookies. Safari has completely banned third-party cookies, but Chrome has the biggest impact on the market, so people don’t feel very strongly about it.

4.1 How does Google do it

Google is already facing more than $10 billion in fines for privacy violations caused by cookies. So Google announced in early 2020 that Chrome would completely disable third-party cookies for the next two to three years. Of course, the resistance is very big, the advertising industry is the biggest voice, it is understandable 😄

It’s been almost two years now, and you can see that Google is doing the same thing with cookies. From SameSite in Chrome 51 to First-party Sets in Chrome 89, Trust Tokens (which I haven’t used yet) are increasingly limiting third-party cookies and offering solutions to historical problems.

First-Party Sets

In the above description of how ads track us, we can clearly see that mmStat.com and Taobao.com are not the same domain, but they are from the same company. These tripartite cookies can be thought of as being accessed by us, and we don’t want them to be killed by the browser. Therefore, Google launched the first-party Sets policy. With Chrome’s saome-party property, we can allow cookies between domain names in a first-party Party to access each other.

Of course, to prevent abuse of this policy, Google also makes the following restrictions:

  • First-Party SetsDomains in must be owned and operated by the same organization.
  • All domain names should be identified by the user as a group.
  • All domain names should share a common privacy policy.

At the same time, this policy does not allow the exchange of user information between unrelated sites, so single sign-on still requires another scheme.

To use this scheme, you can declare that the current domain name is a member (or owner) of a party by providing a manifest file that defines the relationship between the current domain name and other domain names. The file needs to be a JSON file located under the. Well-known /first-party-set route

Taking the official Google example, suppose that A. sample, B. sample, and C. sample want to form a first-party owned by A. sample. Declare the following JSON file:

// https://a.example/.well-known/first-party-set
{
  "owner": "a.example"."members": ["b.example"."c.example"],... }// https://b.example/.well-known/first-party-set
{
	"owner": "a.example"
}

// https://c.example/.well-known/first-party-set
{
	"owner": "a.example"
}
Copy the code

Note: This solution is still in the trial phase, chrome 89 to 93 can be tried

The students interested to Google’s privacy policy can have a look at this piece of open documents: developer.chrome.com/docs/privac…

4.2 the cookie Store API

The Cookie Store API provides an asychronous API for managing cookies, while also exposing cookies to service workers.

The official has also added a new API for cookie operation, which is more convenient than the tedious operation of using Document. cookie. However, it can only be used under THE HTTPS protocol with poor compatibility. The method is similar to the cookie-util tool above me. If you are interested, take a look at the official documentation below.

Official documentation: developer.mozilla.org/en-US/docs/…

4.3 Where do third-party cookies go from here

Through a series of analysis above, we can see that the official will mainly crack down on third-party cookies in the future. Chrome will soon disable these third-party cookies entirely, too. So how does the loss of these third-party cookies affect our business?

  • Single sign-on, if previously dependent on three partiescookieDo light point login students should pay special attention to find a new scheme
  • Exception tracking tool, usually using a third partycookieTo tag the user. When prohibited, it may cause problems with UV inflation
  • User behavior analysis tool, losing this tag may become invalid
  • .

And of course we go back to the essence of what cookies do, which is tag the user. So can we tag users in other ways to solve some of these problems? Of course you can, like the browser fingerprint that a lot of people already use.

From the perspective of ordinary users, we certainly want our privacy to be better protected. But the wall of privacy will also help giant companies build ever higher barriers of their own. For example, When Google launches FloC models to analyze the behavior of user groups, when third-party cookies are blocked, advertisers have to find these companies to use their solutions in order to effectively advertise.

From a technical point of view, we should pay more attention to the specification changes of these industry specification customizers, keep alert to any specification changes, and timely deal with the risks to our own projects.

The full text after

reference

HTTP cookies: developer.mozilla.org/zh-CN/docs/…).

The Wiki cookies: en.wikipedia.org/wiki/HTTP_c…

When the browser fully disables the three-party cookie: mp.weixin.qq.com/s?__biz=Mzk…

New SameParty attribute for Cookie: juejin.cn/post/700201…