Let’s start with two scenarios you might encounter on a daily basis:

Scenario 1: You browse a product on the website and learn about the product information, but you do not place an order or even log in. Two days later, I visited other websites on the same computer and found many ads for similar products.

Scene 2: In a blog, you have more than one trumpet (water army), the presence of these small is to brush a post heat or use public opinion to guide, or pure trading flow, even if you cleared when switching accounts local cache, cookies, restart the router using VPN to operate, even you think you be careful enough, And try to make it as authentic as possible, but managers may still know it’s the same person operating and be discouraged.

If you encounter a scenario like the one above, it’s time to consider whether browser fingerprints are at work.

What is a browser fingerprint

A browser fingerprint is a way of tracking a Web browser through the configuration and Settings information that is visible to a Web site. A browser fingerprint is personally identifiable, just like a fingerprint on your hand, but at this stage it identifies the browser.

Fingerprints on the human hand are unique because each fingerprint has a unique pattern, which is formed by uneven skin. The pattern of each person’s fingerprint makes it unique.

The same goes for a browser fingerprint, where you take the information that the browser is identifiable, you do some math and you get a value, and that value is the browser fingerprint. The identification information can be UA, time zone, geographic location, or the language you use, etc. The information you select determines the accuracy of the browser fingerprint.

Getting a browser fingerprint has no real value to a website, but what is really valuable is the user information that the browser fingerprint corresponds to. As a webmaster, collecting a user’s browser fingerprint and recording the user’s actions is a valuable activity, especially for scenarios where there is no user identity. For example, on A content distribution website, user A likes to browse the content of the secondary element, and this interest is recorded through the browser fingerprint. Then the user can push the information of the secondary element to user A without logging in next time. It’s also a way of delivering content at a time when PCS are so ubiquitous.

For users, it’s a bit of a privacy violation to make a connection between your online behavior and your browser’s fingerprint, especially when it comes to connecting your browser’s fingerprint to real user information. Fortunately, this way is relatively limited privacy violations for users, abuse of user behavior will overdraw users to the good feelings of the website.

Browser Fingerprint Background

Browser fingerprint tracking technology has entered the 2.5 generation.

  • The first generation is stateful, mainly focusing on the user’s cookie and evercookie, requiring the user to log in to get valid information.
  • In the second generation, the concept of browser fingerprint was introduced. By increasing the characteristic values of the browser, users can be more differentiated, such as UA and browser plug-in information.
  • The third generation has already focused on people and established eigenvalues or even models for users by collecting users’ behaviors and habits, which can realize real tracking technology. At present, the implementation of this part is relatively complicated and is still being explored.

It is currently in generation 2.5 because the problem now needs to be solved is how to solve the problem of cross-browser fingerprint recognition. We will talk about the achievements in this area later.

Fingerprint acquisition

Entropy is the average amount of information contained in each received message. The higher the entropy, the more information can be transmitted, and the lower the entropy, the less information can be transmitted.

Browser fingerprint is integrated by many browser feature information, and the entropy of feature value is also different.

Click here to view your browser fingerprint ID and basic information.

Browser fingerprints can also be divided into ordinary fingerprints and advanced fingerprints. Ordinary fingerprints can be understood as parts that are easy to discover and modify, such as HTTP headers

{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml; Q = 0.9, image/webp image/apng, * / *; Q = 0.8, application/signed - exchange; v=b3"."Accept-Encoding": "gzip, deflate, br"."Accept-Language": "zh-CN,zh; Q = 0.9, en. Q = 0.8"."Host": "httpbin.org"."Sec-Fetch-Mode": "navigate"."Sec-Fetch-Site": "none"."Sec-Fetch-User": "? 1"."Upgrade-Insecure-Requests": "1"."User-Agent": "Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"}}Copy the code

Click here to view your HTTP header information.

The accept-language and user-agent headers of the browser are used to retrieve the Language of the browser. This HTTP header entity may be generated by the Language of your current operating system or the Language of the browser setting. This header may not be accurate, as some sites will simply ignore this header and determine the language of the page based on the user’s IP address.

User-agent contains browser and operating system information, for example, I am currently using MacOS and chrome version 77. If the UA is deliberately forged in the header, the web page can also get the real UA through navigator. UserAgent.

Other basic information, such as IP, physical address, geographical location, etc., can also be obtained:

here

It is possible to obtain browser feature information in other ways than HTTP fingerprints, and this document presents some possible eigenvalues

  • User agent string for each browser
  • HTTP ACCEPT header sent by the browser
  • Screen resolution and color depth
  • The system is set to the time zone
  • Browser extensions/plug-ins installed in the browser, such as Quicktime, Flash, Java or Acrobat, and versions of these plug-ins
  • Fonts installed on a computer, reported by Flash or Java.
  • Whether the browser executes JavaScript scripts
  • Whether the browser can plant cookies and “super cookies”
  • The hash of the image generated by the Canvas fingerprint
  • Hash the image generated by WebGL fingerprint
  • Is the browser set to Do Not Track?
  • System platforms (e.g. Win32, Linux x86)
  • System languages (e.g. Cn, en-us)
  • Whether the browser supports touch screen

Once you have these values, you can perform some calculations to get the specific entropy of the browser fingerprint and the uUID of the browser. Calculation method.

The following figure shows the information entropy, repetition probability and specific values of several eigenvalues:

Variable Entropy(bits)
user agent 10.0
plugins 15.4
fonts 13.9
video 4.83
supercookies 6.09
timezone 3.04
cookies enabled 0.353

The information described by ordinary fingerprints is still not unique enough, after all, there are still a lot of MacOS users in Shenzhen. Advanced fingerprints narrow that down even further, almost directly identifying a unique browser.

Cavans fingerprint

Cavans is a dynamic drawing tag in HTML5 that can also be used to generate or manipulate images. Even if Cavans is used to draw the same element, due to the difference of system, different font rendering engines, anti-aliasing, sub-pixel rendering and other algorithms are also different, and canvas will get different results when converting the same text into pictures. The process is as follows

function getCanvasFingerprint () {
    var canvas = document.getElementById("anchor-uuid");
    var context = canvas.getContext("2d");
    context.font = "18pt Arial";
    context.textBaseline = "top";
    context.fillText("Hello, user.", 2, 2);
    return canvas.toDataURL("image/jpeg");
}
Copy the code

Render some text on the canvas and use toDataURL to convert it to the same value even if privacy mode is enabled

here

AudioContex fingerprint

AudioContext fingerprint is similar to Cavans, which is also based on hardware or software differences, to generate different audio output, and then calculate different hash to mark. Of course, the audio is not played directly in the browser, just need to get the processing data before playing. audiofingerprint.openwpm.com/

WebRTC

WebRTC (Web Real Time Communication) enables browsers to communicate audio and video in Real Time. It provides three major apis to enable JS to obtain and exchange audio and video data in Real Time. MediaStream, RTCPeerConnection, and RTCDataChannel. Of course, if you want to use WebRTC to obtain communication capability, the user’s real IP must be exposed (NAT penetration), so RTCPeerConnection provides such API, directly use JS to get the user’s IP address.

Cross-browser fingerprint

All the browser fingerprints mentioned above were obtained from the same browser. However, many characteristic values are unstable. For example, UA and Cavans fingerprints will be completely different when opened in different browsers on the same device. The same browser fingerprint algorithm will not work in different browsers (by different browsers, I mean different browsers on the same device).

A cross-browser fingerprint is a stable browser feature that achieves the same or approximate value across browsers.

Cross-browser fingerprints have also been studied

There is a table like this in this paper

Conventional eigenvalues are difficult to maintain high stability with enough information.

Task(a)~Task(R), List of Fonts (JS), TimeZone, CPU Vritual cores,

Task(A)~Task(R), which is a graphics card Rendering Task and Rendering Tasks. Task (a) Texture, for example, tests the Texture functionality of a regular fragment shader by rendering a random pixel of three primary color values. The fragment shader needs to insert points into the Texture in order to map the Texture to every point on the model. This insertion algorithm is inconsistent across different graphics cards. If the texture is more variable, the difference is more obvious, and we can record this difference to differentiate the graphics card.

List of fonts (JS), which is used to get information about fonts supported on the page. There are two ways to get supported fonts on a page, Flash and JS, and Flash is now out of the picture. List of Fonts is the value of js to get the supported fonts on the page and how to draw the fonts. It is to measure the fill size of text HTML elements of different fonts to distinguish them from other devices.

The TimeZone should be the same on the same device.

CPU Vritual cores is the CPU kernel number, the simplest method is through a navigator. HardwareConcurrency to get.

How to prevent

If you don’t have enough expertise or change your browser information very frequently, it’s almost 100% possible to locate a user using your browser’s fingerprint, but that’s not all bad.

  • The disclosure of privacy is very one-sided, can only be said to reveal part of the user’s browsing behavior.
  • The value is not enough. The user behavior does not correspond with the actual account or specific person, resulting in limited value.
  • Beneficial use, the use of browser fingerprint can isolate part of the black production users, to prevent brushing tickets or some malicious behavior.

But even then, there are a few things you can do to prevent browser fingerprints.

Do Not Track

You can declare a flag in the HTTP header that says “DNT” means “Do Not Track”, and a value of 1 means Do Not Track my web behavior, and 0 means yes. Even if I don’t have a cookie, I can use this flag to tell the server that I don’t want to be tracked and don’t record my behavior.

The bad news is that most sites currently don’t follow this convention and completely ignore the “Do Not Track” signal.

EFF offers a tool called Privacy Badger, a browser-add-on AD blocker that whitelists ads for companies that adhere to this agreement, thereby incentivizing more companies to comply with “Do Not Track” in order to fully display ads.

Personally, I think this is a good approach. If users use this tool, the website will choose the interests of both sides before taking users’ behavior, thus reducing the risk of privacy disclosure for users.

More information about Privacy Badger can be viewed here.

Tor Browser

From what we’ve learned about browser fingerprints, it’s not hard to see that the more features your browser has, the easier it is to track. If, on the other hand, you want to intentionally hide or alter certain browser features, congratulations, your browser may have a unique browser fingerprint that separates you from other users without having to calculate it.

Therefore, the effective method is to popularize the eigenvalues as far as possible. For example, the most popular collocation in the market is Window 10 + Chrome, so it is an effective method to change the UA to this combination, and at the same time, try to avoid the website to obtain eigenvalues with very high information entropy, such as Cavans fingerprint.

The Tor browser has done a lot of work on this to prevent them from being used to track Tor users, and in response to Panopticlick and other fingerprint experiments, the Tor browser now includes patches, To prevent font fingerprints (by limiting the fonts that websites can use) and Canvas fingerprints (by detecting reads on HTML5 Canvas objects and requiring user approval), such as the code above to get Cavans fingerprints, The following warning will pop up on Tor and you can also configure the Tor browser to actively block JavaScript.

Disable the JS

This is a more violent approach, and outright disabling JavaScript is a great defense against browser fingerprint tracking, but it will render a large portion of the page unusable.

And unfortunately, even if JS is disabled, you can use CSS to take browser information, for example:

@media(device-width: 1080px) {
  body {
    background: url("https://example.org/1080.png"); }}Copy the code

You can look at the request logs for 1080. PNG images on the server and see which users have 1080px screens. In Mozilla Firefox, there was even a CSS query that could directly query Windows versions and Windows themes. This has now been fixed.

The resources

(Cross-)Browser Fingerprinting via OS and Hardware Level Features

2.5 Fingerprint tracking technology – Cross-browser fingerprint recognition

navigator.hardwareconcurrency

panopticlick


Pay attention to [IVWEB community] public number to get the latest weekly articles, leading to the top of life!