This article author: green plum main code
background
With the rapid development of the Internet, Web services have shown a trend of rapid growth. While using Web services, people also face network security risks. According to the White Paper on China’s cyber security Industry released by the China Academy of Information and Communications Technology, China’s cyber security industry shows a rapid growth trend, and the industrial scale is expected to reach 170.2 billion yuan in 2020.
Among them, Web security also occupies a certain proportion, Web site has become an important battlefield of network attack and defense.
The browser fingerprint plays an increasingly important role in the field of Web security active defense. This article mainly introduces browser fingerprint from three aspects: overview, fingerprint identification principle, and application scenarios.
Overview of Browser Fingerprints
Proposed by Peter Eckersley, chief scientist at the Electronic Outpost Foundation (EFF), in 2012, browser fingerprints take advantage of the browser’s freely transmitted properties to generate a character string that acts as an identifier just like a human fingerprint. So what are the common browser fingerprints? Please refer to the data in the following table, where:
-
Fingerprint factor: refers to the public attribute of the browser. For example, userAgent can be obtained through navigator-userAgent. ColorDepth can be obtained from screen.colordepth.
-
Stability: The value of the fingerprint factor does not change after the browser is refreshed. For example, colorDepth indicates the colorDepth of the screen, which is 24 in chrome, and is still 24 when you refresh the browser. So it’s a stable fingerprint factor.
-
Independent: The value of the fingerprint factor does not change when different browsers are used on the same device. For example, devicePixelRatio indicates the ratio of the physical pixel resolution of the current display device to the pixel resolution of the CSS. The value is the same for Google, Firefox, Edge, Internet Explorer, Opera, and Safari on the same device. Therefore, the devicePixelRatio is an independent fingerprint factor.
Fingerprint factor | The stability of | independence |
---|---|---|
userAgent | stable | no |
language | stable | Yes (most of the time) |
colorDepth | stable | is |
deviceMemory | stable | is |
pixelRatio | stable | is |
hardwareConcurrency | Stable (but not supported by IE) | Yes (but not supported by IE) |
screenResolution | stable | is |
availableScreenResolution | stable | is |
timezoneOffset | stable | is |
timezone | stable | is |
sessionStorage | stable | no |
localStorage | stable | no |
indexedDb | stable | no |
addBehavior | stable | no |
openDatabase | stable | no |
cpuClass | stable | is |
platform | stable | Yes (most of the time) |
doNotTrack | stable | no |
plugins | To be determined | no |
canvas | Stable (most of the time) | No (verified) |
webgl | Stable (most of the time) | No (verified) |
adBlock | Stability (depending on time of use) | no |
hasLiedLanguages | stable | is |
hasLiedResolution | stable | is |
hasLiedOs | stable | is |
hasLiedBrowser | stable | is |
touchSupport | stable | is |
fonts | Stable (most of the time) | No (most times) |
audio | The unknown | no |
Table fingerprint factor data source: Browser independent Components & Stable Components
We can obtain the corresponding values of the above fingerprint factors through JavaScript code, and make device identification possible by combining stable and independent fingerprint factors into browser fingerprints. That is, through the browser fingerprint we get the fingerprint of the hardware device, such as personal browser or PC.
What, you may wonder, can these fingerprint factors be used for? How is it implemented associated with a device? Let’s take a look at the principle behind browser fingerprint recognition.
Recognition principle
The information entropy
We’ve covered the common browser fingerprint factors above, but how do you measure these freely transmitted fingerprint factors in the browser? For those of you who have studied information theory, we can measure the amount of information in terms of information entropy, and the higher the entropy, the more information can be transmitted, and the lower the entropy, the less information can be transmitted. Therefore, information entropy can be used as the criterion of browser fingerprint identification degree. For example, for a discrete random variable X, its entropy H(X) is:
Entropy value of single fingerprint factor
In the above formula, the logarithm base 2 function is used in bits. Simply put, the information of a fingerprint factor can be quantified by it. For example, 🌰 : Let’s take the fingerprint factor doNotTrack as an example, and its value results can be divided into two types:
- The on setting can be marked as 1 and the off setting can be marked as 0
- Suppose the user visits the website in the statistics, set
doNotTrack
The value is 10% for an unset profile and 90% for an unset profile. thendoNotTrack
The entropy corresponding to this fingerprint factor is:
Note: Click on the basics of information Theory to review through.
Multiple fingerprint factor probability
Knowing how to calculate the information entropy of a single fingerprint factor, we can further calculate the information entropy of multiple fingerprint factors of the browser. According to Peter Eckersley’s paper How Unique Is Your Web Browser, the results show that: Using userAgent, Fonts, screenResolution and plugins to generate browser fingerprints:
The statistical results show that the 8 fingerprint factors contain at least 18.1 bit of information entropy. The self-information calculation formula is as follows:
We can deduce the probability of the occurrence of the same fingerprint:
By calculation, it can be concluded that a random browser will have the same browser fingerprint among 286,777 browsers. Therefore, the browser fingerprint has a very high identification, and with the increase of fingerprint factor dimension, the probability of the same browser fingerprint will become lower and lower, and its identification accuracy will become higher and higher.
Fingerprint duplication
At present, the cloud server and virtual host has been relatively popular, can clone multiple identical virtual system and equipment at any time, similar to ghost system, they are in the initial state of fingerprint probability is the same, then how do we identify these “factory Settings” the same equipment?
Above we propose a fingerprint identification is based on static rules of recognition, namely identification only once, but the truth is a equipment of fingerprint is will change over time, it is completely associated with the habit of users, for example: if you give yourself today browser installed a plug-in, the browser then fingerprints will change. Therefore, the association mechanism based on static rules and dynamic tracking can realize the long-term tracking of fingerprint evolution. For those interested, check out the fP-Stalker: Tracking Browser Fingerprint Evolutions study: The average time a Browser can be tracked is over 54.48 days. In the process of tracking, the fingerprint data set can be reclassified to effectively avoid such problems.
summary
To sum up, we can calculate the entropy value corresponding to the browser fingerprint factor and calculate its occurrence probability combined with self-information, so as to infer the device based on the fingerprint value and probability, and understand the principle of browser fingerprint generation. So you might want to know what the scenarios are, so let’s move on.
Application scenarios
Common application scenarios of browser fingerprint are as follows:
Active defense of a malicious reptile
At present, for Web sites, there are various kinds of automatic crawlers on the network to collect data, such as ticket brushing, comment, crawling privacy data, etc. At present, the main defense measures include passive detection and defense, IP address detection and blocking, browser fingerprint & active defense, etc.
Scanner recognition and interception
In addition to Web crawler tools, there are also various professional Web vulnerability scanning tools, such as AWVS, APPScan, Xray, etc. While scanners actively identify Web vulnerabilities, they also bring potential security problems to online Web services. The interception of a scanner can be identified and intercepted when it scans a specific site based on the specificity of its browser fingerprint.
Tracing and tracing of Web attacks
All kinds of scripts are run on our browser, and it has absolute advantages to trace the source of Web attacks by using browser scripts. The attacker can be traced and traced through the extraction and association of browser fingerprint feature information.
Personalized AD push
When you browse or look for related products on the Internet, do you often see related advertising pages? Chances are you’re being tracked, as your screen resolution, time zone and emoji set are all exposed online. It can even track you while you’re in intraceable browsing mode, check out here.
The above is part of the application scenarios involved in the browser fingerprint, from which we can see: the browser fingerprint can be pushed by advertisers, but also can be a defense identification criteria.
conclusion
This article mainly from the browser fingerprint overview, fingerprint recognition principle and application scenarios lead you to understand the browser fingerprint. This paper introduces common browser fingerprints, analyzes the principles of the association between browser fingerprints and devices, and briefly introduces some application scenarios of browser fingerprints. In the next article, we’ll design a simple fingerprint-tracking model to see what a browser fingerprint can do in real life. Stay tuned.
Refer to the link
- xprize
- fingerprintjs
- Fundamentals of Information Theory
- How do trackers work?
- White Paper on China’s cybersecurity Industry
- New findings based on half a million browser fingerprints
- How Unique Is Your Web Browser?
- Learn how identifiable you are on the Internet
- FP-STALKER: Tracking Browser Fingerprint Evolutions
- Beauty and the Beast: Diverting modern web browsers to build unique browser fingerprints