• 译 文 address: We collected 500,000 browser signatures. Here is what We found.
  • Peter Hraška
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: Usualminds
  • Proofread: PassionPenguin, Chorer, lsvih

Do you remember the last time you were looking for a product online, there were so many pop-up ads about it? Chances are you’re being tracked because your screen resolution, time zone and emoji set are all exposed online.

Yes, it can even track you when you’re in traceless browsing mode (aka incognito mode).

At Slido, we conducted the largest public survey of fingerprint accuracy in browsers and the first comprehensive survey of fingerprint performance on smartphones in the world.

So let’s take a look at what browser fingerprints are, how they can be used to track you, and how “good” they are at it.

What is a browser fingerprint?

I’ll use a simple “camera versus typewriter” analogy to explain what a browser fingerprint is.

Both cameras and typewriters can be easily identified and distinguished by their output.

Each camera would leave a unique noise pattern on the picture, and each typewriter would leave inkmarks around the letters in a particular way.

By looking at two images and comparing their noise patterns, we can accurately determine if they were taken by the same camera.

The same principle applies to browsers. Most browsers have JavaScript enabled, which exposes a lot of information about your browser to the outside world.

It could be your screen size, emojis, installed fonts, language, time zone, or graphics card model. All of this is available from your browser, but you may never have noticed.

On its own, each fingerprint is trivial. But in combination, it makes it possible for anyone to accurately identify a particular browser.

If you are interested, you can view some of your browser fingerprint information on my website.

How to use browser fingerprint information?

One might think that browser fingerprints are inherently bad, but that’s not the case.

To prevent fraud, for example. If you have any type of account that requires you to log in online, such as a bank account or a social network account, you generally only need an email and password to log in.

If one day a thief steals your login credentials and tries to log in from his device, the bank or social network can detect this unusual behavior because the browser fingerprint changes. To prevent fraud, the platform may require you to provide further authorization, such as a SMS verification code.

By far, however, the most widespread use of browser fingerprints is for personalized AD feeds. “Like” and “share” buttons appear on almost every social networking site, and they often include a JavaScript script that collects your browser fingerprint information to further access your browsing history.

Collecting individual fingerprint information from a device is not very valuable. Collecting a lot of fingerprint information on a website isn’t much use either.

But because social-related buttons — that is, scripts that contain browser fingerprints — are almost everywhere, social networks even know how you navigate the Web.

That way, the tech giants will send you ads about what you searched for half an hour ago.

Is my browser recognized everywhere and at any time?

Of course not.

There are several sites, such as AmIUnique and Panopticlick, that determine if your browser’s fingerprint is unique, based on the roughly 1 million fingerprints in their database.

Your browser fingerprint will most likely be marked as unique. That sounds scary, but bear with me, when we understand how fingerprinting works, we might not be so worried.

These sites compare your fingerprint to their entire database, which contains data collected over at least two to three years (or 45 days in Panopticlick’s case).

However, 45 days and two years is plenty of time for your browser fingerprint to change without requiring you to do anything. For example, my browser fingerprint changed six times in 60 days.

This change can be caused by automatic browser updates, window resizing, new font installation, or even daylight saving time. All of this will make it very difficult for your browser to be uniquely identified again for a long time.

The data that we analyzed

Panopticlick and AmIUnique have both published excellent scientific papers on browser fingerprints, analyzing hundreds of thousands of browser fingerprints.

Our data differ in several key respects. We believe this will help us reveal more about browser fingerprints.

  • We analyzed 566,704 browser fingerprints, which is roughly double the number of fingerprints analyzed in the previous largest study
  • In our data set, 65 percent of the devices were smartphones
  • The dataset contains 31 different browser fingerprint features, each device using the most advanced browser fingerprint scripts

We take online privacy very seriously, so all data is analyzed anonymously. Great efforts have been made to ensure that these data are not valid in any other way than for use in this study.

The results of

Probably the most intuitive chart to derive from our data is one that shows the size of the anonymous set.

The anonymous setting basically describes how many different devices share exactly the same browser fingerprint.

For example, an anonymous set of size 1 means that the browser fingerprint is unique. An anonymous set of size 5 means that five different devices have exactly the same fingerprint, so you can’t tell them apart based on fingerprints alone.

The anonymous setting sizes for most of the occupied device types in our dataset are as follows:

After looking at this chart, we can draw a few clear conclusions:

  • Seventy-four percent of desktop devices were uniquely identifiable, while only 45 percent of mobile devices were.
  • Only 33 percent of the browser fingerprints collected on the iPhone were unique.
  • Another 33 percent were almost impossible to track because more than 20 iphones showed the exact same browser fingerprint.

Rate of fingerprint change

Another interesting observation came when we looked at how often the browser fingerprints on individual devices changed.

The following chart shows the number of days between the first access to the device and the first change to the browser fingerprint:

We can see that within 24 hours, nearly 10 percent of the devices that we saw multiple times managed to change their fingerprints.

Let’s look at how this works on each device separately:

The chart shows that 19 percent of iphones had their fingerprints replaced within a week, compared to just 3 percent of Android phones over the same period. Our data suggests that the iPhone is harder to track over time than Android.

Minimum number of fingerprints

Finally, we discussed how many fingerprint features need to be collected from the browser to reliably identify that browser.

To do this, we used Shannon entropy to assess how accurate fingerprint information is. The higher the entropy is, the more accurate the recognition process is.

For example, entropy of 14.2 means that one out of every 19,000 browser fingerprints is exactly the same as my browser fingerprint. Increasing entropy to 16.5 means that one out of every 92,500 devices has the same fingerprint as me.

In our experiments, we found the most accurate subset of browser fingerprint information.

The entropy of our entire data set is 16.55, so we decide to start with 3 fingerprint features and increase the size of the subset until the entropy of the subset is greater than 16.5. The result is as follows:

Experiments show that by extracting three basic browser characteristics, namely date format, user agent character set, and screen available size (screen size minus dock, window bar, etc.), we can achieve entropy of 14.2, which in some cases is sufficient to identify the browser (and the user).

If we extend the subset with harder-to-get features (Canvas, list of installed fonts, etc.), we can reach the 16.5 entropy goal.

This means that websites and companies don’t have to make much effort to identify you.

conclusion

So what do we get out of this?

  • The tech giants can track your online activity, but it’s not entirely accurate yet.
  • Smartphones, especially iphones, are harder to track than PCS
  • The browser fingerprint on the device changes very frequently
  • Browser fingerprints are easy to obtain

However, if you’re worried about your data privacy, I have some good news for you. First up, Apple announced a defense of browser fingerprinting with the latest version of Its Mac OS, Mojave. Second, GDPR considers browser fingerprints to be personal data that must be processed accordingly. Finally, there are many plug-ins and browser extensions that confuse the browser fingerprint script.

So, our browser privacy in the future isn’t as bad as it seems. Granted, sometimes your browser can be uniquely identified, but many times other devices have exactly the same browser fingerprint as yours, making your browser harder to track.

Study motivation

On Slido, we try to make the user experience of our Web application as simple as possible. When you use our app, you usually don’t need to log in, and we want to keep it that way.

The motivation for this study was whether browser-based fingerprint-based authentication could protect users from malicious scripts without compromising the user experience.

It’s important to note that fingerprint information from smartphones is also important, since most of our app traffic comes from smartphones.

The conclusion of our study is no

Browser fingerprints alone are not sufficient for us to authenticate users. However, they are accurate enough to place you in a group of people with similar interests, such as cats or cars.


This means that the use of browser fingerprints is ideal in certain scenarios, such as personalised advertising, where accuracy is not the key, or in preventing bank fraud, where the fear of scammers through browser fingerprint tracking is a good thing after all.

If you want to learn more about browser fingerprinting, I wrote a 60-plus page paper related to my own research that you can read.

You’ll learn more about how each browser feature extraction works, how to avoid being fingerprinted by browsers, explain the images and results in more detail in this article, and more.


Many thanks to my supervisor, Dr. Rndr. Michal Forišek, who provided me with great help in this study.

Related links:

  • My 60-plus page report on browser fingerprints
  • Fp.virpo.sk — Know what a browser fingerprint is
  • Panopticlick.eff.org – Panopticlick
  • Amiunique.org – AmIUnique
  • Audiofingerprint.openwpm.com – audio features in the browser, the application of fingerprint
  • www.nothingprivate.ml/ — Traceless browsing is not traceless

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.