The following article is from To Captcha or Not To Captcha? The author is Akif Khan, Senior Analytics Director at Gartner. He has long focused on the field of digital fraud detection and authentication, and has a strong knowledge of CAPTCHA. Recently, he published an article on the Gartner website about the departure and retention of CAPTCHA. The first half of the article reviews the history of CAPTCHA, in which Geetest, as a CAPTCHA service provider founded in China and successfully built brand influence in overseas markets, is the most typical representative of the fifth stage. In the second half, I share my views on this issue from both positive and negative perspectives.

I often talk to clients about a variety of interesting topics in the area of digital fraud detection and authentication. One is machine traffic detection and mitigation. Whenever we talk about this, we always around CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) have a heated discussion. I’ve noticed a pretty sharp difference between those who support CAPTCHA and those who are firmly against it. The opponents are not only viscerally opposed to the use of CAPTCHA, they are also categorically opposed to it.

Since I’m very interested in CAPTCHA myself, I’ll start with a brief introduction to the development of CAPTCHA (if you’re not too interested in going back to the roots, you can skip this section).

I. The development history of CAPTCHA

In the 1990s, CAPTCHA was first put into use. Alta Vista, the search engine at the time, was plagued by spam and the concept of character captchas was introduced to alleviate the problem.

The second stage of reCAPTCHA, the validation interface of reCAPTCHA is designed as a pair of phrases that were initially used to aid in training artificial intelligence techniques, specifically to train scanning and recovery systems for ancient books. During the scan, the system may recognize the first word but not the second. Once a number of users have correctly typed the second word in their verification, they have solved the problem of undecipherability for the book-scanning system. Google later acquired reCAPTCHA.

The third stage of reCAPTCHA 2.0 is available, with the addition of a checkbox for verification. The Google CAPTCHA has evolved from detecting only user behavior, to secondary detection based on the user’s risk level, and to today’s much-malefied photo-selective authentication. This is another case of Google using CAPTCHA to train machine vision technology. Simply speaking, it is based on the recognition of objects in traffic scenes to train machine vision technology. It is widely speculated that it will be applied to the driverless car project. At the same time, the attention of the academic community to machine vision and the development of its commercialization provide an opportunity for criminals, through the use of academic research results and commercial application cases, they carry out system development in turn to crack the machine vision technology. In fact, the Google cloud itself is selling machine learning systems, so you have a situation where one part of Google is making CAPTCHA, and another part of Google is breaking it.

The fourth phase of reCAPTCHA is the Google reCAPTCHA 3.0, again in the form of checkbox verification, using a user behavior analysis tool that appears to be no different from other major machine traffic detection vendors.

Phase 5 Google is no longer the only player, and CAPTCHA continues to evolve as many vendors increase their investment in their products. Some merchants claim that they have very high human verification pass rates and very low machine traffic pass rates; Other vendors are focusing on machine vision verification technology, which is not only easy for humans to pass, but unlike Google’s machine vision technology, there is no commercial application yet, so criminals cannot use academic research, commercial examples, and open source research to create tools to crack these CAPTCHA.

Two, for and against the case

Back to the discussion at the beginning, why do some companies want to use CAPTCHA? The logic is very simple. If the machine traffic detection system determines that a user is machine traffic, it can block this potentially threatening user. However, machine traffic detection system is not perfect, if the user is not machine traffic? It blocks access by a normal human user. So a CAPTCHA represents an opportunity, or an expectation, that if the user is indeed a real human user, he or she will be able to authenticate himself or herself through the CAPTCHA and continue normal access.

Why do some businesses object to CAPTCHA? Because the CAPTCHA resolution rate for real human users can be so low that they can’t pass, it can also be a bad experience for those users who pass, while the CAPTCHA crack rate for criminals can sometimes be extremely high. Because sometimes, of course, the difference is subtle machine CAPTCHA crack, in some cases are criminals also hire human to crack CAPTCHA, behind this artificial code has a booming industry – in economically underdeveloped areas, black is produced with a small salary to hire a group of people to break captchas, per person per day can crack the thousands.

3. Should we use CAPTCHA?

My answer is yes, just avoid Google’s “please select all pictures with traffic lights” CAPTCHA. There are a number of more advanced CAPTCHA vendors to choose from, such as Arkose Labs, Geetest, and PerimeterX, each with a unique approach and a few nuances, but all better than the dreaded traffic grid. Digital commerce businesses generally have high requirements for reducing false positives, so the need to give users the opportunity to identify themselves, rather than blocking it all out, is a direction worth exploring. While criminals are hiring coders in large numbers to boost machine traffic, you still need a smart CAPTCHA to detect anomalies when coders are passing validations too quickly (meaning they’re breaking them day in and day out). When anomalies are detected, CAPTCHA should automatically make them more difficult to deter and increase the economic cost of coding. The use of CAPTCHA is to increase user interaction in a controlled manner to obtain more monitoring data about the user and their human characteristic level.

To avoid using CPATCHA as the default defense for all sessions, you still need to rely on machine traffic detection products to detect and block most machine traffic, and deploy CAPTCHA in gray areas where it is difficult to define human behavior. I don’t believe it’s more than 5 percent.

Use CAPTCHA in one of the traffic areas you need to monitor. Compare the results and decide whether to use CAPTCHA based on real data and indicators. Don’t give up CAPTCHA because you are prejudiced by outdated verification methods (such as traffic grids).

I am very interested in the experience of using various CAPTCHA products, and I welcome readers to share their experiences.

Finally, as an interesting side note, Amazon tried to file a patent in 2017 for a new CAPTCHA product that is easy for machines to crack, but presents a very difficult visual verification for humans. Predictably, the process was halted, suggesting that the fallibility of humans may be the key to the future battle against machines.

Translation: one frost proofreading: operating unit in overseas markets The original link: https://blogs.gartner.com/aki…