• Threats of Using Regular Expressions in JavaScript
  • Dulanka Karunasena
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: jaredliw
  • Proofreader: KimYangOfCat, Greycodee

Pitfalls of using regular expressions in JavaScript

Regular expressions (RegEx) are widely used in Web development for pattern matching and validation. In practice, however, they pose security and performance risks and open the door to attackers.

So, in this article, I’ll discuss two basic things to be aware of before using regular expressions.

Catastrophic backtracking

There are two algorithms for regular expressions:

  • Deterministic finite-state automata (DFA) — for a given string, each character is checked only once.
  • Nondeterministic finite-state automata (NFA) – Checks the same character multiple times until the best match is found.

JavaScript’s RegEx engine uses the NFA algorithm, which can lead to disastrous backtracking.

To better understand this, let’s consider the following RegEx:

/(g|i+)+t/
Copy the code

This RegEx looks simple. But don’t underestimate what it will cost you 😯. First, let’s understand the meaning behind this RegEx:

  • (g|i+)This group checks whether the given string is specified bygOr at least oneiAt the beginning.
  • The following+Will match the previous group one or more times.
  • Strings should consist of letterstAt the end.

According to the RegEx above, the following text is considered a match:

git
giit
gggt
gigiggt
igggt
Copy the code

Now, let’s test the top RegEx with a matching string as input. I’ll use the console.time() method:

We can see that the execution is very fast, even if the string is a bit long.

However, you’ll be surprised when you see how much time it takes to validate mismatched text.

In the example below, the string ends in v and therefore does not match the RegEx. However, it took about 429 milliseconds, almost 400 times longer than the run time to validate the matching string.

This performance difference comes from the NFA algorithm used by JavaScript.

After the first successful validation, JavaScript’s RegEx engine tries to continue. When it fails at a particular location, it goes back to the previous location and finds an alternative path.

When backtracking becomes too complex, the algorithm consumes more computing power, causing catastrophic backtracking.

Note: To understand the complexity of backtracking, visit regex101.com and test your RegEx. Regex101.com shows that it takes only 10 steps to validate GIIIIT using the aforementioned RegEx, compared to 189 steps to validate GIIIIV.


ReDoS attacks on Node.js environments

An attacker can exploit catastrophic backtracking to attack node.js servers.

Because JavaScript is single-threaded, a ReDoS attack can exhaust the event loop, leaving the server unresponsive until the request completes.

I’ll use the moment.js library to demonstrate this, as there is a well-known ReDoS vulnerability in versions of moment.js lower than 2.15.2.

var moment = require('moment');
moment.locale("be");
moment().format("D MMN MMMM");
Copy the code

In this example, the date format has 40 characters, including 31 additional Spaces. These Spaces double the running time due to catastrophic backtracking. In my local environment, it took more than 4 minutes.

/D[oD]? ([[^[]]*]|\s+)+MMMM? Overuse of the + operator in/causes this vulnerability. Fortunately, this issue was addressed in a later version by Snyk, a bug tracking tool.

How do I circumvent RegEx vulnerabilities

1. Write a simple RegEx

Catastrophic backtracking occurs when the RegEx contains at least three characters and contains at least two *, +, and} that are close to each other.

So, if you can simplify your RegEx and avoid using the styles above, you can avoid catastrophic backtracking.

2. Use the validation library

For common validation tasks, we can use third-party libraries such as validator.js or Express-Validator.

We can rely on these libraries because they have a large community behind them.

3. Use the RegEx profiler

You can write bug-free regex by using safe-Regex, rXXR2, etc. They will check your RegEx for vulnerabilities and return its legitimacy.

var safe = require('safe-regex');

var regex = /(g|i+)+t/;
console.log(safe(regex)); // false
Copy the code

This will be judged false because the regular expression is susceptible to catastrophic backtracking.

4. Avoid using the node.js default RegEx engine

Since node.js’s default RegEx engine is vulnerable to ReDoS attacks, we can avoid it and use another engine instead, such as Google’s RE2 engine. It ensures that the RegEx is safe against ReDoS attacks, and its usage is similar to the default Node.js RegEx engine.

var RE2 = require('re2');

var re = new RE2(/(g|i+)+t/);
var result = 'giiiiiiiiiiiiiiiiiiit'.search(re);
console.log(result); / / 0
Copy the code

The main harvest

Catastrophic backtracking is the most common problem in regular expressions. Not only does it affect application performance, it also opens the door to ReDoS attackers, leading to node.js servers being attacked.

In this article, we discussed the principles of catastrophic backtracking and ReDoS, as well as ways to circumvent these problems.

I hope this article helps you protect your applications from such attacks. Don’t forget to share your thoughts in the comments section.

Thank you for reading!

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.