1. The introduction
The article for intensive reading this week is Regexp-feature-regular-Expressions.
This article introduces several important features of ES2018 re support:
- Lookbehind assertions – Subsequent assertions
- Named capture groups – Named capture groups
- S (dotAll) Flag -. Matches any character
- Unicode property escapes – Unicode attribute escapes
Summary of 2.
Are you still using subscripts to match content? Match any character only with [\w\ w]? Now that regex is easier to write, and in fact regex is becoming easier to use, it’s time to update your regex.
2.1. Lookbehind assertions
The complete assertion definition is divided into: a cartesian product combination of positive/negative assertions and prior/subsequent assertions. Prior to ES2018, only prior assertions were supported, but now subsequent assertions are finally supported.
Explain these four assertions:
Forward prior assertion (? =)… Indicates that the following string can match pattern.
const re = /Item(? / = 10);
console.log(re.exec("Item"));
/ / to null
console.log(re.exec("Item5"));
/ / to null
console.log(re.exec("Item 5"));
/ / to null
console.log(re.exec("Item 10"));
// → ["Item", index: 0, input: "Item", groups: undefined]
Copy the code
Negative antecedent assertion (? ! …). Indicates that the following string does not match pattern.
const re = /Red(? ! head)/;
console.log(re.exec("Redhead"));
/ / to null
console.log(re.exec("Redberry"));
// → ["Red", index: 0, input: "Redberry", groups: undefined]
console.log(re.exec("Redjay"));
// → ["Red", index: 0, input: "Redjay", groups: undefined]
console.log(re.exec("Red"));
// → ["Red", index: 0, input: "Red", groups: undefined]
Copy the code
After ES2018, two new assertions are supported:
Forward trailing assertion (? < =…). Indicates that the previous string can match pattern.
In the first case, the string is put in front and the pattern is put after. The string is in the back and pattern is in the front. What the first match ends in, and what the second match begins in.
const re = / (? < = which) \ d + (\ \ d *)? /;
console.log(re.exec("199"));
/ / to null
console.log(re.exec("$199"));
/ / to null
console.log(re.exec("199 euros"));
// → ["199", undefined, index: 1, input: "€199", groups: undefined]
Copy the code
Negative trailing assertion (?
Note: The following example shows that meters cannot match three numbers before.
const re = / (?
;
console.log(re.exec("10 meters"));
// → [" meters", index: 2, input: "10 meters", groups: undefined]
console.log(re.exec("100 meters"));
/ / to null
Copy the code
It gives a slightly more complex example that combines forward and negative trailing assertions:
Note: The following example shows that meters previously matched two numbers, and that the number 35 was not previously matched.
const re = / (? <=\d{2})(?
;
console.log(re.exec("35 meters"));
/ / to null
console.log(re.exec("meters"));
/ / to null
console.log(re.exec("4 meters"));
/ / to null
console.log(re.exec("14 meters"));
// → ["meters", index: 2, input: "14 meters", groups: undefined]
Copy the code
2.2. Named the Capture Groups
Named capture groups give names to the content captured by the re, which are more readable than subscripts.
What is the syntax? < name > :
const re = / (?
\d{4})-(?
\d{2})-(?
\d{2})/
;
const [match, year, month, day] = re.exec("2020-03-04");
console.log(match); / / - 2020-03-04
console.log(year); / / - 2020
console.log(month); / / - 03
console.log(day); / / - 04
Copy the code
You can also use the previous capture group directly with the subscript \1 in a regular expression, for example:
To clarify, \1 represents the content of (\w\w) match, not (\w\w) itself, so when (\w\w) matches ‘ab’, \1 represents a match to ‘ab’.
console.log(/(\w\w)\1/.test("abab")); / / to true
// if the last two letters are not the same
// as the first two, the match will fail
console.log(/(\w\w)\1/.test("abcd")); / / to false
Copy the code
For named capture groups, the syntax \k
is used instead of the subscript \1:
Subscript and naming can be used together.
const re = /\b(?
\w+)\s+\k
\b/
;
const match = re.exec("I'm not lazy, I'm on on energy saving mode");
console.log(match.index); / / - 18
console.log(match[0]); / / - on on
Copy the code
2.3 s (dotAll) Flag
While. Can match any character in the re, newlines cannot be matched. So clever developers have cleverly solved this problem with [\w\ w].
However, this is ultimately a design flaw. In ES2018, the/S mode is supported. Equivalent to [\w\ w] :
console.log(/ /.s.test("\n")); / / to true
console.log(/ /.s.test("\r")); / / to true
Copy the code
2.4. The Unicode Property Escapes
Re supports more powerful Unicode matching. In /u mode, all numbers can be matched with \p{Number} :
The u modifier recognizes all Unicode characters greater than 0xFFFF.
const regex = /^\p{Number}+$/u;
regex.test("Squared after creates fell 1/2 level delighted many customers"); // true
regex.test("㉛㉜㉝"); // true
regex.test("Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ Ⅺ Ⅻ"); // true
Copy the code
\p{Alphabetic} matches all Alphabetic elements:
const str = "Han";
console.log(/\p{Alphabetic}/u.test(str)); / / to true
// The \w words cannot match
console.log(/\w/u.test(str)); / / to false
Copy the code
Finally, an easy way to match Chinese characters.
2.5 compatible table
In general, only Chrome and Safari are supported. Firefox and Edge are not. So it will take a few more years for large projects to be used.
3. The intensive reading
The four new features listed in this article are the addition of ES2018 to re. But as the compatibility table shows, none of these features are available yet, so let’s revisit ES6’s regex improvements and find out where they fit in with ES2018’s regex changes.
3.1. RegExp constructor optimization
When the first argument to the RegExp constructor is a regular expression, it is possible to specify the second argument – modifier (ES5 raises an error) :
new RegExp(/book(? =s)/giu."iu");
Copy the code
A painless optimization, because most of the time constructors aren’t used that way.
3.2. Regular method for strings
Internal calls to the string’s match(), replace(), search, and split methods point to instance methods of RegExp, for example
String.prototype.match points to RegExp. Prototype [symbol.match].
That is, regular expressions are supposed to be triggered by regular instances, but now support strings directly (conveniently). But the execution actually points to the re instance object, making the logic more uniform.
Here’s an example:
"abc".match(/abc/g) /
// Internal execution is equivalent to
abc /
g[Symbol.match]("abc");
Copy the code
3.3. u modifier
In the overview, Unicode Property Escapes is an enhancement to the U modifier, which was added in ES6.
The u modifier means “Unicode mode” and is used to properly handle Unicode characters larger than \uFFFF.
The u modifier also changes the behavior of the following regular expressions:
- The dot character originally supported single characters, but in
u
In mode, the value can be greater than0xFFFF
Unicode characters. - will
\u{61}
Meaning by matching 61u
Adapt to match Unicode code 61a
. - Quantifier matches that correctly recognize non-one-character Unicode characters.
\S
Unicode characters can be correctly recognized.u
Mode,[a-z]
It can also recognize letters with different Unicode encodings that are close together, for example\u212A
The other oneK
.
Basically, all Unicode characters can be read correctly in the U modifier mode, and in ES2018, there are some new matching sets for the U mode to match common ¼ characters, such as \p{Number}.
3.4. Y modifier
The Y modifier is the “sticky” modifier.
Y, like the G modifier, is a global match, that is, the match continues from the last successful match. The difference in y is that a match must be made immediately after the last one.
Such as:
/a+/g.exec("aaa_aa_a"); // ["aaa"]
Copy the code
3.5. The flags
Get the modifiers via the flags attribute:
const regex = /[a-z]*/gu;
regex.flags; // 'gu'
Copy the code
4. To summarize
This week’s in-depth reading of regexp-features-regular-Expressions has taken a look at some of the new regular features added to ES2018, as well as some of the regular enhancements made to ES6.
If you’re good at this diffuse learning style, you might want to take a closer look at the new features introduced throughout ES6. I highly recommend Ruan Yifeng’s Introduction to ECMAScript 6.
The features introduced in ES2018 are so new that you should be just as comfortable with ES6 features as you are with ES3.
If you know someone who is still surprised by the ES6 features, please share this article with them before they degenerate into “JS beginners with only project experience”.
The discussion address is: Close reading regular ES2018 · Issue #127 · dT-fe /weekly
If you’d like to participate in the discussion, pleaseClick here to, with a new theme every week, released on weekends or Mondays. Front end Intensive Reading – Helps you filter the right content.