Talk about ES6 regular expressions`u`character

ES6 re added u modifier, can make JS re use more powerful, accurate.

You can use the RegExp. Prototype. unicode attribute to determine whether the re uses the U modifier.

We look at the role of the Kangkang U modifier from the following scenario:

Match any character
1. In the old way, it can be used[\d\D],[\s\S],[\w\W],(^)Match any character
```
/[\d\D]/.test("\r");
//true
Copy the code
```
2. Of course, it can also be given.The meaning of, i.e[^\n\r\u2028\u2029], and then use.|[\n\r\u2028\u2029]Matches any character;
```
    /^(.|[\n\r\u2028\u2029])*$/.test("1 a \ t \ n \ n \ r \ f - _ (the");
    //true
Copy the code
```
3. Another approach is to use new additions to ES6sThe modifier. It will make special characters.Ability to match any single Unicode Basic Multilingual Plane (BMP) character:
```
"\n".match(/ /.);
//null
"\n".match(/ /.s);
//[" Address ", index: 0, INPUT: "Access ", Groups: undefined]
Copy the code
```
  sWhether the modifier is used or not can be passedRegExp.prototype.dotAllAttribute judgment:
```
/./.dotAll;
//false
/ /.s.dotAll;
//true
Copy the code
```
However, this is not the end of the story. When you try to match the following example directly with the above re, you will find that the character is not recognized properly:
```
"𠮷".match(/[\d\D]/);
//["�", index: 0, input: "𠮷", groups: undefined]
Copy the code
```
To correctly identify such Unicode characters with code points greater than \uFFFF, the U modifier comes in handy. It will correctly handle four bytes of UTF-16 encoding:
```
"𠮷".match(/[\d\D]/u);
/ / / "𠮷", index: 0, input: "𠮷 groups: undefined]
Copy the code
```
Match any position

We know that \b matches the boundary between words and non-words, and \b matches the boundary between words. For those who know the u modifier, write /[\b\ b]/ug. Although it seems more reasonable, it is still not the correct solution.

The correct method should be: / \ | b \ b/ug. Not only is the u modifier used, but also the meaning of /[\b]/, which matches the escape character \b itself and the backspace key \u0008.

In addition, for the word boundary \b, both javascript and Java re refer to the string \w is [a-za-z0-9_]. In.NET, regular expression words are defined as strings of [A-zA-z0-9] and Unicode characters (Chinese characters, full-corner characters, etc.).
Correct identification of quantifiers

With the U modifier, all quantifiers correctly recognize Unicode characters with code points greater than 0xFFFF.
Correct recognition of predefined patterns

In re, \d, \w, \s, etc. are predefined classes. Similarly, Unicode characters with code points greater than 0xFFFF can be matched correctly only with the u modifier.
Standardize the writing of escape characters

This will help you understand what regular metacharacters are, because an error will be reported when an invalid escape character is escaped:
```
    /\a/
    // /\a/
    /\a/u
    // Uncaught SyntaxError: Invalid regular expression: /\,a/: Invalid escape
Copy the code
```

Identifies non-canonical characters with the I modifier

/[a-z]/iu.test('\u212A')
// true
/[a-z]/iu.test('\u004B')
// true
Copy the code

Unicode’s own attribute class \p{… } and \ {P… }

This new attribute class matches all characters that match a Unicode attribute. There are two ways to write it:

Specify the attribute name and value

\p{UnicodePropertyName=UnicodePropertyValue}

/ / example:
/\p{Script=Greek}/u  // Specify a Greek character to match
Copy the code

Write only the attribute name or attribute value

\p{UnicodePropertyName}
\p{UnicodePropertyValue}

/ / example:
/\p{White_Space}/u // Matches all Spaces/\p{Emoji_Modifier_Base}\p{Emoji_Modifier}? |\p{Emoji_Presentation}|\p{Emoji}\uFE0F/gu/ / match Emoji
Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Talk about the ‘U’ character in ES6 regular expressions

Talk about ES6 regular expressions`u`character

Talk about the ‘U’ character in ES6 regular expressions

Talk about ES6 regular expressionsucharacter

Related Posts

TypeScript | Chapter 1: Environment Setup and basic data types

How fast is Vue 3.0? 🚀

I also tried to make a vUE component library 👏

Talk about ES6 regular expressions`u`character