Regular expression

Learning to write regular expressions can be used in conjunction with the graphical regular expression website. You can better understand the meaning of regular expressions

Graphical regular expressions: Click to jump

The modifier

Tags, also known as modifiers, are used to specify additional matching strategies for regular expressions.

Tags are not written inside regular expressions. They are outside the expression and have the following format:

character meaning
i Ignore – Case Insensitive Sets matching to case insensitive and searches are case insensitive: there is no difference between A and A.
g Global – Global match Searches for all matches.
m Multi line – Multi line matching causes the boundary characters ^ and $to match the beginning and end of each line, remember multiple lines, not the beginning and end of the entire string.
s The special character dot. Contains the newline character \n. By default, the dot. Matches any character other than the newline character \n. With the s modifier,. Contains the newline character \n.
'BAnANa'.replace(/a+/g.The '*') // "BAnAN*"
'BAnANa'.replace(/a+/gi.The '*') // "B*n*N*"
Copy the code

The operator

The meaning of \

character meaning
\ Marks the next character as a special character, or a literal character, or a backreference, or an octal escape.
/ / sample 1
'1wa2b3e'.replace(/w/.The '*') // "1*wa2b3e"
Copy the code
/ / sample 2
'1wa2b3e'.replace(/\w/.The '*') // "*wa2b3"
Copy the code

^ and $

character meaning
^ Matches the start of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ‘\n’ or ‘\r’.
$ Matches the end of the input string. If the Multiline property of the RegExp object is set, $also matches the position before ‘\n’ or ‘\r’.
/^a/.test('a2sfade3') // true
/^a/.test('234weras') // false
/a$/.test('12sfadea') // true
/a$/.test('12sfadew') // false
Copy the code

The first string starts with a and matches true, and the second string does not start with a and matches false. The third string ends in a and matches true, and the fourth string does not end in a and matches false

As shown in the previous two examples, the re in example 1 is unescaped and matches the literal character W. The second example \w (described below) matches the first letter, number, or underscore in the string.

Or relationship

character meaning
x|y Matches x or y. For example, ‘z | food can match the “z” or “food”. ‘(z | f ood matching “zood” or “food”.
'aabbcacaabc'.replace(/[a|b]/.The '*') // "*abbcacaabc"
'aabbcacaabc'.replace(/[a|b]/g.The '*') // "****c*c***c"
Copy the code
// LiLei is replaced by LiLei;
'My name is Lilei'.replace(/Li(L|l)ei/g."Li lei");
Copy the code

? The meaning of qualifiers that limit occurrences of, *, +, etc

character meaning
? Matches the preceding subexpression zero or once. Equivalent to {0,1}.
* Matches the preceding subexpression zero or more times. Equivalent to {0,}.
+ Matches the previous subexpression one or more times. For example. Equivalent to {1,}.
{n} N is a non-negative integer. Match certain n times.
{n,} N is a non-negative integer. At least n times.
{n,m} Both m and n are non-negative integers, where n <= m. At least n times and at most m times are matched.
? When the character is immediately followed by any other qualifier (*, +,? , {n}, {n,}, {n,m}), the matching mode is non-greedy. The non-greedy mode matches as few strings as possible, while the default greedy mode matches as many strings as possible. For example, for the string “oooo”, ‘o+? ‘will match a single’ O ‘, while ‘o+’ will match all ‘o’.
'aabbbaaaab'.replace(/a{3}/.The '*') // "aabbb*ab"
'aabbbaaaab'.replace(/a{3,}/.The '*') // "aabbb*b"
'aabbbaaaab'.replace(/ a/g {2, 3}.The '*') // "*bbb*ab"
Copy the code

metacharacters

Regular expressions are basically composed of two string types: textual characters and metacharacters. Metacharacters are special characters that have special meaning in regular expressions.

Let’s look at what metacharacters are.

Metacharacter meaning of common escape characters plus letters

character meaning
\d Matches a numeric character. Equivalent to [0-9].
\D Matches a non-numeric character. That’s the same thing as ^0 minus 9.
\w Matches letters, digits, and underscores. Equivalent to ‘[a-za-z0-9_]’.
\W The value cannot contain letters, digits, or underscores. Equivalent to ‘[^ a-za-z0-9_]’.
/\d/.test('a') // false
/\d/.test('1') // true
/\D/.test('a') // true
/\D/.test('1') // false
/\w/.test('a_1') // true
/\w/.test(The '%') // false
/\W/.test('a_1') // false
/\W/.test(The '%') // true
Copy the code

Boundary metacharacter

character meaning
\b Match a word boundary, which is the position between words and Spaces. For example, ‘er\b’ can match ‘er’ in “never”, but not ‘er’ in “verb”.
\B Matches non-word boundaries. ‘er\B’ matches the ‘er’ in “verb” but not the ‘er’ in “never”.
(pattern) Matches pattern and gets the match. The Matches obtained can be obtained from the generated Matches collection, which is used in VBScript and in JScript
0 0…
Nine attributes. To match the parenthesis character, use ‘(‘ or ‘)’.
(? :pattern) Matches pattern but does not get the result, that is, it is a non-get match and is not stored for future use. This uses the “or” character (
(? =pattern) Look ahead Positive Assert matches the search string at the beginning of any string that matches pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, “Windows (? = 95
(? ! pattern) Negative assert matches the lookup string at the beginning of any string that does not match pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, “Windows (? ! 95
(? <=pattern) A “look behind” check is similar to a “look behind” check, but in the opposite direction. For example, “(? < = 95
(? <! pattern) Reverse negative pre – check, and positive negative pre – check similar, but in the opposite direction. (for example “? <! 95
// The boundary: \b range is: \W range
 "is this a book?".match(/\bis\b/g); // ["is"]
Copy the code
// positive positive check;
let str = "iphone3iphone4iphone11iphoneNumber";
// Replace iPhone with "Apple";
let reg = /iphone(? = \ d {1, 2})/g;
let res = str.replace(reg,The word "apple");
console.log(res);
Copy the code
// positive negation;
let str = "iphone3iphone4iphone11iphoneNumber";
// Replace iPhone with "Apple";
let reg = /iphone(? ! \ d {1, 2})/g;
let res = str.replace(reg,The word "apple");
console.log(res);
Copy the code
// negative affirmative pre-check;
let str = "10px20px30pxipx";
// px--> pixel;
let reg =  / (? <=\d{2})px/g;
let res = str.replace(reg,"Pixel");
console.log(res);
Copy the code
// negative negative check;
let str = "10px20px30pxipx";
// px--> pixel;
let reg =  / (? 
      ;
let res = str.replace(reg,"Pixel");
console.log(res);
Copy the code

Common metacharacters such as tabs

character meaning
\f Matches a feed character. This is equivalent to \x0c and \cL.
\n Matches a newline character. Equivalent to \x0a and \cJ.
\r Matches a carriage return. Equivalent to \x0d and \cM.
\s Matches any whitespace character, including Spaces, tabs, page feeds, and so on. Equivalent to [\f\n\r\t\v].
\S Matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v].
\t Matches a TAB character. Equivalent to \x09 and \cI.
\v Matches a vertical TAB character. Equivalent to \x0b and \cK.
/\f/.test('\x0c') // true
Copy the code

Character set

character meaning
[xyz] Collection of characters. Matches any of the contained characters. For example, ‘[ABC]’ can match ‘a’ in” plain”.
[^xyz] A collection of negative characters. Matches any character that is not contained. For example, ‘[^ ABC]’ can match ‘p’, ‘l’, ‘I’, ‘n’ in” plain”
[a-z] Character range. Matches any character in the specified range. For example, ‘[a-z]’ can match any lowercase character in the range ‘a’ through ‘z’.
[^a-z] The range of negative characters. Matches any character that is not in the specified range. For example, ‘[^a-z]’ can match any character that is not in the range ‘a’ through ‘z’.
'plain'.replace(/[abc]/g.The '*') // "pl*in"
'plain'.replace(/[^abc]/g.The '*') // "**a**"
/[a-z]/.test('aglkshjeroihskjngbhlskfgh') // true
/^[a-z]/.test('aglkshjeroihskjngbhlskfgh') //false
Copy the code

Other metacharacters

character meaning
\cx Matches the control character specified by x. For example, \cM matches a Control-m or carriage return character. The value of x must be either A-z or a-z. Otherwise, c is treated as a literal ‘c’ character
\xn Matches n, where n is a hexadecimal escape value. The hexadecimal escape value must be two digits long. For example, ‘\x41’ matches “A”. ‘\x041’ is equivalent to ‘\x04’ & “1”. ASCII encoding can be used in regular expressions.
\num Matches num, where num is a positive integer. A reference to the match obtained. For example, ‘(.). \1’ matches two consecutive identical characters.
\n Identifies an octal escape value or a backward reference. If \n is preceded by at least n retrieved subexpresses, n is a backreference. Otherwise, if n is an octal number (0-7), n is an octal escape value.
\nm Identifies an octal escape value or a backward reference. If there are at least nm subexpression obtained before \nm, nm is a backward reference. If \nm is preceded by at least n fetches, n is a backward reference followed by the literal m. If none of the preceding conditions are met, if both n and m are octal numbers (0-7), \nm will match the octal escape value nm.
\nml If n is an octal digit (0-3) and both m and L are octal digits (0-7), the octal escape value NML is matched.
\un Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (?) .

grouping

What is grouping

To repeat a single character, it is very simple to simply add a qualifier after the character. For example, a+ means to match one or more a, a? Matches 0 or 1 A. But what if we want to repeat multiple characters? We can use parentheses () to specify which subexpression to repeat, and then repeat the subexpression, for example :(ABC)? Zero or one ABCs where a parenthesis is a grouping.

'abcdababcdababcdabacdbab'.replace(/(ab){2}/g."*"); // "abcd*cd*cdabacdbab"
Copy the code

Grouping can be divided into two forms, capturing groups and non-capturing groups.

In the expression (A)(B(C)), there are four such groups:

The subscript group
0 (A)(B(C))
1 (A)
2 (B(C))
3 (C)
// Convert time format: 2020-12-09 ----> 2020/12/09
 "2020-12-09".replace(/ (\ d {4}) - (\ d {1, 2}) - (\ d {1, 2})/g."$1 / $2 / $3"); / / "2020/12/09"
Copy the code

backreferences

// Convert time format: 2020-12-09 ----> 09/12/2020;
 "2020-12-09".replace(/ (\ d {4}) - (\ d {1, 2}) - (\ d {1, 2})/g."$3 / $2 / $1"); / / "09/12/2020"
Copy the code

After grouping

/ /? < Group name > NEW features in ES2018;
let res = "$name=zhangsan&age=20".match(/ \ $(? 
      
       \w+)/
      );
console.log(res.groups.str); // "name"
Copy the code

The regular way

test

The test() method searches for the value specified in the string, returning true or false based on the result.

/abc/.test('abc') // true
/abd/.test('abc') // false
Copy the code

exec

The exec() method retrieves the specified value in the string. The return value is the value that was found. If no match is found, null is returned

/abc/.exec('qabcz') // ["abc", index: 1, input: "qabcz", groups: undefined]
/abd/.exec('qwerr') // null
Copy the code

String-dependent re methods

replace

The replace() method is used to replace some characters in a string with other characters, or to replace a substring that matches a regular expression.

Suitable for replacing sensitive words

"Buy an iPhone".replace(/ apple /.The '*') // "Buy a * phone"
"Buy an iPhone".replace(/ apple /.function(v){
return The '*'.repeat(v.length)
})// "Buy a mobile phone"
Copy the code

match

The match() method retrieves a specified value within a string or finds a match for one or more regular expressions.

"fdsa1231dfaf232fda".match(/\d+/g) / / / "1231", "232"]
Copy the code

search

The search() method is used to retrieve a specified substring in a string, or to retrieve a substring that matches a regular expression.

"fdsa1231dfaf232fda".search(/\d+/g) / / 4
Copy the code