start
How do newcomers learn regular expressions
Test the re site:
- regex101.com/
- jex.im/regulex/
Regular expressions (which can be called “regexp” or “reg”) contain patterns and optional modifiers. There are two syntax for creating a regular expression object. Longer syntax:
regexp = new RegExp("pattern"."flags");
Copy the code
For shorter syntax, use the slash “/” :
regexp = /pattern/; // No modifiers
regexp = /pattern/gmi; // Accompanying modifiers g, m, and I (more on that later)
Copy the code
The slash “/” tells JavaScript that we are creating a regular expression. It acts like a string quote. New RegExp creates a re by dynamically passing in parameters
The modifier
i
Stands for case insensitiveg
All matchu
Matches according to Unicode attributes, using\p{... }/u
,View detailed Properties
let regexp = /\p{sc=Han}/gu; // To match Chinese
let str = 'Hello п ривет Hello 123_456';
alert( str.match(regexp) ); / / you, good
Copy the code
m
Multi-line mode, each line will match once (^ each line will match the beginning $once)
let str = `1st place: Winnie
2nd place: Piglet
33rd place: Eeyore`;
alert( str.match(/^\d+/gm));/ / 1, 2, 33
Copy the code
Character classes
Common character classes:
\d
digital\s
Blank space\w
Single-character word, including alphanumeric underscores
Can be combined, for example:
let str = "test ES 6 AA"
let reg = /e\w\s\d/i
str.match(reg) // ["ES 6", index: 6, input: "test ES 6 AA", groups: undefined]
Copy the code
Each character class has a reverse class that represents not XX
\D
The digital\S
The space\W
Not a single character
let str = "+ 7 (903) - the 123-45-67";
alert( str.replace(/\D/g.""));/ / 79031234567
Copy the code
.
Matches any character (except newline)"/ES./"
\b
Find if the target word is inThe border, such as/\bjava\b/
Can match! java!
But it doesn’t matchjavac
The anchor ^ $
^xx
It starts with xxxx$
It ends with xx
The combination of the two makes for an exact match
const time = "12:02"
let reg = /^\d\d:\d\d$/
//.test Tests whether a match is found
reg.test(time) // true
Copy the code
The empty string ” can be matched with /^$/
Characters need to be escaped
[\ ^ $. |? * + ()
Sets and ranges […]
[abc]
said'A', 'b', 'c'
Any one of them, which is thetaor[a-z], [1-5]
Denotes scope,[0-9A-F]
0-9orA – F,[\w-]
Represents a letter or hyphen-
[^abcd]
A, B, C, and D are not matchedcharacterThis writing is used forTo rule out
Or |
The equivalent of a | b [ab], we can use like this:
gr(a|e)y
Strictly equivalentgr[ae]y
.gra|ey
Matches “gra” or “EY”.
Quantifier control
*
Match 0 ~ up/\d*/
Any number+
Match 1 ~ up?
Matches 0 or 1, equivalent to{0, 1}
{n}
Match n,\d{3}
Matches three consecutive digits, equivalent to\d\d\d
{2, 5}
Matches 2-5 digits{3}
Matches >= 3 digits
Greedy versus lazy
Let’s look at an example
let str = `"hi" some word "ok" aa`
let reg = /".+"/g
str.match(reg) //["hi" some word "ok"]
Copy the code
We were trying to match [“hi”,”ok”], but we got the whole sentence, because greedy searches pick the matches in order first. +
- When matching
"
Is matched to the first quotation mark, where the matching string is"
- When matching
.
When the matching string is"h
- When matching
+
“, the string becomes"hi" some word "ok" aa
! Because all the following characters are compounded. +
Rule that does not contain arbitrary characters for newlines - Match at this time
"
, found that there are many matches, can not find"
And so beganback, until backtracking becomes"hi" some word "ok"
This is the greed model.
Here’s another example:
let str = 123 456 ` `
let reg1 = /\d+ \d+? /
let reg2 = /\d+ \d+/
str.match(reg1) / / 123 4
str.match(reg2) / / 123 456
Copy the code
Add? After the quantifier. , i.e.,.? +? ?? And so on, will become lazy mode, he will not match all at once, but will stop matching when the first condition is satisfied.
Capture group (…).
group
Here’s an example:
let str = "gogogoaa"
let reg = /(go)+/
str.match(reg) // gogogo
Copy the code
It is easy to understand that multiple characters are counted as a whole to match
Let’s look at a couple of examples
- Domain name matching
/([\w-]+\.) +\w+/g can match the format aaA.aaA.aa aa-aa.aaa.aaCopy the code
/[-.\w]+@([\w-]+\.) +[\w-]+/gCopy the code
(xx) is called the concept of group. The contents in parentheses are not only matched as a whole, but also matched objects within the group are returned:
let str = 'Hello, world!
';
let tag = str.match(/ < (. *?) >/);
alert( tag[0]);// <h1>
alert( tag[1]);// h1
Copy the code
Nested groups
Returns an array of results, where the position [0] is the value returned by a normal full match and the position [1] is the value matched to in parentheses. We can make nested groups with this method:
let str = `<group1 group2>`
let arr = str.match(/<((\w+)\s(\w+))>/)
console.log(arr[0]) //<group1 group2>
console.log(arr[1]) //group1 group2
console.log(arr[2]) //group1
console.log(arr[3]) //group2
Copy the code
let match = 'ac'.match(/a(z)? (c)? /)
alert( match.length ); / / 3
alert( match[0]);// ac (exactly match)
alert( match[1]);// undefined because (z)? Don't match
alert( match[2]);// c
Copy the code
MatchAll works with the G modifier
If g returns multiple objects, we can match it with matchAll:
let str = `<group1> <group2>`
let arr = Array.from(str.matchAll(/<(group\d)>/g))
arr[0] [0] // <group1>
arr[0] [1] // group1
arr[1] [0] // <group2>
arr[1] [1] // group2
Copy the code
Note that matchAll does not return an array, but rather an iterable object.
Named groups?
Modify the above example slightly
let str = `<group1 group2>`
let arr = str.match(/
(?
\w+)\s(?
\w+))>/
)
let groups = arr.groups
console.log(arr[0]) //<group1 group2>
console.log(groups.g0) //group1 group2
console.log(groups.g1) //group1
console.log(groups.g2) //group2
Copy the code
We can do this by immediately putting?
sets the group name and returns a group object by returning the groups attribute of the array
Replace capture group
The str.replace(regexp, replacement) method replaces all capture groups in STR that match regexp with replacement. This is done with $n, where n is the group number. For example,
let str = "John Bull";
let regexp = /(\w+) (\w+)/;
alert( str.replace(regexp, 'the $2, $1'));// Bull, John
Copy the code
For named parentheses, the reference is $
. For example, let’s change the date format from “year-month-day” to “day.month.year” :
let regexp = / (?
[0-9]{4})-(?
[0-9]{2})-(?
[0-9]{2})/g
;
let str = "2019-10-30, 2020-01-01";
alert( str.replace(regexp, '$<day>.$<month>.$<year>'));// 30.10.2019, 01.01.2020
Copy the code
backreferences
We need to find the quoted string: single quote ‘… ‘or double quotation marks “…” – Both variants should be matched. Then we have the phrase “She’s the one!” /[‘”](.*?) [‘”]/g will match “She’
So the question is, how do we get the re to remember what we captured in one of the groups when we can use a backreference
let str = `He said: "She's the one!" . `;
let regexp = / [']) "(. *?) \1/g;
alert( str.match(regexp) ); // "She's the one!"
Copy the code
Here \1 finds the first group that ([‘”]) matches, i.e. “, and the re becomes /([‘”])(.*?). “/g
We can also refer to it as \k
:
let str = `He said: "She's the one!" . `;
let regexp = / (?
['"])(.*?) \k
/g
;
alert( str.match(regexp) ); // "She's the one!"
Copy the code
assertions
Looking forward to assert
Usage:
x(? =y)
Matches only if x is followed by y
let str = "1 Turkey costs 30 euro";
alert( str.match(/\d+(? = euros) /));// 30 (correctly skipping the single digit 1)
Copy the code
x(? ! y)
Matches only if x is not followed by y
After deep assertion
(? <=y)x
Matching,x
, and only in the fronty
In the case.(? <! y)x
Matching,x
, only up fronty
In the case.
Capture group
If we wanted to capture the entire loop expression or part of it, that would be possible. Just wrap it in additional parentheses. Here, for example, the currency symbol (euro | kr) and amount were captured together:
let str = "1 Turkey costs 30 euro";
let reg = /\d+(? =)/(euro | kr); / / euro | kr on both sides with the additional brackets
alert( str.match(reg) ); / / 30 euros
Copy the code
String and re methods
str.match(regexp)
Method in stringstr
Find a match inegexp
Of characters.str.matchAll(regexp)
It is mainly used to search all matches of all groupsstr.split(regexp|substr, limit)
Use regular expressions (or substrings) as delimiters to split strings.str.search(regexp)
Returns the location of the first match, or if not found- 1
str.replace(str|regexp, str|func)
Generic methods for search and replaceregexp.exec(str)
Method returns a stringstr
In theregexp
Matches.regexp.test(str)
Look for a match and returntrue/false
Indicates whether it exists.