Regular expression
Regular expressions are patterns used to match combinations of characters in strings. In JavaScript, regular expressions are also objects. These patterns are used in the exec and text methods of RegExp, and in the match, matchAll, replace, search, and split methods of String
Create an expression
literal
Create regular expressions with two/directly, with slashes to indicate the beginning and end
var reg = /ab/g;
Copy the code
When the script loads, the regular expression literal is compiled. When the regular expression is kept unchanged, better performance is achieved using this approach.
RegExp constructor
var reg = new RegRxp("ab"."g");
Var reg = /ab/g
Copy the code
Literals create slash ends after arguments and constructors the second argument represents the modifier.
The above two ways of writing are equal markup for each new regular expression object. The difference is that when the first engine compiles the code, it creates a new regular expression. The second method creates new expressions at run time, so literals are more efficient. And literals are more convenient and intuitive, and they are basically used to define regular expressions.
Instance attributes
Regular modifiers related instance attributes (read only)
ignoreCase
: Returns a Boolean value indicating whether it is seti
The modifierglobal
: Returns a Boolean value indicating whether it is setg
The modifiermultiline
: Returns a Boolean value indicating whether it is setm
The modifierflags
: returns a string containing all modifiers set
Power attributes independent of modifiers:
lastIndex
: Returns a certificate indicating the location where the next search will beginsource
: Returns the regular expression as a string, read-only
var reg = /abc/gim;
// Modifiers related attributes
reg.ignoreCase; //true
reg.global; //true
reg.multiline; //true
reg.flags; //gim
// Modifiers are independent of attributes
reg.lastIndex; / / 0
reg.source; //abc
Copy the code
Instance methods
Regular instance method
test()
Tests for a match in the string, returning true or false
var reg = /av/g;
var s = "avbabc";
reg.test(s); //true
reg.lastIndex = 2;
reg.test(s); //false
Copy the code
When the regular expression has the G modifier, each test method matches backwards from where it last ended, and you can use lastIndex to see the current position
var reg = /av/g;
var s = "avbavabc";
reg.lastIndex; / / 0
reg.test(s); //true
reg.lastIndex; / / 2
reg.test(s); //true
reg.lastIndex; / / 5
reg.test(s); //false
Copy the code
If the regular expression is an empty string, all strings are matched, returning true
exec()
The null exec method returns an array containing two properties:
input
: The entire stringindex
: indicates the index of the start position of the successful pattern matching
var reg = /av/g;
var s = "avbavabc";
reg.exec(s); //["av", index: 0, input: "avbavabc", groups: undefined]
reg.exec(s); //["av", index: 3, input: "avbavabc", groups: undefined]
reg.exec(s); //null
Copy the code
As with Test, when the regular expression has a G modifier, each exec method matches backwards from where it last ended, and you can use lastIndex to see the current position
When the regular expression contains () group matches, the array returned contains multiple matches. The first is the result of the successful match of the entire re, the second is the result of the match in parentheses, and the third is the content of the match in parentheses if there are multiple parentheses. And so on.
var reg = /a(v)/g;
var s = "avbavabc";
reg.exec(s); //[ 'av', 'v', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s); //[ 'av', 'v', index: 3, input: 'avbavabc', groups: undefined ]
reg.exec(s); //null
Copy the code
More than ()
var reg = /a(v)(b)/g;
var s = "avbavabc";
reg.exec(s); // [ 'avb', 'v', 'b', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s); //null
Copy the code
String instance method
match()
Performs a search for matching characters in a string, returns an array, returns null if the regular expression does not have a G modifier, returns an array with index and iuPUT attributes
var reg = /ac/;
var s = "acbacvabc";
var s1 = "aabaavabc";
s.match(reg); //[ 'ac', index: 0, input: 'acbacvabc', groups: undefined ]
s1.match(reg); //null
Copy the code
When the regular expression has the G modifier, this method returns an array of all successful matches at once. No more index and input attributes
var reg = /ac/g;
var s = "acbacvabc";
s.match(reg); //[ 'ac', 'ac' ]
Copy the code
Note: Setting the lastIndex property of a regular expression does not apply to the match method, which always matches from the beginning of the string.
matchAll()
Performs a lookup of all matching characters in a string, returning an iterator. Note that when matchAll is used, the regular expression needs to have the G modifier, otherwise it will run with an error.
var reg = /a/g;
var s = "acbacvabc";
arr = [...s.matchAll(reg)];
console.log(arr);
/ / output:
/** [ [ 'a', index: 0, input: 'acbacvabc', groups: undefined ], [ 'a', index: 3, input: 'acbacvabc', groups: undefined ], [ 'a', index: 6, input: 'acbacvabc', groups: undefined ] ] **/
Copy the code
search()
Performs a lookup of a matching character in a string, returning the position of the first matching character, or -1 if not matched
var reg = /en/g;
var reg1 = /yo/g;
var s = "yuwenbo";
s.search(reg); / / 3
s.search(reg1); / / 1
Copy the code
replace()
Performs a look-up match in a string and replaces the matched substring with a replacement string. The two parameters are the regular expression and the content to be replaced.
If the re does not have a G modifier, only the first matching value is replaced. If there is a G modifier, all matched values are replaced.
var s = "i love you";
console.log(s.replace(/\s/."❤")); / / ❤ I love you
console.log(s.replace(/\s/g."❤")); / / I ❤ love ❤ you
Copy the code
Replace The second argument can use the $sign to make it easier to specify what to replace
$&
: Matched substring- ‘$’ : Matches the text before the result
$'
: Matches the text following the result$n
: The NTH group of contents successfully matched. N is a natural number starting from 1$$
: indicates the dollar sign$
console.log("he llo".replace(/(\w+)\s(\w+)/."$2 $1")); //llo he
console.log("hello".replace(/e/."- $` - $& - $' -")); //h-h-e-llo-llo
Copy the code
The second argument to replace can also be used as a function to replace each regular match with the return value of the function
The function can take multiple arguments, the first of which is the match, followed by the group match (there can be multiple group matches), the penultimate argument to the position of the match in the string, and the penultimate argument to the original string.
console.log(
"hello".replace(/e/.function (match, index, str) {
console.log(match, index, str);
return "❤"; }));//e 1 hello
/ / h ❤ llo
Copy the code
split()
Split a string using a regular expression or a fixed string and store the substrings in an array. This method takes two parameters. The first parameter is the regular expression, which represents the split rule, and the second parameter returns the maximum number of members of the array
str = "ni hao ya.hei hei hei";
str.split(/ | \. /.5); //[ 'ni', 'hao', 'ya', 'hei', 'hei' ]
Copy the code
Conclusion:
To determine if a string is matched, use the test or search methods to get more information, and use the exec or match methods to get more information.
Modifiers (identifiers)
Modifiers represent additional rules and are placed at the end of the regular pattern. You can use them individually or together.
// A single modifier
"abAbab".match(/a/g); //["a","a"]
// Use multiple modifiers together
"abAbab".match(/a/gi); //["a", "A", "a"]
Copy the code
g
The modifier
Global search, by default, matches only once, then stops matching, and modifiers will search all the way down
i
The modifier
By default, the matching string is case-sensitive
m
The modifier
By default, ^ and $match at the beginning and end of strings with the m modifier. ^ and $also match at the beginning and end of lines. That is, ^ and $recognize newline \n
Such as:
/yewen$/m.test('hi yuwen\n')
为true
/yewen$/.test('hi yuwen\n')
为false
s
The modifier
Match newline character
u
The modifier
Matches using patterns of Unicode codes
y
The modifier
Performing a sticky search match starts at the current position of the target string
Special characters
\
character
Escape character Need to match in the regular expression special character itself, need to be home in front of the backslash \ regular expressions, require the backslash escapes: ^,., [, $, (,), |, *, +,?, {, \
^
character
Match the start position If you set the multi-line flag, match the position after the newline character
For example: /^A/ will match A in “Ant”, but not A in “ntA”
$
character
Match end position If multiple line flags are set, the position before the newline character is also matched
For example, /A$/ will match A in “ntA”, but not A in “Ant”
*
character
Matching an expression 0 or more times is equivalent to {0,}
For example: /yueno*/ will match Yuenooo and Yuen in “yuenoooYuen”
+
character
Matching an expression once or more is equivalent to {1,}
For example: /yueno+/ will only match Yuenooo in “YuenoooYuen”
?
character
Matching an expression 0 or 1 times is equivalent to {0, 1}
- Such as:
/yueno? /
Will only match"yuenoooyuen"
In theyueno
- Note:
?
If followed by any quantifier*
,+
,?
or{}
Will make the quantifier non-greedy (match as few characters as possible) - Such as:
/yueno?? /
Will only match"yuenoooyuen"
In theyuen
.
character
Any single character other than a newline is matched by default
- Such as:
/.y/
Will only match"yuenoooyuen"
In theoy
- Such as:
/.. y/
Will only match"yuenoooyuen"
In theooy
(x)
character
The parentheses in the capture parentheses regular expression represent grouping matches, and the patterns in the parentheses can be matched with the contents of the grouping matches can be replaced with the \n regular, and the $1,$2 syntax can be used
- Such as:
/(wenbo)+/.test('wenbowenbo')
为true
, indicating matchwenbo
One or more times as a whole - Such as:
"wenbo,zhijian".replace(/(wenbo),(zhijian)/, '$2,$1')
- Output:
zhijian,wenbo
(? =x)
character
Matches X but does not remember the match non-capture parentheses, allowing you to define subexpression to use with regular expression operators that use non-capture parentheses, matching elements but not using \n and $n methods
x(? =y)
character
Matches > x, only if > x is followed by > y>, predicate first
- Such as:
'wenbo'.match(/wen(? =bo)/)
- Output:
[ 'wen', index: 0, input: 'wenbo', groups: undefined ]
- Such as:
'wenyu'.match(/wen(? =bo)/)
- Output: null
(? <=y)x
character
Matches > x only if > x is preceded by > y and followed by > assertion
- Such as:
'wenbo'.match(/(? <=wen)bo/)
- Output:
[ 'bo', index: 3, input: 'wenbo', groups: undefined ]
- Such as:
'yubo'.match(/(? <=wen)bo/)
- Output: null
x(? ! y)
character
Matches > x, only if > x is not followed by > y, > positive negative lookup
(? <=y)x
character
Matches > x, only if > x is not preceded by > y>, reverse negation lookup
x|y
character
Matches > x or > y can be used together
- Such as:
'wenyu'.match(/w|e|n/g)
- Output:
[ 'w', 'e', 'n' ]
{n}
character
Matches the preceding character exactly > n times, > n> is a positive integer
- Such as:
'hello'.match(/l{2}/g)
- Output:
[ 'll' ]
{n,}
character
Matches a character at least > n times, > n> is a positive integer
{n,m}
character
Matches the preceding characters at least > N > times and at most > m> times, > N >, > m yes > positive integer >
[xyz]
character
Character set > matches any character in square brackets, including escape characters. A character can be specified with a dash (-), > for example: > [A-za-Z1-9]>
- Such as:
'hello 123'.match(/[a-h1-2]/g)
- Output:
[ 'h', 'e', '1', '2' ]
[^xyz]
character
The reverse character set, > matches any character that does not contain resquare brackets
- Such as:
'hello 123'.match(/[^a-h1-2]/g)
- Output:
[ 'l', 'l', 'o', '3' ]
[\b]
character
Matches a backspace (U+0008), not > \b, don’t get confused
\b
character
Match the boundaries of a word
Such as:
/\bworld/.test('hello world') // true
/\bworld/.test('hello-world') // true
/\bworld/.test('helloworld') // false
\B
character
Matches a non-word boundary
Such as:
/\bworld/.test('hello world') // false
/\bworld/.test('hello-world') // false
/\bworld/.test('helloworld') // true
\cX
character
Matches A control character in the string when X is A character between A and Z
\d
character
Matching a number is equivalent to > [0-9]
\D
character
Matching a number is equivalent to > [^0-9]
\D
character
Matching a number is equivalent to > [^0-9]
\f
character
Match a feed character (U+000C)
\n
character
Match a newline character (U+000A)
\r
character
Matches a carriage return
\s
character
Matches a whitespace character, including a space, TAB, page feed, and line feed
[\f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]
\S
character
Matches a non-whitespace character
\t
character
Matches a horizontal TAB character
\v
character
Matches a vertical TAB character
\w
character
Matching A single-word character (letter, digit, or underscore) > is equivalent to > [A-za-z0-9_].
\w
character
Matching A non-single-word character > is equivalent to > [A-za-z0-9_]
\W
character
Matches a non-single-word character
\n
character
Returns the last NTH word to capture matched subcharacters, the number of > captures counted in open parentheses
\ 0
character
Matches the NULL character (U+0000)
\xhh
character
Matches a two-digit hexadecimal character (\x00-\xFF)
\uhhhh
character
Matches a UTF-16 code unit represented by a four-digit hexadecimal number
\u{hhhhh}
Character or\u{hhhh}
Matches a Hexadecimal Unicode character (only when the U flag is set)