Regular expression

Regular expressions are patterns used to match combinations of characters in strings. In JavaScript, regular expressions are also objects. These patterns are used in the exec and text methods of RegExp, and in the match, matchAll, replace, search, and split methods of String

Create an expression

literal

Create regular expressions with two/directly, with slashes to indicate the beginning and end

var reg = /ab/g;
Copy the code

When the script loads, the regular expression literal is compiled. When the regular expression is kept unchanged, better performance is achieved using this approach.

RegExp constructor

var reg = new RegRxp("ab"."g");
Var reg = /ab/g
Copy the code

Literals create slash ends after arguments and constructors the second argument represents the modifier.

The above two ways of writing are equal markup for each new regular expression object. The difference is that when the first engine compiles the code, it creates a new regular expression. The second method creates new expressions at run time, so literals are more efficient. And literals are more convenient and intuitive, and they are basically used to define regular expressions.

Instance attributes

Regular modifiers related instance attributes (read only)

  • ignoreCase: Returns a Boolean value indicating whether it is setiThe modifier
  • global: Returns a Boolean value indicating whether it is setgThe modifier
  • multiline: Returns a Boolean value indicating whether it is setmThe modifier
  • flags: returns a string containing all modifiers set

Power attributes independent of modifiers:

  • lastIndex: Returns a certificate indicating the location where the next search will begin
  • source: Returns the regular expression as a string, read-only
var reg = /abc/gim;
// Modifiers related attributes
reg.ignoreCase; //true
reg.global; //true
reg.multiline; //true
reg.flags; //gim
// Modifiers are independent of attributes
reg.lastIndex; / / 0
reg.source; //abc
Copy the code

Instance methods

Regular instance method

test()

Tests for a match in the string, returning true or false

var reg = /av/g;
var s = "avbabc";
reg.test(s); //true

reg.lastIndex = 2;
reg.test(s); //false
Copy the code

When the regular expression has the G modifier, each test method matches backwards from where it last ended, and you can use lastIndex to see the current position

var reg = /av/g;
var s = "avbavabc";

reg.lastIndex; / / 0
reg.test(s); //true

reg.lastIndex; / / 2
reg.test(s); //true

reg.lastIndex; / / 5
reg.test(s); //false
Copy the code

If the regular expression is an empty string, all strings are matched, returning true

exec()

The null exec method returns an array containing two properties:

  • input: The entire string
  • index: indicates the index of the start position of the successful pattern matching
var reg = /av/g;
var s = "avbavabc";

reg.exec(s); //["av", index: 0, input: "avbavabc", groups: undefined]
reg.exec(s); //["av", index: 3, input: "avbavabc", groups: undefined]
reg.exec(s); //null
Copy the code

As with Test, when the regular expression has a G modifier, each exec method matches backwards from where it last ended, and you can use lastIndex to see the current position

When the regular expression contains () group matches, the array returned contains multiple matches. The first is the result of the successful match of the entire re, the second is the result of the match in parentheses, and the third is the content of the match in parentheses if there are multiple parentheses. And so on.

var reg = /a(v)/g;
var s = "avbavabc";

reg.exec(s); //[ 'av', 'v', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s); //[ 'av', 'v', index: 3, input: 'avbavabc', groups: undefined ]
reg.exec(s); //null
Copy the code

More than ()

var reg = /a(v)(b)/g;
var s = "avbavabc";

reg.exec(s); // [ 'avb', 'v', 'b', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s); //null
Copy the code

String instance method

match()

Performs a search for matching characters in a string, returns an array, returns null if the regular expression does not have a G modifier, returns an array with index and iuPUT attributes

var reg = /ac/;
var s = "acbacvabc";
var s1 = "aabaavabc";

s.match(reg); //[ 'ac', index: 0, input: 'acbacvabc', groups: undefined ]
s1.match(reg); //null
Copy the code

When the regular expression has the G modifier, this method returns an array of all successful matches at once. No more index and input attributes

var reg = /ac/g;
var s = "acbacvabc";

s.match(reg); //[ 'ac', 'ac' ]
Copy the code

Note: Setting the lastIndex property of a regular expression does not apply to the match method, which always matches from the beginning of the string.

matchAll()

Performs a lookup of all matching characters in a string, returning an iterator. Note that when matchAll is used, the regular expression needs to have the G modifier, otherwise it will run with an error.

var reg = /a/g;
var s = "acbacvabc";

arr = [...s.matchAll(reg)];
console.log(arr);
/ / output:
/** [ [ 'a', index: 0, input: 'acbacvabc', groups: undefined ], [ 'a', index: 3, input: 'acbacvabc', groups: undefined ], [ 'a', index: 6, input: 'acbacvabc', groups: undefined ] ] **/
Copy the code

search()

Performs a lookup of a matching character in a string, returning the position of the first matching character, or -1 if not matched

var reg = /en/g;
var reg1 = /yo/g;
var s = "yuwenbo";

s.search(reg); / / 3
s.search(reg1); / / 1
Copy the code

replace()

Performs a look-up match in a string and replaces the matched substring with a replacement string. The two parameters are the regular expression and the content to be replaced.

If the re does not have a G modifier, only the first matching value is replaced. If there is a G modifier, all matched values are replaced.

var s = "i love you";
console.log(s.replace(/\s/."❤")); / / ❤ I love you
console.log(s.replace(/\s/g."❤")); / / I ❤ love ❤ you
Copy the code

Replace The second argument can use the $sign to make it easier to specify what to replace

  • $&: Matched substring
  • ‘$’ : Matches the text before the result
  • $': Matches the text following the result
  • $n: The NTH group of contents successfully matched. N is a natural number starting from 1
  • $$: indicates the dollar sign$
console.log("he llo".replace(/(\w+)\s(\w+)/."$2 $1")); //llo he
console.log("hello".replace(/e/."- $` - $& - $' -")); //h-h-e-llo-llo
Copy the code

The second argument to replace can also be used as a function to replace each regular match with the return value of the function

The function can take multiple arguments, the first of which is the match, followed by the group match (there can be multiple group matches), the penultimate argument to the position of the match in the string, and the penultimate argument to the original string.

console.log(
  "hello".replace(/e/.function (match, index, str) {
    console.log(match, index, str);
    return "❤"; }));//e 1 hello
/ / h ❤ llo
Copy the code

split()

Split a string using a regular expression or a fixed string and store the substrings in an array. This method takes two parameters. The first parameter is the regular expression, which represents the split rule, and the second parameter returns the maximum number of members of the array

str = "ni hao ya.hei hei hei";
str.split(/ | \. /.5); //[ 'ni', 'hao', 'ya', 'hei', 'hei' ]
Copy the code

Conclusion:

To determine if a string is matched, use the test or search methods to get more information, and use the exec or match methods to get more information.

Modifiers (identifiers)

Modifiers represent additional rules and are placed at the end of the regular pattern. You can use them individually or together.

// A single modifier
"abAbab".match(/a/g); //["a","a"]

// Use multiple modifiers together
"abAbab".match(/a/gi); //["a", "A", "a"]
Copy the code

gThe modifier

Global search, by default, matches only once, then stops matching, and modifiers will search all the way down

iThe modifier

By default, the matching string is case-sensitive

mThe modifier

By default, ^ and $match at the beginning and end of strings with the m modifier. ^ and $also match at the beginning and end of lines. That is, ^ and $recognize newline \n

Such as:

  • /yewen$/m.test('hi yuwen\n')true
  • /yewen$/.test('hi yuwen\n')false

sThe modifier

Match newline character

uThe modifier

Matches using patterns of Unicode codes

yThe modifier

Performing a sticky search match starts at the current position of the target string

Special characters

\character

Escape character Need to match in the regular expression special character itself, need to be home in front of the backslash \ regular expressions, require the backslash escapes: ^,., [, $, (,), |, *, +,?, {, \

^character

Match the start position If you set the multi-line flag, match the position after the newline character

For example: /^A/ will match A in “Ant”, but not A in “ntA”

$character

Match end position If multiple line flags are set, the position before the newline character is also matched

For example, /A$/ will match A in “ntA”, but not A in “Ant”

*character

Matching an expression 0 or more times is equivalent to {0,}

For example: /yueno*/ will match Yuenooo and Yuen in “yuenoooYuen”

+character

Matching an expression once or more is equivalent to {1,}

For example: /yueno+/ will only match Yuenooo in “YuenoooYuen”

?character

Matching an expression 0 or 1 times is equivalent to {0, 1}

  • Such as:/yueno? /Will only match"yuenoooyuen"In theyueno
  • Note:?If followed by any quantifier*,+,?or{}Will make the quantifier non-greedy (match as few characters as possible)
  • Such as:/yueno?? /Will only match"yuenoooyuen"In theyuen

.character

Any single character other than a newline is matched by default

  • Such as:/.y/Will only match"yuenoooyuen"In theoy
  • Such as:/.. y/Will only match"yuenoooyuen"In theooy

(x)character

The parentheses in the capture parentheses regular expression represent grouping matches, and the patterns in the parentheses can be matched with the contents of the grouping matches can be replaced with the \n regular, and the $1,$2 syntax can be used

  • Such as:/(wenbo)+/.test('wenbowenbo')true, indicating matchwenboOne or more times as a whole
  • Such as:"wenbo,zhijian".replace(/(wenbo),(zhijian)/, '$2,$1')
  • Output:zhijian,wenbo

(? =x)character

Matches X but does not remember the match non-capture parentheses, allowing you to define subexpression to use with regular expression operators that use non-capture parentheses, matching elements but not using \n and $n methods

x(? =y)character

Matches > x, only if > x is followed by > y>, predicate first

  • Such as:'wenbo'.match(/wen(? =bo)/)
  • Output:[ 'wen', index: 0, input: 'wenbo', groups: undefined ]
  • Such as:'wenyu'.match(/wen(? =bo)/)
  • Output: null

(? <=y)xcharacter

Matches > x only if > x is preceded by > y and followed by > assertion

  • Such as:'wenbo'.match(/(? <=wen)bo/)
  • Output:[ 'bo', index: 3, input: 'wenbo', groups: undefined ]
  • Such as:'yubo'.match(/(? <=wen)bo/)
  • Output: null

x(? ! y)character

Matches > x, only if > x is not followed by > y, > positive negative lookup

(? <=y)xcharacter

Matches > x, only if > x is not preceded by > y>, reverse negation lookup

x|ycharacter

Matches > x or > y can be used together

  • Such as:'wenyu'.match(/w|e|n/g)
  • Output:[ 'w', 'e', 'n' ]

{n}character

Matches the preceding character exactly > n times, > n> is a positive integer

  • Such as:'hello'.match(/l{2}/g)
  • Output:[ 'll' ]

{n,}character

Matches a character at least > n times, > n> is a positive integer

{n,m}character

Matches the preceding characters at least > N > times and at most > m> times, > N >, > m yes > positive integer >

[xyz]character

Character set > matches any character in square brackets, including escape characters. A character can be specified with a dash (-), > for example: > [A-za-Z1-9]>

  • Such as:'hello 123'.match(/[a-h1-2]/g)
  • Output:[ 'h', 'e', '1', '2' ]

[^xyz]character

The reverse character set, > matches any character that does not contain resquare brackets

  • Such as:'hello 123'.match(/[^a-h1-2]/g)
  • Output:[ 'l', 'l', 'o', '3' ]

[\b]character

Matches a backspace (U+0008), not > \b, don’t get confused

\bcharacter

Match the boundaries of a word

Such as:

  • /\bworld/.test('hello world') // true
  • /\bworld/.test('hello-world') // true
  • /\bworld/.test('helloworld') // false

\Bcharacter

Matches a non-word boundary

Such as:

  • /\bworld/.test('hello world') // false
  • /\bworld/.test('hello-world') // false
  • /\bworld/.test('helloworld') // true

\cXcharacter

Matches A control character in the string when X is A character between A and Z

\dcharacter

Matching a number is equivalent to > [0-9]

\Dcharacter

Matching a number is equivalent to > [^0-9]

\Dcharacter

Matching a number is equivalent to > [^0-9]

\fcharacter

Match a feed character (U+000C)

\ncharacter

Match a newline character (U+000A)

\rcharacter

Matches a carriage return

\scharacter

Matches a whitespace character, including a space, TAB, page feed, and line feed

[\f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]

\Scharacter

Matches a non-whitespace character

\tcharacter

Matches a horizontal TAB character

\vcharacter

Matches a vertical TAB character

\wcharacter

Matching A single-word character (letter, digit, or underscore) > is equivalent to > [A-za-z0-9_].

\wcharacter

Matching A non-single-word character > is equivalent to > [A-za-z0-9_]

\Wcharacter

Matches a non-single-word character

\ncharacter

Returns the last NTH word to capture matched subcharacters, the number of > captures counted in open parentheses

\ 0character

Matches the NULL character (U+0000)

\xhhcharacter

Matches a two-digit hexadecimal character (\x00-\xFF)

\uhhhhcharacter

Matches a UTF-16 code unit represented by a four-digit hexadecimal number

\u{hhhhh}Character or\u{hhhh}

Matches a Hexadecimal Unicode character (only when the U flag is set)