Regular expression

Basic grammar

A character with no special meaning

Ace / / / / the match 123 ace / 123 / / / / -_ - _ - _ -_ - _ - / / / / matching Yu into matching Yu / / /Copy the code

Escape character

The character set and

[ABC] // Matches a or B or C [^ ABC] // Matches any character other than ABC [0-9] // The abbreviation of [0123456789] [a-z] // Matches any character from A to ZCopy the code

The mapping between partially escaped characters and character sets in the figure above

. / / match except newline character (\ n) of any one character = \ [^ \ n] w = [0-9 a - Z_] \ w = [^ 0-9 a - Z_] \ s = [\ t \ n \ v] \ s = \ [^ \ t \ n \ n] d = = d \ [0-9] [^ 0-9]Copy the code

quantifiers

{n} matches n times, such as a{2}, matches aa

{m, n} matches m-n times, and matches n times preferentially. For example, a{1,3} can match aaa, aa, and a

{m,} matches m-∞ times, prefered ∞ times, such as a{1,}, can match aAAA…

? Match 0 or 1 times,1 match is preferred, equal to {0,1}

+ matches 1-n times, preferentially matches n times, equivalent to {1,}

* matches 0-n times, preferentially matches n times, equivalent to {0,}

The default greedy mode of the re is to match the upper limit over the lower limit of any quantifier representing a range

A {1, 3} // matches the string 'aaa', it matches AAA instead of ACopy the code

Non-greedy model

*** appears here? The use of the
a{1, 3}? // Matches string 'aaa', matches a instead of AAACopy the code

Characters of the boundary

Boundary matching requirements, such as already XXX, already XXX end

^ Indicates the beginning of a match outside []

^ ABC // can match ABC, but not aabcCopy the code

$indicates the meaning of the end of the match

ABC $// can match ABC, but not ABCCCopy the code

\b indicates the boundary of words

ABC \b // can match ABC, but not ABCCCopy the code

Select expression

Regular use | to represent the group

A | b / / match a or b | 456 | 789/123 / match 123 or 456 or 789Copy the code

Grouping and reference

(ABC) {2} / / match abcabc (123 | 456) {2} / / match 123123, 456456, 123456, 456123Copy the code
Capture and non-capture groups

Groups are captured by default, and in groups (added later? : You can make groups non-capture groups, which can improve performance and simplify logic

*** appears here? The use of the
'123'.match(/(? : 123)) / / return/' 123 ', '123'. The match (/ (123)) / / return (' 123 ', '123')Copy the code
reference

For example, when matching AN HTML tag, you usually want the following XXX to be the same as the preceding one

The syntax for a reference is \ number, which indicates the number of capture groups before the reference. Note that non-capture groups cannot be referenced

< ([a-z] +) > 1 > < \ \ / / / can match ` < span > < / span > ` or ` < div > < / div > `, etcCopy the code

Preliminary search

*** appears here? The use of the
/ / looking forward to? = exp1(? =exp2) // match to exp1 // negative look? ! exp1(? ! Exp2) // match to exp1 that is not exp2. < = (? <=exp2)exp1 // matches exp1 in front of exp2. <! Exp2)exp1 // matches exp1 not preceded by exp2Copy the code

Example:

"Cream chocolate ". Replace (/(? <= cream) chocolate /, "cake ") //" cream cake" "hazelnut chocolate ". Replace (/(? <= cream) chocolate /, "cake ") //" hazelnut chocolate", not matchedCopy the code

In a regular expression? =,? ! ,? < =,? <! ,? :

() represents the capture group, and () saves the matched value of each group, using $n(n is a number representing the contents of the NTH capture group) (? :) represents a non-capture group. The only difference is that values matched by a non-capture group are not savedCopy the code

Example:

1,234,567,890 "1234567890". Replace (/\B(? = (? :\d{3})+$/,","); // Find \B (word boundary) and replace \B with "," equivalent to adding a "," between two adjacent digits.Copy the code

The modifier

The default re is case sensitive

/ XXX /gi // The last g and I are the two modifiersCopy the code

The g re ends by default on the first matching character, and the global modifier g makes it match to the end

The I re is case sensitive by default, and I can ignore case

M regex ends on a newline by default and cannot match multiple lines of text. M can make it match multiple lines of text

example

/ ^ [a-z] * [^ \ d] {1,}? (aaa|bbb)(? :ccc)$/Copy the code

RegExp graphical presentation tool

Use a regular

Create a regular

Two methods: literals and new

Var reg = new RegExp(' ABC ', 'g') // new modeCopy the code

API

RegExp#test = RegExp. Prototype. test

RegExp#test

RegExp#exec

String#search

String#match

String#split

String#replace

RegExp#test

Each re instance has the test method, which takes a string and returns a Boolean indicating whether the current re matches the specified string

/abc/.test('abc') // true
/abc/.test('abd') // false
Copy the code

RegExp#exec

Exec uses the same method as test, except that instead of returning a Boolean value, it returns a matching result

A successful match returns an array, the first item being the result of the match, followed by the captured grouping, index representing the index position of the matched sequence in the input string, and input the input string

/abc(d)/.exec('abcd') // ["abcd", "d", index: 0, input: "abcd"]
Copy the code

If there is a global argument (g), the second match will continue from the end of the last match

var r1 = /ab/
r1.exec('ababab') // ['ab', index: 0]
r1.exec('ababab') // ['ab', index: 0]

var r2 = /ab/g
r2.exec('ababab') // ['ab', index: 0]
r2.exec('ababab') // ['ab', index: 2]
r2.exec('ababab') // ['ab', index: 4]
Copy the code

Returns NULL if the match fails

/abc(d)/.exec('abc') // null
Copy the code

This feature can be used for loop matching, such as counting the number of ABCs in a string

var reg = /abc/g var str = 'abcabcabcabcabc' var num = 0; var match = null; while((match = reg.exec(str)) ! == null) { num++ } console.log(num) // 5Copy the code

String#match

The match method also returns a matching result, similar to exec

'abc'.match(/abc/) // ['abc', index: 0, input: abc]
'abc'.match(/abd/) // null
Copy the code

If there is a global argument (g), match returns all results and has no index and input attributes

'abcabcabc'.match(/abc/g) // ['abc', 'abc', 'abc']
Copy the code

String#search

The search method returns the index of the successful match, as a string or re, and the result is the index

'abc'.search(/abc/) // 0
'abc'.search(/c/) // 2
Copy the code

Returns -1 if the match fails

'abc'.search(/d/) // -1
Copy the code

String#split

The split method of a string, which splits the string with the specified symbol and returns data

'a,b,c'.split(',') // [a, b, c]
Copy the code

The argument can also be a re, which must be used if there are more than one delimiter

'a,b.c'.split(/,|\./) // [a, b, c]
Copy the code

String#replace

The string replace method replaces the string matching character with another specified character

'abc'.replace('a', 'b') // 'bbc'
Copy the code

The first argument can be a regular expression, which you need to add if you want global substitution

'ABC'. The replace (/ [ABC], 'y') / / ybc 'ABC'. The replace (/ ABC/g, 'y') / / yyy global replacementCopy the code

In the second argument, you can also reference the result of the previous match

'ABC'. The replace (/ a /, '$& b') / / abbc $& characters' ABC 'in front of a reference. The replace (a/b (a) /,' $1 a) / / aac & n in front of the reference character groups' ABC '. The replace (a/b /, '$\') / / aac $` reference character in front of the characters' ABC '. The replace (a/b /, "$") / / acc $' characters behind the reference characterCopy the code

The second argument to replace can also be a function, with the first argument being the match content and the subsequent argument being the matching grouping

'abc'.replace(/\w/g, function (match, $1, $2) {
    return match + '-'
})
// a-b-c-
Copy the code

Removes whitespace before and after strings

str = str.replace(/^\s*|\s*$/g, '')
Copy the code

RegExp

The RegExp is a global function that can be used to create dynamic regees and has some properties of its own

  • The $_
  • $n
  • input
  • length
  • lastMatch
/a(b)/.exec('abc') // ["ab", "b", index: 0, input: "ABC "] RegExp.$_ // ABC Last matched string RegExp.$1 // b Last matched capture group RegExp. Input // ABC Last matched string RegExp. LastMatch // ab Length // 2 Indicates the length of the array that was last matchedCopy the code

Instance attributes

Properties of a regular expression instance

  • flags
  • ignoreCase
  • global
  • multiline
  • source
  • LastIndex indicates the index after the last match

Let’s look at examples

var r = /abc/igm; r.flags // igm r.ignoreCase // true r.global // true r.multiline // true r.source // abc r.exec('abcabcabc') // ["abc", index: 0] r.lastIndex // 3 r.exec('abcabcabc') // ["abc", index: [" ABC ", index: 0] [" ABC ", index: 0] [" ABC ", index: 0]Copy the code

example

/ (? : 0 \ d {2, 3} -)? \ d {7} / / / / ^ 1 phone number 010 - XXX XXX [378] \ d {9} $17 / / / phone number 13 XXX XXX XXX / ^ 18 [0-9 a zA - Z_] + @ [0-9 a zA - Z] + \. [Z Z] + $/ / / mailCopy the code