Familiarize yourself with regular expressions

A regular expression is an object that describes character patterns in the form of

    const pattern = /s$/;
Copy the code

This simple re matches all strings ending in “s”

We can also define phi if we use the constructor approach

    const pattern = new RegExp("s$");
Copy the code

character

Letters and numbers in a regular expression are matched according to literal content. In addition, a regular expression supports non-alphabetic character matching. Here is a table of the corresponding characters

Special characters Regular expression memory
A newline \n new line
Page identifier \f form feed
enter \r return
Whitespace characters \s space
tabs \t tab
Vertical TAB character \v vertical tab
The fallback operator [\b] Backspace, the [] symbol is used to avoid duplication with \b
Character set Regular expression memory
Any character other than a newline character .
Single digit, [0-9] \d digit
In addition to the [0-9] \D not digit
A single character including an underscore [A-Za-z0-9_]
Word character \w word
Non-single-word characters \W not word
Matches whitespace characters, including Spaces, tabs, page feeds, and line feeds \s space
Matches non-whitespace characters \S not space

In regular expressions, many symbols have special meanings. For example:

.? ^ $+ =! : | \ / () [] {}Copy the code
  • […]. Matches any character in square brackets
  • [^…]. ^ does not match any character in square brackets
  • $indicates the end of a character

repeat

The following table is the repeating syntax for regular expressions

character meaning
{m, n} Match the previous item at least m times and at most N times
{n,} Match the previous item at least n times
{n} Matches the previous term n times
? Matches the previous item 0 or 1 times
  • | once or many times before a match, equivalent to {1}
  • | before a match any number of times, equivalent to {0}

Select, group, and reference

Regular expression syntax also contains the specified options, subexpressions grouping and previous sub-expression special characters, we use “|” “or” relationship, grouping expressed in parentheses (), here are some examples

    /ab|cd|ef/  // Can match ab, CD or EF
Copy the code

Note that the selection matches from left to right until a match is found. If the left match is found, the right match is ignored, even if the right match is better

    /a/ab/ // If the string ab is true, only a can be matched
Copy the code

Regular expression parentheses have several uses, one of which is to combine individual items into expressions

/java(script)? // / matches Java or javaScriptCopy the code

Another is to allow the back of an expression to reference a previous subexpression

/([Jj]ava([Ss]cript)?) \sis\s(fun\w*)/// Nested subexpressions ([Ss]cript?) We can call it PI over 2
Copy the code

The symbol? : indicates that only groups are performed but no reference is performed

/([Jj]ava(? :[Ss]cript)?) \sis\s(fun\w*)/([Ss]cript?) Not participating in counting, /2 means (fun\w*)
Copy the code

The table below is a summary of selection, grouping, and referencing

character meaning
\ Select, representing the expression to the left or right of the matching symbol
(…). Combined into a unit that can pass
(? :…). Only composition is performed, but no references are provided
\n The NTH character in the group matched for the first time, the group index is the number of parentheses from left to right, (? 🙂 does not participate in counting

Specify the matching position

Anchor characters and modifiers for regular expressions

character meaning
^ Matches the beginning of a string. In multi-line retrieval, matches the beginning of a line
$ Matches the end of a string. In multi-line retrieval, matches the end of a line
\b To match a word boundary, simply the part between \w and \w, or between the character \w and the beginning or end of the string, note that [\b] matches a backspace symbol
\B Matches a non-word boundary
(? =p) Forward-first assertions require that all subsequent characters match p, but the result of the entire re does not contain the P part
(? ! p) Negative prior assertion requires that all subsequent characters do not match p
i Case insensitive (ignore)
m Multi-line match (multi) ^ matches the beginning of a string or a line, and $matches the end of a string or a line
g Global, find all matches, not stop after finding the first one

On the use of \ B

    /\sjava\s/ // Match Java (with Spaces before and after)
    /\bjava\b/ // Matches Java (without Spaces)
Copy the code

About antecedent assertion use

/java(? Javasscript/Java (? ! The match result indicates that Java does not match javascriptCopy the code

The modifier