Familiarize yourself with regular expressions
A regular expression is an object that describes character patterns in the form of
const pattern = /s$/;
Copy the code
This simple re matches all strings ending in “s”
We can also define phi if we use the constructor approach
const pattern = new RegExp("s$");
Copy the code
character
Letters and numbers in a regular expression are matched according to literal content. In addition, a regular expression supports non-alphabetic character matching. Here is a table of the corresponding characters
Special characters | Regular expression | memory |
---|---|---|
A newline | \n | new line |
Page identifier | \f | form feed |
enter | \r | return |
Whitespace characters | \s | space |
tabs | \t | tab |
Vertical TAB character | \v | vertical tab |
The fallback operator | [\b] | Backspace, the [] symbol is used to avoid duplication with \b |
Character set | Regular expression | memory |
---|---|---|
Any character other than a newline character | . | |
Single digit, [0-9] | \d | digit |
In addition to the [0-9] | \D | not digit |
A single character including an underscore | [A-Za-z0-9_] | |
Word character | \w | word |
Non-single-word characters | \W | not word |
Matches whitespace characters, including Spaces, tabs, page feeds, and line feeds | \s | space |
Matches non-whitespace characters | \S | not space |
In regular expressions, many symbols have special meanings. For example:
.? ^ $+ =! : | \ / () [] {}Copy the code
- […]. Matches any character in square brackets
- [^…]. ^ does not match any character in square brackets
- $indicates the end of a character
repeat
The following table is the repeating syntax for regular expressions
character | meaning |
---|---|
{m, n} | Match the previous item at least m times and at most N times |
{n,} | Match the previous item at least n times |
{n} | Matches the previous term n times |
? | Matches the previous item 0 or 1 times |
- | once or many times before a match, equivalent to {1}
- | before a match any number of times, equivalent to {0}
Select, group, and reference
Regular expression syntax also contains the specified options, subexpressions grouping and previous sub-expression special characters, we use “|” “or” relationship, grouping expressed in parentheses (), here are some examples
/ab|cd|ef/ // Can match ab, CD or EF
Copy the code
Note that the selection matches from left to right until a match is found. If the left match is found, the right match is ignored, even if the right match is better
/a/ab/ // If the string ab is true, only a can be matched
Copy the code
Regular expression parentheses have several uses, one of which is to combine individual items into expressions
/java(script)? // / matches Java or javaScriptCopy the code
Another is to allow the back of an expression to reference a previous subexpression
/([Jj]ava([Ss]cript)?) \sis\s(fun\w*)/// Nested subexpressions ([Ss]cript?) We can call it PI over 2
Copy the code
The symbol? : indicates that only groups are performed but no reference is performed
/([Jj]ava(? :[Ss]cript)?) \sis\s(fun\w*)/([Ss]cript?) Not participating in counting, /2 means (fun\w*)
Copy the code
The table below is a summary of selection, grouping, and referencing
character | meaning |
---|---|
\ | Select, representing the expression to the left or right of the matching symbol |
(…). | Combined into a unit that can pass |
(? :…). | Only composition is performed, but no references are provided |
\n | The NTH character in the group matched for the first time, the group index is the number of parentheses from left to right, (? 🙂 does not participate in counting |
Specify the matching position
Anchor characters and modifiers for regular expressions
character | meaning |
---|---|
^ | Matches the beginning of a string. In multi-line retrieval, matches the beginning of a line |
$ | Matches the end of a string. In multi-line retrieval, matches the end of a line |
\b | To match a word boundary, simply the part between \w and \w, or between the character \w and the beginning or end of the string, note that [\b] matches a backspace symbol |
\B | Matches a non-word boundary |
(? =p) | Forward-first assertions require that all subsequent characters match p, but the result of the entire re does not contain the P part |
(? ! p) | Negative prior assertion requires that all subsequent characters do not match p |
i | Case insensitive (ignore) |
m | Multi-line match (multi) ^ matches the beginning of a string or a line, and $matches the end of a string or a line |
g | Global, find all matches, not stop after finding the first one |
On the use of \ B
/\sjava\s/ // Match Java (with Spaces before and after)
/\bjava\b/ // Matches Java (without Spaces)
Copy the code
About antecedent assertion use
/java(? Javasscript/Java (? ! The match result indicates that Java does not match javascriptCopy the code
The modifier