Javascript regular expressions

A REGEXP object

JavaScript supports regular expressions through the built-in RegExp object there are two ways to instantiate a RegExp object:

Literal:var reg = /\bis\b/g;
The constructor

var reg = new RegExp ('\\bis\\b'.'g'); // Two \ because \ needs to be escaped in the stringCopy the code

Decorator:

G: Full text search, do not add, search until the first match

I: Ignore case. The default is case sensitive

M: Multi-line search

For regular expressions with newline /n, document.write(); Is recognized as a delimiter, a space, so the only way to see newline effects is console.log. The normal keyboard return is /r/n, /r equals the end of a line, and /t stands for Tab.

2. Regular expression composition

1. Metacharacters

Literal text characters: characters that represent their original meaning, such as a, B, and C for a, B, and C
Metacharacters: non-alphabetic characters that have special meanings in regular expressions

* +? RMB ^. | \ () {} [] TAB \ \ t level v vertical tabs \ n a newline carriage return \ \ r \ 0 null character f form-feed character \ cX and corresponding control characters (Ctrl + X) XCopy the code

2. The character class

In normal cases, one regular expression character corresponds to one string character:

Metacharacters can be used[]Create a simple class
A class is an object that conforms to some property, a general reference, not a character in particular
expression[abc]The characteraorborcA class of characters that an expression can match

3. The character class is reversed

Use metacharacters^createReverse class/negative class
A reverse class means something that doesn’t belong to a class
expression[^abc]saidIt is not the content of character A or B or C

4. The range class

You can use[a-z]To join two character representationsAny character from a to Z
The closed interval containsaandzitself
in[]The composition of a class can be written from the inside[a-zA-Z]Represents uppercase and lowercase letters

5. Predefined classes

Matches common character classes

character	Equivalence class	meaning
.	[^\r\n]	All characters except carriage returns and newlines
\d	[0-9]	Numeric characters
\D	[^ 0-9]	Non-numeric character
\s	[\t\n\x0B\f\r]	Whitespace characters
\S	[^\t\n\x0B\f\r]	Non whitespace characters
\w	[a-zA-Z_0-9]	Word character (alphanumeric underscore)
\W	[^a-zA-Z_0-9]	Non-word character

Matches a string of ab+ numbers + arbitrary characters:ab\d.

6. The border

Common boundary matching characters:

character	meaning
^	Begin with * * *
$	Ends with a * * *
\b	Word boundaries
\B	Non-word boundary

Note: some characters have different meanings in different scenarios. For example^in[]If it’s the opposite, if it’s notBegin with * * *

7. Quantifiers

character	meaning
?	Zero or one occurrence (maximum one occurrence)
+	Appear once or more (at least once)
*	Two or more occurrences (any)
{n}	A n time
{n,m}	N to m occurrences
{n,}	At least n times

8. Greedy versus non-greedy

Greed mode: Under normal conditions\ d {3, 6}Matches as many numbers as possible, i.e., all in units of 6 (until there are no more)
Non-greedy mode: Make the regular expression match as few times as possible, that is, once a match is successful, no more attempts are made, and match in the fewest units

\ d {3, 6}? Add? After the quantifier. Go into non-greedy modeCopy the code

Grouping 9.

Normal quantifiers only match up to the last letter, for examplebeyond{3}Matches the letter D three times
Use () to add groups of words, such as quantifiers(beyond) {3}
use|To achieve the effect of
Backreference, clever use$Packet capture

10. Looking forward to

A regular expression is parsed from the beginning of the text to the end of the text. The end direction of the text is called “forward”. When a regular expression matches a rule, it checks whether the rule matches the assertion in the forward direction and in the backward direction.

The name of the	regular
Positive predictive	exp(? =assert)
Negative predictive	exp(? ! assert)

The assert section is also a regular expression.

Third, regular expression related methods

1. Properties of the re object

Global: Indicates whether to search for the full text. The default value is false
Ignore case: whether the case is case sensitive. The default value is false
Multiline: multi-line search. Default is false
LastIndex: is the position next to the last character of the current expression matching content. Pay special attention to this attribute, which is prone to errors in global mode G
Source: Text string of the regular expression

2. Regular object method

RegExp.prototype.test(str)

Tests whether a string matching a regular expression pattern exists in a string argument

RegExp.prototype.exec(str)

The string is searched using the regular expression pattern, and the properties of the global RegExp object are updated to reflect the match.

Returns NULL if there is no matching text
Otherwise returns an array of results: index: declares the position of the first character of the matching text input: stores the retrieved string string

3. string object methods

Regular expressions can be used in string methods

(1) the String. The prototype. The search ()

Use to detect a substring specified in a string, or to retrieve a substring that matches a regular expression. Str.search (reg) returns index as the first match, -1 if no match is found

(2) the String. The prototype. The match ()

Retrieves the string and finds one or more texts that match the regexp. Whether g is present or not makes a big difference.

Without g, the match method can perform only one match in the string, returning the matched string or NULL
G returns an array containing information about all matched texts

(3) String.prototype.replace(reg, replacement)

The replace method replaces the corresponding match with replacement and returns the replaced string.

Replacement can be

string
Function, the argument to the function is the matched string but it can’t be an arrow function, because the arrow function can’t take the matched string.

Here are a few examples of replace:

// 1. Get-element becomes humped
function commal(str) {
  // No specified number is automatically matched
  // Be sure to add the global modifier
  let reg = new RegExp('-[a-z]'.'g')
  // $0 is the string that replace matches
  The // replace method does not change the original string
  return str.replace(reg, function($0) {
    return $0.slice(1).toUpperCase()
  })
}

console.log(commal('ab-gi-du'))    // abGiDu
Copy the code

// 2. Separate three numbers by a comma
function slicestr(str) {
  let reg = /\d{3}/
  // This is correct. 123,256 will not occur
  //let reg = /(\d)(? =(\d{3})+$)/g;
 
  // Use the arrow function to get no matching string, and return undefined
  /* return str.replace(reg, (word) => { word + ',' }) */
  return str.replace(reg, function(word) {
    return word + ', '})}Copy the code

(4) Split () method

Four, the comparison of several methods

. 1. The RegExp. The prototype test (STR) and String. The prototype. The search (reg)

Both methods can be used to find if there is a corresponding regular expression string in the string STR, with the following differences:

Text is the method for the re object and search is the method for the string object
Text is used in conjunction with RegExp. LastIndex in global mode, each time starting from the location of the last match
Text returns a Boolean value for the existence of a match, while search returns the location of the first match, and -1 if not

let str = "k is so k"
// Global mode
let reg = new RegExp("k"."g")
console.log(reg.test(str))    //true
console.log(reg.lastIndex)    / / 1
console.log(reg.test(str))    //true
console.log(reg.lastIndex)    / / 9
Copy the code

// Global mode
let str = "k is so k"
let reg = new RegExp("k"."g")
console.log(str.search(reg))  / / 0
console.log(reg.lastIndex)    / / 0
console.log(str.search(reg))  / / 0
console.log(reg.lastIndex)    / / 0
Copy the code

. 2. The RegExp. Prototype exec (STR) and String. The prototype. The match (reg)

The exec and match methods can both return objects with index positions, but their use depends on whether they are in global mode or not.

Exec returns the first match each time in non-global mode, and in global mode, in conjunction with the lastIndex property, looks back one at a time and returns the corresponding match.

let str = "k is so k"
// Non-global mode
let reg = new RegExp("k")
console.log(reg.exec(str))    // {"0: "k"
                              // groups: undefined
                              // index: 0
                              // input: "k is so k k k"
                              // length: 1"}
console.log(reg.lastIndex)    //  0
console.log(reg.exec(str))    / / same as above
console.log(reg.lastIndex)     / / 0
// Global mode
let str = "k is so k"

// Non-global mode
let reg = new RegExp("k"."g")
console.log(reg.exec(str))    // {"0: "k"
                              // groups: undefined
                              // index: 0
                              // input: "k is so k"
                              // length: 1"}
console.log(reg.lastIndex)    //  0
console.log(reg.exec(str))    // {"0: "k"
                              // groups: undefined
                              // index: 8
                              // input: "k is so k"
                              // length: 1"}
console.log(reg.lastIndex)     / / 9
Copy the code

In global mode, match directly returns an array of matching objects without information such as location index. In non-global mode, match returns the first matching object with information such as index.

// Global mode
let str = "k is so k"
let reg = new RegExp("k"."g")
console.log(str.match(reg))   // ["k", "k"]
console.log(reg.lastIndex)    / / 0
console.log(str.match(reg))   // ["k", "k"]
console.log(reg.lastIndex)    / / 0

// Non-global mode
let str = "k is so k"
let reg = new RegExp("k"."g")
console.log(str.match(reg))   // {"0: "k"
                              // groups: undefined
                              // index: 0
                              // input: "k is so k"
                              // length: 1"}
console.log(reg.lastIndex)    / / 0
console.log(str.match(reg))   / / same as above
console.log(reg.lastIndex)    / / 0
Copy the code

reference

MDN regular expression

Probably the best regular expression tutorial notes