“This is the 8th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021”

RegExp

preface

Regular expressions are important. Almost every language supports regular expressions. ECMAScript supports regular expressions through the RegExp type. In fact, regular is very common, such as how we route to the corresponding matching page, extract the text I want in a paragraph of text, and so on, we need to use regular.

Regular match

  • The pattern of this regular expression can be simple or complex regular expressions (strings, qualifiers, grouping, look-forward backreferences).

  • Regular expressions can have 0 to more flags to control the behavior of regular expressions.

Matching pattern mark

  • G: global mode, indicating the entire content of the search character
  • I: case insensitive, which means that the case of pattern and string are ignored when searching for a match
  • M: Multi-line mode: the search will continue after the end of a line of text
  • Y: Adhesion mode, which only looks for strings starting from and after lastIndex.
  • U: Unicode mode, enabling Unicode matching
  • S: doAll mode, representing metacharacters, matching any character (including \n \r)

Tags can also be used in combination

let pattern = /at/gi
Copy the code

Regular objects can be created either as literals or as RegExp constructors

Literal form

That’s what we defined above

let pattern = /at/gi
Copy the code

metacharacters

\ Marks the next character as a special character, or a literal character, or a backreference, or an octal escape. For example, \n matches the newline character ^ matches the beginning of the input string. $matches the end of the input string. * Matches the preceding subexpression0Times or more + matches the preceding subexpression1One or more times? Matches the previous subexpression0Time or1Times. Match the newline character of any single character x | y x or y (xyz) match contains any characters [^ xyz] match did not contain any characters [a-z] match any a lowercase letter a-z \ d match a numeric characters is equivalent to [0-9] \D matching a non-numeric character is equivalent to [^0-9]


let pa = /[bc]at/iMatch the first one"bat"or"cat", ignore caselet pa = /\[bc]at/iMatch the first one"[bc]at"Ignore caselet pa = /.at/giMatches all of the"at"A trailing three-character combination, case ignoredlet pa = /\.at/giMatch all".at"Ignore caseCopy the code

To use metacharacters, you must escape.

RegExp constructor

This way, both arguments are passed as strings

let pat = new RegExp("[bc]at"."i")
Copy the code

The metacharacter requires a second escape of \ to \, as in \n \n

The literal pattern corresponds to the string /\[BC \]at/"\\[bc\\]at"
/\.at\                       "\\.at"
/name\/age/                  "name\\/age"
/\d.\d{1.2} /. \ \ "d \ \ d {1, 2}"  
/\w\\hello\\123/             "\\w\\\\hello\\\\123"
Copy the code

5.2.1 RegExp Instance Attributes

These properties provide an overview of regular expression information, but are not often used in actual development. For example, I don’t need to know if G is used in the regular expression.

  • Global: A Boolean value indicating whether the G flag is set.
  • IgnoreCase: Boolean value indicating whether the I flag is set.
  • Unicode: Boolean value indicating whether the U flag is set.
  • Sticky: A Boolean value, indicating whether the Y flag is set.
  • LastIndex: Integer representing the start of the next search in the source string, always starting at 0.
  • Multiline: Boolean value indicating whether the M flag is set.
  • DotAll: Boolean value indicating whether the S flag is set.
  • Source: Literal string for the regular expression (not the pattern string passed to the constructor), without leading and ending slashes.
  • Flags: indicates the flag string of the regular expression. Always return as a literal instead of a string pattern passed in to the constructor (no backslashes).

5.2.2 RegExp instance method

Here we introduce two methods exec() and test()

exec()

Parameter: String to which the pattern is to be applied

Requirements: internal matches “and baby”, internal matches “and dad” or “and dad and bay”

let txt ="mom and dad and baby"
// Note the space
let pattern = /mom( and dad( and baby)?) ? /gi
let match = pattern.exec(txt)
// ['mom and dad and baby', ' and dad and baby', ' and baby', index: 0, input: 'mom and dad and baby', groups: undefined]
Copy the code
  • Match [0]: String to look for
  • Match [1]: the first string to match
  • Match [2]: the second string to match (when matching conditions are multiple)
  • Match [“input”]: String to look for
  • Match [“index”]: matches the index of the string

Think about it 🤔

Why does this array have strings and key-value pairs? I’m wrong, right? Is this still an array?

let arr = [1.2."test":11]  // Uncaught SyntaxError: Unexpected token ':'
Copy the code

Yes, this is a normal array with some other attributes assigned. Because arrays are objects, they can have any key-value pair in addition to the usual numeric index, although you’ll almost never see this in plain clean code (regular expression matching is the only other nonstandard property you can think of where an array object is). You can get a similar array with the following method definition.

let arr = [1]
arr.input = "test"  // [1, 2, input: 'test']
Copy the code

Global tag G

What exactly does global matching mean? Exec () returns the same result even if it uses global matching.

let text = "cat, bat, sat, fat"


let nogpattern = /.at/
// ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
nopattern.exec(text)  


// ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
let havegpattern = /.at/g
havegpattern.exec(text)
Copy the code

This is because exec() needs to be called again before it looks down again. And that’s where the effect of g comes in

Don’t use g

let text = "cat, bat, sat, fat"
let nog = /.at/
nog.exec(text) // ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(nog.lastIndex)   // 0 Last matched index
nog.exec(text) //['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(nog.lastIndex)   / / 0
Copy the code

The use of g

let text = "cat, bat, sat, fat"
let haveg = /.at/g  // ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(haveg.lastIndex)    / / 0
haveg.exec(text)  // ['bat', index: 5, input: 'cat, bat, sat, fat', groups: undefined]
console.log(haveg.lastIndex)   / / 8
Copy the code

Adhesion marker Y

Only look for strings starting and after lastIndex

let text = "_aa_a"
let pattern = /_a+/y
pattern.exec(text)  // ['_aa', index: 0, input: '_aa_a', groups: undefined]

Copy the code

y

But if we want to match aa or A, we get null. Because you start at lastIndex (0). It is to find the string [” _ “, “_a”, “_aa”, “_aa_”, “_aa_a”] is starting from the index of 0 all tenants,

let text = "_aa_a"
let pattern = /a+/y    
pattern.exec(text)           //null
console.log(text.lastIndex) / / 0
Copy the code

This can be found by redefining the lastIndex of pattern

pattern.lastIndex = 1
pattern.exec(text)           // pattern.exec(text) //null
Copy the code

g

And if it is g, looking for the [” _ “, “_a”, “_aa”, “_aa_”, “_aa_a”, “a”, “aa”, “aa_”, “aa_a”, “a”, “a_”, “a_a”, “_”, “_a”, “a”], that is, all possible permutation and combination.

test()

This method is used to verify that the patterns match. Returns true or false

let a = "_aa_a"
let pattern = /_a+/g
pattern.test(a)             // true
Copy the code

ToLocaleString () and toString ()

let pattern = new RegExp("\\[bc\\]at"."gi");
pattern.toString()          // '/\[bc\]at/gi'
pattern.toLocaleString()    // '/\[bc\]at/gi'
Copy the code

valueof()

Returns the regular expression itself

5.2.3 RegExp constructor properties

The following attributes allow you to extract information about the operations performed by exec() and test()

  • Input abbreviation $_ The string searched last
  • LastMatch abbreviation $& Last matched text
  • LastParen abbreviation $+ Last matched capture group
  • LeftContext abbreviation $’ INPUT the text that appears before lastMatch in the string
  • The rightContext abbreviation $’ INPUT string appears in the text following lastMatch
let text = "this has been a short summer";
let pattern = / (.). hort/g
if(pattern.test(text)){
    console.log(RegExp.input);         // this has been a short summerh'n'm'n
    console.log(RegExp. $_);/ / same as above
    console.log(RegExp.leftContext);    // tis has been a
    console.log(RegExp["$`"])           / / same as above
    console.log(RegExp.rightContext);   // summer
    console.log(RegExp.lastMatch);     // short
    console.log(RegExp.lastParen);     // s
}
Copy the code