“This is the 8th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021”
RegExp
preface
Regular expressions are important. Almost every language supports regular expressions. ECMAScript supports regular expressions through the RegExp type. In fact, regular is very common, such as how we route to the corresponding matching page, extract the text I want in a paragraph of text, and so on, we need to use regular.
Regular match
-
The pattern of this regular expression can be simple or complex regular expressions (strings, qualifiers, grouping, look-forward backreferences).
-
Regular expressions can have 0 to more flags to control the behavior of regular expressions.
Matching pattern mark
- G: global mode, indicating the entire content of the search character
- I: case insensitive, which means that the case of pattern and string are ignored when searching for a match
- M: Multi-line mode: the search will continue after the end of a line of text
- Y: Adhesion mode, which only looks for strings starting from and after lastIndex.
- U: Unicode mode, enabling Unicode matching
- S: doAll mode, representing metacharacters, matching any character (including \n \r)
Tags can also be used in combination
let pattern = /at/gi
Copy the code
Regular objects can be created either as literals or as RegExp constructors
Literal form
That’s what we defined above
let pattern = /at/gi
Copy the code
metacharacters
\ Marks the next character as a special character, or a literal character, or a backreference, or an octal escape. For example, \n matches the newline character ^ matches the beginning of the input string. $matches the end of the input string. * Matches the preceding subexpression0Times or more + matches the preceding subexpression1One or more times? Matches the previous subexpression0Time or1Times. Match the newline character of any single character x | y x or y (xyz) match contains any characters [^ xyz] match did not contain any characters [a-z] match any a lowercase letter a-z \ d match a numeric characters is equivalent to [0-9] \D matching a non-numeric character is equivalent to [^0-9]
let pa = /[bc]at/iMatch the first one"bat"or"cat", ignore caselet pa = /\[bc]at/iMatch the first one"[bc]at"Ignore caselet pa = /.at/giMatches all of the"at"A trailing three-character combination, case ignoredlet pa = /\.at/giMatch all".at"Ignore caseCopy the code
To use metacharacters, you must escape.
RegExp constructor
This way, both arguments are passed as strings
let pat = new RegExp("[bc]at"."i")
Copy the code
The metacharacter requires a second escape of \ to \, as in \n \n
The literal pattern corresponds to the string /\[BC \]at/"\\[bc\\]at"
/\.at\ "\\.at"
/name\/age/ "name\\/age"
/\d.\d{1.2} /. \ \ "d \ \ d {1, 2}"
/\w\\hello\\123/ "\\w\\\\hello\\\\123"
Copy the code
5.2.1 RegExp Instance Attributes
These properties provide an overview of regular expression information, but are not often used in actual development. For example, I don’t need to know if G is used in the regular expression.
- Global: A Boolean value indicating whether the G flag is set.
- IgnoreCase: Boolean value indicating whether the I flag is set.
- Unicode: Boolean value indicating whether the U flag is set.
- Sticky: A Boolean value, indicating whether the Y flag is set.
- LastIndex: Integer representing the start of the next search in the source string, always starting at 0.
- Multiline: Boolean value indicating whether the M flag is set.
- DotAll: Boolean value indicating whether the S flag is set.
- Source: Literal string for the regular expression (not the pattern string passed to the constructor), without leading and ending slashes.
- Flags: indicates the flag string of the regular expression. Always return as a literal instead of a string pattern passed in to the constructor (no backslashes).
5.2.2 RegExp instance method
Here we introduce two methods exec() and test()
exec()
Parameter: String to which the pattern is to be applied
Requirements: internal matches “and baby”, internal matches “and dad” or “and dad and bay”
let txt ="mom and dad and baby"
// Note the space
let pattern = /mom( and dad( and baby)?) ? /gi
let match = pattern.exec(txt)
// ['mom and dad and baby', ' and dad and baby', ' and baby', index: 0, input: 'mom and dad and baby', groups: undefined]
Copy the code
- Match [0]: String to look for
- Match [1]: the first string to match
- Match [2]: the second string to match (when matching conditions are multiple)
- Match [“input”]: String to look for
- Match [“index”]: matches the index of the string
Think about it 🤔
Why does this array have strings and key-value pairs? I’m wrong, right? Is this still an array?
let arr = [1.2."test":11] // Uncaught SyntaxError: Unexpected token ':'
Copy the code
Yes, this is a normal array with some other attributes assigned. Because arrays are objects, they can have any key-value pair in addition to the usual numeric index, although you’ll almost never see this in plain clean code (regular expression matching is the only other nonstandard property you can think of where an array object is). You can get a similar array with the following method definition.
let arr = [1]
arr.input = "test" // [1, 2, input: 'test']
Copy the code
Global tag G
What exactly does global matching mean? Exec () returns the same result even if it uses global matching.
let text = "cat, bat, sat, fat"
let nogpattern = /.at/
// ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
nopattern.exec(text)
// ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
let havegpattern = /.at/g
havegpattern.exec(text)
Copy the code
This is because exec() needs to be called again before it looks down again. And that’s where the effect of g comes in
Don’t use g
let text = "cat, bat, sat, fat"
let nog = /.at/
nog.exec(text) // ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(nog.lastIndex) // 0 Last matched index
nog.exec(text) //['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(nog.lastIndex) / / 0
Copy the code
The use of g
let text = "cat, bat, sat, fat"
let haveg = /.at/g // ['cat', index: 0, input: 'cat, bat, sat, fat', groups: undefined]
console.log(haveg.lastIndex) / / 0
haveg.exec(text) // ['bat', index: 5, input: 'cat, bat, sat, fat', groups: undefined]
console.log(haveg.lastIndex) / / 8
Copy the code
Adhesion marker Y
Only look for strings starting and after lastIndex
let text = "_aa_a"
let pattern = /_a+/y
pattern.exec(text) // ['_aa', index: 0, input: '_aa_a', groups: undefined]
Copy the code
y
But if we want to match aa or A, we get null. Because you start at lastIndex (0). It is to find the string [” _ “, “_a”, “_aa”, “_aa_”, “_aa_a”] is starting from the index of 0 all tenants,
let text = "_aa_a"
let pattern = /a+/y
pattern.exec(text) //null
console.log(text.lastIndex) / / 0
Copy the code
This can be found by redefining the lastIndex of pattern
pattern.lastIndex = 1
pattern.exec(text) // pattern.exec(text) //null
Copy the code
g
And if it is g, looking for the [” _ “, “_a”, “_aa”, “_aa_”, “_aa_a”, “a”, “aa”, “aa_”, “aa_a”, “a”, “a_”, “a_a”, “_”, “_a”, “a”], that is, all possible permutation and combination.
test()
This method is used to verify that the patterns match. Returns true or false
let a = "_aa_a"
let pattern = /_a+/g
pattern.test(a) // true
Copy the code
ToLocaleString () and toString ()
let pattern = new RegExp("\\[bc\\]at"."gi");
pattern.toString() // '/\[bc\]at/gi'
pattern.toLocaleString() // '/\[bc\]at/gi'
Copy the code
valueof()
Returns the regular expression itself
5.2.3 RegExp constructor properties
The following attributes allow you to extract information about the operations performed by exec() and test()
- Input abbreviation $_ The string searched last
- LastMatch abbreviation $& Last matched text
- LastParen abbreviation $+ Last matched capture group
- LeftContext abbreviation $’ INPUT the text that appears before lastMatch in the string
- The rightContext abbreviation $’ INPUT string appears in the text following lastMatch
let text = "this has been a short summer";
let pattern = / (.). hort/g
if(pattern.test(text)){
console.log(RegExp.input); // this has been a short summerh'n'm'n
console.log(RegExp. $_);/ / same as above
console.log(RegExp.leftContext); // tis has been a
console.log(RegExp["$`"]) / / same as above
console.log(RegExp.rightContext); // summer
console.log(RegExp.lastMatch); // short
console.log(RegExp.lastParen); // s
}
Copy the code