“If you had a problem you thought you could solve with re, you now have two problems.” 🤷 ♀ ️

Bronze – Regular foundation

Regular expressions are patterns used to match combinations of characters in strings.

Creating regular expressions

  • Created using regular expression literals/ab+c/g
  • Create by calling the constructor of the RegExp objectnew RegExp("ab+c","g")
    • Accepts two arguments, the first a string or regular expression, and the second a modifier (flag)
    • If the first argument is a regular expression, only modifiers that use the second argument are used, and modifiers that use the original regular expression are ignored (ES6 extension)

The RegExp object

1. Instance properties and methods

  • RegExp.prototype.exec(str)
  • RegExp.prototype.test(str)
  • Regexp.prototype. flags (ES6) returns a modifier for the regular expression
  • RegExp. Prototype. sticky (ES6) indicates whether y modifier is set…

2. Static attributes

  • RegExp.lastIndex

3. String objects

There are six ways to use regular expressions

  • str.search(regexp)

  • Str.match (regexp) returns an array

  • str.replace(regexp|substr, newSubStr|function)

  • STR. The split (separator) separator separator including STR | regexp

  • Str.matchall () – new in ES2020

  • Str.replaceall () -es2020 Added

. 4. The RegExp. The prototype test (STR) and String. The prototype. The search (RegExp)

  • Test () checks whether the regular expression matches the specified string, returning true or false.
  • Similar to the search() method of String, returns the matching index, otherwise -1
let str = "hello world!";
/world/.test(str); // true

let str = "hello world!";
str.search(/world/); //  6
Copy the code

To learn more about the return information (which is slow to execute), use the exec() method ⬇️ ⬇️ ⬇️

5. The RegExp. Prototype. The exec (STR) and String. The prototype. The match (RegExp)

  • Exec () searches for matches in the specified string. Returns an array on success and updates the lastIndex property of the regular expression object
    • The array consists of: the first item is the matched text, the second item is the relevant capture group contents, and other attributes (index matched index value, input raw string, groups named capture group)
  • Match () also returns an array containing the first full match and its associated capture group (returns the same result as the exec() method)
  • When the match() method uses the G flag, all matches are returned
// Return the same result
let str = "hello world world!";
/world/.exec(str);

let str = "hello world world!";
str.match(/world/);

// All matching results are returned
let str = "hello world world!";
str.match(/world/g);
Copy the code

Write a regular expression

A regular expression consists of simple characters and special characters

1. 6 optional flags

Regular expressions have six optional arguments (flags) to allow global and case-insensitive searches, etc

  • G Searches global
  • I Case insensitive search for ignorecase
  • M Multiline search
  • S allow. Match newline character (ES2018)
  • U Unicode pattern matching (ES6)
  • Y Perform a “sticky” search, matching from the current position of the target string (ES6)

Grammar:

  • let regExp = /pattern/flags;
  • let regExp = new RegExp("pattern", "flags");
let str = "Hello World!";
/world/i.test(str); // true
Copy the code

2. Special characters

Special characters with special meaning in regular expressions can be classified into special characters, quantifiers, ranges/groups, assertions, and Unicode attribute escapes.

1. Special single character

  • .Matches any character (except newline)
  • \dMatching digit => [0-9]
  • \DMatch non-numbers => [^0-9]
  • \wMatch A character word (including alphanumeric underscore) => [A-ZA-Z0-9_]
  • \WMatch non-characters => [^ A-zA-z0-9_]
  • \sMatches the whitespace character space, including Spaces, tabs, page feeds, and line feeds
  • \SMatches non-whitespace characters
  • \bMatch the word boundary
  • \rMatch carriage return
  • \nMatches a newline character
  • \uhhhhMatches Unicode characters represented by hexadecimal numbers
  • \u{hhhh}Matches Unicode characters in hexadecimal numbers (new in ES6, requires u flag)
let str = "He played the King in a8 and she moved her Queen in c2.";
str.match(/\w\d/g); // ["a8","c2"]
Copy the code
// Matches Unicode characters
let str = "happy 🙂, confused 😕, sad 😢";
let reg = /[\u{1F600}-\u{1F64F}]/gu;
str.match(reg); / / [' 🙂 ', '😕', '😢]
Copy the code
// Match Chinese character [\u4e00-\u9fa5]
let str = "123 I am 456 Chinese";
let reg = /[\u4e00-\u9fa5]/g;
str.match(reg); // [" I ", "yes "," zhong ", "wen "]
Copy the code

2. The quantifiers

  • => {0,}

  • + Match more than once (1+ means at least once) => {1,}

  • ? Match 0 or 1 times (optional, maybe not, somewhat like TS optional) => {0,1}

  • {n} The matching character occurs exactly n times

  • {n,m} at least n times, at most m times

  • {n,} occurs at least n times

// Matching rules: one or more characters and a space, global matching, regardless of case
let re = /\w+\s/gi;
"fee fi fo fum".match(re); // ["fee ", "fi ", "fo "]
Copy the code

3. Range/group

  • [xyz]Character set, matching any character in square brackets, dash(-)You can specify a range
  • [^xyz]Reverse character set that matches any character not contained in square brackets
  • x|yMatches x or y
let str = "The Caterpillar and Alice looked at each other";
let reg = /\b[a-df-z]+\b/gi;
str.match(reg);  // ["and", "at"]
Copy the code
  • (x)1. Group 2. Capture, match x and remember the match, then pass\nTo reference the NTH captured group, used when replacing$nTo refer to.
  • (? :x)Without capturing parentheses, matching substrings are not remembered, saving performance
let reg = /(apple) (banana) \1 \2/;
"apple banana apple banana apple banana".match(reg);
Copy the code
let reg = /(\w+)\s(\w+)/;
let str = "John Smith";
str.replace(reg, "$2 $1"); // "Smith, John"
Copy the code

4. Assertions – mainly boundary judgments

  • ^ Match the beginning of input (note: reverse in character set [^xyz])

  • $matches the end of the input

  • \b Matches the boundary of a word

  • x(? =y) predicate first, matching x (only if followed by y) as in: /Jack(? Matching = Sparrow)/Jack

  • (? <=y)x followed by assertion (ES2018), matching x (y only) /(? < = Sparrow) Jack Sparrow/matches

let str = "https://xxx.xx.com/#/index?type=xx&value=xxx";
let reg = / (? < = \?) .+/g;
str.match(reg); // ['type=xx&value=xxx']

// Conditional filtering
let oranges = ["ripe orange A"."green orange B"."ripe orange C"];
oranges.filter((item) = > item.match(/ (? <=ripe )orange/)); // ["ripe orange A", "ripe orange C"]
Copy the code
  • x(? ! y)Negate the assertion first, matching x (only if not followed by y)/Jack(? ! Sparrow)/
  • (? <! y)xSubsequent negation assertion (ES2018), matching x (only the current face is not y)/ (? <! Jack)Sparrow/

Silver – Regular progression

Here are some of the (new ES6) modifiers and their corresponding attributes

G modifier with lastIndex attribute

  • LastIndex specifies the start index for the next match. The g flag is required to take effect
  • Because with the G flag bit set, the RegExp object is stateful, recording the position since the last successful match in the lastIndex property
  • The exec()/test() method updates the lastIndex property of the re object after a successful match, and the lastIndex is reset to 0 if the match fails
let regExp = /ab*/g;
regExp.exec("abbcdefabh"); // ['abb',index:0]
regExp.lastIndex; / / 3
// Continue matching
regExp.exec("abbcdefabh"); // ['ab',index:7]
regExp.lastIndex; / / 9
// Continue matching
regExp.exec("abbcdefabh"); // null
regExp.lastIndex; / / 0
Copy the code

With the above features, the exec()/test () method loops through strings (finds all matches)

let reg = /ab*/g;
let str = "abbcdefabh";
let arr = [];
while((arr = reg.exec(str)) ! = =null) {
  console.log(arr, reg.lastIndex);
}
// Match will return only the result of the match
str.match(reg); // ['abb','ab']
Copy the code

Y modifier with the sticky attribute (ES6)

  • Y, also known as the “stick” modifier, is also a global match
  • Unlike the G modifier, which works as long as there is a match in any remaining position, the Y modifier ensures that “the match must begin at the first remaining position,” i.e., adhesion.
let regExp = /ab*/y;
regExp.exec("abbcdefabh"); // ['abb',index:0]
regExp.lastIndex; / / 3
// Continue matching
regExp.exec("abbcdefabh"); //null
regExp.lastIndex; / / 0

regExp.sticky; // true indicates the y modifier is set
Copy the code

Understanding: the Y modifier implies a header match. The y modifier is designed so that the header matching flag ^ is valid for global matching.

U modifiers and Unicode attributes (ES6)

  • The u modifier is used to match Unicode characters greater than \uFFFF (ES6)\uhhhhMatching Unicode characters represented by hexadecimal numbers)
  • Unicode property indicating whether the U modifier is set

/^\uD83D/.test('\uD83D\uDC2A') // true "\uD83D\uDC2A" represents a character
/^\uD83D/u.test('\uD83D\uDC2A') // false

let  r = /hello/u;
r.unicode; // true
Copy the code

S modifier and dotAll attribute (ES6)

  • In the ES5.Matches any character (except newline)
  • ES2018 added the S modifier, so that.Can match any single character, called dotAll pattern.
/foo.bar/.test("foo\nbar"); // false
// ES2018
/foo.bar/s.test("foo\nbar"); // true
/foo.bar/s.dotAll; // true
Copy the code

Gold – regular depth

Named Group matching

1. The group match

// exec() returns the first item of the array as successfully matched text, and from the second, each item corresponds to the successfully matched text in the "capture parentheses"
let regex = /(\d{4})-(\d{2})-(\d{2})/;
regex.exec("1999-12-31"); / / [" 1999-12-31 ", "1999", "12", "31," index: 0, groups: undefined]
Copy the code

The matching meaning of each group is not easy to see, and can only be quoted with numeric ordinals \n

2. Named Group Matching (ES2018)

Allows you to specify a name for each group match, both for code reading and reference purposes.

Grammar: /? < group name >(x)/

let regex = / (? 
      
       \d{4})-(? 
       
        \d{2})-(? 
        
         \d{2})/
        
       
      ;
regex.exec("1999-12-31");
/ / [" 1999-12-31 ", "1999", "12", "31," index: 0, groups: {day: "31", the month: "twelve," year: "1999"}]
Copy the code

3. Deconstruct assignments

Deconstruct the array returned by the match result directly

let {
  groups: { one, two },
} = / ^ (? 
      
       .*):(? 
       
        .*)$/u
       
      .exec("foo:bar");
one; // foo
two; // bar
Copy the code

4. Replace

When replacing, reference the named group with %< group name >

let re = / (? 
      
       \d{4})-(? 
       
        \d{2})-(? 
        
         \d{2})/u
        
       
      ;

"2015-01-02".replace(re, "$<day>/$<month>/$<year>");
/ / '02/01/2015'
Copy the code

String (new) method

String. The prototype. MatchAll (regexp) (ES2020)

  • The matchAll() method fetches all matches at once and includes capture groups. It returns an Iterator.
  • The regular expression must be set to global mode G, otherwise TypeError will be raised

Before matchAll, get all the match information by calling regexp.exec() in the loop. If matchAll is used, you don’t need to use the while loop with exec

let regexp = /t(e)(st(\d?) )/g;
let str = "test1test2";

// match Match mode
str.match(regexp); // ['test1', 'test2']

// The exec mode matches
regexp.exec(str); // ["test1", "e", "st1", "1", index: 0 ]

// matchAll to get a better capture group
[...str.matchAll(regexp)]; // [Array(4), Array(4)]
Copy the code

String.prototype.replace(regexp|substr, newSubStr|function)

When the first argument is a regular expression and the second argument is a function:

  • str.replace(regexp, function)
  • The function argument returns a new string to replace the result of the regexp match
let re = / (? 
      
       \d{4})-(? 
       
        \d{2})-(? 
        
         \d{2})/u
        
       
      ;
"2015-01-02".replace(
  re,
  (
    matched, // Match the result
    capture1, // Match group 1(must match on)
    capture2, // Match group 2
    capture3, // Match group 3
    index, // index
    input, // input
    groups / / a named group
  ) = > {
    console.log(matched, capture1, capture2, capture3, index, input, groups);
    let { day, month, year } = groups;
    return `${day}/${month}/${year}`; });/ / "02/01/2015"
Copy the code

String.prototype.replaceAll(regexp|substr, newSubstr|function) (ES2021)

  • You can replace all matches at once
  • The function argument is used the same as replace when the first argument is a regular expression (with the g modifier required) and the second argument is a function

reference

Regular expression

regexper

C.runoob.com/front-end/8…

Ruan Yifeng – regular extension

Unicode encoding online conversion

Unicode and JavaScript in detail