Regular expressions must know and must know

Single character matching

  • Global matching G

  • Ignore case I

  • . Matches any single character

  • \ Escape character: The special characters of patten in the re have special meanings. If you want to indicate a single character, you need to use \ to escape.

  • – The hyphen usually occurs only in the [] interval, and any hyphen outside of [] only matches itself (no special meaning), so no escape is required

const reg = /.\./ig;
const text = 'n.1.'

console.log(reg.exec(text));
Copy the code

Back in the reference

  • Backtracking references allow regular expressions to use the results of the last match
// \1 represents the result of the first subexpression
const reg = /[ ]+(\w+)[ ]+\1/
const text = ` off off of
`
console.log(text.match(reg));
Copy the code
// Matches a valid HTML tag
const reg = /\<([hH][1-6])\>.*\<\/\1\>/g
const text = ` 

hahha

hahha

haha

`
console.log(text.match(reg)); Copy the code
  • \ 1Represents the match that references the first subexpression,\ 2Represents a match that references the second subexpression…
const reg = /\<([hH][1-6])\>(.*?) \<\/\1\>\2/g
const text = ` 

aa

aa

hahha

haha

`
console.log(text.match(reg)); Copy the code
  • Backtrace reference replacement string
// Add an H1 tag to the string
const reg = /(\w+)(\n)/g
const text = ` haha xixi `
console.log(text.replace(reg,`<h1>$1</h1>$2$2`));

// Convert tel format
const reg = /(\d{3})-(\d{3})-(\d{3})/g
const text = ` `. 313-555-234
console.log(text.replace(reg,'($1) $2 - $3'));

Copy the code
// Uppercase

const reg = /(^[a-zA-F])([a-zA-F]+)/g
const text = `babel`
console.log(text.replace(reg,(. args) = >{
    return args[1].toUpperCase() + args[2]}));Copy the code
const reg = /(^[a-zA-F])([a-zA-F]+)/g
const text = `babel`
console.log(text.replace(reg,(. args) = >{
    return args[1].toUpperCase() + args[2]}));Copy the code

Matches a set of characters

  • Metacharacters []Defines a set of characters whose matching results can match those in the setAny memberMatches the text.
const reg = /[Rr]eg/;
const text = 'Reg'

console.log(reg.exec(text));
Copy the code
  • Between character sets+ , . ,/Will be resolved to normal characters, that is, no escape is required
/ [+] /// +
Copy the code
  • Character set interval
  1. /[A-z]/ : [, ^And so on in the ASCII character listThe character between Z and aIt will also be matched
/ [0-9] // / 0123456789
/[A-z]/ / / not commonly used
/[A-Z]/ // A B C D .... Z
/[a-zA-Z0-9]/ // Matches any character in a-z a-z 0-9
Copy the code
  • Metacharacters ^Take the match
  1. /[^ a-z]/ : Matches characters that are not in the range a-z

  2. /[^a-z0-9]/ : applies to the entire set of characters, not to any interval after ^

const reg = /[^A-z]/;
const text = '1'
console.log(reg.exec(text)); //match

const reg = /[^a-z0-9]/;
const text = '9'

console.log(reg.exec(text)); //null
Copy the code
  • Demo: Matches RGB values

RGB: The combined value given by a hexadecimal number

const reg = /#[A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9]/;
const text = '#FFF333 #CCCCCC'
console.log(reg.exec(text));
Copy the code
const reg = /[^a-z0-9]/;
const text = '9'

console.log(reg.exec(text)); //not match
Copy the code

Number of matches repeated

  • {n}Match:{}N times before the character
const reg = /#[0-9A-Fa-f]{6}/
const text = '#ffffff'
console.log(reg.exec(text)); 
Copy the code
  • {min, Max} : matches the characters before {} min to Max times

  • {min,} : matches the character before {} for at least min

  • * and + are greedy metacharacters, followed by? Represents its lazy version

const reg = /[a]+? /
const text = 'aaa'
console.log(reg.exec(text)); // Only one A can be matched
Copy the code

get

  1. If you need to match /, try to match \/

  2. The greedy behavior pattern is more, not more. The more the better is always matched from the beginning of the text to the end of the text, and the more is always matched from the beginning of the text to the first match

const reg = /[a]+? /
const text = 'aaa'
console.log(reg.exec(text)); 
Copy the code

Before and after the search

  • : Look forward to the schema? =At the beginning ofsubexpression, need to match the text to follow in=behind
/ /? =: indicates matching: but matching results do not consume this:
const reg = /. + (? =:)/g //.+(:)/g: matching results will consume this:
const text = `
https://www.baidu.com
http://www.tencent.com
`

console.log(text.match(reg));
Copy the code
  • : Looks back to the schema? < =A leading subexpression
// Match all the numbers after $
const reg = / (? <=\$)[\d\.]+/g
const text = `
apple: $10
peer: $16
banna: $xxx
`
console.log(text.match(reg));
Copy the code
  • : Matches the content of the HTML tag directly
// The enclosing subexpression is not the result of a backreference
const reg = / (? <=<(\w+)>).*(? =<\/\1>)/g
const text = ` < h1 > tag < / h1 > < h3 > not match < / h2 > < span > XXXX < / span > `
console.log(text.match(reg));

// console.log(text.replace(reg,'$1'));
Copy the code
  • : negative forward matching and negative backward matchingReplace = with!
const reg = /\b(? 
      
const text = 'China City'
console.log(text.match(reg));// city
Copy the code
const reg = /\b(? < = China) [a zA - Z] + \ b/g
const text = 'China City'
console.log(text.match(reg));// china
Copy the code

The embedded condition

Js does not support embedding conditions

  • (? (reference) true – reg | false – reg)

  • (? (backtracking reference)true-reg

const reg = /()? 
  • .*<\/li>(? (1) < \ / a > | < \ / b >) /
  • const text = '
  • haha
  • < a>
  • whoop
  • '
    Copy the code

    Use metacharacters

    • \ Escape character: A complete regular expression in which the character \ is always followed by another character

    • Match whitespace characters: \f, \n, \r…

    • Character class: characters that match a category.

    /\d/ === / [0-9]
    /\D/= = =/ [^ 0-9]
    
    /\w/= = =/[a-zA-Z0-9_]/
    /\W/= = =/[^a-zA-Z0-9_]/
    
    /\s/ 
    /\S/
    Copy the code

    get

    1. Linux: Text line end tag\n
    const reg = /\w/g
    const text =`123w`
    console.log(reg.exec(text)); 
    
    Copy the code

    Using subexpressions

    Can be used to group and categorize expressions

    • (exp)The: subexpression must be used(a)On the parcel
    const reg = /(age){2,}/
    const text = 'ageage' // match
    console.log(reg.exec(text));
    
    Copy the code
    // Matches the IP address
    const reg = / (((\ d {1, 2}) | | \ d {2} (1) (2 \ [0-4] d) | (25 [0-5])) \.) {3} ((\ d {1, 2}) | | \ d {2} (1) (2 \ [0-4] d) | (25 [0-5])) /
    const text = '11.255.11.11'
    console.log(reg.exec(text));
    Copy the code
    • |: the or operator in a regular expression
    // 20\d{2} is a match
    const reg = /19|20\d{2}/
    
    // Match the year
    
    const reg = /[20|19]\d{2}/
    const text = '2021'
    console.log(reg.exec(text));
    
    Copy the code
    const reg = / (((\ d {1, 2}) | | \ d {2} (1) (2 \ [0-4] d) | (25 [0-5])) \.) {3} ((\ d {1, 2}) | | \ d {2} (1) (2 \ [0-4] d) | (25 [0-5])) /
    const text = '11.255.11.11'
    console.log(reg.exec(text));
    Copy the code

    Position matching

    Boundary qualifiers: At what point does the qualifying match occur

    • [x]: Word boundary:\bMatch word boundaries,\BMatches the non-boundary of the word
    // Matches all English words
    const reg = /\b[a-zA-Z]{1,}\b/g;
    const text = 'Cat ha ha dog'
    console.dir(text.match(reg));
    
    Copy the code
    // Match the hyphen before and after the word boundary
    const reg = /\B-\B/g;
    const text = 'a - a'
    console.dir(text.match(reg));
    Copy the code
    • [x]: string boundary:^What does it start with,$What does it end with?
    const reg =/\<[\w+]{1,}\>(.*)\<\/[\w+]\>/g;
    const text = `
    <p> hello </p>
    <p> world </p>
    `
    console.dir(text.match(reg));
    Copy the code
    const reg =/\<[\w+]\s*[\w='"]{0,}\>(.*)\<\/[\w+]\>/g;
    const text = `
    <p name="123123"> hello </p>
    <p> world </p>
    `
    console.dir(text.match(reg));
    Copy the code

    Repeat matching

    • +: Matches one or more specific characters
    const reg = /a+/;
    const text = 'aa';
    console.log(reg.exec(text)); // Match two As
    Copy the code
    // Match the mailbox
    const reg = /\w+@\w+\.com/;
    const text = '[email protected]';
    console.log(reg.exec(text)); 
    Copy the code
    // Matches the secondary domain name
    const reg = /[\w.]+@[\w.]+\.\w+/;
    const text = '[email protected]';
    console.log(reg.exec(text)); 
    Copy the code
    • *Match:zeroOr multiple specific characters,That is, characters are optional
    // Matches a string that begins with a character
    const reg = /\w+[\w.]*@[\w.]+\.\w+/;
    const text = '[email protected]';
    console.log(reg.exec(text)); 
    Copy the code
    • ?Match:zeroOr a particular character at a time
    // HTTPS and HTTP can both match [] to improve readability
    const reg = /http[s]? :\/\/[\w./]+/;
    const text = 'https://baidu.com/';
    console.log(reg.exec(text)); 
    Copy the code
    // Remove all Spaces compatible with Windows and Linux
    const reg = /[\r]? \n[\r]? \n/g;
    const text = `asda asd asdasd asdasd `;
    console.log(text.replace(reg,'\n')); 
    Copy the code
    const reg = /[\r]? \n[\r]? \n/g;
    const text = `asda asd asdasd asdasd `;
    console.log(text.replace(reg,'\n')); 
    Copy the code