Learning to regex is very simple, the front end of the regular expression notes carefully organized

preface

This regular expression note is transferred from B station UP main: backing person programming, I just add comments to some of the more difficult to understand in the code.

Backed by people who video site at: www.bilibili.com/video/BV12J…

The original website of this blog is doc.houdunren.com/

1. Experience the power of regular expressions

Example: Filter out numbers in a string

1. Regular functions manipulate regular expressions

    let hd = "houdunren2200hdcms9988";

    let nums = [...hd].filter(a= > !Number.isNaN(parseInt(a)))

    console.log(nums.join(""));     / / 22009988
Copy the code

2. The regular expression matches the character string

    console.log(hd.match(/\d/g).join(""));      / / 22009988
Copy the code

You can see that using regular expressions greatly simplifies your code.

2. Create a re

JS uses both literals and objects to create regular expressions

Create regular expressions using literals

Using // wrap literals is recommended, but you can’t use variables in them

    let hd = 'houdunren.com'
    console.log(/u/.test(hd));  //true
Copy the code

Query cannot be performed when a is a variable

    let a = 'u';
    console.log(/a/.test(hd));  //false
Copy the code

It is possible to use the eval to JS syntax to parse variables into the re, but it is a bit cumbersome, so the following object creation method is recommended when variables are present

If the argument is an expression, the eval() function executes the expression. If the argument is a Javascript statement, eval() executes the Javascript statement.

    console.log(eval(` /${a}/ `))  // /u/
    console.log(eval(` /${a}/ `).test(hd));  //true
Copy the code

Create regular expressions using objects

Use object mode when a re needs to be created dynamically

The test method returns Boolean values true and false

The match method returns an array of matched results

    let hd = "houdunren.com";
    // Create a RegExp object
    let reg = new RegExp("u"."g");
    console.log(reg);     // /u/g
    console.log(reg.test(hd));  // true
    console.log(hd.match(reg))  // ['u', 'u']
Copy the code

Content is highlighted based on user input and regular expressions are supported

Replace () takes the first argument to the substring or RegExp object of the schema to be replaced,

The second argument is a string value that specifies the replacement text or the function that generates the replacement text. (If the second parameter is a function, the function is used to replace the value of the first parameter after the match.)

    const con = prompt("Please enter what you want to search for, regular expression support");
    const reg = new RegExp(con,"g");
    let div = document.querySelector('div');
    div.innerHTML = div.innerHTML.replace(reg,search= > {
        console.log(search)
        return `<span style="color:red">${search}</span>`;
    })
Copy the code

3. The selector

| on behalf of the modifier choice, namely | left and right sides of the have a match to the can

Check if the phone is a flight from Shanghai or Beijing

    let tel = "010-12345677";
    / / error: results only match both side | either as a result, the tel is "010" also can match
    console.log(tel.match(/ 010 | 020 \ - \ {7, 8} / d))
    // Correct result: All need to be used in atomic group
    console.log(tel.match(/ (010 | 020) \ - \ {7, 8} / d))
Copy the code

Whether the matching character contains Houdunren or HDCMS

    const hd = "houdunren";
    console.log(/houdunren|hdcms/.test(hd));  //true
Copy the code

4. Escape

Escape is used to change the meaning of a character. It is used to deal with multiple meanings of a character.

If you have a scenario like this, use literals to create regular expressions to look up the/symbol, but the/has a special meaning in the re. Writing /// will cause parsing errors, so use escape syntax /\// to match.

Matches the slash in the URL

    const url = "https://www.houdunren.com";
    console.log(/https:\/\//.test(url));
Copy the code

There are some differences in escape when building regexs. Here are the differences between objects and literals when defining regexs

    let price = 12.23;
/ / (1) literal
    // The meaning of the "." 1:. Any character except the line meaning 2:
    // Meaning 1: d letter D Meaning 2: \d numbers 0-9
    console.log(/\d+\.\d+/.test(price));


// use the object
    // key: string \d and d are the same, so \d is d when new RegExp
    console.log("\d"= ="d");
    
    // When defining a re with an object, you can print the string as if it were a literal
    console.log("\\d+\\.\\d+");
    let reg = new RegExp("\\d+\\.\\d+");
    console.log(reg.test(price));
Copy the code

5. Character boundaries

Boundary operator	instructions
^	Matches the beginning of the string
$	Matches the end of the string, ignoring a newline character

The match content starts with WWW

    const hd = "www.houdunren.com";
    console.log(/^www/.test(hd));
Copy the code

The match ends with.com

    const hd = "www.houdunren.com";
    console.log(/\.com$/.test(hd));
Copy the code

The detection user name contains 3 to 6 letters. If you do not use the ^ and $limits, you will not get the correct result

<body>
    <input type="text" name="username">
</body>
<script>
    document.querySelector(`[name="username"]`)
            .addEventListener("keyup".function() {
                // Does not apply to ^, the first character in a literal is not necessarily a letter,
                // If $is not used, the end may not be 3-6 letters, and the last digit may not be a letter
                let res = this.value.match(/ ^ [a-z] {3} $/ I);
                console.log(res);
                console.log(res ? "Right":"Error");
            })
</script>
</html>

Copy the code

6. Meta subcharacters

A metacharacter is the smallest element in a regular expression and represents only a single (one) element

A list of characters

metacharacters	instructions	The sample
\d	Matches any number	[0-9]
\D	Matches any character other than a number	[^ 0-9]
\w	Matches any letter, number, or underscore	[a-zA-Z_]
\W	Matches any character except letters, digits, or underscores	[^a-zA-Z_]
\s	Matches any whitespace character, such as a space, TAB character`\t`, a newline`\n`	[\n\f\r\t\v]
\S	Matches any character except whitespace	[^\n\f\r\t\v]
.	Matches any character except newline

Match any number

    let hd = "houdunren 2010";
    console.log(hd.match(/\d/g));/ / [' 2 ', '0', '1', '0']
Copy the code

Matches all phone numbers

    let hd = Zhang SAN :010-99999999, Li Si :020-88888888;
    let res = hd.match(/ \ d {3} - \ {7, 8} d/g);
    console.log(res);
Copy the code

Gets all user names

    let hd = Zhang SAN :010-99999999, Li Si :020-88888888;
    let res = hd.match(/[^:\d\-,]+/g);
    console.log(res);
Copy the code

Matches any non-number

console.log(/\D/.test(2029)); //false
Copy the code

Matches alphanumeric underscores

let hd = "hdcms@";
console.log(hd.match(/\w/g)); //["h", "d", "c", "m", "s"]
Copy the code

Matches matches any character except letters, digits, or underscores

console.log(/\W/.test("@")); //true
Copy the code

Matches matches any whitespace character

console.log(/\s/.test("")); //true
console.log(/\s/.test("\n")); //true
Copy the code

Matches any character except whitespace

let hd = "hdcms@";
console.log(hd.match(/\S/g)); ////['h', 'd', 'c', 'm', 's', '@']
Copy the code

To match points, you need to escape

let hd = `houdunren@com`;
console.log(/houdunren.com/i.test(hd)); //true
console.log(/houdunren\.com/i.test(hd)); //false
Copy the code

Matches any character except the newline character. Hdcms.com cannot be matched because of the newline character

const url = ` https://www.houdunren.com hdcms.com `;
console.log(url.match(+ /. /) [0]);
Copy the code

When using /s as a single-line mode (line breaks are ignored),. Matches all

let hd = `
  <span>
    houdunren
    hdcms
  </span>
`;
let res = hd.match(/<span>.*<\/span>/s);
console.log(res[0]);
Copy the code

Spaces in the re are treated as normal characters

let tel = ` `. 010-999999;
console.log(/\d+-\d+/.test(tel)); //false
console.log(/\d+ - \d+/.test(tel)); //true
Copy the code

To match all characters, you can use [\s\ s] or [\d\ d] to match all characters

let hd = `  houdunren hdcms  `;
let res = hd.match(/<span>[\s\S]+<\/span>/);
console.log(res[0]);
Copy the code

7. Pattern modification

The modifier	instructions
i	Match case – insensitive letters
g	Search globally for all matches
m	Consider a multi-line match
s	Ignore the newline character as a single line`.`Can match all characters
y	from`regexp.lastIndex`Begin to match
u	Correctly handle the four character UTF-16 encoding

i

Unify all houdunren.com to lowercase

    let hd = "houdunren.com HOUDUNREN.COM"

    hd = hd.replace(/houdunren\.com/gi."houdunren.com");

    console.log(hd);
Copy the code

g

Use the G modifier to manipulate content globally

let hd = "houdunren";
hd = hd.replace(/u/."@");
console.log(hd); // Only the first one is replaced when the g modifier is not used

let hd = "houdunren";
hd = hd.replace(/u/g."@");
console.log(hd); // all u's are replaced with the global modifier
Copy the code

m

Used to treat content as a multi-line match, mostly with the modifier ^ and $

Here’s how to parse the lesson starting with # numbers into object structures. Learning about atomic groups later makes the code simpler

    let hd = #1 jS, $200 # #2 PHP, $300 # #9 houdunren.com #3 Node.js, $180 #;
    // Change to [{name:'js',price:'200 yuan '}]
    let lessons = hd.match(/^\s*#\d+\s*.+\s+#$/gm).map(v= > {
        v = v.replace(/\s*#\d+\s*/."").replace(/\s+#/."");
        [name,price] = v.split(",");
        //{} returns key-value pairs
        return { name,price }; 
    })
   console.log(JSON.stringify(lessons,null.2))
   / * run results as follows: [{" name ":" js ", "price" : "200 yuan"}, {" name ":" PHP ", "price" : "300 yuan"}, {" name ":" node. Js ", "price" : }] */

Copy the code

u

Each character has an attribute. For example, L indicates a letter, and P indicates a punctuation mark. Other property abbreviations can be viewed at the property’s alias (Open New Window) website.

// Match letters with the \p{L} attribute
let hd = "Houdunren2010. Keep publishing tutorials, come on!";
console.log(hd.match(/\p{L}+/u));

// Use the \p{p} attribute to match punctuation
console.log(hd.match(/\p{P}+/gu));
Copy the code

Characters also have unicode character system attributes Script= character system, the following uses \p{sc=Han} to obtain the Chinese character system, other languages please see the text language table

let hd = Zhang SAN :010-99999999, Li Si :020-88888888;
let res = hd.match(/\p{sc=Han}+/gu);
console.log(res);
Copy the code

Four character UTF-16 byte encodings are handled correctly using U mode

let str = "𝒳 𝒴";
console.table(str.match(/ [𝒳 𝒴] /)); // Result is "�"

console.table(str.match(/ [𝒳 𝒴] / u)); // Result is correct "𝒳"
Copy the code

lastIndex

The lastIndex property of the RegExp object returns or sets the position at which the regular expression begins to match

— must be used in conjunction with the G modifier

— valid for exec methods

When the match is complete, lastIndex is reset to 0

    let hd = 'I'm constantly sharing video tutorials from houdunren.com;
    let reg = / backer (.{2})/g;
    console.log(reg.exec(hd));  // 'backstop'
    reg.lastIndex = 11; // Search backwards from index 10
    console.log(reg.exec(hd)); //' backer url '(note only for exec method)
Copy the code

    let hd = 'I'm constantly sharing video tutorials from houdunren.com;
    reg = /\p{sc=Han}/gu;
    // Exec will record the last search lastIndex, so it will iterate over all Chinese
    // It must be used with g
    while((res = reg.exec(hd))) {
        console.log(res[0]);
    }
Copy the code

y

Let’s compare using y mode with g mode, which always matches strings

let hd = "udunren";
let reg = /u/g;
console.log(reg.exec(hd));
console.log(reg.lastIndex); / / 1
console.log(reg.exec(hd));
console.log(reg.lastIndex); / / 3
console.log(reg.exec(hd)); //null
console.log(reg.lastIndex); / / 0
Copy the code

But in y mode, if the match from lastIndex is not successful, it will not continue to match

let hd = "udunren";
let reg = /u/y;
console.log(reg.exec(hd));
console.log(reg.lastIndex); / / 1
console.log(reg.exec(hd)); //null
console.log(reg.lastIndex); //0 //lastIndex returns to 0
Copy the code

Because using Y mode can stop matching when no match is available, the matching efficiency can be improved when matching qq in the following characters

let hd = ` backing people QQ group: 11111111999999 99888888 8 backing people constantly share video tutorials, backing people houdunren.com `;

let reg = /(\d+),? /y;
reg.lastIndex = 7;
while ((res = reg.exec(hd))) console.log(res[1]);
Copy the code

Table 8. Atoms

Matching a metacharacter in a set of characters is done in a regular expression through the metacharacter table, which is placed in [] (square brackets).

Use the syntax

Atomic table	instructions
[]	It only matches one of the atoms
(^)	Matches only any atom of the “except” character
[0-9]	Matches any number from 0 to 9
[a-z]	Matches any letter from lowercase A to z
[A-Z]	Matches any letter from capital A to Z

Examples of operation

Using [] to match any of these characters succeeds. In the following example, any character of UE is matched, not treated as a whole

const url = "houdunren.com";
console.log(/ue/.test(url)); //false
console.log(/[ue]/.test(url)); //true
Copy the code

Date matching

let tel = "2022-02-23";
// Here () and \1 are atomic group knowledge, which will be discussed later
console.log(tel.match(/\d{4}([-/])\d{2}\1\d{2}/));
Copy the code

Gets any number between 0 and 3

const num = "2";
console.log(/ [0, 3].test(num)); //true
Copy the code

The value contains any character from a to F

const hd = "e";
console.log(/[a-f]/.test(hd)); //true
Copy the code

The order is ascending otherwise an error will be reported

const num = "2";
console.log(/ [3-0] /.test(num)); //SyntaxError
Copy the code

Letters must also be in ascending order or an error will be reported

const hd = "houdunren.com";
console.log(/[f-a]/.test(hd)); //SyntaxError
Copy the code

Gets all user names

“-” in [] means interval matching, so it should be escaped

    let hd = Zhang SAN :010-99999999, Li Si :020-88888888;
    // "-" in [] is an interval match, so escape it
    let res = hd.match(/[^:\d\-,]+/g);
    console.log(res);
Copy the code

Some regular characters in the atomic table do not need to be escaped. If escaping is ok, it can be interpreted as in the atomic table. That’s the decimal point

let str = "(houdunren.com)+";
console.table(str.match(/[().+]/g));

// There is no problem with escaping
console.table(str.match(/[\(\)\.\+]/g));
Copy the code

You can use [\s\ s] or [\d\ d] to match all characters including newlines

.const reg = /[\s\S]+/g; .Copy the code

Below is the use of atomic table knowledge to delete all headings

<body>
  <p>Backed up by people</p>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
</body>
<script>
  const body = document.body;
  const reg = /<(h[1-6])>[\s\S]*</\1>*/g;
  let content = body.innerHTML.replace(reg, "");
  document.body.innerHTML = content;
</script>
Copy the code

9. The atomic group

If you need to match more than one element at a time, you can do it through the element subgroup

The difference between a group and a list is that a group matches more than one element at a time, whereas a list matches any character

Metacharacter groups are wrapped with ()

The basic use

When no g-mode modifier is added, only the first one is matched, and the matched information contains the following data

variable	instructions
0	The full content of the match
1,2….	Atom group matched
index	Position in the original string
input	The original string
groups	After grouping

The following result is obtained by matching h tag with atomic group.

     let hd = `houdunren
 hdcms
 `
    let reg = /<(h[1-6])>([\s\S]*)<\/\1>/i;
    console.log(hd.match(reg));

Copy the code

Email address matches

The following uses atomic groups to match mailboxes (mailboxes can contain “-“)

let hd = "[email protected]";
let reg = /^[\w\-]+@[\w\-]+\.(com|org|cn|cc|net)$/i;
console.dir(hd.match(reg));
Copy the code

If the mailbox is in the following format [email protected] the above rules will not be valid, you need to define the following method

let hd = `[email protected]`;
let reg = /^[\w\-]+@([\w\-]+\.) +(org|com|cc|cn)$/;
console.log(hd.match(reg));
Copy the code

Reference group

\ n (n is 1, 2, 3…). Reference atomic group when matching, $n (n is 1,2,3…) Refers to the use of matched group data during replacement. Let’s replace the tag with the P tag

let hd = ` < h1 > houdunren < / h1 > < span > backing people < / span > < h2 > HDCMS < / h2 > `;
// The first alternative is recommended
    let reg = /<(h[1-6])>([\s\S]*)<\/\1>/gi;
    console.log(hd.replace(reg,`<p>$2</p>`))   //$2 matches the second parenthesis ()
Copy the code

// Replace two
    let res = hd.replace(reg,(p0,p1,p2,p3) = > {
        console.log('p0:',p0);  //p0: houdunren
        console.log('p1:',p1);  //p1: h1
        console.log('p2:',p2);  //p2: Houdunren (second ())
        console.log('p3:',p3);  // p3:3 (index start position)
        return `<p>${p2}</p>`;
    });
    console.log(res);

Copy the code

Routing matching

    let hd = ` https://www.houdunren.com http://houdunwang.com https://hdcms.com `;
Copy the code

Matches the entire URL with the secondary domain name

let reg = /https:\/\/\w+\.\w+\.(com|org|cn)/gi;
Copy the code

Get the domain name record, add ()

let reg = /https:\/\/(\w+\.\w+\.(com|org|cn))/gi;
Copy the code

Do not enter \2 or $2 after (? :

let reg = /https:\/\/(\w+\.\w+\.(? :com|org|cn))/gi;
Copy the code

HTTPS or HTTP matches.?

let reg = /https? :\/\/(\w+\.\w+\.(? :com|org|cn))/gi;
Copy the code

The WWW may or may not be available. , add? : does not record WWW

let reg = /https? : \ \ / ((? :\w+\.) ? \w+\.(? :com|org|cn))/gi;
Copy the code

10. Repeat matching

The basic use

We use repeat match modifiers when we want to repeat something, including the following.

symbol	instructions
*	Repeat zero or more times
+	Repeat once or more
?	Repeat zero times or once
{n}	Repeated n times
{n,}	Repeat n times or more
{n,m}	Repeat n to m times

Because the minimum unit of a re is a metacharacter, and we rarely match only one metacharacter such as a and b, repeat matching is almost mandatory in every re statement.

By default, the repeat option repeats a single character, that is, not a greedy match

let hd = "hdddd";
console.log(hd.match(/hd+/i)); //hddd
Copy the code

After using the atomic group, the entire group is repeatedly matched

let hd = "hdddd";
console.log(hd.match(/(hd)+/i)); //hd
Copy the code

Here is the re to verify the flight number

let hd = "010-12345678";
console.log(- / 0 \ d {2, 3} \ d {7, 8} /.exec(hd));
Copy the code

The user name must contain 3 to 8 letters or digits and start with a letter

<body>
    <input type="text" name="username">
</body>
<script>
    let input = document.querySelector(`[name="username"]`);
    input.addEventListener("keyup".e= > { 
        const value = e.target.value;
        let state = / ^ [a-z] [/ w] {2, 7} $/ I.test(value);
        console.log(
            state ? "Correct! : "User names can only be 3-8 letters or numbers and start with a letter."
        );
    })
</script>
Copy the code

The password must contain uppercase letters and contain 5 to 10 characters

    let input = document.querySelector(`[name="password"]`);
    input.addEventListener("keyup".e= > { 
        const value = e.target.value.trim();
        const regs = [/ ^ [a zA - Z0-9] {5, 10} $/./[A-Z]/];
        // Use every to test if every item is true, otherwise return false
        let state = regs.every(v= > v.test(value));
        console.log(state ? "Right" : "Passwords must contain uppercase letters and be between 5 and 10 characters long.")});Copy the code

Ban greed

When the regular expression is repeated, the default is greedy matching mode, that is, it will match as much as possible, but sometimes we do not want it to match as much as possible. Modifier to disallow duplicate matching

use	instructions
*?	Repeat this any number of times,But repeat as little as possible
+?	Repeat 1 or more times,But repeat as little as possible
??	Repeat 0 or 1 times,But repeat as little as possible
{n,m}?	Repeat n to m times,But repeat as little as possible
{n,}?	If I repeat this n times,But repeat as little as possible

Here is an example of grammar against greed

    let str = "aaa";
    console.log(str.match(/a+/));  //aaa
    console.log(str.match(/a+? /));  //a
    console.log(str.match(/ a {2, 3}? /));  //aa
    console.log(str.match(/a{2,}? /));   //aa
Copy the code

Change all spans to H4 and highlight them in red, with the backing person – in front of the content

<body>
  <main>
    <span>houdunwang</span>
    <span>hdcms.com</span>
    <span>houdunren.com</span>
  </main>
</body>
<script>
  const main = document.querySelector("main");
  // Match as little content as possible starting with  and ending with 
  /* here [\s\ s] will match everything, so the first Will not be treated as </span>/span> </Span > process, but add +? This will not be the caseconst reg = /([\s\S]+?) <\/span>/gi;
  main.innerHTML = main.innerHTML.replace(reg, (v, p1) = > {
    console.log(p1);
    return < span style="color:red" style="color:red" style="color:red${p1}</h4>`;
  });
</script>
Copy the code

Here is the title element in the page using the No greed lookup

<body>
    <h1>
      houdunren.com
    </h1>
    <h2>hdcms.com</h2>
    <h3></H3>
    <H1></H1>
</body>
<script>
    let body = document.body.innerHTML;
     corresponds to 
         , and only one title element is available
    let reg = /<(h[1-6])>[\s\S]*? <\/\1>/gi;
    console.table(body.match(reg));
</script>
Copy the code

11. Character methods

search

The search() method is used to retrieve a specified substring in a string, or it can be searched using a regular expression that returns the index position

let str = "houdunren.com";
console.log(str.search(".com"));
Copy the code

Use regular expression search

console.log(str.search(/\.com/i));
Copy the code

match

Search directly using a string and return a set of details

    let str = "houdunren.com";
    console.dir(str.match(".com"));
Copy the code

The re is used to get the content, and the return value is also a set of details. Here is a simple search string

    let hd = "houdunren";
    let res = hd.match(/u/);
    console.log(res);
    console.log(res[0]); // The matching result
Copy the code

If you are usinggModifier, there are no details of the result (you can use exec), and here are the heading elements to get all h1~6

<body>
    <p>Backed up by people</p>
    <h1>houdunren.com</h1>
    <h2>hdcms.com</h2>
</body>
<script>
    let body = document.body.innerHTML;
    let result = body.match(/<(h[1-6])>[\s\S]+? <\/\1>/g);
    console.dir(result);
</script>
Copy the code

The above is only the result of the match, not the details

matchAll

The matchAll operation is supported in the new browser and returns an iterator

    let str = "houdunren";
    let reg = /[a-z]/ig;
    // Use matchAll directly to return an iterator
    let res = str.matchAll(reg);
    console.log(res);
Copy the code

Iterate over an iterated object with for of

    let str = "houdunren";
    let reg = /[a-z]/ig;
    // Iterate over the iterator
    for (const iterator of str.matchAll(reg)) {
        console.log(iterator);
    }
Copy the code

split

Used to delimit strings using strings or regular expressions, here is delimiting dates using strings

    let str = "2023-02-12";
    console.log(str.split("-")); / / / "2023", "02", "12"]
Copy the code

If the date’s concatenation is indeterminate, the re operation is used

    let str = "2023/02-12";
    console.log(str.split(/ - | \ / /));/ / [' 2023 ', '02', '12']
Copy the code

replace

The replace method can perform both basic character and regular replacements, replacing the date concatenator

    let str = "2023/02/12";
    console.log(str.replace(/\//g."-")); / / 2023-02-12
Copy the code

Substitution strings can insert the following special variable names:

variable	instructions
`$$`	Insert a “$”.
`$&`	Insert the matching substring.
$`	Inserts the content to the left of the currently matched substring.
`$'`	Inserts the content to the right of the currently matched substring.
`$n`	Let’s say the first parameter is zero`RegExp`Object, insert the NTH parenthes-matching string (0<n<100), hint: the index starts at 1

Add three = before and after the backer

    let hd = "= backer =";
    console.log(hd.replace(/ backer /g."$$` ` $& $' $'"));//=== backer === =
Copy the code

Use the number. – Connect

    let hd = "(010)99999999 (020)8888888";
    console.log(hd.replace(/ \ [(\ d {3, 4}) \] (\ d {7, 8})/g."$1 - $2"));/ / 010-99999999 020-8888888
Copy the code

All the education of Chinese characters plus links to https://www.houdunren.com

Online education is an efficient way to learn, and education is a lifelong career.<script>
    const body = document.body;
    body.innerHTML = body.innerHTML.replace(
        / education/g.`<a href="https://www.houdunren.com">$&</a>`
    );
</script>
Copy the code

Add HTTPS to the link and complete WWW.

<body>
    <main>
      <a style="color:red" href="http://www.hdcms.com">Open source system</a>
      <a id="l1" href="http://houdunren.com">Backed up by people</a>
      <a href="http://yahoo.com">Yahoo!</a>
      <h4>http://www.hdcms.com</h4>
    </main>
</body>
<script>
    const main = document.querySelector("body main");
    const reg = /(;
    main.innerHTML = main.innerHTML.replace(reg, (v, ... args) = > {
        //v corresponds to a.*href=['"], args[1] corresponds to HTTP
        args[1] + ="s";
        // if args[3] is undefined, add WWW. If args[3] is undefined, add WWW
        args[3] = args[3] | |"www.";
        console.log(args)
        // Use splice to delete the first five elements, return the value of the deleted element, receive the return value of the join
        return args.splice(0.5).join("");
    });
</script>
Copy the code

Replace all title tags with P tags

<body>
    <h1>houdunren.com</h1>
    <h2>hdcms.com</h2>
    <h1>Backed up by people</h1>
</body>
<script>
    const reg = /<(h[1-6])>(.*?) <\/\1>/g;
    const body = document.body.innerHTML;
    const html = body.replace(reg, function(str, tag, content) {
        return `<p>${content}</p>`;
    });
    document.body.innerHTML = html;
</script>
Copy the code

Delete tags H1 to H6 from the page

<body>
    <h1>houdunren.com</h1>
    <h2>hdcms.com</h2>
    <h1>Backed up by people</h1>
</body>
<script>
    const reg = /<(h[1-6])>(.*?) <\/\1>/g;
    const body = document.body.innerHTML;
    const html = body.replace(reg, "");
    document.body.innerHTML = html;
</script>
Copy the code

Atomic group name

Use? <> Alias the atomic group

Replace the H tag with the P tag with an alias

    let hd = ` < h1 > houdunren.com < / h1 > < h2 > hdcms.com < / h2 > < / h1 > < h1 > backing people `;
    const reg = /<(h[1-6])>(? 
      
       .*?) <\/\1>/gi
      ;
    console.log(hd.replace(reg,"<p>$<con></p>"));
Copy the code

Get the connection content using an alias

<body>
    <main>
          <a href="https://www.hdcms.com">Open source system</a>
          <a id="l1" href="https://www.houdunren.com">Backed up by people</a>
          <a href="https://www.yahoo.com">Yahoo!</a>
    </main></body> <script> const main = document.querySelector("body main"); const reg = /<a.*? href=(['"])(? <link>.*?) 1 > \ (? <title>.*?) <\/a>/ig; const links = []; For (const iterator of main.innerhtml.matchall (reg)) {links.push(iterator["groups"]); } console.log(links); </script>Copy the code

12. Regular methods

Here are the operations provided by the RegExp re object

test

Check whether the entered email address is valid

<body>
    <input type="text" name="email" />
</body>
<script>
    let email = document.querySelector(`[name="email"]`);
    email.addEventListener("keyup".e= > {
        console.log(/^\w+@\w+\.\w+$/.test(e.target.value));
    });
</script>
Copy the code

exec

Without the G modifier, it is similar to the match method, which can be looped until all matches are complete.

Use the same re for multiple operations with the G modifier, that is, define the re as a variable

Returns null if no match is found using the g modifier

Calculate the number of times the sponsor appears in the content

<body>
    <div class="content">Video tutorials are shared by houdunren.com</div>
</body>
<script>
    let content = document.querySelector(".content");
    let reg = / (? 
        
          backing) person /g
        ;
    let num = 0;
    while ((result = reg.exec(content.innerHTML))) {
        num++;
    }
    console.log('Backing people appear together${num}Time `);
</script>
Copy the code

13. Global matching

Problem analysis

Here’s how to use match to get the tag content of the page globally, but without returning the match details

<body>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
  <h1>Backed up by people</h1>
</body>

<script>
  function elem(tag) {
    const reg = new RegExp("<" + tag + ") >. +? < 1 > \ / \ \"."g");
    return document.body.innerHTML.match(reg);
  }
  console.table(elem("h1"));
</script>
Copy the code

matchAll

The matchAll operation is supported in the new browser and returns an iterator

The G modifier needs to be added

let str = "houdunren";
let reg = /[a-z]/ig;
for (const iterator of str.matchAll(reg)) {
  console.log(iterator);
}
Copy the code

When working in newer browsers, you can use matchAll to match content

<body>
    <h1>houdunren.com</h1>
    <h2>hdcms.com</h2>
    <h1>Backed up by people</h1>
</body>
<script>
    let reg = /<(h[1-6])>([\s\S]+?) <\/\1>/gi;
    const body = document.body;
    const hd = body.innerHTML.matchAll(reg);
    let contents = [];
    for(const iterator of hd) { 
        contents.push(iterator[2]);
    }
    console.table(contents);
</script>
Copy the code

Define the matchAll method in the prototype to work in older browsers without adding g mode to run

String.prototype.matchAll = function(reg) {
  let res = this.match(reg);
  if (res) {
    let str = this.replace(res[0]."^".repeat(res[0].length));
    let match = str.matchAll(reg) || [];
    return[res, ...match]; }};let str = "houdunren";
console.dir(str.matchAll(/(U)/i));
Copy the code

exec

The g mode modifier is used in conjunction with the exec loop operation to obtain the results and matching details

<body>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
  <h1>Backed up by people</h1>
</body>
<script>
  function search(string, reg) {
    const matchs = [];
    while ((data = reg.exec( string))) {
      matchs.push(data);
    }
    return matchs;
  }
  console.log(search(document.body.innerHTML, /<(h[1-6])>[\s\S]+? <\/\1>/gi));
</script>
Copy the code

Use the function defined above to retrieve the url in the string

let hd = `https://hdcms.com https://www.sina.com.cn https://www.houdunren.com`;

let res = search(hd, /https? :\/\/(\w+\.) ? (\w+\.) +(com|cn)/gi);
console.dir(res);
Copy the code

14. Assertion match

An assertion, although written in an extension, is not a group, so it is not saved in the result of a match. You can think of an assertion as a condition in the re.

(? =exp)

Zero width first assertion? =exp matches whatever follows exp

Add a link to the kanji behind the tutorial

<body>
    <main>Sponsors continue to share video tutorials, learning from sponsors to improve programming ability.</main>
</body>
<script>
    const main = document.querySelector("main");
    // Assertions are conditions in the re
    const reg = / backer (? = tutorials)/gi;
    main.innerHTML = main.innerHTML.replace(
        reg,
        v= > `<a href="https://houdunren.com">${v}</a>`
    );
</script>
Copy the code

Here we add.00 after the price

<script>
    let lessons = Js, $200,300 hits PHP, $300.00,100 hits Node. js, $180,260 hits;
    / / (? = element) is a condition and will not be put into groups
    let reg = /(\d+)(.00)? (? =)/gi;
    lessons = lessons.replace(reg, (v, ... args) = > {
      //args[1] = (.00); //args[1] = (.00)
      args[1] = args[1] | |00 ".";
      return args.splice(0.2).join("");
    });
    console.log(lessons);
</script>
Copy the code

The user name for assertion validation must be five letters, the following re indicates whether the assertion is a group, and is not recorded in the match result

<body>
    <input type="text" name="username" />
</body>
<script>
    document
        .querySelector(`[name="username"]`)
        .addEventListener("keyup".function() {
        let reg = / ^ (? =[a-z]{5}$)/i;
        console.log(reg.test(this.value));
        });
</script>
Copy the code

(? <=exp)

Zero-width afterline assertion? <=exp matches what comes before exp

Matches the number preceded by Houdunren

let hd = "houdunren789hdcms666";
let reg = / (? <=houdunren)\d+/i;
console.log(hd.match(reg)); / / 789
Copy the code

Matches before and after the number of content

    let hd = "houdunren789hdcms666";
    let reg = / (? <=\d)[a-z]+(? =\d{3})/i;
    console.log(hd.match(reg));//hdcms
Copy the code

Replace all hyperlinks with houdunren.com

    const body = document.body;
    // match the a href content, the first parenthesis represents yes condition, so \1 represents ['"]
    let reg = / (? <=;
    console.log(body.innerHTML.match(reg));
    body.innerHTML = body.innerHTML.replace(reg, "https://houdunren.com");
Copy the code

In the following example, add a link to the video following the backer

<body>
    <h1>Backing people constantly record rich case of video tutorials</h1>
</body>
<script>
    let h1 = document.querySelector("h1");
    let reg = / (? <= backing person) video /;
    h1.innerHTML = h1.innerHTML.replace(reg, str= > {
        return `<a href="https://www.houdunren.com">${str}</a>`;
    });
</script>
Copy the code

Blur the last four digits of the phone

    let users = 'To the military telephone: 12345678901 backing person telephone: 98745675603';
    let reg = / (? <=\d{7})\d+\s*/g;
    users = users.replace(reg, str= > {
        return "*".repeat(4);
    });
    console.log(users); // Contact: 1234567**** Support: 9874567****
Copy the code

Gets the content in the title

    let hd = '< H1 > Backing person video continuously recording case rich video tutorial 
      ';
    let reg = / (? <=).*(? =<\/h1>)/g;
    console.log(hd.match(reg));
Copy the code

(? ! exp)

Zero – width negative – leading assertion cannot be followed by anything specified by exp

Use (? ! Exp) cannot be followed by two digits

    let hd = "houdunren12";
    let reg = /[a-z]+(? ! \d{2})$/i;
    console.table(reg.exec(hd));//null
Copy the code

In the following example, xiang Jun cannot appear in the user name

<body>
    <main>
      <input type="text" name="username" />
    </main>
</body>
<script>
    const input = document.querySelector(`[name="username"]`);
    input.addEventListener("keyup".function() {
        // Do not use the word "xiang Jun"
        const reg = / ^ (? ! . * to the army. *) [a-z] {5, 6} $/ I;
        console.log(this.value.match(reg));
    });
</script>
Copy the code

(? <! exp)

Zero width negative trailing assertion cannot be preceded by anything specified by exp

Gets a character that is not preceded by a number

    let hd = "hdcms99houdunren";
    let reg = / (? 
      ;
    console.log(reg.exec(hd)); //hdcms
Copy the code

All not begin with https://oss.houdunren.com static resource replacement for the new site

<body>
    <main>
      <a href="https://www.houdunren.com/1.jpg">1.jpg</a>
      <a href="https://oss.houdunren.com/2.jpg">2.jpg</a>
      <a href="https://cdn.houdunren.com/2.jpg">3.jpg</a>
      <a href="https://houdunren.com/2.jpg">3.jpg</a>
    </main>
</body>
<script>const main = document.querySelector("main"); // Find const reg = / HTTPS :\/\/(\w+)? (?<! oss)\.. +? (? =\/)/gi;main.innerHTML = main.innerHTML.replace(reg, v => {
        console.log(v);
        return "https://oss.houdunren.com";
    });
</script>
Copy the code

Learning to regex is very simple, the front end of the regular expression notes carefully organized

preface

1. Experience the power of regular expressions

2. Create a re

Create regular expressions using literals

Create regular expressions using objects

3. The selector

4. Escape

5. Character boundaries

6. Meta subcharacters

A list of characters

7. Pattern modification

i

g

m

u

lastIndex

y

Table 8. Atoms

Use the syntax

Examples of operation

9. The atomic group

The basic use

houdunren

hdcms

Email address matches

Reference group

houdunren

Routing matching

10. Repeat matching

The basic use

Ban greed

corresponds to , and only one title element is available

11. Character methods

search

match

matchAll

split

replace

Atomic group name

12. Regular methods

test

exec

13. Global matching

Problem analysis

matchAll

exec

14. Assertion match

(? =exp)

(? <=exp)

).*(? =<\/h1>)/g

(? ! exp)

(? <! exp)

Related Posts

The product manager said: Can we customize the pictures displayed when sharing links to Zhihu? Finally, I nailed it!

React -big-calendar An event calendar component built for React

7.com MIT phase (I heard renderer marked it for us, let’s map real nodes)