Regular expressions this article is enough! [Pure dry goods, suggested collection]

preface

“Drunk just know wine thick 🍷, love just know heavy. ❤ ️”

For re both love and hate, love is through a few simple lines of code can achieve powerful functions, hate is difficult to really control it.

Every time we need to use a more complex regular expression, we have to search through the search engine, and every time we find out the results are still different, which is time-consuming and laborious, so we have done some sorting. If you need to use the re in the future, just come here and find it.

🍔 You can learn the following by reading this article 🍔 :

Learn how tocheckWrite your own re correctly (reduce or avoid potential bugs)!
Can quicklysearchThe re you need!
Can master a few of youFewer plugins for the re
Have a comprehensive understanding of regularInstance methods,Advanced mode,Basic usage(Look up 20)
The corresponding module provides a link to how to use it, so you can quickly learn how to use it in more detail

⭐️ (Ctrl + D or Command + D) ⭐️

Let’s start with the dry stuff, then get to the basics

🍞 verify that your re is correct!

Let’s start with an online regular expression graphical tool. With this tool, you can see at a glance what characters your regular expression can match. Here is the phone number /^1[34578]\d{9}$/ re display click try

🍞 Search for the re you need!

This search tool supports both web search and direct search in vscode. Most of the everyday re’s can be found here.

Search the web page for regular expressions
Github’s instructions for web/VScode/idea/Alfred Workflow

🍞 can let you install plugins less re!

1. Match the specified tag in the HTML string

Example: Match all img tags and replace the SRC attribute (common in rich text editing articles)

Use function substitution

// The string to match
let str='<div><img src=\'123\' /><div><img src="456"/></div></div>'

// A regular expression that matches all images and SRC attributes
let reg = /<img [^>]*src=['"]([^'"]+)[^>]*>/gi

// Reassign after substitution
str = str.replace(reg, function(match, ... args){
console.log('matches to the IMG tag:${match}`.'matches the SRC value of the tag:${args}`)
return match
})

/ / output:Matched img tag: <img SRC ='123'/> Matches the SRC value of the tag:123Matched img tag: <img SRC ="456"/> Matches the SRC value of the tag:456
Copy the code

Replace with $sign ($1-$99)

// The text that matches the first through 99 subexpressions in regexp

var name = "Doe, John";
name.replace(/(\w+)\s*, \s*(\w+)/."$2 $1");
// output 'John Doe'

var name = '"a", "b"';
name.replace(/"([^"]*)"/g."' $1 '");
// print "'a', 'b'" (single quotes on configured ab)
Copy the code

Important: When using function substitution 1. If there is no match, the following function function is not executed. 2. If the match is repeated, the function is executed multiple times. The result of return replaces the current matched value. 3. The browser automatically compiles the escape character ‘\’ in the string without additional processing. 4. The args argument returns each child match (that is, the match of each parenthesis), the last of which is the text itself

The ‘$1-$99’ substitution is very powerful and convenient

See MDN for more details

2. Match the URL string to get each individual part

Can let you install one less plugin re!

Convert objects to URL parameters, all in one sentence (There is no need to introduce additional plug-ins to handle this)

// queryObject is the object to be converted
Object.entries(queryObject).map(([key, value]) = >key + '=' + value).join('&')
Copy the code

Matches each part of the URL string (Just a regular)

var reg = /^(https? :)\/\/([0-9a-z.]+)(:[0-9]+)? ([/0-9a-z.]+)? (\? [0-9a-z&=]+)? (#[0-9-a-z]+)? /i

var stringUrl = 'https://juejin.cn/user/4353721776234743/posts?id=123#test'

let [href, protocol, hostname, port, pathname, search, hash] = reg.exec(stringUrl)

console.log({href, protocol, hostname, port, pathname, search, hash})

// ** The output is the same as the value of location. Interested friends can copy and try on the console. **
Copy the code

The output result is consistent with the value corresponding to location. Interested partners can copy it and try it on the console.

3. Abbreviated if judgment condition

I think even fewer people are using it. When used correctly, we can write less code. See:

/ / before modification
if(
userName === 'Dilieba' ||
userName === 'Zhang Liying' ||
userName === 'White Baihe' ||
userName === 'Guan Xiaotong' ||
userName === 'Liu Yifei') {... }//
if(/ ^ Dillon hot bar | li-ying zhang what | | white best xiaotong guan | $/ l.test(userName)){
......
}
Copy the code

Use the ‘^’ and ‘$’ symbols to make a match. You can also use an array like includes to make a unique match

4. Identify the browser kernel and access terminal

    // userAgent matching involves more,
    // Only a simple example is provided here
    function parseUA(userAgent) {
      const u = userAgent || navigator.userAgent;
      return {
        isIOS: /iOS|iPad|iPhone/i.test(u),
        isAndroid: /Android/i.test(u), 
        isMobile: /iOS|iPad|iPhone|Android|windows Phone/i.test(u), 
        isQQ: /qq/i.test(u), // Easy to mismatch
        isWeixin: /micromessenger/i.test(u), 
        isWeibo: /weibo/i.test(u),
        isMac: /mac/i.test(u),
        isWondows: /Windows NT/i.test(u)
      }
    }
Copy the code

🍞 Regular instance method (Introduction, common)

test

A well-worn method to check whether a regular expression matches a specified string. Returns true or false.

let str = 'hello world! ';
let result = /^hello/.test(str);
console.log(result);
// true
Copy the code

match

Data type returned after execution:

Returns the array type if there is a match, null otherwise
Global pattern matching: Each item matched is displayed in an array. (Countless sets of attributes)
In non-global pattern matching: The 0th element of the array holds the matching text, while the remaining elements hold the text that matches the subexpression of the regular expression. It also returns array properties whose index is the matched character position and input is a reference to the string.

'abc'.match(/e/)
// null

'abc'.match(/b/)
// ['b', index: 1, input: 'abc', groups: undefined]

'a1b2'.match(/\d/g)
/ / / '1', '2'

Copy the code

exec

The main application scenarios are as follows: 1. Text retrieval 2. When multiple expressions are matched, it is returned in the form of data at a time (for example, URL string matching mentioned before) 3. Returns more comprehensive matching related information

Matching pattern and result:

If no match is found, the return value is null.
In non-global match mode, this function is the same as the match function (see match method).
In global mode it is more complex, as follows:

// Global matching example
var str = 'ab1ab2'
var reg = /b\d/g
console.log(reg.exec(str))
['b1', index: 1, input: 'ab1ab2AB3ab4 ', groups: undefined]
console.log(reg.lastIndex)
// Output: 3

console.log(reg.exec(str))
['b2', index: 4, input: 'ab1ab2ab3', groups: undefined]
console.log(reg.lastIndex)
// Output: 6

console.log(reg.exec(str))
// Output: null
console.log(reg.lastIndex)
// Output: 0
Copy the code

Note: In global mode, the position of the first character of the currently matched text is returned after each match. Reg.lastindex matches the position of the next character of the last character of the text. So the starting and ending index values are not on the same object. You can manually set the value of lastIndex to set the starting point of a global match, which defaults to 0. It will return null after the match is complete, and the next match will start from the first match and continue the match cycle (null is used to determine whether to terminate the match).

When matching subforms (see matching URL strings mentioned earlier)

// Simple example

var str='abc123'
var reg=/c(\d{3})/
console.log(reg.exec(str))
['c123', '123', index: 2, input: 'abc123', groups: undefined]
// where 'c123' is a full match and '123' is a word expression in parentheses
Copy the code

If there is a match, the first entry in the array is the complete match, followed by the match for each subexpression. If there is a global match, as in the global match example above, the result of each match is returned as an array

search

The user retrieves the position of the string

You can pass in a string to retrieve

Var STR = ‘abcdefg’ var result = str.search(‘ CD ‘) console.log(result

Can be passed in a regular expression for retrieval

Var STR = ‘abcdefg’ var result = str.search(/ CD /) console.log(result) /

replace/replaceAll

For a basic understanding, see “Matching specified tags in HTML Strings,” which analyzes the differences with replaceAll. Type of form: STR. ReplaceAll (regexp/substr, replacement)

ReplaceAll Must use g (global modifier) if retrieved with reOtherwise, an error will be displayed (replaceAll called with a non-global RegExp argument)
ReplaceAll, if retrieved as a string, replaces all matched strings
If thereplacementIs a callback function, each match is also performed. However, the parameters in the callback need to be noted, as shown in an example:

1.No subformvar str = 'abcb'
// To use a re, you must add a global modifier
var result = str.replaceAll(/b/g.function(. args){
    console.log(args)
    return args[0]})// The values of args are printed in sequence
// ['b', 1, 'abcb']
// ['b', 3, 'abcb']

2.Subexpressionvar str = 'ab1cb2'
// To use a re, you must add a global modifier
var result = str.replaceAll(/b(\d)/g.function(. args){
    console.log(args)
    return args[0]})// The values of args are printed in sequence
// ['b1', '1', 1, 'ab1cb2']
// ['b2', '2', 4, 'ab1cb2']
Copy the code

Conclusion:

When replacement is a callback function, its arguments are different depending on whether there is a subexpression
In the absence of subexpression, the parameters are (the current matching result, the position of the first character of the current matching text, and the full character).
In the absence of subexpression, the parameters are (the result of the current match, the result of several matched subitems, the position of the first character of the current matched text, and the full character).

split

(Used to split a string into an array of strings.) Split (separator,howmany) Separator can be a string or regular howmany as a number, specifying the maximum length of the array to be returned.

"Hello". The split (" ") can be returned / / [" h ", "e", "l", "l", "o"] "hello". The split (" ", 3) can be returned / / [" h ", "e", "l"]Copy the code

See MDN for more details on the re instance method

🍞 Advanced Mode

Positive positive pre-check (? =pattern)

var reg = /Windows(? = 95 NT | | 98 | 2000)/reg. Test (' Windows95 ') / / true reg. Test (' Windows') / / true reg. Test (' Windows3.1 ') / / falseCopy the code

Can only match forward? The character after =

Positive negative pre-check

var reg = Windows(? ! 95|98|NT|2000) reg.test('Windows95') // falseCopy the code

Can only match positive not? ! The following character

Reverse affirmative prelookup (see the example)

var reg = /(? <=95|98|NT|2000)Windows/ reg.test('95Windows') // true reg.test('66Windows') // falseCopy the code

Reverse negative pre-check (see example)

var reg = /(? <! 95|98|NT|2000)Windows/ reg.test('95Windows') // false reg.test('66Windows') // tureCopy the code

Greed mode

Make as many matches as possible on the premise that they are successful

1.As many matches as possiblevar str = 'axxxa'
var reg = /a(\w+)/
reg.test(str) // (\w) matches' xxXA ', returns true

2.Have to matchvar str = 'axxxa'
var reg = /a(\w+)a/
reg.test(str) // (\w) matches' XXX '(giving up the last a), returns true
Copy the code

Lazy mode (non-greedy mode)

Also known as lazy mode, as long as the match is successful, as few matches as possible

1.As few matches as possiblevar str = 'axxxa'
var reg = /a(\w+?) /
reg.test(str) // Match 'ax', return true

2.Have to matchvar str = 'axxxa'
var reg = /a(\w+?) a/
reg.test(str) // Match 'axxxa', return true
Copy the code

Description:

Add ‘? ‘, can make the expression of variable number of matches as few matches as possible
If it’s an expression that matches but doesn’t match, try to “mismatch” as much as possible.
When too few matches would cause the entire expression to fail, non-greedy mode minimizes the number of matches to make the entire expression match successfully.

🍞 Generic regular expressions

Chinese match

let regex = [\u4e00-\u9fa5]

11 digit cell phone numbers match

let regex = /^1[34578]\d{9}$/g;

Date Verification (YYYY-MM-DD)

let regex = /^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$/;

Id card matching

let regex = /^[1-9]\d{7}(? :0\d|10|11|12)(? :0[1-9]|[1-2][\d]|30|31)\d{3}$/

Search for more regular expressions

Here is the full list of regular expressions to search for. Click here to find more

🍞 Basic use

Metacharacters that must be escaped

([{\ ^ $|)? * +.]}

Commonly used symbols

metacharacters	describe
.	Find single characters (except newlines and line terminators)
\w	Matches letters, digits, and underscores. Equivalent to [A Za – z0-9 _]
\W	Matches a non-single-word character (as opposed to \w)
\d	Match the Numbers
\D	Matches non-numeric characters
\s	Matching whitespace characters
\S	Matches non-whitespace characters
\b	Matching word boundaries
\B	Matches non-word boundaries
\ 0	Find NUL characters`[\x00-\x7F]`, such as`/\0/.test('\x00') // true`
\n	Find a newline character
\f	Find the feed character
\r	Find carriage return
\t	Find tabs
\v	Find vertical tabs
\xxx	Find the character specified by the octal number XXXX
\xdd	Find characters specified in hexadecimal number dd
\uxxxx	Finds Unicode characters specified in hexadecimal XXXX
`\uhhhh`	Matches a UTF-16 code unit represented by a four-digit hexadecimal number.

quantifiers

quantifiers	describe
^	Matches the beginning of a line. In multi-line detection, the beginning of a line is matched
$	Match end, in multi-line detection, matches the end of a line
*	Matches the preceding subexpression zero or more times.
+	Matches the previous subexpression one or more times.
?	Matches the preceding subexpression zero or once.
{n}	N is a non-negative integer. Match certain n times.
{n,}	N is a non-negative integer. At least n times.
{n,m}	Both m and n are non-negative integers, matching at least n times and at most m times

Refer to MDN for more basic details

🍞 related links for reference

Online regular expressions graphical tools search for more regular expressions search for regular expressions manual [its UI is generally 🤗, but its description and examples are short and skilled, easy and quick to understand] MDN details please refer to regular expressions do not back

Author: Tager Address: juejin.cn/user/435372… Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.