This article has been published exclusively by guolin_blog, an official wechat account
Reprint please indicate the source: juejin.cn/post/692093…
This article is from Rong Hua Xie Hou’s blog
Past review:
Learning regular Expressions together (1) Those Dizzying Metacharacters
Learning regular Expressions together (2) Quantifiers and Greed
Learning regular Expressions together (3) Grouping and Referencing
Learning regular Expressions together (4) Four Common Matching patterns
Learning regular Expressions together (5) Predicate Matching
Learning regular Expressions together (6) Principle of Regular Matching
Learning regular Expressions together (7) Backtracking Traps
0. Write first
In development, regular expressions are often used to verify email and mobile phone numbers, and batch search and replace texts.
Most of the students, when they get the demand, the first thing must be to open the browser, search: how to write the mailbox regular expression, and then Ctrl C + V, test several conditions no problem, submitted, out of the problem also do not know how to modify, can only ask for help enthusiastic netizens.
This article, mainly take you to understand the basic usage of regular expression, have a preliminary understanding, see the regular after no longer a face meng.
For example, regular expressions for IPv4 addresses:
^ ((1-9] [0-9]? | 1 [0-9] [0-9] [0 to 4] | 2 | 25 [0-9] [0 to 5]) (\. (0 | [1-9] [0-9]? | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] | 25 [0 to 5]) {3} $Copy the code
If you haven’t read this article, this expression may seem a little confusing at first glance, but don’t worry, you’ll find that what seems like a complicated expression is just that.
A mind map of the main content of this article can be used for quick follow-up queries:
1. Special single character
In regular expressions, ordinary characters still represent the original meaning. For example, expression 1 can match the number 1, and expression A can match the letter A.
However, if we want to match more characters, we can not list all characters, which would be a waste of time, then what better way, then metacharacters come into play.
.
The dot wildcard can match any character except the newline:
\d
A numeric wildcard that can match numbers 0-9
\D
If D is capitalized, the match is any non-number, equivalent to the antisense of \ D
\w
Alphanumeric underscore Wildcard that can match any alphanumeric underscore
\W
If W is capitalized, any non-alphanumeric underscore is matched
\s
Whitespace A wildcard character that can be assigned to any whitespace character, including carriage return, line feed, page feed, and TAB
\S
If S is capitalized, any non-whitespace character is matched
Here, the special single character is done, summary:
2. Whitespace
Whitespace is divided into the following categories, usually denoted by \s:
Scope of 3.
-
| or, like that or you think in your heart, ab | BC can match to the ab or BC
-
[…]. For example, [ABC] can match the letters A, B, or C
-
[a-z] matches any element between a and z, and the wildcard \w can be represented by [A-za-z0-9_]
-
[^…]. Inversely, it can’t be any single element inside the parentheses
Note: The above expression can only match a single element at a time
4. The quantifiers
-
* Asterisk, representing 0 to multiple occurrences, may occur, may not occur, if the occurrence, unlimited number of times
-
Plus sign, which means 1 to more, which means at least once
-
? The question mark represents 0 to 1 occurrences, for example, Http regex can be used in Https? said
-
{m} indicates m occurrences. For example, a{1} indicates that the letter a can appear only once in the matching rule
-
{m,} represents at least m occurrences, {0,} corresponds to an asterisk, and {1,} corresponds to a plus sign
-
{m,n} represents m to n occurrences, and {0,1} is equivalent to a question mark
5. Actual combat
Now let’s go back to the regular expression at the beginning of the article:
^ ((1-9] [0-9]? | 1 [0-9] [0-9] [0 to 4] | 2 | 25 [0-9] [0 to 5]) (\. (0 | [1-9] [0-9]? | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] | 25 [0 to 5]) {3} $Copy the code
Do you think there is some idea, let’s implement it together, first talk about the rules:
-
The range of IPv4 addresses is defined as 1.0.0.0-255.255.255.255. Of course, there must be stricter definitions for IPv4 addresses, so we will not tangle with them here
-
Through the above range, we can get the basic rule [1-255].[0-255].[0-255].[0-255].
-
[0-255], so we just need to write the rules for [0-255] first, and then it is very simple
-
The ^ and $are used to mark the beginning and end of a line, which we’ll cover in the next article
Let’s start:
1. How to express a two-digit range
From the above, we know that a number can be represented by \d or [0-9]. What if you want to express many digits, such as 0-99?
The range from 0 to 99, the least number of digits is 1, the most number of digits is 2, so we can use two wildcards to represent, for clarity and beauty, we use **[0-9]** to represent.
Write it like this:
0 | [1-9] [0-9]?Copy the code
Among them,0
Represents the number 0, which cannot be used to exclude the case of 00[0-9] [0-9]?
To indicate that there is an or in the middle|
Behind,[1-9] [0-9]?
That’s 1 minus 99, remember?
It means zero to one occurrence.
2. How to express a three-digit range
We’re done with the two digits, the three digits are easy, so let’s write down the range from 0 to 255.
Note here:
-
When you get up to three digits, the hundreds place can only be one or two
-
When the hundreds digit is 2, the tens digit can only be 0-5
-
When the tens digit is 5, the ones digit can only be 0-5
Let me write that:
0 | [1-9] [0-9]? | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] | 25 [0 to 5]Copy the code
The range from 0 to 255 is 1 to 255:
[1-9] [0-9]? | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] | 25 [0 to 5]Copy the code
Combination of 3.
The last combination, remember the meaning of {3}, represents this character or combination three times:
Note:.
Don’t forget to use dot\
Under the escape
^ ((1-9] [0-9]? | 1 [0-9] [0-9] [0 to 4] | 2 | 25 [0-9] [0 to 5]) (\. (0 | [1-9] [0-9]? | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] | 25 [0 to 5]) {3} $Copy the code
Done, there is no suddenly enlightened feeling, verify:
6. Write at the end
Here, the basic use of regular expressions is over, if you have any questions can leave a comment, thank you.
The online verification tool for regular expressions is regex101.com/
In the next article, we will learn about the assertion mechanism of re. Stay tuned!