This is the 7th day of my participation in Gwen Challenge
The point of regular expressions is that developers write matching patterns that can then be used to verify that user input matches the patterns written by developers.
Create regular expressions and rules for regular expressions
Creating regular expressions
A. Direct measurement grammar
/pattern/attributes
Copy the code
Here pattern is a regular expression rather than a string, and attributes is an optional character containing the attributes “G”, “I”, and “m” to specify global matching, case-sensitive matching, and multi-line matching, respectively.
var regExp = /^[a-z]+$/i;
regExp.test('Asadf'); //true
Copy the code
Be careful with occurrences of \ in the string, which need to be escaped, such as:
var regExp = /^[a-z]+\d[a-z]$/i;
Copy the code
You can use metacharacter (/d for number) for regular expression objects. The metacharacter starts with multiple lowercase letters, contains a number, and ends with a single lowercase character.
var regExp1 = new RegExp('^[a-z]+\\d[a-z]$'.'i');
Copy the code
For our string creation, since the default behavior of \ is for escaping, we need two \ to achieve the desired result.
B. Create a RegExp object
new RegExp(pattern, attributes);
var regExp1 = new RegExp('^[a-z]+$'.'i'); // /^[a-z]+$/i
var regExp2 = new RegExp(/^[a-z]+$/.'i'); // /^[a-z]+$/i
var regExp3 = new RegExp(/^[a-z]+$/i.'g'); // /^[a-z]+$/g
Copy the code
- The pattern argument is a string specifying the pattern of a regular expression or other regular expression.
- The parameter Attributes is an optional string containing the attributes “G”, “I”, and “m” to specify global matching, case-sensitive matching, and multi-line matching, respectively. Before ECMAScript standardization, the M attribute was not supported. If pattern is a regular expression rather than a string, this parameter must be omitted.
C. Regular expression objects and regular expression strings
The regex objects that are actually usable also rely on the RegExp object for regex strings. The difference between the two is that one is contained with a backslash, and the other is a string.
But all we can use are regular expression objects.
Rules for regular expressions
A. Beginning & End
Start: ^ Indicates that it starts with a character and ends with a character. & indicates that it ends with a character
// Multiple lowercase letters start with a number and end with a single lowercase character.
var regExp = /^[a-z]+\d[a-z]$/i;
var regExp1 = new RegExp('^[a-z]+\\d[a-z]$'.'i');
regExp.test('a3c'); //true
regExp1.test('a3cd'); //false
Copy the code
B. metacharacter
. Any character except newline and line terminator
var str="That's hot!";
var patt1=/h.t/g;
console.log(patt1.test(str)); //true
console.log(str.match(patt1)); //['hat', 'hot']
Copy the code
\s Finds whitespace characters
Whitespace can be:
- Space character
- TAB character
- Carriage Return character
- New Line character
- Vertical TAB character
- Form Feed Character
var str="Th t s hot!";
var patt1=/h\st/g;
console.log(patt1.test(str)); //true
console.log(str.match(patt1)); //['h t']
Copy the code
\S Finds non-whitespace characters
var str="Th t s hot!";
var patt1=/h\St/g;
console.log(patt1.test(str)); //true
console.log(str.match(patt1)); //hot
Copy the code
\b Matches word boundaries
Matches word boundaries. Word characters cannot be directly adjacent to or followed by another word character. Matches the initial letter boundary, for example: Matches the boundary whose initial letter is M
var str="moon";
var patt1=/\bm/;
console.log(patt1.test(str)); //true
console.log(str.match(patt1)); //["m", index: 0, input: "moon", groups: undefined]
var str="emoon";
var patt1=/\bm/;
console.log(patt1.test(str)); //false
console.log(str.match(patt1)); //null
Copy the code
Matches final letter boundaries as shown in the following example:
var str="moonm";
var patt1=/m\b/;
console.log(patt1.test(str)); //true
console.log(str.match(patt1)); //["m", index: 4, input: "moonm", groups: undefined]
var str="emoomn";
var patt1=/m\b/;
console.log(patt1.test(str)); //false console.log(str.match(patt1)); //null
Copy the code
\B Matches non-word boundaries
So just to get a sense of the word boundary and non-boundary boundary is a position, not a particular character, so what is position, between each character and the beginning and the end of the string when you represent all positions
|e|x|a|m|p|l|e|:|a|+|b|=|3|
Copy the code
This is true when only word boundaries are displayed
|example|:|a|+|b|=|3|
Copy the code
So showing non-word boundaries looks like this
e|x|a|m|p|l|e:a+b=3
Copy the code
var str="Visit Schoolr";
var patt1=/\BSchool/g;
console.log(str.match(patt1)); //null
var str="Visit fSchoolr";
var patt1=/\BSchool/g;
console.log(str.match(patt1)); //["School"]
var str="Visit fSchool";
var patt1=/\BSchool/g;
console.log(str.match(patt1)); //["School"]
var str="Visit School";
var patt1=/\BSchool/g;
console.log(str.match(patt1)); //null
Copy the code
The above example refers to the left side of a non-boundary match, so the first example matches the left side of a boundary, so the result is null, and the second matches the left side of a boundary. Although the right boundary is matched, our regular pattern only matches the left non-boundary, so it doesn’t matter. The right-hand side matches similarly, so I’m not going to give you an example here.
Simple character lookup
\w finds word characters \w finds non-word characters. Word characters include: A-z, A-z, 0-9, and underscores.
\d Find a number
\D Find non-numeric words
I won’t give you any examples because these are relatively simple.
Other metacharacters
- \0 Find the NUL character.
- \n Find a newline character.
- \f Find the page feed character.
- \r Find carriage return.
- \t Find TAB characters.
- \v Finds vertical tabs.
- \ XXX finds the character specified in the octal number XXX.
- \ XDD finds characters specified in hexadecimal number dd.
- \uxxxx finds Unicode characters specified in the hexadecimal number XXXX.
C. quantifiers
I’ve covered regular expression creation, the beginning and end of regular expressions, and the metacharacters, which are important components of regular expressions. The next step is to look at quantifier variations for a character, which are also the length limits we place on matching content.
The n+ quantifier matches any string containing at least one n
var str="Hellooo World! Hello W3School!";
var patt1=/o+/g;
console.log(str.match(patt1)); //["ooo", "o", "o", "oo"]
Copy the code
Because we’re matching one or more o characters, there’s going to be four matches here.
The n* quantifier matches any string containing zero or more n’s
var str="Hellooo World! Hello W3School!";
var patt1=/o*/g;
console.log(str.match(patt1)); //["", "", "", "", "ooo", "", "", "o", "", "", "", "", "", "", "", "", "", "o", "", "", "", "", "", "", "oo", "", "", ""]
Copy the code
The reason for this result is that every time I encounter a word that is not an O, I will match a zero O, and I will get a result “”. So for the first paragraph above, the Hellooo matching process looks like this
If you match H, you get “”, if you match E, you get “”, if you match L, you get “”, if you match L, you get “”, then if you match three O’s, then if you match space characters, you get “”. Then | I was used to draw the match results above is “” where, as follows:
|H|e|l|looo| |Wo|r|l|d|! | |H|e|l|lo| |W|3|S|c|hoo|l|! |Copy the code
other
- n? Matches any string containing zero or one n n{X} Matches a string containing a sequence of X n
- N {X,Y} matches a string containing sequences of X to Y n
- N {X,} matches a string containing at least X sequences of n
- N $matches any string ending in n
- ^n matches any string starting with n
? =n matches any string immediately followed by the specified string n
/regexp(? = n) and/ornew RegExp("regexp(? =n)")
Copy the code
? ! N matches any string that is not immediately followed by the specified string n
/regexp(? ! N) and/ornew RegExp("regexp(? ! n)")
Copy the code
D. the brackets
All of our examples above are matching one character for one position, multiple characters for multiple positions. We can also match multiple characters with one position.
The [ABC] expression is used to find any character between square brackets
var str="Helloo";
var patt1=/[lo]/g;
console.log(str.match(patt1)); //["l", "l", "o", "o"]
Copy the code
[^ ABC] looks for any character that is not between square brackets
var str="Helloo";
var patt1=/[^lo]/g;
console.log(str.match(patt1)); //["H", "e"]
Copy the code
The characters inside square brackets can be any character or range of characters
- [0-9] Finds any number from 0 to 9
- [a-z] finds any character written from small a to lowercase Z
- [a-z] finds any character from capital A to capital Z
- [a-z] finds any character from upper CASE A to lower case Z
If we have more than one consecutive word together, we need to use ().
() Finds any specified option
var str="Helloo";
var patt1=/(lo)/g;
console.log(str.match(patt1)); //[lo]
Copy the code
If () a match can also be a variety of combination, is to use |
var str="Helloo";
var patt1=/(lo|He)/g;
console.log(str.match(patt1)); //["He", "lo"]
Copy the code
Regular expression methods are used
RegExp object method
The exec() method is used to retrieve a match for a regular expression in a string
The exec() method is powerful, generic, and more complex to use than the test() method and the String method that supports regular expressions.
If exec() finds the matching text, it returns an array of results. Otherwise, null is returned. The 0th element of this array is the text that matches the regular expression, the first element is the text that matches the first subexpression of RegExpObject (if any), the second element is the text that matches the second subexpression of RegExpObject (if any), and so on. In addition to the array element and the length attribute, the exec() method returns two attributes. The index property declares the position of the first character of the matching text. The input property holds the retrieved string string. We can see that when we call the exec() method of a non-global RegExp object, we return the same array as when we call the string.match () method.
However, when RegExpObject is a global regular expression, the behavior of exec() is a little more complicated. It starts retrieving the string at the character specified in the lastIndex attribute of RegExpObject. When exec() finds the text that matches the expression, after the match, it sets the lastIndex property of RegExpObject to the position next to the last character of the matched text. That is, you can iterate over all the matching text in a string by calling the exec() method repeatedly. When exec() finds no more matching text, it returns NULL and resets the lastIndex attribute to 0.
Important: If you want to start retrieving new strings after a pattern match in a string, you must manually reset the lastIndex property to 0.
Tip: Note that exec() adds the full details to the array it returns, whether RegExpObject is in global mode or not. This is where exec() differs from String.match(), which returns much less information in global mode. So we can say that calling the exec() method repeatedly in the loop is the only way to get complete pattern matching information for the global pattern.
var str = "Hello world o";
var patt = new RegExp("o"."g");
var result;
while((result = patt.exec(str)) ! =null) {
console.log(result);
console.log(patt.lastIndex);
}
console.log(str.match(patt))
Copy the code
The red part of the figure is the output of exec and the green part is the match output
The special case is when I remove the global match parameter g, the whole program will loop forever. That’s because it’s already mentioned above: When exec() can’t find any more matching text, it returns null and resets the lastIndex property to 0.
The test() method is used to test whether a string matches a pattern
Return true if the string string contains text matching RegExpObject, false otherwise.
var str = "Visit W3School";
var patt1 = new RegExp("W3School");
console.log(patt1.test(str)); //true
Copy the code
Method of a string object
The search() method is used to retrieve a specified substring in a string, or to retrieve a substring that matches a regular expression
Return value: the starting position of the first substring in stringObject that matches regexp. Note: If no matching substring is found, -1 is returned. The search() method does not perform a global match; it ignores the flag G. It also ignores the lastIndex property of regEXP and always retrieves from the beginning of the string, which means it always returns the first matching position of stringObject.
var str="Visit W3School!"
console.log((str.search(/W3School/))); / / 6
Copy the code
The match method retrieves a specified value within a string or finds a match for one or more regular expressions
Stringobject. match(searchValue) // Retrieves the string
var str="Visit W3School sW3School!"
console.log((str.match('W3School'))); //["W3School", index: 6, input: "Visit W3School sW3School!", groups: undefined]
Copy the code
Stringobject. match(regexp) // Retrieves the regexp object
var str="Visit W3School hW3School!"
console.log((str.match(/W3School/))); // ["W3School", index: 6, input: "Visit W3School hW3School!", groups: undefined]
var str="Visit W3School hW3School!"
console.log((str.match(/W3School/g))); // ["W3School", "W3School"]
Copy the code
You can see from the two examples above that match is dependent on having the global flag G when retrieving a RegExp object.
If regexp does not flag G, then the match() method can only perform a match once in stringObject.
If no matching text is found, match() returns NULL. Otherwise, it returns an array containing information about the matching text it finds. The 0th element of the array holds the matching text, while the remaining elements hold the text that matches the subexpression of the regular expression. In addition to these regular array elements, the returned array also contains two object attributes. The index attribute declares the position in stringObject of the matching text starting character, and the input attribute declares a reference to stringObject.
If regexp has the flag G, the match() method performs a global search to find all matching substrings in stringObject. If no matching substring is found, null is returned. If one or more matching substrings are found, an array is returned. However, the contents of the array returned by a global match are quite different. The array element contains all the matching substrings in stringObject and has no index or input attribute.
The replace() method is used to replace some characters in a string with other characters, or to replace a substring that matches a regular expression
stringObject.replace(regexp/substr, replacement)
Copy the code
- The regexp/substr required. RegExp object that specifies the substring or schema to replace. Note that if the value is a string, it is treated as the direct quantitative text mode to retrieve, rather than being converted to a RegExp object first.
- Necessary for replacement. A string value. Specifies a function that replaces text or generates replacement text.
The replace() method of the string stringObject performs a look-and-replace operation. It looks in stringObject for the substrings that match regexp, and then replaces them with replacement. If regexp has the global flag G, the replace() method replaces all matching substrings. Otherwise, it replaces only the first matching substring.
Replacement can be either a string or a function. If it is a string, then every match will be replaced by a string. But the $character in Replacement has a specific meaning. As shown in the following table, the string from the pattern match will be used for substitution.
You can see from the figure above that there is a noun called a subexpression/subexstring.
var name = "Doe, John";
var name1 = name.replace(/(\w+)\s*, \s*(\w+)/."$2 $1");
console.log(name1); //John Doe
Copy the code
$1 refers to the () match inside the first regular expression. In the example above, the value is Doe, and $2 refers to the second () match, John
Here’s an example of what happens with $:
var name = "1Doe, John3";
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g."$2 $1");
console.log(name1); //1John Doe3
/ / Doe's the first match here, John, so keep 1 and 3 of the original position, and then will match content are replaced with "$2 $1", and naturally become 1 John Doe3
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g."$&");
console.log(name1); //1Doe, John3
// The output here is the same as before, because matching substrings are used to replace matching substrings.
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g."$`");
console.log(name1); //113 // the first match is Doe, John, so the original position of 1 and 3 is retained, and the left text of the matching substring is replaced by 1
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g."$");
console.log(name1); / / 133
// Same as the previous one
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g."$$");
console.log(name1); / / 1 $3
// Replace the matching string Doe, John with the direct quantity symbol $, which naturally becomes 1$3.
Copy the code
ECMAScript V3 states that the replacement argument to the replace() method can be a function rather than a string
var name = "1Doe, John3";
var name1 = name.replace(/([a-zA-Z]+)\s*, \s*([a-zA-Z]+)/g.function() {
console.log(arguments); //["Doe, John", "Doe", "John", 1, "1Doe, John3"]
});
Copy the code
The above code contains five parameters:
- The first argument is a string that matches the pattern,
- The second argument to the third to last argument is the subexpression match, namely 1,1,1,2.
- The penultimate argument is the index of the matching string at the beginning of the original string.
- The last argument is the original string itself.
The modifier
Here’s an example, I’m not going to do it for case, it’s easy.
var str = "first second\nthird fourth\nfifth sixth";
var patt = /(\w+)$/
console.log(str.match(patt)); // ["sixth", "sixth", index: 32, Input: "First Second Justify ", Groups: undefined]
var patt = /(\w+)$/g
console.log(str.match(patt)); // ["sixth"]
var patt = /(\w+)$/m
console.log(str.match(patt)); // ["second", "second", index: 6, Input: "First Second Quarter Fourth Address Fifth Sixth ", Groups: undefined]
var patt = /(\w+)$/gm
console.log(str.match(patt)); // ["second", "fourth", "sixth"]
Copy the code
- The first example is in the absence of modifiers, where the entire sentence matches on a single string, ending with sixth.
- The second example adds global modifiers so that all that matters is the format of the output value of the match function, as discussed earlier.
- The third example just adds the multi-line modifier. The regular expression uses the newlines \n and \r and carriage returns as boundaries. The first match is second, but since there is no global modifier, only one result is obtained.
- The fourth example adds global and multi-line modifiers to output three values of course.
reference
- www.cnblogs.com/fuhai/p/614…
- www.w3school.com.cn/js/jsref_ob…