This is the 9th day of my participation in the August Wen Challenge.More challenges in August
The RegExp constructor
In ES5, there are two cases for arguments to the RegExp constructor.
1. The first case: the argument is a string, in which case the second argument represents the regular expression modifier (flag
).
const regex = new RegExp('xyz'.'i');
/ / equivalent to the
const regex = /xyz/i;
Copy the code
2. In the second case, if the argument is a regular expression, a copy of the original regular expression is returned.
const regex = new RegExp(/xyz/i);
/ / equivalent to the
const regex = /xyz/i;
Copy the code
However, ES5 does not allow you to add modifiers with the second argument at this time, otherwise an error will be reported.
const regex = new RegExp(/xyz/.'i');
// Uncaught TypeError: Cannot supply flags when constructing one RegExp from another
Copy the code
ES6 changes this behavior. If the first argument to the RegExp constructor is a regular object, the second argument can be used to specify the modifier.
Furthermore, the returned regular expression ignores the modifier of the original regular expression and uses only the newly specified modifier.
new RegExp(/abc/ig.'i').flags
// "i"
Copy the code
In the code above, the modifier of the original re object is ig, which is overridden by the second argument I.
Second, the string of regular method
There are four methods for string objects that can use regular expressions: match(), replace(), search(), and split().
ES6 uses these four methods to call RegExp instance methods within the language, so that all reged-related methods are defined on RegExp objects.
String. The prototype. The match call RegExp. Prototype [Symbol. Match]
String. The prototype. The replace call RegExp. Prototype [Symbol. The replace]
String. The prototype. The search call RegExp. Prototype [Symbol. The search]
String. The prototype. The split call RegExp. Prototype [Symbol. The split]
Y modifier
ES6 adds a Y modifier to regular expressions, called the “sticky” modifier.
The y modifier is similar to the G modifier in that it is a global match, and each subsequent match starts at the position following the last successful match.
The difference is that the G modifier works as long as there is a match in the remaining position, while the Y modifier ensures that the match must start at the first remaining position, which is what “bonding” means.
const s = 'aaa_aa_a';
const r1 = /a+/g;
const r2 = /a+/y;
r1.exec(s) // ["aaa"]
r2.exec(s) // ["aaa"]
r1.exec(s) // ["aa"]
r2.exec(s) // null
Copy the code
The code above has two regular expressions, one using the G modifier and the other using the Y modifier.
The two regular expressions are executed twice each, and on the first execution, they behave the same, with the remaining strings being _aa_A. Since the G modifier has no position requirement, the second execution returns the result, while the Y modifier requires that the match must start with the remaining headers, so null is returned.
In fact, the y modifier implies the header matching flag ^.
/b/y.exec('aba')
// null
Copy the code
The above code does not guarantee a header match, so it returns NULL. The y modifier is designed so that the header matching flag ^ is valid for global matching.
Here is an example of the replace method on a string object.
const REGEX = /a/gy;
'aaxa'.replace(REGEX, The '-') // '--xa'
Copy the code
In the code above, the last A is not replaced because it does not appear in the header of the next match.
So a single Y modifier for the match method returns only the first match, and must be used in conjunction with the G modifier to return all matches.
'a1a2a3'.match(/a\d/y) // ["a1"]
'a1a2a3'.match(/a\d/gy) // ["a1", "a2", "a3"]
Copy the code
4, RegExp. Prototype. Sticky
Matching the Y modifier, ES6’s regular instance object has the sticky attribute, indicating whether the Y modifier is set.
const r = /hello\d/y;
r.sticky // true
Copy the code
RegExp. Prototype. flags
ES6 adds flags for regular expressions, which returns modifiers for regular expressions.
// ES5 source attribute
// Returns the body of the regular expression
/abc/ig.source
// "abc"
// Flags attributes for ES6
// Returns the modifier of the regular expression
/abc/ig.flags
// 'gi'
Copy the code
Named group matching
1. The question is:
Regular expressions use parentheses for group matching.
const RE_DATE = /(\d{4})-(\d{2})-(\d{2})/;
Copy the code
In the code above, there are three sets of parentheses inside the regular expression. These three sets of matching results can be extracted using the exec method.
const RE_DATE = /(\d{4})-(\d{2})-(\d{2})/;
const matchObj = RE_DATE.exec('1999-12-31');
const year = matchObj[1]; / / 1999
const month = matchObj[2]; / / 12
const day = matchObj[3]; / / 31
Copy the code
One problem with group matching is that the matching meaning of each group is not easy to see and can only be referenced with numeric ordinals (such as matchObj[1]), which must be changed when the order of the group changes.
2. Solutions:
ES2018 introduced Named Capture Groups, which allows you to assign a name to each group match, making it easy to read the code and reference it.
const RE_DATE = / (?
\d{4})-(?
\d{2})-(?
\d{2})/
;
const matchObj = RE_DATE.exec('1999-12-31');
const year = matchObj.groups.year; / / 1999
const month = matchObj.groups.month; / / 12
const day = matchObj.groups.day; / / 31
Copy the code
In the above code, “named group match” is inside the parentheses, and the pattern header is “question mark + Angle bracket + group name” (?
), which can then be referenced on the Groups attribute where the exec method returns the result. Meanwhile, numeric ordinal number (matchObj[1]) is still valid.
A named group match is equal to assigning an ID to each set of matches to describe the purpose of the match. If the order of the groups changes, there is no need to change the matching processing code.
If the named group does not match, the corresponding Groups object property is undefined.
const RE_OPT_A = / ^ (?
a+)? $/
;
const matchObj = RE_OPT_A.exec(' ');
matchObj.groups.as // undefined
'as' in matchObj.groups // true
Copy the code
The above code, named group as no match is found, then the matchObj. Groups. As the attribute value is undefined, and as the key name is ever-present in groups.
3. Application Scenario 1: Deconstructing assignment and substitution
With named group matching, you can assign variables directly from the result of the match using deconstructed assignment.
let {groups: {one, two}} = / ^ (?
.*):(?
.*)$/u
.exec('foo:bar');
one // foo
two // bar
Copy the code
For string substitution, use $< group name > to refer to the named group.
let re = / (?
\d{4})-(?
\d{2})-(?
\d{2})/u
;
'2015-01-02'.replace(re, '$<day>/$<month>/$<year>')
/ / '02/01/2015'
Copy the code
In the code above, the second argument to the replace method is a string, not a regular expression.
The second argument to the replace method can also be a function, which has the following sequence of arguments.
'2015-01-02'.replace(re, (
matched, // The entire matching result is 2015-01-02
capture1, // The first group matches 2015
capture2, // The second group matches 01
capture3, // The third group matches 02
position, // Match position 0 at the beginning
S, // The original string 2015-01-02
groups {year, month, day}
) = > {
let {day, month, year} = groups;
return `${day}/${month}/${year}`;
});
Copy the code
Named group matching adds a final function argument: an object made up of named groups. This object can be destructively assigned directly inside the function.
4. Application Scenario 2: Reference
If you want to reference a named group match inside a regular expression, you can write \k< group name >.
const RE_TWICE = / ^ (?
[a-z]+)! \k
$/
;
RE_TWICE.test('abc! abc') // true
RE_TWICE.test('abc! ab') // false
Copy the code
The numeric reference (\1) is still valid.
const RE_TWICE = / ^ (?
[a-z]+)! \ 1 $/
;
RE_TWICE.test('abc! abc') // true
RE_TWICE.test('abc! ab') // false
Copy the code
The two reference grammars can also be used together.
const RE_TWICE = / ^ (?
[a-z]+)! \k
! \ 1 $/
;
RE_TWICE.test('abc! abc! abc') // true
RE_TWICE.test('abc! abc! ab') // false
Copy the code
7. Regular matching index
It is not easy to get the start and end positions of the re match results. The exec() method of the re instance returns an index attribute that retrieves the start position of the entire match, but if it contains group matches, the start position of each group match is difficult to obtain.
Now there is a phase 3 proposal to add the indices property to the return result of the exec() method to get the matching start and end locations.
const text = 'zabbcdef';
const re = /ab/;
const result = re.exec(text);
result.index / / 1
result.indices // [[1, 3]]
Copy the code
In the example above, the exec() method returns result, whose index property is the start of the entire match result (AB), and whose indices property is an array of the start and end locations of each match. Since the regular expression for this example has no group matches, the Indices array has only one member that indicates that the entire match starts at 1 and ends at 3.
Note that the start position is included in the match result, but the end position is not. For example, if the result is ab, which is the first and second bits of the original string, then the end position is bit 3.
If the regular expression contains group matches, the array corresponding to the Indices attribute contains multiple members, providing the start and end locations of each group match.
const text = 'zabbcdef';
const re = /ab+(cd)/;
const result = re.exec(text);
result.indices // [1, 6], [4, 6]]
Copy the code
In the example above, if the regular expression contains a group match, the Indices property array has two members. The first member is the start and end positions of the entire match result (abbcd) and the second member is the start and end positions of the group match (CD).
Here is an example of multiple group matches.
const text = 'zabbcdef';
const re = /ab+(cd(ef))/;
const result = re.exec(text);
result.indices // [[1, 8], [4, 8], [6, 8]]
Copy the code
In the above example, the regular expression contains two group matches, so the Indices property array has three members.
The Indices property array also has a Groups property if the regular expression contains a named group match. This property is an object from which you can get the start and end positions of a named group match.
const text = 'zabbcdef';
const re = /ab+(?
cd)/
;
const result = re.exec(text);
result.indices.groups // { Z: [ 4, 6 ] }
Copy the code
In the example above, the indices.Groups property of the result returned by the exec() method is an object that provides the named group to match the start and end positions of Z.
If the group match is not obtained successfully, the corresponding member of the indices property array is undefined and the corresponding member of the indices. Groups property object is undefined.
const text = 'zabbcdef';
const re = /ab+(?
ce)? /
;
const result = re.exec(text);
result.indices[1] // undefined
result.indices.groups['Z'] // undefined
Copy the code
In the above example, the indices property array and the indices. Groups property object correspond to a group match member that is undefined because the group match was unsuccessful.
Eight, String. Prototype. MatchAll ()
To find multiple matches for a regular expression in a string
1. The old way
Use the G modifier or y modifier to retrieve each one in the loop.
const regex = /t(e)(st(\d?) )/g;
const string = 'test1test2test3';
const matches = [];
const match;
while (match = regex.exec(string)) {
matches.push(match);
}
matches
/ / /
// ["test1", "e", "st1", "1", index: 0, input: "test1test2test3"],
// ["test2", "e", "st2", "2", index: 5, input: "test1test2test3"],
// ["test3", "e", "st3", "3", index: 10, input: "test1test2test3"]
// ]
Copy the code
In the code above, the while loop fetches the regular match for each of three rounds.
2. The new scheme
ES2020 increased String. Prototype. MatchAll () method, which can be a one-time remove all matching. However, it returns an Iterator, not an array.
const string = 'test1test2test3';
const regex = /t(e)(st(\d?) )/g;
for (const match of string.matchAll(regex)) {
console.log(match);
}
// ["test1", "e", "st1", "1", index: 0, input: "test1test2test3"]
// ["test2", "e", "st2", "2", index: 5, input: "test1test2test3"]
// ["test3", "e", "st3", "3", index: 10, input: "test1test2test3"]
Copy the code
MatchAll (regex) returns a traverser, so we can use for… Of loop out. The advantage of returning a traversal over an array is that the traversal is resource-efficient if the result of the match is a large array.
Iterator to array is very simple, using… The operator and array.from () method do the trick.
// Convert to an array
[...string.matchAll(regex)]
// Convert to an array
Array.from(string.matchAll(regex))
Copy the code