This is the 10th day of my participation in the August More Text Challenge. For details, see:August is more challenging
Java regular expressions
Regular expressions define the pattern of strings.
Regular expressions can be used to search, edit, or process text.
Regular expressions are not limited to a single language, but there are subtle differences in each language.
Regular expression instances
A string is simply a regular expression, such as the Hello World regular expression that matches the “Hello World” string.
. (dot) is also a regular expression that matches any character such as “a” or “1”.
The following table lists some examples and descriptions of regular expressions:
Regular expressions | describe |
---|---|
this is text | Match the string “this is text” |
this\s+is\s+text | Notice that in the string\s+. Matches after the word “this”\s+You can match multiple Spaces, then an IS string, then an is string\s+Matches multiple Spaces followed by the text string. You can match this instance: this is text |
^\d+(.\d+)? | ^ defines where \d+ begins to match one or more numbers? Setting the options in parentheses is optional. Match instances where “.” can match: “5”, “1.5”, and “2.21”. |
Java regular expressions are the most similar to Perl’s.
The java.util.regex package consists mainly of the following three classes:
-
The Pattern class:
A Pattern object is a compiled representation of a regular expression. The Pattern class has no public constructor. To create a Pattern object, you must first call its public static compilation method, which returns a Pattern object. This method takes a regular expression as its first parameter.
-
The Matcher class:
A Matcher object is an engine that interprets and matches input strings. Like the Pattern class, Matcher has no public constructor. To get a Matcher object, you need to call the matcher method on the Pattern object.
-
PatternSyntaxException:
PatternSyntaxException is an optional exception class that represents a syntax error in a regular expression pattern.
Capture group
A capture group is a method of treating multiple characters as a single unit. It is created by grouping characters in parentheses.
For example, a regular expression (dog) creates a single group containing “d”, “o”, and “g”.
Capture groups are numbered by counting their opening brackets from left to right. For example, in the expression ((A) (B (C))), there are four such groups:
- ((A)(B(C)))
- (A)
- (B(C))
- (C)
You can see how many groups the expression has by calling the groupCount method on the Matcher object. The groupCount method returns an int, indicating that the Matcher object currently has multiple capture groups.
There is also a special group(group(0)), which always represents the entire expression. This group is not included in the return value of groupCount.
Regular expression syntax
In other languages, \ means: I want to insert a normal (literal) backslash into the regular expression, please don’t give it any special meaning.
In Java, \ says: I want to insert a backslash in a regular expression, so the character after it has special meaning.
So, in other languages (such as Perl), a single backslash \ is sufficient to escape, whereas in Java a regular expression requires two backslashes to resolve to escape in other languages. It’s also easy to understand that in Java regular expressions, two \ represent one of the \ in other languages, which is why a regular expression for a digit is \d, and a common backslash is \.
System.out.print("\"); // The output is \ system.out.print ("\\"); // Output \Copy the code
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class learn_15 {
public static void main(String[] args) {
String content = "this is a dog ,I like dog 300";
String pattern1 = ".*is.*";
boolean isMatch = Pattern.matches(pattern1,content);
System.out.println(isMatch);
String pattern2 = "(\\w+)(\\d+)";
Pattern r = Pattern.compile(pattern2);
Matcher ms = r.matcher(content);
if(ms.find()){
System.out.println(ms.group());
System.out.println(ms.groupCount());
}else {
System.out.println("NO MATCH"); }}}Copy the code