This article is participating in the “Java Theme Month – Java Brush questions punch card”, see the activity link for details

Topic describes

This is 10. Regular expression matching on LeetCode with difficulty.

Tag: “Dynamic programming”

Given a string s and a character rule P, implement a regular expression match that supports ‘.’ and ‘*’.

  • ‘.’ matches any single character
  • The ‘*’ matches zero or more of the preceding elements

By matching, you want to cover the entire string S, not parts of the string.

Example 1:

Input: s = "aa" p = "a" Output: false Description: "A" cannot match the entire string "aa".Copy the code

Example 2:

Input: s = "aa" p = "a*" Output: true Thus, the string "aa" can be treated as 'a' repeated once.Copy the code

Example 3:

Input: s = "ab" p = ".*" Output: true Description: ".*" can match zero or more ('*') characters ('.').Copy the code

Example 4:

Input: s = "aab" p = "c*a*b" Output: true Explanation: Since '*' means zero or more, where 'c' is zero, 'a' is repeated once. So it matches the string "aab".Copy the code

Example 5:

Input: s = "Mississippi" p = "mis*is*p*." output: falseCopy the code

Tip:

  • 0 <= s.length <= 20
  • 0 <= p.length <= 30
  • S may be empty and contain only lowercase letters from A to Z.
  • P may be empty and contains only lowercase letters from A to Z, as well as the characters. And *.
  • Ensure that each occurrence of the character * is preceded by a valid character

Dynamic programming

For the string p, there are three types of characters:

  • Common characters: and is requiredsCharacters in the same position match exactly
  • '. ': Can matchsAny character at the same position in
  • The '*': Cannot be used aloneThe '*'Must be used simultaneously with the preceding character, the data is guaranteedThe '*'Can find the preceding character. To be able to matchsAny number of characters in the same position.

If a character such as a* is present, do you want to match 0 a, 1 A, or 2 A?

This problem can be solved using dynamic programming:

  • State definition: f(I,j) stands for considering whether the substring ending in I in S matches the substring ending in j in P. That is, the final result we want is f[n][m].

  • State transfer: that is, we need to consider how f(I,j) is obtained. Previously, p has three kinds of characters, so the state transfer here also needs to be discussed in three cases:

    1. P [j] is a common character: the matching condition is that the preceding character matches, and the ith character in S is the same as the JTH bit in P. That is, f(I,j) = F (i-1, J-1) &&s [I] == P [j].

    2. P [j] is ‘.’ : the matching condition is that the preceding character matches. The ith character in s can be any character. That is, f(I,j) = f(I – 1, j-1) &&p [j] == ‘.’.

    3. P [j] is ‘*’ : reads the character p[j-1], for example, character A. The number of a’s in s is 0, 1, 2…

      3.1. When the match is 0: f(I,j) = f(I, j-2)

      3.2. When the match for a: f (I, j) = f (j – I – 1, 2) && (s = = p [I] [1] | | p [1] = = ‘. ‘)

      3.3. When matching for 2: f (I, j) = f (j, I – 2-2) && ((s [I] = = p [1] && s] [I – 1 = = p] [j – 1) | | p [j] = = ‘. ‘)

      .

We know that by enumeration*How many matchesaIn this way, the algorithm complexity is very high.

We need to explore some “properties” to simplify the process.

Code:

class Solution {
    public boolean isMatch(String ss, String pp) {
        // Trick: Insert a space in the header of the character to get a char array that starts at 1 and makes f[0][0] = true, which can be rolled down
        int n = ss.length(), m = pp.length();
        ss = "" + ss;
        pp = "" + pp;
        char[] s = ss.toCharArray();
        char[] p = pp.toCharArray();
        // f(I,j) indicates whether characters 1 to I in s and 1 to j in p match
        boolean[][] f = new boolean[n + 1][m + 1];
        f[0] [0] = true;
        for (int i = 0; i <= n; i++) {
            for (int j = 1; j <= m; j++) {
                // If the next character is '*', the current character cannot be used alone, skip
                if (j + 1 <= m && p[j + 1] = =The '*') continue;
                
                // p[j] is a normal character
                if (i - 1> =0&& p[j] ! =The '*') {
                    f[i][j] = f[i - 1][j - 1] && (s[i] == p[j] || p[j] == '. ');
                } 
                
                P [j] = '*'
                else if (p[j] == The '*') {
                    f[i][j] = (j - 2> =0 && f[i][j - 2]) || (i - 1> =0 && f[i - 1][j] && (s[i] == p[j - 1] || p[j - 1] = ='. ')); }}}returnf[n][m]; }}Copy the code
  • Time complexity: n represents the length of S, m represents the length of P, and a total of N ∗mn * mn∗m states. The complexity is O(n∗m)O(n * m)O(n∗m)
  • Spatial complexity: Two dimensional arrays are used to record results. The complexity is O(n∗m)O(n * m)O(n∗m)

Dynamic programming is essentially enumeration (non-repeating violence enumeration), so its complexity is easy to analyze, as many states are computed as many times.


The last

This is the 10th article in our “Brush through LeetCode” series, which started on 2021/01/01. There are 1916 topics in LeetCode as of the start date, some of which have locks. We will finish all the topics without locks first.

In this series of articles, in addition to explaining how to solve the problem, I will also present the most concise code possible. If a general solution is involved, the corresponding code template will also be used.

In order to facilitate the students to debug and submit the code on the computer, I set up the relevant warehouse: github.com/SharingSour…

In the repository, you’ll see links to the series of articles, the corresponding code for the series of articles, the LeetCode source links, and other preferred solutions.