“This is my 35th day of participating in the First Challenge 2022. For more details: First Challenge 2022.”


Hi, everybody. I’m handsome.

Today’s solution to repeated substrings is a variant of the KMP algorithm.

I was stuck for two hours on this one, and then I had an Epiphany while I was pushing my chest at the gym…

It has been a long time since we have to submit twice to AC.

Let’s take a look at this problem.

LeetCode 459: Repeated substrings

The question

Given a non-empty substring, determine whether it can be formed by repeating one of its substrings more than once.

The sample

Input: “abab” Output: True

prompt

  • The given string contains only lowercase English letters and cannot exceed 10000 characters.

title

The official difficulty rating, in this case, is simple, I suspect LeetCode is in the mood…

If this is an easy question, I’m live… Ahem…

The previous KMP algorithm we said is to solve the pattern string position in the main string, repeated substring this problem is not to find the substring position in the main string, but to determine whether a string is composed of multiple repeated substrings.

Officially, this is a periodic string problem, and KMP is the classic solution to this kind of problem.

But even though it is KMP, it only uses the next piece, and the rest depends on the formula…

Let me just start with the formula:

len(s) % (len(s) -  maxLen) = 0
Copy the code

Where len(s) is the length of the string s and maxLen is the length of the longest common prefix and suffix.

So if s is a periodic string, then the length of s is a multiple of the length of s minus the length of the longest common prefix and suffix, then the string s is a periodic string.

As for why this formula, we have to start from Pangu…

Interested to carefully study the reasoning of this article, it is really very good, illustrated, to ensure that you will understand after carefully reading…

Writings. Sh/post/algori…

The longest common prefix is next.

I have detailed how to ask in the previous KMP introduction article:

ACM players take you to play KMP algorithm!

The next array starts at 0 and stores locations, so the longest common suffix length should be next[len(s) -1] + 1.

But knowing the formula is not enough, this problem will still get stuck, yes, it’s me…

if maxLen == 0 or s[n - 1] ! = s[n - 1 - maxLen]: return FalseCopy the code

MaxLen == 0 next[len(s) -1] == -1

Now let’s focus on the or behind this, I’m missing this line of code.

This is a special case, according to the next method I described in [KMP algorithm introduction] article:

The next value of “abab” is [-1,0,0,1], and the next value of “abac” is [-1,0,0,1]. The former is a periodic string, and the latter is not, but the latter’s next value is counted as a periodic string as the former.

So what’s the solution? Is to add a constraint:

s[n - 1] ! = s[n - 1 - maxLen]Copy the code

If it’s a periodic string, the letters in the corresponding positions should be the same, if the letters in the corresponding positions are different, it’s definitely not a periodic string.

You must pay attention.

Code implementation

Python code implementation

class Solution: def getNext(self, s: str): Next = [-1] * len(s) # next[0] = -1, If j == -1 or s[I] == s[j]: if j == -1 or s[I] == s[j]: if j == -1 or s[I] == s[j]: I += 1 j += 1 next[I] = j # next[I] = j # j = next[j] return next def repeatedSubstringPattern(self, s: str) -> bool: if len(s) == 0: Return False next = self.getNext(s) n = len(s) # maxLen = next[n-1] + 1 if maxLen == 0 or s[n-1]! = s[n - 1 - maxLen]: return False return n % (n - maxLen) == 0Copy the code

Java code implementation

class Solution {
    
    public int[] getNext(String s){
        int i = 0;
        int j = -1;
        int n = s.length();
        
        int[] next = new int[n];
        next[0] = -1;
​
        while(i < n - 1){
            if(j == -1 || s.charAt(i) == s.charAt(j)){
                i += 1;
                j += 1;
                next[i] = j;
            }else{
                j = next[j];
            }
        }
        return next;
    }
​
​
    public boolean repeatedSubstringPattern(String s) {
        int n = s.length();
​
        if (n == 0){
            return false;
        }
​
        int[] next = getNext(s);
        int maxLen = next[n - 1] + 1;
​
        if(maxLen == 0 || s.charAt(n - 1) != s.charAt(n - 1 - maxLen)){
            return false;
        }
​
        return n % (n - maxLen) == 0;
    }
}
Copy the code

That’s where the repeated substrings end up. Well, KMP is good for this type of notation in the future.

It would be nice if you could understand the formula and why the formula is pushed the way it is.

If you don’t understand it, it’s ok. It’s just for you to practice using KMP and not get stuck.

Come on, everybody, remember to like me!

I am handsome egg, we will see you next time ~