This is the 10th day of my participation in Gwen Challenge
The KMP algorithm is used for string matching. It ensures that the time complexity of the algorithm is O(N) by ensuring that the target string is scanned only once.
The KMP algorithm uses the information stored in the storage (usually placed in dp, next array) to string the matches in the correct position. (Also a dynamic programming algorithm)
So, the key is to match the string to determine next[]. The key to determine next[] is to match the common prefix of the string.
In the detailed explanation of KMP algorithm written by Labuladong, referring to Algorithm 4, the process of jump is regarded as the state transformation of the state machine. This idea is very consistent with the thinking of the computer, but if you have not learned the principle of compilation, I am afraid that it is a little difficult to understand the expression of using the state machine.
public class KMP {
private int[][] dp;
private String pat;
public KMP(String pat) {
this.pat = pat;
int M = pat.length(a);// dp[status][character] = next status
dp = new int[M][256];
// base case
dp[0][pat.charAt(0)] = 1;
// The shadow state X starts with 0
int X = 0;
// The current state j starts at 1
for (int j = 1; j < M; j++) {
for (int c = 0; c < 256; c++) {
if (pat.charAt(j) == c)
dp[j][c] = j + 1;
else
dp[j][c] = dp[X][c];
}
// Update the shadow status
X = dp[X][pat.charAt(j)]; }}public int search(String txt) {...}
}
Copy the code
In ordinary use, it is common to use a one-dimensional array to represent pattern strings. Whenever a mismatch occurs, find the mismatched character PI and go to the preceding string F=p1… Pi-1, move the pattern string back, place F first where the front (FL) and back (Fr) want to coincide, and is the position where the pattern string should move back. In other words, whenever a mismatch occurs, the place j points to is exactly the length of the F string that it wants to overlap with, plus one. So, we usually define a next[j] array. J is 1~m, where m is the length of the pattern string, indicating that when the JTH character is mismatched, the match should start again from the character at next[j].
To summarize the key points
The key to KMP is how to find the next[] array, and I didn’t do a good job of explaining that above. Next [0] is next[0], and we know the value of next[I] when we solve for next[I +1]. In other words, we only need to compare the value Pi of P next[I]. Next [I] = next[I]+1; P next[I] = next[I]+1; P next[I] = next[I]+1; We can find the process of matching before and after strings, also known as pattern matching problem, our purpose is to find the common matching before and after strings, in this case we find P next[next[j]], which has the same matching in the following matches. So it goes on and on and on and on. Let’s talk about the KMP improvement, which actually takes advantage of the match failure. If the failed match is the same as the next P [next[I] match, then you can actually go one more step, right? You are not sure which character in the target string will be matched, but you are clear about the step in the pattern string that failed to match. That is, if you are sure that P next[I] is equal to Pi, then you will fail to match directly, but if you fail to match, Then you need to match P next[next[I]] so you can actually go ahead and wait until you reach the top of the string or hit the one that matches.