This paper does not make too much professional explanation, only for their own learning do not understand the record of the place

First, BF algorithm

Also known as naive pattern matching algorithm, and our thinking pattern is basically the same. If the match is successful, both Pointers are increded by one; if the match fails, Pointers to the primary string are backtracked to the next position, and Pointers to substrings are backtracked to zero. In order to make string easier to use, we use subscripts to start from zero.

BF implementation code is as follows

int indexOneDF(string s,string t){
    int i=0,j=0;// I points to s, j points to t
    while(i<s.size() && j<t.size()) {if(s[i] == t[j]){ i++; j++;// If they are equal, move to the next one
        }else{
            // Based on the value of j, determine how much I has moved forward to the next character back to the previous position
            i = i-j+1; 
            j=0; }}if(j>=t.size()) {return i-t.size(a);// Returns the first position of the matched main string
    }else{
        return - 1; }}Copy the code

Second, KMP algorithm

Also known as pattern matching algorithm, unlike the naive algorithm above, KMP algorithm does not mindlessly align substrings to the next position when moving to the next position, but instead uses a Next array. Next [I] indicates the length of the longest common prefix between next[0, I], which in practice refers to the next digit of the longest prefix.

Why is it possible to calculate the longest prefix and suffix by reverting j to next[J-1]?

This paragraph is partially referenced from juejin.cn/post/694547… This paper shows the process of KMP in the way of dynamic graph, and directly shows the realization process of KMP algorithm.

Suppose that the prefix pointer j and suffix pointer are calculated as follows: mismatches occur when j=7 and I =14. The previous longest prefix was ABCDABC, so we need to find what the new longest prefix between next[0, I] is.

Next [J-1] represents the length of the longest prefixes between next[0, j-1] (that is, pointing to the next bit of the longest prefixes), and the sub-longest prefixes of the longest prefix X0 are x1 and X2 regions respectively. Similarly, the sublongest prefixes of the longest suffixes Y0 are y1 and y2, and since X0 and Y0 match, X1 and y2 match, so we can compare the values of j and I to know the length of the longest prefixes between next[0, I] at this time. If they are equal, the length of the longest common prefix and suffix is j + 1. If they are not equal, go back to j and repeat the process.

If I go back again, j and I still don’t want to wait, j goes back to 0, still don’t want to wait. So next[I] = 0.

The KMP implementation code is as follows

In order to use string, subscripts starting from 0 are not recommended.

/* * @Author: IndexYang * @Date: 2021-11-28 12:40:01 * @Last Modified by: IndexYang * @Last Modified time: The 2021-11-28 15:39:35 * /
 
#include <iostream>
#include <cstring>
using namespace std;
const int N = 100010;

void KMPfromZero(string s,string t){
    int m=s.length(),n=t.length(a);//m is the length of the main string and n is the length of the string to be matched
    int ne[N];// Next array, which is how far back the current position needs to be moved
    ne[0] = - 1;// Important!!
    for(int i=1,j=- 1; i<n; i++){// If j has moved, and the next j is not equal to I, move backward
        while(j>=0&& t[i] ! = t[j+1]) j=ne[j];
        // if the next of j is equal to I, then j++
        if(t[i] == t[j+1]) j++;
        / / j values
        ne[i] = j;
    }
    for(int i=0,j=- 1; i<m ; i++){while(j! =- 1&& s[i]! =t[j+1]) j=ne[j];
        if(s[i] == t[j+1]) j++;
        if(j == n- 1) {// If success makes it to the end
            printf("%d ",i-n+1); j = ne[j]; }}}int main(a){
    string s,t; //s is the primary string, and t is the string to be matched
    cin>>s>>t;
    // The naive algorithm
    //cout<<indexDF(s,t)<<endl;
    / / KMP algorithm
    KMPfromZero(s,t);
    return 0;
}
Copy the code

Here is the use of string array, string storage from the array subscript 1, easy to use.

// KMPfromOne
#include <iostream>
using namespace std;
const int N = 100010, M = 1000010;

int n, m;
int ne[N];
char s[M], p[N];

int main(a){
    cin >> n >> p + 1 >> m >> s + 1;
    for (int i = 2, j = 0; i <= n; i ++ ){
        while(j && p[i] ! = p[j +1]) j = ne[j];
        if (p[i] == p[j + 1]) j ++ ;
        ne[i] = j;
    }
    for (int i = 1, j = 0; i <= m; i ++ ){
        while(j && s[i] ! = p[j +1]) j = ne[j];
        if (s[i] == p[j + 1]) j ++ ;
        if (j == n){
            printf("%d ", i - n); j = ne[j]; }}return 0;
}
Copy the code