Front knowledge

  • A prefix is a special substring that starts at the beginning of a string and ends somewhere. The i-ending Prefix of the string S is denoted as Prefix(S, I), i.e., Prefix(S, I).

    For example:

  • True prefix refers to the prefix of S except S itself.

    For example, all prefixes of the string abcabcd are {a, ab, ABC, abca, abcab, abcabc, abcabcd}, while its true prefix is {a, ab, ABC, abca, abcab, abcabc}.

  • substring

    Substring of string S S [j] I… I < j said string from I to j, this passage is gradate S [I], S/I + 1,… String formed by S[j].

    Sometimes S[I…j], I >j are used to represent empty strings.

  • Definition of a prefix function

    Given a string of length, the prefix function is defined as an array Xn of length N. The definition is:

    1. If the substring s[0… I] has an equal pair of true prefixes and true suffixes :s[0…k-1] and S [i-k+1… I], then the length of the equal true prefix (or true suffixes, because they are equal) is Xn=[k];
    2. If more than one pair is equal, then it is the length of the longest pair;
    3. If there are no equal, then.

    In simple terms, the length of the longest equal true prefix and true suffix of the substring.

    It is described in mathematical language as follows:

    In particular, specify Xn[0] = 0(empty strings are ignored here; they are avoided in the string definition above).

    For example, for the string abcabcd,

    Because a has no true prefix or suffix, it is zero by convention

    Because ab has no equal true prefix and true suffix

    Because ABC has no equal true prefix and true suffix

    Because abCA has only one equal pair of true prefixes and true suffixes: A, length 1

    Because abCAB equals true prefixes and true suffixes are only ab, of length 2

    Because true prefixes and true suffixes equal to abcabc are only ABC, of length 3

    Since abcabcd has no equal true prefix and true suffix,

    Similarly, the string aabaaab can be prefixed to [0,1,0,1,2,2,3].

Evaluates the prefix function for a string

  • A naive algorithm for finding prefix functions
package com.algorithms.other;

import java.util.Arrays;

// Find the prefix function
// Complexity is O(n^3)
public class PrefixFun {
    public static void main(String[] args) {
        int[] res =  PrefixFun.prefixFun("aabaaab");
        Arrays.stream(res).forEach(System.out::println);
// 0 1 0 1 2 2 3
    }
    /** * find the prefix function *@param str
     */
    static int[] prefixFun(String str){

       int len = str.length();
       int[] result = new int[len];
       // The outermost loop is used to traverse the string
       / / string s [1]
       / / string s [1, 2]
       / / string s [1, 2, 3]
       / /...
        result[0] = 0;
       for(int i = 1; i<len; i++){// The inner loop is used to find the longest common string for the string
           for(intj = i; j>=0; j--){if(str.substring(0,j).equals(str.substring(i-j+1,i+1))){
                   result[i] = j;
                   break; }}}returnresult; }}Copy the code

This algorithm is order n^3.

  • Optimization Direction 1

    Decrement: The values of two adjacent prefix functions either increase by 1 or remain the same or decrease

-  for(intj = i; j>=0; j--){ +for(int j = result[i-1] +1; j>=0; j--){Copy the code

KMP algorithm

The method of finding the prefix function is

package com.algorithms.string;

import java.util.Arrays;

// Find the prefix function
// Complexity is O(n^3)
public class PrefixFun {
    public static void main(String[] args) {
        int[] res =  PrefixFun.prefixFunOn("ababcabcacbab");

        Arrays.stream(res).forEach(System.out::println);
// 0 1 0 1 2 2 3
    }
    /** * the prefix function (On3, On2 after modification) *@param str
     */
    static int[] prefixFunOn3(String str){
       int len = str.length();
       int[] result = new int[len];
       // The outermost loop is used to traverse the string
       / / string s [1]
       / / string s [1, 2]
       / / string s [1, 2, 3]
       / /...
        result[0] = 0;
       for(int i = 1; i<len; i++){// The inner loop is used to find the longest common string for the string
    // for(int j = i; j>=0; j--){
               for(int j = result[i-1] +1; j>=0; j--){if(str.substring(0,j).equals(str.substring(i-j+1,i+1))){
                   result[i] = j;
                   break; }}}return result;
    }

    /** * o(n) prefix function algorithm // *@param str
     * @return* /
    static int[] prefixFunOn(String str){
        int len = str.length();
        //pr
        int[] prefixArr = new int[len];
        // Let result[0] = 0 by convention
        prefixArr[0] = 0;
        for(int i = 1; i<len; i++){// Get the transfer equation is no

            int j = prefixArr[i-1];

            // If prefix[i-1] matches the added adjacent string, the value of prefix[I] will be prefix[I]+1
            // Otherwise we do a second match through the prefix array
            // We keep looking for this condition

            while(j>0&&str.charAt(j)! =str.charAt(i)){ j = prefixArr[j-1];
            }
            // if not found, j must be 0
            // If found, increment the result by 1
            if(str.charAt(i)==str.charAt(j)){
                j++;
            }
            prefixArr[i] = j;
        }


        returnprefixArr; }}Copy the code

Using the two-pointer method to search for strings,

The whole process is:

The specific code is

package com.algorithms.string;

public class Kmp {

    public static void main(String[] args) {
        System.out.println( Kmp.KmpSolution("adsdbbabb"."dbba"));
    }

    /** * String matching problem *@paramSTR string *@paramPattern matches the string */
    static int KmpSolution(String str,String pattern){
        int[] prefix = PrefixFun.prefixFunOn(str);
        int i=0,j=0;
        while (i<str.length()&&j<pattern.length()){
            // The string is the same as the matching string and both Pointers move forward
            if(str.charAt(i)==pattern.charAt(j)){
                // return if j reaches the last index
                if(j==pattern.length()-1) {return i-j;
                }
                i++;
                j++;
            }

            // let j jump back to see the minimum prefix function at the previous position and jump back to the next position to continue the comparison
           else if(j==0||prefix[j-1] = =0){
               i++;
            }
           else {
               j = prefix[j-1]; }}return -1; }}Copy the code

reference

Prefix functions and KMP algorithms – OI Wiki (oi-wiki.org)