directory
A naive pattern matching algorithm
Ii. KMP algorithm (improved pattern matching algorithm)
Hello! Hello, everyone, I am gray little ape, a super bug writing sand carving program ape!
Today to share with you about a string comparison of pattern matching algorithm, in the data structure of the related operations, of the string for strings of positioning operation is often referred to as string pattern matching, also he is one of the most important operation in all sorts of string processing, substring also called pattern string at the same time, about the main series and the mode string matching algorithms commonly used there are two main types: Simple pattern matching algorithm and KMP algorithm (improved pattern matching algorithm), the next two algorithms will be analyzed respectively.
A naive pattern matching algorithm
The naive pattern matching algorithm is also known as the Brutt-Foss algorithm, and its basic idea is: From the main list of the first character and pattern in comparison with the first character of string, if equal, is one of the character after the comparison, or from the main list of the second character and pattern string of the first character to compare, until the pattern on each of the characters, in turn, and the main chain of continuous character sequence matching, is called a match at this moment, If a match for the pattern string cannot be found in the main string, then the match is said to have failed.
Here is an example of a simple pattern matching algorithm for storing strings in character arrays.
// Pass in the primary string s and the mode string t, and set pos to match from the primary string
int index(char s[],char t[],int pos) {
int i,j,slen,tlen;
i = pos;
j = 0;
slen = s.length; // Get the length of the main string
tlen = t.length; // Get the length of the pattern string
while (i<slen && j<tlen) {
if (s[i] == t[j]) {
i++;
j++;
}
else {
i = i-j+1;
j=0; }}if (j>=tlen) {
return i-tlen;
}
return 1;
}
Copy the code
Ii. KMP algorithm (improved pattern matching algorithm)
KMP algorithm is an algorithm of improved, compared with the simple pattern matching algorithm, KMP algorithm in the main series and the mode of string matching process, when comparing appeared in the process of matching the character are not equal, don’t need to back up the string of character position of the pointer, but using part “match” result has been the pattern string “sliding” to the right as far as possible, Let’s go ahead and compare.
Set the mode string to “P0… P(m-1) “, the idea of KMP matching algorithm is: when the character Pj in the pattern string is not equal to the corresponding character Si in the main string, because the first j characters (” P0… P(j-1) “) has been successfully matched, so if “P0… P (k – 1) “and” P (j-k)… P(j-1) “is the same, then P0 can be compared with Si so that I does not need to fall back.
In KMP algorithm, substring sliding can be realized according to the value of next function of pattern string. If next[j]=k, then next[j] means that when Pj in pattern string is not equal to corresponding characters in main string, Pnext[j] of pattern string is compared with corresponding characters of main string.
The next function is defined as follows:
The following is the procedure for the next function of the pattern string:
// Find the next function of the pattern string p and store it in the next array
void Get_next(char *p,int next[])
{
int i,j,slen;
slen = strlen(p); // Get the length of the pattern string
i=0;
while (i<slen) {
if (j==-1||p[i]==p[j]) {
++i;
++j;
next[i] = j;
} else{ j = next[j]; }}}Copy the code
After obtaining the next function,
In the KMP pattern matching algorithm, the subscript of the first character of the pattern string is 0, then the KMP algorithm is as follows:
/* Use the next function of the pattern string p to find the position of p in the main string s starting from the pos character */
/* If the match is successful, return the subscript of the mode string in the main string, otherwise return -1 */
int Index_KMP(char *s,char *p,int pos,int next[])
{
int i,j,slen,plen;
i=pos-1;
j=-1;
slen = strlen(s); // Find the length of the main string
plen = strlen(p); // Find the length of the pattern string
while (i<slen && j<plen) {
if (j==-1||s[i]==p[j]) {
++i;
++j;
} else{ j=next[j]; }}if (j>=plen) {
return i-plen;
}
else {
return -1}}Copy the code
On the string pattern matching algorithm is shared here, there are shortcomings also hope you together correct,
Feel good remember to like attention yo!
Big bad Wolf accompany you to progress together!