Analysis of the
Description:
- The following solution “violence algorithm” is the foundation, “dynamic programming” must master, “center diffusion” method to write;
- “Manacher’s algorithm” is only used to broaden the field of vision, and most algorithmic interviews won’t require it (unless the candidate is a competition candidate).
Brute Force
- According to the definition of the substring, enumerate all the substrings whose length is greater than or equal to 22, and judge whether they are palindromes.
- In the implementation, palindrome validation can only be performed for substrings greater than the “longest currently obtained substring length”;
- When recording the longest substring, you can only record “current substring start position” and “substring length” without interception. This step will be implemented in the later methods.
Description: The violent solution has high time complexity, but it is clear and simple to write. Since the probability of correctness is high, we can use the brute force matching algorithm to verify that the other algorithms we write are correct. In many cases, the optimized solution is based on the “violent solution”, which is obtained in space for time. Therefore, thinking clearly about the violent solution and analyzing its shortcomings can open our minds for us in many cases.
Reference code 1: Java code works, C++ code exceeds memory limits, Python code times out.
public class Solution {
public String longestPalindrome(String s) {
int len = s.length();
if (len < 2) {
return s;
}
int maxLen = 1;
int begin = 0;
// s.char (I) checks for out-of-bounds subscripts every time, so it is converted to a character array first
char[] charArray = s.toCharArray();
// Enumerate all substrings with length greater than 1 charArray[I..j]
for (int i = 0; i < len - 1; i++) {
for (int j = i + 1; j < len; j++) {
if (j - i + 1 > maxLen && validPalindromic(charArray, i, j)) {
maxLen = j - i + 1; begin = i; }}}return s.substring(begin, begin + maxLen);
}
/** * verify that the substring s[left..right] is a palindrome */
private boolean validPalindromic(char[] charArray, int left, int right) {
while (left < right) {
if(charArray[left] ! = charArray[right]) {return false;
}
left++;
right--;
}
return true; }}Copy the code
Complexity analysis:
- Time complexity: O(N3) O(N^3) O(N3), where N is the length of the string, enumerates the left and right bounds of the string, and then continues to verify whether the substring is a callback substring. All three operations are related to N;
- Space complexity: O(1), using only constant temporary variables, independent of string length.
Method two: dynamic programming
The following is the “dynamic programming” problem thinking path, for your reference.
Special note:
- The following “dynamic programming” explanation only helps you understand the basic idea of the “dynamic programming” problem;
- “Dynamic programming” problems can be very difficult, and it is recommended not to drill into particularly difficult problems when learning;
- Master the solution of classical dynamic programming problems, understand the origin of the definition of state, can list the state transition equation;
- Then practice with questions of appropriate difficulty;
- If you have time and interest, you can do some less common types of questions to broaden your horizons;
- The classic book on dynamic programming is Introduction to Algorithms.
Tip: right click “Open picture in New Note sheet” to view the larger picture.
1. Thinking state (key point)
- To define the state, first try “whatever the question asks, set it as the state”;
- Then think about “how to transfer the state”. If the “state transition equation” is not easy to get, try to modify the definition. The purpose is still to get the “state transition equation” conveniently.
The state transition equation is the relation of subproblems of different sizes of the original problem. That is how the optimal solution of the big problem is derived from the optimal solution of the small problem.
2. Thinking about the state transition equation (core and difficult points)
- State transition equation is very important, is the core of dynamic programming, but also difficult;
- A common derivation technique is: classification discussion. Namely, classifying the state space;
- It is a very flexible thing to generalize the “state transition equation”, which is usually analyzed on a case-by-case basis.
- In addition to mastering the classic dynamic programming problems, you need to do more problems;
- If it’s an interview, make it easy for yourself. Grasp the dynamic programming solution of common problems, understand the dynamic programming to solve problems, start from a small-scale problem, gradually get the solution of the big problem, and record the intermediate process;
- The “dynamic programming” method is still the embodiment of the idea of “space for time”, and the common process of solving problems is like “filling in a form”.
3. Think about initialization
Initialization is very important, one wrong step, one wrong step. The initialization state must be set correctly to get the correct result.
- Angle 1: Start directly from the semantics of the state;
- Angle 2: If the semantics of the state are difficult to think about, consider what initialization conditions are required for the boundary of the “state transition equation”;
- Angle 3: From the subscript of the “state transition equation” equation, see whether it is necessary to set one more line and one more column to represent “Sentinel”, so as to avoid the discussion of some special cases.
4. Think about output
Sometimes it’s the last state, and sometimes it’s a combination of all the states that we’ve computed before.
5. Think optimization space (also known as table reuse)
- “Optimization space” can make code difficult to understand and cause “state” to lose its original semantics, which can be difficult to learn at first. It’s more important to get the code right first;
- “Optimization space” is necessary when the state space is very large (processing massive data) and there is not enough space, you must “optimize space”;
- Very classic examples of “optimized space” problems are the “0-1 knapsack” problem and the “complete knapsack” problem.
(The following is an analysis of the “dynamic programming” approach to this problem.)
The annoying part of this problem is deciding on the string of the text. Hence the need for a way to quickly determine whether all substrings of the original string are textual substrings, hence the idea of “dynamic programming”.
A key step in “dynamic planning” is to figure out “how state transitions”. In fact, “palindrome” has the nature of “state transfer”.
- When both ends of a palindrome are removed, the remaining part is still a palindrome (the boundary case is not discussed here);
Again from the definition of palindrome string discussion:
-
If the first and last characters of a string are not equal, the string must not be a palindrome;
-
If the first and last characters of a string are equal, it is necessary to proceed.
- If the substring inside is a palindrome, the whole thing is a palindrome;
- If the substring inside is not a palindrome string, the whole thing is not a palindrome string.
That is, when the first and last characters are equal, the palindrome property of the inner string determines the palindrome property of the whole substring, which is state transition. Thus, “state” can be defined as whether a substring of the original string is a subroutine.
Step 1: Define the state
Dp [I][j] indicates whether the substring S [I..j] is a returnee substring, where the substring S [I..j] is defined as the left closed and right closed interval, and s[I] and S [j] can be obtained.
Step 2: Consider the state transition equation
In this step, classification is discussed (according to whether the beginning and end characters are equal or not). According to the above analysis, :
dp[i][j] = (s[i] == s[j]) and dp[i + 1][j - 1]
Copy the code
Description:
- Dynamic programming is actually filling out a two-dimensional table, because it’s a substring, so
i
和j
Is the relationship betweeni <= j
So just fill out the part of the form above the diagonal. - see
dp[i + 1][j - 1]
You have to consider the boundary case.
The boundary condition is that the expression [I + 1, j-1] does not constitute an interval, that is, the length is strictly less than 2, that is, j-1 – (I + 1) + 1 < 2, and the sorted j-i < 3.
J-i < 3 is equivalent to j-i + 1 < 4. When the length of s[I..j] is 2 or 3, we can draw a conclusion directly.
- If the substring
s[i + 1..j - 1]
Only 1 character, that is, remove the two ends, leaving only 11 characters in the middle part, obviously palindrome; - If the substring
s[i + 1..j - 1]
Is an empty string, then the substrings[i, j]
It must be a subroutine.
Therefore, if s[I] == s[j] and j-i < 3, we can directly conclude that dp[I][j] = true, otherwise state transition will be performed.
Step 3: Consider initialization
When initializing, a single character must be a palindrome string, so the diagonal is initialized to true, that is, dp[I][I] = true.
In fact, the initialization part can be omitted. Since only one character is always a palindrome, dp[I][I] is not referenced by any other state value at all.
Step 4: Consider the output
As soon as dp[I][j] = true is obtained, the substring length and the start position are recorded. Interception is not necessary because interception also costs performance.
Step 5: Consider optimizing space
Because in the process of filling out the form, only the lower left values are referred to. It can actually be optimized, but it makes the code more difficult to write and understand, and it loses its readability and interpretability. I’m not optimizing space here.
Note: always get the palindrome judgment of the small string first, and then the large string can refer to the judgment result of the small string, that is, the order of filling the table is very important.
If you can do it yourself, draw a table, you’ll get a better idea of dynamic programming as a table method.
Reference Code 2:
public class Solution {
public String longestPalindrome(String s) {
/ / sentence
int len = s.length();
if (len < 2) {
return s;
}
int maxLen = 1;
int begin = 0;
// dp[I][j] indicates whether s[I, j] is a palindrome string
boolean[][] dp = new boolean[len][len];
char[] charArray = s.toCharArray();
for (int i = 0; i < len; i++) {
dp[i][i] = true;
}
for (int j = 1; j < len; j++) {
for (int i = 0; i < j; i++) {
if(charArray[i] ! = charArray[j]) { dp[i][j] =false;
} else {
if (j - i < 3) {
dp[i][j] = true;
} else {
dp[i][j] = dp[i + 1][j - 1]; }}// if dp[I][j] == true, s[I..j] is a palindrome
if (dp[i][j] && j - i + 1 > maxLen) {
maxLen = j - i + 1; begin = i; }}}returns.substring(begin, begin + maxLen); }}Copy the code
Complexity analysis:
- Time complexity: O(N2) O(N^{2}) O(N2).
- Space complexity: O(N2) O(N^{2}) O(N2), two-dimensional DP problem, a state has to be represented by a two-dimensional ordered number pair, so the space complexity is O(N2) O(N^{2}) O(N2).
After writing the code, you can write down the flow of the code on a piece of paper, using the string ‘babad’ as an example:
It can be found:
1. When the difference between I and j (L and R in the figure) is less than 3, the dp value can be directly judged without reference to the previous DP value;
2. In other cases, whenever calculating the new DP value, we must refer to the DP value in the “lower left corner”, that is, DP [I + 1][J-1] (I + 1 represents the lower part and J-1 represents the left side).
So it’s ok to write from the top down, or from the bottom up.
The wrong order and the correct order are shown below to understand what “no aftereffect” means for dynamic programming.
Note: The numbers in the table represent the “order of filling”, starting from 1. The arrows and numbers on the outside of the table also indicate the order of filling the form, which is consistent with the meanings of the numbers in the table.
Reference code 3: The following code differs only from “Reference Code 2” for the inner loop and is noted in the comment.
import java.util.Arrays;
public class Solution {
public String longestPalindrome(String s) {
int len = s.length();
if (len < 2) {
return s;
}
boolean[][] dp = new boolean[len][len];
for (int i = 0; i < len; i++) {
Arrays.fill(dp[i], false);
}
/ / initialization
for (int i = 0; i < len; i++) {
dp[i][i] = true;
}
char[] charArray = s.toCharArray();
int maxLen = 1;
int start = 0;
for (int j = 1; j < len; j++) {
// The only difference between the following line and "reference code 2" is that I can be written either upside down or forward, because the substring has a reference value
for (int i = j - 1; i >= 0; i--) {
if (charArray[i] == charArray[j]) {
if (j - i < 3) {
dp[i][j] = true;
} else {
dp[i][j] = dp[i + 1][j - 1]; }}else {
dp[i][j] = false;
}
// if dp[I][j] == true, then s[I, j] is palindrome
if (dp[i][j]) {
int curLen = j - i + 1;
if(curLen > maxLen) { maxLen = curLen; start = i; }}}}returns.substring(start, start + maxLen); }}Copy the code
Complexity analysis: (ibid.)
The state transition equation can be written as follows: If j-i < 3 is true, the state transition equation can be written as follows:
dp[i][j] = (s[i] == s[j]) and (j - i < 3 or dp[i + 1][j - 1])
Copy the code
The following code, though shorter, loses readability, mixes logical operators, and is hard to read without a preamble of parentheses to indicate precedence. I don’t really recommend it.
Reference Code 4:
import java.util.Arrays;
public class Solution {
public String longestPalindrome(String s) {
int len = s.length();
if (len < 2) {
return s;
}
boolean[][] dp = new boolean[len][len];
for (int i = 0; i < len; i++) {
Arrays.fill(dp[i], false);
}
/ / initialization
for (int i = 0; i < len; i++) {
dp[i][i] = true;
}
char[] charArray = s.toCharArray();
int maxLen = 1;
int start = 0;
for (int j = 1; j < len; j++) {
for (int i = 0; i < j; i++) {
dp[i][j] = charArray[i] == charArray[j] && (j - i < 3 || dp[i + 1][j - 1]);
// if dp[I][j] == true, then s[I, j] is palindrome
if (dp[i][j]) {
int curLen = j - i + 1;
if(curLen > maxLen) { maxLen = curLen; start = i; }}}}returns.substring(start, start + maxLen); }}Copy the code
Complexity analysis: (ibid.)
Conclusion:
- We see that the “dynamic programming” approach to problem solving is sometimes not directly problem oriented.
- “Dynamic planning” is still the embodiment of the idea of “space for time”, and itself “dynamic planning”, as a tabular method, is to use “space” for “time”.
Note on the slow execution time of the “dynamic programming” method:
- Dynamic programming is essentially a brute force solution, because you need to enumerate left and right boundaries, O(N2) O(N^2) O(N2);
- The “center spread method” provided below enumerates the centers of all possible subroutines, O(2N), not at one level.
The expression of time complexity estimation is used above, O(N2) O(N^2) O(N2) indicates that the highest term is aN2 aN^2 aN2 polynomial.
From this problem we can see, although the time complexity, but the actual execution time may be a gap, and the time complexity is an estimate, it is not necessary to calculate very clearly that this is because the real running time is related to many factors, such as the running environment of the machine, test cases, etc.
Method three: center diffusion method
The brute force method uses a double pointer sandwiched on both sides to verify whether it is a subroutine.
In addition to the left and right boundaries of the enumeration string, it is easy to think of enumerating the “center position” of the possible substring of a text, from which you try to diffuse as far as possible to get a palindrome string.
So the idea of the center diffusion method is to iterate over each index, take this index as the center, take advantage of the center symmetry of the “palindrome string” and spread out to see how far it can spread.
The time complexity of enumerating “central location” is O(N), and the time complexity of proliferating from “central location” to “LSVS” is O(N), so the time complexity can be reduced to O(N2) O(N^2) O(N2).
One detail to note here is that the palindrome center has a different form when the length of the palindrome string is odd and even.
- The “center” of an odd palindrome string is a concrete character, for example, a palindrome string
"aba"
The center is the character"b"
; - The “center” of an even palindrome string is the “gap” between the two characters in the middle, e.g., palindrome string
"abba"
The center of the two"b"
The “gap” in the middle.
Let’s see where is the center of a possible substring of a string?
We can design a method that is compatible with both cases:
1. If the overlapping index code is passed in and center diffusion is carried out, the length of the obtained substring is odd;
2. If adjacent index codes are passed in and center diffusion is performed, the resulting substring length is even.
The coding details are in the comments below.
Reference Code 5:
public class Solution {
public String longestPalindrome(String s) {
int len = s.length();
if (len < 2) {
return s;
}
int maxLen = 1;
String res = s.substring(0.1);
// Enumerate to len-2
for (int i = 0; i < len - 1; i++) {
String oddStr = centerSpread(s, i, i);
String evenStr = centerSpread(s, i, i + 1);
String maxLenStr = oddStr.length() > evenStr.length() ? oddStr : evenStr;
if(maxLenStr.length() > maxLen) { maxLen = maxLenStr.length(); res = maxLenStr; }}return res;
}
private String centerSpread(String s, int left, int right) {
// left = right; // left = right
// Right = left + 1; // Right = left + 1; // Right = left + 1
int len = s.length();
int i = left;
int j = right;
while (i >= 0 && j < len) {
if (s.charAt(i) == s.charAt(j)) {
i--;
j++;
} else {
break; }}// Be careful here, when you break out of the while loop, s.char (I)! = s.char (j), so you can't pick I, you can't pick j
return s.substring(i + 1, j); }}Copy the code
Complexity analysis:
- Time complexity: O(N2) O(N^{2}) O(N2), for reasons already stated.
- Space complexity: O(1), using only constant temporary variables, independent of string length.
In fact, there are even more time-intensive algorithms, developed by computer scientist Manacher, which are described below.
Method four: Manacher algorithm (do not need to master, the vast majority of the interview will not be required to write this algorithm, understand the idea can)
Note: there are some details in the following answer key when I was doing, without close scrutiny, there has been a friend pointed out that there is an error, please according to own understanding to think scientists of practice and thought, I have no time to modify the contents of this part temporarily, answer the following questions, you can also refer to other friends give you inconvenience, I’m sorry.
Personally, I know that Manacher algorithm is based on “center diffusion method” and adopts the similar idea with KMP algorithm, which is still “space for time”.
Manacher algorithm, Chinese programmers jokingly called the “horse and cart” algorithm. It is specifically used to solve the “longest substring of a loop” problem with a time complexity of O(N)O(N).
Wikipedia describes Manacher’s algorithm as follows:
[Manacher(1975)] discovered a linear time algorithm that could list all palindromes in a given string starting at the head of the string. In addition, Apostolico, Breslauer & Galil (1995) found that the same algorithm could also find all the maximum substrings of a text at any position, and the time complexity was linear. Therefore, they provide a substring solution of the longest loop with linear time complexity. Alternative linear time solutions are provided by Jeuring (1994), Gusfield (1997), and are based on suffix trees. There are also known efficient parallel algorithms.
Manacher algorithm is still the center diffusion method in essence, but it uses techniques similar to KMP algorithm to fully explore the characteristics of the substring that has been palindromic judgment. In the traversal process, it records the information of the substring that has been traversed, which is also a typical reflection of the idea of changing space for time.
The following describes the specific flow of Manacher algorithm.
Step 1: Preprocess the original string (add delimiters)
Insert a delimiter at the beginning and end of the string. For example, “babad” adds the delimiter “#” to produce “#b#a#b#a#d#”.
This is explained as follows:
1. The delimiter is a character of only one kind, and it must not be a character that appeared in the original string.
2, after joined the delimiter, makes the “gap” is the specific location, convenient and subsequent discussion, and any one of the new string back to the text string in the original string must be able to find only a text string and the matching back, so the study of new string back to the text string can get the original string back to the text string;
The length of the new string must be odd.
4. The substring of a new string must be bounded by a delimiter, so the delimiter acts as a sentinel.
Step 2: Compute the auxiliary array P
The auxiliary array P records the information for each character-centered substring of the new string.
The manual method is still “center spread”. In this case, record the maximum number of steps that can spread from the current character to the left and right at the same time.
Using the string “abbabb” as an example, to show how to manually compute the auxiliary array p, we will fill in the following table.
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p |
Line 1 Array CHAR: each character after the delimiter of the original string.
Line 2 Array index: This array is the index array of the new string, and its value is the index number starting from 00.
- Let’s first fill in
p[0]
.
Char [0] = ‘#’; p[0] = 0; p[0] = 0;
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 |
- Fill in the below
p[1]
。
Take char[1] = ‘a’ as the center, spread to the left and right at the same time, take 11 steps, both left and right are “#”, form a loop, then continue to spread to the left and right at the same time, the left reaches the boundary, the maximum number of steps that can spread “is 11, so p[1] = 1;
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 | 1 |
- Fill in the below
p[2]
。
Char [2] = ‘#’, char[2] = ‘#’, p[2] = 0;
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 | 1 | 0 |
- Fill in the below
p[3]
.
Take char[3] = ‘b’ as the center, spread to the left and right at the same time, take 11 steps, both sides of the left and right are “#”, form a loop, continue to spread to the left and right at the same time, the left is “A”, the right is “B”, do not match, the maximum number of steps can spread is 11, so p[3] = 1;
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 | 1 | 0 | 1 |
- Fill in the below
p[4]
.
Char [4] = ‘#’ as the center, simultaneously spreading left to right, can take up to 44 steps, left to reach the left boundary, so p[4] = 4.
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 | 1 | 0 | 1 | 4 |
- Go ahead and fill out the rest of the P array.
At this point, the following figures are not difficult to fill out and are finally written in the following table:
char | # | a | # | b | # | b | # | a | # | b | # | b | # |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
p | 0 | 1 | 0 | 1 | 4 | 1 | 0 | 5 | 0 | 1 | 2 | 1 | 0 |
Note: Some data define the auxiliary array P as the palindrome radius array, that is, P [I] records the radius of the palindrome string centered on the ith character of the new string (including the ith character), which is one character different from the auxiliary array P defined here, essentially the same.
The auxiliary array p has a maximum value of 55, which corresponds to the “longest substring” of the original string “abbabb” : “bbabb”. This conclusion is general, that is:
The maximum value of the auxiliary array P is the length of the longest loop substring.
Therefore, we can record this maximum value as we evaluate the auxiliary array P, and record the longest loop substring.
A quick explanation of why:
- If the center of the new substring is a character, then the center of the original substring is also a character. In the new substring, the character spreading out is “delimiter first, character later”, and the number of steps spreading out is also due to the delimiter
#
For every two steps in the new string, only one valid character is scanned, but two characters are evaluated in the original string.Because you have to end up with a delimiter, you have to compute one, which is exactly what counts the center of the original callback string;
- If the center of the new callback string is
#
, then the center of the primitive phereme substring is a “gap”. In a new callback substring, the characteristic of spreading out to the sides is “character first, delimiter later”, the number of steps spreading out because of the delimiter#
For every two steps in the new string, only one valid character is scanned, but two characters are evaluated in the original string.
Therefore, it is true that the maximum value of the auxiliary array P is the length of the longest substring.
At this point, we can actually write a version of the code, and it is acceptable to submit this version of the code to LeetCode, which also verifies the correctness of our conclusion above.
Reference Code 6:
public class Solution {
public String longestPalindrome(String s) {
int len = s.length();
if (len < 2) {
return s;
}
String str = addBoundaries(s, The '#');
int sLen = 2 * len + 1;
int maxLen = 1;
int start = 0;
for (int i = 0; i < sLen; i++) {
int curLen = centerSpread(str, i);
if (curLen > maxLen) {
maxLen = curLen;
start = (i - maxLen) / 2; }}return s.substring(start, start + maxLen);
}
private int centerSpread(String s, int center) {
int len = s.length();
int i = center - 1;
int j = center + 1;
int step = 0;
while (i >= 0 && j < len && s.charAt(i) == s.charAt(j)) {
i--;
j++;
step++;
}
return step;
}
/** * creates a preprocessed string **@paramS Original string *@paramDivide Separates characters *@returnThe resulting string */ is processed using delimiters
private String addBoundaries(String s, char divide) {
int len = s.length();
if (len == 0) {
return "";
}
if(s.indexOf(divide) ! = -1) {
throw new IllegalArgumentException("Parameter error, the split character you passed exists in the input string!");
}
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < len; i++) {
stringBuilder.append(divide);
stringBuilder.append(s.charAt(i));
}
stringBuilder.append(divide);
returnstringBuilder.toString(); }}Copy the code
Complexity analysis:
- Time complexity: O(N2) O(N^2) O(N2), where NNIs the length of the original string. The length of the new string is $2 \times N+ 1, not counting the number and constant terms, because the complexity of the time is still, not counting the coefficients and constants, so the time complexity is still O(N^2)$.
- Space complexity: O(N).
The scientist’s job: Take full advantage of the palindromic nature of the new string and compute auxiliary arrays
p
.
The code above is not too smart, location for new string every center spread, will cause the original string visit several times, each character is an extreme condition is: # # # # # # # # #. In fact, computer scientist Manacher modified the algorithm so that when filling in the values of the new auxiliary array P, he could refer to the values of the already filled auxiliary array P, so that the new string was accessed only once per character. The overall time complexity is improved from O(N2) O(N^2) O(N2) to O(N).
In the process of traversal, in addition to loop variable I, we also need to record two variables, maxRight and center, which have the following meanings respectively:
maxRight
: Records the farthest edge of the current expansion to the right, that is, from the beginning to the current right-most position of the loop string that can be extended using the center diffusion method. formaxRight
Let’s make three points:
- “Farthest to the right” is evaluating auxiliary arrays
p
In the process of diffusion to the right can go to the largest index position, note: get onemaxRight
The corresponding substring is not necessarily the “longest substring” currently obtained. It is possible that one of the substrings is shorter, but it happens to be further down the string. maxRight
The next position of may be seen by the program. The reason for stopping is as follows :(1) the left boundary cannot be diffused, so the right boundary is restricted and cannot be diffused.maxRight
The next position of is not visible; (2) Because I saw itmaxRight
The next position that leads tomaxRight
It can’t spread any further.- why
maxRight
Very important? Because the scan is going from left to right,maxRight
Can provide the most information, it is an important classification discussion criteria, so we need a variable to record it.
center
:center
Is with themaxRight
The relevant variable is the one mentioned abovemaxRight
Index value of palindrome center of. forcenter
The description is as follows:
center
Formal definition of:
center = argmax {x + p[x] ; |; 0 \leq x< I}cente**r=argma**x{x+p[x]∣0≤x< I}
The maximum value of x + p[x] is our definition of maxRight, I is a circular variable, 0<= x< I is the maximum value of maxRight obtained in all indexes before I, its corresponding palindrome center index is the above formula.
maxRight
与center
It’s one to one, one to onecenter
The value of theta only corresponds to onemaxRight
The value of the; somaxRight
与center
It must be updated at the same time.
The following discussion discusses the relationship between the loop variable I and maxRight:
Case 1: when I >= maxRight, this is the situation at the beginning and just after scanning a substring of a text, at this time, the maxRight can be gradually expanded by scanning one by one according to the “center diffusion method”.
Case 2: When I < maxRight, the p-value of the loop variable with respect to the index (called mirror) of center symmetry is important, depending on the nature of the new character’s substring.
(mirror + I) / 2 = center, so mirror = 2 * center – I.
According to the value of P [mirror] from small to large, it can be divided into the following three situations:
Case 2 (1) : the value of p[mirror] is small and does not exceed maxright-i.
Maxright-i is the number of left steps from I’s mirror point about center (excluding itself) to maxRight’s mirror point about center
As can be seen from the figure, due to the symmetry of “centered gyrus substring”, “I centered gyrus substring” and “center centered gyrus substring” also have symmetry, and “I centered gyrus substring” and “center centered gyrus substring” can no longer diffuse. P [I] = p[mirror]
Case 2 (2) : the value of p[mirror] is exactly equal to maxRight -i.
Note: It is still based on the symmetry of “center centered subroutine”, which leads to the symmetry of “I centered subroutine” and “center centered subroutine”.
- Because it’s on the left
f
And the one on the rightg
As a result ofcenter
Is the center of the substring “cannot continue to spread; - But”
i
Is the center of the subroutine “can continue to spread.
Therefore, we can copy the value of p[mirror] first, and then continue “center diffusion”, continue to increase maxRight.
Case 2 (3) : the value of p[mirror] is greater than maxRight -i.
Note: It is still based on the symmetry of “center centered subroutine”, which leads to the symmetry of “I centered subroutine” and “center centered subroutine”. Here we prove that p[I] = maxright-i, again using the symmetries of three pheromones.
(1) Due to the symmetry of the “center centered text substring”, the characters C and E corresponding to the yellow arrow must not be equal;
(2) Due to the symmetry of “mirror-centered substring”, the characters C and C corresponding to the green arrow must be equal;
Thirdly, due to the symmetry of “centering on center”, the character C and C corresponding to the blue arrow must be equal;
Deduce the symmetry of the “i-centered substring”, the characters C and E corresponding to the red arrow must not be equal.
Therefore, p[I] = maxright-i, which cannot be greater. Above is because I draw the picture, may see the friend will feel for granted. In fact, it can be proved by contradiction:
If the two characters c and E are equal to each other, it follows that the values of the eight variables indicated by the yellow, green, blue, and red arrows are all equal, and then the “center centered subroutine” can spread to the left and right at the same time by 11 Spaces. Maximum sexual conflict with maxRight.
When I < maxRight, p[I] can refer to the information of p[mirror]. With maxright-i as the reference standard, p[I] should be conservative, i.e., the smaller of the two values:
p[i] = min(maxRight - i, p[mirror]);
Copy the code
Reference Code 7:
public class Solution {
public String longestPalindrome(String s) {
/ / sentence
int len = s.length();
if (len < 2) {
return s;
}
// Get preprocessed strings
String str = addBoundaries(s, The '#');
// The length of the new string
int sLen = 2 * len + 1;
// Array p records the scanned substring
int[] p = new int[sLen];
// Double Pointers, which are one-to-one, must be updated simultaneously
int maxRight = 0;
int center = 0;
// The maximum number of central spread steps currently traversed, whose value is equal to the length of the longest substring of the original string
int maxLen = 1;
// The starting position of the longest substring of the original string, which must be updated at the same time as maxLen
int start = 0;
for (int i = 0; i < sLen; i++) {
if (i < maxRight) {
int mirror = 2 * center - i;
// This line of code is the key to Manacher's algorithm
p[i] = Math.min(maxRight - i, p[mirror]);
}
// The left and right starting point of the next attempt at diffusion. The number of steps that can be diffused is added directly to p[I]
int left = i - (1 + p[i]);
int right = i + (1 + p[i]);
// left >= 0 && right < sLen
// str.charat (left) == str.charat (right) indicates 1 diffusion
while (left >= 0 && right < sLen && str.charAt(left) == str.charAt(right)) {
p[i]++;
left--;
right++;
}
// According to maxRight's definition, it is the largest of the traversed I + p[I]
// The greater the value of maxRight, the more likely it is to enter the judgment above I < maxRight, so that the palindrome information can be reused
if (i + p[i] > maxRight) {
// maxRight and Center need to be updated at the same time
maxRight = i + p[i];
center = i;
}
if (p[i] > maxLen) {
// Records the length of the longest substring and its corresponding starting point in the original string
maxLen = p[i];
start = (i - maxLen) / 2; }}return s.substring(start, start + maxLen);
}
/** * creates a preprocessed string **@paramS Original string *@paramDivide Separates characters *@returnThe resulting string */ is processed using delimiters
private String addBoundaries(String s, char divide) {
int len = s.length();
if (len == 0) {
return "";
}
if(s.indexOf(divide) ! = -1) {
throw new IllegalArgumentException("Parameter error, the split character you passed exists in the input string!");
}
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < len; i++) {
stringBuilder.append(divide);
stringBuilder.append(s.charAt(i));
}
stringBuilder.append(divide);
returnstringBuilder.toString(); }}Copy the code
Complexity analysis:
- Time complexity: O(N), Manacher algorithm matches only when it encounters positions that have not been matched. The positions that have been matched are no longer matched, so for strings
S
For each position of, it only matches once, and the algorithm is O(N).O(N). - Space complexity: O(N).