Yesterday, a student asked a question in another group: 4×4 grid, each grid has two states of black and white. Every time you click on a grid, it and all its adjacent grids will be flipped at the same time, asking how to restore all the grids to full white. In the past, when I was in college, I did this problem by enumerating the first line strategy, deducing the remaining lines and verifying the last line. Although pruning can reduce some calculation, it is not very significant. Later, I studied matrix (xOR matrix in particular) for a period of time, and I have never encountered this problem again. Since I was asked yesterday, I will brush it out and find out three POJ questions, with great difference and slight difference in details, with difficulty from low to high:
- 1222: The number of operations that limit 4 by 4 to all black or all white, just one optimal solution.
- 1753: Limit 5 times 6, only to all 0, no optimal solution (but also no Special Judge, will there be multiple solutions? This is explained below).
- W and H are less than or equal to 15, can only reach all 0, find a lexicographic minimum optimal solution.
The optimal solution is defined as the least number of operations (the least number of 1’s in the solution)
These problems can be solved by enumerating the first row, but since we are writing the article, we will not be so water in the past, today we are going to talk about mathematical methods to solve this problem.
The basis of the rollover problem
In 1222’s description, some basic strategies for this problem have been described, so let’s briefly review them:
- The order of operations has nothing to do with the result (ultimately only the “parity” of the number of times each cell is changed)
- There is no need to operate any one cell twice (ditto)
In the output of 1753 and 3279, we can also see that we are asking for a matrix, and each element of the matrix represents whether the cell needs to operate once.
Linear congruence equation
Consider the goal of our solution: whether each cell needs to be operated once, and its corresponding result: whether the state of each cell changes after the operation is completed. What about the relationship between these two matrices? It is “which cells are affected by operating each cell”. This process can actually be represented by a system of linear congruences (let’s use a 2×2 example here) :
x0 * a00 + x1 * a01 + x2 * a02 + x3 * a03 = r0 (mod 2) x0 * a10 + x1 * a11 + x2 * a12 + x3 * a13 = r1 (mod 2) x0 * a20 + x1 * a21 + x2 * a22 + x3 * a23 = r2 (mod 2) x0 * a30 + x1 * a31 + x2 * a32 + x3 * a33 = r3 (mod 2)Copy the code
Here, x0 to x3 respectively represent whether the four 2×2 cells need to be operated once, r0 to R3 respectively represent whether the four 2×2 cells need to be changed once, aij represents whether pressing the JTH cell will change the state of the ith cell, so aij matrix is known:
2 by 2 matrix 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 3 by 3 matrix 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1Copy the code
The equations above can be represented as vectors and matrices (take up linear algebra if you lack the knowledge!). :
A * X = R (mod 2) // If X and R are column vectors, the matrix elements have the same form as the previous representation. X * A = R (mod 2) // If X and R are row vectors, this is the same thing as above, but the A matrix has to be transposed.Copy the code
This representation is called a congruence matrix (a vector is also a special kind of matrix). In particular, modulo 2 can also be called xor matrix because a xor b = (a+b) mod 2.
The multiplication of congruence matrices also satisfies the associative law (note that matrix multiplication does not satisfy the commutative law), and congruence matrices also have similar properties to ordinary matrices in many respects, so we can use this to do some strange things.
Rank of matrix, existence test of solution
In linear algebra, the column rank of A matrix A is the maximum number of linearly independent columns of A. Similarly, the row rank is an enormous number of linearly independent rows of A. So if you think of a matrix as a row vector or a column vector, the rank is just the rank of those row vectors or column vectors, which is the number of vectors in a maximally independent set.
The rank of a matrix is the smaller value of its column rank and row rank. The row rank and column rank of a square matrix are always the same.
Correlation of the existence of solutions to linear equations:
- Rank (coefficient matrix)< rank (augmented matrix) : no solution exists
- Rank (coefficient matrix)= rank (augmented matrix) : existence of solutions:
- Rank (coefficient matrix)= number of unknowns: there is always a unique solution
- If the number of unknowns is equal to the number of rows, it is called “full rank”, and the coefficient matrix has an inverse matrix
- Rank (coefficient matrix)< number of unknowns:
- All solutions can be expressed as a linear combination of a particular solution and K linearly independent free radicals, where K= the number of unknowns – rank (coefficient matrix)
- Rank (coefficient matrix)= number of unknowns: there is always a unique solution
An augmented matrix shows the result of each row of the equation as the last column of the matrix.
In particular, for congruence matrix, the above conclusion is not necessarily valid, and full rank does not necessarily mean the existence of inverse matrix, because the multiplication operation is not necessarily reversible in the process of linear congruence calculation, and an effective division operation cannot be directly defined. For example, the 1 by 1 matrix 2 of mod 4 does not have an inverse, because there is no integer between 0 and 3 multiplied by 2. Mod 4 equals 1. But fortunately this problem does not exist for xor matrices, full rank xor matrices always have inverse matrices, and the existence and rank of solutions of xor matrices also conform to the above correlation.
The above description is abstract, so we can look at it concretely (here we use ordinary linear equations for example).
X0 * 1 + x1 * 2 = 4 x0 * 4 + x1 * 4 = 8 x0 * 1 + x1 * 2 = 4 x0 * 2 + x1 * 4 = 8Copy the code
What does rank mean? The rank of rows (the coefficient matrix) is how many rows there are, whose coefficients can’t be derived from a linear combination of the preceding rows. Look at the three equations above:
- The coefficients on the second row of equation 1 cannot be obtained by a linear combination of the first row, and the rank of the coefficient matrix is 2. There are two unknowns, and the solution is unique.
- The second row of equation 2 can be obtained from the first row ✖️2, so the rank of the coefficient matrix is only 1. But if you consider the augmented matrix, you can’t get it from the first row, so the rank of the augmented matrix is still 2, and the linear system has no solution.
- The second row of equation 3, no matter the coefficient matrix or the augmented matrix, can be obtained from the first row. Therefore, the rank of the coefficient matrix and the augmented matrix are both 1 and less than the number of unknowns. This linear system of equations has an infinite number of solutions.
Why is that? Because:
Equation 1, x0 * 1 + x1 * 2 = 4, x0 * 4 + x1 * 4 = 8 # The coefficient is not derived from the previous equation, which means that this is a separate equation that can be solved. Equation 2: x0* 1 +x1* 2 = 4 x0*2+x1*4= 6 # Equation 3: x0* 1 +x1* 2 = 4 x0*2+x1*4=8 # Both coefficient and increment can be obtained from the previous equation # means that the previous equation can be obtained x0*2+x1*4=8. Removing this line of equations does not affect the number of solutions, equivalent to only one line of equations is meaningful.Copy the code
In the example above, row rank is equal to rank. What if the number of rows is greater than or equal to the number of unknowns? Is row rank equal to row rank? Does that still hold?
Gaussian elimination
The definition of rank and the determination of the solution are described above, but there is no explanation of how to find the rank of the matrix, nor how to find the solution. In fact, both of these things can be solved using the same algorithm, called Gaussian elimination.
First, recall the elimination method we used to solve linear equations. That’s right.)
X0 times 1 plus x1 times 2 is equal to 4, x0 times 4 plus x1 times 4 is equal to 8, so the first row times 2, you get x0 times 2 plus x1 times 4 is equal to 8 and the second row minus the third row, you get x0 times 2 plus x1 times 0 is equal to 0 so x0 is equal to 0, And you can continue to solve for x1Copy the code
Gaussian elimination is essentially turning this into a normalized process, and it says, first of all, by eliminating the coefficient matrix into a triangular matrix.
X0 times 1 plus x1 times 2 is equal to 4, x0 times 4 plus x1 times 4 is equal to 8 is less than -- minus the first row times 4, x0 times 1 plus x1 times 2 is equal to 4, plus x1 times -4 is equal to minus 8Copy the code
The second step is to turn the coefficients into a diagonal matrix
X0 * 1 + x1 * 2 = 4 < -- minus second row *-0.5 + x1 * -4 = -8 x0 * 1 + = 0 + x1 * -4 = -8Copy the code
Step three, cancel out the coefficients
x0 = 0
x1 = 2
Copy the code
So we can solve the equation.
Note that in the first step, we may encounter some strange situations, such as:
X1 times 2 is equal to 4 is less than -- this row is x0 and the coefficients are 0. What do we do? x0 * 4 + x1 * 4 = 8Copy the code
As long as the coefficient of x0 on successive rows has a non-zero value, there are several ways to solve this problem, the most common of which are two:
X0 times 4 plus x1 times 6 is equal to 12 is less than -- we could add this row x0 times 4 plus x1 times 4 is equal to 8 -- x0 times 4 plus x1 times 4 is equal to 8 is less than -- we could swap these two rows x1 times 2 is equal to 4Copy the code
What if a coefficient is zero in all subsequent rows? Unfortunately, this means that the unknown is a “free variable,” which means:
- The rank of the coefficient matrix may be smaller than the number of unknowns, and the value of this free variable does not affect the existence of the solution
- If WE go to step 2, we can make all the coefficients in this row 0
- If the corresponding row of this regular sequence is not zero, the equation has no solution.
- If all “rows with zero coefficients” are also zero, it means that the equation may have an infinite number of solutions
- If there is a solution:
- There must be a particular solution where all the free variables are zero
- This “free variable” and its coefficient *-1 in the previous equation can form a “free radical”, and any solution of the equation plus multiples of the free radical must also be a solution of the equation.
- In particular, if each free radical contains only one free variable, then these free radicals must be linearly independent, and the number of free radicals is equal to the number of free variables is equal to the number of unknowns minus the rank of the matrix
Note that when there are free variables, the elimination cannot be eliminated to the diagonal matrix, and there may be non-zero coefficients above the column of the free variables. These non-zero coefficients are associated with free radicals.
Let’s look at the Gaussian elimination results of equations 2 and 3 in detail:
Equation 2: x0 * 1 + x1 *2 = 4 x0 *2 + x1 * 4 = 6 <-- minus the first row *2 x0 * 1 + x1 *2 = 4 x0 * 0 + x1 * 0 = -2 <-- all coefficients are 0, step 2 does not involve this column. # If the constant sequence is not 0, the equation has no solution. Equation 3: x0 * 1 + x1 *2 = 4 x0 *2 + x1 * 4 = 8 <-- minus the first row *2 x0 * 1 + x1 *2 = 4 x0 * 0 + x1 * 0 = 0 <-- all coefficients are 0, step 2 does not involve this column. The constant is listed as 0, the equation has an infinite number of solutions, and x1 can be a free variable. There must be a particular solution to x1=0, where x0=4 and x1=0. # The column coefficient *-1 plus the column 1 of the free variable is a set of free radicals, i.e. [-2, 1], which means that if x0,x1 is the solution to this equation, for any c, X0 +(-2)*c, x1+1*c are both solutions to the equation (called the general solution) # The equation has only one set of free radicals, which means that any solution must conform to 4+(-2)*c, 0+1*cCopy the code
Note that the description of some Gaussian elimination algorithms will eliminate this row, and the resulting matrix is not a diagonal matrix. This is more convenient for manual calculation but not for computer calculation, so we keep this row, and the corresponding row coefficients are all 0. These two approaches are equivalent.
Gaussian elimination method can also be used to solve inverse matrix of full order square matrix. We first add an identity matrix to the right of the coefficient matrix; Then it can be noted that any operation of Gaussian elimination (no matter swap, linear superposition or coefficient reduction) is equivalent to multiplying a matrix Pi by the left, and an identity matrix can be obtained after gaussian elimination of full rank matrix (the diagonal coefficient is 1, and the remaining matrix elements are all 0). Therefore, the operation process is equivalent to:
Pn*(... (P2*(P1* ([A, I])))) = Pn * ... * P1 * ([A,I]) = [I, B] *P1 = V, so: V * [A, I] = [I, B] can be obtained: V * A = I V * I = B => V = B so B * A = I, B and A are mutually inverse matrix.Copy the code
Even if the matrix is not rank, that is, there is no inverse matrix, the elimination matrix obtained in the elimination process can still be used to eliminate a new problem quickly, so that the time complexity of each decision can be reduced through the prediction calculation. Unfortunately, these POJ problems do not have such application opportunities.
The above procedure and conclusion are basically consistent with xor matrix (but for non-prime linear congruence square matrix, because there is no suitable division operation, so it may not be able to carry out elimination, so gaussian elimination method can not be applied).
But for the xOR matrix, the coefficient of the radical can only be 0 or 1, so it is not an infinite set of solutions, but a POW (2, number of free variables) set of solutions.
The problem solving
When programmatically representing xor matrices, we can usually represent a row of matrices with an entire int (multiple ints if there are not enough int bits, or STL’s STD ::bitset works as well!). So we can use AND to quickly multiply bitwise, AND XOR to quickly multiply bitwise XOR. This makes programming xor matrices extremely simple.
And the xor matrix only has values of 0 and 1, so the third step of Gaussian elimination is not necessary.
// Note that this assumes that unsigned int is sufficient to represent a row. If not, encapsulate rows into multiple unsigned Ints and xor them one by one
// The first step of gaussian elimination
for (int i = 0; i < N; i++) {
unsigned int mask = (1U<<i);
if ((A[i] & mask) == 0) {
for (int j = i+1; j<N; j++) {
if (A[j] & mask) {
// Linear superposition of a non-zero line is used here, since swapping seems more expensive
A[i] ^= A[j];
B[i] ^= B[j];
break; }}}if((A[i] & mask) ! =0) {
for (int j =i+1; j < N; j++) {if (A[j] & mask) {
// Eliminate the column in the following rowA[j] ^= A[i]; B[j] ^= B[i]; }}}}/ / the second step
for (int i = N- 1; i>0; i--) {
unsigned int mask = (1U<<i);
for (int j = 0; j < i; j++) {
if (A[j] & mask) {
// The column in the previous row is eliminated only if it is not free.A[j] ^= A[i]; B[j] ^= B[i]; }}}Copy the code
After elimination, we may get one of two results, either an identity matrix (diagonal all one), then full rank & inverse matrix exists, or a partial row coefficient of zero, then not full rank (there may be no solution or multiple sets of solutions)
Specific to the topic:
- 1753: There are 4 free radicals, meaning there are either no solutions or there are 16 solutions. The particular solution of all zero free radicals can be obtained from the input (there may be no solution), and then all of the solutions are traversed according to the free radicals and the best one is chosen. It can be switched to white or black, or it can be interpreted as switching the white and black of the problem itself to calculate again (and 0xFFFF or), and get the best solution.
- 1222: The matrix happens to be full rank, so the inverse matrix can be computed directly by Gaussian elimination, directly multiplying the input by the inverse matrix to obtain the solution. The inverse matrix is independent of the problem input and can be precomputed.
- 3279: The problem size is uncertain, meaning that both full and dissatisfied rank may be encountered and cannot be calculated in advance. Therefore, gaussian elimination method is used to solve the problem directly, and all the free radicals are obtained and all the solutions are traversed and the best one is selected.