Chapter 41 ~ 42: The Dutch flag problem and Strassen algorithm for matrix multiplication

\

preface

The two problems in this paper, the Dutch flag and Strassen’s algorithm for matrix multiplication, are related to the dive-and-conquer method, so the two problems are put together. Divide and conquer means divide and conquer. It is just like the strategy of avoiding the main force of the enemy’s armed forces and destroying them one by one in wartime.

If you have any questions, please feel free to answer them.

\

\

Chapter 11: The Dutch flag

Topic describes

There are three balls of different colors, red, white and blue, which are arranged in a disorderly order. Please rearrange these balls so that the red, white and blue balls are in the same color together. The reason this question is called the Dutch flag is because we can imagine the red, white and blue balls as stripes, arranged in an orderly way to form the Dutch flag. As shown in the figure below:

    

\

Thought analysis

At first glance, we don’t seem to have a good solution to this problem other than violence, but what about quicksort? As we know, quicksort is based on the dive-and-conquer mode. The dive-and-conquer process of sorting A typical subarray A[p…r] is three steps:

  1. Decomposition: A [p.. r] is divided into two subarray A (possibly empty) [p.. the q – 1] and A [q + 1.. r], makes A/p.. the q – 1 < = A [q] < = A/q + 1.. r
  2. Sort subarrays A[p..q-1] and A[q+1..r] with A recursive call to quicksort.
  3. A merger.

In other words, the main idea of quicksort is to rely on a partition and conquer process, each sorting process, the selected pivot elements will arrange the entire array into a large sequence, a small sequence, and then recursively sort the entire array.

As shown in the pseudocode below:

The key to the quicksort algorithm is the PARTITION process, which rearranges A[p..r] in place: **PARTITION(A, p, R) **1 x ← A[r] 2 I ← P-1 3 for J ← P to R-1 4 Do if A[J] ≤ x 5 then I ← I + 1 6 Exchange A[I] <-> A[J] 7 Exchange A[i + 1] <-> A[r] 8 return i + 1

The sorting process is then recursively completed:

QUICKSORT(A, p, r) 1 if p < r 2 then Q ← PARTITION(A, p, r) 3 QUICKSORT(A, p, q-1) 4 QUICKSORT(A, q + 1, r)\

For example, I refers to the position before the array header, and j refers to the array header element, with j in front and I in back, both moving from left to right.

I /j 2 8 7 1 3 5 6 4(I ++) (I ++) (I ++) (I ++) (I ++) I j 2 1 7 8 3 5 6 4 3 j 2 1 1 7 8 3 5 6 4 A[I + 1] <-> A[r] [I + 1] <-> A[r] [I + 1] [I + 1] <-> A[r] [I + 1] 2 1 3 4 7 5 6 8\

Ok, so the first quicksort is complete. So, 4 divides the whole array into two parts, 2, 1, 3,7, 5, 6, 8, and recursively sort those two parts separately.

The whole process can be seen in this article: Quicksort algorithms, or in this picture I drew in school:

And the question we face is, rearrange the balls so that all the balls are arranged into three different colors, can we set three Pointers, using the partition process?

Solution one, partition partition

According to the previous analysis, this problem is similar to the partition process in quickset. We just need to use three Pointers, one before begin, one in current, one after end, and both of them are exchanged.

  1. Current traversal, the entire array sequence, current means 1 doesn’t move,
  2. Current means 0, and then begin, then current++, begin++,
  3. Current refers to 2, and end is exchanged, and then current does not move, end–.

Why? In the third step, current refers to 2. After exchanging with END, current does not move the column. After the exchange of current and end, current does not move, end–, because there are worries at home.

The reader can imagine that your ultimate goal is to get the 0’s, 1’s, and 2’s in order. What if, in step 3, “end” was 0 before “end” and “current” is 0 after “current”? You can’t move it. It means 0. You have to swap columns with begin.

Ok, so you may not understand, just quote the gnuhPC diagram, it will be clear:

    

    

The reference code is as follows:

// Reference from gnuhpc
while( current<=end )      
{           
  if( array[current] ==0 )           
   {               
      swap(array[current],array[begin]);                
      current++;                
      begin++;          
   }           
   else if( array[current] == 1 )          
   {               
      current++;          
   } 
          
   else //When array[current] =2 
   {             
      swap(array[current],array[end]); end--; }}Copy the code

In this chapter.

\

\

Chapter 42: Strassen’s algorithm for matrix multiplication

Topic describes

Program matrix multiplication and consider optimization methods when the matrix is large.

Thought analysis

According to Wikipedia: A multiplication of two matrices can only be defined if the number of columns in the first matrix B is equal to the number of rows in the other matrix A. If A is an m by n matrix and B is an n by p matrix, their product AB is an m by p matrix, one of its entriesWhere 1 ≤ I ≤ m, 1 ≤ j ≤ p.

    

It is worth mentioning that matrix multiplication satisfies the associative law and distributive rate, but does not satisfy the commutative law. In this example shown in the figure below, the result of the two matrices is changed after the commutative multiplication:

So let’s actually solve this matrix multiplication problem.

Solution one, violent solution

In fact, it is obvious from the previous analysis that the complexity of the multiplication of two matrices with the same dimension is O (n^3). The reference code is as follows:

// Matrix multiplication, 3 for loops
void Mul(int** matrixA, int** matrixB, int** matrixC)  
{  
	for(int i = 0; i < 2; ++i)   
	{  
		for(int j = 0; j < 2; ++j)   
		{  
			matrixC[i][j] = 0;  
			for(int k = 0; k < 2; ++k) { matrixC[i][j] += matrixA[i][k] * matrixB[k][j]; }}}}Copy the code

Solution 2. Strassen algorithm

In solution one, we did the matrix multiplication with three for loops, but when the dimensions of the two matrices are very large, the time complexity of O (n^3) is very large, so we need to find a better solution.

Generally speaking, when the amount of data is large, we tend to divide the large data into small data and process each separately. So with that in mind, if you’re given two very large matrices, could you consider the dive-and-conquer method of multiplying the smaller matrices one by one, because we know that a matrix can be divided into more smaller matrices.

As shown in the figure below, given A two dimensional matrix A B:

When we multiply these two matrices A and B, we find that in the process of multiplying, there are 8 times of multiplication and 4 times of addition:

The complexity of matrix multiplication is mainly the multiplication, and adding one or two more will not increase the complexity much. So, we thought, can we reduce the number of times of multiplication in the process of matrix multiplication, so as to reduce the complexity of matrix multiplication? The answer is yes.

In 1969, a German mathematician, Strassen, proved that the solution of O(N^3) was not the optimal algorithm for matrix multiplication. He did a series of work to reduce the final time complexity to O(N^ 2.80).

How did he do it? Using the previous example of multiplying matrices A and B, he defined seven variables:

Thus, the process of Strassen algorithm is as follows:

  • When two matrices A and B are multiplied, divide A, B and C into squares of equal size:

;

  • We can see that C comes from:

\

  • Now define 7 new matrices (readers can think about how they came up with these 7 new matrices) :

  • The final result matrix C can be obtained by combining the above 7 new matrices:

On the face of it, Strassen’s algorithm is only slightly better than the generic matrix multiplication algorithm, which has a time complexity of 0And the Strassen algorithm complexity is just. But as n gets larger, like when n >> 100, Strassen’s algorithm becomes more efficient than the general matrix multiplication algorithm.

As shown in the figure below:

Solution 3. Continuous optimization

Later, according to Wikipedia, the Coppersmith-Winograd algorithm reduced the time complexity of N by N matrix multiplication to:In 2010, Andrew Stothers again reduced complexity toA year later, in 2011, Virginia Williams finally fixed complexity as:

\

\

reference

  1. Quick sort algorithm: blog.csdn.net/v_july_v/ar… ;

  2. An in-depth analysis of quicksort algorithm: blog.csdn.net/v_july_v/ar… ;

  3. Gnuhpc:blog.csdn.net/gnuhpc/arti… ;

  4. Wikipedia for Strassen’s algorithm: zh.wikipedia.org/wiki/%E6%96… ;

  5. Chapter 42 part figure from the article “Computer Algorithms: Strassen ‘s Matrix Multiplication” : www.stoimen.com/blog/2012/1… ; \

  6. Translation version of the above, from Turing community: www.ituring.com.cn/article/179… ;

  7. The Coppersmith Winograd algorithm: en.wikipedia.org/wiki/Copper… ; \

\

Afterword.

The Art of Programming was supposed to go to chapter 50, but now it’s down to the last eight chapters. Thank you for your attention. I wish all the readers of this blog a happy New Year and everything they want in the Year of the Horse, thanks.

July, 28th, 2014