As a programmer, we all know the data structure and algorithm is very important, at the same time learning data structures and algorithms can be very difficult for most programmers, one more thing, have to admit that he is also one of the most 🙍, but if you want to make their programming road to go more long-term, have to face the tough ðŸĶī bone. So I am ready to start seriously, I hope I can stick to it. Like the hope to help a praise 👍 ah

Complexity analysis is the essence of the whole algorithm learning, as long as you master it, the content of data structure and algorithm basically master half. Therefore, this article is mainly about how to analyze the complexity of the algorithm. It mainly refers to the data structure and algorithm course of Professor Wang Zheng, a geek, and summarizes and abstractions by himself. If you are interested, you can buy the course to learn.

Some knowledge of data structures and algorithms

A data structure is a storage structure for a group of data. An algorithm is a set of methods for manipulating data.

Data structures and algorithms are complementary. Data structures serve algorithms, which operate on specific data structures. So we can’t talk about algorithms in isolation, and we can’t talk about data structures in isolation. For example, common binary search algorithms require arrays to store data because of the random access nature of arrays. But if we choose a data structure like a linked list, the binary lookup algorithm won’t work because linked lists don’t support random access.

Data structures are static; they are just a way of organizing data. Data structures that exist in isolation are useless if you don’t operate and build algorithms on top of them.

Why complexity analysis

Data structures and algorithms themselves solve the problem of “fast” and “frugal” — how to make code run faster, how to make code save storage. Therefore, the execution efficiency is a very important indicator of the algorithm.

We can evaluate the efficiency of algorithm execution by post-statistical method (run the code once, and get the time and memory occupied by the algorithm through statistics and monitoring), but this method has very big limitations.

  1. Test results are highly dependent on the test environment.
  2. Test results are greatly influenced by the size of the data. For example, for small data sorts, insertion sort might actually be faster than quicksort

So, we need a way to roughly estimate the efficiency of an algorithm without having to test it with specific test data, and that’s called asymptotic time space complexity analysis or what we call complexity analysis.

Progressive time and space complexity analysis provides us a good direction of the theory analysis, and it is platform independent, host program or algorithm for us to be able to let us have a general understanding of, let we know, for example, in the worst case execution efficiency, but also provides a good communication for our Bridges, we can say, The time complexity of algorithm 1 is O(n) and algorithm 2 is O(logN), so we immediately have a perceptual perception of the “efficiency” of the different algorithms.

Time and space complexity, of course, incremental analysis is only a theoretical model, can only provide a rough estimate analysis, we can’t directly conclude that felt O (logN) algorithm must be better than O (n), according to different host environment, different data sets, the size of the different amount of data, in the practical application of the above may be different the real performance, so, Specific performance benchmarks can be carried out for different actual situations. For example, horizontal benchmarks can be carried out on a unified batch of mobile phones (the same hardware, system, etc.) to select the most suitable algorithm for specific application scenarios.

So, incremental time and space complexity is not conflict analysis and statistics method, but rather complementary to each other, but a low time complexity program will have great possibilities is better than that of a high order the time complexity of the program, so in the actual programming, time care about the theory of time, space model is help to produce high efficiency program. At the same time, because progressive time and space complexity analysis only provides a rough analysis model, it will not waste too much time. The key is to have such complexity analysis thinking when programming, which helps us to constantly remind ourselves of the programming level and improve the code quality.

Large O complexity notation

The execution efficiency of an algorithm, roughly speaking, is the execution time of the algorithm code. So how can we directly see the efficiency of a piece of code? Let me give you some practical examples

Example 1:

Assume that the execution time of each piece of code is the same, unit_time. So what is the time complexity of the following code?


function sum(n){
    let sum = 0;
    let i = 1;
    for(i; i <= n; i++){
            sum += i;
    }
    return sum;
}
Copy the code

The analysis process is shown in the figure belowT (n) = (3 n + 2) * unit_time. As you can see,The execution time T(n) for all code is proportional to the number of times each line of code executes.

Example 2:

Let’s continue to use the above method to analyze the following code

function sum(n){
    let sum = 0;
    let i = 1;
    for(i; i <= n; i++){
        console.log('test');
        for(let j = 1; j <= n; j++){ sum = sum + i * j; }}return sum;
}
Copy the code

The analysis process is shown in the figure below

To sum up: Although we do not know the specific value of unit_time, we can get a very important rule through the derivation of the execution time of the two codes, that is, the execution time T(n) of all codes is directly proportional to the execution times n of each line of code.

We can summarize this rule in a formula: T(n) = O(f(n)), and then our big O appears

So, in the first exampleT(n) = O(3n+2)In the second exampleT(n) = O(2n2+2n+3). That’s the big O time complexity notation. The big O time complexity does not actually specify the actual execution time of the code, but ratherRepresents the trend of code execution time as data size increasesSo, also calledThe asymptotic time complexity (with no complexity), hereinafter referred to asTime complexity.

When n is large, you can think of it as 10,000, 100,000. The low-order, constant and coefficient parts in the formula do not control the growth trend, so they can be ignored. We only need to record a maximum order of magnitude. If we use the big O notation to express the time complexity of the two pieces of code we just talked about, it can be written as T(n) = O(n); T (n) = O (n2).

Three practical methods of time complexity analysis

Focus only on the piece of code that loops the most times

The big O complexity representation simply focuses on the code execution time as the data size increases. Low orders, constants, and coefficients are usually ignored and only the highest order is recorded.

Therefore, when we analyze the time complexity of an algorithm or a piece of code, we only focus on the piece of code with the most times of loop execution.

Again, the sum code above is order (n).

Rule of addition: The total complexity is equal to the complexity of the code with the highest order of magnitude

We can look at the following code

function sum(n){
    let sum_1 = 0;
    for(let i = 1; i < 100; i++){
        sum_1 += i;
    }

    let sum_2 = 0;
    for(let j = 0; j < n; j++){
        sum_2 += j;
    }

    let sum_3 = 0;
    for(let i = 1; i <= n; i++){
        console.log(i);
        for(let j = 1; j <= n; j++){ sum_3 = sum_3 + i * j; }}return sum_1 + sum_2 + sum_3;
}

Copy the code

We do a complexity analysis of the code

In particular, the first piece of code: this code loop 10,000 times, 100,000 times, as long as it is a known number, regardless of n, is still constant execution time. Although this constant has a great influence on the code execution time when it is very large, back to the concept of time complexity, it represents the changing trend of the algorithm execution efficiency and data size growth, so we can ignore the constant execution time no matter how big it is. Because it has no effect on growth trends per se.

To summarize: The total time complexity is equal to the time complexity of the code with the highest magnitude. It can be abstracted into the following formula

If T1(n)=O(f(n)), T2(n)=O(g(n)); Then T = T1 (n) (n) + T2 (n) = Max (O (f (n)), O (g (n))) = O (Max (f (n), g (n)))

Product rule: The complexity of nested code is equal to the product of the complexity of both inside and outside the nested code

Now that you understand the addition rule, the product rule should be easy to understand, and the product rule formula can be abstracted into the following formula

* If T1(n)=O(f(n)), T2(n)=O(g(n)); Then T = T1 T2 (n) (n) (n) = O (f (n)) xO (g (n)), O (g (n))) = O (f (n) xg (n))

That is to say, assuming the T1 (n) = O (n), T2 (n) = O (n2), the T1 * T2 (n) (n) = O (n3). Take a look at this sample code

function sum(n){
    let sum = 0;
    for(let i = 1; i < 100; i++){ sum = sum + fSum(i); }}function fSum(n){
    let total = 0;
    for(let i = 0; i < n; i++){
        total += i;
    }
    return total
}
Copy the code

Specific analysis is as follows:

In other words, we can view the product rule as a nested loop

Analysis of several common time complexity examples

The above levels of complexity can be broadly divided into two categories

  1. Polynomial magnitude, the first column in the diagram
  2. There are only two nonpolynomial magnitude-o (2n) and O(n!). , when the data size n becomes larger and larger, the execution time of the non-polynomial magnitude algorithm will increase sharply, and the execution time of solving the problem will increase indefinitely. Therefore, the algorithm with non-polynomial time complexity is actually a very inefficient algorithm. The following is an explanation of the common polynomial magnitude complexity

O(1)

O(1) is just a way of saying constant time complexity, not that only one line of code has been executed. For example, the following code, even though it is only three lines long, is O(1), not O(3).

 let i = 8;
 let j = 6;
 let sum = i + j;
Copy the code

So just to summarize a little bit, as long as the execution time of the code doesn’t increase with n, then we’ll call the time complexity of the code O(1). In other words, the time complexity of this algorithm is equal to two o (1), even if there are thousands of lines of code, as long as there are no circular or recursive statements in the algorithm.

O (logn), O (nlogn)

Logarithmic time complexity is a kind of time complexity which is common and difficult to analyze. This can be illustrated by the following example

 let i=1;
 while (i <= n)  {
   i = i * 2;
 }
Copy the code

According to the complexity analysis described earlier, the third line of code is the one that loops the most. So, if you can calculate how many times this line of code is executed, you can know the time complexity of the entire code.

As you can see from the code, the value of variable I starts at 1 and is multiplied by 2 each time through the loop. When greater than n, the loop ends. Remember that geometric sequence we learned in high school? In fact, the value of variable I is a geometric sequence. If I were to list them one by one, it would look something like this:

So we just need to know what x value is, and we know how many times this line of code has been executed. I’m not going to go into the problem of solving x by 2x equals n, which we probably learned in high school.x=log2n, so, the time complexity of this code isO(log2n).

Similarly, what is the time complexity of this code?

 let i=1;
 while (i <= n)  {
   i = i * 3;
 }
Copy the code

According to the above we can conclude that the time complexity is O(log3n).

But in fact, whether it’s base two, base three, or base ten, we can write all logarithmic time as order logn. Why is that?

To solve this problem we need to know something about logarithms. Logarithms can be converted to each other


l o g 3 n = l o g 32 ∗ l o g 2 n log3n = log32 * log2n

Because log32 is constant, based on one of our previous theories, we can ignore the coefficient O(Cf(n)) = O(f(n)) when we use the big O notation complexity. So, order log base 2n is equal to order log base 3n.

So if you understand order logn, then order nlogn is easy to understand. The product rule tells us that if the time of a piece of code is order logn, and we iterate n times, it’s order nlogn. Moreover, O(nlogn) is also a very common algorithm time complexity. For example, merge sort and quicksort have O(nlogn) time complexity.

O (m + n), O (m * n)

Take a look at the code below

function cal(m, n) {
  let sum_1 = 0;
  let i = 1;
  for (; i < m; ++i) {
    sum_1 = sum_1 + i;
  }
 
  let sum_2 = 0;
  let j = 1;
  for (; j < n; ++j) {
    sum_2 = sum_2 + j;
  }
 
  return sum_1 + sum_2;
}
Copy the code

The complexity of this code is determined by the size of the two data pieces. M and n are two data sizes. We can’t evaluate the magnitude of m or n in advance, so when we write complexity, we can’t simply use the rule of addition and omit one of them. So, the time complexity of the code above is order m+n.

For this kind of situation, the original plus rule is not correct, we need to change the rules of addition to: T1 (m) + T2 (n) = O (n) (f (m) + g) T1 (m) + T2 (n) = O (f (m) + g (n)) T1 (m) + T2 (n) = O (f (m) + g (n)). But the multiplication rule continue to be valid: T1 (m) ∗ T2 (n) = O (f (m) ∗ f (n)) T1 * T2 (m) (n) = O (f (m) * f (n)) T1 (m) ∗ T2 (n) = O (f (m) ∗ f (n)).

Spatial complexity analysis

The full name of time complexity is progressive time complexity, which represents the increasing relationship between the execution time of an algorithm and the size of data. By analogy, the full name for the parabolic space complexity is the increasing relationship between the storage space of an algorithm and the data size of the algorithm.

Let me give you an example of C

void print(int n) {
  int i = 0;
  int[] a = new int[n];
  for (i; i <n; ++i) {
    a[i] = i * i;
  }
 
  for (i = n- 1; i >= 0; --i) {
    print out a[i]
  }
}
Copy the code

The description is shown below.

So the space complexity of the whole code is order n.

Common spatial complexity is O(1), O(n), O(n2), logarithmic complexity such as O(logn), O(nlogn) is not usually used. Moreover, spatial complexity analysis is much simpler than temporal complexity analysis

Complexity of different data structures and algorithms

If you like it, you can help to collect it. Thank you

Algorithm complexity analysis: best, worst, average, amortized time complexity