These are the notes I took when I was studying moOCs algorithm course by Professor Bobo. The original video is here

In all algorithm interviews, one question is almost inescapable – what is the time complexity of your algorithm? And sometimes, interviewers will simply ask us to design an algorithm of XXX complexity. So complexity estimation is a must for learning algorithms (through algorithmic interviews). Many students get headaches when they think of complexity analysis, and immediately think of the complex mathematical derivation in “Introduction to Algorithms”. In fact, in the average corporate interview, the analysis of complexity is not that high, but it is difficult to get around.

What is the Big O

When we calculate complexity, we usually write it as O(n)/O(nlogn). So what exactly is the Big O?

  • N indicates the data scale

  • O(f(n)) represents the number of instructions required to run the algorithm, which is proportional to f(n).

An 🌰

  • Binary search: O(logn) – Indicates the number of instructions to be executed: A *logn

  • Find the maximum/minimum value O(n) — b*n in the array

  • Merge sort O(nlogn) c*nlogn

  • Select sort O(n^2) — d*n^2

Where a, B, C, and D represent constants, which are fixed and invariable. In other words, with the increase of data size N, the change of algorithm efficiency has little relationship with constants A, B, C and D. And logn, n, nlogn, n^2 is huge, so we usually omit the constant term.

For example, in the following example, algorithm A O(n) requires 10,000 * n instructions, and algorithm B O(n^2) requires 10 * n^2 instructions.

n A Instruction number 10000N B Number of instructions 10n^2 A multiple
10 10 ^ 5 10 ^ 3 100
100 10 ^ 6 10 ^ 5 10
1000 10 ^ 7 10 ^ 7 1
10000 10 ^ 8 10 ^ 9 0.1
10 ^ 5 10 ^ 9 10 ^ 11 0.01
10 ^ 6 10 ^ 10 ^ 13 0.001

It can be seen that with the increase of data volume, the algorithm with high time complexity must run longer than the algorithm with low time complexity. In this case, we’re dealing with 1,000 watts of data. Algorithm A takes A day, and algorithm B takes A year. So that’s why we try to minimize the complexity of the algorithm.

And we can well understand from this graph that as the amount of input data increases, their growth rates are completely different.

A strict definition of O(n)

Now that you have a good idea of what O(n) is, let’s look at a more rigorous definition of O(n). In academia, strictly speaking, O(f(n)) represents an upper bound on algorithm execution. What do we mean by “upper bound of algorithm execution”?

The time complexity of 🌰 merge sort algorithm is O(nlogn), and also O(n^2).

But in the interview, we use O to indicate the lowest upper bound on algorithm execution. Although, strictly speaking, the statement that merge sort is O(n^2) is true. But in an interview, we don’t usually say that.

Time complexity addition

If you have multiple time complexities, how do you calculate it? For example, we design an algorithm, which is the sum of several algorithms, and the time is O(N logn + n). Then the whole algorithm will be dominated by high time complexity. In other words, the time of an algorithm is order N logn plus n, and we can think of it as order N logn. As the data size increases, O(nlogn) plays a dominant role, while O(n) is not significant.

Similarly, if we design an algorithm with order N logn plus n^2, let’s say the time is order n^2. But the premise of this calculation method is that the data size of the two parts must be the same, that is, the two N’s are the same.

In practice, there is also the complexity of this case: O(AlogA+B), where A and B are unrelated. At this point, we can’t merge the algorithm complexity.

A thought question

You have an array of strings, and you sort each string in the array alphabetically; The entire string array is then sorted lexicographically. The time complexity of the entire operation?

You may be thinking, this problem is not easy 😎

1, the alphabetical order of each string is nlogn, there are n strings, so it is n * nlogn 2, then the whole string array is lexicographically sorted nlogn

Add them up. Order n times nlogn plus nlogn is equal to order n^2logn.

Unfortunately, this idea is wrong ❌.

First, the length of “this string” is mixed up with the length of “the entire string array”, and they are irrelevant. So, we need to separate these two parameters. Assume that the longest string length is s; There are n strings in the array. Sort each string: O(slogs). Sort each string in the array alphabetically: O(n*slog(s))

Then again, the lexicographical complexity of the string array is nlogn. In sorting algorithms, the time complexity O(nlogn) represents the number of comparisons. Generally, sorting an integer array requires only O(nlogn) comparisons. Because two integers are compared, they are O(1) in computers. However, comparing two strings is different, and the process of comparing the two strings in lexicographical order costs O(s), which is the performance cost of string length. So the lexicographical ordering of our array of strings should be order (s*nlog(n))

In summary, the time complexity of the addition algorithm for these two parts is: O(NSlog (s)) + O(snlog(n)) = O(NS (logs + logn))

Complexity is about use cases

Test cases also affect the time complexity of the algorithm. The insertion sort algorithm, for example, is O(n^2) at worst, but O(n) at best. But let’s make the sorting algorithm O(n^2), because on average it’s O(n^2).

Quicksort, at worst order n^2, at best order N logn. But we’re going to make the sorting algorithm order nlogn, because on average, it’s going to be order nlogn.

The data size

To give you an idea of the size of the data, let’s do it programmatically. Run a program O(n) with values ranging from 1 to 10… 10 ^ 9.

for (let i = 0; i <= 9; i++) {
  const n = Math.pow(10, i);
  console.time("someFunction");
  let sum = 0;
  for (let j = 0; j < n; j++) {
    sum += j;
  }
  console.log("10 ^" + i + "s");
  console.timeEnd("someFunction");
}

/* 0s someFunction: 7.882ms 10^1s someFunction: 0.052ms 10^2s someFunction: 0.044ms 10^3s someFunction: SomeFunction: 0.25ms 10^5s someFunction: 9.683ms 10^6s someFunction: 1.446ms 10^7s someFunction: 0.25ms 10^5s someFunction: 9.683ms 10^6s 13.859ms 10^8s someFunction: 125.905ms 10^9s someFunction: 1.026s */
Copy the code

When the amount of data is relatively small, it doesn’t mean much. However, from the time the data scale reached 1000, every 10-fold increase, the corresponding time increased by a factor of 10.

If you want to solve the problem in 1 s:

The O(n^2) algorithm can handle about 10^4, the O(n) algorithm can handle about 10^8, the O(nlogn) algorithm can handle about 10^7, because lg10^7 is about 7.

Then in the interview, we can approximate the complexity of the algorithm based on the size of the data. If the interviewer says, my data scale is 10^8. So we know that the algorithm is basically order n. (Of course, sometimes O(nlogn) is also ok, because logn is a very low level of complexity and processing speed is very fast.)

If the interviewer says, we only need to deal with 10^3 numbers. That leaves enough room for us to do an O(n^2) complexity.

Spatial complexity

Compared to time complexity, spatial complexity is easy to calculate. In general, how much auxiliary space is opened, how much space complexity is used. Let’s say we have a data size of N, and accordingly, we open auxiliary arrays of length N to help execute the algorithm, so the space complexity is O(n). If we open one more auxiliary two-dimensional array, then our space complexity is O(n^2). Of course, for many algorithms, we don’t need a lot of array space, such as sorting in place. We just need to open some temporary variables to store this data. In this case, we say that the space complexity of this algorithm is O(1). Finally, for spatial complexity, we need to note that recursion has a spatial cost. In recursion, the computer pushes the recursively called functions onto the system stack, taking up space. Write an algorithm that computes the sum from 0 to n.

const sum = (n) = > {
  let result = 0;
  for (let i = 0; i < n; i++) {
    result += i
  }
  return result
}
Copy the code

If I write it in a loop, it’s order one in space, what if I write it recursively?

const sum = (n) = > {
        if(n === 0) {return 0;
        }
        return n + sum(n-1)}Copy the code

Spatial complexity: the depth of O(n) recursive calls is N, and we need to push n states into the system stack. In other words, our spatial complexity is the same as the depth of recursion in the entire recursive call.

Common algorithm complexity analysis

O(1)

function swapTwoNumbers(a, b) {
        let temp = a;
        a = b;
        b = temp;
}
Copy the code

There is no change in the size of the data in this algorithm, which is constant, so the algorithm complexity is O(1).

O(n)

function sum (n) {
        let result = 0;
        for(let i = 0; i <= n; i++){ result += i }return result;
}
Copy the code

This function computes the sum of integers from 0 to n, and it has a loop, and the number of loops is dependent on n, so this is a very typical O(n) algorithm. In other words, the loop should be executed cn times. Where c is a constant, it may not be a number greater than 1. Like this algorithm:

function reverse (s) {
        let n = s.length;
        for(let i = 0; i<n/2; i++) { [s[i],s[n-1-i]] = [s[n-1-i], s[i]]
        }
}
Copy the code

(1/2)*n swaps are performed, and its algorithm time complexity O(n), that is, the time consumed by the algorithm increases linearly as the length of the string increases.

O(n^2)

function selectionSort(arr, n) {
  for (let i = 0; i < n; i++) {
    let minIndex = i;
    for (let j = i+1; j < n; j++) {
      if(arr[j] < arr[minIndex]) {
        minIndex = j
      }
    }
   [arr[i], arr[minIndex]] = [arr[minIndex], arr[i]]
  }
}
Copy the code

This algorithm involves a double loop, so it’s usually O(n^2) time.

Strictly speaking, the number of instructions performed by this algorithm is n^2 proportional to the complexity of the data.

Let’s see how many times minIndex = j is executed.

  • When I = 0, j is 1 to n, n – 1 times;

  • When I = 1, j is 2 to n, n – 2 times;

  • When I = n-1, j is n to n, 0 times;

  • (n – 1) + (n – 2) + (n – 3) +… +0 = (0+n-1) n-2 = 1/2 n*(n-1) = O(n^2)

Of course, we can’t just look at a double loop and think of it as O(n^2). Here’s an example:

function printInformation (n) {
        for (let i = 1; i <=n ; i++){for(let j = 1; j<=30; j++) {console.log(i, j)
                }
        }
}
Copy the code

Although this is a double loop, the console.log operation is performed 30n times, so the algorithm complexity is O(n).

O(logn)

For example, we are familiar with binary search:

function binarySearch(arr, n, target) {
        let l = 0; r = n-1;
        while(l<=r){
                let mid = l + (r-l)/2;if(arr[mid] === target) return mid;
                if(arr[mid] > target) r = mid - 1;
                else l = mid + 1;
        }
        return -1;
}
Copy the code

So let’s review the idea of binary search, for an ordered array. First find the middle element of the array and compare the middle element to the selected target. If found, the element is returned; Otherwise, the search continues to the left and right of the array depending on the size. The first time, we look for target in n elements. If not, we look for n/4 among n/2 elements… 1 How many times did this process take place? That’s the same thing as solving for n. How many times do you divide by 2? The answer is log base two of n. So the time is order logn.

// Convert number to string:

function numberToString (num) {
        let s = ""
        while( num ) {
                s += '0' + num%10;
                num /= 10;
        }
        reverse(s)
        return s;
}
Copy the code

If you look at this algorithm, it’s the same thing as: how many times does n divide by 10, is n equal to zero? So, after we’ve done this, we can quickly get the answer log of 10 n = order logn.

As we said, when you look at double loops, you can pretty much assume order n^2 algorithm. But there are exceptions, like this one:

function hello(n){
        for (let sz = 1; sz < n; sz+=sz) {
        for (let i = 1; i < n; i++) {
          console.log("hello")}}}Copy the code

The first loop size += size is the same thing as multiplying by 2 each time until you get past n O logn and the second loop is 1 to n O logn so the time is O nlogn.

Complexity experiment

In previous articles, we looked at the relationship between programs and complexity. But even so, when we actually design the program, we can’t be completely sure that the time complexity of the algorithm is what we want.

To solve this problem, we can do experiments to see how the program allows time to change as the data size increases. For example, we can double the size of the data each time and see the change in time between them.

In JavaScript you can use console.time(“someFunction”); Console. timeEnd(“someFunction”) to observe time changes, which I won’t cover here.

Complexity analysis of recursive algorithms

Both merge sort and quicksort, recursive algorithms are used.

Since both of these algorithms are O(nlogn), when you first start out with algorithms, you might think that recursive algorithms are O(nlogn).

Of course, this conclusion is not correct, we must analyze the specific problem.

The first case is simple, we only make one recursive call.

Binary search

The most typical example is binary lookup:

function binarySearch(arr, l, r, target) {
  if(l > r) {
    return -1;
  }
  let mid = l + (r-l)/2
  if (arr[mid] === target) {
    return mid;
  }
  else if (arr[mid] > target) {
    return binarySearch(arr, l, mid - 1)}else {
    return binarySearch(arr, mid + 1, r)
  }
}
Copy the code

For this function, we only make one recursive call at a time, so we just need to calculate how deep the recursion is. The next array is half the size of the current array, the depth of the recursive call is logn, and the complexity of the problem is O(1), so the time complexity is O(logn). So, if there is only one recursive call in a recursive function at depth, and the time complexity in each recursive function is T, then the overall time complexity is O(T * depth).

Take the sum of 0 minus n

Another example: Calculate the sum of 0 minus n

function sum(n) {
  if (n === 0) {
    return 0
  }
  return n + sum(n-1)}Copy the code

The recursion depth of this function is: n In each recursive function, the time complexity is: O(1) and the overall time complexity is O(n).

At this point we should count the number of calls.

Fibonacci numbers

function f(n) {
 if (n === 0) {return 0
  }
  return f(n-1) + f(n-1)}Copy the code

In general, we can draw a recursion tree, such as f(3) :

So, to figure out how many times f of 3 is called, we can just count the nodes in the tree.

1 + 2 + 4 + 8 = 15

2^0 + 2^1 + 2^2 + 2^3 + + 2^n = 2^(n+1) – 1

The complexity of O (2 ^ n)

This is actually a sum formula for a geometric sequence. After calculation, the result is two to the NTH power of numbers. In other words, this algorithm, it calls itself twice each time, calls itself, recursive functions twice. Finally, we have an exponential algorithm.

Note that the exponential algorithm is a very slow algorithm. If you design an exponential algorithm, be careful. For example, if N is around 20, that’s millions of calculations. If N is 30, your average computer is already running very slowly. You can actually optimize this, for example, for pruning, for example, for some exponential algorithms, for dynamic programming and so on. So we’re taking an exponential algorithmic problem and turning it into a polynomial algorithmic problem.

It’s essentially a search tree that the computer has built up in the task, and the size of the search tree is also the time complexity that we need to calculate, which is a very important concept.

Merge sort, quick sort O(nlogn)

But some of you might be asking, these advanced sorting algorithms that we’re talking about, merge sort or quicksort, they don’t have 2 to the N, they have order N log N.

function mergeSort(arr, l, r) {
    if (l >= r) {
        return;
    }
    let mid = (l+r)/2
    mergeSort(arr, l, mid)
    mergeSort(arr, mid+1, r)
    merge(arr, l, mid, r)
}
Copy the code

That’s because in our previous example, we had a depth of N for the whole tree, and in these sort searches, we actually had a depth of logn for the trees.

So how do you calculate the complexity of merge sort and quicksort?

In these sorting algorithms, the size of the data we process at each node is progressively smaller. In our previous example, each of these nodes handled the same amount of data, even though they were all 1, so you can see on the right, I’ve drawn it. For merge sort, what does the recursion tree look like when we sort eight numbers.

When our array length is 8, the node tree is as follows: there are logn levels, and again, O(n) level algorithms for each level of nodes. So, the total algorithm complexity is order nlogn.

At the root, we’re dealing with eight elements, and then we split it in two, four elements per node, two elements per node, and then we split it in two, and then we’re dealing with one element per node, and when we’re dealing with one element, we don’t have to deal with one element, because one element doesn’t have to be sorted.

So when we have eight elements, first of all this tree has not eight levels, it has three. And the scale of data processed by each node is getting smaller and smaller.

So, where did this n log n come from? We have logn levels.

  • At the first level, only one node holds this element.

  • At the second level, although it’s split into two nodes, each node only has four elements, and the total is still eight.

  • At the third level, there are actually four twos, but again, each node is an O(n) algorithm.

So in general, at each level, we’re dealing with order n, and there are logn levels at this level, and you multiply them to get order n logn.

In fact, such a recursion tree is a divide-and-conquer algorithm. And because of that, we usually design divide-and-conquer algorithms with n log N time complexity.

The recursion case we analyze here basically covers most of the techniques used in recursive time and complexity analysis. If some of you are interested in more complicated cases, you can look at the master theorem. The master theorem is a formula that generalizes all the cases of time and complexity for recursive functions. A recursive function divides the entire data into several pieces, depending on the time and complexity of each piece. For different cases, the master theorem gives the answer. (But the concept of the master theorem isn’t usually asked in an interview, either.)