preface

As a “program ape”, we should have heard the saying: program = data structure + algorithm.

That phrase was coined by The Swiss computer scientist Niklaus Wirth, who won the Turing Prize in 1984 and wrote a book about it, Algorithms + Data Structures=Programs, Since then, it has become a well-known saying.

As time goes by, no matter whether this sentence is very accurate or not, it can at least show that data structure and algorithm are very core foundation for programs. If we want to write more excellent and elegant code, then data structure and algorithm must be mastered well.

Why learn algorithms

A lot of people might think, well, I don’t know how to do algorithms, I’m pretty good at coding, so algorithms don’t seem to be very useful. Nowadays, with the development of the Internet, almost everything we want can be found ready-made on the Internet. Various frameworks have very powerful functions, and it seems that it is possible to write “good code” without algorithms. However, if we don’t understand algorithms, such as sorting, how can we evaluate the efficiency of code execution? How do we choose the most common ArrayList and LinkedList, and how do we write good code when we need to find a certain number in a collection?

How do you determine which code is good code? Readability, scalability, robustness, probably can be used to judge, but these things I think does not directly reflect your good code, because for the user to access your code fast response that is good, on the contrary, quick response and even a few seconds longer interface, again good, I’m afraid if you readability and robust is also not good code.

So a piece of code is good, the most direct judgment standard is performance, and if you want to write high-performance code, then it must understand the algorithm, and throw this factor, if don’t want to write the rest of my life, CRUD code also needs to understand the algorithm, we use a lot of framework and the underlying middleware have data structures and algorithms, Learning the algorithm is also of great benefit to us when we read the source code to understand its design ideas.

To say the utilitarian purpose, that is the interview, at present, many large factories interview, the algorithm is basic, so if you want to enter the big factory, we also have to learn the algorithm.

Is the algorithm hard to learn

Mentioned algorithm, a lot of people’s first reaction is too difficult to learn, to learn can’t, or is often forgot, but actually an ordinary developers for us, because does not need us to invent algorithm, we need only to the use of flexible algorithm, so do not need to be very solid data base, is to some basic common sense.

If need to design a algorithm invention, it will go to prove the feasibility of the algorithm is derived, which is need to have a very solid mathematical foundation, the average person really can’t do that, but we have mentioned algorithm is nothing more than ordinary programmers mouth binary search method, the hash algorithm, etc., is a bit more advanced and back, greedy, dynamic programming, etc. These so-called algorithms are ready-made formulas, we need to do nothing more than to understand it, and then flexibly use it. This is the same as we used to learn mathematical formulas, give you a formula, and then you do the problem, the process of doing the problem is to use the formula flexibly.

Algorithms are the same, there are specific methods and specific ideas, we do not need to deduce why this way is feasible, so there is no other trick to learning algorithms, is to understand the idea first, and then more practice, and so skilled, naturally can be flexibly used, will not say that learning immediately forget. Learn to forget nothing more than two reasons, one is not understand, but did not practice consolidation.

Complexity analysis

Data structure and algorithm are often put together, the two are not independent, because the algorithm is a way to achieve a certain purpose, and data structure is a carrier, that is to say, the algorithm must rely on the data structure of the carrier, otherwise it is empty talk. In other words: data structures serve algorithms, and algorithms need to work on specific data structures.

How do we evaluate whether an algorithm is good or not? We mentioned before, your code is good, the most intuitive is the response speed, algorithm, as well as to achieve a purpose (for example), whose fast algorithm and who we can think of better algorithm, if the speed of the two kinds of algorithm about, then we can go to evaluation algorithm of space occupied, who takes up less space, Then you can decide which algorithm is better, and that’s the basis of algorithms: time complexity and space complexity.

Before learning algorithms, we must learn how to analyze time complexity and space complexity (i.e., “fast” and “save”), otherwise the algorithm written by ourselves will not know the efficiency of the algorithm.

Time complexity large O notation

As you know, the time complexity of an algorithm is represented by a capital “O”, such as O(1), O(n), O(logn), O(nlogn), O(n²), and so on.

Time complexity is the full name of gradual time complexity, the execution time of the algorithm and data size, the growth of the relationship between the time complexity of the above representation is not a real reaction the execution time of the algorithm and the reaction is a trend, so we time to focus on the analysis of the complexity of “change”, ignore the “constant”.

Variable refers to a variable, that is, the execution time of a piece of code changes with the variable, and constant refers to a constant, that is, no matter how my variable changes, the execution time does not change.

Let’s practice by analyzing common examples of time complexity in practice.

The O (1) constant

The 0(1) complexity algorithm is also called the constant order algorithm. The 1 here is a constant, which means that the efficiency of this algorithm is constant, no matter how much data you have, the efficiency is the same, and this complexity is also the optimal algorithm.

public static void print(int n){
    int a = 1;
    int b = 2;
    int c = 3;
    int sum = a + b + c;
    System.out.println(sum);
}
Copy the code

No matter how many lines of code there are in the above example, the time complexity is constant. In other words: as long as there are no loops in the code, the complexity of recursive class calls is constant no matter how many lines the code has.

O (n) linear order

O(n) complexity algorithms are also called linear order. For example, how should we analyze complexity?

public static void print1(int n){ int a = 0; for (int i=0; i<n; i++){ System.out.println(i); }}Copy the code

The previous constant order is not analyzed because the constant order is easier to understand, and we will take the linear order as an example to analyze how to get it.

Let’s assume that each line of code has execution time T, so what is the execution complexity of the code above?

The obvious answer is T+n times T, which is (n+1)T, and there’s a rule in the algorithm that constants can be ignored, so you get nT, which is O(n) in big O notation.

This is just a quick calculation, and you don’t have to worry about how each line of code might take a different time or something like that, and you don’t have to worry about how the for loop takes up one line, and the curly braces take up one line, but if you do, I suggest you think about why 1=1 equals 2.

The complexity of the algorithm reflects only a trend, in this case O(n) reflects a trend, that is, as n changes, the execution time of the algorithm decreases.

O (n squared) square order

Given the linear order above, the square order is easy to understand, the double cycle is the square order, the cubic cycle is the cubic, and the k cycle is the k.

O (logn) log order

O(logn) is also called logarithmic order, and logarithmic order is also very common, you’ll see a lot of logarithmic order complexity in problems like binary search, binary trees, but logarithmic order complexity is also a kind of algorithm complexity that’s a little bit harder to understand.

Here’s an example:

public static void print2(int n){ int i=1; while (i <= n) { i = i * 2; }}Copy the code

How does this code analyze complexity? The key to this code is to analyze how many times the while loop takes place. When we look at the loop, we find that instead of increasing, I keeps doubling: 1->2->4->8->16->32->64 until it is equal to n, so we get the following formula: X ^ 2 = n.

In other words, as long as we calculate the value of x, we can get the number of cycles, and according to the high school mathematics knowledge, we can get x=log2n (2 is at the bottom, which is the base number, and we have tried several methods, but failed to work out), so according to the above linear order analysis method, we omit the constant, This results in an example of algorithm complexity O(log2n).

In the same way, in the following example, we can quickly analyze the complexity of O(log3n) :

int i=1;
while (i <= n) {
    i = i * 3;
}
Copy the code

Log3n =log32 * log2n, and log32 is a constant, which can be omitted, so we get O(log3n)=O(log2n). By the same token, whatever the base is, it’s going to be the same thing as O(log2n), and because of that, for convenience, we usually omit the base and write it as O(logn).

The mathematical formula above it doesn’t matter if you forget or don’t understand, just remember that no matter how much is the base number of logarithmic, we all count O (logn), for the complexity of the algorithm is a logarithmic order, there is a simple judgment method: when the loop subscript to specify multiple forms of attenuation, so this is a logarithmic order.

Order nlogn, linear log order

If you understand the logarithmic order above, then this linear logarithmic order is very easy to understand, just need to embed another layer of loop in the logarithmic order algorithm is linear logarithmic order:

int i=1; for (int j=1; j<=n; j++){ while (i <= n) { i = i * 2; }}Copy the code

Analyzing the most commonly used time complexity, we can actually get the following rules:

  • As long as the constant level, no matter how large, the efficiency is the same (e.g., the constant-order complexity example).
  • To analyze the time complexity of a piece of code, you only need to analyze the piece of code that executes the most times (e.g., so in this example we only analyze the number of code executions in the body of the loop).
  • The complexity of nested code is equal to the product of the complexity of the inside and outside of the nested code (e.g., analyzing the linear logarithmic complexity example).

Other complexity

In addition to the above commonly used complexity, there are also exponential, hierarchical, square root, etc., which are relatively rare, we won’t do the analysis, if you are interested, you can learn about it yourself.

Combinatorial complexity analysis

What we have analyzed before are the complexity results obtained when only one piece of code is complicated. What if I have multiple pieces of code in an algorithm that are complicated? How do you analyze complexity at this point?

Take the maximum complexity as the complexity of the whole algorithm

Let’s start with this example:

public static void print1(int n){ for (int i=0; i<1000; i++){ System.out.println(i); } for (int j=0; j<n; j++){ System.out.println(j); } for (int p=0; p<n; p++){ for (int q=0; q<n; q++){ System.out.println(p+q); }}}Copy the code

So in this example we have three loops, first of all, the first one is a constant, so according to the previous conclusion, no matter how big the constant is, it is constant, so the first loop is O(1), and the second and third loops are O(n) and O(n ^ 2), respectively.

That is, there are three pieces of code in this code that produce three different complexities, and the size relationship of these three complexities can be clearly obtained as: O(1)

Take the sum of the complexity as the complexity of the whole algorithm

Let’s look at another example:

public static void print2(int m,int n){ for (int i=0; i<1000; i++){ System.out.println(i); } for (int j=0; j<m; j++){ System.out.println(j); } for (int k=0; k<n; k++){ System.out.println(k); }}Copy the code

In this example, we can also analyze the three sections of the cycle separately and get the following complexity: O(1), O(m), O(n). At this time, we can only know that the minimum O(1) can be ignored, but the size of the latter two cannot be determined. Therefore, we need to take the sum of the complexity of the two cycles as the complexity of the algorithm, so we can get the algorithm complexity of this example: O(m+n).

Time complexity type

Time complexity of the above analysis is relatively simple, practical algorithm may be more complex than the example, and as long as it is in our sample cycle is no brain circulation, also is certain from beginning to the end, however, in practice we sometimes do not need to from beginning to the end, may be midway will be over in a cycle, so we according to the actual situation, Time complexity can be further analyzed from the following four aspects:

  • Best time complexity
  • Worst time complexity
  • Average time complexity
  • Amortize the time complexity

This four types of time complexity in here will only introduce the front three, because the fourth is more complex, but also very limited usage scenarios, and the analysis of the four kinds of complexity, you can, as understanding, dare not interested friends can skip this little part, because in most cases we just need to analysis the worst-case complexity, This is the time complexity of the scenario assuming that all loops are executed.

Best time complexity

Let’s use an example to understand the best time complexity:

public static int findEle(int[] arr,int val){ if (null == arr || arr.length == 0){ return -1; } for (int i=0; i<arr.length; i++){ if (arr[i] == val){ return i; } } return -1; }Copy the code

This method finds the index of the specified element in a specified array, and returns -1 if it does not find it.

Pay attention to the method of the loop body, if found elements, then return directly, it will be a phenomenon, that is my loop the loop body is how much time is uncertain, may be, may also be n (assuming that the length of the array), so if we want to find elements in the array in the first place, so I am found the cycle time, The complexity of this algorithm is O(1), which is the best-case time.

Worst time complexity

So if you understand the best time, then the worst time is easy to understand, which is that there’s no element in the array and I want to find it, or the last value is the element THAT I want to find, then I have to loop through the entire array, and then the time is O(n), and that’s the worst time.

Average time complexity

After all, best-time complexity and worst-time complexity occur only in special cases, and the probability is relatively small, so it is easy to think that we also need an average time complexity.

Let’s just do a simple analysis, and for the sake of analysis, let’s assume that the probability of an element being in the array and not in the array is 1/2, and then if it’s in the array, let’s assume that the probability of an element being in each position is the same, so the probability of an element being in each position is 1/n.

So the average time that you end up with should be equal to the sum of the two cases where the element is in the array and the element is not in the array.

  • The complexity of an element in an array

Because the probability of the element being in the array is 1/2, and then the probability of being in each position is also 1 over n. If the element appears in the first position, the complexity is 1*(1/2n); If the element appears in the second position, the complexity is 2 *(1/2n), and finally the time complexity in the current scene is: 1*(1/2n) + 2 *(1/2n) +… (1/2 + n * n) = (n + 1) / 4.

  • The element is not in an array

It has been assumed that the probability of element not being in the array is 1/2, so the time complexity in the current scenario is n * (1/2). Since the element is not in the array, the algorithm must complete the whole cycle, i.e. the cycle is N times.

Finally, we add the sum of the two cases to get the average time complexity :(n+1)/4 + n/2 = (3n+1)/4. Finally, we ignore the coefficients of the constant class and get the average time complexity is O(n).

Amortize the time complexity

Amortization time complexity algorithm needs to use amortization analysis method, the calculation method is relatively complicated, and the use of scenarios are limited, this paper will not be introduced too much.

Spatial complexity

The full name of space complexity is progressive space complexity, which is used to express the growing relationship between the storage space and the data size of the algorithm. Like time complexity, spatial complexity is expressed in terms of big O.

In fact, once we have learned to analyze the time complexity, the analysis of spatial complexity will be easy. It mainly depends on whether we use extra space to store data in an algorithm, and then determine whether the size of this extra space will change with the change of n, so as to obtain the spatial complexity.

Let’s look at an example of assigning an array. Assuming that this is an algorithm, we can analyze the spatial complexity of the algorithm:

public static void init(int n){ int a = 0; int arr[] = new int[n]; for (int i=0; i<n; i++){ arr[i]=n; }}Copy the code

Defines a variable at the beginning, there needs to be space, but it is a constant level (not changes over n), and then defines an array, the array length is n, here also need to take up the space array, and an array of space is changes with the change of n, the rest of the code does not take up extra space, So we can think of the space complexity in the example above as O(n).

The spatial complexity of the algorithm can also be summarized as follows:

  • If applying for a finite number of variables (constant), the space complexity isO(1).
  • If the application is one-dimensional array, queue or linked list, then the space complexity isO(n).
  • If you’re applying for a two-dimensional array, then the space complexity is zeroO (n squared).
  • If the array is applied in the body of the loop, it may be necessary to take the nested product as a spatial complexity, which requires further analysis.

conclusion

This paper mainly describes why to learn the algorithm, but also simply reduce the relationship between the data structure and the algorithm, then mainly introduces the introduction of the algorithm knowledge: time complexity and space complexity. Want to learn algorithm, we must analyze how a time complexity and space complexity of the algorithm, only oneself will analyze these two measure of algorithm performance, can better write performance excellent algorithm, at the same time we also talked about the best time complexity, the worst time complexity, the average time complexity and time complexity between, However, these four kinds of complexity of the calculation method as we can understand, such as the actual need to use to review it is not late.