1. The mathematical basis of complexity

The analysis required to estimate algorithm resource consumption is generally a theoretical problem and therefore requires a formal system architecture. Let’s start with some mathematical definitions.

Definition 1: If there are normal numbers C and n 0_00, such that T(n) ≤ cf(n) when n ≥ n 0_00, then T(n) = O(f(n)) definition 2: If there are normal numbers C and n 0_00, such that T(n) ≥ gf(n) when n ≥ n 0_00, then T(n) = ω (g(n)) definition 3: If only if T (N) = O (N) (h) and T (N) = Ω (h (N)), T (N) = Θ (f (N)) defines 4: if T = O (N) (p (N)) and T (N) indicates Θ (f (N)), then T = O (N) (p (N))

The main purpose of these definitions is to establish a relative level between functions. Given two functions, there are usually points at which the value of one function is less than the value of the other, so a statement such as f(N) < g(N) has no value. So we compare their relative growth rates. It’s an important metric.

Even though N is bigger than N 2^22 in the hour 1000, N 2^22 is growing faster, so N 2^22 is going to be bigger in the end. In this case, N=1000 is the turning point.

The first definition says that eventually there is some point n0n_0n0 from which CF (N) is always at least as big as T(N), so that if you ignore the constant factors, f(N) is at least as big as T(N). T(N)=1000N, f(N)=N 2^22, n0N_0N0 =1000 and c=1. We could have n0n_0n0=10 and c=100. Therefore, we can say that 1000N = O(N 2^22)(order N squared), and this notation is called big O notation.

If you use the traditional inequality to calculate the growth rate, then the first definition says that the growth rate of T(N) is less than or equal to the growth rate of f(N). The second definition says that the growth rate of T(N) is greater than or equal to the growth rate of g(N). The third definition says that the growth rate of T(N) is equal to the growth rate of h(N). The last definition is that T(N) growth is less than p(N) growth.

With the above definitions in mind, we should grasp the following conclusions:

Rule 1: if T 1 _11 (N) = O (f (N)) and T 2 _22 (N) = O (g (N)), then

  1. T
    1 _1
    (N) + T
    2 _2
    (N) = max(O(f(N)), O(g(N))
  2. T
    1 _1
    (N) * T
    2 _2
    (N) = max(O(f(N) * g(N))

Rule 2: If T(N) is a polynomial of degree K, then T(N k^kk) = θ. Rule 3: for any constant k, log k^kkN=O(N). It tells us that the logarithm grows very slowly.

Note that:

  1. It’s a very bad habit to put constant or low order terms in big O. Don’t write T(N)= O(2N 2^22) or T(N)= O(N 2^22 + N). In both cases, the correct form is T(N)= O(N 2^22).
  2. We can always determine the relative growth rates of two functions f(N) and g(N) by calculating the limit, using L ‘Hopital’s rule if necessary. This limit may have four possible values:
  • 1. The limit is 0: this means that f(N) = O (g(N))Copy the code
  • 2. The limit is C ≠ 0: this means f(N) = θ (g(N)Copy the code
  • 3. The limit is infinity: this means that g(N) = O (f(N)).Copy the code
  • 4. Limit swing: The two have nothing to doCopy the code

2. Calculation of running time

The following is a simple program fragment that calculates ∑ I =1Ni3\sum^N_{I =1}{I ^3}∑ I =1Ni3

int 
Sum( int N )
{
    int i, PartialSum;
    PartialSum = 0; / * 1 * /
    for (i=1; i <= n; i++) {  / * 2 * /
    	PartialSum +=  i * i * i / * * / 3
    return PartialSum; / * * / 4
}
Copy the code

Let’s analyze the above program. The statement takes no time. Lines 1 and 4 each take up a time unit. Line 3 takes up four units of time per execution (two multiplications, one addition, and one assignment), and a total of 4N units of time per N executions. The second line implies overhead in initializing I, testing I ≤N, and incrementing I. The total cost of line 2 is to initialize 1 unit of time, all the tests N+1 units of time, and all the increment N units of time, for a total of 2N+2. Ignoring the overhead of calling functions and returning values, we get a total of 6N+4. Therefore, we say that the function is O(N).

If we had to demonstrate all this work every time we analyzed a program, the task would quickly become unworkable. Fortunately, since we have the big O result, there are many shortcuts that can be taken without affecting the final result. For example, the third line (each time executed) is clearly an O(1) statement, so it is foolish to calculate exactly whether it is two, three, or four time units. The first line is obviously unimportant compared to the for loop, so it’s unwise to spend time there. This leads us to some general rules.

So when we analyze complexity, we only consider the highest complexity operations

2.1 For Loop:

The running time of a for loop is at most the number of times that statements (including tests) within the for loop are run to the table.

2.2 Nested for loops

Analyze these cycles from the inside out. The total running time of a statement within a nested set of rings is the product of the statement’s running time multiplied by the size of all the for loops in that set. The following program fragment is O(N 2^22);

for(i = 0; i < n; i++)
    for (j = 0; j< n; j++)
        k++;
Copy the code

2.3 Sequential Statements

You can sum the running time of each direction (this means that the maximum value is the resulting running time). The following program fragment uses O(N) first and costs O(N 2^22), the total cost is also O(N 2^22);

for(i = 0; i < n; i++)
    A[i] = 0;
for(i = 0; i < n; i++)
    for (j = 0; j< n; j++)
        A[i]+=A[j] + i + j
Copy the code

2.4 the if/else statements

For program fragments

if (Condition) 
    S1
else
    S2
Copy the code

The running time of an if/else statement is never more than the judging time plus the total running time of the longer ones in S1 and S2. Obviously in some cases this estimate is too high, but never too low.

2.5 the recursive

If the recursion is really just a lightly demonstrated for loop, the analysis is usually simple. For example, the following example is actually a simple loop, thus running O(N)

long int
Factorial ( int N )
{
    if( N <=1 )
        return 1; 
    e1se
        return N * Factorial(N - 1);
}
Copy the code

The running time of the following program increases exponentially. This is roughly the worst case scenario. The reason this program is slow is because there’s a lot of extra work to be done, and the — call Fib of n-1 actually computs Fib of n-2. This information is then discarded and recalculated on the second invocation of the recursion. The discarded information is compounded recursively and results in huge running times.

long int
Fib ( int N )
{
    if( N <=1 )
        return 1; 
    e1se
        return Fib(N - 1) + Fib(N - 2);
}
Copy the code

2.6 Logarithms in running time

An algorithm is O(logN) if it takes constant time O(1) to reduce the size of the problem to one part (usually half). On the other hand, if constant time is used only to reduce the problem by a constant (such as reducing the problem by 1), then the algorithm is O(N). Obviously, only some special kinds of problems can be of the order log N type. For example, if N numbers are input, an algorithm must consume the amount of omega (N) time just to read them in. So, when we talk about O(log N) algorithms for these kinds of problems, we usually assume that the input data has already been read in advance.

For searching

Given an integer X and the integers A0, A1,.. , A N−1_{n-1}N−1, the latter has been pre-sorted and in memory, find such that A i_ii = X subscript I, if X is not in the data, return I = -1.

The obvious solution is to scan the data from left to right, which takes linear time to run.

A good strategy is to verify that X is a middle element. If it is, then the answer is found. If X is less than the centered element, then we can apply the same strategy to the sorted subsequence to the left of the centered element. Similarly, if X is greater than the center element, then we examine the right half of the data.

ing
BinarySearch( const ElementType A[ ], ElementType X, int N )
{
    int Low, Mid, High; 
    Low = 0; High = N - 1; 
    whi1e(Low < High)
    {
        Mid = (Low + High) / 2;
        if (A[Mid] < X)
            Low = Mid + 1;
        else
        if(A [Mid] > X)
            High = Mid - 1
        else
            return Mid
    }
    return NotFound;
}
Copy the code

Each iteration costs O(1) for all the work within the loop, so the analysis needs to determine the number of loops. The cycle starts at high-low = n-1 and ends at high-low ≥-1. After each cycle, the high-low value will be at least halved before the cycle. So, the number of cycles is at most [log(N-1)]+2. For example, if high-low =128, then the maximum value of high-low after each iteration is 64, 32, 16, 8, 4, 2, 1, 0, -1. So, the running time is order log N.

Recommended article: How to Understand time complexity

3. Common time complexity comparisons

It’s graphed as follows

4. Time complexity of common data structures

Dsmt4: DSMT4 O(1) : Constant Complexity O(log N) : Logarithmic Complexity of the dSMT4 equation O(n^2) : DSMT4. N Square O(N ^3): N Square O(2^ N): Exponential Growth O(N!) : Factorial Factorial

4.1 Common data structures

Figure 4.2

4.3 Sorting Algorithm

4.4 Search Algorithm

Common algorithm complexity analysis: www.bigocheatsheet.com/

5. Inspection and analysis

One way to do this is to program and compare the actual observed running time to the one described through analysis. When N is doubled, the running time of a linear program is multiplied by factor 2. The run time of a quadratic program is multiplied by factor 4, while the run time of a cubic program is multiplied by factor 8. A program running in logarithmic time takes just one more constant when N is doubled, while a program running O(N log N) takes slightly more than twice as long as it would in the same environment

Another common technique for verifying that a program contains O(f(N)) is to calculate the ratio T(N)/f(N) for some range of N (usually separated by multiples of 2), where T(N) is the empirically observed running time. If f(N) is an ideal approximation of the running time, then the calculated value converges to a normal number. If the estimate of f(N) is too large, the value converges to 0. If f(N) is estimated too low and the program is not O(f(N)), then the calculated values diverge.