What exactly is the Big O
- N indicates the data scale
- O of f of n, f sub n is a function of n. Represents the number of instructions required to run the algorithm, proportional to f(n).
- It doesn’t have much to do with constant terms like A.B.C.D. It depends on what level it is.
Algorithm A: O(n) Number of required instructions: 10000N Algorithm B: O(N ^2) Number of required instructions: 10N ^2
- N gets bigger and bigger. The instruction number of algorithm a.b changes.
-
When n is greater than some critical point, A must be greater than B. It’s a difference of magnitude.
-
Highly complex algorithms may have an advantage over others and make sense when data volumes are small.
- For all advanced sorting algorithms, insertion sorting can be optimized when the data size is small enough. 10% – 15%. Detail optimization.
-
Different time complexity increases with the data size
convention
- In academia, strictly speaking, O(f(n)) represents the upper bound of algorithm execution
- Merge sort is order nlogn, which is order n^2.
- c.nlogn < a.n^2
- In the industry, we use O to represent the lowest upper bound on algorithm execution
- We don’t usually say merge sort is order n^2
example
- Whichever prevails:
O( nlogn + n ) = O( nlogn )
O( nlogn + n^2 ) = O( n^2 )
- The above formula requires the algorithm to deal with the same n (O(AlogA + B), O(AlogA + B^2)).
- I can’t leave this out
- Iterate over the graph implemented by the adjacency list:
- Time complexity: O(V + E) V is the number of vertices, E is the number of edges.
- Dense graph, even complete graph. E is almost V^2.
A time complexity problem
You have an array of strings, and you sort each string in the array alphabetically; The entire string array is then sorted lexicographically. The time complexity of the entire operation?
- Each string n*nlogn + the entire string array: nlogn
- Error string length confused with array length
Assume that the longest string length is s; Sort each string in array with n strings: O(slogs) Sort each string in array alphabetically: O(n*slog(s))
-
Sort the entire string array lexicographically: O(s*nlog(n))
- Explanation: Understanding of the time complexity of sorting algorithms:
- Nlogn is the number of comparisons. Sorting an integer array only takes nlogn comparisons.
- Because two integers compare O(1) to each other. Two strings compare differently O(s).
-
O(nslog(s)) + O(snlog(n)) = O( nslogs + snlogn )= O( ns(logs+logn) )
-
Lexicographical sorting of string arrays. Compare nlogn times. Each comparison takes O(s) time.
Algorithm complexity is in some cases use-case dependent
Insertion sort O(n^2)
- Worst case: O(n^2)
- Best case: O(n) : nearly orderly
- Average case: O(n^2)
Quicksort algorithm O(nlogn)
- Worst case: O(n^2) is not random. The orderly
- Best case: O(nlogn) randomize the standard point
- Average case: O(nlogn)
Rigorous algorithms are best worst averaging. We often focus on the majority. You just have to know the extremes.
The concept of data size
Do a selection sort on 10^5, and the computer freezes?
#include <iostream>
#include <cmath>
#include <ctime>
using namespace std;
int main(a) {
// The data size is increased 10 times each time to test
// If you are interested, you can also try to increase the data size by 2 times each time
for( int x = 1 ; x <= 9 ; x ++ ){
int n = pow(10, x);
clock_t startTime = clock();
long long sum = 0;
for( int i = 0 ; i < n ; i ++ )
sum += i;
clock_t endTime = clock();
cout << "sum = " << sum << endl;
cout << "10 ^" << x << ":"
<< double(endTime - startTime)/CLOCKS_PER_SEC
<< " s" << endl << endl;
}
return 0;
}
Copy the code
The results
sum = 45 10^1 : 2e-06 s sum = 4950 10^2 : 1e-06 s sum = 499500 10^3 : 4e-06 s sum = 49995000 10^4 : 2.9e-05 s sum = 4999950000 10^5:0.000305s sum = 499999500000 10^6:0.003049s sum = 49999995000000 10^7: 0.029234s sum = 4999999950000000 10^8:0.308056s sum = 499999999500000000 10^9:2.98528sCopy the code
If you want to solve the problem in 1s
- The O(n^2) algorithm can handle about 10^4 levels of data;
- The O(n) algorithm can handle data of about 10^8 levels;
- O(nlogn) algorithms can handle data at about 10^7 levels
Because what we just did was very simple, just simple addition. So normally you have to underestimate a little bit, and divide by 10
Spatial complexity
- Open one more auxiliary array: O(n)
- Open one more auxiliary two-dimensional array: O(n^2)
- Multi-open constant space: O(1): sort in place array
- Recursive calls have a spatial cost:
- The function before the recursive call is pushed onto the system stack.
Common complexity analysis
O(1) There is no change in data size
// O(1)
void swapTwoInts( int &a , int &b ){
int temp = a;
a = b;
b = temp;
return;
}
Copy the code
The number of O(n) cycles is C.N. C is a constant that doesn’t necessarily have to be greater than 1
// O(n) Time Complexity
int sum( int n ){
int ret = 0;
for( int i = 0 ; i <= n ; i ++ )
ret += i;
return ret;
}
Copy the code
O(n)
The number of loops is 1/2 x n string flips. Abc-cba. First and last. Second and penultimate scan half swap: 1/2*n swap: O(n)
void reverse( string &s ){
int n = s.size();
for( int i = 0 ; i < n/2 ; i ++ )
swap( s[i] , s[n-1-i] );
return;
}
Copy the code
O(n^2) selects sorting methods. O(n^2) double loop: first to n. Second weight to n. Both +1. The number of instructions executed is proportional to n^2.
i = 0; J performs n-1 arithmetic summation
(n-1) + (n-2) + (n-3) + … + 0
= (0+n-1)*n/2
= (1/2)n*(n-1)
= 1/2*n^2 - 1/2*n
= O(n^2)
// O(n^2) Time Complexity
void selectionSort(int arr[], int n){
for(int i = 0 ; i < n ; i ++){
int minIndex = i;
for( int j = i + 1 ; j < n ; j ++ )
if( arr[j] < arr[minIndex] )
minIndex = j;
swap( arr[i] , arr[minIndex] );
}
}
Copy the code
30N base operations: O(n) because the second loop is fixed and not affected by n.
// O(n) Time Complexity
void printInformation(int n){
for( int i = 1 ; i <= n ; i ++ )
for( int j = 1 ; j <= 30 ; j ++ )
cout<<"Class "<<i<<" - "<<"No. "<<j<<endl;
return;
}
Copy the code
O (logn) finds the middle element of an ordered array to determine the relationship between the elements and the middle element. If you don’t find it, you can throw away half of it.
Binary search method
// O(logn) Time Complexity
int binarySearch(int arr[], int n, int target){
int l = 0, r = n-1;
while( l <= r ){
int mid = l + (r-l)/2;
if( arr[mid] == target ) return mid;
if( arr[mid] > target ) r = mid - 1;
else l = mid + 1;
}
return -1;
}
Copy the code
O(logn)
N is equal to 0 after how many times you divide by 10? Log of 10n = O(log of n) while we divide by 10 until we end at 0. Reverse (s) complexity: 1/2 n switching operations. The number of bits in the string s, consistent with n.
string intToString( int num ){
string s = "";
string sign = "+";
if( num < 0 ){
num = -num;
sign = "-";
}
while( num ){
s += '0' + num%10;
num /= 10;
}
if( s == "" )
s = "0";
reverse(s);
if( sign == "-" )
return sign + s;
else
return s;
}
Copy the code
O(nlogn)
The second loop is n and the first loop is size+=size which is multiplied by 2.log2n
// O(nlogn) void hello(int n){ for( int sz = 1 ; sz < n ; sz += sz ) for( int i = 1 ; i < n ; i ++ ) cout<<"Hello, Algorithm!" <<endl; }Copy the code
Order SQRT (n) x goes from n to the square root of n
// O(sqrt(n)) Time Complexity
bool isPrime( int num ){
for( int x = 2 ; x*x <= num ; x ++ )
if( num%x == 0 )
return false;
return true;
}
bool isPrime2( int num ){
if( num <= 1 ) return false;
if( num == 2 ) return true;
if( num%2 == 0 ) return false;
for( int x = 3 ; x*x <= num ; x += 2 )
if( num%x == 0 )
return false;
return true;
}
Copy the code
Complexity experiments.
We thought we were writing an algorithm for order N logn, but it was order n^2, right?
If you want to solve the problem within 1s:
- The O(n2) algorithm can handle data of about 10^4 levels;
- The O(n) algorithm can handle data of about 10^8 levels;
- O(nlogn) algorithms can handle data at about 10^7 levels
The difference in constants ahead can be large.
2. To experiment or observe trends:
Double the size of the data at a time and watch the time change
Four algorithms of different complexity.
namespace MyAlgorithmTester{ // O(logN) int binarySearch(int arr[], int n, int target){ int l = 0, r = n-1; while( l <= r ){ int mid = l + (r-l)/2; if( arr[mid] == target ) return mid; if( arr[mid] > target ) r = mid - 1; else l = mid + 1; } return -1; } // O(N) int findMax( int arr[], int n ){ assert( n > 0 ); int res = arr[0]; for( int i = 1 ; i < n ; i ++ ) if( arr[i] > res ) res = arr[i]; return res; } // O(NlogN) bottom-up void __merge(int arr[], int l, int mid, int r, int aux[]){for(int I = l; i <= r ; i ++) aux[i] = arr[i]; int i = l, j = mid+1; for( int k = l ; k <= r; k ++ ){ if( i > mid ) { arr[k] = aux[j]; j ++; } else if( j > r ){ arr[k] = aux[i]; i ++; } else if( aux[i] < aux[j] ){ arr[k] = aux[i]; i ++; } else { arr[k] = aux[j]; j ++; } } } void mergeSort( int arr[], int n ){ int *aux = new int[n]; for( int i = 0 ; i < n ; i ++ ) aux[i] = arr[i]; for( int sz = 1; sz < n ; sz += sz ) for( int i = 0 ; i < n ; i += sz+sz ) __merge(arr, i, i+sz-1, min(i+sz+sz-1,n-1), aux ); delete[] aux; return; } // N (N^2) select sort void ectionsort (int arr[], int N){for(int I = 0; i < n ; i ++){ int minIndex = i; for( int j = i + 1 ; j < n ; j ++ ) if( arr[j] < arr[minIndex] ) minIndex = j; swap( arr[i] , arr[minIndex] ); } return; }}Copy the code
Code to generate the test case:
namespace MyUtil { int *generateRandomArray(int n, int rangeL, int rangeR) { assert( n > 0 && rangeL <= rangeR ); int *arr = new int[n]; srand(time(NULL)); for (int i = 0; i < n; i++) arr[i] = rand() % (rangeR - rangeL + 1) + rangeL; return arr; } int *generateOrderedArray(int n) { assert( n > 0 ); int *arr = new int[n]; for (int i = 0; i < n; i++) arr[i] = i; return arr; }}Copy the code
The test is O(N) level
Int main() {// findMax // O(n) cout<<"Test for findMax:"<<endl; for( int i = 10 ; i <= 26 ; i ++ ){ int n = pow(2,i); int *arr = MyUtil::generateRandomArray(n, 0, 100000000); clock_t startTime = clock(); MyAlgorithmTester::findMax(arr, n); clock_t endTime = clock(); cout<<"data size 2^"<<i<<" = "<<n<<"\t"; cout<<"Time cost: "<<double(endTime - startTime)/CLOCKS_PER_SEC<<endl; delete[] arr; } return 0; }Copy the code
Running results:
Test forfindMax: data size 2^10 = 1024 Time cost: 5e-06 s data size 2^11 = 2048 Time cost: 7E-06 s data size 2^12 = 4096 Time cost: 1.2e-05 s data size 2^13 = 8192 Time cost: 2.5E-05 s data size 2^14 = 16384 Time cost: 2.7e-05 s data size 2^15 = 32768 Time cost: 9.2e-05 s data size 2^16 = 65536 Time cost: 0.000169s data size 2^17 = 131072 Time cost: 0.000431 s data size 2^18 = 262144 Time cost: 0.003037 s data size 2^19 = 524288 Time cost: 0.001325s data size 2^21 = 1048576 Time cost: 0.002489s data size 2^21 = 2097152 Time cost: 0.005739s data size 2^22 = 4194304 Time cost: 0.011373s data size 2^23 = 8388608 Time cost: 0.019566s data size 2^24 = 16777216 Time cost: 0.040289s data size 2^25 = 33554432 Time cost: 0.095169s data size 2^26 = 67108864 Time cost: 0.201682 s data size 2^27 = 134217728 Time cost: 0.330673s data size 2^28 = 268435456 Time cost: 0.750136sCopy the code
N has tripled. The time is roughly doubled, so the algorithm is O(n) level.
Is it order n^2?
Int main() {// selectionSort (n^2) cout<<"Test for selectionSort:"<<endl; for( int i = 10 ; i <= 15 ; i ++ ){ int n = pow(2,i); int *arr = MyUtil::generateRandomArray(n, 0, 100000000); clock_t startTime = clock(); MyAlgorithmTester::selectionSort(arr,n); clock_t endTime = clock(); cout<<"data size 2^"<<i<<" = "<<n<<"\t"; cout<<"Time cost: "<<double(endTime - startTime)/CLOCKS_PER_SEC<<endl; delete[] arr; } return 0; }Copy the code
Run result: approximately 4 times
Test forSelection Sort: data size 2^10 = 1024 Time cost: 0.001581 s data size 2^11 = 2048 Time cost: 0.006221s data size 2^12 = 4096 Time cost: 0.02139s data size 2^13 = 8192 Time cost: 0.081103 s data size 2^14 = 16384 Time cost: 0.323263 s data size 2^15 = 32768 Time cost: 1.32474s data size 2^16 = 65536 Time cost: 5.19642sCopy the code
The amount of data n has doubled. That’s a fourfold increase in time.
Is the test order logN?
Int main() {binarySearch // O(logn) cout<<"Test for binarySearch:"<<endl; for( int i = 10 ; i <= 28 ; i ++ ){ int n = pow(2,i); int *arr = MyUtil::generateOrderedArray(n); clock_t startTime = clock(); MyAlgorithmTester::binarySearch(arr,n,0); clock_t endTime = clock(); cout<<"data size 2^"<<i<<" = "<<n<<"\t"; cout<<"Time cost: "<<double(endTime - startTime)/CLOCKS_PER_SEC<<endl; delete[] arr; } return 0; }Copy the code
Complexity test
log2N / logN
= (log2 + logN)/logN
= 1 + log2/logN
Copy the code
When the data is twice as large. Operation efficiency increased by 1. Several times.
Running results:
Test for Binary Search:
data size 2^10 = 1024 Time cost: 1e-06 s
data size 2^11 = 2048 Time cost: 0 s
data size 2^12 = 4096 Time cost: 0 s
data size 2^13 = 8192 Time cost: 2e-06 s
data size 2^14 = 16384 Time cost: 1e-06 s
data size 2^15 = 32768 Time cost: 1e-06 s
data size 2^16 = 65536 Time cost: 1e-06 s
data size 2^17 = 131072 Time cost: 2e-06 s
data size 2^18 = 262144 Time cost: 3e-06 s
data size 2^19 = 524288 Time cost: 1e-06 s
data size 2^20 = 1048576 Time cost: 4e-06 s
data size 2^21 = 2097152 Time cost: 3e-06 s
data size 2^22 = 4194304 Time cost: 3e-06 s
data size 2^23 = 8388608 Time cost: 4e-06 s
data size 2^24 = 16777216 Time cost: 4e-06 s
data size 2^25 = 33554432 Time cost: 1.2e-05 s
data size 2^26 = 67108864 Time cost: 9e-06 s
data size 2^27 = 134217728 Time cost: 1.1e-05 s
data size 2^28 = 268435456 Time cost: 2.4e-05 s
Copy the code
Run the result, change little
Sequential search is converted to binary search, greatly improving efficiency
Is the test order NlogN?
It’s almost O(N)
Int main() {// Test mergeSort // O(nlogn) cout<<"Test for mergeSort:"<<endl; for( int i = 10 ; i <= 24 ; i ++ ){ int n = pow(2,i); Int * arr = MyUtil: : generateRandomArray (n, 0, 1 < < 30); clock_t startTime = clock(); MyAlgorithmTester::mergeSort(arr,n); clock_t endTime = clock(); cout<<"data size 2^"<<i<<" = "<<n<<"\t"; cout<<"Time cost: "<<double(endTime - startTime)/CLOCKS_PER_SEC<<endl; delete[] arr; } return 0; }Copy the code
Running results:
Test forMerge Sort: data size 2^10 = 1024 Time cost: 0.000143s data size 2^11 = 2048 Time cost: 0.000325s data size 2^12 = 4096 Time cost: 0.000977s data size 2^13 = 8192 Time cost: 0.001918s data size 2^14 = 16384 Time cost: 0.003678s data size 2^15 = 32768 Time cost: 0.007635s data size 2^17 = 65536 Time cost: 0.007635s data size 2^17 = 65536 Time cost: 0.034462 s data size 2^18 = 262144 Time cost: 0.069586 s data size 2^19 = 524288 Time cost: 0.136214s data size 2^20 = 1048576 Time cost: 0.294626s data size 2^21 = 2097152 Time cost: 0.619943s data size 2^22 = 4194304 Time cost: 1.37317s data size 2^23 = 8388608 Time cost: 0.619943s data size 2^23 = 8388608 Time cost: 2.73054s data size 2^24 = 16777216 Time cost: 5.60827sCopy the code
About two times
Complexity analysis of recursive algorithms
- It must be order nlogn factorial if it doesn’t recurse.
Recursive implementation of binary search:
Left half or right half. I’m only going to do it once either way and I’m going to cut it in half each time, the depth of the recursive call is logn, and the complexity of the problem is O(1)
// binarySearch
int binarySearch(int arr[], int l, int r, int target){
if( l > r )
return -1;
int mid = l + (r-l)/2;
if( arr[mid] == target )
return mid;
else if( arr[mid] > target )
return binarySearch(arr, l, mid-1, target);
else
return binarySearch(arr, mid+1, r, target);
}
Copy the code
If the recursive function, only one recursive call, recursion depth; In each recursive function, the time complexity is T; The total time complexity is O(T * depth).
Summation recursive implementation
Recursion depth: n Time complexity: O(n)
// sum
int sum( int n ){
assert( n >= 0 );
if( n == 0 )
return 0;
return n + sum(n-1);
}
Copy the code
Compute x to the NTH power
// pow2 double pow( double x, int n ){ assert( n >= 0 ); If (n == 0) return 1.0; double t = pow(x, n/2); // if(n%2) return x*t*t; return t*t; }Copy the code
Recursion depth: logn Time complexity: O(logn)
Recursion makes multiple recursive calls
The depth of the recursion tree is N
2^0 + 1 + 2^2 + 2^3 +… + 2^n = 2n+1 – 1 = O(2^n)
Exponential algorithms: very slow. N is around 20. Very slow pruning operations on 30: dynamic programming. Artificial intelligence: Search tree
// f
int f(int n){
assert( n >= 0 );
if( n == 0 )
return 1;
return f(n-1) + f(n-1);
}
Copy the code
Merge sort n=8
The depth of the tree is logN and when n is equal to 8, the number of levels is 3. Each layer processes smaller and smaller amounts of data and one is divided into logn layers. The total size of each of these layers is still n
// mergeSort
void mergeSort(int arr[], int l, int r){
if( l >= r )
return;
int mid = (l+r)/2;
mergeSort(arr, l, mid);
mergeSort(arr, mid+1, r);
merge(arr, l, mid, r);
}
Copy the code
Amortized Complexity Analysis Amortized Time Analysis
One algorithm is relatively complex, but it is intended to facilitate other operations. The higher one is going to equal the whole thing.
Dynamic array (Vector)
template <typename T> class MyVector{ private: T* data; int size; // Store the number of elements in the array. // Store the maximum number of elements that an array can hold // O(n) : a loop. void resize(int newCapacity){ assert( newCapacity >= size ); T *newData = new T[newCapacity]; for( int i = 0 ; i < size ; i ++ ) newData[i] = data[i]; delete[] data; data = newData; capacity = newCapacity; } public: MyVector(){ data = new T[100]; size = 0; capacity = 100; } ~MyVector(){ delete[] data; } // Average: O(1) void push_back(T e){// if(size == capacity) resize(2* capacity); data[size++] = e; } // O(1) T pop_back(){ assert( size > 0 ); size --; //size starts at 0. // Return data[size]; // return data[size]; }};Copy the code
capitation
The NTH +1 is going to cost order n, but it’s going to distribute that n over the previous n operations. So it becomes order two or order one constant. Resize is conditional and not called every time.
int main() {
for( int i = 10 ; i <= 26 ; i ++ ){
int n = pow(2,i);
clock_t startTime = clock();
MyVector<int> vec;
for( int i = 0 ; i < n ; i ++ ){
vec.push_back(i);
}
clock_t endTime = clock();
cout<<n<<" operations: \t";
cout<<double(endTime - startTime)/CLOCKS_PER_SEC<<" s"<<endl;
}
return 0;
}
Copy the code
1024 operations: 2.5e-05s 2048 Operations: 2.9E-05s 4096 Operations: 7.4e-05s 8192 Operations: 0.000154s 16384 operations: 0.000265s 32768 operations: 0.000391 s 65536 operations: 0.001008s 131072 operations: 0.002006s 262144 Operations: 0.003863s 524288 operations: 0.005842s 1048576 operations: 0.014672 s 2097152 operations: 0.029367 s 4194304 operations: 0.06675s 8388608 operations: 0.014672 s 2097152 operations: 0.029367 s 4194304 operations: 0.06675s 8388608 operations: 0.124446s 16777216 Operations: 0.240025s 33554432 Operations: 0.486061s 67108864 operations: 0.960224sCopy the code
It basically satisfies the two-fold relationship
Delete elements to reduce space.
The time complexity of each common deletion is O(1).
We only have n left. This time resize n removes this element to 1
If I repeat this process, I can’t amortize it. It’s order n.
If the number of elements is 1/4 of the size of the array, resize
template <typename T> class MyVector{ private: T* data; int size; // Store the number of elements in the array. O(n) void resize(int newCapacity){assert(newCapacity >= size); T *newData = new T[newCapacity]; for( int i = 0 ; i < size ; i ++ ) newData[i] = data[i]; delete[] data; data = newData; capacity = newCapacity; } public: MyVector(){ data = new T[100]; size = 0; capacity = 100; } ~MyVector(){ delete[] data; } // Average: O(1) void push_back(T e){ if( size == capacity ) resize( 2* capacity ); data[size++] = e; } // Average: O(1) T pop_back(){ assert( size > 0 ); T ret = data[size-1]; size --; if( size == capacity/4 ) resize( capacity/2 ); //resize the data[size] element. }};Copy the code
The results
2048 operations: 4.3e-05s 4096 operations: 6.3e-05s 8192 Operations: 0.000107s 16384 operations: 0.000316s 32768 operations: 0.000573 s 65536 operations: 0.001344s 131072 operations: 0.001995 s 262144 operations: 0.004102s 524288 operations: 0.008599s 1048576 operations: 0.0014714s 2097152 operations: 0.027181s 4194304 operations: 0.063136s 8388608 operations: 0.126046s 16777216 operations: 0.242574s 33554432 Operations: 0.456381 s 67108864 Operations: 0.96618s 134217728 Operations: 1.76422sCopy the code
Amortized complexity
- The dynamic array
- Dynamic stack
- Dynamic queue
— — — — — — — — — — — — — — — — — — — — — — — — — gorgeous line — — — — — — — — — — — — — — — — — — — —
After watching the friends can click a like/follow, your support is the biggest encouragement to me.
Personal blog tomato technology small stack and gold digging home page
If you want to know more, please follow my wechat official account: Tomato Technology Xiaozhan