This article introduces the top 10 algorithms for machine learning. Please take notes.

Algorithm 1: quick sort method

Quicksort is a sorting algorithm developed by Tony Hall. On average, n log N comparison is required to sort n two items. In the worst case, a comparison of TWO (N2) steps is required, but this is not common.

In fact, quicksort is generally significantly faster than other two (N log n) algorithms because its inner loop can be implemented efficiently on most architectures. Quicksort uses a Divide and conquer strategy to Divide one serial (list) into two sub-lists.

Algorithm steps

1. Pick an element from the sequence, called pivot,

2. Reorder the array so that all elements smaller than the base value are placed in front of the base value and all elements larger than the base value are placed behind the base value (the same number can go to either side). After the partition exits, the benchmark is in the middle of the array. This is called a partition operation.

3. Recursively sort subsequences of elements smaller than the base value and subsequences larger than the base value.

The bottom case of recursion is if the sequence is of size zero or one, so it’s always sorted. The algorithm always exits, though recursively, because in each iteration it puts at least one element in its last position.

Algorithm two: heap sort algorithm

Heapsort refers to a sort algorithm designed by using heap data structure. Heap is a nearly complete binary tree structure, and also satisfies the property of heap: that is, the key value or index of the child node is always smaller (or larger) than its parent node. The average time complexity of heap sort is two o (nlogn).

Algorithm steps

Create a heap H[0..n-1]

2. Swap the top of the heap (maximum) with the bottom of the heap

3. Reduce the size of the heap by 1 and call shift_down(0) to move the top of the new array into position

4. Repeat Step 2 until the heap size is 1

Algorithm three: merge sort

Merge sort (Merge sort) is an efficient sorting algorithm based on Merge operations. This algorithm is a very typical application of Divide and Conquer.

Algorithm steps

1. Apply for a space equal to the sum of two sorted sequences, which is used to store the combined sequence

2. Set two Pointers to the starting positions of two sorted sequences

3. Compare the elements pointed to by the two Pointers, select the smaller element into the merge space, and move the pointer to the next position

4. Repeat step 3 until a pointer reaches the end of the sequence

5. Copy all remaining elements of the other sequence directly to the end of the merged sequence

Algorithm four: binary search algorithm

Binary search algorithm is a search algorithm that looks for a particular element in an ordered array. The search process starts at the middle element of the array and ends if the middle element is exactly the element to be searched for.

If a particular element is greater than or less than the middle element, it looks in the half of the array that is greater than or less than the middle element, and compares from the middle element as it began. If the array is empty at any step, it cannot be found. This search algorithm reduces the search area by half with each comparison. Half search Reduce the search area by half each time and ensure that the time complexity is two o (logn).

Algorithm 5: BFPRT(Linear screening)

BFPRT algorithm solves a classic problem, that is, selecting the KTH largest (KTH smallest) element from a sequence of N elements. Through clever analysis, BFPRT can ensure that it is still linear time complexity in the worst case. The idea of this algorithm is similar to the idea of quicksort, of course, in order to make the algorithm still achieve o(n) time complexity in the worst case, five authors of the algorithm have done a subtle treatment.

Algorithm steps

1. Divide n elements into 5 groups and divide them into N /5 (upper bound) groups.

2, take out the median of each group, arbitrary sorting method, such as insertion sort.

3. Recursively call the selection algorithm to find the median of all medians in the previous step and set it to x. In the case of even medians, set it to select the one with the smallest middle.

4. Divide the array with x, set the number less than or equal to x as k, and the number greater than x as n-k.

5, if I ==k, return x; If I <k, recursively find the ith smallest element among elements less than x; If I >k, recursively find the i-th smallest element in the element greater than x. </k, recursively find the ith smallest element among elements less than x; If I > n=1, the return is the I element.

Algorithm 6: DFS (Depth-first Search)

Depth-First-Search algorithm is a kind of Search algorithm. It traverses the nodes of the tree along its depth, searching branches as deep as possible. When all edges of node V have been explored, the search goes back to the starting node of the edge on which node V was found.

This process continues until all nodes reachable from the source node have been discovered. If there are undiscovered nodes, one of them is selected as the source node and the process is repeated until all nodes are accessed. DFS is blind search.

Depth-first search is a classical algorithm in graph theory. The depth-first search algorithm can generate the corresponding topological sorting list of the target graph, and the topological sorting list can conveniently solve many related graph theory problems, such as the maximum path problem and so on. Heap data structure is generally used to assist the implementation of DFS algorithm.

Algorithm steps:

1. Access vertex V;

2. Depth-first traversal of the graph starts from the unvisited adjacent points of V successively; All vertices in the graph that have a path with V are accessed;

3. If there are still vertices in the graph that have not been visited, the depth-first traversal will be performed again starting from one of the unvisited vertices until all vertices in the graph have been visited.

The above description may seem abstract, but here’s an example:

After accessing a starting vertex V in the graph, DFS starts from V and accesses any adjacent vertex W1. From w1, access the unvisited vertex w2 adjacent to w1. Then from W2, do a similar visit… And so on until you reach vertex U, where all adjacent vertices have been visited.

Next, take a step back to the vertex you visited last time and see if there are any other adjacent vertices that have not been visited. If so, visit this vertex, and then start from this vertex for a similar visit; If not, take a step back and search. Repeat the process until all vertices in the connected graph have been accessed.

Algorithm 7: BFS(Breadth-first Search)

Breadth-First-Search algorithm is a graphic Search algorithm. Simply put, BFS starts at the root node and traverses the nodes of the tree (graph) along its width. If all nodes are accessed, the algorithm aborts. BFS is also blind search. Queue data structures are generally used to assist the IMPLEMENTATION of BFS algorithms.

Algorithm steps

1. Put the root node into the queue first.

2. Take the first node from the queue and verify that it is the target. If the target is found, the search ends and the results are returned.

Otherwise, all of its immediate children that have not yet been verified are enqueued.

3. If the queue is empty, it means that the whole graph has been checked — that is, there is no object in the graph to search for. End search and return “target not found”.

4. Repeat Step 2.

Algorithm 8: Dijkstra

Dijkstra’s algorithm was proposed by Dutch computer scientist Ezekiel Dykstra. Dikoscher algorithm uses breadth-first search to solve the single source shortest path problem of non-negatively-weighted directed graph, and finally obtains a shortest path tree. This algorithm is often used in routing algorithms or as a submodule of other graph algorithms.

The input of the algorithm consists of a weighted directed graph G and a source vertex S in G. Let’s call V the set of all vertices in G. Every edge in a graph is a pair of ordered elements formed by two vertices. (u, v) means there’s a path from vertex u to v. We denote the set of all edges in G by E, and the weight of edges is defined by the weight function W: E → [0, ∞].

So w(u, v) is the non-negative weight from vertex u to vertex v. The weight of an edge can be thought of as the distance between two vertices. The weight of the path between any two points is the sum of the weights of all the edges of the path. Given that V has vertices S and t, Dijkstra can find the least weighted path from S to T (for example, the shortest path).

This algorithm can also find the shortest path from one vertex S to any other vertex in a graph. Dijkstra algorithm is the fastest single source shortest path algorithm for digraphs without negative weights.

Algorithm steps

1. The initial time S={V0},T={other vertices}, the corresponding distance value of the vertices in T, if exists, d(V0,Vi) is the weight on the arc, if not, d(V0,Vi) is ∞.

2. Select a vertex W from T whose distance value is the smallest and is not in S, and add S.

3. Modify the distance values of the vertices in the rest of T: If W is added as the intermediate vertex and the distance value from V0 to Vi is shortened, then modify this distance value and repeat the above steps 2 and 3 until S contains all vertices, that is, W=Vi.

Algorithm nine: dynamic programming algorithm

Dynamic programming is a method used in mathematics, computer science, and economics to solve complex problems by breaking down original problems into relatively simple subproblems. Dynamic programming is often applied to problems with overlapping subproblems and optimal substructures, and the time consumed by dynamic programming is often much less than that of the naive solution.

The basic idea behind dynamic programming is very simple. Roughly speaking, to solve a given problem, we need to solve its different parts (i.e., subproblems) and combine the solutions of the subproblems to get the solution of the original problem. In general, many subproblems are very similar, so dynamic programming methods try to solve each subproblem only once, thus reducing computational effort:

Once the solution of a given stator problem has been calculated, it is memorized and stored so that the next time the same subproblem solution is needed, the table can be looked up directly. This approach is particularly useful when the number of repeating subproblems increases exponentially with respect to the size of the input. The most classical problem about dynamic programming is knapsack problem.

Algorithm steps

1. Properties of optimal substructure. A problem is said to have the property of optimal substructure (that is, to satisfy the optimization principle) if the optimal solution of the problem contains the solution of the subproblem which is also optimal. Optimal substructure properties provide important clues for dynamic programming algorithms to solve problems. 2. Overlapping nature of sub-problems. The overlapping property of subproblems means that when solving a problem from top to bottom with recursive algorithm, the subproblems generated each time are not always new, and some subproblems will be repeated many times. The dynamic programming algorithm takes advantage of the overlapping nature of the subproblem, calculates each subproblem only once, and then saves its calculation results in a table. When it needs to calculate the subproblem again, it simply looks up the results in the table, so as to obtain higher efficiency.

Algorithm 10: Naive Bayes classification algorithm

Naive Bayes classification algorithm is a simple probability classification algorithm based on Bayes theorem. The basis of Bayesian classification is probabilistic reasoning, which is how to complete reasoning and decision-making tasks when the existence of various conditions is uncertain and only the probability of their occurrence is known.

Probabilistic reasoning is the counterpart of deterministic reasoning. However, naive Bayes classifier is based on the assumption of independence, that is, each feature of the sample is not correlated with other features. Naive Bayes classifier relies on accurate natural probability model and can achieve very good classification effect in supervised learning sample set.

In many practical applications, naive Bayesian model parameter estimation uses maximum likelihood estimation method, in other words naive Bayesian model can work without using Bayesian probability or any Bayesian model.


For more exciting content, please click on the top right corner to follow the small house


Source: GitChat