This paper is a big assignment for the graduate course. The support vector machine introduced in this paper is only the first layer of understanding of the support vector machine algorithm. For an in-depth understanding, please look at the three layers of understanding of the support vector machine.

1: Support vector machine algorithm introduction

In recent years, the emergence of AI and machine learning in network security has greatly changed the operation mode of network security. Today, machine learning can help us deploy cyber security technology effectively. Through machine learning, existing network security systems can prevent similar network attacks by analyzing relevant security data and learning from it. Meanwhile, machine learning technology can enable network security practitioners to take the initiative to prevent network threats and provide better solutions to prevent network threats. With the continuous update and iteration of computer hardware equipment, followed by the continuous improvement of computer computing ability, machine learning field also ushered in a rapid development. The application of machine learning in the field of network security can enable security personnel to organize and manage security log data more efficiently, because these data often involve classification, prediction, association rule learning, etc., and these tasks are better handled by machine learning. This paper focuses on the application of a classification algorithm in machine learning (support vector machine algorithm) in network security. This paper mainly describes the main framework of support vector machine, as well as the linear classification of support vector machine analysis and mathematical derivation, due to my limited ability, did not support vector machine classification in multidimensional conditions for analysis and introduction.

1.1: What is a support vector machine

To understand support vector machines, we should start from classification. Classification is a very important task in the field of machine learning. In simple terms, classification is to group things with the same attribute into a category. The support vector machine is a classification algorithm. As the name implies, support vector machine mainly contains two parts, one is support vector, the other is machine. In the field of machine learning, algorithm is often regarded as a kind of machine, for example, classifier is also called classifier. As for what a support vector is, it will be explained in detail below. As for the main job of the support vector machine algorithm, as shown below, it classifies different categories of things.

1.2: How strong are support vector machines

Support vector machine (SVM) algorithm is derived from statistical theory, and is the algorithm that gets the most attention in the field of machine learning. It is widely used in handwriting recognition, face detection, spam classification, gene classification and intrusion detection. Support vector machines can help us find accurate relationships in complex data and have more accurate results compared with other algorithms. The power of support vector machines will be introduced from different perspectives in the following paragraphs.

(1) High classification accuracy and good classification effect. The following figure shows the classification effects and accuracy of different machine learning algorithms in a given data set, according to sciKit-learn. It can be seen from the following figure that the same input data set, SVM contrast decision tree, random forest, adaptive enhancement, naive Bayes and other classification algorithms have higher accuracy and better classification effect.



(2) Extensive practical application. Support vector machine is widely used in text classification, face recognition, handwritten numeral body recognition, image classification, protein classification and other fields. Experimental results show that support vector machine algorithm can achieve much higher accuracy than traditional classification methods after only three or four rounds of correlation feedback during model training.

(3) The citation rate of academic papers is high and the academic application prospect is good. Because support vector machine algorithm is the closest algorithm to deep learning in machine learning, it has been widely used in various disciplines in view of its high classification accuracy and good classification effect. The original support vector machine algorithm was invented in 1963, after continuous improvement, until today its theory and mathematical derivation have been relatively perfect, the great mathematical theoretical basis behind the support vector machine algorithm can be said to be the great achievement of human mathematics. From an academic point of view, support vector machine algorithm is more or less involved in software engineering, communication engineering, artificial intelligence, natural language processing, network security and other computer-related fields. The following figure is a schematic diagram of the annual publication trend of papers related to support vector machine algorithm in CNKI from 2000 to 2019, which more intuitively shows the widespread application of support vector machine algorithm in papers in recent years.

1.3: The working mode of support vector machines

If not consider mathematical deduction, support vector machine (SVM) of things is actually very simple, a simple linear support vector machine (SVM) classifier was established between the two different classes to a straight line, after the classification means that the line on one side of all of the data samples on behalf of the category, while the other side of the line on behalf of another. So the line is not unique. Support vector machine is not the only similar classification algorithm, such as k-nearest Neighbor algorithm, but the reason why support vector machine algorithm is superior to other algorithms is that support vector machine algorithm can select the optimal line for classification. The following diagram helps us understand how support vector machines work. Let’s say we have some data points on the graph, and we want to classify the yellow origin into a category, and the black points into a category. The core of support vector machine algorithm is to find the optimal boundary, we also call this boundary decision boundary.



The decision boundary is not necessarily a straight line, it can also be a line of other forms, and the decision boundary is also called a hyperplane, the name comes from the fact that sometimes you can find a decision boundary with more than one feature. For example, the decision boundary of nonlinear SVM (support vector machine) with RBF kernel is shown below. All the subsequent steps of support vector machine algorithm are to solve the optimal decision boundary, which is also the final solution goal of the whole support vector machine algorithm.

1.4: Type of support vector machines

There are two different types of support vector machines, and each type of SVM has its own purpose.

  • Simple SVM: Usually used to solve linear classification and regression problems.
  • Kernel SVM: For nonlinear data, it has higher flexibility. Its hyperplane is not only suitable for two-dimensional space, but also can solve the problem of data classification in multidimensional space.

2: Support vector machine algorithm principle

2.1: Understand the concept of hyperplane

In geometry, a hyperplane is a subspace of a space that has one dimension less than the space in which it resides. If the data space is three-dimensional, then the hyperplane is a two-dimensional plane. If the data space is two-dimensional, then the hyperplane is a one-dimensional straight line. In the binary classification problem, if a hyperplane can divide the data set into two sets, each of which contains a separate category, then this hyperplane is the decision boundary. As mentioned above, the core of support vector machine is to find the optimal decision boundary. As shown in the figure below, for a known data set, there can be countless decision boundaries that make the training error zero, and the support vector machine algorithm is to help us find the optimal one. The core of the whole algorithm is also around this. So understanding hyperplane is the key to understand support vector machine algorithm.

2.2: Decide when the boundary is optimal

As mentioned above, the core goal of the support vector machine algorithm is to find the optimal decision boundary, so what conditions the decision boundary meets are the optimal decision boundary is the problem we are concerned about. We randomly select one of the multiple decision boundaries in the above data set, which can be used as decision boundaries. However, we cannot guarantee that this decision boundary still has high accuracy in unknown data sets. For the existing data set, if we take two possible decision boundaries B1 and B2, we shift the decision boundary B1 to both sides until we reach the type of element closest to this decision boundary and then stop, namely □ and ○ as shown in the figure below, forming two new hyperplanes b11 and B12. We then move the original decision boundary between B11 and B12, ensuring that the new decision boundary is equidistant from B11 and B12. The distance between B11 and B12 is called the margin of the decision boundary B1, usually called D. The same operation is also performed for the decision boundary B2, and then we compare the following two decision boundaries. The training errors of B1 and B2 for the existing data set are 0. So which of these two decision boundaries is better?



Next, we introduced is the same as the original data set the distribution of the test sample, that is shown below for the red – and, from the chart we can see that on the test set, for decision boundary B1, still without a sample classification error, its generalization error is 0, but for decision boundary B2, classification of the mistakes, Then the generalization error of B2 is greater than the decision boundary B1. Therefore, decision boundary B1 is better for the test set. What conclusion does this experimental result tell us? That is, the decision boundary with larger margin has higher accuracy and lower generalization error, and the margin here is D mentioned above. This conclusion has strict mathematical proof and conforms to the law of minimization of structural risk. This paper will not deduce this law, but only give a brief introduction.

If the margin (d) is very small, any slight disturbance will have a great impact on the classification of decision boundaries. If the margin is very small, good performance in the training set will occur, but bad performance in the test set, that is, over-fitting phenomenon will occur. So, at this point in the article, we have a new goal, which is to find a decision boundary with a larger margin, and the larger the margin, the better.



Support vector machine (SVM) is a classifier that classifies data by finding the maximum marginal decision boundary. So support vector machine is also called maximum marginal classifier.

2.3: Mathematical derivation of decision boundary

After the above analysis, our goal becomes to find the most marginal decision boundary. The essence of this problem is optimization. In machine learning, optimization problems are often associated with loss function, and support vector machine algorithm is no exception. It is to find the maximum marginal decision boundary by minimizing the loss function.

2.3.1: Define decision boundaries







2.3.2: Define the margin of the decision boundary











2.3.3: Solving the optimal decision boundary



3: Python implementation of support vector machine algorithm

The first four chapters introduce in detail what is support vector machine algorithm and the characteristics and principles of support vector machine, and give how to solve the mathematical process of support vector machine algorithm derivation, the following will be through Python language for linear support vector machine visualization.

3.1: Import the required modules of SVM

3.2: Construct and visualize data sets



3.3: Make grid map



3.4: Calculation of decision boundaries



4: Application of support vector machine algorithm in network security

Before studying the application of support vector machine algorithm in network security, we first look at the main tasks of network security. According to statistics, the main tasks of network security include prediction, prevention, detection, response and monitoring. Support vector machine algorithm has a wide application prospect in every task. The important role of support vector machines in network, terminal, application and user behavior security will be introduced in the following sections.

4.1: SVM is used for network protection

Network protection, as the name implies, is to protect our network from intrusion, so the solution of intrusion detection system is more concerned in the field of network security, the traditional intrusion detection method mostly adopts the method of password signature. In terms of intrusion detection, machine learning technology can help us analyze network traffic. Here, support vector machine has the characteristics of fast detection speed and high classification accuracy, which can help security personnel identify different types of network attacks, such as scanning and fraudulent network.

4.2: SVM for terminal protection

In recent years, with the continuous popularization of mobile terminals, people’s way of accessing the Internet is no longer concentrated on PC as before. More and more mobile devices have become people’s first choice for accessing the Internet, and the network security of mobile terminals is also becoming more and more important. The latest software to detect the security of mobile devices is endpoint detection. Different mobile terminals have different information storage, but there are certain rules between these information. Support vector machine can accurately classify these different types of software.

4.3: SVM for application protection

Applications in today’s society has been inseparable with everyone, each of us in life is full of various applications, such as desktop, Web, mobile, micro services and so on. Most cyberattacks on applications are aimed at exploitable vulnerabilities in the application layer, a problem compounded by the growing number and complexity of applications. Support vector machine algorithms can help security personnel classify known types of attacks, such as SQL injection and so on.