1. Derivation of least square support vector machine
Unlike Support Vector Machines (SVM), The Least Squared Support Vector Machines (LSSVM) proposed by Suykens et al. are used on top of SVM, and three improvements are made:
- Use error termInstead of the relaxation variable
- Equality constraints are used instead of inequality constraints
- Use a quadratic loss functionInstead of hinge loss
The objective function of LSSVM is equation (1) :
Among themIs a sample projected into Hilbert space;andIs the classifier parameter, respectively is the slope and offset of the classification hyperplane;Is the error term;Is the regular term parameter, and controls the degree of punishment for the error term.
Similar to the SVM solution, the Lagrange multiplier method can be used to solve the two classifier parameters. Introduce the Lagrange multiplierThe Lagrange function can be obtained as equation (2) :
Pair the above equations separately,,Take the partial derivative and set it to 0, then:
Different from the quadratic programming problem (QP) required by SVM, LSSVM can obtain model parameters by solving the linear programming problem, and the convex programming satisfies the Slater constraint specification, the optimal solution satisfies (Karush-Kuhn-Tucker, KKT), and the variables are eliminated in KKT conditionand, the following equation can be obtained:
Among themIs the dot product of the kernel matrix and the sample label matrix,Is the identity matrix,The Lagrange multiplier can be obtained by solving this linear equationAnd the offset, and THE same as SVM, its effect on unknown samplesThe decision function of is:
2. Matlab implementation of pre-computed Kernel
Since you only need to solve linear programming problems, the implementation of LSSVM is very simple. The so-called pre-computed Kernel is a self-defined Kernel function, which is different from the classical Kernel functions such as linear Kernel and RBF Kernel. We can define a Kernel function to generate a Kernel matrix.
The pre-computed Kernel implementation of LSSVM is very simple, that is, the custom Kernel functions are input into the training and prediction functions, so the functions that generate the Kernel matrix are extracted:
createKernelMatrix.m
function outputKernel = createKernelMatrix(data1, data2, param) % input: % data1: train data matrix % data2: train or test matrix % param: a strcture contains all the paramters of kernel % .type: kernel function type % .gap : gap with of RBF kernel % output: % outputKernel: kernel matrix computed by data1 and data2 % % kernel parameters extraction type = param.kernelType; gap = param.gap; % kernel creation switch type case 'lin' outputKernel = data1 * data2'; case 'rbf' gap = 2 * gap * gap; N1sq = sum (data1. ^ 2, 2); N1 = size (data1, 1); N2sq = sum (data2. ^ 2, 2); N2 = size (data2, 1); D = (catalog (n2, 1) * n1sq ') '+' ones (n1, 1) * n2sq 'data1-2 * * data2'; outputKernel = exp(-D / gap); end endCopy the code
There are only two classical kernels defined here; you can write your own with the others.
Training function implementation:
lssvmtrain.m
function model = lssvmtrain(kernel, label, param)
% input:
% kernel: input kernel matrix generated by train data, eg: createKernelMatrix(trainData, trainData, param)
% label : train label
% param : a strcture contains all the paramters of model
% output:
% model: train model contains all informations
% model parameters extraction
cost = param.cost;
% TrainKernelMatrix creation
nb_data = length(label);
TrainKernelMatrix = (label*label') .* kernel;
UnitMatrix = eye(nb_data,nb_data);
TrainKernelMatrix = TrainKernelMatrix + UnitMatrix ./ cost;
% constrct formula
rightPart = [0; ones(nb_data, 1)];
leftPart = [0, label'; label, TrainKernelMatrix]; % solve liner program question results = linsolve(leftPart, rightPart); bias = results(1); % b coef = results(2 : end); % alpha yalphas = label .* coef; % y .* alphas % save to model model.coef = coef; model.bias = bias; model.oriKernel = kernel; model.trainLabel = label; model.yalphas = yalphas; endCopy the code
Prediction function implementation:
lssvmpredict.m
function [predictLabel, accuracy, decisionValue] = lssvmpredict(model, kernel, testLabel)
% input:
% model: trained model
% kernel: kernel matrix generated by trainData and testData, eg: createKernelMatrix(trainData, testData, param)
% testLabel: label of test data
% output:
% predictLabel: vector
% accuracy: number
% decisionValue: decision values
nb_test = length(testLabel);
testKernel = kernel;
yalphasMat = model.yalphas * ones(1, nb_test);
decisionValue = sum(testKernel .* yalphasMat) + model.bias;
predictLabel = sign(decisionValue);
accuracy = sum(predictLabel' == testLabel) / nb_test;
end
Copy the code
3. The classicionosphere
Accuracy comparison between dataset and SVM
Below, compare the accuracy of the LSSVM just written against the SVM in the classic Ionosphere dataset.
test.m
clear; clc;% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
% ======== prepare trian and test data ======= %
% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
data = load('ionosphere.mat');
label = data.y;
data = data.x;
nbData = length(label);
trainRatio = 0.7;
nbTrain = fix(nbData * trainRatio);
% ===== select front trainRatio as train data ===== %
% trainIndice = 1: nbTrain;
% testIndice = nbTrain + 1 : nbData;
% ===== select train data randomly ===== %
randIndice = randperm(nbData);
trainIndice = randIndice(1: nbTrain);
testIndice = randIndice(nbTrain + 1 : nbData);
trainData = data(trainIndice, :);
trainLabel = label(trainIndice);
testData = data(testIndice, :);
testLabel = label(testIndice);
% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
% ======== compare progress ======= %
% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
% == LibSVM
model = libsvmtrain(trainLabel, trainData, '-t 2 -c 1 -g 1');
[libsvm_label, libsvm_acc, libsvm_decv] = libsvmpredict(testLabel, testData, model);
% == LSSVM
kernelParam.kernelType = 'rbf';
kernelParam.gap = 1;
modelParam.cost = 1;
trainK = createKernelMatrix(trainData, trainData, kernelParam);
testK = createKernelMatrix(trainData, testData, kernelParam);
lssvm_model = lssvmtrain(trainK, trainLabel, modelParam);
[lssvm_label, lssvm_acc, lsscm_decv] = lssvmpredict(lssvm_model, testK, testLabel);
fprintf('\n\n\nTest accuracy of SVM is: %f,\nTest accuracy of LSSVM is: %f \n', libsvm_acc(1) /100, lssvm_acc);
Copy the code
Results obtained by randomly selecting training test data and trying several sets of kernel functions and parameters:
C:1, RBF, gap:1
Test accuracy of SVM is: 0.924528,
Test accuracy of LSSVM is: 0.896226
Test accuracy of SVM is: 0.943396,
Test accuracy of LSSVM is: 0.952830
Test accuracy of SVM is: 0.971698,
Test accuracy of LSSVM is: 0.971698
Test accuracy of SVM is: 0.933962,
Test accuracy of LSSVM is: 0.952830
Copy the code
C:1, RBF, gap:100
Test accuracy of SVM is: 0.688679,
Test accuracy of LSSVM is: 0.688679
Test accuracy of SVM is: 0.613208,
Test accuracy of LSSVM is: 0.603774
Copy the code
C:1, linear
Test accuracy of SVM is: 0.877358,
Test accuracy of LSSVM is: 0.839623
Test accuracy of SVM is: 0.886792,
Test accuracy of LSSVM is: 0.877358
Test accuracy of SVM is: 0.877358,
Test accuracy of LSSVM is: 0.905660
Copy the code
C:100, linear
Test accuracy of SVM is: 0.915094,
Test accuracy of LSSVM is: 0.905660
Test accuracy of SVM is: 0.858491,
Test accuracy of LSSVM is: 0.820755
Test accuracy of SVM is: 0.905660,
Test accuracy of LSSVM is: 0.849057
Copy the code
reference
SUYKENS, Johan A K . Least squares support vector machines[J]. International Journal of Circuit Theory & Applications, 2002, 27 (6) : 605-615.