1. Derivation of least square support vector machine

Unlike Support Vector Machines (SVM), The Least Squared Support Vector Machines (LSSVM) proposed by Suykens et al. are used on top of SVM, and three improvements are made:

  1. Use error termInstead of the relaxation variable
  2. Equality constraints are used instead of inequality constraints
  3. Use a quadratic loss functionInstead of hinge loss

The objective function of LSSVM is equation (1) :


Among themIs a sample projected into Hilbert space;andIs the classifier parameter, respectively is the slope and offset of the classification hyperplane;Is the error term;Is the regular term parameter, and controls the degree of punishment for the error term.

Similar to the SVM solution, the Lagrange multiplier method can be used to solve the two classifier parameters. Introduce the Lagrange multiplierThe Lagrange function can be obtained as equation (2) :


Pair the above equations separately,,Take the partial derivative and set it to 0, then:


Different from the quadratic programming problem (QP) required by SVM, LSSVM can obtain model parameters by solving the linear programming problem, and the convex programming satisfies the Slater constraint specification, the optimal solution satisfies (Karush-Kuhn-Tucker, KKT), and the variables are eliminated in KKT conditionand, the following equation can be obtained:


Among themIs the dot product of the kernel matrix and the sample label matrix,Is the identity matrix,The Lagrange multiplier can be obtained by solving this linear equationAnd the offset, and THE same as SVM, its effect on unknown samplesThe decision function of is:


2. Matlab implementation of pre-computed Kernel

Since you only need to solve linear programming problems, the implementation of LSSVM is very simple. The so-called pre-computed Kernel is a self-defined Kernel function, which is different from the classical Kernel functions such as linear Kernel and RBF Kernel. We can define a Kernel function to generate a Kernel matrix.

The pre-computed Kernel implementation of LSSVM is very simple, that is, the custom Kernel functions are input into the training and prediction functions, so the functions that generate the Kernel matrix are extracted:

createKernelMatrix.m

function outputKernel = createKernelMatrix(data1, data2, param) % input: % data1: train data matrix % data2: train or test matrix % param: a strcture contains all the paramters of kernel % .type: kernel function type % .gap : gap with of RBF kernel % output: % outputKernel: kernel matrix computed by data1 and data2 % % kernel parameters extraction type = param.kernelType; gap = param.gap; % kernel creation switch type case 'lin' outputKernel = data1 * data2'; case 'rbf' gap = 2 * gap * gap; N1sq = sum (data1. ^ 2, 2); N1 = size (data1, 1); N2sq = sum (data2. ^ 2, 2); N2 = size (data2, 1); D = (catalog (n2, 1) * n1sq ') '+' ones (n1, 1) * n2sq 'data1-2 * * data2'; outputKernel = exp(-D / gap); end endCopy the code

There are only two classical kernels defined here; you can write your own with the others.

Training function implementation:

lssvmtrain.m

function model = lssvmtrain(kernel, label, param)
% input: 
% kernel: input kernel matrix generated by train data, eg: createKernelMatrix(trainData, trainData, param)
% label : train label
% param : a strcture contains all the paramters of model
% output: 
% model: train model contains all informations

    % model parameters extraction
    cost = param.cost;

    % TrainKernelMatrix creation
    nb_data = length(label);

    TrainKernelMatrix = (label*label') .* kernel;
    UnitMatrix = eye(nb_data,nb_data);
    TrainKernelMatrix = TrainKernelMatrix + UnitMatrix ./ cost;
    
    % constrct formula
    rightPart = [0; ones(nb_data, 1)];
    leftPart = [0, label'; label, TrainKernelMatrix]; % solve liner program question results = linsolve(leftPart, rightPart); bias = results(1); % b coef = results(2 : end); % alpha yalphas = label .* coef; % y .* alphas % save to model model.coef = coef; model.bias = bias; model.oriKernel = kernel; model.trainLabel = label; model.yalphas = yalphas; endCopy the code

Prediction function implementation:

lssvmpredict.m

function [predictLabel, accuracy, decisionValue] = lssvmpredict(model, kernel, testLabel)
% input:
% model: trained model
% kernel: kernel matrix generated by trainData and testData, eg: createKernelMatrix(trainData, testData, param)
% testLabel: label of test data
% output:
% predictLabel: vector
% accuracy: number
% decisionValue: decision values

nb_test = length(testLabel);

testKernel = kernel;
yalphasMat = model.yalphas * ones(1, nb_test);

decisionValue = sum(testKernel .* yalphasMat) + model.bias;
predictLabel = sign(decisionValue);

accuracy = sum(predictLabel' == testLabel) / nb_test;

end
Copy the code

3. The classicionosphereAccuracy comparison between dataset and SVM

Below, compare the accuracy of the LSSVM just written against the SVM in the classic Ionosphere dataset.

test.m

clear; clc;% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
% ======== prepare trian and test data ======= %
% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %

data = load('ionosphere.mat');
label = data.y;
data = data.x;

nbData = length(label);
trainRatio = 0.7;
nbTrain = fix(nbData * trainRatio);

% ===== select front trainRatio as train data ===== %
% trainIndice = 1: nbTrain;
% testIndice = nbTrain + 1 : nbData;

% ===== select train data randomly ===== %
randIndice = randperm(nbData);
trainIndice = randIndice(1: nbTrain);
testIndice  = randIndice(nbTrain + 1 : nbData);

trainData = data(trainIndice, :);
trainLabel = label(trainIndice);
testData = data(testIndice, :);
testLabel = label(testIndice);

% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %
% ======== compare progress ======= %
% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = %

% == LibSVM
model = libsvmtrain(trainLabel, trainData, '-t 2 -c 1 -g 1');
[libsvm_label, libsvm_acc, libsvm_decv] = libsvmpredict(testLabel, testData, model);


% == LSSVM
kernelParam.kernelType = 'rbf';
kernelParam.gap  = 1;
modelParam.cost = 1;

trainK = createKernelMatrix(trainData, trainData, kernelParam);
testK  = createKernelMatrix(trainData, testData, kernelParam);

lssvm_model = lssvmtrain(trainK, trainLabel, modelParam);
[lssvm_label, lssvm_acc, lsscm_decv] = lssvmpredict(lssvm_model, testK, testLabel);


fprintf('\n\n\nTest accuracy of SVM is: %f,\nTest accuracy of LSSVM is: %f \n', libsvm_acc(1) /100, lssvm_acc);
Copy the code

Results obtained by randomly selecting training test data and trying several sets of kernel functions and parameters:

C:1, RBF, gap:1

Test accuracy of SVM is:   0.924528,
Test accuracy of LSSVM is: 0.896226 

Test accuracy of SVM is:   0.943396,
Test accuracy of LSSVM is: 0.952830 

Test accuracy of SVM is:   0.971698,
Test accuracy of LSSVM is: 0.971698 

Test accuracy of SVM is:   0.933962,
Test accuracy of LSSVM is: 0.952830 
Copy the code

C:1, RBF, gap:100

Test accuracy of SVM is:   0.688679,
Test accuracy of LSSVM is: 0.688679 

Test accuracy of SVM is:   0.613208,
Test accuracy of LSSVM is: 0.603774 
Copy the code

C:1, linear

Test accuracy of SVM is:   0.877358,
Test accuracy of LSSVM is: 0.839623 

Test accuracy of SVM is:   0.886792,
Test accuracy of LSSVM is: 0.877358 


Test accuracy of SVM is:   0.877358,
Test accuracy of LSSVM is: 0.905660 
Copy the code

C:100, linear

Test accuracy of SVM is:   0.915094,
Test accuracy of LSSVM is: 0.905660 

Test accuracy of SVM is:   0.858491,
Test accuracy of LSSVM is: 0.820755 

Test accuracy of SVM is:   0.905660,
Test accuracy of LSSVM is: 0.849057 
Copy the code

reference

SUYKENS, Johan A K . Least squares support vector machines[J]. International Journal of Circuit Theory & Applications, 2002, 27 (6) : 605-615.