A list,
1 PCA Principal Component Analysis (PCA) is a common data Analysis method. PCA is a data representation method that transforms the original data into a set of linearly independent dimensions through linear transformation. PCA can be used to extract the main feature components of data and is often used for dimensionality reduction of high-dimensional data.
1.1 Dimension reduction
In data mining and machine learning, data is represented as vectors. For example, the annual flow and transaction of a Taobao store in 2012 can be regarded as a set of records, in which the data of each day is a record in the following format:
(Date, number of views, number of visitors, number of transactions, amount of transactions)
Where, “date” is a record mark rather than a metric value, while data mining mostly cares about metric value. Therefore, if we ignore the field of date, we get a group of records, each record can be represented as a five-dimensional vector, and one sample is shown as follows:
It is customary to use column vectors to represent a record, and this guideline will be followed later in this article.
The complexity of many algorithms in machine learning is closely related to, or even exponentially related to, the dimension of the data. Here only 5 dimensions of data may not matter, but in actual machine learning, it is not uncommon to deal with tens of thousands or even tens of thousands of dimensions of data. In this case, the resource consumption of machine learning is unacceptable, so dimensionality reduction will be adopted for data. Dimensionality reduction means the loss of information, but since the actual data itself is often relevant, reduce the loss of information in dimensionality reduction.
For example, in the data of taobao stores above, we know from experience that there is a strong correlation between “page views” and “number of visitors”, and a strong correlation between “number of orders” and “number of transactions”. When the number of page views is high (or low) on a particular day, we should largely assume that the number of visitors is also high (or low) on that day. Therefore, if you delete views or visitors, you don’t end up losing so much information that you reduce the dimension of the data, a so-called dimensional reduction operation. If the data dimension reduction is analyzed and discussed by mathematics, it is expressed by the professional term PCA, which is a dimension reduction method with strict mathematical basis and has been widely adopted.
1.2 Vectors and basis transformations
1.2.1 Inner product and projection
The inner product of two vectors of the same size is defined as follows:
1.2.2 base
In algebra, vectors are often represented by the point coordinates of the end of a line segment. So let’s say that some vector has a coordinate of 3,2, and the 3 here really means that the projection of the vector on the x axis is 3, and the projection of the vector on the y axis is 2. That is, implicitly introducing a definition of x and y vectors of length 1 in the positive direction. So a vector (3,2) is actually projecting 3 onto the X-axis and 2 onto the Y-axis. Notice that the projection is a vector, it can be negative. The vectors (x, y) actually represent linear combinations:
So from the representation above, it turns out that all two-dimensional vectors can be represented as linear combinations like this. Here (1,0) and (0,1) are called bases in two dimensions.
The default bases of (1,0) and (0,1) are chosen for convenience, of course, because they are unit vectors in the positive direction of the x and y axes respectively, thus making point coordinates and vectors on the two-dimensional plane correspond one to one. But in fact any two linearly independent two-dimensional vectors can be a basis, and linearly independent two-dimensional vectors in a two-dimensional plane, intuitively, are two vectors that are not in a straight line.
And if the basis is orthogonal, the only thing that makes it a basis is that it’s linearly independent, and a non-orthogonal basis is fine. But because of the good properties of the orthogonal basis, the basis used in general is orthogonal.
1.2.3 Matrix of basis transformation
The basis transformation in the above example can be represented by matrix multiplication, i.e
If promotion, suppose there are M A N d vector, to transform it as R A N d vectors of new space, so first will be R A base according to the row of matrix A, then the vector according to the column of matrix B, then the product of two matrices AB is transformation as A result, the AB first M after the first M column transformation as A result, through the matrix multiplication is expressed as:
1.3 Covariance matrix and optimization objectives
In data dimension reduction, the key problem is how to determine whether the selected basis is optimal. That is, selecting the optimal basis is the best way to ensure the characteristics of the original data. So let’s say I have five pieces of data
You take the average of each row, and then you subtract the average from each row, and you get
The matrix is expressed in the form of coordinates, and the graph is as follows:
So now the question is: how do you choose to represent these data with one-dimensional vectors and still want to retain the original information as much as possible? In fact, this problem is to select a vector in a direction in the two-dimensional plane, project all data points onto this line, and represent the original record with the value of the projection, that is, the problem of two-dimensional reduction to one-dimensional. So how do you choose this direction (or base) so that you retain as much original information as possible? An intuitive view is that you want the projected values to be as diffuse as possible.
1.3.1 variance
The above problem is to hope that the values of the post-projection can be scattered in one direction as far as possible, and the degree of dispersion can be expressed by mathematical variance, namely:
Thus, the above problem is formalized as: looking for a one-wiki, which maximizes the variance value after all data is transformed into coordinates on this basis.
2.3.2 covariance
Mathematically, the correlation can be expressed by the covariance of two features, namely:
When the covariance is 0, it means that the two features are completely independent. In order for the covariance to be zero, you can only choose the second basis in directions that are orthogonal to the first basis. So the two directions that you end up choosing must be orthogonal.
At this point for dimension reduction problem of the optimization goal: will be reduced to a set of N d vector K d (K < N), the goal is to choose K units (mode 1) orthogonal basis, so that when the raw data transformation to the group based on, in various fields between the two covariance is 0, the field of variance was as large as possible (under the constraint of orthogonal, take the biggest K variance).
2.3.3 Covariance matrix
Assume that there are only two fields x and Y, and form them into a matrix by row, where is the matrix obtained by the centralized matrix, that is, each field minus the average value of each field:
3.4 Diagonalization of covariance matrix
1.4 Algorithm and examples
1.4.1 PCA algorithm
1.4.2 instance
1.5. Discuss
Based on the above explanation of the mathematical principles of PCA, you can learn some of the capabilities and limitations of PCA. In essence, PCA takes the direction with the largest variance as the main feature and “de-correlates” the data in each orthogonal direction, that is, makes them irrelevant in different orthogonal directions.
Therefore, PCA also has some limitations. For example, it can remove linear correlation well, but there is no way for high-order correlation. For data with high-order correlation, Kernel PCA can be considered and non-linear correlation can be converted into linear correlation through Kernel function. In addition, PCA assumes that the main features of the data are distributed in the orthogonal direction. If there are several directions with large variances in the non-orthogonal direction, the effect of PCA will be greatly reduced.
Finally, IT should be noted that PCA is a parameterless technology. In other words, in the face of the same data, if cleaning is not considered, the results will be the same. There is no subjective parameter intervention, so PCA is convenient for general implementation, but it cannot be personalized optimization.
Ii. Source code
function varargout = face(varargin)
% FACE MATLAB code for face.fig
% FACE, by itself, creates a new FACE or raises the existing
% singleton*.
%
% H = FACE returns the handle to a new FACE or the handle to
% the existing singleton*.
%
% FACE('CALLBACK',hObject,eventData,handles,...) calls the local
% function named CALLBACK in FACE.M with the given input arguments.
%
% FACE('Property'.'Value',...). creates anew FACE or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before face_OpeningFcn gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to face_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one % instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES
% Edit the above text to modify the response to help face
% Last Modified by GUIDE v2. 5 18-Dec- 2014. 12:02:18
% Begin initialization code - DO NOT EDIT
gui_Singleton = 1;
gui_State = struct('gui_Name', mfilename, ...
'gui_Singleton', gui_Singleton, ...
'gui_OpeningFcn', @face_OpeningFcn, ...
'gui_OutputFcn', @face_OutputFcn, ...
'gui_LayoutFcn', [],...'gui_Callback'[]);if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Executes just before face is made visible.
function face_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to face (see VARARGIN)
% Choose default command line output for face
handles.output = hObject;
% Update handles structure
guidata(hObject, handles);
% UIWAIT makes face wait for user response (see UIRESUME)
% uiwait(handles.figure1);
% --- Outputs from this function are returned to the command line.
function varargout = face_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% Get default command line output from handles structure
varargout{1} = handles.output;
% --- Executes on button press in pushbutton1.
function pushbutton1_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton1 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% read image to be recognize
global im;
[filename, pathname] = uigetfile({'*.bmp'},'choose photo');
str = [pathname, filename];
im = imread(str);
axes( handles.axes1);
imshow(im);
% --- Executes on button press in pushbutton2.
function pushbutton2_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton2 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)Global IM Global Reference Global W Global imgMean Global col_of_data global pathName Global IMg_path_list % Preprocess new data IM= double(im(:));
objectone = W'*(im - imgmean);
distance = 100000000; % minimum distance method to find the training picture closest to the image to be recognizedfor k = 1:col_of_data
temp = norm(objectone - reference(:,k));
if(distance>temp)
aimone = k;
distance = temp;
aimpath = strcat(pathname, '/', img_path_list(aimone).name); Axes (handles. Axes2) imshow(AImpath) end end % shows test result % aimpath =strcat(pathname, '/', img_path_list(aimone).name);
% axes( handles.axes2 )
% imshow(aimpath)
% --- Executes on button press in pushbutton3.
function pushbutton3_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton3 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
Copy the code
3. Operation results
Fourth, note
Version: 2014 a