A list,

SIFT (scale invariant feature Transform) is a description used in image processing field. This description is a local feature descriptor with scale invariance and key points can be detected in the image. 1 SIFT algorithm features: (1) has good stability and invariance, can adapt to rotation, scale scaling, brightness changes, to a certain extent, not by the change of perspective, affine transformation, noise interference. (2) the distinction between sex good, can fast and exact distinction in mass characteristic database information matching (3) the quantity more, even if only a single object, can also produce a large number of feature vector (4) recommend suite, quickly eigenvector matching (5) the scalability, can be combined with other forms of feature vector

2 SIFT algorithm essence

Key points are searched in different scale Spaces and the directions of key points are calculated.



3 SIFT algorithm to achieve feature matching mainly has the following three processes:

(1) Extraction of key points: Key points are some very prominent points that will not disappear due to lighting, scale, rotation and other factors, such as corner points, edge points, bright spots in dark areas and dark spots in bright areas. This step is to search for image positions on all scale Spaces. Potential points of interest with invariant scale and rotation are identified by gaussian differential functions.

(2) Location of key points and determination of feature direction: In each candidate location, a fine fitting model is used to determine the location and scale. Key points are chosen based on their stability. Then, one or more directions are assigned to each key point based on the local gradient direction of the image. All subsequent operations on the image data transform relative to the direction, scale, and position of the key points, thus providing invariance to these transformations.

(3) Through the feature vector of each key point, pairwise comparison is made to find several pairs of feature points matching each other and establish the corresponding relationship between scenes.

4-scale space

(1) Concept

Scale space is a concept and method that attempts to simulate human eyes observing objects in the image field. For example, when looking at a tree, the key is whether we want to look at the leaves or the whole tree: if we look at the whole tree (equivalent to looking at a large scale), we should strip out the details of the image. If it’s a leaf (at a small scale), look for local details.

SIFT algorithm adopts Gaussian kernel function to filter in the construction of scale space, so that the original image saves the most detail features. After Gaussian filtering, the detail features are gradually reduced to simulate the feature representation at large scale.

There are two main reasons for using gaussian kernel function for filtering:

A Gaussian kernel is the only scale-invariant kernel.

B DoG kernel function can be approximated as LoG function, which makes feature extraction easier. Meanwhile, David Lowe proposed in his paper that filtering the original image after two-fold up-sampling can retain more information for subsequent feature extraction and matching. In fact, the image generation of scale space is the image generated after the convolution operation of the current image with different scale kernel parameter σ.

(2) represents

L(x, y, σ) is defined as the convolution operation of the original image I(x, y) with a variable scale 2-dimensional Gaussian function G(x, y, σ).



The construction of the Gauss Pyramid

(1) Concept

The scale space is represented by gauss pyramid in implementation, and the construction of Gauss pyramid is divided into two steps:

A Perform Gaussian smoothing on the image;

B Downsampling the image.



The pyramid model of image is a tower model composed of a series of images of different sizes, from large to small, from bottom to top, by continuously reducing the sampling of the original image. The original image is the first layer of the golden tower, and the new image obtained by each descending sampling is one layer of the pyramid (one image for each layer), and each pyramid has N layers. In order to show the continuity of scale, gaussian filtering is applied to gaussian pyramid based on simple downsampling. As shown in the figure above, an image of each layer of the image pyramid is gaussian blurred with different parameters, Octave represents the number of image groups that can be produced by an image, and Interval represents the number of image layers included in a group of images. In addition, when downsampling, the initial image (bottom image) of a set of images on the Gaussian pyramid is obtained by intersampling the third to last image of the previous set of images.

(2) represents

Gaussian image pyramid with O group, S layer, then



6 DOG space extremum detection

(1) DOG function



(2) DoG Gaussian difference pyramid

A corresponds to the DOG operator, and the DOG pyramid needs to be constructed.

The change of pixel value on the image can be seen through gaussian difference image. (If there is no change, there is no character. Features must vary as many points as possible. The DOG image depicts the outline of the target.



B DOG local extremum detection

The characteristic points are made up of local extreme points in the DOG space. To find the extreme point of the DoG function, each pixel is compared with all its neighbors to see if it is larger or smaller than its neighbors in the image and scale domains. The characteristic points are made up of local extreme points in the DOG space. To find the extreme point of the DoG function, each pixel is compared with all its neighbors to see if it is larger or smaller than its neighbors in the image and scale domains. As shown in the figure below, the middle detection point is compared with its 8 adjacent points at the same scale and 9×2 corresponding points at the upper and lower adjacent scales, a total of 26 points, to ensure that extreme points are detected in both scale space and two-dimensional image space.



B removing edge effect

In the direction of edge gradient, the main curvature value is larger, while in the direction of edge, the main curvature value is smaller. The principal curvature of the DoG function D(x) of the candidate feature points is proportional to the eigenvalues of the 2×2Hessian matrix H.





7 Key points are assigned

(1) To find the extreme point through scale invariance, it is necessary to use the local features of the image to assign a reference direction to each key point, so that the descriptor has invariance to the image rotation. For the key points detected in the DOG pyramid, the gradient and direction distribution characteristics of the pixels in the 3σ neighborhood window of the Gaussian pyramid image are collected. The modulus and direction of the gradient are as follows:



(2) This algorithm adopts the gradient histogram statistics method, which takes the key points as the origin and determines the direction of the key points by the image pixels in a certain area. After the gradient calculation of key points is completed, the histogram is used to calculate the gradient and direction of pixels in the neighborhood. The gradient histogram divides the orientation range from 0 to 360 degrees into 36 columns of 10 degrees each. As shown in the figure below, the peak direction of the histogram represents the main direction of the key point, and the peak value of the direction histogram represents the direction of the neighborhood gradient at the feature point. The maximum value in the histogram is taken as the main direction of the key point. In order to enhance the robustness of matching, only the direction whose peak value is more than 80% of the main direction is reserved as the secondary direction of this key point.



8 Key Points

For each key point, there is location, scale, and direction. Create a descriptor for each key point, and use a set of vectors to describe the key point, so that it does not change with various changes, such as lighting changes, Angle of view changes, etc. This descriptor not only includes the key points, but also the pixels around the key points that contribute to it, and the descriptor should have a high uniqueness in order to improve the probability of correct matching of feature points.



Lowe’s experiment results showed that the descriptor was characterized by 4×4×8 = 128 dimension vector, and the comprehensive effect was the best (invariance and uniqueness).

9 Key Point Matching

(1) Template image (reference image, reference image) and real-time image (observation image,

Observation image) to establish a key point description sub-set. The target recognition is accomplished by comparing the key point descriptors in the two-point set. Euclidean distance is used to measure the similarity of key point descriptors with 128 dimensions.

(3) The matching can be completed by exhaustive method, but it takes too much time. Therefore, the data structure of KD tree is generally used to complete the search. The content of the search is based on the key points of the target image to search the nearest feature point of the original image and the next adjacent feature point of the original image.

The Kd tree, as shown below, is a balanced binary tree



10 summary

SIFT features with stability and invariability, in the field of image processing and computer vision has a very important role, its itself is very complex, because the contact SIFT is not for a long time, the relevant knowledge of which is also very inadequate, after multiple reference, write this article, the content is not detailed, hope more forgive me. The following is a rough summary of the SIFT algorithm.

(1) Extreme value detection of DoG scale space.

(2) Delete unstable extremum points.

(3) Determine the main direction of feature points

(4) Generate descriptors of feature points for key point matching.

Ii. Source code

function varargout = ImageRegistration(varargin)
% IMAGEREGISTRATION MATLAB code for ImageRegistration.fig
%      IMAGEREGISTRATION, by itself, creates a new IMAGEREGISTRATION or raises the existing
%      singleton*.
%
%      H = IMAGEREGISTRATION returns the handle to a new IMAGEREGISTRATION or the handle to
%      the existing singleton*.
%
%      IMAGEREGISTRATION('CALLBACK',hObject,eventData,handles,...) calls the local
%      function named CALLBACK in IMAGEREGISTRATION.M with the given input arguments.
%
%      IMAGEREGISTRATION('Property'.'Value',...). creates anew IMAGEREGISTRATION or raises the
%      existing singleton*.  Starting from the left, property value pairs are
%      applied to the GUI before ImageRegistration_OpeningFcn gets called.  An
%      unrecognized property name orinvalid value makes property application % stop. All inputs are passed to ImageRegistration_OpeningFcn via varargin. % %  *See GUI Options on GUIDE's Tools menu.  Choose "GUI allows only one % instance to run (singleton)". % % % Edit the above text to modify the response to help ImageRegistration % % Begin initialization code - DO NOT EDIT  gui_Singleton =1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @ImageRegistration_OpeningFcn, ...
                   'gui_OutputFcn',  @ImageRegistration_OutputFcn, ...
                   'gui_LayoutFcn', [],...'gui_Callback'[]);if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT

addpath(pwd);
% --- Executes just before ImageRegistration is made visible.
function ImageRegistration_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% varargin   command line arguments to ImageRegistration (see VARARGIN)5

% Choose default command line output for ImageRegistration
handles.output = hObject; % eliminate axes axesset(handles.axes1,'Xtick'[],'Ytick'[]);set(handles.axes1,'Xcolor'[1 1 1].'Ycolor'[1 1 1]);
set(handles.axes2,'Xtick'[],'Ytick'[]);set(handles.axes2,'Xcolor'[1 1 1].'Ycolor'[1 1 1]);
set(handles.axes3,'Xtick'[],'Ytick'[]);set(handles.axes3,'Xcolor'[1 1 1].'Ycolor'[1 1 1]);
set(handles.axes4,'Xtick'[],'Ytick'[]);set(handles.axes4,'Xcolor'[1 1 1].'Ycolor'[1 1 1]);
set(handles.axes5,'Xtick'[],'Ytick'[]);set(handles.axes5,'Xcolor'[1 1 1].'Ycolor'[1 1 1]);

% Update handles structure
guidata(hObject, handles);

% UIWAIT makes ImageRegistration wait for user response (see UIRESUME)
% uiwait(handles.figure1);


% --- Outputs from this function are returned to the command line.
function varargout = ImageRegistration_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Get default command line output from handles structure
varargout{1} = handles.output;


% --- Executes on button press in pushbutton1.
function pushbutton1_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton1 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)CLC %%%%%%%%%%%%%%% calls openImage. m to read in the reference image and get the filename, image size %%% global Image_I;
   Image_I.FileInformation.IsImage=0;
   

   while Image_I.FileInformation.IsImage==0
       Image_I1=OpenImage;
    if Image_I.flag ==1
%        delete(Image_I.figure1);
       break;
   end
   end

   if Image_I.flag==0
       delete(Image_I.figure1);
       handles.ImsizeI=Image_I.FileInformation.imsize;
       handles.filenameI=Image_I.FileInformation.filename;
       handles.names_dispI=Image_I.FileInformation.names_disp;
       set(handles.text1,'String',handles.names_dispI); guidata(hObject,handles); %%%%%%%%%%%% display reference image axes(handles. Axes1) I=imread(handles. FilenameI); imshow(I) end % --- Executes on button press in pushbutton2.function pushbutton2_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton2 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)CLC %%%%%%%% calls openImage. m to read the floating image and get the file name and image size, global Image_I;
  Image_I.FileInformation.IsImage=0;
  while Image_I.FileInformation.IsImage==0
      Image_J=OpenImage;
    if Image_I.flag ==1
%        delete(Image_I.figure1);
       break;
    end
  end
  
  if Image_I.flag==0
      delete(Image_I.figure1);
      handles.ImsizeJ=Image_I.FileInformation.imsize;
      handles.filenameJ=Image_I.FileInformation.filename;
      handles.names_dispJ=Image_I.FileInformation.names_disp;
      set(handles.text2,'String',handles.names_dispJ); guidata(hObject,handles); % % % % % % % % floating images displayed % % % % % % % % % % % % % % % % % % % % axes (handles. Axes2); J=imread(handles.filenameJ); imshow(J) end % --- Executes on button press in pushbutton3.function pushbutton3_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton3 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
   clc; % % % % % % % % % % % % % % check has input the reference image and floating axesIbox = get (handles axes1,'box');
  axesJbox=get(handles.axes2,'box');
  if strcmp(axesIbox,'off')||strcmp(axesJbox,'off')
      errordlg('Please select Image for Registration'.'Error');
      error('No Image! '); End %%%%%%%%%%%%% checks whether the reference image is the same size as the floating image handles. IsSameSizeIJ =strcmp(handles.ImsizeI,handles.ImsizeJ);
  if handles.isSameSizeIJ~=1
      errordlg('Please Select the Same Size Image'.'Error');
      error('ImageSize doesn''t match! '); End %%%%%%%%%%%% reads and copies images, one for the registration process and the other for output BaseImage=imread(handles. FilenameI) after registration; RegisterImage=imread(handles.filenameJ); Tic [img0,diff] = imMosaic(BaseImage,RegisterImage,1);
  toc
  ElapsedTime=toc;
  ElapsedTime=sprintf('Elapsed Time=[%.3f]',ElapsedTime); Axes (handles. Axes3); imshow(img0); Axes (handles. Axes4) imshow(diff); % Set a threshold to eliminate points where registration is inaccurateif numel(size(diff))>2
    diff1 = rgb2gray(diff);
else
    diff1 = diff;
end
Copy the code

3. Operation results





Fourth, note

Version: 2014 a