[path planning] Robot obstacle avoidance path planning based on MATLAB RBF optimization Qlearning algorithm

I. Introduction to RBF

In 1985, Powell proposed the radial basis function (RBF) method for multivariable interpolation. The radial basis function is a real-valued function whose value depends solely on the distance from the origin, i.e., φ (x) = φ (‖x‖), or it can also be the distance to any point C, called the center point, i.e., φ (x, c) = φ (‖x-c‖). Any function φ that satisfies the property φ (x) = φ (‖x‖) is called a radial basis function, and the standard uses the Euclidean distance (also known as the Euclidean radial basis function), although other distance functions are acceptable. The most commonly used radial basis function is gaussian kernel function, in the form of k (| | x – xc | |) = exp {- | | x – xc | | ^ 2 / (2 * sigma) ^ 2)} kernel function which x_c as the center and the width of the sigma for function parameters, control the function of the radial range.

1.2 RBF neural networkRBF network is a three-layer neural network, which includes input layer, hidden layer and output layer. The transformation from the input space to the hidden layer space is nonlinear, and from the hidden layer space to the output layer space is linear. The flow diagram is as follows:

The basic idea of RBF network is to use RBF as the “basis” of hidden unit to form hidden layer space, so that the input vector can be directly mapped to the hidden space without the need to connect through weights. When the center point of RBF is determined, this mapping relationship is determined. The mapping between hidden layer space and output space is linear, that is, the output of the network is the linear weighted sum of hidden unit output, and the weight here is the adjustable parameter of the network. Among them, the role of the hidden layer is to map the vector from the low dimension P to the high dimension H, so that the low dimension linearly indivisible case can become linearly divisible in the high dimension, which is mainly the idea of kernel function. Thus, the mapping of the network from input to output is nonlinear, while the network output is linear for tunable parameters. The weight of the network can be solved directly by the linear equations, thus greatly speeding up the learning speed and avoiding the local minimum problem. The activation function of RBF neural network can be expressed as:Where xp is the PTH input sample, CI is the i-th center point, h is the node number of hidden layer, and n is the number of output samples or classifications. According to the structure of RBF neural network, the output of the network can be:Of course, the least square loss function is used to express:

3. Learning problems of RBF neural network

There are three parameters to be solved: the center of the basis function, the variance and the weight from the hidden layer to the output layer. (1) Self-organizing center selection learning method: Step 1: Unsupervised learning process, to solve the center and variance of the basis function of the hidden layer Step 2: Supervised learning process, to solve the weight between the hidden layer and the output layer First, h centers were selected for K-means clustering. For the radial basis of the Gaussian kernel function, variance was solved by the formula: Cmax is the maximum distance between the selected center points. Hidden layer to output layer between the weights between neurons can be calculated by least square method directly, namely the loss function to solve the partial derivative of w, make it equal to zero, can be simplified calculating formula is: (2) the direct calculation method is the center of the hidden layer neurons randomly selected in the input samples, fixed and center. Once the center is fixed, the output of neurons in the hidden layer is known, and the connection weight of such neural network can be determined by solving linear equations. The distribution applicable to the sample data is clearly representative. (3) The supervised learning algorithm obtains the network center and other weight parameters that meet the supervision requirements through the training sample set, and goes through an error correction learning process, which is the same as the learning principle of BP network, and also adopts the gradient descent method. Therefore, RBF can also be regarded as a kind of BP neural network.

1.4.1 Difference between RBF neural network and BP Neural network 1.4.1 Difference between local approximation and global Approximation The inner product of input mode and weight vector is used as the independent variable of activation function for hidden nodes of BP neural network, and Sigmoid function is used for activation function. All tuning parameters have equal influence on the output of BP neural network, so BP neural network is a global approximation of nonlinear mapping. The distance between the input mode and the center vector (such as Euclidean distance) is used as the independent variable of the hidden node of RBF neural network, and the radial basis function (such as the Gaussian function) is used as the activation function. The farther the neuron’s input is from the RBF center, the lower the neuron’s activation (Gaussian). The output of RBF network is related to some tuning parameters. For example, if a value of WIJ only affects the output of a value of YI, RBF neural network therefore has the property of “local mapping”.

The so-called local approximation means that the objective function is approximated only according to the data near the query point. As a matter of fact, for the RBF network, usually using the gaussian radial basis function (RBF), on both sides of the image () function attenuation and radial symmetry, elected in the center and the query point (i.e., the input data) is very close to input only when the real mapping function, if the center and the query point far away, under the condition of Euclidean distance is too big, the output tends to zero, the result of the So the point that really matters is the point very close to the query point, so it’s local approximation; The approximation of BP network to the objective function is related to all data, not only the data near the query point.

1.4.2 Difference of middle Layers BP neural network can have multiple hidden layers, but RBF has only one hidden layer.

1.4.3 Differences in Training Speed The training speed of RBF is fast, on the one hand, because there are fewer hidden layers; on the other hand, local approximation can simplify the calculation. For an input x, only some neurons will respond, and the rest will be approximately 0. The corresponding W will not need to be tuned.

RBF network is the best approximation of continuous function, while BP network is not.

2 QLearning profile** Algorithm idea ** QLearning is a value-based algorithm in reinforcement learning algorithms. Q is Q (s,a), which is the expectation that taking action A (A ∈A) can obtain benefits under the s state (s∈S) at a certain moment. The environment will reward the corresponding reward according to the Action of the Agent. Therefore, the main idea of the algorithm is to build State and Action into a Q-table to store the Q value, and then select the Action that can obtain the maximum benefit according to the Q value.

Part of the source code

Function varargout =PathPlanning(varargin) % mobile robot PathPlanning'inf'),sp= INF.startPoint, % EP= INF.endpoint,WS= INF.env To get the starting point, target point position and obstacle position information of the robot working environment. The workspace boundary and obstacle area are set to1, free space % is set to0. gui_Singleton =1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @Simulation_OpeningFcn, ...
                   'gui_OutputFcn',  @Simulation_OutputFcn, ...
                   'gui_LayoutFcn', [],...'gui_Callback'[]);if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT


% --- Executes just before GridSimulation is made visible.
function Simulation_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% varargin   command line arguments to GridSimulation (see VARARGIN)

% Choose default command line output for GridSimulation
handles.output = hObject;
% Update handles structure
guidata(hObject, handles);
% UIWAIT makes GridSimulation wait for user response (see UIRESUME)
% uiwait(handles.mainfig);
%cd D:\Simulation\EvolvingPath\path
cla
grid on
xlabel('X'); ylabel('Y'); StartPoint=findobj(handles)'tag'.'StartPoint'); % Get "Set Start" button handle handles handles =findobj('tag'.'EndPoint'); % Obtain "Set target point" button handle handle. Obstacle=findobj('tag'.'Obstacle'); % Get the "Set obstacles" button handle handles.Start=findobj('tag'.'Start'); OldEnv=findobj()'tag'.'OldEnv'); % Get restore Environment button handle Handles Handles MainAxes=findobj('tag'.'MainAxes'); MainFigure=findobj()'tag'.'MainFigure'); % get the main window handle % initialization, set the button display statusset(handles.StartPoint,'Enable'.'on') % "Set start point" button is availableset(handles.EndPoint,'Enable'.'off') % "Set target point" button is disabledset(handles.Obstacle,'Enable'.'off') % "Set obstacles" button is disabledset(handles.Start,'Enable'.'off') % The Start button is disabledset(handles.OldEnv,'Enable'.'off') % Restore Environment button availableset(handles.MainFigure,'WindowButtonDownFcn'."); % set(handles.MainFigure,'WindowButtonUpFcn'."); % set(handles.MainAxes,'ButtonDownFcn'."); % set(handles.MainAxes,'ButtonDownFcn'."); % inf=load('inf'); % Open the environment information file, inf.mat created by the save command, storing the start point, target point, obstacle information XLim=30; % the maximum value for the X-axis is YLim=30; %y axis maximum BreakTask=0; % initializes the termination task variablefor i=1:XLim % Sets the border as an obstaclefor j=1:YLim
            if ((i==1)|(i==XLim)|(j==1)|(j==YLim))
                ws(i,j)=1;
            end
        end
    end
save('inf'.'ws'.'-append');
save('inf'.'BreakTask'.'-append');

% --- Outputs from this function are returned to the command line.
function varargout = Simulation_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Get default command line output from handles structure
varargout{1} = handles.output;
% --- Executes on button press in StartPoint.
function StartPoint_Callback(hObject, eventdata, handles)
% hObject    handle to StartPoint (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
set(handles.StartPoint,'Enable'.'off')
set(handles.EndPoint,'Enable'.'on')
set(handles.Obstacle,'Enable'.'off')
set(handles.Start,'Enable'.'off')
flag=0;
save('inf'.'flag'.'-append');
set(handles.MainFigure,'WindowButtonDownFcn'."); set(handles.MainFigure,'WindowButtonUpFcn'."); set(handles.MainAxes,'ButtonDownFcn'.'PathPlanning(''MainAxes_ButtonDownFcn'',gcbo,[],guidata(gcbo))');
% --- Executes on button press in EndPoint.
function EndPoint_Callback(hObject, eventdata, handles)
% hObject    handle to EndPoint (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
set(handles.StartPoint,'Enable'.'off')
set(handles.EndPoint,'Enable'.'off')
set(handles.Obstacle,'Enable'.'on')
set(handles.Start,'Enable'.'on')
flag=1;
save('inf'.'flag'.'-append');
%set(handles.MainFigure,'WindowButtonDownFcn'."); %set(handles.MainFigure,'WindowButtonUpFcn'."); set(handles.MainAxes,'ButtonDownFcn'.'PathPlanning(''MainAxes_ButtonDownFcn'',gcbo,[],guidata(gcbo))');
% --- Executes on mouse press over axes background.
function MainAxes_ButtonDownFcn(hObject, eventdata, handles)
% hObject    handle to MainAxes (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
inf=load('inf');
flag=inf.flag;
start_end=inf.start_end;
p=get(handles.MainAxes,'CurrentPoint');
hold on;
if(flag==0)
    p=round(p);
    start_end(1.1)=p(1.1); start_end(1.2)=p(1.2); % record the starting point information and assign the value StartPoint(1.1)=p(1.1); StartPoint(1.2)=p(1.2); % is assigned to the current point, the current point is the starting point of the position information save('inf'.'StartPoint'.'-append');
    HRobot=plot(start_end(1.1),start_end(1.2),'pentagram'); Text (start_end(start_end))1.1)-. 5,start_end(1.2)-. 5.'starting point'); RobotDirection=inf.RobotDirection; X =start_end(start_end)1.1);
    y=start_end(1.2);
    RobotPosX=x;
    RobotPosY=y;
   save('inf'.'RobotPosX'.'-append');
   save('inf'.'RobotPosY'.'-append');
else
    p=round(p);
    start_end(2.1)=p(1.1); start_end(2.2)=p(1.2);
    EndPoint(1.1)=p(1.1); EndPoint(1.2)=p(1.2); % to assign a value to the current point, the current point to the EndPoint position information EndPoint=round(EndPoint); save('inf'.'EndPoint'.'-append');
    plot(start_end(2.1),start_end(2.2),The '*'.'color'.'r')
    text(start_end(2.1)-. 5,start_end(2.2) +. 5.'Target point');
end
save('inf'.'start_end'.'-append');
set(handles.MainAxes,'ButtonDownFcn'."); set(handles.MainAxes,'ButtonDownFcn'."); % --- Executes on button press in Obstacle. function Obstacle_Callback(hObject, eventdata, handles) % hObject handle to Obstacle (see GCBO) % eventdata reserved - to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) env=zeros(50); save('inf'.'env'.'-append');
set(handles.StartPoint,'Enable'.'off')
set(handles.EndPoint,'Enable'.'off')
set(handles.Obstacle,'Enable'.'on')
set(handles.Start,'Enable'.'on')
set(handles.OldEnv,'Enable'.'on') % The Start button is disabledset(handles.MainFigure,'WindowButtonDownFcn'.'PathPlanning(''MainFigure_WindowButtonDownFcn'',gcbo,[],guidata(gcbo))');
%set(handles.MainFigure,'WindowButtonUpFcn'.'PathPlanning(''MainFigure_WindowButtonUpFcn'',gcbo,[],guidata(gcbo))');
function MainFigure_WindowButtonDownFcn(hObject, eventdata, handles)
% hObject    handle to MainFigure (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
    inf=load('inf'); 
    ws=inf.env;
    Pos=get(handles.MainAxes,'CurrentPoint');
    Pos=round(Pos);
    XPos=Pos(1.1); YPos=Pos(1.2); % X=[XPos-. 5,XPos-. 5,XPos+. 5,XPos+. 5];
    Y=[YPos-. 5,YPos+. 5,YPos+. 5,YPos-. 5];
    fill(X,Y,[0 0 0]) % draw obstacle text(132 -..12.'B'.'color'[1 1 1]);
    text(72 -..8.'A'.'color'[1 1 1]);
  %  for i=XPos- 1:XPos+1
   %     for j=YPos- 1:YPos+1
  %          if((i>0)&(I <=XLim)) % prevents the occurrence of environment matrix elements with subscripts of zero %if((j>0)&(j<=YLim))
                    ws(XPos,YPos)=1;
  %              end
  %          end
  %      end
  %end
   env=ws;
    save('inf'.'env'.'-append');

% --- Executes on button press in SensorChecked.
function SensorChecked_Callback(hObject, eventdata, handles)
% hObject    handle to SensorChecked (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)


% Hint: get(hObject,'Value') returns toggle state of SensorChecked

function RobotVelocity_Callback(hObject, eventdata, handles)Set the speed of the robotRobotVelocity (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Hints: get(hObject,'String') returns contents of RobotVelocity as text
%        str2double(get(hObject,'String')) returns contents of RobotVelocity as a double   
 
function RobotRadius_Callback(hObject, eventdata, handles)Set the robot radius % hObject handle toRobotRadius (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Hints: get(hObject,'String') returns contents of RobotRadius as text
%        str2double(get(hObject,'String')) returns contents of RobotRadius as a double
% --- Executes during object creation, after setting all properties.

function SensorMaxValue_Callback(hObject, eventdata, handles)% Set sensor measurement range % hObject handle toSensorMaxValue (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Hints: get(hObject,'String') returns contents of SensorMaxValue as text
%        str2double(get(hObject,'String')) returns contents of SensorMaxValue as a double    
    
function Handbook_Callback(hObject, eventdata, handles)% hObject Handle toHandbook (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
uiopen('System Intro. TXT'.1)

function ClearScreen_Callback(hObject, eventdata, handles)Start overClearScreen (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

set(handles.StartPoint,'Enable'.'on')% "Set start point" button availableset(handles.EndPoint,'Enable'.'off')% The "Set target point" button is disabledset(handles.Obstacle,'Enable'.'off')% "Set obstacles" button is disabledset(handles.Start,'Enable'.'off')% The Start Button disables cla Clear all % -- Executes on button press in oldenv.functionOldEnv_Callback(hObject, eventdata, handles)
% hObject    handle to OldEnv (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
XLim=30; % the maximum value for the X-axis is YLim=30; %y axis Max. Inf =load('inf'); ws=inf.env; SP= INF.StartPoint; EP= inf.endpoint; % Target point positionset(handles.StartPoint,'Enable'.'off')
set(handles.EndPoint,'Enable'.'off')
set(handles.Obstacle,'Enable'.'off')
set(handles.Start,'Enable'.'on')
HandleStart=line([SP(1.1) SP(1.1)],[SP(1.2) SP(1.2)]); % set start point text(SP(1.1)-. 5,SP(1.2)-. 5.'starting point');
HandleTarget=line([EP(1.1) EP(1.1)],[EP(1.2) EP(1.2)]); % set target point text(EP(1.1)-. 5,EP(1.2) +. 5.'Target point');
set(HandleStart,'marker'.'pentagram')
set(HandleTarget,'marker'.The '*'.'color'.'r')
    for i=1:XLim % Sets the border as an obstaclefor j=1:YLim
            if ((i==1)|(i==XLim)|(j==1)|(j==YLim))
                ws(i,j)=1;
            end
        end
    end
for i=2:XLim- 1% Restore the obstacle informationfor j=2:YLim- 1
        if((ws(i,j)==1))
           X=[i-. 5,i-. 5,i+. 5,i+. 5];
            Y=[j-. 5,j+. 5,j+. 5,j-. 5];
         fill(X,Y,[0 0 0]); % end end end X=[1-. 5.1-. 5.1+. 5.1+. 5];
            Y=[6-. 5.6+. 5.6+. 5.6-. 5];
         fill(X,Y,[0 0 0]); % set obstacles X=[1-. 5.1-. 5.1+. 5.1+. 5];
            Y=[14-. 5.14+. 5.14+. 5.14-. 5];
         fill(X,Y,[0 0 0]); % set obstacles X=[10-. 5.10-. 5.10+. 5.10+. 5];
            Y=[1-. 5.1+. 5.1+. 5.1-. 5];
         fill(X,Y,[0 0 0]); % set obstacle text(132 -..12.'B'.'color'[1 1 1]);
text(72 -..8.'A'.'color'[1 1 1]);
X=[0.0.. 5.. 5];
Y=[0,YLim,YLim,0];
fill(X,Y,[0 0 0]); % set the boundary as an obstacle-. 5,XLim-. 5,XLim,XLim];
Y=[0,YLim,YLim,0];
fill(X,Y,[0 0 0]); % set boundary to obstacle X=[0.0,XLim,XLim];
Y=[YLim-. 5,YLim,YLim,YLim-. 5];
fill(X,Y,[0 0 0]); % set boundary to obstacle X=[0.0,XLim,XLim];
Y=[0.. 5.. 5.0];
fill(X,Y,[0 0 0]); %axis([0 20 0 20]) % robot position set to start position RobotPosX=SP(1.1);
RobotPosY=SP(1.2);
    save('inf'.'RobotPosX'.'-append');
    save('inf'.'RobotPosY'.'-append');
Copy the code

Third, the operation result

Iv. Matlab version and references

1 Matlab version 2014A

Steamed stuffed bun [1] Yang, YU jizhou, Yang Shan. [2] ZHANG Yan, WU Shuigen. Intelligent Optimization Algorithm and MATLAB Example (2nd edition) [M]. Publishing House of Electronics Industry, 2016. MATLAB Optimization algorithm source code [M]. Tsinghua University Press, 2017.

[path planning] Robot obstacle avoidance path planning based on MATLAB RBF optimization Qlearning algorithm

I. Introduction to RBF

Part of the source code

Third, the operation result

Iv. Matlab version and references

Related Posts

PCA for Machine Learning

K-means clustering in Python, OpenCV

Capturing the evolution of User interest — An interpretation of DIEN’s paper