A list,

There are two kinds of music retrieval: text-based retrieval and content-based retrieval. Text-based retrieval is the most common way to retrieve a song by entering the name of the song, the name of the artist or the lyrics. It is completed by feature marking the music in the music library, and each music has the information of song name, singer and lyrics. Users often use inverted index to search keywords. The premise of text based retrieval is that the user knows some information about the song, which can meet the user’s needs in most cases. But this limitation can be a drawback in some cases. A lot of times, the music users want to retrieve is a song they’ve overheard while walking down the road. The snippet may be pop music, but it’s unrealistic for users to memorize the lyrics; Or it could be pure music, in which case text-based retrieval is useless. In addition, tagging millions of songs can be a time-consuming task.

In order to meet the demand of users to search anytime and anywhere, content-based music retrieval arises at the historic moment. Content-based music retrieval does not require the user to provide keywords, but searches through raw audio. It can be divided into two forms: humming search and recording search.

2 Humming search

Humming retrieval is a hot topic in music retrieval. It is performed by humming music fragments. Here’s how it works: The user sings to retrieve a piece of music, usually between 10 and 15 seconds long, and then uploads the piece to the server, which returns the music most similar to the user’s hum through similarity matching. Instead of matching the original music directly, the server firstly extracts features from music fragments and then retrieves them using features. The most commonly used feature is the fundamental frequency sequence of music. The core of hum retrieval is similarity matching between fundamental frequency sequences. The hum retrieval is a fuzzy match because the user’s humming fragment cannot be exactly similar to the actual music fragment in the library. For fuzzy matching, there are many methods, the most commonly used such as string editing distance, complex DTW dynamic time warping algorithm. For the purpose of accuracy, DTW dynamic time warping algorithm is often used in hum retrieval. This algorithm is a dynamic programming algorithm with high complexity, but it is very suitable for fuzzy matching between time series, so it has the highest accuracy in hum retrieval. Although DTW algorithm is a dynamic programming algorithm, it can be accelerated by GPU, FPGA and other hardware, so the retrieval speed is not a bottleneck at present.



At present, the biggest problem of hum retrieval is accuracy. Although DTW algorithm can be used, how to extract the accurate fundamental frequency sequence is the biggest problem. At present, the accuracy of extracting fundamental frequency sequence from multi-part music is very low, only about 75%. Using the wrong characteristics to match, DTW does not help. It is emphasized to extract fundamental frequency features from multi-part music because a large number of existing music is multi-part music at present, so extracting features from multi-part music can naturally construct a very large feature library. The characteristic of multi-part music is accompanied by a large number of instrumental background sounds, which are noise in the process of extracting vocal music. How to improve the accuracy of multi-voice music fundamental frequency feature extraction is the key to hum retrieval. When the extraction accuracy reaches more than 95%, hum retrieval will play a greater value. At present, the commercial humming retrieval software is mainly MIDOMI.

3 Recording Retrieval

The name may not be good, but in QQ Music the app is called “Learn songs by Listening to Songs”. As it is literally described, recording retrieval is performed by recording a piece of music and uploading it to the server for retrieval. The difference between it and hum retrieval is that the recording retrieval does not require the user to hum, but to record a piece of music that is being played. In the way of use, this way is more simple and convenient. Since the recording is the original music played, the recording retrieval is not a fuzzy retrieval, but an exact matching, using a different technique from hum retrieval. Fortunately, however, the recording retrieval has a history of ten years, the accuracy is also very high. The most famous app is probably Shazam, but QQ Music is also very good at listening to songs.



The basic principle of the algorithm is to select some energy maximum points from the frequency domain features of music as landmarks, and then combine the two landmarks into fingerprints and store them in the database. Matching works like this: fingerprints extracted from retrieved fragments are reproduced in the original music. So the process of retrieval is to look for moments in the original music where there is a lot of overlap between fingerprints. Matching can be figuratively understood as: the fingerprint extracted from the original music is spread over a long strip, and the recorded clip is a short strip, moving the short strip from the beginning of the strip until the fingerprint overlaps a lot. The implementation will be described in detail in the future.

Ii. Source code

function varargout = recognition(varargin)
% RECOGNITION MATLAB code for recognition.fig
%      RECOGNITION, by itself, creates a new RECOGNITION or raises the existing
%      singleton*.
%
%      H = RECOGNITION returns the handle to a new RECOGNITION or the handle to
%      the existing singleton*.
%
%      RECOGNITION('CALLBACK',hObject,eventData,handles,...) calls the local
%      function named CALLBACK in RECOGNITION.M with the given input arguments.
%
%      RECOGNITION('Property'.'Value',...). creates anew RECOGNITION or raises the
%      existing singleton*.  Starting from the left, property value pairs are
%      applied to the GUI before recognition_OpeningFcn gets called.  An
%      unrecognized property name or invalid value makes property application
%      stop.  All inputs are passed to recognition_OpeningFcn via varargin.
%
%      *See GUI Options on GUIDE's Tools menu.  Choose "GUI allows only one % instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES

% Edit the above text to modify the response to help recognition

% Last Modified by GUIDE v2. 5 08-Jun- 2018. 15:48:10

% Begin initialization code - DO NOT EDIT
gui_Singleton = 1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @recognition_OpeningFcn, ...
                   'gui_OutputFcn',  @recognition_OutputFcn, ...
                   'gui_LayoutFcn', [],...'gui_Callback'[]);if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT


% --- Executes just before recognition is made visible.
function recognition_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% varargin   command line arguments to recognition (see VARARGIN)
setappdata(handles.pushbutton1,'pathfile'.0);
% Choose default command line output for recognition
handles.output = hObject;
ha=axes('units'.'normalized'.'pos'[0 0 1 1]);
 uistack(ha,'down');
 ii=imread('beijing.jpg'); % Set the program's background image, Beijing1.jpg
 image(ii);
 colormap gray
 set(ha,'handlevisibility'.'off'.'visible'.'off');

% Update handles structure
guidata(hObject, handles);

% UIWAIT makes recognition wait for user response (see UIRESUME)
% uiwait(handles.figure1);


% --- Outputs from this function are returned to the command line.
function varargout = recognition_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Get default command line output from handles structure
varargout{1} = handles.output;


% --- Executes on button press in pushbutton1.
function pushbutton1_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton1 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
[filename, pathname] = uigetfile('*.wav'.'Read sound file'); % Select the sound fileif isequal(filename,0)% Determines whether to selectmsgbox('No music selected');
else
   pathfile=strcat(pathname, filename); % Get sound path [x,fs]= Audioread (pathfile); % read the sound into the matrix sound(x,fs); % Play soundend
set(handles.edit1,'string',filename)
setappdata(handles.pushbutton1,'pathfile',pathfile);


% --- Executes on button press in pushbutton2.
function pushbutton2_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton2 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
pathfile=getappdata(handles.pushbutton1,'pathfile');
Fs=40000;
[memory, FS]=audioread(pathfile);
memorypiece=memory(1:100000.1);
fullmemory=memory(1:200000.1);
memorypiece=memorypiece';
memorypiece=resample(memorypiece,Fs,FS);
fullmemory=resample(fullmemory,Fs,FS);
fullmemory=fullmemory';
B=mfcc_m(fullmemory,Fs,13,Fs/4,Fs/8);
c1=load('data1.mat');
c2=load('data2.mat');
c3=load('data3.mat');
n1=c1.C1;
n2=c2.C2;
n3=c3.C3;
DTW1 = dtw(B,n1);
DTW2 = dtw(B,n2);
DTW3 = dtw(B,n3);


A=[DTW1,DTW2,DTW3];
if min(A)==DTW1
     set(handles.edit2,'string'.'Miss Dong - Song Dongye')
elseif min(A)==DTW2
     set(handles.edit2,'string'.'Dead to love -G.E.M. G.E.M.')
else 
    set(handles.edit2,'string'.'Mo-Na-ying')
end


function edit1_Callback(hObject, eventdata, handles)
% hObject    handle to edit1 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Hints: get(hObject,'String') returns contents of edit1 as text
%        str2double(get(hObject,'String')) returns contents of edit1 as a double


% --- Executes during object creation, after setting all properties.
function edit1_CreateFcn(hObject, eventdata, handles)
% hObject    handle to edit1 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    empty - handles not created until after all CreateFcns called
Copy the code

3. Operation results

Fourth, note

Version: 2014 a