A list,
In speech denoising is the most commonly used method in spectral subtraction, spectral subtraction is a kind of development and application of a mature early speech denoising algorithms, the algorithm using the additive noise and not related to the characteristics of the voice, under the hypothesis that the smooth noise is statistics, with no clearance measurement to replace a speech during the noise spectrum estimate of the noise spectrum, with the voice signals with noise spectrum subtraction, Thus the speech spectrum is estimated. Spectral subtraction is widely used because of its simple algorithm and small amount of computation. It is easy to realize fast processing and can obtain high output signal-to-noise ratio. The shortcoming of the classical form of the algorithm is that the “music noise” with certain rhythmic fluctuation will be generated after processing.
When converted to the frequency domain, these peaks sound like multiple tones with random frequency changes from frame to frame. This is particularly pronounced in silent segments. This “noise” due to half-wave rectification is called “musical noise”. Fundamentally, the main causes of music noise are:
(1) The negative part of spectral subtraction algorithm is treated nonlinear
(2) Inaccurate estimation of noise spectrum
(3) Suppression function (gain function) has great variability
1 the principle
2 the flow chart
Disadvantages of spectral subtraction
1) As a result of half-wave rectification of negative values, small, independent peaks appear at random frequencies in the frame spectrum, which translate into the time domain. These peaks sound like multiple trills with random frequency variations from frame to frame, commonly known as “Musical Noise”.
2) In addition, the spectral subtraction method also has a small disadvantage that it uses the phase of noisy speech as the phase of the enhanced speech, so the quality of the generated speech may be rough, especially under the condition of low signal-to-noise ratio, which may reach the level of auditory perception and reduce the quality of speech.
In order to better understand spectral subtraction speech enhancement, a simple simulation of the algorithm is carried out here, and the simulation parameters are set as follows
Ii. Source code
clear all; clc; close all;
[xx, fs] = wavread('C5_2_y.wav'); % read data file xx=xx-mean(xx); % Eliminate dc component x=xx/ Max (abs(xx)); % amplitude normalization IS=0.25; % Sets the leading no-talk segment length wlen=200; % Sets the frame length to25ms
inc=80; % sets the frame shift to10ms
SNR=5; % SNR N=length(x); % Signal length time=0:N- 1)/fs; Signal =awgn(x,SNR,'measured'.'db'); Snr1 =SNR_Calc(x,signal); % Calculate initial SNR NIS=fix((IS*fs-wlen)/inc +1); % Find the number of leading blank frames a=4; b=0.001; Output =SpectralSub(signal,wlen,inc,NIS,a,b); Spectral subtraction snr2% = SNR_Calc (x, the output); % SNR = SNR2-SNR1 after spectrum reduction;fprintf('snr1 = % 5.4 f snr2 = % 5.4 f SNR = % 5.4 f \ n',snr1,snr2,snr); unction frameout=enframe(x,win,inc) nx=length(x(:)); Nwin =length(win); % windowif (nwin == 1) % Check whether the window length is1If, for1, that is, len = win is not set; % is, frame length =winelselen = nwin; % No, frame length = window length endif (nargin < 3) % If there are only two parameters, set frame inc= frame length inc= len; end nf = fix((nx-len+inc)/inc); % FrameOut =zeros(NF,len); Indf = inc*(0:(nf- 1)). '; % Set the displacement position of each frame in x inds = (1:len); % Corresponds to each frame1:len
frameout(:) = x(indf(:,ones(1,len))+inds(ones(nf,1), :)); % Frames the dataif (nwin > 1If the argument includes a window function, multiply each frame by the window function w = win(:)'; Function frameOut = FilpFrame (x,win,inc) [NF,len]=size(x); nx=(nf-1) *inc+len; % FrameOut =zeros(nx,1); nwin=length(win); Winx =repmat(win') % if (nwin ~= 1) %,nf,1); x=x./winx; % remove windowing effect x(find(isINF (x)))=0; % removal in addition to0I get Inf endCopy the code
3. Operation results
Fourth, note
Version: 2014 a