A list,
Speech synthesis based on matlab linear prediction formant detection and pitch parameters
Ii. Source code
clear all; clc; close all;
[xx,fs]=wavread('C7_3_y.wav'); % read file xx=xx-mean(xx); % Dc component x1=xx/ Maxabs(xx)); % x=filter([1 - 99.].1,x1); % preweighting N=length(x); % Data length time=(0:N- 1)/fs; % Signal time scale wlen=240; Long inc = % frame80; Moving frames overlap % = wlen - inc. % Overlap length tempr1=(0:overlap- 1)'/overlap; % overlap window function w1 temPR2 =(overlap-1:-1:0)'/overlap; % oblique triangular window function w2 n2=1:wlen/2+1; Wind =hamming(wlen); % window function X=enframe(X,wlen,inc)'; % framing fn = size (X, 2); % frames Etemp = sum (X) * X); % Calculate the energy of each frame Etemp=Etemp/ Max (Etemp); % energy normalized T1=0.1; R2 = 0.5; % endpoint detection parameter miniL=10; Mnlong =5; % vowel body minimum frame ThrC=[10 15]; % threshold p = 12; % LPC rank frameTime=FrameTimeC(fn,wlen,inc,fs); % Calculate the time scale of each frame Doption=0; % with subject - extension/sound detection [voiceseg vosl, SF, Ef, period] = pitch_Ceps (x, wlen, inc, T1, fs); % Cepstrum-based pitch detection Dpitch= pitFilterM1 (period,voiceseg,vosl); % to smooth the T0 and pitch T0% % formant extraction for I = 1: length (SF) [Frmt (: I), Bw (: I), U (: I)] = Formant_Root (X (: I), p, fs, 3); End %% zint=zeros(2,4); % initialize tal=0; for i=1 : fn yf=Frmt(:,i); % Take the frequency and bandwidth of the three formant of frame I bw= bw (:, I); [an,bn]=formant2filter4(yf,bw,fs); % convert to four second order filter coefficients synt_frame=zeros(wlen,1); If SF(I)==0 % Sfgamer = Randn (wlen,1); For k=1:4% Input An= An (:,k); Bn=bn(k); [out(:,k),zint(:,k)]=filter(Bn(1),An,excitation,zint(:,k)); synt_frame=synt_frame+out(:,k); End else % Frame PT=round(Dpitch(I)); Exc_syn1 =zeros(wlen+tal,1); Exc_syn1 (mod(1: TAL +wlen,PT)==0)=1; Exc_syn2 = EXC_SYN1 (TAL +1: TAL +inc); Index =find(exc_syn2==1); excitation=exc_syn1(tal+1:tal+wlen); % if isempty(index) % if there is no pulse tal= TAL +inc; % Calculate the leading zeros of the next frame else % Within the inc interval pulse eal=length(index); % several pulses tal= Inc-index (EAL) were calculated; End for k=1:4% Enter An= An (:,k) for all four filters in parallel; Bn=bn(k); [out(:,k),zint(:,k)]=filter(Bn(1),An,excitation,zint(:,k)); synt_frame=synt_frame+out(:,k); End end Et=sum(synt_frame.*synt_frame); % Synthesized speech rt=Etemp(I)/Et; synt_frame=sqrt(rt)*synt_frame; If I ==1 % output=synt_frame; Else M=length(output); % according to linear proportion overlap and sum processing synthesis data output=[output(1:M-overlap); output(M-overlap+1:M). synt_frame(overlap+1:wlen)]; endCopy the code
3. Operation results
Fourth, note
Version: 2014 a