A brief introduction of digital watermarking technology of audio signal based on discrete wavelet transform
In recent years, digital watermarking technology has become more and more important. Digital watermarking technology is to embed some identification information directly into the digital carrier, or indirectly expressed in the signal carrier, and does not affect the use value of the original carrier. Through the information hidden in the carrier, we can judge whether the information is tampered with, which has the functions of anti-counterfeiting traceability, information security protection and copyright protection. As for the broadcast relay station, it is the transfer station of the broadcast audio, and the safety and reliability of the signal must be guaranteed before the broadcast signal is sent to thousands of households. However, most of the stations at present only use the judgment of human ear and the comparison between different information sources, which has great limitations. If using the characteristics of digital watermarking, it can effectively prevent signal interruption, protect signal security and ensure the safety of broadcast.
According to the processing technology of digital watermarking in audio signal, digital watermarking can be divided into time domain, transform domain and compression domain.
1.1 Time-domain Digital Watermarking In time-domain digital watermarking technology, watermark information is directly embedded into the audio signal, usually hidden in the signal is not important, to ensure that the embedded watermark does not affect the monitoring effect of the original audio signal. The realization of time-domain watermarking technology is relatively easy and has a small amount of calculation, simple and direct, but it is not robust, easy to crack, and poor resistance.
In digital watermarking in Transform domain, audio signals need to be transformed from time domain to Transform domain, usually including DCT, DFT, Discrete Fourier Transform (DWT), Discrete Wavelet Transform, etc. Watermark information is embedded in transform domain, and the audio time domain signal embedded with watermark is obtained by inverse transformation. The watermarking technology in transform domain is more complex than that in time domain, but the watermarking information embedded in transform domain has stronger invisibility, better concealment and better robustness than that in time domain. The research of this paper is mainly based on DWT audio signal watermark information embedding and extraction.
1.3 compressed domain digital watermarking in time domain and transform domain watermark techniques, are directly to embed the watermark signal into the uncompressed audio format, but usually in the audio signal transmission or storage needs of audio signal compression coding (such as WMA, MP3, etc.), so the compressed domain digital watermarking is watermark technology also has great practical value. Digital watermarking technology in compression domain can be roughly divided into three categories: (1) embed watermark in non-compression domain, compress audio signal and watermark information together; (2) In the compression domain, the watermark information is directly embedded into the compressed audio signal; (3) Decompress the compressed signal, embed the watermark information, and finally compress the watermark information and the decompressed audio signal together. In general, the encoding and decoding system of the compression domain watermarking technology is too complex, restricted by the compression coding format, and the redundancy of the compressed audio signal has been removed, so it is difficult to add watermarking, and the compression domain watermarking technology needs further research.
The audio watermarking algorithm studied in this paper is based on discrete wavelet transform (DWT). Through DWT transformation, the audio signal is embedded with watermark information in the transform domain, and then through inverse transformation (IDWT), the audio signal embedded with watermark is obtained. The block diagram of watermark embedding principle is shown in Figure 1.
It is assumed that the watermark is a two-dimensional image BW of M1×M2. Since audio signals are usually one-dimensional vectors, the watermark information needs to be reduced from two-dimensional to one-dimensional vector W before embedding into audio signals, that is, M=M1×M2. Usually we can also disrupt the image encryption, enhance the watermark concealment.
Assuming that the speech signal is S and the length is N, then S ={s1, S2, s3… SN} Since the speech signal is long, it generally needs to be segmented in the process, and the length of each segment is set as N1. Therefore, the speech signal is divided into K= Fix (N/N1) segments for processing, and each segment is embedded with a watermark.
Wavelet transform is a kind of analytical transform proposed to solve the deficiency of Fourier transform. The basis function of Fourier transform is the sinusoidal signal spread over the whole time domain, which can not accurately represent the mutation signal and the frequency component information of the change. The wavelet transform is the local transformation of time and frequency, which can more accurately represent the frequency domain characteristics of audio signals. The commonly used wavelet bases are Haar wavelet, Daubechies (dB N) wavelet, Marr wavelet and so on. The wavelet basis adopted in this paper is Haar wavelet, which is a rectangular wave in the range of T ∈[0, 1] and is defined as follows:FIG. 1 Schematic diagram of audio signal watermark embeddingFigure 2 Audio signal watermark extraction principle block diagramAfter the Haar wavelet basis is determined, the speech signal S can be expressed as:Which Cj, k for discrete wavelet coefficients, the audio signal is decomposed into approximation part of low frequency and high frequency detail part, we in the embedded watermark information processing, mainly for represents part of low frequency approximation coefficient vector processing, the watermark signal into the low-frequency approximate part and high frequency detail remains unchanged, to ensure the quality of voice basically remain unchanged. Since the embedded watermark is a binary image, if the value of the watermark information is 1, the corresponding low frequency coefficient will be increased; on the contrary, if the value is 0, the corresponding low frequency coefficient will be reduced. After embedding watermark information in DWT domain, speech signal is transformed into time domain signal by IDWT transform.
2.2 Watermark Extraction In order to ensure information security, audio signals embedded with watermarks are sent at the sender, while in order to determine the accuracy of audio information at the receiver, watermarks are usually extracted to ensure the authenticity of the source. Therefore, watermark extraction technology is particularly important. In the process of watermark extraction, the original audio signal and the embedded audio signal need to carry out DWT at the same time, and then analyze and compare the parameters of the two to extract the watermark information. The block diagram of watermark extraction principle is shown in Figure 2.
Described earlier in the process of watermarking embedding, embedding watermark information details of high frequency, therefore in the process of extracting the watermark, we also need to compare the original speech signal S the low-frequency wavelet coefficients vector c A and embed watermark audio signals s1 of the low-frequency wavelet coefficients vector c A1, if A1 > c, A, c, watermark is 1; Otherwise, it is 0, and then the one-dimensional vector of watermark information is obtained by vector averaging, and finally the binary image is obtained by raising dimension.
Two, some source code
% [audio,fs]=audioread('open-cc.wav');
A=audio(1:160000); AL=length(A); % Plot the original audio plot: subplot(312); plot(A); title('Raw audio signal'); % I=imread('a.png');
I=im2bw(I);
subplot(311); imshow(I); title('Watermarked image'); [m,n]=size(I); % Reduce the image dimension piexnum=1;
for i=1:m
for j=1:n
w(piexnum,1)=I(i,j);
piexnum=piexnum+1; end end wl=size(w); % to the original audio2Wavelet decomposition: [C, L]=wavedec(A,2.'haar'); % extract2Low frequency (high energy) high frequency (low energy) of order wavelet decomposition: Ca2 = AppCoef (C, L,'haar'.2);
cd2=detcoef(c,l,2);
cd1=detcoef(c,l,1); ca2L=length(ca2); Ca2DCT = % DCT transform DCT (ca2); % segmented k = wl (1); Number of DL % = ca2L/k; % Ca2 The length of each segment j=1;
delta=0.5; % segment for watermark embeddingfor i=1:k
ca22=ca2DCT(j:j+DL- 1);
Y=ca22(1:DL/4); % extract before1/4Coefficient of Y = reshape (Y,10.10); [U,S,V]=svd(Y); S1 = S (% SVD decomposition1.1);
S2=S(2.2);
D=floor(S1/(S2*delta)); The discriminant % is embedded according to the parity of Dif(mod(D,2) = =0)
if (w(i)==1)
S(1.1)=S(2.2)*delta*(D+1);
else
S(1.1)=S(2.2)*delta*D;
end
else
if (w(i)==1)
S(1.1)=S(2.2)*delta*D;
else
S(1.1)=S(2.2)*delta*(D+1);
end
end
Copy the code
3. Operation results
Matlab version and references
1 matlab version 2014A
[1] Han Jiqing, Zhang Lei, Zheng Tieran. Speech Signal Processing (3rd edition) [M]. Tsinghua University Press, 2019. [2] LIU Ruobian. Deep Learning: Practice of Speech Recognition Technology [M]. Tsinghua University Press, 2019.