A brief introduction of digital watermarking technology of audio signal based on discrete wavelet transform

In recent years, digital watermarking technology has become more and more important. Digital watermarking technology is to embed some identification information directly into the digital carrier, or indirectly expressed in the signal carrier, and does not affect the use value of the original carrier. Through the information hidden in the carrier, we can judge whether the information is tampered with, which has the functions of anti-counterfeiting traceability, information security protection and copyright protection. As for the broadcast relay station, it is the transfer station of the broadcast audio, and the safety and reliability of the signal must be guaranteed before the broadcast signal is sent to thousands of households. However, most of the stations at present only use the judgment of human ear and the comparison between different information sources, which has great limitations. If using the characteristics of digital watermarking, it can effectively prevent signal interruption, protect signal security and ensure the safety of broadcast.

According to the processing technology of digital watermarking in audio signal, digital watermarking can be divided into time domain, transform domain and compression domain.

1.1 Time-domain Digital Watermarking In time-domain digital watermarking technology, watermark information is directly embedded into the audio signal, usually hidden in the signal is not important, to ensure that the embedded watermark does not affect the monitoring effect of the original audio signal. The realization of time-domain watermarking technology is relatively easy and has a small amount of calculation, simple and direct, but it is not robust, easy to crack, and poor resistance.

In digital watermarking in Transform domain, audio signals need to be transformed from time domain to Transform domain, usually including DCT, DFT, Discrete Fourier Transform (DWT), Discrete Wavelet Transform, etc. Watermark information is embedded in transform domain, and the audio time domain signal embedded with watermark is obtained by inverse transformation. The watermarking technology in transform domain is more complex than that in time domain, but the watermarking information embedded in transform domain has stronger invisibility, better concealment and better robustness than that in time domain. The research of this paper is mainly based on DWT audio signal watermark information embedding and extraction.

1.3 compressed domain digital watermarking in time domain and transform domain watermark techniques, are directly to embed the watermark signal into the uncompressed audio format, but usually in the audio signal transmission or storage needs of audio signal compression coding (such as WMA, MP3, etc.), so the compressed domain digital watermarking is watermark technology also has great practical value. Digital watermarking technology in compression domain can be roughly divided into three categories: (1) embed watermark in non-compression domain, compress audio signal and watermark information together; (2) In the compression domain, the watermark information is directly embedded into the compressed audio signal; (3) Decompress the compressed signal, embed the watermark information, and finally compress the watermark information and the decompressed audio signal together. In general, the encoding and decoding system of the compression domain watermarking technology is too complex, restricted by the compression coding format, and the redundancy of the compressed audio signal has been removed, so it is difficult to add watermarking, and the compression domain watermarking technology needs further research.

The audio watermarking algorithm studied in this paper is based on discrete wavelet transform (DWT). Through DWT transformation, the audio signal is embedded with watermark information in the transform domain, and then through inverse transformation (IDWT), the audio signal embedded with watermark is obtained. The block diagram of watermark embedding principle is shown in Figure 1.

It is assumed that the watermark is a two-dimensional image BW of M1×M2. Since audio signals are usually one-dimensional vectors, the watermark information needs to be reduced from two-dimensional to one-dimensional vector W before embedding into audio signals, that is, M=M1×M2. Usually we can also disrupt the image encryption, enhance the watermark concealment.

Assuming that the speech signal is S and the length is N, then S ={s1, S2, s3… SN} Since the speech signal is long, it generally needs to be segmented in the process, and the length of each segment is set as N1. Therefore, the speech signal is divided into K= Fix (N/N1) segments for processing, and each segment is embedded with a watermark.

Wavelet transform is a kind of analytical transform proposed to solve the deficiency of Fourier transform. The basis function of Fourier transform is the sinusoidal signal spread over the whole time domain, which can not accurately represent the mutation signal and the frequency component information of the change. The wavelet transform is the local transformation of time and frequency, which can more accurately represent the frequency domain characteristics of audio signals. The commonly used wavelet bases are Haar wavelet, Daubechies (dB N) wavelet, Marr wavelet and so on. The wavelet basis adopted in this paper is Haar wavelet, which is a rectangular wave in the range of T ∈[0, 1] and is defined as follows:FIG. 1 Schematic diagram of audio signal watermark embeddingFigure 2 Audio signal watermark extraction principle block diagramAfter the Haar wavelet basis is determined, the speech signal S can be expressed as:Which Cj, k for discrete wavelet coefficients, the audio signal is decomposed into approximation part of low frequency and high frequency detail part, we in the embedded watermark information processing, mainly for represents part of low frequency approximation coefficient vector processing, the watermark signal into the low-frequency approximate part and high frequency detail remains unchanged, to ensure the quality of voice basically remain unchanged. Since the embedded watermark is a binary image, if the value of the watermark information is 1, the corresponding low frequency coefficient will be increased; on the contrary, if the value is 0, the corresponding low frequency coefficient will be reduced. After embedding watermark information in DWT domain, speech signal is transformed into time domain signal by IDWT transform.

2.2 Watermark Extraction In order to ensure information security, audio signals embedded with watermarks are sent at the sender, while in order to determine the accuracy of audio information at the receiver, watermarks are usually extracted to ensure the authenticity of the source. Therefore, watermark extraction technology is particularly important. In the process of watermark extraction, the original audio signal and the embedded audio signal need to carry out DWT at the same time, and then analyze and compare the parameters of the two to extract the watermark information. The block diagram of watermark extraction principle is shown in Figure 2.

Described earlier in the process of watermarking embedding, embedding watermark information details of high frequency, therefore in the process of extracting the watermark, we also need to compare the original speech signal S the low-frequency wavelet coefficients vector c A and embed watermark audio signals s1 of the low-frequency wavelet coefficients vector c A1, if A1 > c, A, c, watermark is 1; Otherwise, it is 0, and then the one-dimensional vector of watermark information is obtained by vector averaging, and finally the binary image is obtained by raising dimension.

Two, some source code

function varargout = WatermarkGUI(varargin)
% WATERMARKGUI MATLAB code for WatermarkGUI.fig
%      WATERMARKGUI, by itself, creates a new WATERMARKGUI or raises the existing
%      singleton*.
%
%      H = WATERMARKGUI returns the handle to a new WATERMARKGUI or the handle to
%      the existing singleton*.
%
%      WATERMARKGUI('CALLBACK',hObject,eventData,handles,...) calls the local
%      function named CALLBACK in WATERMARKGUI.M with the given input arguments.
%
%      WATERMARKGUI('Property'.'Value',...). creates anew WATERMARKGUI or raises the
%      existing singleton*.  Starting from the left, property value pairs are
%      applied to the GUI before WatermarkGUI_OpeningFcn gets called.  An
%      unrecognized property name orinvalid value makes property application % stop. All inputs are passed to WatermarkGUI_OpeningFcn via varargin. % % *See  GUI Options on GUIDE's Tools menu.  Choose "GUI allows only one % instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES

% Edit the above text to modify the response to help WatermarkGUI

% Last Modified by GUIDE v2. 5 29-Nov- 2020. 18:47:27

% Begin initialization code - DO NOT EDIT
gui_Singleton = 1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @WatermarkGUI_OpeningFcn, ...
                   'gui_OutputFcn',  @WatermarkGUI_OutputFcn, ...
                   'gui_LayoutFcn', [],...'gui_Callback'[]);if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT


% --- Executes just before WatermarkGUI is made visible.
function WatermarkGUI_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% varargin   command line arguments to WatermarkGUI (see VARARGIN)

% Choose default command line output for WatermarkGUI
handles.output = hObject;

% Update handles structure
guidata(hObject, handles);

% UIWAIT makes WatermarkGUI wait for user response (see UIRESUME)
% uiwait(handles.figure1);


% --- Outputs from this function are returned to the command line.
function varargout = WatermarkGUI_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)

% Get default command line output from handles structure
varargout{1} = handles.output;


% --- Executes on button press in pushbutton4.
function pushbutton4_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton4 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
audio = uigetfile('*.wav');

[AWO,fs] = audioread (audio);

axes(handles.axes10);
plot (AWO,'Color'.'Yellow');

setappdata(0.'AWO', AWO);
setappdata(0.'fs', fs);

% --- Executes on button press in pushbutton5.
function pushbutton5_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton5 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)


% --- Executes on button press in pushbutton6.
function pushbutton6_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton6 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
A = getappdata(0.'A');

fs = getappdata(0.'fs');
%% Step 2 : DWT Level 1 with wavelet db 3

[Aa,Ad] = dwt(A,'db3');

% AI = idwt (Aa,Ad,'db3');

%% Step 3 : svd decomposition Ad
% Preparation before SVD Process into square matrix
lt = length (Ad);
d = sqrt (lt);
d = round(d);
Ad = Ad(1:d^2);
Aa = Aa(1:d^2);

% Reshape Matrix AD until square matrix
Adr = reshape(Ad,d,d);

[U_Ad,S_Ad,V_Ad] = svd (Adr);



%% -------------------------------%%
%% Step 1a; watermark Audio file W = getappdata(0.'W');
% Equality dimensi W and A
la = length (A);
lw = length (W);

tmbah0 = zeros((la-lw),1);
W = [W ; tmbah0];

%% Step 2 a:  DWT Level 1 with wavelet db 3

[Wa,Wd] = dwt(W,'db3');

% AI = idwt (Aa,Ad,'db3');

%% svd decomposition Ad
% Preparation before SVC process
lt = length (Wd);
d = sqrt (lt);
d = round(d);
Wd = Wd(1:d^2);
Wa = Wa(1:d^2);

% Reshare matrix AD until become square matrix 
Wdr = reshape(Wd,d,d);

[U_Wd,S_Wd,V_Wd] = svd (Wdr);

% AWO
AWO = getappdata(0.'AWO');

%% DWT Level 1 with wavelet db 3

[AWOa,AWOd] = dwt(AWO,'db3');

% AI = idwt (Aa,Ad,'db3');

%% svd decomposition Ad
% Preparation before SVD process
lt = length (AWOd);
d = sqrt (lt);
d = round(d);
AWOd = AWOd(1:d^2);
AWOa = AWOa(1:d^2);

%% Reshape matrix AD until become square matrix
AWOdr = reshape(AWOd,d,d);
%% SVD
[U_AWOd,S_AWOd,V_AWOd] = svd (AWOdr);
S_Wd1 = (S_AWOd - S_Ad)/0.01;

Wd1 = U_Wd * S_Wd1 * V_Wd'; 
%% reshape
Wd1R = reshape (Wd1,d^2.1);
Copy the code

3. Operation results

Matlab version and references

1 matlab version 2014A

[1] Han Jiqing, Zhang Lei, Zheng Tieran. Speech Signal Processing (3rd edition) [M]. Tsinghua University Press, 2019. [2] LIU Ruobian. Deep Learning: Practice of Speech Recognition Technology [M]. Tsinghua University Press, 2019.