Make writing a habit together! This is the third day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.
1. Introduction
Voice synthesis technology is more and more widely used in life, reading, listening to books, order broadcasting, intelligent hardware, voice navigation in many scenarios have added voice broadcasting function. Speech synthesis is based on deep neural network technology, providing highly personified, smooth and natural speech synthesis services, which can simulate the voices of different people, so that apps and devices can speak, and intelligent training of personalized speech.
This topic describes how to use the voice synthesis service provided by Huawei Cloud. You can download the synthesized voice by using the apis provided by Huawei Cloud.
2. Enable the function
Huawei Cloud provides speech synthesis, a service that converts text into realistic speech. The user can access and call the API in real time to obtain the result of speech synthesis and synthesize the user’s input text into audio. Through the choice of timbre, custom volume, speed, for enterprises and individuals to provide personalized pronunciation services.
2.1 Voice Interaction Service
Address: console.huaweicloud.com/sis/?region…
2.2 Help Documents
Address: support.huaweicloud.com/api-sis/sis…
Request Header:
parameter | Whether the choice | The parameter types | describe |
---|---|---|---|
X-Auth-Token | is | String | The user Token. Token authentication means that the Token is added to the request header when the API is invoked to obtain the permission to operate the API through identity authentication. The X-subject-Token value in the response header is the Token. |
Request head X-ray Auth – Token field in previous articles have been introduced, access method here: bbs.huaweicloud.com/blogs/31775… Turn to section 2.3.
(2) Request Body parameter:
parameter | Whether the choice | The parameter types | describe |
---|---|---|---|
text | is | String | The length of the text to be synthesized is less than 500 characters. |
config | no | Configure the JSON | Voice composition configuration information. |
(3) Configuration parameters of TtsConfig:
parameter | Whether the choice | The parameter types | describe |
---|---|---|---|
audio_format | no | String | Voice format header: WAV, MP3, PCM Default: WAv parent node: config |
sample_rate | no | String | Sampling rate: 16000, 8000 Default: 8000 Parent node: config |
property | no | String | Speech synthesis character string of the form {language}{speaker}{domain}, that is, “languageResearchers identifiedAreas “. Speakers are divided into ordinary speakers and fine speakers. The price of each call is the same. For fine speakers, every 50 words are counted as a call, and less than 50 words are counted as a call. Ordinary speakers count one call every 100 words, less than 100 words count one call. One Chinese character, one English letter or one punctuation mark is counted as one character. Excellent pronunciator: only CN-north-4 and CN-east-3 are supported in the region. Pitch adjustment is not supported for the time being. If you report an error with SIS.0411, please check for compliance with the usage restrictions. Default: chinese_xiaoyan_common Parent node: config |
speed | no | Integer | Speed. Value range: -500-500 Default value: 0 Parent node: configDescription:When the value is 0, it indicates the normal speech speed of an adult, which is about 250 words per minute. When setting this value, there is no absolute mapping between speed and value. |
pitch | no | Integer | Pitch. Value range: -500-500 Default value: 0 Parent node: config |
volume | no | Integer | The volume. Value range: 0-100 Default value: 50 Parent node: config |
(4) Common pronator property value range:
The property value | describe |
---|---|
chinese_xiaoqi_common | Xiaoqi, the standard female voice speaker. |
chinese_xiaoyu_common | Xiao Yu, the standard male voice speaker. |
chinese_xiaoyan_common | Xiaoyan, gentle female voice speaker. |
chinese_xiaowang_common | Xiao Wang, the voice of children. |
chinese_xiaowen_common | Xiao Wen, soft and beautiful female voice speaker. |
chinese_xiaojing_common | Xiaojing, the nifty female vocalist. |
chinese_xiaosong_common | Xiao Song, passionate male voice speaker. |
chinese_xiaoxia_common | Xiaoxia, the passionate female vocalist. |
chinese_xiaodai_common | Silly, lovely child voice. |
chinese_xiaoqian_common | Xiaoqian, mature female voice speaker. |
english_cameal_common | Cameal, gentle female voice English speaker. |
(5) Property value range:
The property value | describe |
---|---|
chinese_huaxiaoxia_common | Hua Xiaoxia, passionate female vocalist. |
chinese_huaxiaogang_common | Hua Xiaogang, agile male voice speaker. |
chinese_huaxiaolu_common | Hua Xiaolu, intellectual female voice speaker. |
chinese_huaxiaoshu_common | Hua Xiaoshu, soothing female voice speaker. |
chinese_huaxiaowei_common | Hua Xiaowei, gentle female voice speaker. |
chinese_huaxiaoliang_common | Hua Xiaoliang, loud and clear female voice. |
chinese_huaxiaodong_common | Hua Xiaodong, mature male voice speaker. |
chinese_huaxiaoyan_common | Hua Xiaoyan, strict female voice speaker. |
chinese_huaxiaoxuan_common | Hua Xiaoxuan, Taiwan female vocalist. |
chinese_huaxiaowen_common | Hua Xiaowen, soft and beautiful female voice speaker. |
chinese_huaxiaoyang_common | Hua Xiaoyang, vigorous male voice speaker. |
chinese_huaxiaomin_common | Hua Xiao Min, female vocalist in southern Fujian. |
chinese_huanvxia_literature | Hua Nu Xia, wuxia girl voice, only supports 16K sampling rate. |
chinese_huaxiaoxuan_literature | Hua Xiaoxuan, suspense male vocalist, only supports 16K sampling rate. |
chinese_huaxiaomei_common | Hua Xiaomei, gentle female voice speaker. |
(6) Body parameter of the response
Status code: 200
parameter | Whether the choice | The parameter types | describe |
---|---|---|---|
trace_id | no | String | Tokens within the service that can be used to trace specific processes in the log. This token string may not be present in some error cases. |
result | no | object | If the invocation succeeds, it indicates the identification result. If the invocation fails, this field does not exist. |
(7) CustomResult parameter
parameter | Whether the choice | The parameter types | describe |
---|---|---|---|
data | no | String | Voice data, returned in Base64 encoding format. To generate audio files, you need to decode Base64 encoding into byte arrays and save them as audio files in the same format“Audio_format”The default value is in WAV format. |
2.3 Online Debugging Interface
Through the online debugging interface, you can quickly debug interface parameters, request mode, return results and other information.
Address: apiexplorer.developer.huaweicloud.com/apiexplorer…
You can also fill in the test parameters online to test the effect.
2.4 Summary of Request Interfaces
Request address format: POST /v1/{project_id}/ TTS https://sis-ext.cn-north-4.myhuaweicloud.com/v1/0e5957be8a00f53c2fa7c0045e4d8fbf/tts request body: {" text ":" please note that the sitting position ", "config" : { "audio_format": "wav", "sample_rate": "16000", "property": "chinese_xiaoqi_common", "speed": 0, "pitch": 0, "volume": Request header: 0}} {" X - Auth - Token ":" * * * * * * ", "the content-type" : "application/json; Charset =UTF-8"} Response body :{" result":{"data": XXXXXXXX "}} This XXXX is the returned Base64 encoded voice data, which can be decoded and saved as a file.Copy the code
3. Realize the source code
Software using QT design, the core part is mainly used to HTTP request related operations.
3.1 Text to speech source code
// Text to voice void Widget::TextToAudio(QString text) {function_select=1; QString requestUrl; QNetworkRequest request; // set the request address QUrl url; RequestUrl = QString("https://sis-ext.%1.myhuaweicloud.com/v1/%2/tts").arg(SERVER_ID).arg(PROJECT_ID); / / set the format of data submission request. SetHeader (QNetworkRequest: : ContentTypeHeader, QVariant (" application/json ")); SetRawHeader (" x-auth-token ", token); SetUrl (requestUrl); request.setUrl(url); QString post_param=QString ("{" ""text": "%1"," ""config": {" ""audio_format": "%2"," ""sample_rate": "%3"," ""property": "%4"," ""speed": %5," ""pitch": 0," ""volume": %6" "}" "}").arg(text).arg(ui->comboBox_formt->currentText()) .arg(ui->comboBox_cai_yang_lv->currentText()) .arg(ui->comboBox_fa_yin_ren->currentText()) .arg(ui->spinBox_audio_speed->value()) .arg(ui->spinBox_yin_liang->value()); Manager ->post(request, post_param.toutf8 ()); Void Widget::on_pushButton_to_audio_clicked() {QString text= UI ->lineEdit->text(); If (text. IsEmpty ()) {QMessageBox: : information (this, "prompt", "please enter the text", QMessageBox: : Ok, QMessageBox: : Ok); return; } qDebug()<<"text:"<<text; TextToAudio(text); }Copy the code
3.2 access token
/* function :GetToken */ void Widget::GetToken() {// indicates to GetToken function_select=3; QString requestUrl; QNetworkRequest request; // set the request address QUrl url; / / access token request address requestUrl = QString (" https://iam.%1.myhuaweicloud.com/v3/auth/tokens "). Arg (SERVER_ID); // create TCP server, test //requestUrl="http://10.0.0.6:8080"; / / set the format of data submission request. SetHeader (QNetworkRequest: : ContentTypeHeader, QVariant (" application/json; charset=UTF-8")); SetUrl (requestUrl); request.setUrl(url); QString text =QString("{"auth":{"identity":{"methods":["password"],"password":" "{"user":{"domain": {" ""name":"%1"},"name": "%2","password": "%3"}}}," ""scope":{"project":{"name":"%4"}}}}") .arg(MAIN_USER) .arg(IAM_USER) .arg(IAM_PASSWORD) .arg(SERVER_ID); Manager ->post(request, text.toutf8 ()); }Copy the code
3.3 Parsing the Returned Value
Void Widget::replyFinished(QNetworkReply * Reply) {QString displayInfo=""; int statusCode = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute).toInt(); QByteArray replyData = reply->readAll(); QDebug ()<<" statusCode :"<<statusCode; QDebug ()<<" Feedback data :"<<QString(replyData); // Update token if(function_select==3) {displayInfo=" Token update failed "; / / read the HTTP response header data QList < QNetworkReply: : RawHeaderPair > RawHeader = reply - > rawHeaderPairs (); QDebug ()<<"HTTP response header number :"<< rawheader.size (); for(int i=0; i<RawHeader.size(); i++) { QString first=RawHeader.at(i).first; QString second=RawHeader.at(i).second; if(first=="X-Subject-Token") { Token=second.toUtf8(); DisplayInfo =" Token update succeeded "; // Save to file SaveDataToFile(Token); break; }} QMessageBox: : information (this, "tip" displayInfo, QMessageBox: : Ok, QMessageBox: : Ok); return; } // Check the status code if(200! = statusCode) {QJsonParseError json_error; QJsonDocument document = QJsonDocument::fromJson(replyData, &json_error); If (json_error. Error == QJsonParseError::NoError) {if(document.isObject()) {QString error_str=""; QJsonObject obj = document.object(); QString error_code; If (obj.contains("error_code")) {error_code=obj.take("error_code").tostring (); Error_str +=" Error code :"; error_str+=error_code; error_str+="\n"; } the if (obj. The contains (" error_msg ")) {error_str + = "error message:"; error_str+=obj.take("error_msg").toString(); error_str+="\n"; } QMessageBox: : information (this, "tip", error_str QMessageBox: : Ok, QMessageBox: : Ok); } } return; } else if(function_select==1) // QJsonParseError json_error; QJsonDocument document = QJsonDocument::fromJson(replyData, &json_error); Error == QJsonParseError::NoError) {if(document.isObject()) {QJsonObject obj = document.object(); If (obj.contains("result")) {QJsonObject obj1=obj.take("result").toobject (); if(obj1.contains("data")) { QString data=obj1.take("data").toString(); QByteArray d2=QByteArray::fromBase64(data.toUtf8()); QDebug ()<<" Data obtained successfully.." ; QStringList path_list=QStandardPaths::standardLocations(QStandardPaths::DownloadLocation); / / save to file a QString filename = QFileDialog: : getSaveFileName (this, "save the audio file," path_list. Ats (0), tr (" *. Wav *. Mp3 *. PCM ")); if(filename.isEmpty()) { filename=path_list.at(0)+"/123.wmv"; } QFile::remove(filename); QFile file_2(filename); file_2.open(QIODevice::WriteOnly); file_2.write(d2); // Write data file_2.close(); } } } } } }Copy the code