As for realizing the function of translation (taking Chinese-English translation as an example), we can divide it into several steps:
- Read the file
- Find out the Chinese in the text
- Put the Chinese phrases or sentences in the text into an array
- Invoke a common translation interface for asynchronous translation
- The translated English will be backfilled in the meeting documents
First, build the environment
Open the terminal and type: node -v
-bash: node: Command not found
The Node environment is not set up. Please go to the official website to download.
If a version number like v10.16.0 appears on the terminal, your Node environment is set up.
Second, read the file
We first read the fixed file, and then we will try to translate the uploaded file. If you need it, you can move to the later content.
First we create a replace.js file in our folder to write our code.
Then create a js/ts file for us to translate, such as: desutils.ts, the code is as follows:
const desKeyObj = {
desKey: 'ztesoftbasemobile20160812.. '.ivKey: '01234567890'
}
export default {
/** * encryption * @param dataStr */
encrypt: function (dataStr) {
try {
console.log("If this is a paragraph. And there's punctuation.");
} catch (error) {
console.log('Encryption error' + JSON.stringify(error)); }},/** ** @param dataStr decrypt */
crypto: function (dataStr) {
try {
console.log(' ');
} catch (error) {
console.log('Decryption error')}}}Copy the code
Now we start writing the replace.js file to read the desutil.ts file.
First we can use the require command to import the modules we need.
var fs = require('fs');
Copy the code
Node reads and writes files based on a Stream.
There are four flow types under the FS module:
- Readable – Readable operation
- Writable – Writable operation
- Duplex – Readable and write operation
- Transform – The operation is written to the data and then read out the result
There are also four types of operations for streams:
- Data – Triggered when there is data to read
- End – Triggered when there is no more data to read
- Error – Triggered when an error occurs during receive and write
- Finish – Triggered when all data has been written to the underlying system
Don’t worry, you can understand it through the following code.
Create a readable stream and handle stream events:
// Create a readable stream
let readerStream = fs.createReadStream('DesUtils.ts');
readerStream.setEncoding('UTF8');
let data = ' ';
/ / processing flow
readerStream.on('data', (chunk) => {
data += chunk;
});
readerStream.on('end', () = > {console.log(data);
});
/ / don't write
readerStream.on('error', (err) => {
console.log(err.stack);
});
Copy the code
We have written the code to read the file, now let’s try it:
The terminal goes to the folder, types: node replace.js, and, unsurprisingly, the contents of the file are displayed on the terminal.
List all contents in Chinese
Data processing on the data, we will take out the Chinese text through regular expression. Modify the above code:
readerStream.on('data', (chunk) => {
const reg = /[\u4e00-\u9fa5]/g;
while(list = reg.exec(chunk)) {
console.log(list[0]); }})Copy the code
Ok, now we have taken out all the Chinese, if we need to delete the Chinese, we just need to delete these Chinese.
If it is translated into English, it can not be translated word by word, right?
Therefore, we need to classify the Chinese, which should be phrases into phrases, and which should be sentences into sentences, with an array to store.
My approach is to judge by the index of each word. Is it adjacent to the previous Chinese index? If it is, it should not be separated and stored as a word or sentence. (You may not understand it, you can directly analyze the code ~)
If there is a better method, you can try it yourself
readerStream.on('data', (chunk) => {
var reg = /[\u4e00-\u9fa5]/g;
let index = 0;
let termList = []; // Iterate over the Chinese array
let term = ' ';
data = chunk;
while (list = reg.exec(chunk)) {
if((list.index ! == index +1) && term) {
termList.push(term);
term = ' ';
}
term += list[0];
index = list.index;
}
termList.push(term);
console.log(termList); // Print Chinese data
})
Copy the code
Using the code above, we’ve already distinguished phrases, sentences, and so on. The terminal will see the data [‘ encrypt ‘, ‘if this is a paragraph ‘,’ and punctuation ‘, ‘encrypt error ‘,’ decrypt ‘, ‘decrypt error’].
The function of translation will be discussed in the next module. We will first replace the Obtained Chinese content with random content.
4. Replace Chinese
Create a writable stream and replace the query with random characters.
// Create a writable stream to the file replacedesutils.ts
let writerStream = fs.createWriteStream('replaceDesUtils.ts');
readerStream.on('data', (chunk) => {
var reg = /[\u4e00-\u9fa5]/g;
let index = 0;
let termList = []; // Iterate over the Chinese array
let term = ' ';
while(list = reg.exec(chunk)) { ... Omit}if(termList && termList.length) {
termList.map(item= > {
data = data.replace(item, '112233'); // So far, all Chinese has been replaced
})
}
writerStream.end(data); // Write the replaced file to replacedesutils.ts
})
Copy the code
Now you can see that there is a replacedesutils.ts file under the current package. Open it and see if it is successful.
V. Realize translation function
In the call translation interface, reference to a nodejs based on the implementation of a Youdao dictionary translator baidu tutorial documents, friends can do reference.
Let me outline the steps to create the demo (there is 50 yuan in it, which is enough for us to write the demo test) :
- Open Youdao Wisdom Cloud to register an account
- On the left after login
tab
Found on theNatural language translation - Examples of translation
To create a new instance, selectText translation
(Or you can choose another type, I just triedText translation
) - Open on the left
tab
上Application management - My application
, create a new application,Access type
chooseAPI
Click Next to bind the newly created instance. - Check out the app details
Application ID
和Application of key
These two fields correspond to what you’re going to need in a momentappKey
和secretKey
.
Next, create a translator.js file in that folder and copy it directly to create a Youdao Dictionary translator based on NodeJS.
[img-jqssvVM7-1578552563377] (gitImg/nodeReplace/1-5. JPG)
But this file has some libraries to use that we don’t have. We need to install it manually, and the steps are as follows:
- 1. Create one under the current package
package.json
File, copy the code:
{
"dependencies": {
"request": "^ 2.88.0"."request-promise": "^ 4.2.5." "}}Copy the code
- 2. Then run the following command on the terminal:
npm i
oryarn
Install the library we need.
Ok, go ahead and write our code in replace.js:
- Import the translated file and instantiate it
- Set the translation configuration
- Write public methods to translate (if correct
Promise
I don’t quite understand. I will explain it below. - Translate Chinese and write it into a document
var Translator = require('./translator');
let translator = new Translator();
translator.config = {
from: 'zh_CHS'.to: 'EN'.appKey: '* * * * * * * * *'.secretKey: '* * * * * * * * * * * * * * * *'
}
function translateString(str) {
return new Promise(function (resolve, reject) {
letresultStr = translator.translate(str); resolve(resultStr); })}Copy the code
The translateString() method is explained here, because the translation interface we call is asynchronous and there are delays if we don’t use promises.
Promise provides an asynchronous execution mode.
The two arguments provided can be understood as return, and the required data is passed through resolve. Here’s a little test
translateString('come on').then(val= > {
console.log(val);
})
Copy the code
The resulting data is a long JSON string, which we escape and intercept. console.log(JSON.parse(val).translation)
Do you find that the translated content has been obtained by us?
Now let’s continue writing the code:
while(list = reg.exec(chunk)) { ... Omit} termList. Push (term);console.log(termList); // Print Chinese data
translateString(termList).then(val= > {
let translation = JSON.parse(val).translation; // Here you can print the data and translate the content to see why the following code is written that way.
let transList = []; // The translated array
if(translation[0] && translation[0].indexOf(', ')! = =- 1) {
transList = translation[0].split(', '); // There is a space after the comma
transList.map((item, index) = > {
console.log(termList[index], item);
const res = data.replace(termList[index], item);
data = res;
})
writerStream.write(data, 'UTF8');
writerStream.end();
console.log('Write done'); }})Copy the code
Ok guys, now let’s open up replaceDesutils.ts and see what happens.
Later articles will use Express for text manipulation and you can upload custom files for translation.