At the bottom of the previous five articles, we should access the air business.
This article begins with the front end multilingual feature, showing how to use Puppeteer to control Chrome/Chromium and download files.
A directory
What’s the difference between a free front end and a salted fish
directory |
---|
A directory |
The preface |
Three Puppeteer |
3.1 Capturing a Snapshot |
3.2 Downloading Files |
Iv References |
The preface
Puppeteer is a Node library that provides a high-level API for controlling Chromium or Chrome via the DevTools protocol.
As explained in the GitHub introduction, Puppeteer can be used for most of the things you do manually in your browser!
- Capturing a page Snapshot
- Generate page PDF
- Automatically manipulate the page DOM
- …
For detailed examples, please refer to the GitHub or Chinese documentation at the bottom of this article (readme.md).
Three Puppeteer
- Installation:
npm i puppeteer
! Jsliang installation error:
(node:7584) ExperimentalWarning: The fs.promises API is experimental
My node.js version is [email protected], so I need to upgrade Node.js.
There are two ways to upgrade, one is to download the latest version to cover the installation, the other is to manage through NVM/NVMW.
Jsliang network is good, download the latest documentation: Node official website
Check the latest version after installation:
node -v
:v14.17.1
Json: “Puppeteer “: “^10.0.0”
Puppeteer installation is a test of Puppeteer’s Internet speed, with all kinds of errors expected
The installation is complete
3.1 Capturing a Snapshot
Let’s take a simple example of grabbing a page snapshot:
src/index.ts
import program from 'commander';
import common from './common';
import './base/console';
import puppeteer from 'puppeteer';
program
.version('0.0.1')
.description('Library of Tools')
program
.command('jsliang')
.description('Jsliang help instruction')
.action(() = > {
common();
});
program
.command('test')
.description('Test channel')
.action(async() = > {// Start the browser
const browser = await puppeteer.launch({
headless: false.// Open the physical browser
});
// Create a new TAB and open it
const page = await browser.newPage();
await page.goto('https://www.baidu.com/s?wd=jsliang');
// Take a snapshot and store it locally
await page.screenshot({
path: './src/baidu.png'});// Close the window
await browser.close();
});
program.parse(process.argv);
Copy the code
After the NPM run test is executed, the SRC folder contains the image file biduo.png, which is displayed as follows:
This can be affected by the Actual Scientific Internet tool or 360 Safety guard, so make sure these apps are turned off in case your blood pressure spikes
This gives us a glimpse of Puppeteer, and of course it can be exported as a PDF, etc. Read more about Puppeteer in resources below.
3.2 Downloading Files
Since we can get screenshots, it’s not surprising that we can manipulate the DOM. Let’s get files offline!
To take a document example, let’s create an Excel file:
Create a way to play, not to explain, document address: https://www.kdocs.cn/
Then, our next step is to download this Excel (assuming we have hired someone to do the translation work), which looks like this:
This picture comes from the network, this knowledge sharing for reference, infringement must be deleted
Then let’s do a simple one:
It doesn’t matter how multilingual it is, our goal is to access this Excel file by operating Puppeteer
OK, we have the file. How can we download it? The situation is as follows:
- Imagine if we opened the Puppeteer via a headless browser, which is almost as good as a traceless browser. If you log in normally, you have to re-log in, enter the link, and then click the button to download.
So, here’s the no-login link for the document:
We all know that sign-on free is sign-on free. This is a stupid explanation, but I feel it’s necessary…
Here is the above Demo address, you can use it to practice, but I do not ensure that this link will be deleted one day, so follow the above steps to set up a!
- Excel trial file. XLSX:
https://www.kdocs.cn/l/sdwvJUKBzkK2
OK, rory, let’s get down to business — how to get offline files:
- Operation browser open
https://www.kdocs.cn/l/sdwvJUKBzkK2
- Sleep 6.66s (make sure your browser opens the link and loads the page)
- Then trigger the click of the “More Menu” button
- Sleep 2S (make sure more menu buttons are clicked to)
- Set the download path (ensure the download location, otherwise pop-ups will not be easy to handle)
- Finally, the click of the “Download” button is triggered
- Sleep for 10s
- Close the window
The only point to pay attention to above is point 5, because our Windows click download will have a popup window (not the default download), so you need to set the download path in advance (will be reflected in the code).
So, code!
src/common/index.ts
import { inquirer } from '.. /base/inquirer';
import { Result } from '.. /base/interface';
import { sortCatalog } from './sortCatalog';
import { downLoadExcel } from './downLoadExcel';
const common = (): void= > {
// Question route: see questionlist.ts
const questionList = [
// q0
{
type: 'list'.message: A: May I help you? '.choices: ['Public Services'.'File Management']},// q1
{
type: 'list'.message: 'Current public services are:'.choices: ['File sort']},// q2
{
type: 'input'.message: 'Which folder do you want to sort? (Absolute path) ',},// q3
{
type: 'list'.message: 'What kind of support do you need? '.choices: ['multilingual'.'turn Markdown Word'],},// q4
{
type: 'list'.message: 'What kind of support do you need? '.choices: [
'Download multilingual Resources'.'Import multilingual Resources'.'Export multilingual Resources',]},// q5
{
type: 'input'.message: 'Resource download address (HTTP)? '.default: 'https://www.kdocs.cn/l/sdwvJUKBzkK2',}];const answerList = [
// q0
async (result: Result, questions: any) => {
if (result.answer === 'Public Services') {
questions[1] (); }else if (result.answer === 'File Management') {
questions[3]();
}
},
// q1
async (result: Result, questions: any) => {
if (result.answer === 'File sort') {
questions[2]();
}
},
// q2
async (result: Result, _questions: any, prompts: any) => {
const sortResult = await sortCatalog(result.answer);
if (sortResult) {
console.log('Sort succeeded! '); prompts.complete(); }},// q3
async (result: Result, questions: any) => {
if (result.answer === 'multilingual') {
questions[4]();
}
},
// q4
async (result: Result, questions: any) => {
if (result.answer === 'Download multilingual Resources') {
questions[5]();
}
},
// q5
async (result: Result, _questions: any, prompts: any) => {
if (result.answer) {
const downloadResult = await downLoadExcel(result.answer);
if (downloadResult) {
console.log('Download successful! '); prompts.complete(); }}},]; inquirer(questionList, answerList); };export default common;
Copy the code
I regret that Inquirer. Ts was modified so badly that jsliang had to write a file to indicate the sequence of the problem before it was sorted out:
src/common/questionList.ts
// The common section questions consultation route
export const questionList = {
'Public Services': { // q0
'File sort': { // q1
'Folders to sort': 'the Work Work'.// q2}},'File Management': { // q0
'multilingual': { // q3
'Download multilingual Resources': { // q4
'Download address': 'the Work Work'.// q5
},
'Import multilingual Resources': { // q4
'Import address': 'the Work Work',},'Export multilingual Resources': { // q4
'Export full resource': 'the Work Work'.'Export single gate resource': 'the Work Work',}},'turn Markdown Word': 'Not currently supported'.// q3}};Copy the code
After writing, switch to the write function:
src/common/downLoadExcel.ts
import puppeteer from 'puppeteer';
import path from 'path';
import fs from 'fs';
export const downLoadExcel = async (link: string): Promise<boolean> => {
// Start the browser
const browser = await puppeteer.launch({
headless: false.// Open the physical browser
devtools: true.// Open development mode
});
// 1. Create a new TAB and open it
const page = await browser.newPage();
await page.goto(link);
// 2. Sleep 6.66s - Make sure the page opens normally
await page.waitForTimeout(6666);
// 3. Trigger the click of "More Menu" button
const moreBtn = await page.$('.header-more-btn'); moreBtn? .click();// 4. Sleep 1s - Make sure the button is clicked
await page.waitForTimeout(2000);
// 5. Set the download path
const dist = path.join(__dirname, './dist');
if(! fs.existsSync(dist)) { fs.mkdirSync(dist); }await (page asany)._client? .send('Page.setDownloadBehavior', {
behavior: 'allow'.downloadPath: dist,
});
// 6. Trigger the click of the download button
const elements = await page.$$('.header-menu-item');
let downloadBtn;
if (elements.length) {
downloadBtn = elements[8];
}
if(! downloadBtn) {console.error('Download button not found');
await browser.close();
}
awaitdownloadBtn? .click();// 7. Sleep 10s-make sure resources are downloaded
await page.waitForTimeout(10000);
// 8. Close the window
await browser.close();
return await true;
};
Copy the code
After running like this, if the console does not report an error, VS Code will display:
(Dist /Excel) (XLSX) (common) (Dist /Excel) (XLSX
See you next time!
Iv References
- Github: Puppeteer
- Puppeteer
- Puppeteer front-end sharp device
- An introduction to Puppeteer
Jsliang’s document library is licensed by Junrong Liang under the Creative Commons Attribution – Non-commercial – Share alike 4.0 International License. Based on the github.com/LiangJunron… On the creation of works. Outside of this license agreement authorized access can be from creativecommons.org/licenses/by… Obtained.