I am participating in the Mid-Autumn Festival Creative Submission contest, please see: Mid-Autumn Festival Creative Submission Contest for details
The Mid-Autumn Festival is coming, so do you know which flavor of moon cake people like best? Do you know which mooncake is far ahead in sales? Let’s take a look at how selenium can capture the sales rankings of certain East mooncakes
Configure the environment
1. Initialize the project
npm init -y
Copy the code
2. Download the corresponding Webdrive
Browser | Component |
---|---|
Chrome | chromedriver(.exe) |
Internet Explorer | IEDriverServer.exe |
Edge | MicrosoftWebDriver.msi |
Firefox | geckodriver(.exe) |
Opera | operadriver(.exe) |
Safari | safaridriver |
According to their own environment to download, to download a good package into the root directory of the project, do not need to install the version of each browser and dirver version of the choice needs to close, not blindly choose the latest version, otherwise there will be unexpected bugs, preferably to upgrade to the latest stable version of the browser and select the corresponding package.
3. Download dependencies
npm install selenium-webdriver
Copy the code
4. Create an index.js file to write the following logic
Once the preparation project is complete, you can then run a demo to test it. The code is created in index.js
First, selenium- Webdrivert is introduced on demand
const { Builder, By, Key } = require("selenium-webdriver");
Copy the code
The next change begins the writing of logical code
// All operations are in an IIFE
(async function example() {
// Instantiate the driver object, where Chrome represents the browser used
let driver = await new Builder().forBrowser("chrome").build();
try {
// Open the target website address
await driver.get("https://cn.bing.com/");
// Select the tag, enter the keyword, and search
await driver.findElement(By.name("q")).sendKeys("webdriver", Key.RETURN);
// Assert page title
await driver.wait(until.titleIs("Webdriver - Bing in China"), 1000);
} finally {
// Automatically close the browser
// await driver.quit();
}
})();
Copy the code
Grab mooncake sales
Let’s take a look at how selenium can be used to capture mooncake sales in a certain region
First you need to import the required functions and classes as needed, write a self-executing function, and then the crawl behavior is all in IIFE
const { Builder, By, Key } = require("selenium-webdriver");
// All operations are in an IIFE
(async function example() {
// to do ...}) ();Copy the code
Then by opening a certain east website, and through the input box to enter the moon cake keyword search
try {
// Open the target website address
await driver.get("https://www.jd.com/");
// Select the tag, enter the keyword, and search
await driver.findElement(By.id("key")).sendKeys("Moon cakes", Key.RETURN);
} finally {
// Automatically close the browser;
await driver.quit();
}
Copy the code
Waiting for the page’s search results to render is what we expect them to look like
After the results are displayed, you can see from the page that the data is now displayed filtered by the comprehensive filter, while the sales volume data is expected to be crawled
Next, select the sales button on the page through the CSS selector and click to obtain the sales ranking data
await driver
.findElement(By.css("#J_filter .f-line.top .f-sort a:nth-of-type(2)"))
.click();
Copy the code
Now that the page has rendered the data we expected, the next step is to grab the data by analyzing the page
const getMoonPriceList = async() = > {// Data store
const moonList = [];
// Get every displayed card
const _li = await driver.findElements(
By.css("#J_goodsList .gl-warp .gl-item"));for (let i = 0, len = _li.length; i < len; i++) {
// Get each item of data
const itemInfo = _li[i];
/ / price
const price = await itemInfo.findElement(By.css(".p-price i")).getText();
/ / title
const title = await itemInfo.findElement(By.css(".p-name i")).getText();
/ / comments
const comment = await itemInfo
.findElement(By.css(".p-commit a"))
.getText();
moonList.push({
price,
title,
comment,
});
}
console.log("_li length", moonList);
};
getMoonPriceList()
Copy the code
Part of the captured data is shown as follows:
Part of the process of packet capture: