I am participating in the Mid-Autumn Festival Creative Submission contest, please see: Mid-Autumn Festival Creative Submission Contest for details

The Mid-Autumn Festival is coming, so do you know which flavor of moon cake people like best? Do you know which mooncake is far ahead in sales? Let’s take a look at how selenium can capture the sales rankings of certain East mooncakes

Configure the environment

1. Initialize the project

npm init -y
Copy the code

2. Download the corresponding Webdrive

Browser Component
Chrome chromedriver(.exe)
Internet Explorer IEDriverServer.exe
Edge MicrosoftWebDriver.msi
Firefox geckodriver(.exe)
Opera operadriver(.exe)
Safari safaridriver

According to their own environment to download, to download a good package into the root directory of the project, do not need to install the version of each browser and dirver version of the choice needs to close, not blindly choose the latest version, otherwise there will be unexpected bugs, preferably to upgrade to the latest stable version of the browser and select the corresponding package.

3. Download dependencies

npm install selenium-webdriver
Copy the code

4. Create an index.js file to write the following logic

Once the preparation project is complete, you can then run a demo to test it. The code is created in index.js

First, selenium- Webdrivert is introduced on demand

const { Builder, By, Key } = require("selenium-webdriver");
Copy the code

The next change begins the writing of logical code

// All operations are in an IIFE
(async function example() {
  // Instantiate the driver object, where Chrome represents the browser used
  let driver = await new Builder().forBrowser("chrome").build();

  try {
    // Open the target website address
    await driver.get("https://cn.bing.com/");

    // Select the tag, enter the keyword, and search
    await driver.findElement(By.name("q")).sendKeys("webdriver", Key.RETURN);

    // Assert page title
    await driver.wait(until.titleIs("Webdriver - Bing in China"), 1000);
  } finally {
    // Automatically close the browser
    // await driver.quit();
  }
})();
Copy the code

Grab mooncake sales

Let’s take a look at how selenium can be used to capture mooncake sales in a certain region

First you need to import the required functions and classes as needed, write a self-executing function, and then the crawl behavior is all in IIFE

const { Builder, By, Key } = require("selenium-webdriver");

// All operations are in an IIFE
(async function example() {
  // to do ...}) ();Copy the code

Then by opening a certain east website, and through the input box to enter the moon cake keyword search

try {
  // Open the target website address
  await driver.get("https://www.jd.com/");

  // Select the tag, enter the keyword, and search
  await driver.findElement(By.id("key")).sendKeys("Moon cakes", Key.RETURN);
} finally {
  // Automatically close the browser;
  await driver.quit();
}
Copy the code

Waiting for the page’s search results to render is what we expect them to look like

After the results are displayed, you can see from the page that the data is now displayed filtered by the comprehensive filter, while the sales volume data is expected to be crawled

Next, select the sales button on the page through the CSS selector and click to obtain the sales ranking data

await driver
      .findElement(By.css("#J_filter .f-line.top .f-sort a:nth-of-type(2)"))
      .click();
Copy the code

Now that the page has rendered the data we expected, the next step is to grab the data by analyzing the page


const getMoonPriceList = async() = > {// Data store
  const moonList = [];

  // Get every displayed card
  const _li = await driver.findElements(
    By.css("#J_goodsList .gl-warp .gl-item"));for (let i = 0, len = _li.length; i < len; i++) {
    // Get each item of data
    const itemInfo = _li[i];
    / / price
    const price = await itemInfo.findElement(By.css(".p-price i")).getText();
    / / title
    const title = await itemInfo.findElement(By.css(".p-name i")).getText();
    / / comments
    const comment = await itemInfo
      .findElement(By.css(".p-commit a"))
      .getText();

    moonList.push({
      price,
      title,
      comment,
    });
  }

  console.log("_li length", moonList);
};

getMoonPriceList()
Copy the code

Part of the captured data is shown as follows:

Part of the process of packet capture: