Puppeteer is an open source Nodejs library from the Google Chrome team that provides high-level apis to operate a headless Chrome browser using the DevTools protocol. You can also configure a headless Chrome browser to start. This article describes how to use puppeteer to make screenshots online and record the problems and solutions.

This article is simultaneously published in zhihu column: Front-end micro-blog.

A brief introduction to Puppeteer

Many of you may have used PhantomJS, but Puppeteer has many similar features. After all, is Chrome official product, I believe that the future will be synchronized to update Chrome’s new features, functions will be more and more powerful.

You can use Puppeteer to automate your Chrome browser. Here are a few scenarios for Puppeteer:

  1. Webpage snapshot and PDF file generation;

  2. Scaling the SPA system and generating pre-rendered content (e.g. SSR);

  3. Erase page content;

  4. Automatic form submission, UI testing, keyboard input, etc.

  5. Create an up-to-date, automated test environment that uses the latest JavaScript and browser features directly on the latest version of Chrome;

  6. Capture the timeline of your site to help you diagnose performance problems.

That’s all for puppeteer. Search GitHub for more information.

Functional analysis

Last week, the product manager threw in a request: Every Monday, take a screenshot of a report page (including ICONS, tables, etc.) in the system and email it to a designated person.

When I first got this requirement, I felt very troublesome, because the project was a single-page application made with Vue. In order to intercept the project page and ensure the normal display of page style, the 11 ICONS made with Echarts on the page should be rendered and displayed normally.

Since the front-end project runs on Nginx in Docker and accesses the Java backend interface through a reverse proxy, the current architecture is not good for screenshots. As for screenshots, I discussed with my backend colleagues and there are three solutions:

  1. Phantomjs plug-in in the Java end, to achieve screenshots;

  2. Create the Nodejs service and use Phantomjs to access the system page screenshot.

  3. Create the Nodejs service and use puppeteer to implement the screenshot function.

By comparison, phantomJS screenshots are not good, and the page style display is not fine enough. Considering that Nodejs will be used to realize other functions in the future, we finally decide to adopt plan 3, which takes Nodejs service as a separate service layer, only provides services of screenshots, and exposes the interface. It is called by the Java back end, and the function of sending mail regularly is done by the Java back end.

The Nodejs service uses Express to provide routing capabilities, exposing interfaces for external invocation.

Functional process

The Java backend periodically invokes the Nodejs interface → puppeteer screenshot page → Upload the image to the image server after the image is successfully uploaded → Returns the image information to the Java backend → Retrieves the corresponding screenshot from the library and sends an email

Screenshot of practice

With that said, post the code (Nodejs V7.6 and above, support async, await syntax, if the Nodejs version is younger, use Promise).

// Introduce dependent plug-insconst puppeteer = require('puppeteer');const fs = require('fs');const path = require('path');const request = require('request');let theBrowser = null;const websiteUrl = 'example.com/dashboard';const uploadFileUrl = 'upload.com/upload';/ / start the puppeteerpuppeteer.launch({    // The sandbox needs to be disabled under root
  args: ['--no-sandbox']
}).then(async browser => {
  theBrowser = browser    // Open the browser and create a new TAB page
  const page = await browser.newPage();      Puppeteer allows you to size each TAB page individually
  await page.setViewport({
    width: 1000,
    height: 3480
  });      // TAB to access a page that needs a screenshot, use await to wait for the page to load
  await page.goto(websiteUrl);      // Since the page data is asynchronous, wait 8 seconds for the asynchronous request to complete and the page to render
  await page.waitFor(8000);      // After the page is rendered, start taking screenshots
  await page.screenshot({
    path: './dashboard_shot.png',
    clip: {
      x: 200,
      y: 60,
      width: 780,
      height: 3405}});// Upload the screenshot to the image server
  request.post({
    url: uploadFileUrl,
    headers: {            // Here a server is simulated to verify the token
      userToken: '2361A77FDD432C6B464C57007C062B82'
    },
    formData: {
      file: fs.createReadStream(        path.join(__dirname, './dashboard_shot.png')      )
    }
  }, (err, httpResponse, body) => {        // Handle exceptions and close open browsers regardless of failure or success
    if (err) {
      theBrowser.close();
      fs.unlink(        path.join(__dirname, './dashboard_shot.png'))return console.error('upload failed:', err)
    }

    fs.unlink(      path.join(__dirname, './dashboard_shot.png')    )
    theBrowser.close();
  })
}).catch(error => {
  theBrowser.close();
});Copy the code

Problems encountered

For the first time, set the width of the page to 1700, and after successfully deploying on the Ubuntu docker server, an error will be reported, saying “Page crashed! (Page crashes)

When I reported this error, I was confused, because it is OK to run on the machine and the local environment is MAC OS. Normally, since I can start docker online, the error should not be related to the environment. Then I debutted various things and found the following situations:

  1. When capturing a simple page (no rendering of complex charts, etc.), no error will be reported, and the screenshot is successful;

  2. If you do not set the page size, use the default size, no error will be reported, and the screenshot is successful.

  3. After setting the size, no waiting for data loading, no delay, no error, screenshot success;

  4. Plus waiting delay, error will be reported, setting the size will be reported, screenshot failure.

Through the above phenomenon, the first thing that comes to mind is that the running memory of the online Docker is too small, but the MEMORY of 4G is already quite a lot. After upgrading to 8G memory, the error is still reported, so the reason for the small memory is excluded.

If the browser Tab page width is too large, Chrome will need too much performance to open the page and render the page, causing the Tab page to crash.

Holding the attitude of trying, adjust the page width to small, when the width is adjusted to 1200, plus the delay wait, the screenshot is successful, I was also confused at this time, WTF.

However, there is still a certain probability of failure, almost one in ten probability of failure, and then, when the width is adjusted to 1000, basically no error, perfect screenshots every time.

conclusion

My experience with puppeteer has taught me that anything can go wrong, and when you are in doubt, consider all the possible causes, so that it may be easy to find the problem.

Ben thought it could not be the width problem, so he thought there might be something wrong with my writing method. He changed many ways, and finally he got into the horn and couldn’t get out. He wasted a lot of time.

Overall, Puppeteer is very powerful and is great for UI automation testing, as well as gadgets. Discover more scenarios for puppeteer, folks.