This is the fourth day of my participation in the August More text Challenge. For details, see:August is more challenging

The problem

  • How does the front-end use Puppeteer to generate PDF?

  • Can I set cover, header and footer, CSS, img when generating PDF?

  • How do I get the PDF background image to show?

This blog post takes you through these tough issues.

preface

To learn this blog post, you need a basic knowledge of nodeJs

You also need to have some knowledge of Ali’s Node framework egg

In addition, blog content is more, so you need to be a little patient, do not skip any content in the middle.

The body of the

When I first received a request to use Node for PDF generation, I did some quick research and decided to use Puppeteer. What’s so good about this?

In short, it’s simple, with a flat learning curve and no difficulty. You can start with the official document, here’s the document address

English is not good children shoes, here is the Chinese address

I’m going to start with a simple demo, which is the official simplest code to generate PDF

// create.js
const puppeteer = require('puppeteer');

(async() = > {const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.pdf({path: 'example.pdf'});
  awaitbrowser.close(); }) ();Copy the code

And then run it in Terminal

node create.js
Copy the code

It seems simple enough,

However, the demand is always complicated, docking demand and one by one when it needs a little perseverance, a little sweat, here I will list the problems, save you continue to step on the pit:)


Project Structure:

| - index. / / HTML template file | - create. Js / / generate PDF file | - public | -- -- -- -- -- -- -- -- the CSS | -- -- -- -- -- -- -- -- style.css. CSS / / template style | -- -- -- -- -- -- -- -- img | -- -- -- -- -- -- -- -- avatar. PNG / / template need pictures | -- -- -- -- -- -- -- -- PDF, HTML / / template will Mr Into an HTML, and then by the HTML to PDF, PDF | -- -- -- -- -- -- -- - PDF. PDF / / the resulting PDFCopy the code

Let’s create a file with the above file structure, and then fill in the corresponding file with the following contents.

  • Template file index.html
<! DOCTYPEhtml>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="Width = device - width, initial - scale = 1.0">
  <title>title</title>
  <link rel="stylesheet" href="css/style.css">
</head>
<body>
  <div id="cover">Width :794px; if you want to give me the entire page, set my width:794px; height:1124px; page-break-after:always;<a href="# # # #">
      <img src="img/avatar.png" alt="">
    </a>
    <div>
      <p>{{date}}</p>
      <p>{{author}}</p>
    </div>
  </div>
  <div class="page">I am the content and I may have multiple pages and I have a header and footer and my style width: 595px; margin:0 auto;<p>Here you can copy some of the long content yourself</p>
  </div>
</body>
</html>
Copy the code

The project uses the Egg framework, which provides a number of apis and configurations out of the box. One of the API methods is renderView. This method returns the rendered data structure, which is an HTML page.

We use ctx.renderView() to get the page structure with the rendered data and pass it to Puppeteer to generate the PDF

  • The create.js code is as follows:
const html_vars = {
 title: 'title'.date: Date.now(),
 author: 'Sunny Day Student'
}
const html_template = './index.html'

// PdF_string is a rendered HTML string
const pdf_string = await ctx.renderView(html_template, html_vars)

// Try to export PDF
const browser = await puppeteer.launch({
   args: ['--disable-dev-shm-usage'.'--no-sandbox']});const page = await browser.newPage();
page.setContent(pdf_string)
await page.pdf({
	format: 'A4'.path: 'public/pdf.pdf'
})
Copy the code

Run nodecreate.js to generate a pdF.pdf in the /public directory.

Then we opened the PDF file and found the following problem

Problem 1: The CSS file introduced by link in the page cannot be loaded

Solution:

Page. AddStyleTag () can be used to solve the problem. It can be used to pass the link (URL), the content (), and the path ().

The page. AddStyleTag () parameter is described below:

Problem 2: The exported PDF has no background color and background image

This is because Puppeteer is based on chrome’s headless browser, which by default does not export background images and background colors in order to save ink when printing.

Solution:

page.pdf({
    printBackground: true.'-webkit-print-color-adjust': 'exact',})Copy the code

Problem 3: Image path cannot be found

Solution:

Put the pictures in a static resource on the server, direct use of absolute address is no problem, if not on the static server resources, so long as the image server up, such as < img SRC = “http://localhost:3000/img/a.png” / >

Problem 4: Add header footer

The above three questions are not difficult to solve, but the three questions together are quite tedious. Then I thought, why don’t I make the HTML and give the HTML to puppeteer? Since this is a pure HTML file, not a string returned by renderView, the above three problems are solved by adding headers and footers

Solution:

Change the create.js content to the following code

const { promises: { readFile, writeFile } } = require('fs');
const path = require('path')...const pdf_string = await ctx.renderView(html_template, html_vars)
const pdf_path = path.join(__dirname,'/public/pdf.html')
// Img/CSS files can be imported directly
await writeFile(pdf_path, pdf_string, 'utf8');
/ / footer
const footerTemplate = `
      
< span style =" "> I am a footer < / span > < div > < span class =" pageNumber "> < / span > / < span class =" totalPages "> < / span > < / div > < / div > `
; / / the header const headerTemplate = ` // Try to export as PDF const browser = await puppeteer.launch({ args: ['--disable-dev-shm-usage'.'--no-sandbox']});const page = await browser.newPage(); await page.goto(`file://${process.cwd()}/public/index.html`); await page.pdf({ path: 'publick/pdf.pdf'. options,// Header and footer displayHeaderFooter: true, headerTemplate, footerTemplate, margin: { top: 80.bottom: 80}})await browser.close(); Copy the code

Run Node creation.js again, then look at pdF.pdf, and the following problem occurs again

Question 5: How to hide cover header footer

This problem is very critical, the issue is also a lot of questions, interested can click on the address to see, there are two main solutions

  • The first method is to set the margin of the cover to 0
await page.addStyleTag({
    content: "@page:first {margin-top: 0; } body {margin-top: 1cm; }"
});
Copy the code

This can solve the problem that the page header and footer will not appear on the cover, but also bring other problems. It probably means that except for the cover, the calculation of the margin-bottom of other pages is wrong, as shown in the figure:

I also encountered this problem and decided to use the second one to see if there was any solution in the issue

  • The second approach is to separate the cover from the rest of the page, split it several times to generate a PDF, and finally merge it into one. I’ll focus on how this is implemented

If you want to merge PDFS, you can’t generate PDFS directly. Instead, you need to save PDF data as a buffer.

How to generate a BUFFER for the cover and contents of the PDF?

The page. PDF (options) method generates a PDF file if path is passed in options, and returns a PDF buffer if path is not passed

// create.js.// Cover buffer
const cover_buffer = awaitpage.pdf({ ... options,pageRanges: '1'  // Export only the first page, that is, the cover page
})
Margin-bottom: 0; margin-bottom: 0
await page.addStyleTag({
    content: "#cover {display:none}"
})
const content_buffer = awaitpage.pdf({ ... options,displayHeaderFooter: true,
    headerTemplate,
    footerTemplate,
    margin: {
    	top: 80.bottom: 80}})Copy the code

Ok, now the question is how do I merge two buffers into one?

Here, too, there are two merge scenarios in the merge phase

  1. The combined solution can be usedeasy-pdf-mergeTo implement, but this depends on the system tools, so interested partners to experiment
  2. First merge the two buffers, and then regenerate the two buffers into PDF. Here are two recommended librariespdf-lib or node-pdftkEach has its advantages and disadvantages. Let’s see how to use it

The following code is to merge cover_buffer and content_buffer into one file through PDF-lib and then generate PDF. The problem is that the generated PDF file cannot jump when clicking the directory, so if you do not have requirements for the directory, you can use it, which is very convenient

First you need to install the package NPM I pdF-lib-s

It is then used in create.js

// create.js
const { PDFDocument } = require('pdf-lib')...const pdfDoc = await PDFDocument.create()

const coverDoc = await PDFDocument.load(cover_buffer)
const [coverPage] = await pdfDoc.copyPages(coverDoc, [0])
pdfDoc.addPage(coverPage)

const mainDoc = await PDFDocument.load(content_buffer)
for (let i = 0; i < mainDoc.getPageCount(); i++) {
    const [aMainPage] = await pdfDoc.copyPages(mainDoc, [i])
    pdfDoc.addPage(aMainPage)
}

const pdfBytes = await pdfDoc.save()
const pdf_path = 'public/pdf.pdf'
await writeFile(pdf_path, pdfBytes);
await browser.close();
Copy the code

The following code combines cover_buffer and content_buffer into one file using Nod-pdftk to generate a PDF. The problem is that nod-pdftk has an issue for MacOS

You first need to install the package NPM I nod-pdftk -s

It is then used in create.js

// create.js.const pdftk = require('node-pdftk')
const pdf_buffer = [cover_buffer, content_buffer];
pdftk
    .input(pdf_buffer)
    .output()
    .then(buf= > {
        const pdf_path = 'public/pdf.pdf'
		await writeFile(pdf_path, pdfBytes);
		await browser.close();
    });
Copy the code

Select one of the above buffer merging options and put it in creation.js, then run Node Creation.js and check out pdF.pdf. If you have any further questions, please review this blog post or contact me in the comments

CSS Page Setting

Finally, add a little CSS on how to manipulate paging:

Give a like to anyone who sees the end:)