This article has participated in the call for good writing activities, click to view: back end, big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!

preface

Every front-end developer will encounter some PDF-related requirements in his life, but searching online articles is mostly the implementation of partial functions, and it is not easy to obtain a complete solution that fits his own needs. Based on this, I combined my relevant work experience, Sorted out a set of PDF generation, preview, print a complete technical program, we feel useful if you can collect this article for reference in the future work.

This article demo code address: github.com/Alansad/pdf…

PDF generation

Scheme comparison

Generally speaking, there are two ways to generate PDFS. The first is to generate PDFS on the client side, and the second is to generate PDFS on the server side. I recommend generating PDFS on the server side.

It is generally generated on the client side based on canvas:

  • 1. Use the html2Canvas library to convert HTML to canvas
  • 2. Use the canvas.toDataURL method to convert the canvas into an image
  • 3. Use the jsPDF library to convert images to PDF

Although the solution seems simple, it has two fatal drawbacks:

  • 1. Generated PDF is fuzzy
  • 2. The client cannot store the PDF for a long time

So I recommend using the second option, which generates PDF on the server side:

  • 1. Generate AN HTML string
  • 2. Headless browser opens HTML
  • 3. Generate a PDF by taking snapshots of the page

Some server-side plugins make the headless browser opening/screenshot process a black box, so developers don’t feel the process. However, whether Java, NodeJS, Python or other languages are used, the above solution is generally adopted. The PDF generated by this solution has high clarity and strong restore degree.

The specific implementation

Now I will introduce a specific case to detail the technical details of the scheme.

Requirement description: provide an interface to generate PDF, render different PDFS according to different request parameters, and return PDF files as URL links.

Analysis: According to the above requirements, we need to make an HTML template, then fill the HTML according to the parameters in the request, use a headless browser to convert the HTML to PDF, store it in the file service, and finally return the URL to the front end.

Concrete implementation: The following sample code uses the native Node language to facilitate your understanding:

1. First we prepare the HTML string template:

Const getHtml = (params) => {const {title = ' ',} = params return (' <! DOCTYPE html> <html lang="zh-CN"> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> <title>demo</title> </head> <body> <div class="wrapper"> <h style="color:red">${title}</h> <div> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> <img src="https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fn1-q.mafengwo.net%2Fs6%2FM00%2FFC%2FCC%2FwKgB4lNzI2yAK4tdAAEL j6RBVtE37.jpeg%3FimageMogr2%252Fthumbnail%252F%21310x207r%252Fgravity%252FCenter%252Fcrop%252F%21310x207%252Fquality%252 F90&refer=http%3A%2F%2Fn1-q.mafengwo.net & app = 2002 & size = f9999, 10000 & q = a80 & n = 0 & g = 0 n & FMT = jpeg? The SEC = 1627634331 & efacd9a6480 t = 0 6 ffc74c5cdfa8f7f261f "Alt =" "> < img SRC =" https://img1.baidu.com/it/u=1361135963, 570304265 & FM = 26 & FMT = auto&gp = 0. JPG" alt=""> </div> </div> </html> `) }Copy the code

2. We then use the HTMl-PDF NPM package to convert HTML to PDF (htML-PDF document)

// Generate PDF const optionDefault = {'format': 'A4', 'header': {'height': '10mm', 'contents': // Convert HTML to PDF const exportPdf = (HTML, options = optionDefault) => {return new Promise((resolve, reject) => { pdf.create(html, options).toBuffer((err, res) => { if (err) { reject(err) } else { resolve(res) } }) }) }Copy the code

Finally, we start an HTTP service and write an interface to return the PDF:

const http = require('http')
const url = require('url')
const querystring = require("querystring")

const {getHtml, exportPdf} = require('./utils/htmlToPdf')

http.createServer(async (request, response) => {
  const {query, pathname} = url.parse(request.url)
  const {title} = querystring.parse(query)
  if (pathname === '/') {
    response.writeHead(200, {
      'Content-Type': 'application/pdf',
      'Access-Control-Allow-Origin': '*'
    })
    const html = getHtml({title})
    const pdf = await exportPdf(html)
    response.end(pdf)
  }
}).listen(8888)
Copy the code

4. Here our example is to directly return the PDF in buffer format. If you need to upload the PDF file to the storage service (take Ali Cloud storage service as an example), you can use PDf.create (HTML, options).

pdf.create(html, options).toStream((err, res) => {
      if (err) {
        reject(err)
      } else {
        resolve(res)
      }
    })
    
Copy the code

Now that you understand the solution, it’s easy to add a watermark to a PDF:

Because the principle of the scheme is to take a screenshot of THE HTML page, so we only need to add a watermark to the HTML page, there are many watermark libraries on the Internet, add a script in the HTML to join the watermark.

In addition to the above function implementation, there are two considerations:

  • Because this solution is based on the headless browser implementation, so the speed of PDF generation directly depends on the speed of the browser loading HTML, if the time is too long, it is recommended to make asynchronous PDF retrieval. In addition, if there is a high concurrency situation, the amount of HTML loading is too much, but also need to pay attention to the memory problems of the service, it is best to deploy to a different server from the business code.
  • The Browser relies on the Chinese font library when loading HTML. If Chinese characters are not displayed in the PDF, install Chinese fonts in the system. If you do not want to install Chinese fonts for the system, you can also specify Chinese fonts:
@font-face {
          font-family: pdfZh;
          src: url("http://localhost:3000/pdf_zh.ttf");
        }
        body{
           font-family: pdfZh;
        }
Copy the code

PDF preview

The general principle of PDF preview is toPDF to canvas, and the most popular library ispdf.js, I use an official example to introduce the library, the final result is as follows, you can page the PDF preview.

The complete code is as follows:

<! DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> <! --<script src="./pdf.js"></script>--> <script src="//mozilla.github.io/pdf.js/build/pdf.js"></script> <style> #the-canvas { border: 1px solid black; direction: ltr; </style> </head> <body> <div> <button id="prev"> </button ID ="next"> next </button> &nbsp; &nbsp; <span>Page: <span id="page_num"></span> / <span id="page_count"></span></span> </div> <canvas id="the-canvas" style="width: 100%; height: auto"></canvas> </body> <script> // If absolute URL from the remote server is provided, configure the CORS // header on that server. // const url = 'http://localhost:8888/? title=123' var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/web/compressed.tracemonkey-pldi-09.pdf';  // Loaded via <script> tag, create shortcut to access PDF.js exports. const pdfjsLib = window['pdfjs-dist/build/pdf']; // The workerSrc property shall be specified. pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js'; var pdfDoc = null, pageNum = 1, pageRendering = false, pageNumPending = null, scale = 3, canvas = document.getElementById('the-canvas'), ctx = canvas.getContext('2d'); /** * Get page info from document, resize canvas accordingly, and render page. * @param num Page number. */ function renderPage(num) { pageRendering = true; // Using promise to fetch the page pdfDoc.getPage(num).then(function(page) { var viewport = page.getViewport({scale: scale}); canvas.height = viewport.height; canvas.width = viewport.width; // Render PDF page into canvas context var renderContext = { canvasContext: ctx, viewport: viewport }; var renderTask = page.render(renderContext); // Wait for rendering to finish renderTask.promise.then(function() { pageRendering = false; if (pageNumPending ! == null) { // New page rendering is pending renderPage(pageNumPending); pageNumPending = null; }}); }); // Update page counters document.getElementById('page_num').textContent = num; } /** * If another page rendering in progress, waits until the rendering is * finised. Otherwise, executes rendering immediately. */ function queueRenderPage(num) { if (pageRendering) { pageNumPending = num; } else { renderPage(num); } } /** * Displays previous page. */ function onPrevPage() { if (pageNum <= 1) { return; } pageNum--; queueRenderPage(pageNum); } document.getElementById('prev').addEventListener('click', onPrevPage); /** * Displays next page. */ function onNextPage() { if (pageNum >= pdfDoc.numPages) { return; } pageNum++; queueRenderPage(pageNum); } document.getElementById('next').addEventListener('click', onNextPage); /** * Asynchronously downloads PDF. */ pdfjsLib.getDocument(url).promise.then(function(pdfDoc_) { pdfDoc = pdfDoc_; document.getElementById('page_count').textContent = pdfDoc.numPages; // Initial/first page rendering renderPage(pageNum); }); </script> </html>Copy the code

The pdf.js file is github.com/mozilla/pdf… , it should be noted that the scale value, in theory, the larger the value is set, the clearer the display will be. However, if the value is set too large, the parsing process may get stuck.

print

On the browser side, we do not have the permission to directly connect to the printer to print files, because there are great security risks. A common requirement is to invoke the browser’s print interface and let the user adjust and manipulate the print themselves.

The effect shown in the GIF above is: automatically evoking the function of printing PDF.

The implementation steps are as follows:

The purpose of turning PDF into objectURL here is to uniformly solve cross-domain problems.

The specific code is:

<! DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> <iframe id="frame-result" style="height: 100vh; width: 100vw;" ></iframe> </body> <script> downloadRes = async () => { let response = await fetch('http://localhost:8888/? Let blob = await response.blob() const iframeEle = document.querySelector('#frame-result') iframeEle.src = URL.createObjectURL(new Blob([blob], {type: 'application/pdf'})) if (iframeEle) { iframeEle.onload = () => { iframeEle.contentWindow.print(); } } } downloadRes() </script> </html>Copy the code

If you need to invoke the print window without showing the PDF, hide the iframe

<iframe id="frame-result" style="display: none"></iframe>
Copy the code

If you need to open a new window to print, you can use the following code:

<! DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> </body> <script> downloadRes = async () => { let response = await fetch('http://localhost:8888/? Title =123') let blob = await response.blob() const newWindow = window.open(url.createObjecturl (new) Blob([blob], {type: 'application/pdf'}))) if (newWindow) { newWindow.onload = () => { newWindow.print(); } } } downloadRes() </script> </html>Copy the code

conclusion

This article introduces the complete program including PDF generation, PDF preview, PDF printing, welcome friends to exchange and correct. In addition, students who are interested in the knowledge of file stream and headless browser can also follow me, and I will introduce its practical application in work in detail later.