It has been a month and a half since the last update

What to do

As the browser can do more and more, the CSS escalated, made some beautiful reports, articles on the web page is a good choice, but the spread of these products will be limited for a variety of factors, such as offline environment directly xie vegetables, such as the need to product archive into another system (cloud disk, WeChat), etc., to share a link is not available, Since this link can be broken at any time due to business changes, exporting to images or PDF is a mainstream option.

How to implement

You don’t have to use them all, but you should have one in your Arsenal.

HTML2Canvas

First, we make use of the powerful power of Canvas, a Canvas with a high degree of freedom allows us to make rich content, and then we return a data URL by Canase.todataURL (), which contains the image in the format specified by the type parameter (PNG by default). The returned image has a resolution of 96dpi.

A little popular science:

The toDataURL API accepts two parameters:

  1. type

Image format, the default value is image/ PNG, you will see it at the beginning of the exported data URL, if it is not the same as you set, this format is not supported, it will automatically fallback to image/ PNG, don’t worry, it will always render images.

  1. encoderOptions

Image quality, which can be selected from 0 to 1 if the image format is specified as Image/JPEG or image/webp. If the value is out of range, the default value 0.92 will be used. Yes, the larger the number, the clearer it will be. At the same time, the larger the Size, if you want to moderate compression, it is recommended to choose 0.6 or above, no matter how small it is, it is really hard to see.

Html2canvas is a library that accepts a DOM element and returns a painted Canvas, which is usually a working solution. You can then export the format and image quality yourself.

But have you ever known its limitations?

In other words, some usage details.

1. Images cross domains

First off, html2Canvas is a problem of canvas itself. If you want to draw an image to canvas and export it, then its SRC is a network address, which is limited by the classic browser security policy you are familiar with – cross domain.

Since pixels in a < Canvas > bitmap can come from multiple sources, including images or videos retrieved from other hosts, security issues inevitably arise.

Although it is possible to use images from other sources in < Canvas > without using CORS, this taints the canvas and is no longer considered a safe canvas, which can throw exceptions during data retrieval in

.

If HTML or SVG < SVG > is imported from outside and the image source does not conform to the rules, it will be prevented from reading data from

.

Calling the following methods on a “contaminated” canvas will throw a security error:

  • in<canvas>Is called in the context ofgetImageData()
  • in<canvas>On the calltoBlob()
  • in<canvas>On the calltoDataURL()

This mechanism can avoid unauthorized access to remote site information and resulting in user privacy disclosure.

Reference since the MDN

Normally we draw an image to a canvas like this:

const canvas = document.createElement('canvas');
const context = canvas.getContext('2d'); // Get the canvas context

// Create an Image object. Maybe this is the first time you've seen this new Image() written.
const img = new Image(); 
img.src = 'Network address such as ://xxxxx.jpg';
// The key step is to set it up to allow cross-domain
img.crossOrigin = ' ';
img.onload = function () {
    context.drawImage(this.0.0);
    context.getImageData(0.0.this.width, this.height);
};
Copy the code

In addition to the SRC attribute you’re familiar with, the image object has a crossOrigin attribute, which has two optional values: Anonymous and use-credentials. In fact, any character other than use-credentials will be resolved to anonymous, including empty strings. Anonymous means that when the server returns the image address, there is no need to add any non-anonymous information to the header. That means just the image resources, no authentication, no cookies, nothing. In addition, it takes time for the server to return the Image and render the Image. Therefore, the Image object also provides an onLoad method, which means that the Image is loaded and it is expected to read the pixel information of the Image at this time.

Html2canvas does not require you to operate too much, allowTaint:true can be configured;

That’s all the browser has to do, and don’t forget the classic server strategy for cross-domain: the image server assigns the appropriate Access-Control-Allow-Origin to the response Header, which is usually quickly set up on the web page of some obS service provider.

2. Hd zoom

Let’s start with some code:

const dom = document.querySelector('.container');
const imgHeight = dom.offsetHeight // Get the DOM height
const imgWidth = dom.offsetWidth // Get the DOM width
const scale = window.devicePixelRatio // Get the device pixel ratio
html2canvas(dom, {
    allowTaint:true,
    scale,
    width: imgWidth, 
    height: imgHeight 
    })
Copy the code

Due to the distortion caused by the display of the exported non-vector image, in order to make the exported product clear, we usually think of enlarging the size of the elements in the browser first and then saving them as pictures. In practice, it means enlarging the width and height of the canvas first and then enlarging and drawing the elements on it.

Scale means scale, why scale?

This introduces an important concept, devicePixelRatio, and maybe you’ve heard why Apple’s Retina screen is sharper, because he uses 4 physical pixels to render 1 logical pixel, which is the 1px that you defined in your code to occupy 2 lines of the screen, The devicePixelRatio of this device is equal to 2, which makes it possible to render a line of 0.5px and make the entire image look more detailed. Of course, you may never have set this scale, but that’s because the default is 1, and your company’s normal screen is usually 1:1 rendering, so you can see a lot of jagged edges. Now open the console and see if Window.DevicepixelRatio is 1.

What happens if I don’t scale?

If your valued users have an HD screen, the canvas size is smaller than the element size, so the exported image is clearly only part of the upper left corner.

Of course, if you want to enlarge the canvas, remember to enlarge the width and height of the canvas at the same time as the scale value. The recommended magnification is 2 to the NTH power, which will better fit the HD screen.

3. Unsupported CSS

You might think that, except for image resources, some divs and spans can be drawn as they are. Some CSS properties are not supported either. Here are some common ones:

  • Background-clip is a text clip
  • Fortunately, it doesn’t have much effect
  • Object-fit, the internal alignment of images, but you can use other CSS properties to create the same effect

See the documentation for a more comprehensive list of CSS support

Common CSS properties are basically OK, so you may not know that CSS support is not 100%.

4. Visual range

Long screenshots are also A common requirement. If the captured element A is in A container B and B is smaller than A, you can use overflow-y to make the contents of the element B scrollable. However, when you inject parameters to HTML2Canvas, please note that passing B will not automatically scroll the contents of the screenshots, you must pass the A element itself.

JSPDF

As for the scheme of converting DOM elements into images, it may not be HTML2Canvas, but some other libraries, but the principle is the same, without exception, DOM is drawn on canvas, and then a DataURL is given, and the DataURL is converted into image file download (I will not talk about this part, With the aid of the window. The URL. CreateObjectURL method), or directly in the img SRC rendering, so turn the DOM figure only html2canvas, for example.

Here comes your distinguished leader: give us another PDF, we often use this one.

As you start analyzing your requirements, PDF is an excellent document format that solves the problem of “what looks good in Word on my computer will look bad in yours”. It locks in typesetting and is often uneditable. Speaking of uneditable, you’ve seen PDFS where the text is not only uneditable, but also unselectable. It’s actually a huge image.

Here’s the idea. I can put the previous exported image directly into the PDF.

JSPDF is a library that generates PDFS on the front end. It provides a canvas as a page, like Canvas, on which you can add elements and then export them as PDFS to download locally. There’s not much HTML native knowledge to note here, just look at the code:

import JsPDF from 'jspdf';

html2Canvas(dom, { allowTaint: true }).then((canvas) = > {
  // The normal size of a4 paper is 592.28 wide and 841.89 high
  const pageWidth = 841.89
  const pageHeight = 592.28
  // Set the width and height of the content
  const contentWidth = canvas.width
  const contentHeight = canvas.height
  // The default offset
  let position = 0
  // Set the width and height of the generated image
  const imgCanvasWidth = pageWidth
  const imgCanvasHeight = 592.28 / contentWidth * contentHeight
  let imageHeight = imgCanvasHeight
  let pageData = canvas.toDataURL('image/jpeg'.1)
  // New JsPDF accepts three parameters, landscape means landscape, print unit and paper size
  let PDF = new JsPDF('landscape'.'pt'.'a4')

  // When the content is no more than one A4 page
  if (imageHeight < pageHeight) {
    PDF.addImage(pageData, 'JPEG'.20.20, imgCanvasWidth, imgCanvasHeight)
  } else {
    // When the content is longer than one page of A4 paper, an additional page is required
    while (imageHeight > 0) {
      PDF.addImage(pageData, 'JPEG'.20, position, imgCanvasWidth, imgCanvasHeight)
      imageHeight -= pageHeight
      position -= pageHeight
      // Avoid adding blank pages
      if (imageHeight > 0) {
        PDF.addPage()
      }
    }
  }
  // Save too simple
  PDF.save('export PDF' + '.pdf')})Copy the code

After creating a NEW PDF object, call addImage to add the image, again pay attention to the size data, do some height calculations, paginate placement, and finally call the save method to download directly.

AddImage also has jspdf.text (text,x,y,options) for adding text, and jspdf.path (lines,style) for drawing arbitrary lines. Drawing lines in code is nothing more than an SVG-like syntax, where lines are arrays describing m, L, C, and H information.

Unfortunately, a successful NPM library should not only provide a powerful API, but also be easy to use. Although JSPDF can write, underline, and add images, would you write your own code using these apis to draw the DOM as it is? Obviously this is going to be a lot of work. Html2canvas is also widely used because it undertakes these tasks. At this point, you finally made a PDF to send to the leader, the leader is satisfied to open the file, the pictures are in, the text is in, typesetting is not disorderly. But how can I not choose the right words? The leader asks you for soul torture:

Can you tell me the difference between this and the picture?

This question hits the nail on the head, and you already know it’s a little perfunctory. So, open the JSPDF documentation and have a look. When you scroll to the end of the document, you see a jspdf.html () method, which is surprisingly named. Sure enough, the API is simple enough to use, allowing you to pass in a DOM and save a PDF file.

Two lines of code click, a file is generated, and when you open it, it’s all gibberish.

How can you not understand garbled code when other people don’t know it? The data must be correct, but the decoding method is not correct. You start to look for a way to specify the encoding method and insert UTF-8. However, it is not so simple.

JSPDF was developed by foreigners and naturally supports English without considering Chinese

The solution is to load the Chinese font library. JSPDF provides an addFont method that first loads the font data using the addFileToVFS method and then uses addFont, but there are still too many pits.

  1. First you have to find the Chinese font file yourself, and isttfFormat, change the font file name to all lowercase;
  2. In the specialFont conversion websiteOn thettfConvert to JS file;
  3. Then open the js file and copy the value of the font variable (you don’t want to do it at this point, because it’s too long and might cause your computer to stall);
  4. usePDF. AddFileToVFS (' font name ', font)Method to import the font, font is the very long string;
  5. usePDF. AddFont (' font name ', 'hahahaha', 'normal')Load the font ashahahaha;
  6. usepdf.setFont('hahahaha')Use fonts;

But things are far from over, because the form has pit……

The size of the table is still unable to display properly, very uncomfortable here is you need to import a specified font, and after you have imported so many fonts, the volume of the entire project will be instantly increases one hundred times, you might think network conditions to download dozens of signs is not a problem, but the user’s browser would burst memory, your development environment, packaging can also memory, So this is a solution that is OK in theory but not workable.

window.print()

Maybe you remember the browser print function, directly wake up the system print function, but when selecting the printer can also choose to export PDF, the code is very simple, just a line of window.print() can be achieved.

So why wasn’t this simple solution the first choice?

  1. Not beautiful, wake up the browser print function window, this window may not be with your project interface style;
  2. Secondly, users need to understand that they can export PDF without really printing.
  3. Because the user’s device is different, the browser loading effect may not be ideal, out of the PDF may not be the same;
  4. The default is full page printing, irrelevant DOM elements can not be removed;

In fact, this last point is the most unacceptable, and the best solution, just open a new page, containing only the elements you want to print, but this operation costs again, you can use code to open a new page to the user and jump to the specified route, and then the user in the browser to print the popup.

puppeteer

The ultimate solution.

Window.print is the best way to restore styles, and it’s easy to use and write if it’s not a popover.

It would be nice to move this action to the server side, where Puppeteer, an open source Headless Chrome from the Google Chrome team, acts as a browser without an interface and provides an API for opening new tabs, opening web addresses, exporting PDFS (based on print, I think), Clicking on elements, typing text, or even executing JAVASCRIPT code mimicking the actions of a human browser, running in a Node environment of course.

Very simple to use, just a NPM package:

const puppeteer = require('puppeteer')

let browserInstance;

async function start(){
    // Start a browser
    browserInstance = await openBrowser();
}

async function openBrowser() {
  const launchConfig = {
    headless: true};// If you want to deploy to Linux in the future, you need to open sandbox mode
  if(! isDev()) { launchConfig.args = ["--no-sandbox"."--disable-setuid-sandbox"];
  }
  // Start a browser
  const browser = await puppeteer.launch(launchConfig);
  return browser;
}

async function openPage(url) {
  // Open a new TAB
  const page = await browserInstance.newPage();
  // Set the appropriate browser width and height
  await page.setViewport({ width: 1200.height: 1080 });
  // Access address
  await page.goto(url);
  return page;
}

async function page2PDF(page) {
  // Wait for a marker to appear
  await page.waitForSelector(".finished-pdf", {
    timeout: 90000});await sleep(2000);
  // Create a random filename
  const fileName = createUUID(12);
  // Spell out the generation path of a PDF file
  const pathStr = path.join(__dirname, "..".`/pdf/${fileName ?? "Report"}.pdf`);
  console.log("Start generating PDF files", formatToDateTime());
  // Generate a PDF file
  await page.pdf({
    path: pathStr,
    margin: {
      left: "20px".right: "20px".top: "40px".bottom: "40px",},format: "a4".// A4 paper
    scale: 0.75.// Moderate scaling
    printBackground: true.// Include background
    timeout:90000 / / timeout
  });
  console.log("Complete generating PDF file", formatToDateTime());
  return { pathStr, fileName };
}
Copy the code

Generally speaking, the API is very simple to use. Start a browser in memory, open a TAB, make the TAB open a page, wait for the page to load, call the PDF method, configure some style parameters, and generate files.

Some details to note:

  1. By building a Web service with KOA or Express, you can respond to multiple PDF creation tasks. Chrome is multi-tabbed, so it makes sense to just open a browser instance, open a TAB every time you need to perform a task, close the TAB when you generate a PDF, and leave the browser instance undestroyed.

  2. Puppeteer can detect the DOM element loading after the page is loaded. In react or Vue projects, many OF the DOM nodes on the page need to be rendered before the data is requested, so you need to wait for all the data to be loaded. This requires a little change to the front-end project. In your business logic, for example, when a list of data requests come back, add an invisible dom tag to the page in the nextTick with react or vue features. My suggestion is to render an < I > tag, Give it a special class name like my.finded-pdf and add it to the body, and display to None doesn’t affect the page. Puppeteer’s page.waitForSelector method, as the name implies, waits for a DOM element to load, and the creation of a PDF should be done safely after this line of code.

  3. Fonts are also a concern, and on your Windows or Mac they’re a no-brainer, but if deployed to Linux, Linux usually doesn’t have rich font files that you need to add manually, but don’t worry, it’s a lot easier.

  • Prepare font packs

    English and Chinese fonts are usually required, including Microsoft Yahei, Segoe UI,Arial. Due to the diversity of font weights, please prepare bold and Normal font files. Unzip the font package to /usr/share/fonts and place all fonts in this directory.

  • To compile the font, execute these three commands, the last one is to refresh the font cache meaning.

        mkfontscale
    
        mkfontdir
    
        fc-cache
    Copy the code
  1. Text editable, at this time exported PDF, text is text, picture is picture, so text can be selected can also be accompanied by PDF view enlargement without damage, but PNG images will naturally distortion, this is a normal phenomenon. In particular, if you use echart or other drawing tools that use Canvas, canvas will also be rendered as images. Thanks to Echart’s SVG rendering method, this is great. The entire image is no longer a canvas canvas, but composed of many SVG elements. Lines, areas, etc. in Echart are still vectors and can be magnified losslessly, and even text in Echart images can be selected in PDF due to the use of SVG text nodes to render text.

    By the same token, you should conclude that whatever works with SVG should work with SVG!

At this point, HTML2PDF scheme basically landed successfully, please optimize details according to your business design.

remarks

Finally said a preacher, as a front end, we should be glad to in a front-end era of high-speed development, the emergence of the Node can let you don’t have to switch language learning back-end knowledge, don’t be afraid to leave the browser, the client and the server are all we need to know the environment and don’t see “server” three words will turn off this article, Give up the idea of learning, the road ahead is long, I see no end, I will search up and down.