Performance Optimization wizard: Chinese Font practice

background

Using the right font for a Web project can be a great experience for users. However, there are so many font files, if designers or developers want to query fonts, they can only be opened one by one, which greatly affects the work efficiency. Therefore, the platform project I was working on needed to implement a feature that could preview fonts based on fixed text and user input. In the process of realizing this function, two main problems are solved:

The large size of Chinese fonts leads to a long loading time
Preview content is not displayed until the font is loaded

Now the solution of the problem and my thinking summary written.

Use web custom fonts

Before we talk about these two issues, let’s briefly describe how to use a Web custom font. To use a custom font, you can rely on the @font-face rule defined by CSS Fonts Module Level 3. A usage method that is basically compatible with all browsers is as follows:

@font-face { font-family: "webfontFamily"; SRC: url('webfont. Eot '); url('web.eot? #iefix') format("embedded-opentype"), url("webfont.woff2") format("woff2"), url("webfont.woff") format("woff"), url("webfont.ttf") format("truetype"); font-style:normal; font-weight:normal; } .webfont { font-family: webfontFamily; /* @font- name */}Copy the code

Since the woff2, WOFF, TTF formats are already well supported in most browsers, the above code could also be written as:

@font-face {
    font-family: "webfontFamily"; /* The name is arbitrary */
    src: url("webfont.woff2") format("woff2"),
         url("webfont.woff") format("woff"),
         url("webfont.ttf") format("truetype");
    font-style:normal;
    font-weight:normal;
}
Copy the code

With the @font-face rule, we just need to upload the font source file to the CDN, make the url value of the @font-face rule be the address of the font, and finally apply the rule to the Web text to achieve the preview effect of the font.

However, one obvious problem is that the font size is too large and it takes too long to load. Let’s open the Network panel of the browser to see:

You can see that the font size is 5.5MB and the load time is 5.13s. Many Chinese fonts on quark platforms are between 20 and 40 MB in size, so you can expect loading times to increase even further. If the user is in a weak network environment, this wait time is unacceptable.

First, the large volume of Chinese fonts leads to long loading time

1. Analyze the reasons

There are two main reasons why Chinese fonts are so large compared with English fonts:

Chinese fonts contain a large number of glyphs, while English fonts contain only 26 letters and some other symbols.
The lines of Chinese glyphs are far more complex than those of English glyphs, and more positions are used to control Chinese glyphs than English glyphs, so the amount of data is larger.

With the help of Opentype. js, we can count the difference between a Chinese font and an English font in the number of glyphs and the number of bytes of glyphs:

The name of the font	Number of the font	Number of bytes in a glyph
FZQingFSJW_Cu.ttf	8731	4762272
JDZhengHT-Bold.ttf	122	18328

Quark platform font preview needs to meet two ways, one is fixed character preview, the other is according to the user input character preview. However, either way, only a small number of characters of the font will be used, so loading the font in full is not necessary, so we need to simplify the font file.

2. How to reduce font file size

unicode-range

The Unicode-range attribute is typically used in conjunction with the @font-face rule, which controls the use of a particular font for a particular character. But it doesn’t reduce the font file size, so interested readers should try it out.

CSS Unicode-range uses font-face to customize the font for specific characters

fontmin

Fontmin is a pure JavaScript implementation of font subset scheme. As mentioned above, the reason why Chinese fonts are bigger than English fonts is that there are more glyphs. The idea of simplifying a font file is to remove unnecessary glyphs:

/ / pseudo code
const text = 'Font Preview'
const unicodes = text.split(' ').map(str= > str.charCodeAt(0))
const font = loadFont(fontPath)
font.glyf = font.glyf.map(g= > {
 // Get the corresponding glyph according to unicodes
})
Copy the code

In practice, simplification is not so simple, because a font file consists of many tables that are related to each other, such as the MAXP table, which records the number of glyphs, and the LOCA table, which stores the offset of glyphs position. Also, the font file starts with offset Table, which records all the tables of the font, so if we change the GLYf table, we have to update the other tables as well.

Before we discuss how Fontmin does font interception, let’s look at the structure of font files:

The above structure is limited to the case where the font file contains only one font and the font outline is based on the TrueType format (which determines the value of sfntVersion), so the offset table starts at 0 bytes of the font file. If a font file contains multiple fonts, the offset table for each font is specified in TTCHeader, and such files are beyond the scope of this article.

Offset table:

Type	Name	Description
uint32	sfntVersion	0x00010000
uint16	numTables	Number of tables
uint16	searchRange	(Maximum power of 2 <= numTables) x 16.
uint16	entrySelector	Log2(maximum power of 2 <= numTables).
uint16	rangeShift	NumTables x 16-searchRange.

Table record:

Type	Name	Description
uint32	tableTag	Table identifier
uint32	checkSum	CheckSum for this table
uint32	offset	Offset from beginning of TrueType font file
uint32	length	Length of this table

For a font file, whether its font outline is in TrueType format or CFF format based on The PostScript language, it must contain tables cMAP, HEAD, HHEA, HTMX, MAXP, name, OS/2, POST. If the glyphs are in TrueType format, there are CVT, FPGM, GLYF, LOCA, Prep, and GASP tables. In addition to glyf and LOCA, the other four tables are optional.

Fontmin intercepts the font principle

Fontmin uses fonteditor-core internally. The font processing of the core is left to this dependency. The main flow of Fonteditor-core is as follows:

1. Initialize Reader

Convert the font file to an ArrayBuffer for subsequent reading.

2. Extract the Table Directory

The structure immediately following offset table is called table record, and multiple table records are called table Directory. Fonteditor -core will first read the original font Table Directory, from the above Table record structure we know, each Table record has four fields, each field is 4 bytes, so it is very convenient to use DataView to read, The resulting table information for a font file is as follows:

3. Read the table data

In this step, the Table data is read based on the offset and length information of the Table Directory record. The contents of the GLYf table are the most important for simplified fonts, but the Glyf table Record only tells us the length of the glyf table and the glyf offset relative to the font file. How do we know the number, position, and size of glyf glyphs in the glyf table? This requires the use of the MAXP table in the font and the LOCA (Glyphs Location) table. The numGlyphs field value in the MAXP table specifies the number of glyphs, while the LOCA table records the offset of all glyphs in the font relative to the GLYf table. It has the following structure:

Glyph Index	Offset	Glyph Length
0	0	100
1	100	150
2	250	0
.	.	.
n-1	1170	120
extra	1290	0

According to the specification, index 0 refers to the missing character, which is the character that occurs when a character is not found in a font. This character is usually represented by a blank box or space. When the missing character has no outline, Loca [n] = loca[n+1] according to the definition of locA table. We can see that there is an extra term in the table above, which is to calculate the length of the last glyphs loca[n-1].

The value of the Offset field in the above table is in bytes, but the exact number of bytes depends on the value of the indexToLocFormat field in the font head table. When this value is 0, Offset 100 equals 200 bytes. When this value is 1, Offset 100 is equal to 100 bytes, and the two different cases correspond to the Short version and Long version in the font.

But knowing the offsets of all glyphs is not enough to recognize which one we want. Suppose I need a font preview of four glyphs, and the font file has 10,000 glyphs, and we know the offsets of all glyphs from the LOCA table, but which four blocks of 10,000 represent the font preview of four characters? Therefore, we also need to use the CMAP table to determine the position of glyphs. The CMAP table records the mapping of unicode glyphs to glyphs indexes. Once we have the corresponding glyphs index, we can use the index to obtain the glyf offset of glyphs.

And a Glyph Headers data structure starts with Glyph Headers:

Type	Name	Description
int16	numberOfContours	the number of contours
int16	xMin	Minimum x for coordinate data
int16	yMin	Maximum y for coordinate data
int16	xMax	Minimum x for coordinate data
int16	yMax	Maximum x for coordinate data

The numberOfContours field specifies the numberOfContours for this Glyph, and the data structure immediately following Glyph Headers is Glyph Table.

In the definition of a font, the outline is made up of position points, and each position point has a number, which is in ascending order starting from 0. So when we read the specified Glyph Headers, we read the values of the Glyph Headers and the coordinates of the contours.

Glyph Table stores an array composed of the number of the last position point of each contour. From this array, we can get how many position points exist in this Glyph. For example, if the value of this array is [3, 6, 9, 15], it can be known that the number of the last position point on the fourth contour is 15, then there are 16 position points in this font, so we only need to iterate 16 times to access the ArrayBuffer to get the coordinate information of each position point. So we can extract the glyphs that we want, and that’s how Fontmin cuts glyphs.

In addition, when extracting coordinate information, except for the first position point, the coordinate values of other position points are not absolute values. For example, the coordinate of the first point is [100, 100], and the value read for the second point is [200, 200], then the coordinate of the point is not [200, 200], but is incremented based on the coordinate of the first point. So the actual coordinate of the second point is [300, 300].

There are too many tables involved in a font, and the data structure of each table is different. It’s impossible to list how fonteditor-core handles every table.

4. Correlate glyF information

In fonts with TrueType Outlines, each glyph provides values for xMin, xMax, yMin, and yMax, which are the Bounding Box shown in the image below. In addition to these four values, you also need two fields, advanceWidth and leftSideBearing, which are not in the GLYf table and therefore not available when intercepting glyphs. In this step, Fonteditor-core reads the FONT’s HMTX table to retrieve the two fields.

5. Write fonts

This step recalculates the font file size, updates the Offset table and table record values, and then writes the Offset table, table record, and table data to the file. One thing to note is that when a table record is written, the write must be sorted by the table name. For example, if there are four tables prep, HMTX, GLYf, head, then the write order should be GLYf -> HEAD -> HMTX -> prep, while table data does not require this.

Shortcomings of Fontmin

Fonteditor-core will only process the fourteen tables mentioned above and discard the rest of the table when intercepting fonts. Each font usually contains two tables, VHEA and VMTX, which are used to control the spacing of the font in the vertical layout. If the font is intercepted with Fontmin, this part of the information will be lost. You can see the difference when the text is displayed vertically (the right is after interception) :

Fontmin usage method

Once you understand how Fontmin works, you can have fun using it. When the server receives a request from the client, fontmin intercepts the font. Fontmin returns the Buffer corresponding to the intercepted font file. Remember that the font path in @font-face is base64. Therefore, we just need to convert the Buffer to base64 format and embed it in @font-face to return it to the client, and then the client inserts the @font-face into the tag in CSS form.

For fixed preview content, we can also save font files in CDN, but the disadvantage of this method is that if the CDN is unstable, it will cause font loading failure. If, using the above method, each truncated font exists as a Base64 string, you can do a cache on the server and there is no problem. Use Fontmin to generate font subset code as follows:

const Fontmin = require('fontmin')
const Promise = require('bluebird')

async function extractFontData (fontPath) {
  const fontmin = new Fontmin()
    .src('./font/senty.ttf')
    .use(Fontmin.glyph({
      text: 'Font Preview'
    }))
    .use(Fontmin.ttf2woff2())
    .dest('./dist')

  await Promise.promisify(fontmin.run, { context: fontmin })()
}
extractFontData()
Copy the code

For fixed preview content, we can pre-generate the segmented font. For dynamic preview content input by users, we can also follow this process:

Get input -> cut font- > upload CDN -> generate @font-face -> Insert page

According to this process, the client needs to request font resources twice (don’t forget that the font will be requested after @font-face inserts the page), and the two steps of cutting font and uploading CDN will consume a long time. Is there a better way? We know that the outline of a glyphs is determined by a series of position points, so we can take position point coordinates in the GLYf table and draw specific glyphs directly from an SVG image.

SVG is a powerful image format that can be interacted with using CSS and JavaScript, where the path element is primarily applied

This can be done with opentype.js. After the client gets the path element with the input font, it just needs to iterate to generate the SVG tag.

3. Advantages of reducing font file size

Attached below is a comparison table of file size and loading speed after font capture. It can be seen that the loading speed is 145 times faster after font interception compared to full loading.

Fontmin supports generating Woff2 files, but the official documentation has not been updated. I used woff files initially, but woff2 files are smaller and have good browser support

The name of the font	The size of the	time
HanyiSentyWoodcut.ttf	48.2 MB	17.41 s
HanyiSentyWoodcut.woff	21.7 KB	0.19 s
HanyiSentyWoodcut.woff2	12.2 KB	0.12 s

2. The preview content will not be displayed before the font is loaded

This is the second problem in implementing the preview feature.

There are two concepts in the font display behavior of the browser: blocking period and swap period. Take Chrome as an example, before the font is loaded, there is a period of blank display, which is called blocking period. If the loading is not complete during the blocking period, the backup font is displayed first, and the swap period is entered, waiting for the font to be replaced after the loading is complete. This will cause the page font to flicker, which is not what I want it to look like. The font-display property controls the behavior of the browser. Can we change the value of the font-display property to achieve our purpose?

font-display

	Block Period	Swap Period
block	Short	Infinite
swap	None	Infinite
fallback	Extremely Short	Short
optional	Extremely Short	None

The font display policy depends on the font display value. The default font display value of the browser is auto, and its behavior is similar to the value block.

The first strategy is FOIT(Flash of Invisible Text). FOIT is the default representation of fonts when loaded by the browser.

The second strategy is FOUT(Flash of Unstyled Text), which instructs the browser to use the backup font until the custom font is loaded. The corresponding value is swap.

Application of two different strategies: Google Fonts FOIT Chinese Fonts FOUT

In the quark project, I wanted the effect of not showing preview content until the font was loaded, and the FOIT policy came closest. However, the maximum time for the FOIT text to be invisible is about 3s. If the user’s network is not in good condition, the backup font will be displayed first after 3s, causing the page font to blink. Therefore, the font-display attribute does not meet the requirements.

The CSS Font Loading API also provides a solution at the JavaScript level:

FontFace, FontFaceSet

Let’s take a look at their compatibility:

IE again, IE does not have users do not care

We can construct a FontFace object using the FontFace constructor:

const fontFace = new FontFace(family, source, descriptors)

family
- Font name, specifying a name asCSSattributefont-familyThe value of the
source
The font source can be a URL or an ArrayBuffer
descriptors optional
Style: the font style
Weight: the font – weight
Stretch: the font – stretch
Display: font-display (this value can be set, but does not take effect)
UnicodeRange: Unicode-ranges for the @font-face rule
The variant: the font – the variant
FeatureSettings: the font – feature – Settings

After constructing a fontFace font is not loaded, fontFace’s load method must be implemented. The load method returns a Promise, and the promise’s resolve value is the font that will load successfully. But the font doesn’t work just because it’s loaded successfully. You also need to add the returned fontFace to fontFaceSet.

The usage method is as follows:

/** * @param {string} path Font file path */
async function loadFont(path) {
  const fontFaceSet = document.fonts
  const fontFace = await new FontFace('fontFamily'.`url('${path}') format('woff2')`).load()
  fontFaceSet.add(fontFace)
}
Copy the code

Therefore, on the client side, we can set CSS opacity: 0 for text content first, and then set CSS opacity: 1 after await await loadFont(path) execution is completed. In this way, we can control that the content will not be displayed before the loading of the customized font is complete.

The final summary

This article introduces the problems and solutions encountered in the development of font preview function. Limited to OpenType specification, there are many items. In the introduction of Fontmin principle part, only the processing of GLYf table is described, readers who are interested in this can further learn.

During the review and summary of this work, I am also thinking about better implementation. If you have any suggestions, please feel free to communicate with me. At the same time, the content of the article is my personal understanding, there are mistakes are inevitable, if you find mistakes welcome correction.

Thanks for reading!

reference

Front-end font interception
Scalable Vector Graphics
FontFace
FontFaceSet
fontmin
fonteditor-core
TrueType-Reference-Manual
OpenType-Font-File

Write in the last

JDC Bump Lab spring campus recruitment has begun!

We are a team that loves to create and constantly try new technologies, new experiences and new products.

Internship vacancies are open for 2020, so come and join!

Contact email: AOtu [AT]jd.com
Location: Shenzhen

The basic requirements

A full-time bachelor degree or above with a minimum of 1 October 2020 to 30 September 2021;
Solid foundation, logical, good communication skills and team work spirit;
Love computer programming, pay attention to new things, new technology, creative, strong learning ability;
Have some understanding of Web standards, usability, accessibility and other related knowledge;
Learn how to use HTML, CSS, JavaScript to build high-performance Web applications;
Understand mainstream front-end frameworks (React/Vue/Angular, etc.), multi-end development frameworks (Taro/Uni-app/Chameleon/MpVue, etc.), pursue good code style, and have a good understanding of interface design and program architecture;

Give priority to

Knowledge of Node.js/Java server development;
Understand wechat applets/mobile terminal development;
Have a personal open source project or technical blog with frequent updates.

Please specify the source [gold digging]