Sourcemap who does good without being acknowledged

When we are developing code, we can locate specific problems in the console when we encounter errors, like this:The problem is that since the packaging action compiles and compresses our original code, our original code is no longer in the product. When we open the product, all we see is code like this:Why, then, can we use the console to locate errors in the original code? The answer is the subject of this article:The source map.In the front-end engineering system, most of a piece of code needs to go through the steps of packaging and compilation from development to online, so as to:

  1. Translate file types such as JSX, TSX, and TS into JS that the Runtime can recognize
  2. Translate JS into es5, which is more widely applicable
  3. Compressing multiple JS files into one final product and confusing the code to a certain extent

Through these three steps, our code has changed beyond recognition. The loss of the original information makes it difficult to trace the code, and if we encounter an error at this point, the compressed code leaves us nowhere to go. However, if we have such files in our package, we can use them to restore the original code:These are sourcemap, so how is sourcemAP generated? How should we use Sourcemap?

Sourcemap principle

The composition of sourcemap

To describe sourcemap generation more clearly, I compile and generate sourcemap using the simplest case:

// input 
const example = () = > { 
  console.log('example'); 
} 
 
// output 
"use strict"; 
 
var example = function example(){ 
  console.log("example"); 
}; 
Copy the code

While converting the above code to ES5 with Babel, we get a sourcemap:

{ 
  "version": 3."sources": [ 
    "src/example.js"]."names": [ 
    "example"."console"."log"]."mappings": "; AAAA,IAAMA,OAAO,GAAG,SAAVA,OAAU,GAAM; AACpBC,EAAAA,OAAO,CAACC,GAAR,CAAY,SAAZ; AACD,CAFD"."sourcesContent": [ 
    "const example = () => {\n console.log(\"example\"); \n}; \n"]}Copy the code

You can see that it has multiple attributes representing:

  1. Version: indicates the source map version number.
  2. Sources: indicates the file before conversion. The item is an array that may contain multiple files merged into a single file.
  3. Names: Names of all variables and attributes before conversion.
  4. Mappings: STRING that records location information.
  5. SourceContent: original content.

The most important of these is the mappings field, which records the mappings between the original and compiled code.

How are mappings recorded

Take two minutes to think about how, if you were designing Sourcemap, you would record a mapping from raw code to compiled code. It is very simple, I will compile each word, corresponding to the original location of the record, it should be noted that, because there are multiple files compiled into a file, so we need to record the original file name:

Compiled position (row/column) Compiled words Original file name Original location (row/column) The original word
0, 0 var src/example.js 0, 0 const
0, 4 example src/example.js 0, 6 example
0, 11 = src/example.js 0, 13 =
0, 14 function src/example.js 0, 16 (
0, 23 example src/example.js 0, 6 example
0, 30 ( src/example.js 0, 16 (
0, 33 { src/example.js 0, 22 {

So far, we have recorded the original information of the first line of code, which can be expressed as:

0|0|src/example.js|0|0.0|4|src/example.js|0|6.0|11|src/example.js|0|13.0|14|src/example.js|0|16.0|23|src/example.js|0|16.0|30|src/example.js|0|16.0|33|src/example.js|0|22 
Copy the code

Similarly, the mapping between the second line and the third line can be recorded in the same way. Once we had recorded the mapping, we needed to consider a practical problem: we needed 150 characters to record the mapping with only 23 characters of original information. Is there a way to use fewer characters?

Now, we optimize the above information layer by layer as sourcemap does:

To optimize the

Omit the line number in the output file, instead; To identify line breaks

Use; To identify the newline, we can save the above encoding as:

0|src/example.js|0|0.4|src/example.js|0|6.11|src/example.js|0|13.14|src/example.js|0|16.23|src/example.js|0|16.30|src/example.js|0|16.33|src/example.js|0|22; 
Copy the code

Identify variable names with indexes

Earlier we mentioned the NAMES array in Sourcemap. In Sourcemap, it also records the index of variable names in the names array, so the encoding is as follows:

0|src/example.js|0|0.4|src/example.js|0|6|0.11|src/example.js|0|13.14|src/example.js|0|16.23|src/example.js|0|16|0.30|src/example.js|0|16.33|src/example.js|0|22; 
Copy the code

Use index instead of file name

SRC /example.js in sources has an index of 0, so it can be further simplified as:

0|0|0|0.4|0|0|6|0.11|0|0|13.14|0|0|16.23|0|0|16|0.30|0|0|16.33|0|0|22; 
Copy the code

Replace absolute position with relative position

When the file is large, the simplified code above also has the possibility that some numbers will become very long as they grow. If the position of a line records a position, then relative positioning of that position can be used to reach any position in the line. Such as:

Compiled position (column) Compiled words Original file name Original location (row/column) The original word
0 var src/example.js 0, 0 const
4 (previous position +4) example src/example.js 0, 6 example
7 (previous position +7) = src/example.js 0, 7 =
3 (previous position +10) function src/example.js 0, 3 (
9 (previous position +9) example src/example.js 0 to 10 example
7 (previous position +7) ( src/example.js 0, 10 (
3 (previous position +3) { src/example.js 0, 6 {

So our mappings continue to be simplified to:

0|0|0|0.4|0|0|6|0.7|0|0|7.3|0|0|3.10|0|10.9|0|0|-10.7|0|0|10.3|0|0|6; 
Copy the code

VLQ coding

If we can think of some way to get rid of the separator between each word (in our case is |), we could save a lot of character. Restrictions, of course, we remove the separator, the question is we can’t without the help of the separator to distinguish 10010 is 10 | | 1 | 0 0 | 10 or 100, but we can design a set of methods, we can in the case of remove the separator can still correct grouping. Sourcemap uses this methodology:

In binary, six byte bits are used to record a number, one byte is used to indicate whether it ends (C below), one byte is used to indicate plus or minus (S below), and four bits are used to indicate the number. So we’re going to use these six bytes to represent the number that we need.

B5 B4 B3 B2 B1 B0
C Value S

In any number, the first byte bit of the first group is clearly identified as positive or negative, so subsequent byte bits do not need to be identified, that is, the first group has four byte bits to represent the number, Subsequent each group has 5 bits to represent a byte value (there is still a byte bit logo whether to end) we use the simplified the mappings of the second 4 | | 0 0 | | 0 6 as an example:

The decimal system binary
4 100
0 0
6 110

So they should be encoded as:

4 B5 B4 B3 B2 B1 B0
0 0 1 0 0 0


0 B5 B4 B3 B2 B1 B0
0 0 0 0 0 0


6 B5 B4 B3 B2 B1 B0
0 0 1 1 0 0


Note: if you can’t express is a grouping of Numbers, will be to accommodate the rest with the second group, here, for example: 23 binary for 10111, due to a group cannot accommodate, so 10111 can be divided into two groups, the first group is the final four, 10111, the second group is the rest of the 10111, then it will eventually be encoded as: 101110, 000001.

So 4 | | 0 0 | | 0 was eventually translate into 6 001000 000000 000000 001100 000000, then get base64 code:

001000 000000 000000 001100 000000 
I      A      A      M      A 
Copy the code

6 bytes bit is used for a group of records a number, it is because most every base64 encoding can represent binary six, so in this code, we will be 4 | | 0 0 | | 0 6 to IAAMA. At this point, we know how sourcemap works and how it is generated.

Different Sourcemaps in Webpack

We know that Sourcemap is used to record the mapping between compiled code and original code, so in theory every compilation process can produce a copy of Sourcemap, such as TS(X) to JS, ES6 to ES5, code compression, etc. After all this process, we need to merge the SourcemAP generated at each step to end up with a sourcemAP of production code to development code. We can use the community on the wheels of the existing manual merge to realize sourcemap: www.npmjs.com/package/mer…

However, there are some problems with this. Packaging tools such as WebPack are indispensable in the early engineering architecture, and WebPack itself provides the generation of SourcemAP, and we don’t need to pay attention to the details of the SourcemAP merge. In the WebPack configuration, the devTool attribute identifies which mode is used to generate sourcemap. There are several types you can use in devtool, such as eval, inline, cheap… Take this program for example:

// index.js 
import a from "./a"; 
import b from "./b"; 
 
b(a); 
 
// a.js 
export default "a"; 
 
// b.js 
export default (str) => { 
  throw new Error(str); 
}; 
Copy the code

source-map

This pattern generates a separate.map file. It’s the same as what we talked about earlier. It is the most detailed, but also the longest.

Inline (such as inline-source-map)

Instead of generating a separate.map file, sourcemap is encoded in Base64 encoding and appended to the end of the compiled code. The disadvantage is that this makes the compiled code bulky, otherwise the same as source-Map mode.

eval

Source code is eval(…) as a character. Instead of generating sourceMap information, only a sourceURL attached to each module stores the location of the original file. Also, we can only see webpack-processed compiled code in the console, so it does not reflect the actual line number:

eval-source-map

Source code is eval(…) as a character. After the sourceMap information (in base64 encoded form) and sourceURL are attached to each module, in this mode we can see the original code in the console.

cheap-source-map

The generated sourcemap only has row information, and does not record column information:However, if you look carefully, you may notice that sourcemap in cheap-source-map mode is still not mapped to the actual original code (the original arrow function is compiled into function by Babel). Cheap-source-map records the mapping to the code transformed by the Loader (in this case, babel-Loader).

cheap-module-source-map

Use cheap-module-source-map to record the information before loader translation:

nosources

In this mode, a SourcemAP without sourcecontent is generated as an error stack with no concrete content:

hidden

In this mode, the SourcemAP is generated, but the sourcemapURL information is not attached to the compiled code. In addition to the combination of some patterns, specific can be “^ (inline – | hidden – | eval -)? (nosources-)? (cheap-(module-)?) ? source-map$”

Which pattern should the development environment use? What about the production environment?

Development environments typically use the eval-source-Map mode, which builds quickly and allows you to see raw information. In a production environment, hidden-source-map can be used in order not to leak the source code.

Why is Sourcemap sometimes wrong?

One guess: Above we covered the various Sourcemap patterns in WebPack. Some of these patterns sacrifice some information for better build speed (eval, cheap-source-Map). If webpack’s Hot Server is not using the correct Sourcemap mode, mislocation problems can occur. In this case, it is usually OK to change the sourcemap mode to cheap-module-source-map or even source-map mode. Also, if one part of the packaging process loses the original information while translating the code, the resulting merged Sourcemap may be mispositioned.


A link to the

  1. D-kylin/note
  2. Rich-Harris/vlq
  3. Details of JS compression, sourcemap, sourcemap to find the original error message