Originally published in Zhihu column: zhuanlan.zhihu.com/ne-fe

0.1+0.2=0.30000000000000004, 1-0.9= 0.099999999999998, Many people know this is a floating point error, but it’s not clear why. This article will help you understand the principle behind this and the solution, will also explain to you JS large number crisis and the four operations will encounter the pit.

Storage of floating point numbers

The first step is to figure out how JavaScript stores decimals. Unlike other languages such as Java and Python, all numbers in JavaScript, including integers and decimals, have only one type – Number. It is implemented in accordance with IEEE 754 standards and is expressed in 64-bit fixed length, which is the standard double double floating-point number (and related float 32-bit single precision). It’s described in detail in the principles of how a computer is made, so don’t worry if you don’t remember.

Note: The default for decimals in most languages, including Java, Ruby, and Python, is IEEE 754 compliant float floating-point numbers. Floating-point problems also exist in this article.

The advantage of this storage structure is that it can normalize the processing of integers and decimals and save storage space.

The 64-bit bits can be divided into three parts:

  • Sign bit S: The first bit is the sign bit (sign). 0 represents a positive number and 1 represents a negative number
  • Exponent bit E: The middle 11 bits store the exponent, which is used to represent the exponent
  • Mantissa: The last 52 digits are mantissa. The excess digits are automatically zeros
64 bit allocation

The actual number can be calculated using the following formula:

Numerical calculation formula

Note that the above formula follows scientific notation, 0

The final formula becomes:

Final calculation formula

So 4.5 is finally expressed as (M=001, E=1025) :

4.5 allocation map

(the resulting www.binaryconvert.com/convert_dou…).

0.1 is converted to binary as 0.0001100110011001100(1100 cycle), 1.100110011001100X2 ^-4, so E=-4+1023=1019; M drops the first 1 and gets 100110011… . And finally:

0.1 allocation map

At 0.100000000000000005551115123126, after converted to a decimal floating point is an error.

why0.1 + 0.2 = 0.30000000000000004?

The calculation steps are as follows:

// Both 0.1 and 0.2 are converted to binary
0.00011001100110011001100110011001100110011001100110011010 +
0.0011001100110011001100110011001100110011001100110011010 =
0.0100110011001100110011001100110011001100110011001100111

// The decimal value is exactly 0.30000000000000004Copy the code

whyX = 0.1Can get0.1?

Congratulations, you’ve reached the point where mountains are not mountains. Because the fixed length of Mantissa is 52 bits, plus the omitted one bit, the maximum number that can be represented is 2^53=9007199254740992, which corresponds to the mantissa of scientific count is 9.007199254740992, which is also the maximum precision that JS can represent. It has a length of 16, so you can use toPrecision(16) to approximate the accuracy. If you exceed the accuracy, you will automatically round it up. So we have:

0.10000000000000000555.toPrecision(16)
// Returns 0.1000000000000000, which is exactly 0.1 after the trailing zeros are removed

// But the '0.1' you see is not actually '0.1'. Do not believe you can use a higher accuracy try:
0.1.toPrecision(21) = 0.100000000000000005551Copy the code

The crisis of large number

You probably already have a vague sense of what happens if the integer is greater than 9007199254740992? Since E is 1023, the largest integer that can be represented is 2 to the 1024 minus 1, which is the largest integer that can be represented. But you can’t calculate it that way, because 2 to the 1024 becomes Infinity

> Math.pow(2.1023)
8.98846567431158 e+307

> Math.pow(2.1024)
InfinityCopy the code

So what happens to numbers between two to the 53 and two to the 63?

  • (2 ^ 2 ^ 53, 54)The number in between will be one of two, and it can only be an even number
  • (2 ^ 2 ^ 54, 55)The number between will be one of four, and can only accurately represent four multiples
  • . Skip more multiples of 2 in turn

The following diagram is a good representation of the JavaScript correspondence between floating point and Real numbers. The ones we use, minus 2^53, 2^53, are very small in the middle, and they get thinner and thinner and less precise as you go.

fig1.jpg

In the early order system of Taobao, the order number was treated as a number. Later, the random order number increased rapidly, exceeding 9007199254740992. The final solution is to change the order number into a string.

To solve the problem of large numbers you can refer to the third-party library bignumber.js, the principle is to treat all numbers as strings, re-implement the calculation logic, the disadvantage is that the performance is much worse than the native, so native support for large numbers is very necessary. TC39 already has a Stage 3 proposal, Proposal Bigint, and the problem of large numbers is expected to be solved completely. Babel 7.0 can be used until browsers support it, which automatically converts to big-INTEGER internally to maintain accuracy but reduce efficiency.

toPrecision vs toFixed

When processing data, the two functions can easily be confused. What they have in common is the conversion of numbers into strings for display. Do not use it in the middle of a calculation, only for the final result.

The differences need to be noted:

  • toPrecisionIs the processing precision, which starts from the first non-zero number from left to right.
  • toFixedIs the number of specified digits rounded from the decimal point.

Both can round up extra numbers, and some people use toFixed to round up extra numbers, but be aware that this is buggy.

For example: 1.005.toFixed(2) returns 1.00 instead of 1.01.

Cause: 1.005 actually corresponds to the number 1.00499999999999989, in the round all were removed!

Solution: Use the rounding function math.round () to handle. Math.round(1.005 * 100) / 100 does not work because 1.005 * 100 = 100.499999999999. Math.round should also be used after both multiplication and division accuracy errors are resolved. This can be done using the number-Precision-# round method described later.

The solution

Back to my main concern: how to solve floating-point errors. First of all, it’s theoretically impossible to store an infinite number of decimals in a finite amount of space, but we can manipulate it to get what we want.

Data presentation class

When you have data like 1.4000000000000001 to display, it is recommended to use toPrecision and parseFloat to convert it to a number, as follows:

The parseFloat (1.4000000000000001 toPrecision (12)) = = = 1.4 / / TrueCopy the code

The encapsulation method is:

function strip(num, precision = 12) {
  return +parseFloat(num.toPrecision(precision));
}Copy the code

Why choose 12 as the default precision? This is a rule of thumb choice, generally choosing 12 will solve most 0001 and 0009 problems, and is sufficient in most cases, you can increase if you need more precision.

Data operation class

For operational operations such as +-*/, you cannot use toPrecision. The correct way is to convert decimals into whole numbers and then operate. Take addition:

/** * exact addition */
function add(num1, num2) {
  const num1Digits = (num1.toString().split('. ') [1] | |' ').length;
  const num2Digits = (num2.toString().split('. ') [1] | |' ').length;
  const baseNum = Math.pow(10.Math.max(num1Digits, num2Digits));
  return (num1 * baseNum + num2 * baseNum) / baseNum;
}Copy the code

The above method can be applied to most scenarios. Special processing is needed for scientific notation such as “2.3e+1” (when the number precision is greater than 21, the number will be forced to be displayed in scientific notation).

Reading this far means you’re very patient, so I’ll give you a bonus. When you encounter floating point errors, you can directly use github.com/dt-fe/numbe…

Perfect support for floating point number addition, subtraction, multiplication, division, rounding operations. Very small at 1K, much smaller than most similar libraries (math.js, BigDecimal.js), with 100% test coverage and readable code, you can use it in your application!

reference

  • Double-precision floating-point format
  • What Every Programmer Should Know About Floating-Point Arithmetic
  • Why Computers are Bad at Algebra | Infinite Series
  • Is Your Model Susceptible to Floating-Point Errors?

If you find this article helpful to you, please swipe like a little encouragement