preface

JavaScript is known to lose precision when calculating certain floating-point numbers. For example, when you type 0.1+0.2 on the console, you get 0.30000000000000004 instead of 0.3.

There are two kinds of people in the world, those who understand binary and those who don’t

We know that all data in a computer is ultimately stored in binary, as are numbers. So when the computer computes 0.1+0.2, it’s actually counting the binary numbers stored in the computer, so how much binary is 0.1 stored in JavaScript? We first according to the decimal to binary method, 0.1 into binary is: 0.0001100110011001100… (1100 loop) and then convert 0.2 to binary is: 0.00110011001100… (1100 cycle). We found that they were all binary loops of infinite cycles. Obviously, computers don’t use their “infinite space” to store these infinite loops of binary numbers. So what do you do with that kind of data?

How does JavaScript store an infinite loop of binary decimals?

Different languages may have different storage standards. There is only one type of numbers used in JavaScript, including integers and decimals. Number is implemented in accordance with IEEE 754 standards and is represented by 64-bit fixed length. That is, the standard double floating-point number (and, related, float 32-bit single-precision). The exact way in which a double is stored is not explained here (see a later section for more details), except that in binary scientific notation, The decimal part of a double-precision floating-point can hold up to 52 bits (e.g. 1. XXX… *2^n, where x is reserved at most 52 digits) plus the first 1, in fact, it is reserved 53 significant digits, the rest is discarded, following the “0 round 1”, then the binary truncation of 0.1 is:

0.00011001100110011001100110011001100110011001100110011010
Copy the code

Similarly, the binary representation after the elimination of 0.2 is:

0.0011001100110011001100110011001100110011001100110011010
Copy the code

Add the two and you get:

0.00011001100110011001100110011001100110011001100110011010 +
0.0011001100110011001100110011001100110011001100110011010 =
0.0100110011001100110011001100110011001100110011001100111
Copy the code

We decimal the result according to the formula or tool:

You can see that the result is exactly 0.30000000000000004.

Note: The default for decimals in most languages, including Java, Ruby, and Python, is IEEE 754 compliant float floating-point numbers. Floating-point problems also exist in this article.

How are floating point numbers saved

In computers, floating point representation is divided into three parts, as shown in the figure above:

  • The first part (blue) is used to store the sign bit, which is used to distinguish between positive and negative numbers, with 0 indicating a positive number
  • The second part (green) is used to store exponents
  • The third part (red) is used to store decimals.

Double – precision floating – point numbers occupy 64 bits:

  • The sign bit occupies 1 bit
  • The exponent takes 11 bits
  • The fraction takes up 52 digits

Symbols, exponents, and decimal places are associated with scientific notation. We take the78.735As an exampleThe last of the1.001110011 * 2 ^ 6In scientific notation, the real number is obtained by multiplying an integer or fixed-point number (the mantissa) by an integer power of some base (usually 2 in computers)Floating point Numbers. Well, we might as well, according to this rule, be seated accordingly78.735To double precision notation, sign bits and decimal places are clearly visible, just by taking the exponent part6Convert to binary is110Finally:

0(sign) 00000000110(exponent) 00111001 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Copy the code

(This result is actually wrong, more on why)

Let’s see how the above mentioned 0.1 is stored according to the double precision specification. We know that its binary is:

0.00011001100110011001100110011001100110011001100110011001 10011...
Copy the code

Translated into scientific notation, this is:

1.1001100110011001100110011001100110011001100110011001 * ^ 2-4Copy the code

In other words, 0.1:

  • The sign bit is:0
  • The decimal places are:1001100110011001100110011001100110011001100110011001
  • The index bit is:4 -

The double precision floating-point specification specifies a sign bit, but the sign bit represents the value of the whole data, not the value of the exponent. Why keep a special bit for the exponent? The answer is no.

How to save the negative index bit?

In order to reduce unnecessary trouble, IEEE specifies an offset, which is used to save the index part each time, so that even if the index is negative, then the offset will become positive. This offset is also set regularly in order for all negative exponentials plus this offset to become positive. For example, we know that the exponential part of double is 11 bits of binary, so the range of data that can be represented is 0 to 2047. IEEE defines 1023 as the offset of double precision.

  1. When the exponential bits are not all zeros and not all ones (normalized values),IEEE provides that the formula for calculating the order code ise-Bias. The minimum value of e is 1, then= 1-1023-1022E is at most2046,2046-1023 = 1023, as you can see, in this case the range is- 1022 ~ 1013.
  2. When all the exponent bits are zeros (non-normalized values), IEEE specifies that the formula for calculating the order code is1-Bias, i.e.,= 1-1023-1022.
  3. When all the exponent bits are 1 (special value), IEEE specifies that this floating-point number can be used to represent three special values, plus infinity, minus infinity,NaN(not a number). Specifically, NaN is represented when the decimal place is not zero; When the decimal place is 0, the sign bit s=0 represents positive infinity, and s=1 represents negative infinity.

The offset of 78.735 is 6+1023 (1029), and the offset is 10000000101 (10000000101).

0 10000000101 00111001 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000, 00000000,Copy the code

In the same way, do you know how to store 0.1 double precision floating point?

The range of floating point values

If you read this carefully, you should be able to figure out the range of values that JavaScript can represent. The maximum value of e is 1023. 1.111.. (52).. 11*2^1023 to normal binary:

1 111.. (52).. 11 000.. (971).. 00Copy the code

Converting binary to decimal is:We’re going to find the sumNumber.MAX_VALUEIs the same value1.7976931348623157 e+308. But this is actually not the maximum value, so if we add some more numbers to this value, we find that it does not returnInfinity.soNumber.MAX_VALUEandInfinityThere are also many, according to IEEE specifications we can know that positive infinity if and only if the exponential part is all 1(the maximum of the exponential part(2, 11) math.h pow - 1-1023 = = 1024), when the decimal part is 0, it is:

1.000... * 2 ^ 1024Copy the code

So math.pow (2,1024) is positive infinity, and the largest number JavaScript can store is math.pow (2,1024)-1. However, the data between number. MAX_VALUE and math.pow (2,1024) cannot be represented properly and the accuracy will be lost. The same can be said for minimums.

The maximum safe integer for JavaScript

The so-called safe range is that we will not lose the accuracy of the calculation within this range. According to the definition of double precision, it can be known that the largest safe integer is:

1.11.. (52) * 2 ^ 52Copy the code

Converting to decimal is math.pow (2,53)-1, 9007199254740991.

In JavaScript, we have number. MAX_SAFE_INTEGER to represent the maximum safe integer

And what we find is that it’s the same value that we calculated for ourselves.

How to solve the problem of calculation error

The number-Precision library, less than 1K in size, is recommended.

Reference article:

  1. Grab the tail of the data
  2. What is the main difference between the Java floating point types float and double? – Boss quack answer – Zhihu