1. Business problems

Recently, when solving a business requirement, we encountered the following scenario: The product wants to reward a series of prizes, wants the sum of the series to be 100%, and wants accuracy to be maintained at 6 bits (percentage 4 bits), as shown in the following example:

This business scenario is too simple, so I implement it with the following code, which calculates the sum of the probabilities for all items using forEach.

 let sp = 0;
 items.forEach(r= > {
     sp += r.p || 0;
 });
 return sp;
Copy the code

Using the integer probability (10/20/30/40, 50/50) is fine, but when we switch to the decimal (10.2345, 20.3456), we find that the calculated SP has a decimal, producing a scenario similar to 99.9999, 100.0001. This allows for loss of precision, and the tendency is to take the integral method to handle such summation operations.

 let sp = 0;
 items.forEach(r= > {
     sp += Math.floo(r.p * 10000) | 0;
 });
 return sp / 10000;
Copy the code

After a number of small numeric attempts, the performance basically meets the business requirements, according to this implementation test, continue to do the next requirement. (44.2726 + 24.2000 + 28.8889 + 2.1053 + 0.4040 +.0769 + 0.0333 + 0.0095 + 0.0095 = 99.9999) This is why, after some time of positioning, a similar scenario is found, in order to quickly solve the problem, math.floor => Math.round, the calculation becomes normal. 44.2726 * 10000 => 442725.99999999994 => math. floor => 442725

Why? With that in mind, check out your college computer science textbook and relearn our floating-point arithmetic.

2. The Number of JS

JS Number type is what kind of, we can see ECMA Number value, Number type is the 64-bit IEEE 754-2019 standard, understand IEEE 754 floating point Number representation, carry and algorithm, we can better write business code.

2.1 Coding Mode

Although the IEES 754 standard has been changing, its core rules remain the same. Let’s first look at the basic coding rules for floating points. (The following introduction to floating points comes from a deeper understanding of computer systems.) The representation of IEEE floating point number is V=(−1)s∗M∗2EV =(-1) ^s \ast M \ast 2^EV=(−1)s∗M∗2E, and its parameters are defined as:

  • Symbols (sign). SSS determines whether the number is positive or negative.
  • Mantissa (mantissa). M is a binary decimal in the range 1 ~ 2−ε1\sim2-\varepsilon1 ~ 2−ε, or 0 ~ 1−ε0\sim1-\varepsilon0 ~ 1−ε.
  • Exponent (exponent). The function of E is to weigh floating-point numbers with the value 2 to the E.

For IEEE754 64-bit floating point numbers, the structure of symbols, mantissa, and order code is as follows:Based on the value of the order code, floating point numbers can be divided into three categories: normalized, nonnormalized, and specialized.

2.1.1 normalized

The order code of the normalized value, exp is the unsigned number of E10e9e8e7E6E5E4E3e2e1e0e_ {10} e_9e_8e_7e_6E_5e_4E_3e_2e_1e_0e10e9E8E7E6E5E4E3E2E1e0, and its representation ranges from 1 to 2046. However, in this scenario, the actual value of the order code is subtracted from the offset value E= E −biasE = e-biase = E − BIAS. Bias = 2K −1−1=211−1−1= 1023BIAS =2 ^{k-1} -1 =2 ^{11-1} -1 = 1023BIAS = 2K −1−1=211−1−1=1023, k is the number of bits in the code. The IEEE 754 64-bit bias value is 1023. According to the normalized order code calculation method, we can get the following table, and the representation range of real EEE is -1022 – +1023.

The decimal field is expressed as
0. f 51 . . . f 0 0.f_{51}… f_{0}
Small numerical
f f
, including
0 < = f < 1 0 <= f < 1
. The mantissa of a floating-point number is expressed as
M = 1 + f M = 1 + f
Of, implicitly begins with 1, so you can view M as
1. f 51 . . . f 0 1.f_{51}… f_{0}
In the form. Therefore, floating point numbers adjust the order code so that the mantissa M is in range
1 < = M < 2 1 <= M < 2
.

We use the representation of 10.2345 to show you the representation of floating point numbers. As shown in the figure below, it is the true representation of data. We get the representation bit of 10.2345, 10100011…. In the form.

According to the representation method of M, we get 1.0100011… , E = 3, exp = 8 + BIAS = 3 + 1023 = 1026, we get the final floating point representation as follows

Since 10.2345 is an inexpressible precision value, here also carry the trade-off, the analysis to the behind, at this time the real value of floating-point Numbers is: 10.2345000000000005968558980385.

In fact, it involves a shift in the idea that reality is continuous and computers are discrete. We live in a continuous world, but machines live in a discrete world. For us, distance, weight and size are all continuous, but for a computer, no matter how rigorous its memory and representation, it is always discrete, and the number of representations it can make is fixed. IEEE 754 converts continuous real numbers into discrete floating point numbers. The example in the figure below, the IEEE 754 floating-point representation, actually maps real continuous real numbers to discrete floating-point numbers.

2.1.2 Non-normalization

When the order code is all zeros, the number represented is a non-programmed number. In this scenario, the order code is E=1− BIAS =1−1023=−1022E = 1-BIAS = 1-1023 = -1022E=1− BIAS =1−1023=−1022, and the mantail is M=fM =fM =f. The non-normalization can represent the form of 0, and the difference in flag bits can represent +0/-0.

2.1.3 special values

When the code is all 1’s, the number is a special value. When the field of small numbers is zero, the value is infinity, plus infinity when s equals zero, minus infinity when s equals one. When you multiply large numbers or divide by 0, infinity is the result of an overflow. When the order is all 1 and the decimal field is not 0, the resulting value is called “NaN”. Some operations return NaN when the result is not a real number, infinity, or other exceptions. Below are some representations of IEEE754 64-bit floating point numbers, and we can take a look at its representation

2.2 rounding

IEEE floating point number defines four rounding methods, rounding to an even number, rounding to zero, rounding up, rounding down, its basic rounding methods, as shown in the following example, shows the operation form of several rounding methods:

What is the calculation method of IEEE floating-point selection? Rounding to an even number, that is, when rounding the evaluated value, it rounds to the nearest even number. For floating-point Numbers (the content of here from www.pianshen.com/article/697… , define the lowest order 0 as an even digit and 1 as an odd digit. In addition, floating point number also defines three concepts, the Guard bit (Guard bit), the approximation bit (Round bit) and the Sticky bit (Sticky bit). For example, if we want to reserve 2 decimal places, the second to the right of the decimal point is the Guard bit, the third to the right of the decimal point is the Round bit, and all the decimal places starting from the fourth to the right of the decimal point are combined to form the Sticky bit. The following is the CMU courseware, which can better illustrate their concept.

The concept of intermediate values, for rounding operations, we define XXX.YYYY100… 0 is the middle value of rounding, where the right-most Y is the rounding position. For example, 11.11011 has a median value of 11.11100…. when rounded to a decimal position of 2 . Rounding to even numbers has two calculation rules:

  • If the closest value is unique, round directly to the closest value;
  • If the Guard bit is in the middle, the Guard bit is even. If the Guard bit is even, the Guard bit is rounded up without carrying the Guard bit. If the Guard bit is odd, the Guard bit is rounded up after carrying the Guard bit.

Let’s think about the rounding of two decimal places, the rounding of the following data, to give you a better understanding of how rounding to even numbers works,

For 10.2345 in 2.1, to see what its rounding calculation looks like, its actual value is as follows:

1010. 0011 1100 0000 1000 0011 0001 0010 0110 1110 1001 0111 1000 1100 1..Copy the code

According to the calculation rules of mansa, we need to convert it into the following form:

IEEE 754 64-bit mantissae has 51 bits, reserved bit 0, approximate bit 1, viscous bit 1001… . The value 010 0011 1100 0000 1000 0011 0001 0010 0110 1110 1001 0111 1001 is closer and does not comply with the rule of “intermediate value”. So the rounding value is 010 0011 1100 0000 1000 0011 0001 0010 0110 1110 1001 0111 1001. So the actual floating point value is 10.2345000000000005968558980385 10.2345.

2.3 Operation Rules

For the operation rules of addition, subtraction, multiplication and division of floating point numbers, refer to the following documentation on floating point number processing. Let’s talk about the rules for adding and multiplying floating-point numbers. Does the commutative law of addition and subtraction apply to floating point numbers, a+b+c? =a+(b+c)a + b + c ? = a + (b + c)a+b+c? =a+(b+c), from the point of view of floating point coding and rounding, is not applicable. For example, 3.14 + 1e10-1e10 has the value 0, and 3.14 + (1e10-1e10)=3.14. The distributive property of multiplication is a similar problem, because of the rounding problem, the distributive property of multiplication also suffers from a loss of accuracy, resulting in abnormal effects. Therefore, when we code, we should consider the arithmetic rules of floating point number. Some arithmetic rules of addition, subtraction, multiplication and division are invalid.

3. How to solve the business problem

Going back to our business scenario, how the operations under the 6-bit precision rule should be handled, the rounding rule for floating-point numbers to even numbers is bound to result in a loss of precision, which causes us to get the wrong value using math.floor. For our rounding rule, we consider a similar form:

let sp = 0;
// Option 1: add a +0.1 offset value to math.floor to get the correct value
items.forEach(r= > {
    sp += Math.floo(r.p * 10000 + 0.1) | 0;
});
// Option 2: math.round can also get the correct value
items.forEach(r= > {
    sp += Math.round(r.p * 10000) | 0;
});
return sp / 10000;
Copy the code

Therefore, when we encounter floating point arithmetic scenarios, especially when it comes to the exact calculation, we should consider the encoding and rounding rules of floating point numbers, so as to write the code to fight for. Going back to our title, I think you already know why 0.1+0.2+0.3 does not equal 0.6.

4. Refer to the documents

  • www.binaryconvert.com/result_doub…
  • www.pianshen.com/article/697…
  • www.binaryconvert.com/result_doub…
  • www.yuejianzun.xyz/2019/05/28/…