We do not know in the process of using JS have found some floating point number operation, the result of the accuracy problem: such as 0.1 + 0.2 = 0.30000000000000004 and 7 * 0.8 = 5.6000000000000005 and so on.

What exactly is causing this problem? In fact, this is because the information inside the computer is represented in binary, which is the code of zeros and ones, but some floating-point numbers can’t be represented accurately in binary, causing a series of accuracy problems. Of course, this is not a problem unique to JS.

Let’s take 0.1+0.2 as an example and take a closer look at floating points and how to avoid this problem when using JS. This problem is very basic, but also very understanding of the necessary, we should be a review of the “computer composition principle”.

Through the following chapters, we will roughly introduce the following aspects:

  • The binary representation of floating point numbers
  • What is the IEEE 754 standard
  • A scheme to avoid the calculation accuracy of floating point numbers
  • Basic usage of the Test Framework (Mocha)

First, the operation of the computer

(I) How to convert decimal to binary

(1) integer part: divide 2 by remainder, if the quotient is not 0, continue to divide it by 2, when the quotient is 0, all remainder in reverse order;

② Decimal part: multiply by 2 to take the integer part, if the decimal part is not 0, continue to multiply by 2, until the decimal part is 0 will take out the integer bits in positive order. (If the decimal part cannot be zero, the corresponding value shall be obtained according to the requirements of the effective digit, and the 0 after the digit shall be rounded off by 1.)

Using the above method, let’s try converting 0.1 to binary:

0.1 * 2 = 0.2 – – – – – – – – – – Set 0

0.2 * 2 = 0.4 – – – – – – – – – – set 0

0.4 * 2 = 0.8 – – – – – – – – – – Set 0

0.8 x 2 = 1.6 – – – – – – – – – – Take 1

0.6 * 2 = 1.2 – – – – – – – – – – Take 1

0.2 * 2 = 0.4 – – – – – – – – – – set 0

.

It turns out that no amount of multiplication is going to equal 0, so there’s no way binary is going to be able to represent 0.1 exactly.

So the binary representation of 0.1 is: 0.000110011…… 0011… (0011 Infinite loop)

The binary representation of 0.2 is 0.00110011…… 0011… (0011 Infinite loop)

How many bits to save depends on what criteria you use, which is covered in the next section.

(ii) IEEE 754 standards

The IEEE 754 Standard is the Standard number of the IEEE Standard for Floating-point Arithmetic. The IEEE 754 standard specifies the exchange, arithmetic format, and methods of self-stating floating-point numbers in binary and decimal systems in computer programming environments.

According to IEEE 754, any binary floating point number can be expressed in the following form:


The s-number character, which represents the plus and minus of floating point numbers (0 plus 1 minus); M is the significant bit (mantissa); E is the order code, represented by shift code, and the truth value of the order code is added with a constant (offset).

The mantissa part M is usually normalized, that is, the mantissa that is not a “0” always begins with a “1”, and this bit is also called the hidden bit because it is omitted when stored. For example, if 1.0011 is saved, only 0011 is saved, and the first 1 is added when reading. This is equivalent to saving one more significant digit.

Common floating point formats are:

① Single precision:

Single-precision floating-point format

This is a 32-bit floating point number, with the highest 1 bit being the sign bit S, the next 8 bits being the exponent E, and the remaining 23 bits being the mantissa (significant digit) M;

Its truth value is:


② Double precision:

Double precision floating point format

This is a 64-bit floating point number, with the highest 1 bit being the sign bit S, the next 11 bits being the exponent E, and the remaining 52 bits being the mantissa (significant digit) M;

Its truth value is:


JavaScript has only one number type, number, and number uses the IEEE 754 double-precision floating-point format. Following these rules, let’s see how JS stores 0.1 and 0.2:

0.1 is positive, so the sign bit is 0;

The binary bit is 0.000110011…… 0011… (0011 infinite loop), normalized to 1.10011001…… 1001(1)*2^-4, according to the rules of rounding 0, the final value is

^ 2-4 * 1.1001100110011001100110011001100110011001100110011010

And the exponent E is equal to negative 4 plus 1023 is 1019

Thus, the binary storage format of 0.1 in JS is (sign bit separated by comma, index bit separated by semicolon) :

0011111011; 1001100110011001100110011001100110011001100110011010

0.2 it is:

0011111100; 1001100110011001100110011001100110011001100110011010

Q1:Why is exponential bit E (order code) represented by shift code?

A1:To make it easier to determine the size.

(iii) Floating point operations

0.1 = > 0011111011; 1001100110011001100110011001100110011001100110011010

0.2 = > 0011111100; 1001100110011001100110011001100110011001100110011010

Floating-point numbers are added and subtracted in the following steps:

(1) The order, so that the two numbers of the decimal position alignment (that is, make the two numbers of the order code equal).

Therefore, the order difference should be calculated first, and the mantissa with small order should move right according to the order difference (the number loss may occur when the mantissa shifts, affecting the accuracy).

Since the order codes and mantissa of 0.1 and 0.2 are both positive, their source codes, inverse codes and complement codes are the same. (Complement code is used for calculation, and double symbol is used during calculation)

△ Order difference (complement) = 00,01111111011-00,0111111111100 = 00,0111111111011 + 11,10000000100 = 11,1111111111111

It can be seen from the above that the order difference of △ is -1, that is, the order code of 0.1 is smaller than that of 0.2. Therefore, the mantras of 0.1 should be moved 1 bit to the right, and the order code of 0.1 should be increased by 1 (to make the order code of 0.1 consistent with that of 0.2).

0.1 => 0,01111111100; 1100110011001100110011001100110011001100110011001101 (0)

Note: note the rounding principle of 0 to 1. The reason why the mantissa is 1 is because the value of the hidden bit is 1 (default is not stored, only when reading).

② Sum of mantissa

0.1100110011001100110011001100110011001100110011001101

+ 1.1001100110011001100110011001100110011001100110011010

— — — — — — — — — — — — — — —

10.0110011001100110011001100110011001100110011001100111

(3) normalized

For the result of step 2, the right gauge is required (i.e. the mantissa moves 1 bit to the right, and the order code is increased by 1).

Sum = 0.1 + 0.2 = 0,01111111101; 1.0011001100110011001100110011001100110011001100110011 (1)

Note: the right gauge operation may lead to the loss of low position, resulting in errors, resulting in precision problems. Therefore, the rounding operation of step 4 is required

④ Round (0 round 1)

Sum = 0011111101; 1.0011001100110011001100110011001100110011001100110100

⑤ Overflow judgment

Determines whether floating point operation overflows according to the order code. Our step code 01111111101 neither overflows nor overflows.

At this point, the 0.1+0.2 operation is over. Now, let’s take a look at the decimal value of this calculation.

<1> First normalize it to get binary form:

Sum = 0.010011001100110011001100110011001100110011001100110100

<2> Convert it to decimal

sum = 2^2 + 2^5 + 2^6 + … + 2 ^ 52 = 0.30000000000000004440892098500626

Now you should understand where the result 0.30000000000000004 comes from in JS.

Q2:Why do computers use complement codes?

A2:Can simplify the computer operation steps, and only set the adder, such as when doing subtraction, if can find the positive number equivalent to the negative number to replace the negative number, you can replace the subtraction operation with addition. And with the complement code, you can achieve this effect.

Second, floating point number precision problem solution

(I) Simple solution

The idea is to convert a decimal to a whole number and then convert back to a decimal. The code is relatively simple, so I just posted it.

'use strict' var accAdd = function(num1, num2) { num1 = Number(num1); num2 = Number(num2); var dec1, dec2, times; try { dec1 = countDecimals(num1)+1; } catch (e) { dec1 = 0; } try { dec2 = countDecimals(num2)+1; } catch (e) { dec2 = 0; } times = Math.pow(10, Math.max(dec1, dec2)); // var result = (num1 * times + num2 * times) / times; var result = (accMul(num1, times) + accMul(num2, times)) / times; return getCorrectResult("add", num1, num2, result); // return result; }; var accSub = function(num1, num2) { num1 = Number(num1); num2 = Number(num2); var dec1, dec2, times; try { dec1 = countDecimals(num1)+1; } catch (e) { dec1 = 0; } try { dec2 = countDecimals(num2)+1; } catch (e) { dec2 = 0; } times = Math.pow(10, Math.max(dec1, dec2)); // var result = Number(((num1 * times - num2 * times) / times); var result = Number((accMul(num1, times) - accMul(num2, times)) / times); return getCorrectResult("sub", num1, num2, result); // return result; }; var accDiv = function(num1, num2) { num1 = Number(num1); num2 = Number(num2); var t1 = 0, t2 = 0, dec1, dec2; try { t1 = countDecimals(num1); } catch (e) { } try { t2 = countDecimals(num2); } catch (e) { } dec1 = convertToInt(num1); dec2 = convertToInt(num2); var result = accMul((dec1 / dec2), Math.pow(10, t2 - t1)); return getCorrectResult("div", num1, num2, result); // return result; }; var accMul = function(num1, num2) { num1 = Number(num1); num2 = Number(num2); var times = 0, s1 = num1.toString(), s2 = num2.toString(); try { times += countDecimals(s1); } catch (e) { } try { times += countDecimals(s2); } catch (e) { } var result = convertToInt(s1) * convertToInt(s2) / Math.pow(10, times); return getCorrectResult("mul", num1, num2, result); // return result; }; var countDecimals = function(num) { var len = 0; try { num = Number(num); var str = num.toString().toUpperCase(); if (str.split('E').length === 2) { // scientific notation var isDecimal = false; if (str.split('.').length === 2) { str = str.split('.')[1]; if (parseInt(str.split('E')[0]) ! == 0) { isDecimal = true; } } let x = str.split('E'); if (isDecimal) { len = x[0].length; } len -= parseInt(x[1]); } else if (str.split('.').length === 2) { // decimal if (parseInt(str.split('.')[1]) ! == 0) { len = str.split('.')[1].length; } } } catch(e) { throw e; } finally { if (isNaN(len) || len < 0) { len = 0; } return len; }}; var convertToInt = function(num) { num = Number(num); var newNum = num; var times = countDecimals(num); var temp_num = num.toString().toUpperCase(); if (temp_num.split('E').length === 2) { newNum = Math.round(num * Math.pow(10, times)); } else { newNum = Number(temp_num.replace(".", "")); } return newNum; }; var getCorrectResult = function(type, num1, num2, result) { var temp_result = 0; switch (type) { case "add": temp_result = num1 + num2; break; case "sub": temp_result = num1 - num2; break; case "div": temp_result = num1 / num2; break; case "mul": temp_result = num1 * num2; break; } if (Math.abs(result - temp_result) > 1) { return temp_result; } return result; };Copy the code

Basic usage:

Add: accAdd(0.1, 0.2) // get result: 0.3 subtraction: accSub(1, 0.9) // Get result: 0.1 Division: accDiv(2.2, 100) // Get result: 0.022 multiply: CountDecimals () method: calculate the length of decimal places convertToInt() method: convert decimal places to integers getCorrectResult() method: confirm our calculation is correct, just in caseCopy the code

(ii) Use of special base type libraries

If you need very precise results, consider using special base data types, such as this library called Bignumber:

Github.com/MikeMcl/big…

Verify the feasibility of the solution through unit tests

Testing ensures the quality of our code. However, our accurate calculation scheme needs a lot of tests to verify its correctness. It saves time and effort to test by writing test code, and it is convenient to modify the code later, and it can quickly confirm whether the code is wrong.

The test used Mocha, a popular JavaScript testing framework, which is available in both the browser and Node environment.

Follow these simple steps to install and use it:

① Install the Node.js environment

② Go to the project directory and run the NPM init command to initialize package.json

③ Install the relevant class libraries in the project directory (test framework Mocha, assertion library CHAI)

npm install mocha chai --saveCopy the code

(4) test code preparation

Since there is no interface involved, we will test the code as node

④-1 Export the function to be tested first to facilitate the test code call. Roughly as follows:

if (typeof module ! == 'undefined' && module.exports) { var calc = {}; calc.countDecimals = countDecimals; calc.convertToInt = convertToInt; calc.getCorrectResult = getCorrectResult; calc.accAdd = accAdd; calc.accSub = accSub; calc.accDiv = accDiv; calc.accMul = accMul; module.exports = calc; }Copy the code

④-2 Write test code (take the countDecimals method as an example)

var chai = require('chai'); var assert = chai.assert; // Using Assert style var expect = chai.expect; // Using Expect style var should = chai.should(); // Using Should style var calc = require('./accurateCalculate'); var countDecimals = calc.countDecimals; function test_countDecimals(info, num, expected) { it(info, function(done) { expect(countDecimals(num)).to.be.equal(expected); done(); }); } describe('TEST countDecimals', function() { describe('TEST Number', function() { test_countDecimals('3', 3, 0); Test_countDecimals (' 3.00 ', 3.00, 0); Test_countDecimals (' 3.01 ', 3.01, 2); test_countDecimals('3e0', 3e0, 0); test_countDecimals('3e10', 3e10, 0); test_countDecimals('3e-10', 3e-10, 10); Test_countDecimals (e0 '3.01', 3.01 e0, 2); Test_countDecimals (e10 '3.01', 3.01 e10, 0). Test_countDecimals (e-10 '3.01', 3.01 e-10, 12); }); describe('TEST String', function() { test_countDecimals('3', '3', 0); Test_countDecimals (' 3.00 ', 3.00 ' ', 0). Test_countDecimals (' 3.01 ', 3.01 ' ', 2); Test_countDecimals (e0 '3.00', 3.00 e0 ', 0). Test_countDecimals (e10 '3.00', 3.00 e10, 0). E-10 test_countDecimals (' 3.00 ', 3.00 e-10 ', 10); test_countDecimals('30e-1', '30e-1', 0); test_countDecimals('30e-2', '30e-2', 1); }); });Copy the code

The require statement is required to import the relevant class library and the object exported from the JS under test.

The Describe block is called a “test suite” and is a function that describes what the test is using its first argument.

The IT block, called a test case, represents a single test and is the smallest unit of testing. The first parameter is the name of the test case.

The test script should contain one or more Describe blocks, and each DESCRIBE block should contain one or more IT blocks.

The above test code uses the Expect assertion style:

Expect (call the function under test ()).to.be.equal(expected return value);Copy the code

The so-called “assertion” is to judge whether the actual execution result of the function under test is consistent with the expected result.

⑤ Write the test item in package.json

"scripts": {
    "test": "mocha test.js"
  },Copy the code

⑥ Run the test code

After package.json is configured, simply run the NPM test command to see the test results:


For more detailed uses of Mocha, see this blog post:

Test framework Mocha Example Tutorial – Ruan Yifeng’s weblog

Four, conclusion

The problem with the accuracy of JS floating-point numbers is that some decimals cannot be represented accurately in binary. JS uses IEEE 754 double precision floating point format.

To avoid the calculation accuracy problem of floating point number, the following methods can be adopted:

  • Call the round() method to round or toFixed() method to retain the specified number of digits (this method can be used if accuracy is not high)
  • Converting decimals to whole numbers and doing the calculation is the simple solution mentioned earlier
  • Use special base data types, such as the bignumber mentioned earlier (high precision, with the help of these libraries)

If there are problems in the article, I hope you point out; You are welcome to communicate with me if you have any questions.