nature

Js calculation accuracy problem, js integer maximum value is 2 power 53, namely 9007199254740992

Knowledge points to master

  1. Base (decimal, binary, octal, hexadecimal)
  2. Scientific enumeration
  3. Floating-point storage mechanism

I don’t want to talk too much about the base system, but mainly about the last two points

Scientific enumeration

For floating-point numbers, the position of the decimal point is not fixed, and the array after the decimal point is not fixed, so someone proposed a scientific notation to identify the floating-point number formula:

A represents the binary number of floating point numbers

E indicates the number of digits moved by the decimal point

For example, 10 converted to binary is 1010, which in scientific notation is 1.010 * 2^3. Explain how scientific notation is stored in the calculation

The sign bit Index a Decimal places Exponential offset
Single-precision floating point number 1 a eight 23 127
A double – precision floating – point number 1 a 11 52 1023

Explanation:

  • 1 bit is used to hold the symbol
  • The 11 bits are used to hold the index
  • Fifty-two bits are used to hold the decimal part

The sign bit: 1 said positive, 0 means negative index: e can be a positive number, also can make the negative, transformation rules, the result of the deviation of the e + into binary decimal places: science is counting the number of decimal places, the total is 52, not enough to fill 0, 52 here also says the biggest value for 52 integer index offset: formula

K is the number of exponent bits (double exponent is 11 bits, all x = 2^11-1 = 1023)

For example: 10 binary 1010, scientific notation is: 1.010 * 2^3 converted to binary standard form:

The sign bit person + + index small digital = 1 + 10000000010 + 0100000000000000000000000000000000000000000000000000 is 64

Here’s another example of a floating point number:

0.1 + 0.2 = 0.30000000000000004;

Let’s analyze this problem through the above knowledge points:

Note: Decimal to binary, by multiplying by 2 round, order

  1. Is 0.1

Binary: 0.1 0.0001100110011001100110011001100110011001100110011001101 (note: in the form of scientific notation can only save 64 valid number, here at the end of the rounded 0 to 1)

0.1 scientific notation: 1.100110011001100110011001100110011001100110011001101 * 2 ^ (4)

0.1 actually stored in the computer in the form of: 1011111110111001100110011001100110011001100110011001100110011010 64

  1. Is 0.2

The binary 0.2:0.001100110011001100110011001100110011001100110011001101

0.2 scientific notation: 1.100110011001100110011001100110011001100110011001101 * 2 ^ (3)

0.2 actually stored in the computer in the form of: 1011111111001001100110011001100110011001100110011001100110011010

  1. Add the 0.1 and 0.2 binary

0.0001100110011001100110011001100110011001100110011001101 + 0.0011001100110011001100110011001100110011001100110011010 = Converted to a decimal 0.0100110011001100110011001100110011001100110011001100111 or 0.30000000000000004

Understand the principle and basic knowledge, the next is the processing of the problem of adding large numbers

Idea 1:

  1. First of all, the number type will be inaccurate if the number value exceeds the maximum value. The maximum number of digits is 16, so the first step is to intercept the large number characters, with 14 digits as a segment (considering that 15 and 15 digits may also exceed the maximum value, a relatively safe 14-bit interception is adopted here).
  2. When the 14 bits of each paragraph are added first, determine whether the total length of the sum is greater than or equal to 15 bits, that is, whether there is a carry place. If there is a carry place, add the carry place 1 in the next calculation
function addBigNum(num1, num2){
    var l1 = String(num1).length,           // Save the num1 length
        l2 = String(num2).length,           // Save the num2 length
        flag = false.// Check if there is a carry
        newNum1 = String(num1),             // use it to intercept
        newNum2 = String(num2),
        newSum = "",
        sum = ' ';
    if(l1 < 14 && l2 < 14) {return Number(num1) + Number(num2);
    }
    while (l1 >= 14 || l2 >= 14){
        newNum1 = l1 && l1 > 14 ? newNum1.slice(l1-14? l1-14:0, l1): ' ';
        newNum2 = l2 && l2 > 14 ? newNum2.slice(l2-14? l2-14:0, l2): ' ';

        // There is a carry
        if(flag){
            newSum = String(Number(newNum1) + Number(newNum2) + 1);
            flag = false;
        }else {
            newSum = String(Number(newNum1) + Number(newNum2));
        }

        if(newSum.length >= 15){
            sum = String(newSum.slice(1, newSum.length)) + String(sum);
            flag = true;
        }else {
            sum =  String(newSum) + String(sum);
        }

        l1 -= 14;
        l2 -= 14;

        // Add the bits that are less than 14 digits
        if(l1 < 14 && l2 < 14) {if(flag){
                sum =  String(Number(num1.slice(0, l1)) + Number(num2.slice(0, l2)) + 1) + String(sum);
            }else {
                sum =  String(Number(num1.slice(0, l1)) + Number(num2.slice(0, l2))) + String(sum); }}}return sum;
}
Copy the code