Take a look at the Go code below. Will 0.7 be returned?

    var num float32
    for i := 0; i < 7; i++{
        num = num + 0.1
    }

    fmt.Println(num)
Copy the code

The answer, perhaps surprisingly, is 0.70000005

    0.70000005
Copy the code

Some might ask, is it the Go language? Try another language? OK, let’s try changing JS.Again, the answer is surprising. Try adding float types in C, C++, Java, PHP, and other languages to see if the result is accurate.

Also, in addition to language, you can try superimposing fields of type float data in databases such as MySQL to see if the result is accurate.

I can tell you the answer: as long as you add data of type float, the result of addition (subtraction, multiplication and division) will not be accurate in any language, in any database, in any middleware.

This is a loss of precision for floating-point types. (Loss of significance)

To understand why, you need to understand how a computer defines and represents a float type. Unlike the representation of the positive integer type, the representation of float types in computers is slightly more complex and follows the IEEE 754 standard.

Next, let’s talk about IEEE 754 standard.

Let’s start by reviewing the representation of integer types in computers. We know that computers only recognize zeros and ones; So, for positive integers like 6, we’re going to do a decimal to binary conversion. That is:

So, decimal 6 is eventually converted to binary 110.

That makes sense, but what about decimals like 6.1? Someone said we could use a special symbol for the decimal point to separate the 6 from the 1 in 6.1; That sounds like a good idea. In fact, IEEE 754 is actually doing this, but the idea is slightly complicated, the general idea is to copy the “scientific counting method”!

Let’s review what scientific notation is. A number is expressed as the form of a 10 n power multiplication (1 | | or less a < 10, not for a score form, n is integer), this notation is called scientific notation. That’s 1.360X10^4.

Floating-point numbers can be expressed in the form 1.0110101 X 2^n, similar to scientific notation. How can this form be represented at the data level? According to IEEE 754, the data is divided into three parts:

From left to right: sign bits (positive and negative), exponential bits and decimal places

In the case of single-precision floating-point numbers, there are 32 bits of single-precision floating-point numbers.

  • A bit represents a sign bit
  • Eight bits represent exponential bits
  • The 23 bits represent the decimal place

Here’s one thing to watch out for: Float: 1.0110101 X 2^n; float: 1.0110101 X 2^n; float: 1.0110101 X 2^n; float: 1.0110101 X 2^n; float: 1.0110101 X 2^n; float: 1.0110101 X 2^n

So, how do you show it? For example, what are the numbers after the decimal point? Can 6.1 be written as 110.1? And if I could, what does this 1 after the decimal point represent? The number one? So if you add a few zeros, can you say ten, a hundred, a thousand? It seems that no, because it can only satisfy the “visual effect “, the logical level is directly unreasonable.

To understand that the number after the decimal point represents the number divided by 2, for example, in binary, the first 1 after the decimal point represents 1/2 equals 0.5, the second 1 represents 1/2/2 equals 0.25, and so on, the third 1 represents 0.125… Please see the following figure for details:

So, given a decimal number, such as 0.1, the corresponding binary number should be calculated the opposite way to the left of the decimal point: multiply by 2 to record the whole number

0.1 X 2 = 0.2 0.2 X 2 = 0.4 0.4 X 2 = 0.8 0.8 X 2 = 1.6 1 (1.6-1 = 0.6) 0.6 X 2 = 1.2 1 (1.2-1 = 0.2) 0.2 X 2 = 0.4 0.4 X 2 = 0.8 0.8 X 2 = 1.6 1 (1.6-1 = 0.6) 0.6 X 2 = 1.2 1 (1.2-1 = 0.2) 0.2 X 2 = 0.4 0.4 X 2 = 0.8 0 0.8 X 2 = 1.6 1... // Go on foreverCopy the code

So, in binary is expressed as 0.1:0.000110011001100110011… So 6.1 in binary should be expressed as: 110.000110011001100110011… Using scientific notation “is expressed as: 1.10000110011001100110011… X 2^2 OK: 100001100110011001100110011 X 2^2 OK: 10000110011001100110011 X 2^2 OK: 10000110011001100110011 X 2^2 OK

The sign bit 0 represents a positive number and 1 represents a negative number, so it can be determined that the sign bit 6.1 is 0; So now we have the sign place, we have the decimal place, all we have left is the exponent 2 and so on, how do we represent that? Convert directly to 000000010 in 8-bit space?

Obviously not, first of all, if the index bit is represented by the original code, then, for the case of the index bit is negative, you have to add a symbol bit to represent, and there will be two zero situation: 00000000 and 1000000, the operation process is complicated ~

Somebody said what if I use the complement? If the complement is used, the following situation occurs, as an example:

For example: 1.01 X 2^-1 and 1.11 X 2^3 compare the size? First, compare the exponential bits, -1 and 3, and convert them into binary numbers' 111 'and' 011 'respectively. If there is no other logic, '111' is' 7 'and' 011 'is' 3', will 7 be less than 3?Copy the code

Visible use of complement code, is not very convenient, so, another encoding method – – shift code. The definition of shift is to add a constant of bias to each number, usually “2^n-1” or “2^ N-1-1 “for n digits.

Continuing with the above 1.01 X 2^-1 and 1.11 X 2^3 comparison size example:

For example: 1.01 X 2^-1 and 1.11 X 2^3 compare the size? If the index is -1, it is -1 + 4 = 3. If the binary index is 3, it is 3 + 4 = 7. If the binary index is 3, it is 111Copy the code

In this way, the comparison of exponential bits of floating-point scientific notation is made easy, and the problem of “positive zero” and “negative zero” being different is eliminated.

Because:

If the offset is: 4, the shift represents 0 only: 0 + 4 = 4, i.e. "100"Copy the code

In IEEE 754, the offset of the exponential shift code is 2^ n-1-1 of the exponential bits, which is 127.

So, back to the problem of 6.1 representation, the exponent bit is: 2 + 127 = 129, binary representation is: 10000001

Therefore, 6.1, under the IEEE 754 single-precision floating-point standard, is expressed as:

Okay, now that you know how to represent floating point numbers in the IEEE 754 standard, you can see why floating-point numbers don’t add exactly?

Because floating point number many decimals in the binary environment can not be fully represented, only partial data to approximate representation, two numbers added together, is the sum of two approximate numbers added together, if the number of times enough, the accuracy will naturally become lower and lower

For more exciting content, please follow my wechat official accountInternet Technology Nest