To view the content of the previous section, please click the link above to follow the official account and view all the articles.


Counterintuitive fact

The reason why the computer is called “calculation” machine is because the invention it is mainly used for calculation, “calculation” is of course its specialty, in everyone’s impression, calculation must be very accurate. But in fact, even in some very basic decimal operations, the results of the calculation are not exact.

Such as:

Float f = 0.1 0.1 f f *;

System.out.println(f);

The result looks self-explanatory and should be 0.01, but the screen output is actually 0.010000001, followed by an extra 1.

How could the computer have gone wrong with what seemed to be such a simple operation?

The brief answer

In fact, it is not the arithmetic itself that is wrong, but the computer simply cannot accurately represent many numbers, such as 0.1.

Computers store decimals in a binary format that does not represent 0.1 exactly. It can only represent a number that is very close to 0.1 but not equal to it.

Numbers are imprecise, and it is not surprising that operations on imprecise numbers are imprecise.

How can 0.1 not be expressed exactly? In the decimal world, yes, but not in the binary world. Before we talk about binary, let’s look at the familiar decimal system.

In practice, decimal notation can only represent numbers that can be expressed as sums of powers of 10, such as 12.345, which actually represent: 1*10+2*1+3*0.1+4*0.01+5*0.001, and the expression of the integer is similar, each position behind the decimal point also has a bit weight, from left to right, 0.1,0.01,0.001,… So 10 to the minus 1, 10 to the minus 2, 10 to the minus 3.

In many cases, decimal notation is not exact, such as 1/3. If you keep three decimal places, the decimal notation is 0.333, but no matter how many decimal places you keep, it is not exact. If you multiply 0.333, for example, by 3, you expect a 1, but in fact it is 0.999.

Binary is similar, but binary can only represent what can be expressed as a sum of two to the powers of two. Here are some examples of powers of two:

The power of 2 The decimal system
2 ^ (1) 0.5
2 ^ (2) 0.25
2 ^ (3) 0.125
2 ^ (4) 0.0625

Numbers that can be expressed exactly as the sum of some power of two can be expressed precisely, and others cannot.

Why does it have to be binary?

Why not use the familiar decimal system? At the lowest level, computers use electronic components that represent only two states, usually low and high voltage, corresponding to 0 and 1. Using binary, it is easy to build hardware and perform calculations based on these electronics. If you had to use decimal, the hardware would be much more complex and inefficient.

What decimal calculation is accurate

If you write a program to experiment, you will find that some of the calculations are accurate. For example, I wrote in Java:

System. The out. Println (f + 0.1 0.1 f);

System. The out. Println (0.1 0.1 f * f);

The first line outputs 0.2 and the second line outputs 0.010000001. The first line is also wrong, right?

In fact, this is just an illusion of the Java language, and the result is actually inaccurate, but because it is close enough to 0.2, Java chose to print 0.2, which looks very compact, rather than a decimal with lots of zeros in the middle.

When the error is small enough, the result looks accurate, but inaccuracy is the norm.

How do I deal with inexact calculations

The calculation is not accurate. What can we do? Most of the time, we don’t need that much precision, we can round it, or we can output it with a fixed number of decimal places.

If you really need high precision, one way is to convert decimals to integers and then to decimals again, or another way is to use decimal data types. There is no general specification for this, which is BigDecimal in Java, which is more accurate, but less efficient, and I won’t go into details in this section.

Binary representation

We’ve been using the word “decimal” for floats and doubles, and actually, that’s a loose word, because “decimal” is a word we use in math, but in computers, we’re talking about floats. Float and double are called floating-point data types, and decimal operations are called floating-point operations.

Why is it called floating point? And that’s because in the binary representation of a decimal, when you represent that decimal point, it’s not fixed, it’s floating.

Let’s use the analogy in base 10, which has the scientific notation, for example, the number 123.45, written this way, is the fixed notation, and if we use the scientific notation, we only have one digit in front of the decimal point, we can write 1.2345E2, which is 1.2345*(10^2), which means that in the scientific notation, the decimal point moves two to the left.

In binary, the decimal is represented by a similar scientific notation, such as m*(2^e). M is called the mantissa and e is called the exponent. Exponents can be true or negative, and negative exponents denote smaller numbers approaching 0. In binary, the mantissa part and the exponent part are represented separately, with a sign bit representing plus and minus.

The binary format for representing decimals is the same in almost all hardware and programming languages. The format is a standard called THE IEEE 754 standard, which defines two formats, one 32-bit for Java float and the other 64-bit for Java double.

In 32-bit format, one bit represents a sign, 23 bits represent a mantissa, and 8 bits represent an exponent. In 64-bit format, 1 bit represents a symbol, 52 bits represent a mantissa, and 11 bits represent an exponent.

In both formats, in addition to representing normal numbers, the standard specifies special binary forms for representing special values, such as negative infinity, positive infinity, 0, NaN (non-numeric, such as 0 times infinity).

The IEEE 754 standard has some complex details that may seem difficult to understand at first glance and are not commonly used for everyday use, so I won’t cover them in this article.

If you want to see the actual binary form of a floating point number, in Java, you can use the following code:

Integer.toBinaryString(Float.floatToIntBits(value))Long.toBinaryString(Double.doubleToLongBits(value));

summary

Why do decimals go wrong? The reason: many decimals cannot be represented accurately in computers.

The basic thinking of computers is binary, so unexpected, reasonable!

In the last video we talked about binary integers, and in this video we talked about decimals.

What about characters and text? What’s the code? What’s the reason for the garbled code?


— Long series, to be continued, please pay attention to (click on the head of the article public account link, or public account search “Lao Ma said programming” or “Laoma_shuo”, or long press the two-dimensional code below to follow)

Review images


Original article, all rights reserved, reprint please contact background.