We all say that Java is an object-oriented programming language, but its “basic data types” seem to break that, so Java is not 100% pure object-oriented programming language.

But why didn’t Sun remove the “basic datatypes” instead of adding “wrapper types” with object-oriented design ideas?

This article will try to analyze the meaning and advantages of “basic data types” as well as the implementation details of “wrapper classes”.

Primitive type VS object type

Eight basic data types are predefined in Java, including: byte, int, Long, Double, float, Boolean, char, and short. The biggest difference between primitive types and object types is that primitive types are based on values and object types are based on references.

Primitive variables store specific values directly in the local table of the stack, while object variables store heap references.

Obviously, variables of object type require more memory space than variables of primitive type.

As mentioned above, primitive types are based on numbers, so there is no class for primitive types. In other words, variables can only store values, but they do not have methods to manipulate data. Object types are completely different. A variable is actually an instance of a class. It can have information such as attribute methods.

The reason why Java does not eliminate the basic type is that the basic type occupies relatively small memory space and has higher performance advantages than the object type in computing. Of course, the disadvantages are also self-evident.

So what “wrapper types” do Java provide to make up for the fact that “primitive types” are not object-oriented?

As you can see, the wrapper class names of the six basic types are all uppercase except for int and char.

Let’s use int and Integer as examples to take a quick look at how the wrapper class is implemented.

Int and an Integer

The first thing to be clear about is that since Integer is a wrapper for int, it must be possible to store Integer values just like int.

/**
 * The value of the {@code Integer}.
 *
 * @serial
 */
private final int value;
Copy the code

The Integer class internally defines a private field value that holds an Integer value, and the wrapper class encapsulates the methods of various operations around that value.

Now let’s see how to build an instance of the wrapper class:

public Integer(int value) {
    this.value = value;
}
Copy the code
public Integer(String s) throws NumberFormatException {
    this.value = parseInt(s, 10);
}
Copy the code

The Integer class provides two constructors for building and initializing an instance of the Integer class. The first is more straightforward and allows you to initialize a value by passing in an integer value. The second, indirect approach, allows you to pass in a string of numbers. Integer internally attempts to convert the string to an Integer value, initializing value if successful and throwing an exception otherwise.

So we can convert a variable of type int to an instance of an Integer wrapper class with the following code:

int age = 22;
Integer iAge = new Integer(age);
Copy the code

Next, we know that when the System.out.println method is used to print an Integer instance, the virtual machine uses the return value of the Integer instance’s toString method as an argument to the print method.

So how does Integer internally convert a value to an Integer value? Maybe this problem we rarely thought about, because such small problems are basically packaged very well, our general development does not need too much care, but if let you write, can you write it accurately?

public String toString() {
    return toString(value);
}
Copy the code

First, a toString method with no arguments by default calls another toString method with arguments inside.

public static String toString(int i) {
    if (i == Integer.MIN_VALUE)
        return "2147483648";
    int size = (i < 0) ? stringSize(-i) + 1 : stringSize(i);
    char[] buf = new char[size];
    getChars(i, size, buf);
    return new String(buf, true);
}
Copy the code

If your value is equal to integer.min_value, you can simply return the predefined string, otherwise the stringSize method determines how many bits the currently passed Integer I is, that is, how many characters it needs to be represented, and the details of that method will be covered later, This is a very elegantly implemented algorithm.

With size determined, you can create an array of characters, convert a numeric value to a string using the getChars method, and finally build a string object to return.

Let’s take a look at the implementation of the stringSize method:

final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
                                99999999, 999999999, Integer.MAX_VALUE };
static int stringSize(int x) {
    for (int i=0; ; i++)
        if (x <= sizeTable[i])
            return i+1;
}
Copy the code

The implementation of this code may not be so clear to me in words, but let me give you an example, and you’ll get the idea quickly.

For example, if x equals 85, then the sizeTable element larger than x and closest to x is 99 (the largest of the two digits) with index 1, then we get that X is a two digit number (1+1).

If you think about it, it makes sense. Each element in sizeTable is the largest number of equal digits, 99 being the largest of two digits, 999 being the largest of three digits, and so on. So when x is closest to an element of the index, that is, the number of bits of X is the same as that element, and then calculate the number of bits of that element.

Then let’s look at how the core getChars method is implemented:

static void getChars(int i, int index, char[] buf) {
    int q, r;
    int charPos = index;
    char sign = 0;
    if (i < 0) {
        sign = The '-';
        i = -i;
    }
    while (i >= 65536) {
        q = i / 100;
    // really: r = i - (q * 100);
        r = i - ((q << 6) + (q << 5) + (q << 2));
        i = q;
        buf [--charPos] = DigitOnes[r];
        buf [--charPos] = DigitTens[r];
    }

    for (;;) {
        q = (i * 52429) >>> (16+3);
        r = i - ((q << 3) + (q << 1));  // r = i-(q*10) ...
        buf [--charPos] = digits [r];
        i = q;
        if (i == 0) break;
    }
    if (sign != 0) {
        buf [--charPos] = sign;
    }
}
Copy the code

Although this method does not have much code, it does require you to have some basic binary operations. I is the integer value we want to convert to a string, index is the number of digits, and the buF array is the container for the converted character store, used to store the result.

First, if I is a negative number, then the value of the variable sign assigns it a “-“, identifying it as a negative number and making it positive, since positive numbers are easier to manipulate.

This is followed by a loop that executes the body of the loop as long as I is greater than 65536 (2^16).

q = i / 100;

// q * (2^6 + 2^5 + 2^2) = q * 100

r = i – ((q << 6) + (q << 5) + (q << 2));

Q gets the value of I minus the ones and tens places, while R gets the lost tens and ones places. For example: if I is 12345, then Q is 123 and R is 45.

Finally, the value of I is reset to enter the next loop, and the ones and tens bits are stored by the following two statements.

buf [–charPos] = DigitOnes[r];

buf [–charPos] = DigitTens[r];

These two assignments are also interesting, because r must be a two-digit number, so r cannot exceed 100 anyway. For example: r equals 56, then DigitOnes[r] will get 6, and DigitTens[r] will get 5.

The design of this code is still very clever, so through this loop, more than 65536 bits are stored in reverse order into the BUF array.

The next for loop completes the storage of bits less than 65536.

q = (i * 52429) >>> (16+3);

r = i – ((q << 3) + (q << 1));

2^ 19 = 524288, so (I * 52429) >>> (16+3) is equivalent to I * 52429/524288 is approximately equal to I * 0.1000003814697, and q is an integer value, So the final value of q is actually equal to I / 10.

Maybe it’s just for efficiency’s sake to make a simple divide by ten operation so complicated that q ends up storing I minus the ones place, and R is storing the lost ones place.

For example: I equals 1234, then Q equals 123 and r equals 4.

So you can store it bit by bit with a similar idea:

buf [–charPos] = digits [r];

Finally, the sign flag bit is determined to determine whether the string needs to be printed with the “-” symbol to indicate whether the value is positive or negative.

To summarize the toString method as a whole, the core of the toString method is to determine the number of digits of the value, which needs to be expressed in several characters, and to convert the value to a string. The first step is very simple. Needless to say, the second step is carried out step by step according to the size of the value. The value greater than 65536 is stored at the speed of two bits each time, and the value less than 65536 is stored at the speed of one bit.

The Integer class also contains a class of methods, valueOf. ValueOf is an important method that is used to implement automatic unboxing in JDK 1.5.

This method uses an IntegerCache mechanism, so let’s look at the implementation of this caching mechanism in Integer:

Since JDK 1.5, Sun has added this cache class to reuse Integer instances within a specified range, caching objects with the same content to reduce memory overhead. By default, you can cache instances between [-128,127], although you can specify the maximum value of the cache interval by starting the vm with -xx :AutoBoxCacheMax.

And the program’s first step is to read the virtual machine startup parameters to determine whether a program startup specifies cacheable maximum values, if integerCacheHighPropValue is null, it means that is not explicitly specified, and using 127 as the highest limit cacheable.

Otherwise, do some calculations based on the parameters, and if the parameter is set to less than 127, 127 is taken as the maximum limit for the cache. In theory, we could cache up to integer.max_value, but in practice we can’t, because integer.max_value is the maximum length that an Integer[] array can define, and we still have 127 negative numbers to cache.

So IntegerCache can cache up to integer.max_value – 129, which is used by default as the maximum cache limit if the value is greater than this value.

As a result, IntegerCache can cache values between [low,high].

Then let’s look at how valueOf uses IntegerCache:

public static Integer valueOf(int i) {
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}
Copy the code

If I is in our cache, it will return a direct reference directly from IntegerCache, but the way it’s evaluated is a little bit interesting.

cache[0] = -128(128 + -128)   cache[1] = -127(128 + -127)
cache[2] = -126(128 + -126)   cache[3] = -125(128 + -125)
......
cache[128 + i] = i;
Copy the code

Since the cache is cached from -128 and the index is cached from 0, the difference between any value and -128 is the index cached for that value.

So, once I is in our cached value range, a direct reference will be returned directly from the cache pool, otherwise an Integer instance will actually be created to return.

Here we have analyzed the source code of three or four methods, but there are many more instrumental methods in the Integer class that we can’t cover in space, so you can learn about them yourself.

I think you already know a little bit about the relationship between packaging types and basic types, as well as some questions about automatic unpacking and some classic interview questions in the next article.


All the code, images and files in this article are stored in the cloud on my GitHub:

(https://github.com/SingleYam/overview_java)

Welcome to wechat public number: jump on the code of Gorky, all articles will be synchronized in the public number.