Member variables
@Native public static final int MIN_VALUE = 0x80000000;
@Native public static final int MAX_VALUE = 0x7fffffff;
Copy the code
MIN_VALUE is the minimum value of Integer, which is minus 2 to the 31st. MAX_VALUE indicates the maximum value of Integer, which is 2^31-1. Since the maximum value of an Integer is 2^31-1, when we increment it by 1, it becomes 2^31, which obviously overflows and causes the Integer to be dialed back to -2^31, which is the minimum value. So the relationship between the two member variables is: 0x80000000 = 0x7FFFFFFF + 1.
ToString source code parsing
toString(int i, int radix)
ToString converts to a base string as a negative number, and finally decides whether to add a “-” sign.
How is an integer converted to a char?
Char array = char array = char array
final static char[] digits = {
'0' , '1' , '2' , '3' , '4' , '5' ,
'6' , '7' , '8' , '9' , 'a' , 'b' ,
'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
'o' , 'p' , 'q' , 'r' , 's' , 't' ,
'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};
Copy the code
Since arrays are built in, the only way to retrieve the characters is by using the array’s subscripts, and by putting them all together in char[], we can construct a string. The algorithm for calculating subscripts in Integer is:
while (i <= -radix) {
buf[charPos--] = digits[-(i % radix)];
i = i / radix;
}
buf[charPos] = digits[-i];
Copy the code
This piece of reference [quotient residual] base conversion algorithm, it is easy to understand.
toString(int i)
As you can see from the diagram, the process of converting a decimal int to a string is relatively simple. In fact, the core of the process is in the getChar() method. The main illustration here is the green part of the figure, which is calculating the size of the char array.
To convert base 10 to a string, we first create an array of char[] to record the converted characters. Then Integer calculates the size of the char array using the following method:
int size = (i < 0)? stringSize(-i) +1 : stringSize(i);
// Requires positive x
static int stringSize(int x) {
for (int i=0; ; i++)
if (x <= sizeTable[i])
return i+1;
}
Copy the code
Before calling stringSize, we need to make sure that x is a positive number, so we check if I is negative. If so, we need to convert I to a negative number, such as stringSize(-i), but since negative numbers are one bit more than positive numbers in the range of signed integers, we need to add +1.
So how does stringSize get its bits in terms of x? We all know that the number 9 is a bit-tens threshold, so Integer also uses the following array to determine the number of bits of x:
final static int [] sizeTable = { 9.99.999.9999.99999.999999.9999999.99999999.999999999, Integer.MAX_VALUE };
Copy the code
If x is 10000, stirngSize checks that 9999 is less than or equal to 10000, so 9999 is set to 4 in sizeTable, which is also the digit number of 9999. And then finally, you add 1 and you get the number of x’s.
GetChar source code parsing
The core flow of the getChar method is essentially the iterative process of generating characters (corresponding to the green nodes in the figure), as described below. Each iteration generates two-character processing source code as follows:
// Generate two digits per iteration
while (i >= 65536) {
q = i / 100;
// really: r = i - (q * 100);
r = i - ((q << 6) + (q << 5) + (q << 2));
i = q;
buf [--charPos] = DigitOnes[r];
buf [--charPos] = DigitTens[r];
}
Copy the code
The processing logic of this code is as follows:
When I >= 65536, it is every two digits of the number, I /= 100, for example, I = 567235474.
- I = 5672354, buf = {,,,,,,, ‘7’, ‘4’};
- In the last two 5 and 4 in buf array, I = 56723, buf = {,,,,,,, ‘5’, ‘4’, ‘7’, ‘4’};
- I is now less than 65536, breaking out of the cycle.
So the above r = I – ((q << 6) + (q << 5) + (q << 2)) is equivalent to r = I – (q * 100), because the performance of displacement is higher than multiplication, so conversion processing is adopted. The specific derivation process is as follows:
- r = i – (q * 100)
- r = i – (q * 64 + q * 32 + q * 4)
- r = i – ((q << 6) + (q << 5) + (q << 2))
After breaking out of the loop, we enter the process of generating 1 bit characters per iteration as follows:
// Fall thru to fast mode for smaller numbers
// assert(i <= 65536, i);
for (;;) {
q = (i * 52429) > > > (16+3);
r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
buf [--charPos] = digits [r];
i = q;
if (i == 0) break;
}
Copy the code
When I < 65536, I /= 10 for each digit, following the example above, I = 56723.
- To take the last 3 into the buf array, I = 5672, buf = {,,,,,, ‘3’, ‘5’, ‘4’, ‘7’, ‘4’}; We iterate until I == 0.
Q = (I * 52429) >>> (16+3) is equivalent to q = I /10. R = I -((q << 3) + (q << 1)) is equivalent to r = I -(q*10).
ToUnsignedLong source code parsing
ToUnsignedLong converts an int to an unsigned long.
public static long toUnsignedLong(int x) {
return ((long) x) & 0xffffffffL;
}
Copy the code
Int to unsinged Long. You have to think about both positive and negative numbers, and if it’s positive, it’s consistent. If it’s negative, then negative is plus 2 to the 32. Let’s verify the negative number as follows:
Negative number operation procedure: oxfffffff complement:1000 0000 0000 0000 0000 0000 0000 0001
-2Complement:1111 1111 1111 1111 1111 1111 1111 1110&1000 0000 0000 0000 0000 0000 0000 0000Convert to source code:1111 1111 1111 1111 1111 1111 1111 1110
10Base:4294967294 = 2^32 - 2
Copy the code
ToUnsignedString0 source analysis
ToUnsignedString0 converts an int to a string.
/** * Convert the integer to an unsigned number. */
private static String toUnsignedString0(int val, int shift) {
// assert shift > 0 && shift <=5 : "Illegal shift value";
int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
int chars = Math.max(((mag + (shift - 1)) / shift), 1);
char[] buf = new char[chars];
formatUnsignedInt(val, shift, buf, 0, chars);
// Use special constructor which takes over "buf".
return new String(buf, true);
}
Copy the code
ToUnsignedString0 into the said reference val is decimal values, then through Integer. NumberOfLeadingZeros method, we can get to the val (binary) there are how many before 1 0 (from left to right).
For example, if the binary of 10 is 1010, then the binary of 10 is represented as: 0000 0000 0000 0000 0000 0000 0000 1010, so by Integer. NumberOfLeadingZeros get result is 28. So, integer.size-28 gives us a MAG of 4.
The size of the char array depends on shift=1, which converts val to binary, shift=3, which converts val to octal, and shift=4, which converts val to hexadecimal.
For example, 10 to binary is 1010, octal is 12, and hexadecimal is: A. So when shift is 1, the char array size should be 4, when shift is 3, the char array size should be 2, and when shift is 4, the char array size should be 1. Of course, that’s what math.max (((mag + (shift-1))/shift), 1) does.
Finally, int is converted to char[] by calling formatUnsignedInt, which is the same as new String.
NumberOfLeadingZeros source code analysis
From the toUnsignedString0 source code analysis above, we know that by passing an int to numberOfLeadingZeros, we get the numberOfLeadingZeros of an int. The source code is as follows:
public static int numberOfLeadingZeros(int i) {
// HD, Figure 5-6
if (i == 0)
return 32;
int n = 1;
if (i >>> 16= =0) { n += 16; i <<= 16; }
if (i >>> 24= =0) { n += 8; i <<= 8; }
if (i >>> 28= =0) { n += 4; i <<= 4; }
if (i >>> 30= =0) { n += 2; i <<= 2; }
n -= i >>> 31;
return n;
}
Copy the code
This algorithm uses displacement to confirm the number of zeros. Let’s take 10 as an example, as follows:
i = 10
高16A low |16A I0000 0000 0000 0000 0000 0000 0000 1010
i >>> 16 0000 0000 0000 0000 0000 0000 0000 0000Meet the I > > >16= =0, n=17
i <<= 16 0000 0000 0000 1010|0000 0000 0000 0000
i >>> 24 0000 0000 0000 0000 0000 0000 0000 0000Meet the I > > >24= =0, n =25
i <<= 8 0000 1010|0000 0000 0000 0000 0000 0000
i >>> 28 0000 0000 0000 0000 0000 0000 0000 0000Meet the I > > >28= =0, n=29
i <<= 4 1010|0000 0000 0000 0000 0000 0000 0000
i >>> 30 0000 0000 0000 0000 0000 0000 0000 0010Do not satisfy I >>>30= =0
i >>> 31 0000 0000 0000 0000 0000 0000 0000 0001 i=1
n = n - i = 29 - 1 = 28
Copy the code
Through the example, we can know that the above algorithm is actually the application of dichotomy, avoiding the judgment of the format of 0 one by one, and the performance is very high.
FormatUnsignedInt source code analysis
From the toString(int I, int radix) source code analysis, we know that the conversion of base 10 to other bases is using the residual method, next, this article will present another conversion method, and is also the main algorithm of formatUnSignedInt. The source code is as follows:
static int formatUnsignedInt(int val, int shift, char[] buf, int offset, int len) {
int charPos = len;
int radix = 1 << shift;
int mask = radix - 1;
// Packet conversion processing
do {
buf[offset + --charPos] = Integer.digits[val & mask];
val >>>= shift;
} while(val ! =0 && charPos > 0);
return charPos;
}
Copy the code
The formatUnsignedInt converts an unsigned int to a char array. For example, when shift is 1, 1<<shift, radix is 2, indicating that the corresponding radix is binary.
Before we begin to analyze the above source code, we first through an example, through the displacement of a 10 to hexadecimal conversion, 170 (10) as an example, as follows:
1.First the170Convert to binary:1010 1010
2. 16Base equivalent2^4To the power, we're going to write binary with4Bits are grouped as a group, first pair low4The packet of bits is converted to16Base:2Base:1010 | 1010
16Base: a | a3.So the integration turns out to be aa, which is170the16Into the system.Copy the code
In this way, conversion is carried out by bitwise grouping, and aggregation is carried out at last. Its performance is higher than quotient residual algorithm.
We began to analyze the above source code:
- Int radix = 1 << shift; int radix = 1 << shift;
- For the second step of the example, the corresponding grouping base is obtained from radix 1, for example, the radix is 16 (representing hexadecimal), so the mask is 15, and if the 15 is converted to binary, it is 1111,1111 is exactly 4 bits, corresponding to the 4 in 2^4. And each bit is 1, which is also suitable for the mask.
- Once you have the source code, you can start the group conversion process. With val & mask, you can calculate a decimal value for every 4 bits (assuming mask is 1111, which is 4 bits). This decimal value is also the subscript of the integer.digits array, and the corresponding character is identified by the subscript index. Example:
以170For example,170Binary of:1010 1010Mask:0000 1111& Results:0000 1010convert10Base:10Corresponds to the value in integer.digits: aCopy the code
- Shift val >>> to the low level and continue with step 3 above to achieve the effect of circular grouping transformation.
Using this method, you can get the corresponding base char[].
ParseInt source code analysis
ParseInt’s core function is in the special symbol processing that piece, the source code is as follows:
char firstChar = s.charAt(0);
if (firstChar < '0') { // Possible leading "+" or "-"
if (firstChar == The '-') {
negative = true;
limit = Integer.MIN_VALUE;
} else if(firstChar ! ='+')
throw NumberFormatException.forInputString(s);
if (len == 1) // Cannot have lone "+" or "-"
throw NumberFormatException.forInputString(s);
i++;
}
Copy the code
The most important member variable is limit. The default value is -integer. MAX_VALUE.
Char firstChar = s.char (0); char firstChar = s.char (0); char firstChar = s.char (0); If the first character is less than ‘0’, it is likely to be something other than a number, and is then checked for negative by firstChar == ‘-‘, if so, updating the negative marker and setting the limit. If it is not a “+” sign, or if the string is only one digit long, then an exception is thrown.
Finally is the string to convert the number of processing, the source code is as follows:
multmin = limit / radix;
while (i < len) {
// Accumulating negatively avoids surprises near MAX_VALUE
digit = Character.digit(s.charAt(i++),radix);
if (digit < 0) {
throw NumberFormatException.forInputString(s);
}
if (result < multmin) {
throw NumberFormatException.forInputString(s);
}
result *= radix;
if (result < limit + digit) {
throw NumberFormatException.forInputString(s);
}
result -= digit;
}
Copy the code
Because integer.parseint converts result *= radix from character.digit (s.char (i++),radix), we limit a minimum value as a judgment on whether the radix is out of bounds. The algorithm Multmin = limit/radix is such a treatment. If the result of this time is less than mulmin, there is no need to continue the treatment. Therefore, the use of result *= radix after the treatment will lead to transgression and avoid the occurrence of abnormalities.
The above is the judgment that the minimum limit is worth a limit, but since there is another step after the conversion is result -= digit, we also need to judge whether the result of this operation exceeds the limit, i.e. Result-digit < limit? Result > limit + digit.
reference
Explore integer.getchars in Integer