Some tips for JDK source code

All extracted from JDK source

1 i++ vs i–

Line 985 of the String source code, equals

while (n–! = 0) { if (v1[i] ! = v2[i]) return false; i++; } this code is used to determine whether strings are equal or not, but there is a strange place to use I –! Lambda equals 0, don’t we usually use i++? Why do I –? And the same number of cycles. The reason is that there is an extra instruction after compilation:

I — The operation itself affects the CPSR(current program status register). Common flags for CPSR are N(result negative), Z(result 0), C (carry), O (overflow). I > 0 can be directly determined by the Z flag. The I ++ operation also affects the CPSR(current program status register), but only the O (overflow) flag, which is of no help in determining I < n. So you need an extra comparison instruction, that is, one more instruction per loop.

In short, there is one less instruction compared to 0. So, recycle I –, high-end atmosphere and grade.

Member variables vs local variables

The JDK source code almost always uses a local variable to accept a member variable in any method, such as

public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
Copy the code

Since local variables are initialized in the method thread stack and member variables are initialized in heap memory, the former is obviously faster, so we try to avoid using member variables directly in the method and use local variables instead.

3 Intentionally load registers && place time-consuming operations outside the lock

In ConcurrentHashMap, the operation of locking segment is very interesting. It is not a direct lock, but similar to spin lock. It repeatedly tries to acquire the lock, and in the process of acquiring the lock, the linked list will be traversed to load the data into the register cache first, so as to avoid convenience in the process of locking. The creation of new objects is also done outside the lock to avoid time-consuming operations in the lock

Final V put(K key, int hash, V value, Boolean onlyIfAbsent) {/** Obtain the exclusive lock on the segment before writing to it. Instead try */ HashEntry<K,V> node = tryLock()? null : scanAndLockForPut(key, hash, value);Copy the code

ScanAndLockForPut () the source code

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) { HashEntry<K,V> first = entryForHash(this, hash); HashEntry<K,V> e = first; HashEntry<K,V> node = null; int retries = -1; // Negative while Locating node // Loop to lock while (! tryLock()) { HashEntry<K,V> f; // To recheck first below if (retries < 0) {if (e == null) {if (node == null) // Speculatively create node Node = new HashEntry<K,V>(hash, key, value, null); retries = 0; } else if (key.equals(e.key)) retries = 0; Else // Loop the list, the CPU can automatically read the list into the cache e = e.next; } // retries>0 becomes a spin lock. Of course, if the number of retries exceeds MAX_SCAN_RETRIES (single-core, 64-core), then stop and wait for the lock. // Lock () is the blocking method until it returns. Else if (++retries > MAX_SCAN_RETRIES) {lock(); break; } else if ((retries &1) == 0 && // There is a big problem. A new element enters the list as a new header. Rehash the scanAndLockForPut method (f = entryForHash(this, hash))! = first) { e = first = f; // re-traverse if entry changed retries = -1; } } return node; }Copy the code

4 Use == to determine the equality of objects

When judging whether the object is equal, can use first = =, = = to directly compare the address because it is over, very fast, and would be the object to the equals value comparison, relatively slow, so if possible, you can use a = = b | | a.e quals (b) to compare whether the object is equal

5 about transient

Transient is used to prevent serialization, but the internal array in the HashMap source code is defined as transient

/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
Copy the code

The key and value pairs in the network cannot be serialized using hashMap.

Effective Java 2nd, Item75, Joshua

For example, consider the case of a hash table. The physical representation is a sequence of hash buckets containing key-value entries. The bucket that an entry resides in is a function of the hash code of its key, which is not, in general, guaranteed to be the same from JVM implementation to JVM implementation. In fact, it isn’t even guaranteed to be the same from run to run. Therefore, accepting the default serialized form for a hash table would constitute a serious bug. Serializing and deserializing the hash table could yield an object whose invariants were seriously corrupt.

How to understand? Take a look at hashmap.get ()/put() and see that reading and writing a Map determines which bucket to read/write from based on Object.hashcode(). Object.hashcode() is a native method and may vary from JVM to JVM.

For example, to store an entry into a HashMap, the key is the STRING” STRING”. In the first Java program, the hashcode() of “STRING” is 1, and bucket number 1 is stored. In the second Java program, the hashcode() of “STRING” might be 2, stored in bucket number 2. With default serialization (Entry[] table is not transient), the memory distribution of the HashMap from the first Java program will be the same after the second Java program is serialized, which is not correct.

For example, if you store an entry pair into a HashMap, key=” Fang “, in the first Java program, the “Fang” hashcode() is 1, and store it in table[1], ok, Now passing to another JVM program, the “Fang” hashcode() might be 2, so go to table[2] and fetch the value, but the result is not there.

HashMap readObject and writeObject are now used to output/input the content and to regenerate the HashMap.

6 don’t use a char

Char is encoded in Java UTF-16 and is two bytes, and two bytes cannot represent all characters. Two bytes are called BMP, and the other characters are joined together as high surrogate and Low surrogate to form 4-byte characters. For example, indexOf in the String source:

Public int indexOf(int ch, int fromIndex) {final int Max = value.length; if (fromIndex < 0) { fromIndex = 0; } else if (fromIndex >= max) { // Note: fromIndex might be near -1>>>1. return -1; } // Handle the most cases here (ch is a Bmp code point or a // negative value (invalid code point)) final char[] value = this.value; for (int i = fromIndex; i < max; i++) { if (value[i] == ch) { return i; } } return -1; Return indexOfSupplementary(ch, fromIndex);} else {return indexOfSupplementary(ch, fromIndex); }}Copy the code

So a Java char can only represent part of the BMP character in UTF16. For CJK (China, Japan and South Korea Unified Ideographic Characters) part of the extended character set cannot be expressed.

For example, in the figure below, char cannot be represented except for ext-a.

A String is a constant, and the password is stored in the constant pool. If another process dumps the process, the password will be dumped out of the constant pool. Char [] can be changed by writing other information, even if it is dumped to reduce the risk of password leakage.

But I think you can dump memory is a char can prevent it? Unless the String is not collected in the constant pool and is read directly from the constant pool by another thread, this is probably very rare.

I a course: “Java foundation tutorial: handwritten JDK” we might as well onlookers below:)

1 i++ vs i–

Member variables vs local variables

3 Intentionally load registers && place time-consuming operations outside the lock

4 Use == to determine the equality of objects

5 about transient

6 don’t use a char

Related Posts

New features in Redis6.0: multithreading, client caching and security

TCP/UDP/IP protocol for computer networks

Eureka Client heartbeat mechanism process