Original: Taste of Little Sister (wechat official ID: XjjDog), welcome to share, please reserve the source.

The phenomenon of

If you’ve read the HashMap source code carefully, you’ve probably noticed one problem: There are two private methods in HashMap.

private void writeObject(java.io.ObjectOutputStream s) throws IOException
private void readObject(java.io.ObjectInputStream s) throws IOException, ClassNotFoundException
Copy the code

These two approaches have two things in common:

  1. All private methods
  2. Although they are private methods, there is no place to call them inside the HashMap

doubt

What do these two methods do?

Why make it private?

answer

What are the writeObject and readObject methods in HashMap for?

Answer: Both the readObject and writeObject methods are created for HashMap serialization.

First, HashMap implements the Serializable interface, which means that the class can be serialized, and the JDK provides ObjectOutputStream for serialization of Java objects and ObjectInputStream for deserialization. We’ll look at the serialized using ObjectOutputStream, it provides a different way to serialize objects of different types, such as writeBoolean wrietInt, writeLong etc., for the custom types, provides the writeObject method. The writeObject method of ObjectOutputStream calls the following method:

private void writeSerialData(Object obj, ObjectStreamClass desc) 
    throws IOException 
    {
    ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
    for (int i = 0; i < slots.length; i++) {
        ObjectStreamClass slotDesc = slots[i].desc;
        if(slotDesc hasWriteObjectMethod ()) {/ / if you rewrite the writeObject method PutFieldImpl oldPut = curPut; curPut = null; SerialCallbackContext oldContext = curContext; try { curContext = new SerialCallbackContext(obj, slotDesc); bout.setBlockDataMode(true); slotDesc.invokeWriteObject(obj, this); // Call the writeObject method bout. SetBlockDataMode (false); bout.writeByte(TC_ENDBLOCKDATA); } finally {// omit} curPut = oldPut; }else{ defaultWriteFields(obj, slotDesc); }}}Copy the code

Or this one:

If the writeObject method is overridden, the writeObject method will be called. If the writeObject method is overridden, the writeObject method will be called. The default serialization method is called.

The call relationship is shown as follows:

Why are readObjects and WriteObjects private in HashMap?

The JDK documentation does not specify why this is set to private. A method is private, so it cannot be subclass override. What’s the advantage of that? If I implement a class that inherits from HashMap and I want to have my own serialization and deserialization methods, I can also implement private readObject and writeObject methods without worrying about the HashMap itself. The following is from StackOverFlow:

We don’t want these methods to be overridden by subclasses. Instead, each class can have its own writeObject method, and the serialization engine will call all of them one after the other. This is only possible with private methods (these are not overridden). (The same is valid for readObject.)

Why should HashMap implement its own writeObject and readObject methods instead of using the JDK’s uniform default serialization and deserialization operations?

Must first clear the purpose of serialization, the Java object serialization, is sure to at some point to the object serialization, and generally serialization and deserialization of the machine is different, because the serialization is the most common scenario calls across machines, and the serialization and deserialization of a the most fundamental requirement is, The deserialized object is the same as the object before the serialization.

In a HashMap, the Hash value of an Entry is calculated based on the Hash value of the Key and then stored in an array. Different JVM implementations may calculate different Hash values for the same Key.

The result of different Hash values is that it is possible for a HashMap object to deserialize in a different way than it was before it was serialized. It is possible that, before serialization, the element with Key= ‘AAA’ was placed in the 0th position of the array, while after deserialization, the element with Key= ‘AAA’ might be fetched from the 2 position of the array, and the data retrieved would be different from that before serialization.

In Effective Java, Joshua explains this:

For example, consider the case of a hash table. The physical representation is a sequence of hash buckets containing key-value entries. The bucket that an entry resides in is a function of the hash code of its key, which is not, in general, guaranteed to be the same from JVM implementation to JVM implementation. In fact, It isn’t even guaranteed to be the same from run to run. Therefore, accepting the default serialized form for a hash table would constitute a serious bug. Serializing and deserializing the hash table could yield an object whose invariants were seriously corrupt.

So to avoid this problem, HashMap takes the following approach:

    1. To avoid serialization of the object by the default serialization method in the JDK, use the transient keyword for elements that might cause data inconsistencies. Unserialized entries include: Entry[] table,size,modCount.
    1. The writeObject method is implemented to ensure the consistency of serialization and deserialization results.

What does HashMap do to ensure consistency between serialized and deserialized data? First, when a HashMap is serialized, it does not serialize the array that holds the data. Instead, it serializes the number of elements and the Key and Value of each element. On deserialization, the Key and Value positions are recalculated and an array is repopulated. Can we resolve the serialization and deserialization inconsistency? Instead of serializing the Entry array that holds the elements, it is regenerated at deserialization time. This prevents the element fetched by Key after deserialization from being different from the element fetched before deserialization.

Xjjdog is a public account that doesn’t allow programmers to get sidetracked. Focus on infrastructure and Linux. Ten years architecture, ten billion daily flow, and you discuss the world of high concurrency, give you a different taste. My personal wechat xjjdog0, welcome to add friends, further communication.