“This is the second day of my participation in the August More Text Challenge. For details, see: August More Text Challenge juejin.cn/post/698796…”

How does Java get suIDS?

For those of you who know serialization, there are two suids in deserialization: the suID computed by the local class and the suID read from the serialized byte stream. Our analysis expands on both.

1. Start with the exception thrown

To find ideas for source code analysis, we can start by throwing exceptions

1.1 Create and define a serializable class

public class User implements Serializable {
    public String name;
    private Integer age;

    // getter & setter & toString. }Copy the code

1.2 Run the serialization method

static void test1(a) throws IOException {
    User user = new User("player".21);
    user.setName("player");
    user.setAge(21);
    System.out.println(user);
    FileSerializeUtil.serialize(user, "user.obj");
}
Copy the code
public static void serialize(Object object, String filepath) throws IOException {
    try (ObjectOutputStream objectOutputStream = new ObjectOutputStream(new FileOutputStream(filepath))) {
        objectOutputStream.writeObject(object);
    } catch (IOException e) {
        logger.warning("serialize: fail to serialize object to " + filepath);
        throwe; }}Copy the code

1.3 Do something about it

Add a member variable to User

public class User implements Serializable {
    public String name;
    private Integer age;
    privateString address; . }Copy the code

1.4 Run the deserialization method

static void test2(a) throws IOException, ClassNotFoundException {
    User user = (User) FileSerializeUtil.deserialize("user.obj");
    System.out.println(user);
}
Copy the code
public static Object deserialize(String filepath) throws IOException, ClassNotFoundException {
    try (ObjectInputStream objectInputStream = new ObjectInputStream(new FileInputStream(filepath))) {
        return objectInputStream.readObject();
    } catch (IOException | ClassNotFoundException e) {
        logger.warning("deserialize: file to deserialize object from " + filepath);
        throwe; }}Copy the code

1.5 Triggering An Exception and Viewing Exception Information

Exception in thread "main" java.io.InvalidClassException: serial.User; local class incompatible: stream classdesc serialVersionUID = 8985745470054656491, local class serialVersionUID = -4967160969146043535
	at java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:715)
	...
Copy the code

Thus we can begin our story in line 715

2. Trunk source analysis

2.1 initNonProxy()Analysis of the

To determine how the local serialVersionUID and the serialVersionUID from the serialized byte stream were obtained, we found the place in the ObjectStreamClass where the exception was thrown, InitNonProxy () has a judgment code that says,

void initNonProxy(ObjectStreamClass model, Class
        cl, ClassNotFoundException resolveEx, ObjectStreamClass superDesc)
        throws InvalidClassException
    {
    	// (1) the value of suid is obtained here
        longsuid = Long.valueOf(model.getSerialVersionUID()); .if(cl ! =null) {...// The exception is thrown here
            if(model.serializable == osc.serializable && ! cl.isArray() && ! isRecord(cl) &&// (2) OsC is also serialized by getSerialVersionUID()suid ! = osc.getSerialVersionUID()) {throw new InvalidClassException(osc.name,
                        "local class incompatible: " +
                                "stream classdesc serialVersionUID = " + suid +
                                ", local class serialVersionUID = "+ osc.getSerialVersionUID()); }... }}Copy the code

The exception occurs because the suID and osc.getSerialVersionUID() are not equal. We can see that both essentially call the getSerialVersionUID() method, and we conclude that there must be some judgment in that method. Let’s expand the getSerialVersionUID() method.

2.2 getSerialVersionUID()Analysis of the

The first thing we notice is that the first line of the method’s documentation comment,

Return the serialVersionUID for this class.
Copy the code

This, of course, tells us the serialVersionUID of the class returned by this method. ComputeDefaultSUID () returns the suID of the local class,

The source code for this method is as follows,

public long getSerialVersionUID(a) {
    // REMIND: synchronize instead of relying on volatile?
    if (suid == null) {
        if (isRecord)
            return 0L;

        suid = AccessController.doPrivileged(
            new PrivilegedAction<Long>() {
                public Long run(a) {
                    returncomputeDefaultSUID(cl); }}); }return suid.longValue();
}
Copy the code

To make it clear what the difference is, let’s debug,

For suid at (1),

There are two possibilities for osc.getSerialVersionUID() at (2),

  • UserThere is no customization in the classserialVersionUID()
  • UserClassserialVersionUID()

It is easy to understand that both the UID of the local serialization class and the UID of the deserialization byte stream are obtained by this method. The SUID of the class will be computed using the computeDefaultSUID() method in if for the local UID that is not customized. For the custom SUID and the SUID in byte stream, it will be returned directly.

You can read more about computeDefaultSUID() below, but let’s continue with the main line

So we wonder, since we don’t go computeDefaultSUID(), where do we get the custom SUID and the SUID in the byte stream? With the powerful retrieval capabilities of IDEA, we can look at where suIDS are referenced,

  • In blue, we do not have a custom suID class compute path
  • Visual redgetDeclaredSUID()Is to get a custom (declaration)suid
  • redin.readLong()From the file input stream

Let’s verify our guess by analyzing the suID that gets the custom declaration.

We don’t parse the in.readLong() section because we continue to enter the Java class that will enter the I/O section, which simply means reading an integer of type long from the file through the input stream.

2.3 Getting a custom (declared) SUID

2.3.1 ObjectStreamClass()Analysis of the

With the IDEA jump, we go to the call getDeclaredSUID(), and look at the annotation for this method,

Creates local class descriptor representing given class.
Copy the code

Create a descriptor for the corresponding class. We don’t need to focus too much on the other details, we just need to focus on if… Else condition,

.// Determine whether to implement the Serializable interfaceserializable = Serializable.class.isAssignableFrom(cl); .if (serializable) {
    AccessController.doPrivileged(new PrivilegedAction<>() {
        public Void run(a) {... suid = getDeclaredSUID(cl); . }}); }else {
    // If suid = 0 is not implemented
    suid = Long.valueOf(0); . }Copy the code

For classes that do not implement Serializable, the default suID is 0. For classes that do, getDeclaredSUID() is used to get the suID.

2.3.2 getDeclaredSUID()Analysis of the
private static Long getDeclaredSUID(Class
        cl) {
    try {
        // Get the member variable serialVersionUID by reflection
        Field f = cl.getDeclaredField("serialVersionUID");
        // Check whether the member variable serialVersionId is a static constant
        int mask = Modifier.STATIC | Modifier.FINAL;
        if ((f.getModifiers() & mask) == mask) {
            f.setAccessible(true);
            // Where you really get the local class's custom suID
            return Long.valueOf(f.getLong(null)); }}catch (Exception ex) {
    }
    return null;
}
Copy the code

This method doesn’t have a lot of code. First, it reflects the member variable serialVersionUID, determines whether it is static or not, and then reflects it to get its value. See, we’re done getting our custom suID.

3. Default method for calculating class suIDcomputeDefaultSUID()

This method is used to compute the SUID based on the “information” of the class if the class does not have a custom serialVersionUID

private static long computeDefaultSUID(Class
        cl) {
    // Check whether it is a proxy object and whether it is a subclass or subinterface of Serializable
    if(! Serializable.class.isAssignableFrom(cl) || Proxy.isProxyClass(cl)) {return 0L;
    }

    try {
        // Buffer the output into a byte array
        ByteArrayOutputStream bout = new ByteArrayOutputStream();
        // Here we can understand that dout caches output in a bout
        DataOutputStream dout = new DataOutputStream(bout);

        // Write the class name to the byte array
        dout.writeUTF(cl.getName());

        // Get the class name modifier
        int classMods = cl.getModifiers() &
            (Modifier.PUBLIC | Modifier.FINAL |
             Modifier.INTERFACE | Modifier.ABSTRACT);

        // Get the class member methods
        Method[] methods = cl.getDeclaredMethods();
        
        if((classMods & Modifier.INTERFACE) ! =0) {
            // If the class is an interface
            classMods = (methods.length > 0)?// Operate or with the abstract modifier if there is a method
                (classMods | Modifier.ABSTRACT) :
            	// Nonexistent methods operate on the inverse of the abstract modifier
                (classMods & ~Modifier.ABSTRACT);
        }
        // The class modifier is written to the byte array
        dout.writeInt(classMods);

        if(! cl.isArray()) {// To compensate for the array processing, for the array type will get Cloneable and Serializable, so the array does not have to walk inClass<? >[] interfaces = cl.getInterfaces(); String[] ifaceNames =new String[interfaces.length];
            for (int i = 0; i < interfaces.length; i++) {
                ifaceNames[i] = interfaces[i].getName();
            }
            // Sort the interface names to avoid different writes to the same array of interfaces
            Arrays.sort(ifaceNames);
            for (int i = 0; i < ifaceNames.length; i++) {
                // Write the interface name to the byte arraydout.writeUTF(ifaceNames[i]); }}// Get the member variables
        Field[] fields = cl.getDeclaredFields();
        MemberSignature[] fieldSigs = new MemberSignature[fields.length];
        // Get the signature of the member variable
        for (int i = 0; i < fields.length; i++) {
            fieldSigs[i] = new MemberSignature(fields[i]);
        }
        Arrays.sort(fieldSigs, new Comparator<>() {
            public int compare(MemberSignature ms1, MemberSignature ms2) {
                // Sort by member variable name
                returnms1.name.compareTo(ms2.name); }});// Handle the member variables
        for (int i = 0; i < fieldSigs.length; i++) {
            MemberSignature sig = fieldSigs[i];
            // Get the member variable modifier
            int mods = sig.member.getModifiers() &
                (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED |
                 Modifier.STATIC | Modifier.FINAL | Modifier.VOLATILE |
                 Modifier.TRANSIENT);
            // If it is not private, write
            if (((mods & Modifier.PRIVATE) == 0) | |// Write if it is static or transient
                ((mods & (Modifier.STATIC | Modifier.TRANSIENT)) == 0))
                If it is private but static or transient, write */. If it is private but static or transient, write */
            {
                // Write the signature information to the byte arraydout.writeUTF(sig.name); dout.writeInt(mods); dout.writeUTF(sig.signature); }}// The static class is written
        if (hasStaticInitializer(cl)) {
            dout.writeUTF("<clinit>");
            dout.writeInt(Modifier.STATIC);
            dout.writeUTF("()V");
        }

        // Get the constructorConstructor<? >[] cons = cl.getDeclaredConstructors(); MemberSignature[] consSigs =new MemberSignature[cons.length];
        // Get the constructor signature
        for (int i = 0; i < cons.length; i++) {
            consSigs[i] = new MemberSignature(cons[i]);
        }
        // Sort the constructor signatures
        Arrays.sort(consSigs, new Comparator<>() {
            public int compare(MemberSignature ms1, MemberSignature ms2) {
                returnms1.signature.compareTo(ms2.signature); }});for (int i = 0; i < consSigs.length; i++) {
            MemberSignature sig = consSigs[i];
            // Get the constructor modifier
            int mods = sig.member.getModifiers() &
                (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED |
                 Modifier.STATIC | Modifier.FINAL |
                 Modifier.SYNCHRONIZED | Modifier.NATIVE |
                 Modifier.ABSTRACT | Modifier.STRICT);
            // Write if the constructor is non-private
            if ((mods & Modifier.PRIVATE) == 0) {
                dout.writeUTF("<init>");
                dout.writeInt(mods);
                dout.writeUTF(sig.signature.replace('/'.'. '));
            }
        }

        MemberSignature[] methSigs = new MemberSignature[methods.length];
        // Get the signature of the method
        for (int i = 0; i < methods.length; i++) {
            methSigs[i] = new MemberSignature(methods[i]);
        }
        // Method signature sort
        Arrays.sort(methSigs, new Comparator<>() {
            public int compare(MemberSignature ms1, MemberSignature ms2) {
                int comp = ms1.name.compareTo(ms2.name);
                if (comp == 0) {
                    comp = ms1.signature.compareTo(ms2.signature);
                }
                returncomp; }});for (int i = 0; i < methSigs.length; i++) {
            MemberSignature sig = methSigs[i];
            // Get the method modifier
            int mods = sig.member.getModifiers() &
                (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED |
                 Modifier.STATIC | Modifier.FINAL |
                 Modifier.SYNCHRONIZED | Modifier.NATIVE |
                 Modifier.ABSTRACT | Modifier.STRICT);
            // If it is not private, write
            if ((mods & Modifier.PRIVATE) == 0) {
                dout.writeUTF(sig.name);
                dout.writeInt(mods);
                dout.writeUTF(sig.signature.replace('/'.'. ')); }}// Refresh, saving the result to a byte array
        dout.flush();

        // SHA is performed on the previous byte array
        MessageDigest md = MessageDigest.getInstance("SHA");
        byte[] hashBytes = md.digest(bout.toByteArray());
        long hash = 0;
        for (int i = Math.min(hashBytes.length, 8) - 1; i >= 0; i--){
            hash = (hash << 8) | (hashBytes[i] & 0xFF);
        }
        return hash;
    } catch (IOException ex) {
        throw new InternalError(ex);
    } catch (NoSuchAlgorithmException ex) {
        throw newSecurityException(ex.getMessage()); }}Copy the code

This seems like a long method, but what it does is repetitive and easy to understand. In a nutshell, it takes information about the class through reflection, puts it into a byte array, and uses the hash function (SHA) to get a “summary” representing the class. The whole method has been fully annotated here, and the reader can read through it to understand the details of the method.

The hash function is a method commonly used to encrypt or generate a summary of information

  • Arbitrary input, fixed output
  • Collision avoidance, also known as close by a mile, changes in the input by even one digit can make a big difference in the calculation
  • Unidirectional, in fact, is determined by arbitrary input and fixed output, it is not feasible to calculate the original input by the hash value

If readers are interested in this section, the author will open the “Blockchain” column to explain the “Things about encryption”.

3.1 Factors that determine class SUID

We know that the hash function is the key to generating this suID, so to find what determines the suID, first look for what determines the hash, which is the input to the hash,

MessageDigest md = MessageDigest.getInstance("SHA");
// Input source
byte[] hashBytes = md.digest(bout.toByteArray());
long hash = 0;
for (int i = Math.min(hashBytes.length, 8) - 1; i >= 0; i--) {
    hash = (hash << 8) | (hashBytes[i] & 0xFF);
}
Copy the code

We see that the input is coming from the Bout variable, which means that we’re going to keep an eye on the operations with the following two variables,

// Cache the hash input object
ByteArrayOutputStream bout = new ByteArrayOutputStream();
// The object written to the bout in the method
DataOutputStream dout = new DataOutputStream(bout);
Copy the code

All we need to do is find the actions associated with the bout. The author has commented the entire code above, and the reader can read the following to summarize the factors and the actions that cause the hash value to change.

factors The specific action
The name of the class Modify the name of the class
Class modifier Add, subtract, and modify class modifiers
The class interface Add, subtract, and implement interfaces
Class member methods and constructors Increasing and decreasing methods; Modifying method Signature
Class member variables (including static and constant) Increasing and decreasing variables; Modify variable signature

We find that changes in class information will change the suID of the class except that the class it inherits from does not affect the suID.

There are different details to these factors that the authors have already noted in the source code, such as private constructors that are not written into the hash input.

As to why the original writer did not include class inheritance, the author is still thinking

3.1 Why Do I Need to Sort?

When reading about this method, it is not difficult to find places where array.sort () is often used to sort various reflected variables, such as this code,

Arrays.sort(fieldSigs, new Comparator<>() {
    public int compare(MemberSignature ms1, MemberSignature ms2) {
        returnms1.name.compareTo(ms2.name); }});Copy the code

This code is a way to sort the reflected class member variable signatures. In fact, it is not difficult to understand that the purpose is that the position transformation between member variables should not affect the suID of a class. For example, if I swap the position of two member variables of User, the deserialization will not occur.

public class User implements Serializable {
    private Integer age;
    publicString name; . }Copy the code

conclusion

This is the end of our serialized source code analysis, this is the author spent a lot of time to create a source code analysis, if you like me such a new word, might as well point a thumbs up. If there’s something you think I can do better, I’d love to see you in the comments!