From the school to A factory all the way sunshine vicissitudes of life

Please go to www.codercc.com


1. Carefully implement the Serializable interface

  • The problem

    The serialization process encodes “an object into a byte stream.” The reverse process is called “deserialization.” When an object is serialized, its encoding can be passed from one VIRTUAL machine to another and stored on disk for later deserialization. For a long time, there is a misconception that Serializable interface can only be implemented in order to achieve serialization. In fact, this approach has many harms, and the convenience of this serialization approach will incur long-term maintenance costs. What are the caveats about Serilizable?

  • The answer

    1. The disadvantage of Serializable

      Directly implementing the Serializable interface has the following disadvantages:

      • Reduced flexibility: If a class implements the Serializable interface, its byte stream encoding becomes part of its export API, and once the class is widely used, this serialization method must always be supported. Also, if the default Serializable is used, ** both private and package-level private instance domains in this class become part of the exported API, which does not comply with the design principle of minimum domain access. ** In addition, if the internal structure of the class is changed, the client will attempt to serialize with the old version of the class and deserialize with the new version, and the program will fail.

        If the serialized class does not have a specified serialVersionUID identifier (sequence version UID) displayed, the system automatically generates that identifier by invoking a complex operation based on the class. This identifier is an identifier generated based on the class name, interface name, and all public and protected member names. If you change the internal structure of the class, such as adding a method, the automatically generated sequence version UID will also change. Therefore, if a version number is not explicitly declared, compatibility is broken, resulting in an InvalidClassException at runtime.

      • More prone to bugs and security holes: while normal objects are created by constructors, serialization is also an object creation mechanism, and deserialization can also construct objects. Since there is no explicit constructor in the deserialization mechanism, deserialization ensures that constraints are established by the real constructor and that attackers are not allowed to access the internal information of the object being constructed. With the default deserialization mechanism, it is easy to break the constraint relationships of objects and to suffer illegal access.

      • Relevant test burden: when a serializable class has been modified, need to check the serialized in the new one instance, in the old version deserialization “and” in the old version serialized an instance, in the new version deserialization “is normal, when release increased, the test and the serializable class the number and distribution of” is directly proportional to the product.

    2.Serializable applicable scenarios

    If a class belongs to a framework that relies on serialization to transfer and persist objects, then it is necessary to implement Seriablizable. Further, a class belongs to a component. If the parent implements Seriablizable, This class also needs to implement the Seriablizable interface. As a rule of thumb, value classes such as Date and BigInteger should implement Serializable, as do most collection classes.

    3.Serializable does not apply to scenarios

    • Classes designed for inheritance should implement the Serializable interface as little as possible, and user interfaces should not inherit the Serializable interface as much as possible, since subclasses or implementation classes also carry the serialization risk. In most cases, this rule should be followed, but it can be broken in very special circumstances, For example, classes that implement Serializable include Throwable (exceptions can be passed from server to client), Component (GUIs can be sent, saved, and recovered), and HttpServlet abstract (sessions can be cached).
    • Internal classes should not implement Serializable; they need to hold references to external class instances and values of local variables from external scopes. How these fields correspond to class definitions is uncertain. So the default serialization form for inner classes is not clearly defined.
  • conclusion

    In summary, serialization should not be equated with simply implementing the Serilizable interface, but should be considered for Seriablizable scenarios and considerations described above.

2. Consider using a custom serialization form

  • The problem

    Designing the serialization form of a class is just as important as designing the API for that class, so don’t rush into using the default serialization behavior without seriously considering whether the default serialization form is appropriate. Before you make a decision, you need to look at this form of coding from multiple perspectives of flexibility, performance, and correctness. Generally speaking, you can only accept the default serialization form if your own custom serialization form is substantially the same as the default one. What are the considerations when choosing the proper serialization method?

  • The answer

    1. The default serialization form describes the data contained within the cash, as well as the internal data of every other object that can be reached from this object, that is, the complete description of the topology in which all objects are linked. The ideal serialization form for an object should contain only the logical data that the object represents, and the logical data and the physical representation should be independent of each other. That is, if the physical representation of an object is equivalent to its logical content, the default serialization is appropriate. Here’s an example

      public class Name implements Serializable { 
            private final String lastName; 
            private final String firstName; 
            private final String middleName; 
      . . } Copy the code

      From a logical point of view, the Name class can be simply represented by three attributes lastName, firstName, and middleName, that is, these three attributes can accurately reflect its logical content. Therefore, the default serialization can be used in this case, which also requires validation and protective copying of parameters in the readObject.

    2. Using the default serialization form, when one or more field fields are marked transient, if deserialization is performed, these field fields are initialized to their type default values, such as the object reference field is set to NULL, the default value of the value base field is 0, and the default value of the Boolean field is false. If these values cannot be modified by any transient field, you must provide a readObject method. It first calls defaultReadObject and then restores the transient fields to their original values. Similarly, during serialization, instance fields decorated with transient are omitted

    3. During the serialization process, the virtual machine attempts to call writeObject() and readObject() in the object class, so it can implement its own serialization logic in the readObject and writeObject methods. Did not achieve a certain logic also should call the default ObjectOutputStream. DefaultWriteObject () and ObjectInputStream defaultReadObject () method, so that you can guarantee the forward or backward compatibility;

    4. No matter which serialization form you choose, declare an explicit sequence version UID for each serializable class you write. This prevents sequence version uuIds from being a potential source of incompatibility, and also provides a small performance benefit because there is no need to count sequence version UUids.

  • conclusion

    When you decide to design a class to be serializable, you should think carefully about which serialization form to use. The default serialization form should be used only if it reasonably describes the logical state of the object. Otherwise, design a custom serialization form that properly describes the state of the object.

3. Use the readObject method with caution

  • The problem

    To make your program more secure and reliable, you need to make protective copies of constructors and access methods for mutable fields, such as the following code:

    public final static class Period{
        private final Date start;
        private final Date end;
        public Period(Date start, Date end){
            this.start = new Date(start.getTime());
     this.end = new Date(end.getTime());  if(this.start.compareTo(this.end)>0) { throw new IllegalArgumentException(start + "after" +end);  }  }  public Date getStart(a) {  return new Date(start.getTime());  }  public Date getEnd(a) {  return new Date(end.getTime());  } } Copy the code

    If you serialize a class that does not satisfy the start and end constraints, how can you ensure that the key constraints of the object are protected during serialization?

  • The answer

    Deserialization is also a way of constructing objects in addition to constructors, so it is also necessary to check the validity of parameters and protect copies when constructing objects. So the readObject method also needs to make sure that the key constraint for Period remains unchanged and keeps it immutable:

  private void readObject(ObjectInputStream s)
  throws IOException, ClassNotFoundException {
      s.defaultReadObject();
      // Defensively copy our mutable components
      start = new Date(start.getTime());
 end = new Date(end.getTime());  // Check that our invariants are satisfied  if (start.compareTo(end) > 0)  throw new InvalidObjectException(start +" after "+ end);  }  } Copy the code

In addition, it should be noted that the protective copy comes before the parameter validity check, and the clone method cannot be used to copy objects.

  • conclusion

    In summary, whenever you write a readObject method, think of it this way: you are writing a public constructor that must produce a valid instance of whatever byte stream is passed to it. Here are some lessons to help you write a more robust readObject method:

    1. For object reference fields to remain private, each object in these fields must be protectively copied. Mutable components of immutable classes fall into this category;
    2. For any constraint, an InvalidObjectException is thrown if the check fails. These checks should follow all protective copies;
    3. If the entire object graph must be validated after deserialization, the ObjectInputValidation interface should be used.
    4. Do not call overwritable methods in readObject methods, either indirectly or directly

4. Use enumeration to implement singletons

  • The problem

    The easiest way to do this for Singleton is:

    public class Elvis { public static final Elvis INSTANCE = new Elvis(); private Elvis() { ... } public void leaveTheBuilding() { ... }}Copy the code

    If a class is serialized, either by default or custom serialization, or by so-called processing in the readObject method, the class will no longer be a singleton. So how do we implement this for serializable singletons?

  • The answer

    To satisfy serializable singletons, there are two ways:

    1. Leverage the readResolve method: The readResolve feature allows you to substitute an instance created by a readObject for another instance. For an object being deserialized, if its class defines a readResolve method with the correct declaration, the readResolve method on the new object will be called after deserialization. The object reference returned by this method is then returned in place of the newly created object. Therefore, on each deserialization, the previous instance object can be returned in the readResolve method, ensuring that only one object will be deserialized multiple times. The sample code is:

      // readResolve for instance control - you can do better!
      private Object readResolve(a) {
          // Return the one true Elvis and let the garbage collector
          // take care of the Elvis impersonator.
          return INSTANCE;
      } Copy the code

      This method ignores the deserialized object and returns only the particular Elvis instance that was created when the class was initialized. In fact, if you rely on readResolve for instance control, all instance domains with object reference types must be declared transient. Otherwise, singletons implemented using the readResolve method are also vulnerable.

    2. Implementation with enumerations: Serializable singletons can be implemented using enumerations. This security is guaranteed by the JVM, and the code is very concise, and the instance field does not need transient decoration:

      // Enum singleton - the preferred approach
      public enum Elvis {
          INSTANCE;
          private String[] favoriteSongs ={ "Hound Dog"."Heartbreak Hotel" };
          public void printFavorites(a) {
       System.out.println(Arrays.toString(favoriteSongs));  } } Copy the code
  • conclusion

    The simplest and safest way to implement serializable data is in the form of enumerations, which should be used whenever possible. The readResolve implementation ensures that all instance fields of the class are primitive or transient.