The original:
4 More Techniques for Writing Better Java


The author:
Justin Albano


Translation: Vincent

If you were asked to optimize your Java code, what would you do? In this article, the author introduces four ways to improve system performance and code readability. If you’re interested, let’s take a look. The following is a translation.

Our normal programming tasks are little more than applying the same suite of technologies to different projects, and for the most part, these technologies will do the trick. However, some projects may require special techniques, so engineers have to dig deep to find the simplest but most effective methods. In the previous article, we discussed four special techniques that can be used when necessary to create better Java software; In this article, we will introduce some common design strategies and goal achievement techniques that can help solve common problems:

  1. Do only targeted optimizations
  2. Use enumerations for constants whenever possible
  3. Redefine the inside of the classequals()methods
  4. Use polymorphism as often as possible

It is important to note that the techniques described in this article are not suitable for all situations. And when and where these technologies should be used requires careful consideration.

1. Only optimize purposefully

Large software systems are certainly very focused on performance. While we want to write the most efficient code we can, many times we don’t know how to optimize our code. For example, does the following code affect performance?

public void processIntegers(List<Integer> integers) { for (Integer value: integers) { for (int i = integers.size() - 1; i >= 0; i--) { value += integers.get(i); }}}Copy the code

It really depends. As you can see from the above code, its processing algorithm is O(n³)(using the big O notation), where n is the size of the list collection. If n is only 5, then there is no problem and only 25 iterations are performed. But if n is 100,000, that might affect performance. Please note that even then we can’t say for sure there will be a problem. Although this approach requires a billion logical iterations, it is debatable whether it will have an impact on performance.

For example, if the client executes this code in its own thread and waits asynchronously for the computation to complete, its execution time might be acceptable. Similarly, if the system is deployed in production and no client calls are made, there is no need to optimize this code because it will not consume the overall performance of the system. In fact, tuning the performance makes the system more complex, and the tragedy is that the performance of the system does not improve.

The bottom line is that there is no such thing as a free lunch, so in order to reduce the cost, we often use techniques such as caching, loop unwrapping, or predictive computation to optimize, which increases the complexity of the system and reduces the readability of the code. If this optimization improves the performance of the system, it is worth the complexity, but there are two pieces of information that must be known before you make a decision:

  1. What are the performance requirements
  2. Where are the performance bottlenecks

First we need to be clear about what the performance requirements are. If it ends up within the requirements and the end user doesn’t raise any objections, performance tuning is not necessary. However, when new functionality is added or the amount of data on the system reaches a certain size, optimization must be done or problems may occur.

In such cases, neither intuition nor examination should be relied on. Because even experienced developers like Martin Fowler are prone to making some bad optimizations, as explained in refactoring (page 70) :

The interesting thing about performance, if you analyze enough programs, is that most of the time is wasted in a small portion of the system’s code. If you optimize all your code the same way, you end up wasting 90% of your optimization because your optimized code doesn’t run very often. Any time spent optimizing for no purpose is a waste of time.

As battle-hardened developers, we should take this point seriously. Not only did the first guess not improve the performance of the system, but 90% of the development time was completely wasted. Instead, we should implement common use cases in a production (or pre-production) environment, figure out which parts are consuming system resources during execution, and then configure the system. If only 10 percent of the code consumes the most resources, optimizing the other 90 percent is a waste of time.

Based on the analysis, to use this knowledge, we should start with the most common cases. This ensures that the actual effort will ultimately improve the performance of the system. The analysis step should be repeated after each optimization. This not only ensures that the performance of the system is actually improved, but also shows where the performance bottleneck is after the optimization of the system (because after solving one bottleneck, other bottlenecks may consume more of the overall resources of the system). Note that the percentage of time spent in the existing bottleneck is likely to increase, since the remaining bottlenecks are temporary and the overall execution time should decrease as the target bottleneck is removed.

Although it takes a lot of capacity to do a comprehensive profile check on a Java system, there are some common tools that can help you spot performance hotspots, including JMeter, AppDynamics, and YourKit. Also, see DZone’s Performance Monitoring guide for more information on Java program performance tuning.

Although performance is a very important component of many large software systems and is part of an automated test suite in the product delivery pipeline, it cannot be optimized blindly and aimlessly. Instead, specific optimizations should be made for performance bottlenecks that are already understood. This not only helps us avoid adding complexity to the system, but also saves us from taking detours and doing time-wasting optimizations.

2. Use enumerations for constants as much as possible

There are many scenarios that require the user to list a set of predefined or constant values, such as HTTP response code that might be encountered in a Web application. One of the most common implementation techniques is to create a new class that contains many static values of final type. Each value should have a comment describing what the value means:

public class HttpResponseCodes {
    public static final int OK = 200;
    public static final int NOT_FOUND = 404;
    public static final int FORBIDDEN = 403;
}
if (getHttpResponse().getStatusCode() == HttpResponseCodes.OK) {
    // Do something if the response code is OK 
}
Copy the code

This is all very well and good, but there are some downsides:

  1. The integer value passed in is not strictly validated
  2. Methods on the status code cannot be called because they are basic data types

In the first case, you simply create a particular constant to represent a particular integer value, but there are no restrictions on methods or variables, so the values used may be beyond the defined range. Such as:

public class HttpResponseHandler { public static void printMessage(int statusCode) { System.out.println("Recieved status  of " + statusCode); } } HttpResponseHandler.printMessage(15000);Copy the code

Although 15000 is not a valid HTTP response code, there is no restriction on the server side that the client must provide a valid integer. In the second case, there is no way to define methods for the status code. For example, if you want to check whether a given status code is a success, you must define a separate function:

public class HttpResponseCodes {
    public static final int OK = 200;
    public static final int NOT_FOUND = 404;
    public static final int FORBIDDEN = 403;
    public static boolean isSuccess(int statusCode) {
        return statusCode >= 200 && statusCode < 300; 
    }
}
if (HttpResponseCodes.isSuccess(getHttpResponse().getStatusCode())) {
    // Do something if the response code is a success code 
}
Copy the code

To solve these problems, we need to change the constant type from basic data type to custom type, and only allow custom class specific objects. This is where Java enumerations (enUms) come in. With enum, we can solve both problems at once:

public enum HttpResponseCodes {
    OK(200),
    FORBIDDEN(403),
    NOT_FOUND(404);
    private final int code; 
    HttpResponseCodes(int code) {
        this.code = code;
    }
    public int getCode() {
        return code;
    }
    public boolean isSuccess() {
        return code >= 200 && code < 300;
    }
}
if (getHttpResponse().getStatusCode().isSuccess()) {
    // Do something if the response code is a success code 
}
Copy the code

Also, it is now possible to require a status code that must be valid when a method is called:

public class HttpResponseHandler {
    public static void printMessage(HttpResponseCode statusCode) {
        System.out.println("Recieved status of " + statusCode.getCode()); 
    }
}
HttpResponseHandler.printMessage(HttpResponseCode.OK);
Copy the code

It is important to note that this example demonstrates that enumerations should be used whenever possible if they are constants, but it does not mean that enumerations should be used in every case. In some cases, you may want to use a constant to represent a particular value, but other values are allowed. For example, you probably know PI, and we can capture this value (and reuse it) with a constant:

Public class NumericConstants {public static final double PI = 3.14; public static final double UNIT_CIRCLE_AREA = PI * PI; } public class Rug { private final double area; public class Run(double area) { this.area = area; } public double getCost() { return area * 2; } } // Create a carpet that is 4 feet in diameter (radius of 2 feet) Rug fourFootRug = new Rug(2 * NumericConstants.UNIT_CIRCLE_AREA);Copy the code

Thus, the rules for using enumerations can be summarized as follows:


When all possible discrete values are known in advance, enumerations can be used


Taking the HTTP response code mentioned above as an example, we probably know all the values of the HTTP status code (available in RFC 7231, which defines the HTTP 1.1 protocol). Hence the enumeration. In the case of calculating PI, we don’t know all the possible values for PI (any possible double is valid), but at the same time want to create a constant for circular rugs to make the calculation easier (and easier to read); So we define a set of constants.

If you don’t know all the possible values in advance, but you want to include fields or methods for each value, the easiest way is to create a new class to represent the data. While there is nothing to say about scenarios where enumerations should never be used, the key to knowing where and when to avoid enumerations is to be aware of all values in advance and not to use any others.

3. Redefine equals() within the class

Object recognition can be a difficult problem to solve: if two objects occupy the same place in memory, are they the same? If they have the same ID, are they the same? Or what if all the fields are equal? Although each class has its own identification logic, there are many things in the system that need to be checked for equality. For example, there is a class that represents order purchase…

public class Purchase { private long id; public long getId() { return id; } public void setId(long id) { this.id = id; }}Copy the code

… There must be a lot of things like this in the code, as written below:

Purchase originalPurchase = new Purchase();
Purchase updatedPurchase = new Purchase();
if (originalPurchase.getId() == updatedPurchase.getId()) {
    // Execute some logic for equal purchases 
}
Copy the code

The more this logic is invoked (which, in turn, violates the DRY principle), the more the identity of the Purchase class becomes. If, for some reason, the identity logic of the Purchase class is changed (for example, the type of the identifier is changed), then there must also be a lot of places where the identity logic needs to be updated.

We should initialize this logic inside the class rather than propagating the identity logic of the Purchase class too much through the system. At first glance, we can create a new method, such as isSame, whose input is a Purchase object, and compare the ids of each object to see if they are the same:

public class Purchase { private long id; public boolean isSame(Purchase other) { return getId() == other.gerId(); }}Copy the code

While this is a valid solution, it ignores Java’s built-in ability to use the equals method. Every class in Java inherits from The Object class, albeit implicitly, and therefore also inherits from equals. By default, this method checks for object identifiers (identical objects in memory), as shown in the following code snippet in the OBJECT Class definition in the JDK (Version 1.8.0_131) :

public boolean equals(Object obj) {
    return (this == obj);
}
Copy the code

The equals method acts as a natural place to inject identity logic (by overriding the default equals implementation):

public class Purchase { private long id; public long getId() { return id; } public void setId(long id) { this.id = id; } @Override public boolean equals(Object other) { if (this == other) { return true; } else if (! (other instanceof Purchase)) { return false; } else { return ((Purchase) other).getId() == getId(); }}}Copy the code

Although this equals method may seem complicated, since it only accepts arguments to objects of type, we only need to consider three cases:

  1. The other object is the current object (originalPurchase. Equals (originalPurchase)), which by definition are the same object and therefore return true

  2. The other object is not a Purchase object, in which case we cannot compare the ID of Purchase, so the two objects are not equal

  3. The other objects are not the same object, but are instances of Purchase, so equality depends on whether the ID of the current Purchase is equal to the other purchases

Now we can refactor our previous condition as follows:

Purchase originalPurchase = new Purchase();
Purchase updatedPurchase = new Purchase();
if (originalPurchase.equals(updatedPurchase)) {
    // Execute some logic for equal purchases 
}
Copy the code

In addition to reducing replication in the system, refactoring the default Equals method has some other advantages. For example, if we construct a list of Purchase objects and check if the list contains another Purchase object with the same ID(different objects in memory), then we get true because the two values are considered equal:

List<Purchase> purchases = new ArrayList<>();
purchases.add(originalPurchase);
purchases.contains(updatedPurchase); // True
Copy the code

In general, if you need to determine whether two classes are equal anywhere, you just need to use the overridden equals method. If we wanted to use equals implicitly because we inherited the Object to determine equality, we could also use the = = operator as follows:

if (originalPurchase == updatedPurchase) {
    // The two objects are the same objects in memory 
}
Copy the code

Also note that when the equals method is overridden, the hashCode method should also be overridden. For more information about the relationship between the two methods, and how to define hashCode methods properly, see this thread.

As we’ve seen, overriding equals not only initializes the identity logic inside the class and reduces the proliferation of this logic throughout the system, it also allows the Java language to make informed decisions about the class.

4. Use polymorphism as often as possible

Conditional sentences are a common construct for any programming language, and they exist for a reason. Because different combinations can allow the user to change the behavior of the system based on the instantaneous state of a given value or object. Suppose the user needs to calculate the balance of each bank account, then the following code can be developed:

public enum BankAccountType {
    CHECKING,
    SAVINGS,
    CERTIFICATE_OF_DEPOSIT;
}
public class BankAccount {
    private final BankAccountType type;
    public BankAccount(BankAccountType type) {
        this.type = type;
    }
    public double getInterestRate() {
        switch(type) {
            case CHECKING:
                return 0.03; // 3%
            case SAVINGS:
                return 0.04; // 4%
            case CERTIFICATE_OF_DEPOSIT:
                return 0.05; // 5%
            default:
                throw new UnsupportedOperationException();
        }
    }
    public boolean supportsDeposits() {
        switch(type) {
            case CHECKING:
                return true;
            case SAVINGS:
                return true;
            case CERTIFICATE_OF_DEPOSIT:
                return false;
            default:
                throw new UnsupportedOperationException();
        }
    }
}
Copy the code

While the above code meets the basic requirements, there is an obvious flaw: the user only determines the behavior of the system based on the type of account given. This not only requires the user to check the account type each time they make a decision, but also to repeat the logic when making a decision. For example, in the above design, the user must check in both ways. This can get out of hand, especially when you receive a request to add a new account type.

Instead of using account types for distinction, we can use polymorphism to make decisions implicitly. To do this, we convert the BankAccount concrete class into an interface and pass the decision process into a series of concrete classes that represent each type of BankAccount:

public interface BankAccount { public double getInterestRate(); public boolean supportsDeposits(); } public class implements BankAccount {@override public double getIntestRate() {return 0.03; } @Override public boolean supportsDeposits() { return true; }} public class implements BankAccount {@override public double getIntestRate() {return 0.04; } @Override public boolean supportsDeposits() { return true; } } public class CertificateOfDepositAccount implements BankAccount { @Override public double getIntestRate() { return 0.05; } @Override public boolean supportsDeposits() { return false; }}Copy the code

This not only encapsulates each account’s unique information into its own classes, but also enables users to change the design in two important ways. First, if you want to add a new BankAccount type, you simply create a new concrete class that implements the BankAccount interface and provides concrete implementations of the two methods. In conditional structure design, we have to add a new value in the enumeration, add a new case statement in the two methods, and insert the logic for the new account under each case statement.

Second, if we want to add a new method to the BankAccount interface, we simply add the new method to each concrete class. In conditional design, we had to copy the existing switch statement and add it to our new method. In addition, we must add logic for each account type in each case statement.

Mathematically, when we create a new method or add a new type, we have to make the same number of logical changes in polymorphic and conditional designs. For example, if we add a new method to our polymorphic design, we have to add the new method to the concrete classes of all N bank accounts, and in our conditional design, we have to add n new case statements to our new method. If we add a new account type in a polymorphic design, we have to implement all m numbers in the BankAccount interface, and in a conditional design, we have to add a new case statement to each m existing method.

While the number of changes we have to make is equal, the nature of the changes is quite different. In a polymorphic design, if we add a new account type and forget to include a method, the compiler will throw an error because we didn’t implement all the methods in our BankAccount interface. In conditional design, there is no such check to ensure that there is a case statement for each type. If a new type is added, we can simply forget to update every switch statement. The more serious the problem, the more we repeat our switch statements. We’re human, and we tend to make mistakes. So any time we can rely on the compiler to alert us to errors, we should.

The second important note about these two designs is that they are externally equivalent. For example, if we wanted to check the interest rate on a checking account, the condition design would look something like this:

BankAccount checkingAccount = new BankAccount(BankAccountType.CHECKING); System.out.println(checkingAccount.getInterestRate()); / / the Output: 0.03Copy the code

Instead, a polymorphic design would look like this:

BankAccount checkingAccount = new CheckingAccount(); System.out.println(checkingAccount.getInterestRate()); / / the Output: 0.03Copy the code

From an external point of view, we simply call getintereUNK() on the BankAccount object. This is even more obvious if we abstract the creation process as a factory class:

public class ConditionalAccountFactory { public static BankAccount createCheckingAccount() { return new BankAccount(BankAccountType.CHECKING); } } public class PolymorphicAccountFactory { public static BankAccount createCheckingAccount() { return new CheckingAccount(); } } // In both cases, we create the accounts using a factory BankAccount conditionalCheckingAccount = ConditionalAccountFactory.createCheckingAccount(); BankAccount polymorphicCheckingAccount = PolymorphicAccountFactory.createCheckingAccount(); // In both cases, the call to obtain the interest rate is the same System.out.println(conditionalCheckingAccount.getInterestRate()); / / the Output: 0.03 System. Out. Println (polymorphicCheckingAccount. GetInterestRate ()); / / the Output: 0.03Copy the code

It is very common to replace conditional logic with polymorphic classes, so methods for refactoring conditional statements into polymorphic classes have been published. Here’s a simple example. In addition, Martin Fowler’s Refactoring (p. 255) describes a detailed process for performing this refactoring.

As with the other techniques in this article, there are no hard and fast rules for when to perform a transformation from conditional logic to polymorphic classes. In fact, we don’t recommend it under any circumstances. In test-driven design: For example, Kent Beck designed a simple currency system to use polymorphic classes, but found that this made the design too complex, so he redesigned his design to a non-polymorphic style. Experience and sound judgment will determine when it is appropriate to convert conditional code to polymorphic code.

conclusion

As programmers, although most problems can be solved by the usual techniques we use, sometimes we need to break the mold and demand innovation. After all, expanding the breadth and depth of our knowledge as developers not only enables us to make better decisions, but also makes us smarter.


Details of all guests and topics, as well as registration