An overview of the

JMH is a Micro Benchmark Framework developed by OpenJDK/Oracle developers who developed Java compilers. What is Micro Benchmark? A benchmark at the method level can be accurate to microseconds. It can be seen that JMH is mainly used when you have found the hot spot function and need to further optimize the hot spot function, you can use JMH to conduct quantitative analysis of the optimization effect.

Typical usage scenarios include:

• Want to know quantitatively how long A function takes to execute and the correlation between the execution time and the input n • There are two different implementations of A function (for example, implementation A uses FixedThreadPool and implementation B uses ForkJoinPool) and do not know which one performs better

Although JMH is a fairly good Micro Benchmark Framework, unfortunately there are few documents available on the Internet, and the official documentation is not quite detailed, which causes some obstacles to use. The good news, however, is that the official Code Sample is fairly straightforward and is recommended to read through if you need to learn more about JMH usage — this article will cover some of the most typical uses of JMH and some of the common options.

The first example

If you’re using Maven to manage your Java projects, introducing JMH is a simple matter – just add the JMH dependencies to PEM.xml

<properties>
    <jmh.version>1.14.1</jmh.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-core</artifactId>
        <version>${jmh.version}</version>
    </dependency>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-generator-annprocess</artifactId>
        <version>${jmh.version}</version>
        <scope>provided</scope>
    </dependency>
</dependencies>
Copy the code

Let’s create our first Benchmark

package com.ckj.base.designPatternes.proxy.DynamicProxy;

import org.openjdk.jmh.annotations.*;
import org.springframework.cglib.proxy.Enhancer;
import org.springframework.cglib.proxy.MethodInterceptor;
import org.springframework.cglib.proxy.MethodProxy;

import java.lang.reflect.Method;
import java.util.concurrent.TimeUnit;

/**
 * @author c.kj
 * @Description
 * @Date 2021-03-04
 * @Time 21:55
 **/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
@Fork(1)
public class CglibProxy {

    @Benchmark
    @Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MICROSECONDS)
    @Measurement(iterations = 5, time = 100, timeUnit = TimeUnit.MICROSECONDS)
    public int measureName() throws InterruptedException {
       Thread.sleep(1);
        getProxyInstance();
        return 0;
    }

    public Object getProxyInstance() {
        Enhancer enhancer = new Enhancer();
        enhancer.setSuperclass(CglibTarget.class);
        MethodInterceptor methodInterceptor = new MethodInterceptor() {
            @Override
            public Object intercept(Object o, Method method, Object[] objects, MethodProxy methodProxy)
                    throws Throwable {

                System.out.println("intercept start....");
                Object o1 = methodProxy.invokeSuper(o, objects);
                System.out.println("intercept end....");

                return o1;
            }
        };
        enhancer.setCallback(methodInterceptor);
        Object o = enhancer.create();
        return o;

    }

}

Copy the code

There are quite a few notes that you may not be seeing for the first time, but don’t worry, I’ll explain what they mean next. Let’s run the benchmark 🙂

# JMH 1.14.1 (Released 1780 days ago, please consider updating!) # VM version: JDK 1.8.0_201, VM 25.201-b09 / Library/Java/JavaVirtualMachines jdk1.8.0 _201. JDK/Contents/Home/jre/bin/Java # VM options: -Dvisualvm.id=94033462347806 -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=56278:/Applications/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8 # Warmup: 5 iterations, 100 us each # Measurement: 5 iterations, 100 us each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: com.ckj.base.designPatternes.proxy.DynamicProxy.CglibProxy.measureName # Run progress: 0.00% Complete, ETA 00:00:00 # Fork: 1 of 1 # Warmup Iteration 1: 68584.809 US /op # Warmup Iteration 2: 1312.436 US/OP # Warmup Iteration 3: 1329.796 US/OP # Warmup Iteration 4: 1329.558 US /op # Warmup Iteration 5: 1296.309 US/Op Iteration 1: 1296.129 US/OP Iteration 2: 1121.620 US/OP Iteration 3: 1603.035 US/OP Iteration 4: 1306.412 US /op Iteration 5: 1194.444 US/OP Result "measureName": 1304.328 ±(99.9%) 706.764 US /op [Average] (min, AVg, Max) = (1121.620, 1304.328, 1603.035), STdev = 183.544 CI (99.9%): [597.564, 2011.092] (assuming Normal Distribution) # launch complete. 00:00:01 Benchmark Mode Cnt Score Error Units designPatternes. Proxy. DynamicProxy. CglibProxy. MeasureName avgt 1304.328 5 ± 706.764 US /op Process Finished with exit code 0Copy the code

The test results for getProxyInstance() show an average execution time of about 1304.328 microseconds. Since our test subject 1304.328 slept exactly 1000 microseconds, the JMH results were pretty much what we expected.

Ok, now let’s explain the meaning of the code in more detail. But before we do that, we need to take a look at some of the basic concepts of JMH.

The basic concept

Mode

Mode Indicates the Mode used by JMH to Benchmark. Usually the dimensions of measurement are different, or the way of measurement is different. There are currently four modes of JMH:

•Throughput: overall Throughput, for example, “How many calls can be executed in 1 second”. •AverageTime: average invocation time, for example, XXX milliseconds per invocation. •SampleTime: random sampling and output distribution of sample results, e.g. “99% calls within XXX ms and 99.99% calls within XXX ms” •SingleShotTime: Iteration is 1s by default, and SingleShotTime is run only once. Warmup is often set to 0 at the same time to test performance on a cold start.

Iteration

Iteration is the smallest unit in which the JMH tests. In most modes, an iteration represents a second. The JMH calls the benchmark method repeatedly during this second, samples it based on the pattern, calculates throughput, calculates average execution time, and so on.

Warmup

Warmup is the act of warming up before actually benchmarking. Why warm up? Because of the JVM’s JIT mechanism, if a function is called more than once, the JVM tries to compile it into machine code to speed up execution. So to get benchmark’s results closer to the real thing, you need to warm up.

annotations

Now to explain the annotations used in the example above, many of them are perfectly literal 🙂

@Benchmark

This method is a benchmark object, similar to JUnit’s @test.

@Mode

Mode, as mentioned earlier, represents the Mode that JMH uses to Benchmark.

@State

State is used to declare that a class is a “State,” and then takes a Scope parameter to indicate the shared Scope of that State. Since many Benchmarks require classes that represent state, JMH allows you to inject these classes into Benchmark functions as dependency injection. Scope is mainly divided into two types.

•Thread: This state is unique to each Thread. •Benchmark: This state is shared across all threads.

As for the use of State, there are good examples in the official code sample.

@OutputTimeUnit

The unit of time used for benchmark results.

Startup options

Having explained the annotations, let’s look at the parameters that JMH sets before starting up.

Options opt = new OptionsBuilder()
        .include(FirstBenchmark.class.getSimpleName())
        .forks(1)
        .warmupIterations(5)
        .measurementIterations(5)
        .build();

new Runner(opt).run();
Copy the code

include

The name of the benchmark class. Note that regular expressions are used to match all classes.

fork

The number of forks. If the fork number is 2, the JMH forks two processes to test.

warmupIterations

Number of iterations warmed up.

measurementIterations

The number of iterations actually measured.

Second example

After looking at the first purely demonstration example, let’s look at a practical one.

Question:

Calculate the sum from 1 to n and compare the efficiency of the serial algorithm with that of the parallel algorithm to see at what point n the parallel algorithm starts to surpass the serial algorithm

Start by defining an interface that represents both implementations

public interface Calculator {
    /**
     * calculate sum of an integer array
     * @param numbers
     * @return
     */
    public long sum(int[] numbers);

    /**
     * shutdown pool or reclaim any related resources
     */
    public void shutdown();
}
Copy the code

Since the implementation of these two algorithms is not the focus of this article and is not inherently difficult, the actual code will not be described. If you’re really interested, look at the appendix at the end. The following is just an illustration of what I mean by serial and parallel algorithms.

• Serial algorithm: Use for-loop to calculate the sum of n positive integers. • Parallel algorithm: divide n positive integers into M parts, give them to m threads to calculate and respectively, and then add their results.

The code for benchmark is as follows

@BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) @State(Scope.Benchmark) public class SecondBenchmark { @Param({"10000", "100000", "1000000"}) private int length; private int[] numbers; private Calculator singleThreadCalc; private Calculator multiThreadCalc; public static void main(String[] args) throws RunnerException { Options opt = new OptionsBuilder() .include(SecondBenchmark.class.getSimpleName()) .forks(2) .warmupIterations(5) .measurementIterations(5) .build(); new Runner(opt).run(); } @Benchmark public long singleThreadBench() { return singleThreadCalc.sum(numbers); } @Benchmark public long multiThreadBench() { return multiThreadCalc.sum(numbers); } @Setup public void prepare() { numbers = IntStream.rangeClosed(1, length).toArray(); singleThreadCalc = new SinglethreadCalculator(); multiThreadCalc = new MultithreadCalculator(Runtime.getRuntime().availableProcessors()); } @TearDown public void shutdown() { singleThreadCalc.shutdown(); multiThreadCalc.shutdown(); }}Copy the code

Notice that there are three annotations that were not used before.

@Param

@param can be used to specify multiple cases of an argument. It is especially useful for testing the performance of a function with different parameters.

@Setup

@setup is executed before benchmark and, as its name suggests, is used for initialization.

@TearDown

@teardown, as opposed to @setup, is executed after all benchmark runs have finished and is used to recycle resources, etc.

And finally, guess what the actual result is? In what problem set can parallel algorithms outperform serial algorithms?

I ran it down on my MAC, and at 10,000 the parallel algorithm is worse than the serial algorithm, and at 100,000 the parallel algorithm starts to approach the serial algorithm, and at 1,000,000 the parallel algorithm takes about half as long as the serial algorithm.

Commonly used options

There are also some common JMH options that are not mentioned, which are briefly described here

CompilerControl

Controls the compiler’s behavior, such as forcing inline, disallowing compilation, etc.

Group

Multiple benchmarks can be defined as the same group, and they are executed simultaneously, primarily to test multiple methods that affect each other.

Level

This controls when @setup and @teardown are called. By default, level. Trial is before and after the benchmark starts and ends.

Profiler

JMH supports profilers that can display wait time to run time ratios, hot spot functions, and so on.

read

IDE plug-ins

IntelliJ has plug-ins for JMH, which provide convenient functions such as automatic generation of benchmark methods.

JMH tutorial

Jenkov’s JMH tutorial, which is much more detailed than this article, is highly recommended. Jenkov’s other Java tutorials are also worth checking out.

The appendix

Code listing

public class SinglethreadCalculator implements Calculator { public long sum(int[] numbers) { long total = 0L; for (int i : numbers) { total += i; } return total; } @Override public void shutdown() { // nothing to do } } public class MultithreadCalculator implements Calculator { private final int nThreads; private final ExecutorService pool; public MultithreadCalculator(int nThreads) { this.nThreads = nThreads; this.pool = Executors.newFixedThreadPool(nThreads); } private class SumTask implements Callable<Long> { private int[] numbers; private int from; private int to; public SumTask(int[] numbers, int from, int to) { this.numbers = numbers; this.from = from; this.to = to; } public Long call() throws Exception { long total = 0L; for (int i = from; i < to; i++) { total += numbers[i]; } return total; } } public long sum(int[] numbers) { int chunk = numbers.length / nThreads; int from, to; List<SumTask> tasks = new ArrayList<SumTask>(); for (int i = 1; i <= nThreads; i++) { if (i == nThreads) { from = (i - 1) * chunk; to = numbers.length; } else { from = (i - 1) * chunk; to = i * chunk; } tasks.add(new SumTask(numbers, from, to)); } try { List<Future<Long>> futures = pool.invokeAll(tasks); long total = 0L; for (Future<Long> future : futures) { total += future.get(); } return total; } catch (Exception e) { // ignore return 0; } } @Override public void shutdown() { pool.shutdown(); }}Copy the code