Learn more about jVM-G1 tuning briefs

preface

The G1 collector is a difficult collector to tune, because it can’t divide as many as it wants, like the fixed generational collector. More allocation is done by the collector, and since regions are pieced together and automatic growth is controlled by G1, it seems to be difficult to tune.

This article is more of a general direction for tuning, and more will be explained later by the JVM tools.

Case of actual combat

This time, an online education platform is used as an example to explain how G1 is optimized.

Where does the pressure on online education platforms come from? First of all, children need to go to school during the day and parents also need to work, so the traffic will not be very large during the day. Meanwhile, the main business is not handled on the online education platform. But in the evening, when the pressure on the machines starts to rise, and the kids are also listening online, the number of users will explode, with tens of thousands of people attending online classes at the same time. Can be found at this time of the pressure of the online education platform is to live, and live traffic peak is classroom interaction part, why is the interactive link, because children don’t like the boring class, in order to drive the class atmosphere, the key to the game must be active class atmosphere, also is the core of the system pressure, the system need to record all kinds of data, such as length, At the same time, a large number of objects will be allocated. In order to ensure the smooth broadcast, the system requires a very low latency response time.

Therefore, the final conclusion is that the online education platform will probably have double pressure in the interactive link of live broadcast, and the system requires very low latency response time.

From the above analysis, let’s assume that there are about 600 requests per second on a single machine, and assuming each request takes up 10KB, that’s a size of 6000KB, which is the final memory footprint of around 6M. Deployed on a 4-core 8G system.

Similarly, this case is only simulation and hypothesis, the specific situation is limited by various factors, do not go too deep into the details.

How to analyze the system

The traditional concept of generation

We use the traditional concept of generation to deploy this system. According to the request per second, the object size is 6M, and according to the configuration of system 4 and core 8, the memory for the JVM is about 4G. We know that the points, rewards, computation and other objects generated in the interactive link of live broadcasting are basically small objects. So we don’t need to give the old generation too much space, so out of the method area and virtual machine thread stack memory, we will give the new generation about 3G and the old generation about 1G of content.

If 6M objects are generated per second, that is more than 300 metres per minute. With the default Cenozoic ratio of 8:1:1, the Eden region is about 2.4g in size, and the Survior region is about two Spaces of 300M. 300M per minute, then the Cenozoic will be full in about 8 minutes. At this time, it is assumed that 300M objects survive and enter the Survior region. At this time, although the Survior region can be filled, about 150M objects will eventually enter the old age due to the ratio of more than 50%.

At that time, according to the object of the old s every eight minutes into the 150 m, about 40 minutes or so will pause time, the whole system the pause time is optimistic, because the system can not only run a piece of content, just calculate live interaction that can produce such effect, is combined with other modules of the whole system, ** Actually 5, 6 minutes may pause once!! ** This is certainly not good, imagine you play the game every few minutes to stop, for children suddenly card to lose the final score of poor children cry and make, parents at this time do not think there will be a lot of complaints, be scolded is not far……

Using the G1 collector

The system is deployed on a 4-core 8G system. Assuming that the machine allocates 4G to the heap on the JVM, the default generation initialization ratio is 5%, the maximum is 60%, and the JAVA thread stack is 1M, it takes 200,300 M to open several hundred threads. The method area takes up 256M.

In the last section of this article, you can think of the G1 as working like a tray, but on the system it is very difficult to predict when the **G1 will come back for garbage collection!! ** Here, some auxiliary means are needed. At the same time, the monitoring operation in this part needs tools and logs to be interpreted, so it will be interpreted in a special article in the future.

How to calculate the size of a Region

If the size of the new generation is 5%, the size of the new generation is about 100 regions. If the memory size of the new generation is 5%, and the memory size of the new generation is 4G, the memory usage of the new generation is about 200M

Crucial parameters:

-xx :MaxGCPauseMills: The default value is 200, which indicates that the maximum pause time is 200 ms.

If the G1 collector is used, this parameter directly affects the performance of the entire JVM system. If the value is too large, the garbage collection time will be too long, resulting in the first decatun, the new generation will be full before the garbage collection is triggered, or the old memory will not be collected in a timely manner.

How often a new generation recycle operation is triggered

According to the previous explanation, the new generation can use up to 60% of the space. At the same time, it also explains that the new generation uses the replication algorithm. According to the 8:1:1 rule, garbage collection will be triggered when 80% of the new generation reaches. This is clearly not consistent with G1’s full-heap-based and mixed recycling practices. So you can’t think about g1’s reclamation in terms of a fixed-size reclamation.

Correct action: G1 periodically checks whether the current generation meets the 200MS collection requirement. In general, when the garbage collection of the new generation takes 200MS, it triggers the collection of the new generation.

Here can also be directly according to the previous understanding of the restaurant waiters regularly come to the operation of the plate to understand how often the new generation of recycling operation.

How to optimize:

After the discussion above, this is a headache, this is how to optimize? We need to use some pressure measurement tools and GC log and memory analysis tools to consider this, but do not make the GC pause time preset too large to cause the GC pause time too long, should be reasonable.

How is Mixed GC optimized?

Since the optimization of the new generation is already very troublesome, let alone the old recycling. The Old collection itself does not have Old GC, but is replaced by Mixed GC, so it needs to be treated with care. We need to prevent Mixed GC by expanding objects into the Old, which is more concerned with the time-pausing model parameter.

The final analysis

New generation:

  1. Since it is a replication algorithm, it is still necessary to control the size of new generations of surviving objects into survior and keep it within 50%. Try to avoid objects going directly into the old age after GC. The other 60% of the new generation space usually doesn’t need to be adjusted much, unless the frequent creation of business objects in the new generation creates a large number of objects to consider.
  2. If the value of this parameter is set to a smaller value, the collection efficiency may be improved, but a large number of short-lived objects may enter the old age.

Old age:

  1. Follow the last step of G1, alternating garbage collection and system return 8 times, and stop collection when returning to 5% of the region. This parameter can be slightly larger: G1HeapWastePercent
  2. 45% of the older generation is the mechanism that triggers garbage collection, and this parameter does not need to be drastically changed, as the JDK must have set this parameter after a lot of testing and consideration.

conclusion

This article is more about tuning ideas. There is no one-size fits all solution for JVM tuning, especially for the G1 collector, where the algorithm details are so complex that tuning requires more effort and time to test the tuning effect.

Write in the last

The next chapter will do a whole series of sections so far, reviewing the old to know the new, people’s forgetting curve more needs our repeated consolidation of knowledge and content.