I’m in the fourth quarter of Denver’s Creators Camp

[Preliminary Conclusion]

In the capacity scheduler, capacity and maximum-capacity are configured the most. One is the capacity of the current queue, and the other is the maximum capacity of the queue. The total capacity of multiple queues is 100.

Maximum-capacity is an understandable parameter, that is, the upper limit of the resources available to the queue.

If there are multiple queues, the value of maximum-capacity in each queue is set to the same as that of Capacity. This means that each queue can only use resources of a fixed size and cannot overuse idle resources in other queues. In this case, resource waste or low utilization may occur.

Therefore, this value is usually set to a value greater than Capacity. For example, if the value is set to 100, each queue can use all resources of the cluster at most.

However, since the maximum capacity can use all the resources of the cluster, what is the role and significance of the capacity parameter and how does it limit the resource usage of users?

Check the official documents, online also read a lot of articles, always feel that the meaning of capacity does not fully explain the parameter, simply direct source.

Combined with the source code, and compared to the log, confirmed several key points, to their conclusions are very confident, immediately email synchronization to the group partner.

The capacity parameter of a queue is the upper limit of the resources that a single user can use in the queue.

Multiple users are allowed to submit tasks to the same queue. Therefore, the combined resources of different tasks of multiple users can exceed capacity but cannot exceed maximum-capacity.

“Counter”

However, it wasn’t long before I received an email from my colleague with the following illustration: the queue was configured with 10% of the resources, and the user submitted a task that used much more than 10% of the resources!

“Response”

When I received the email, I felt my face had been swollen, but I studied the relevant source code before and determined that it should be limited. Is there any detail that I didn’t notice that the code went to other branches?

With doubt, I went through the relevant code again and ran a series of tests. It was found that the phenomenon could be explained and the previous conclusion was still valid.

The total resources of the current cluster are 12GB, and the queue capacity is set to 10%. Therefore, the theoretical upper limit of resource usage for a single user in the queue is as follows:

12 * 1024 * 0.1 = 1228.8MB
Copy the code

Note: The parent queue of this queue is root. If the parent queue is not root, you need to continue to multiply the percentage of the capacity of the parent queue.

The minimum unit of cluster resource allocation is 1024MB. Therefore, you need to round it up to 2048MB. That is, the upper limit of resources used by a single user is 2048MB.

When the driver of the spark task is started (the applied resource is 2048MB), the number of resources used by the user in the current queue does not exceed the upper limit. Therefore, resources can be allocated to the user. That is, the driver can be started successfully.

After the driver starts, it requests to start two executors, and allocates 2048MB for each executor. During yarn scheduling, the user is found to have used 2048MB of resources (resources allocated to the driver), which still does not exceed the upper limit. Therefore, resources are allocated to one executor. However, when it comes to the second executor, the user’s current resource usage exceeds the upper limit of 4096MB, so no resources are allocated to this executor.

That is, although the queue capacity is configured at 10%, it is not strictly restricted by 10%, that is, it is allowed to be overused. As long as the user does not exceed the upper limit, it can continue to allocate resources (even if the upper limit will be exceeded after allocation). However, once the currently used resources exceed the upper limit, resources cannot be allocated.

To verify the conclusion, perform the following tests: Set the queue capacity to 10%, and set the maximum am-resource-percent (AM resource usage) to prevent interference due to AM resource restriction.

Submit a Spark task first. The situation is the same as the previous one. When submitting a task again, the second task is always in the ACCEPT state, and the Driver of the Spark task does not allocate resources.

In addition, the diagnosis information of the task is that the user resource usage exceeds the upper limit.

On this basis, switch users and submit another Spark task. It is found that the task can run normally, as shown in the following figure:

The second task is also in the ACCEPT state, but the message displayed on the interface is different, indicating that the user has exceeded the maximum AM resource usage.

So this is the point where the previous conclusion is correct.

【 Questioning again 】

After summarizing the above test process, relevant screenshots and conclusion summary, I replied to the email, thinking it was over. However, after a while, I received another email with the following reply:

If the capacity of the queue is set to 5%, then theoretically the maximum resource used by a single user in the queue is:

12 * 1024 * 0.05 = 614.4MB
Copy the code

Rounded up to 1024MB, according to your conclusion, submitted tasks should not be able to allocate resources, only in ACCEPT state, but still allocated resources, as shown in the picture below:

“GG”

After seeing the email, I felt quite calm, because I had found this problem in the previous research process, and I directly posted a code to explain it:

if (! Resources.lessThanOrEqual(resourceCalculator, lastClusterResource,amIfStarted, amLimit)) { if (getNumActiveApplications() < 1 || (Resources.lessThanOrEqual(resourceCalculator, lastClusterResource, queueUsage.getAMUsed(partitionName),Resources.none()))) { LOG.warn("maximum-am-resource-percent is insufficient to start  a" + " single application in queue, it is likely set too low." + " skipping enforcement to allow at least one application" + " to start"); } else { application.updateAMContainerDiagnostics(AMState.INACTIVATED, CSAMContainerLaunchDiagnosticsConstants.QUEUE_AM_RESOURCE_LIMIT_EXCEED); if (LOG.isDebugEnabled()) { LOG.debug("Not activating application " + applicationId + " as amIfStarted: " + amIfStarted + " exceeds amLimit: " + amLimit); } continue; }}Copy the code

In other words, as long as no task is running in the queue, resources are allocated to ensure that one task can run even if the upper limit of available resources is exceeded when a task is submitted.

Also, as you can see from the figure above, the task allocates only 2048MB, which is the driver’s resources, and none of the executors that the driver requests to start are allocated any resources because the number of resources currently used has exceeded the upper limit.

【 summary 】

The capacity parameter of a queue is the upper limit of the resource usage of a single user. The capacity parameter of a queue is the upper limit of the resource usage of a user. The capacity parameter of a queue is the upper limit of the resource usage of a user.

Of course, there are other parameters, such as user-limit-factor, minimum-user-limit-percent, that determine the upper limit of user resource usage, which will be explained separately in a future article.

In addition, the whole discussion process down, realize the source code is not lying, look at the source code at the same time or to do more hands-on testing verification, in order to truly understand.