“This is the 20th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

An overview of the

In Metal, MTLStorageMode is used to specify memory locations and access permissions for resources.

MTLStorageMode is an enumerated type and is defined as follows:

public enum MTLStorageMode : UInt {

    case shared = 0

Case managed = 1 // Only available on macOS

    case private = 2

    case memoryless = 3

}

  • case shared

The resource is stored in the system memory and can be accessed by cpus and Gpus.

This is the default storage mode for MTLBuffer objects. This is also the default storage mode for the MTLTexture object in iOS and tvOS. In macOS, the shared storage mode does not work with the MTLTexture object.

When a CPU or GPU changes the content of a resource, the textures of other participants need to be accessed synchronously.

If you change the contents of a resource using the CPU, those changes must be made before committing the command buffer that accesses the resource.

If you use coded commands in the command buffer to change the contents of a resource, Code is running on the CPU in the command buffer before execution may not be the content of the reading resources (namely MTLCommandBuffer object the status attribute to MTLCommandBufferStatus.com pleted).

  • case manager

This storage mode only exists in macOS, it is the default storage mode for the MTLTexture object, it does not exist in iOS and tvOS.

Cpus and Gpus can maintain separate copies of resources, and any changes must be explicitly synchronized.

In the unified memory model, resources in this mode reside in system memory accessible to both the CPU and GPU.

In the discrete memory model, resources in this mode exist as a pair of synchronized memory allocations. A copy of the resource resides in system memory accessible only to the CPU; The other copy resides in video memory that can only be accessed by the GPU. Metal manages two replicas by creating an MTLResource object.

In both memory models, Metal optimizes CPU and GPU access to managed resources. However, managed resources need to be explicitly synchronized after the CPU or GPU modifies its contents.

If you change the contents of a resource using the CPU, you must copy the changes to the GPU using one or more of the methods provided by the MTLBuffer or MTLTexture protocols.

If you change the contents of a resource using a GPU, the BLIT channel must be encoded to copy the changes to the CPU. For details, see the MTLBlitCommandEncoder protocol.

  • case ‘private’

This resource can be accessed only by the GPU. Resource consistency between CPU and GPU is not required because the CPU does not have access to the content of the resource.

In the unified memory model, this resource resides in system memory. In the discrete memory model, it resides in video memory. In both memory models, Metal optimizes GPU access to private resources, not shared or managed resources.

In the discrete memory model, Metal always tries to store private resources in explicit memory. However, under certain memory limits, Metal may store private resources in system memory, and when the private resources are used again, Metal will try to copy them back to video memory before using them.

  • case memoryless

The contents of a resource can only be accessed by the GPU. This part of the resource is called tile memory and only exists temporarily during rendering. Tile memory has higher bandwidth, lower latency, and lower power consumption than system memory.

Memoryless storage mode is available on Apple Series Gpus.

Memoryless resources can only be used as a render passes the temporary render target (indeed, the configuration objects and MTLRenderPassAttachmentDescriptor MTLTexture object used together). The contents of a texture cannot be loaded at the start of a render channel (mtlLoadAction.load) or stored at the end of a render channel (mtlstoreAction.store).

Memoryless resources are used when the contents of the render target are only needed during rendering. For example, most rendering channels do not store deep attachments and multi-sample attachments in memory, and creating these attachments as memoryless resources can significantly reduce memory usage.

On Metal devices that support tile rendering, you have more flexibility to use ImageBlocks to manage transient render data. See Metal Shading language specification for details.

Set the resource storage mode

Fast memory access and driver-level performance can be achieved by specifying appropriate storage modes for buffers or textures.

Sets the storage mode of the buffer

Create a new MTLBuffer using the makeBuffer(length:options:) method and set its storage mode in the options parameter of the method.

let bufferOptions = MTLResourceOptions.storageModePrivate

let buffer = device.makeBuffer(length: 256,

                               options: bufferOptions)

Copy the code
Note: The storage mode options in MTLResourceOptions are equivalent to the storage mode values in MTLStorageMode. When creating a new buffer, multiple resource options can be combined, but only one storage mode can be set.Copy the code

Sets the storage mode for the texture

Create a new MTLTextureDescriptor and set its storageMode in the storageMode property of the descriptor. Then create a new MTLTexture using the makeTexture(descriptor:) method.


let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .bgra8Unorm,

                                                                 width: 256,

                                                                 height: 256,

                                                                 mipmapped: true)

textureDescriptor.storageMode = .private

let texture = device.makeTexture(descriptor: textureDescriptor)

Copy the code

Select a resource storage mode on iOS or tvOS

All iOS and tvOS devices have a unified memory model.

Select a resource storage mode for buffers or textures

Which mode to choose depends on the resource access requirements:

  • Populated and updated by the CPU

If the resource requires CPU access, select mTLStoragemode.shared mode.

  • Access by GPU only

Select mTLStoragemode.private mode if GPU is used to fill resources through calculation, rendering, or blit channels. This is common for rendering targets, intermediate resources, or texture flows.

  • Filled once by the CPU and accessed frequently by the GPU

Use the CPU to create a resource in mTLStoragemode.shared mode and fill it with its contents. Then, the GPU is used to copy the contents of the resource to another resource with mTLStoragemode.private mode.

  • Accessed only by the GPU, its content is temporary (textures only)

If the texture is from the GPU temporary filling and memoryless render target, please select MTLStorageMode. Memoryless model. Memoryless render targets are render targets that exist only in tile memory and are not supported by system memory. Such as depth or template textures, which are only used during rendering and not needed before or after GPU execution.

Create a Memoryless render target

To create a memoryless render targets, please send MTLTextureDescriptor storageMode attribute is set to MTLStorageMode. To create new MTLTexture memoryless and use this descriptor. Then set the new texture to MTLRenderPassAttachmentDescriptor texture attributes.

let memorylessDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .r16Float,
                                               width: 256,
                                              height: 256,
                                           mipmapped: true)

memorylessDescriptor.storageMode = .memoryless

let memorylessTexture = device.makeTexture(descriptor: memorylessDescriptor)


let renderPassDescriptor = MTLRenderPassDescriptor()

renderPassDescriptor.depthAttachment.texture = memorylessTexture

Copy the code
. Note: only use MTLStorageMode memoryless model created texture, and can't create the buffer, the buffer can not be used as a memoryless render target.Copy the code

Select the resource storage mode in macOS

MacOS devices can have multiple Gpus, each with a uniform or discrete memory model. In a unified memory model, the CPU and GPU share system memory, and their access to that memory depends on the storage mode chosen for the resource. In the discrete memory model, system memory is separated from video memory. Both CPU and GPU can access system memory, but only GPU can access video memory.

Although there are differences between the unified memory model and the discrete memory model, there is no need to write difference code, and Metal’s Resource Storage Pattern API works for both.

Sets the resource storage mode for the buffer

  • Access by GPU only

Select mTLStoragemode.private mode if you are using the GPU to fill the buffer through calculation, rendering, or blit channels. This situation is common for intermediate buffers between channels.

  • Filled once by the CPU and accessed frequently by the GPU

Select mTLStoragemode.managed mode. First, the CPU fills the buffer with data, and then synchronizes the buffer. Finally, the GPU is used to access the buffer’s data.

  • Changes are frequent, relatively small, and accessed by the CPU and GPU

Select the mTLStoragemode. shared mode.

  • The CPU and GPU can be accessed

Select mTLStoragemode.managed mode. After the buffer content is modified, the buffer is always synchronized with the CPU or GPU.

For more information, see Synchronizing Managed Resources.

Note: In macOS, there is no difference in GPU performance between managed and private buffers. Therefore, there is no performance advantage in transferring data from the shared buffer to the private buffer using only managed buffers.Copy the code

Sets the resource store mode for the texture

Managed or mTLStoragemode. private mode can be used to create textures, but mTLStoragemode.shared cannot be used.

  • Access by GPU only

Select mTLStoragemode.private mode if you are using the GPU to fill textures through calculation, rendering, or blit channels. This is common for rendering targets and drawable objects.

  • Filled once by the CPU and accessed frequently by the GPU

Use the CPU to create a buffer in mTLStoragemode.shared mode and fill the buffer with texture data. The contents of the buffer are then copied into a texture with mTLStoragemode.private mode using the GPU.

  • CPU and GPU are frequently accessed

Select mTLStoragemode.managed mode. Always synchronize textures after modifying their contents with the CPU or GPU.

conclusion

This article introduces the resource storage mode, explains and compares the four modes in detail, and finally lists how to choose the corresponding storage mode in iOS and macOS.