I have previously explored the comparison of LUT filters implemented by CoreImage, OpenGLES and other technologies on iOS — the comparison of LUT filters implemented by iOS. But in fact, In the aspect of graphics processing, Apple prefers its own Metal. This is a low-level graphical programming interface similar to OpenGLES, which was first released in WWDC in 2014. It can be used to send instructions from the CPU to the GPU driver GPU for a large number of parallel matrix calculations.

Metal provides the following features:

  • Low overhead interface. Metal is designed to eliminate hidden performance bottlenecks like state checking, and you can control the GPU’s asynchronous behavior for efficient multithreading for parallel creation and submission of command buffers
  • Memory and resource management. The Metal framework provides buffers representing GPU memory allocations and texture objects with exact pixel formats that can be used for textured images or attachments
  • Integrated support for graphical and computational operations. Metal uses the same data structures and resources (buffer, texture, Command Queue) for graphics and computation. Metal’s shader language supports both graphics and computation. The Metal framework supports sharing resources between runtime interfaces (cpus), graphics shaders, and computing methods
  • Precompiled shaders. Metal’s shader functions can be compiled at the compiler along with the code and loaded at run time, making it easier to debug shaders.

The object relationship of Metal is shown in the following figure


image

The Command Queue that connects CPU and GPU is the Command Queue, which loads multiple Command buffers. The Command buffers can carry various graphics defined by Metal and calculate Command encoders. In the encoder are the actual commands and resources created by the developer, which are ultimately passed to the GPU for computation and rendering.

The next step is to implement an image LUT filter using Metal.

1. Initialization

Metal initialization mainly generates and holds reusable objects that are expensive to initialize.

The GPU interface is initialized to hold the GPU. In Metal, it is defined as an object of type MTLDevice

self.mtlDevice = MTLCreateSystemDefaultDevice(); // Obtain the GPU interfaceCopy the code

Next, we initialize an MTKView, which acts as a canvas for the GPU to render content onto the screen

    [self.view addSubview:self.mtlView];
    [self.mtlView mas_makeConstraints:^(MASConstraintMaker *make) {
        make.left.top.equalTo(self.view);
        make.width.height.equalTo(self.view);
    }];
Copy the code

Metal’s rendering process is similar to That of OpenGLES, as shown below


So again, you need to pass in vertex data, vertex shaders, and fragment shaders

Vertex data is defined as follows, with the first four components of each row as vertex coordinates and the last two components as texture coordinates (normalized).

The static const float vertexArrayData [] = {1.0, 1.0, 0.0, 1.0, 0, 1, 1.0, 1.0, 0.0, 1.0, 0, 0, 1.0, 1.0, 0.0, 1.0, 1, 1, 1.0, 1.0, 0.0, 1.0, 0, 0, 1.0, 1.0, 0.0, 1.0, 1, 0, 1.0, 1.0, 0.0, 1.0, 1, 1};Copy the code

It is then loaded into the vertex buffer

self.vertexBuffer = [self.mtlDevice newBufferWithBytes:vertexArrayData length:sizeof(vertexArrayData) options:0]; / / a vertex buffer using the array is initialized, MTLResourceStorageModeShared resources stored in the CPU and GPU can access system memoryCopy the code

The scope of Metal’s vertex and fragment shaders is bundle-based. You can put Metal files of any name in a Bundle, and the shader functions in the files can be searched and loaded into memory by Metal.

id<MTLLibrary> library = [self.mtlDevice newDefaultLibraryWithBundle:[NSBundle bundleWithPath:[[NSBundle mainBundle] pathForResource:@"XXXXXX" ofType:@"bundle"]] error:nil]; id<MTLFunction> vertextFunc = [library newFunctionWithName:@"vertex_func"]; id<MTLFunction> fragFunc = [library newFunctionWithName:@"fragment_func"]; // Get vertex shaders and fragment shaders from the bundleCopy the code

Next to the shader function assembly into the rendering pipeline, need to use MTLRenderPipelineDescriptor object

MTLRenderPipelineDescriptor *pipelineDescriptor = [MTLRenderPipelineDescriptor new]; pipelineDescriptor.vertexFunction = vertextFunc; pipelineDescriptor.fragmentFunction = fragFunc; pipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm; // This setting configures the pixel format so that all content passing through the rendering pipeline conforms to the same color component order (in this case Blue, Green, Red, Alpha) and size (in this case, 8-bit (8) color value into from 0 to 255) self. PipelineState = [self. MtlDevice newRenderPipelineStateWithDescriptor: pipelineDescriptor error:nil]; // Initializes a render pipeline state description bit, which is equivalent to the pipeline between the CPU and GPUCopy the code

Finally, initialize the render queue and create the texture cache

self.commandQueue = [self.mtlDevice newCommandQueue]; / / get a render queue, which loading need rendering instructions MTLCommandBuffer CVMetalTextureCacheCreate (NULL, NULL, self mtlDevice, NULL, & _textureCache); // Create texture cacheCopy the code

2. Image texture loading

Metal provides a handy class MTKTextureLoader for loading image textures, which generates the MTLTexture object based on various parameters. However, there are two problems with this class:

  • Problem 1: BGRA problem

The default format for Metal MTKView is MTLPixelFormatBGRA8Unorm

The color pixel format for the current drawable’s texture. The pixel format for a MetalKit view must be MTLPixelFormatBGRA8Unorm, MTLPixelFormatBGRA8Unorm_sRGB, MTLPixelFormatRGBA16Float, MTLPixelFormatBGRA10_XR, or MTLPixelFormatBGRA10_XR_sRGB.

However, crash happened when I tried to set it to other values, so I need to set pixelFormat to BGRA format in the whole rendering process and command coding process. The problem is that for some images with RGBA format in the order of internal pixels, the generated texture and the final rendered image will be blue. To ensure that the incoming images are BGRA images, I pre-render the incoming images into CGContext according to BGRA, then extract the UIImage object and pass it in

- (unsigned char *)bitmapFromImage:(UIImage *)targetImage { CGImageRef imageRef = targetImage.CGImage; NSUInteger iWidth = CGImageGetWidth(imageRef); NSUInteger iHeight = CGImageGetHeight(imageRef); NSUInteger iBytesPerPixel = 4; NSUInteger iBytesPerRow = iBytesPerPixel * iWidth; NSUInteger iBitsPerComponent = 8; unsigned char *imageBytes = malloc(iWidth * iHeight * iBytesPerPixel); CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB(); CGContextRef context = CGBitmapContextCreate(imageBytes, iWidth, iHeight, iBitsPerComponent, iBytesPerRow, colorspace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); CGRect rect = CGRectMake(0, 0, iWidth, iHeight); CGContextDrawImage(context, rect, imageRef); CGColorSpaceRelease(colorspace); CGContextRelease(context); return imageBytes; } - (NSData *)imageDataFromBitmap:(unsigned char *)imageBytes imageSize:(CGSize)imageSize { CGColorSpaceRef colorSpace =  CGColorSpaceCreateDeviceRGB(); CGContextRef context = CGBitmapContextCreate(imageBytes, imageSize.width, imageSize.height, 8, imageSize.width * 4, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); CGImageRef imageRef = CGBitmapContextCreateImage(context); CGContextRelease(context); CGColorSpaceRelease(colorSpace); UIImage *result = [UIImage imageWithCGImage:imageRef]; NSData *imageData = UIImagePNGRepresentation(result); CGImageRelease(imageRef); return imageData; }Copy the code
  • Problem 2: sRGB problem

StackOverflow has a lot of answers to this strange Color problem. All mentioned will MTKTextureLoader MTKTextureLoaderOptionSRGB option is set to NO, it is the default value is YES.

My understanding is that sRGB is actually a color encoding, which has the effect of increasing the encoding accuracy of the dark color gamut and decreasing the encoding accuracy of the light color gamut. Srgb-encoded images need to be gamma-corrected to ensure strict linear RGB calculation for LUT comparisons. However, the images I pass in are arranged in RGB format, so there is no need for gamma correction. If sRGB correction is not turned off, linear RGB data will be corrected, resulting in a dark image. This problem also appeared when I did the filter research of CoreImage. The following is the effect picture


  • The final code

Generate LUT texture code

unsigned char *imageBytes = [self bitmapFromImage:lutImage]; NSData *imageData = [self imageDataFromBitmap:imageBytes imageSize:CGSizeMake(CGImageGetWidth(lutImage.CGImage), CGImageGetHeight(lutImage.CGImage))]; free(imageBytes); self.lutTexture = [loader newTextureWithData:imageData options:@{MTKTextureLoaderOptionSRGB:@(NO)} error:&err]; // Generate a LUT filter textureCopy the code

Generate the original texture code

    unsigned char *imageBytes = [self bitmapFromImage:image];
    NSData *imageData = [self imageDataFromBitmap:imageBytes imageSize:CGSizeMake(CGImageGetWidth(image.CGImage), CGImageGetHeight(image.CGImage))];
    free(imageBytes);    
    MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:self.mtlDevice];
    NSError* err;
    self.originalTexture = [loader newTextureWithData:imageData options:@{MTKTextureLoaderOptionSRGB:@(NO)} error:&err];
Copy the code

3. Shader code

The Metal shader code is similar to the OpenGLES shader because the principle is the same

#include <metal_stdlib> using namespace metal; struct Vertex { packed_float4 position; packed_float2 texCoords; }; struct ColoredVertex { float4 position [[position]]; float2 texCoords; }; vertex ColoredVertex vertex_func(constant Vertex *vertices [[buffer(0)]], uint vid [[vertex_id]]) { Vertex inVertex = vertices[vid]; ColoredVertex outVertex; outVertex.position = inVertex.position; outVertex.texCoords = inVertex.texCoords; return outVertex; } fragment half4 fragment_func(ColoredVertex vert [[stage_in]], texture2d<half> originalTexture [[texture(0)]], Texture2d <half> lutTexture [[texture(1)]]) {float width = originaltexture.get_width (); float height = originalTexture.get_height(); uint2 gridPos = uint2(vert.texCoords.x * width ,vert.texCoords.y * height); half4 color = originalTexture.read(gridPos); Float blueColor = color.b * 63.0; int2 quad1; Quad1.y = floor(floor(blueColor) / 8.0); Quad1.x = floor(blueColor) - (quad1.y * 8.0); int2 quad2; Quad2.y = floor(ceil(blueColor) / 8.0); Quad2.x = ceil(blueColor) - (quad2.y * 8.0); half2 texPos1; TexPos1. X = (quad1. * x 0.125) + 0.5/512.0 + (1.0/512.0 (0.125 -) * color. R); TexPos1. Y = (quad1. Y * 0.125) + + 0.5/512.0 (1.0/512.0), (0.125 - * color in g); half2 texPos2; TexPos2. X = (quad2. * x 0.125) + 0.5/512.0 + (1.0/512.0 (0.125 -) * color. R); TexPos2. Y = (quad2. Y * 0.125) + + 0.5/512.0 (1.0/512.0), (0.125 - * color in g); half4 newColor1 = lutTexture.read(uint2(texPos1.x * 512,texPos1.y * 512)); half4 newColor2 = lutTexture.read(uint2(texPos2.x * 512,texPos2.y * 512)); half4 newColor = mix(newColor1, newColor2, half(fract(blueColor))); Half4 finalColor = mix(color, half4(newcolor.rgb, 1.0), half(1.0)); half4 realColor = half4(finalColor); return realColor; }Copy the code

I will not repeat it here.

4. Render to screen

The rendering process begins by fetching the next content cache, the “canvas”

id<CAMetalDrawable> drawable = [(CAMetalLayer*)[self.mtlView layer] nextDrawable]; // Get the next available content area cache to draw content if (! drawable) { return; }Copy the code

Then generate the MTLRenderPassDescriptor, which is equivalent to the descriptor for the rendering process

MTLRenderPassDescriptor *renderPassDescriptor = [self.mtlView currentRenderPassDescriptor]; // Get the current render descriptor if (! renderPassDescriptor) { return; } renderPassDescriptor. ColorAttachments [0]. ClearColor = MTLClearColorMake (0.5, 0.5, 0.5, 1.0); / / set the color attachment removal renderPassDescriptor. ColorAttachments [0]. LoadAction = MTLLoadActionClear; // Used to avoid rendering new frames with old content attachedCopy the code

A usable command buffer is then taken from the command queue and loaded into a command encoder that contains vertices, textures, and so on for the shader

id<MTLCommandBuffer> commandBuffer = [self.commandQueue commandBuffer]; CommandEncoder = [commandBuffer <MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderPassDescriptor]; / / by rendering descriptor build encoder [commandEncoder setCullMode: MTLCullModeBack]; / / set out on the back of the [commandEncoder setFrontFacingWinding: MTLWindingClockwise]; / / set the map vertex of the yuan in clockwise order is forward [commandEncoder setViewport (MTLViewport) {0.0, 0.0, the self. MtlView. DrawableSize. The width, Self. MtlView. DrawableSize. Height, 1.0, 1.0}]; / / set the viewing area [commandEncoder setRenderPipelineState: self. PipelineState]; / / set the rendering pipeline state a [commandEncoder setVertexBuffer: self. The vertexBuffer offset: 0 atIndex: 0]; / / set the vertex buffer [commandEncoder setFragmentTexture: self. OriginalTexture atIndex: 0]; / / set the texture 0, that is, the original [commandEncoder setFragmentTexture: self. LutTexture atIndex: 1); / / set the texture 1, namely the LUT figure [commandEncoder drawPrimitives: MTLPrimitiveTypeTriangle vertexStart: 0 vertexCount: 6]. // Draw a triangle elementCopy the code

Finally commit to the queue

    [commandEncoder endEncoding];
    [commandBuffer presentDrawable:drawable];
    [commandBuffer commit];
Copy the code

Still select the following original picture for testing


Select the following LUT diagram


lookup.png

Final filter effect