This is Raywenderlich’s translation of the free chapter on Metal by Tutorials, chapter 3 of the original book. Chapter 3 of the original book completed an app that displays cubes, and covered much more of the basics of GPU hardware than any other tutorial. 原文 原文 address Metal Rendering Pipeline Tutorial


Version Swift 4, iOS 11, Xcode 9

This article is part of our bookMetal by TutorialsAn excerpt from chapter 3 of The book. This book will introduce you to graphic programming in Metal, Apple’s GPU programming framework. You will build your own game engine, create 3D scenes and build your own 3D games in Metal. Hope you like it!

In this tutorial, you will delve into the rendering pipeline and create a Metal app to render a red cube. Along the way, you’ll learn all about the basics of the hardware chips that take 3D objects and turn them into pixels for display on the screen.

The GPU and CPU

All computers have a Central Processing Unit (CPU) that operates and manages the resources on the computer. Computers also have a Graphics Processing Unit (GPU).

A GPU is a special piece of hardware that can process images, video and huge amounts of data very quickly. This is called throughput. Throughput is the amount of data processed per unit of time.

The CPU cannot process large amounts of data very quickly, but it can process many sequential tasks (one after another) very quickly. The amount of time it takes to process a task is called latency.

The optimal configuration is low latency and high throughput. Low latency allows the CPU to execute serial queued tasks without causing the system to be slow or unresponsive; High throughput allows the GPU to render video or games asynchronously without blocking the CPU. Because gpus have a highly parallel architecture designed to do repetitive tasks with little or no data passing, they can handle large amounts of data.

The chart below shows the main differences between CPUS and Gpus.

The CPU has a large cache and a small number of Arithmetic Logic Unit (ALU) cores. The low-latency cache on the CPU is used for quick access to temporary resources. Gpus don’t have as large caches, but they do have more ALU cores, which only perform calculations without having to store intermediate results in memory.

Meanwhile, a CPU has only a few cores, whereas a GPU has hundreds or even thousands of cores. With more cores, the GPU can split the problem into many small parts, each part running in parallel on a separate core, thus hiding latency. When the processing is complete, the results of the parts are combined and the final result is returned to the CPU. However, core numbers are not the only key factor!

In addition to the GPU core being streamlined, there are special circuits for processing geometry, commonly called Shader cores. These shader cores handle all the beautiful colors you see on the screen. The GPU writes an entire frame at a time to fill the entire render window. It then moves on to the next frame to maintain a reasonable frame rate.

The CPU continues to pass instructions to the GPU to keep it busy, but sometimes the CPU will stop sending instructions, or the GPU will stop processing received instructions. To avoid blocking, the Metal on the CPU will arrange multiple commands in the command buffer and pass the new instructions in order so that the next frame doesn’t have to wait for the GPU to complete the first frame. That way, whichever CPU or GPU does the work first, there will be more work waiting to be done.

The GPU part of the graphics pipeline starts when it receives all the instructions and resources.

The Metal items

You’ve already learned about Metal with Playgrounds. Playgrounds is great for testing and learning new concepts. It is also important to learn how to build a complete Metal project. Since the iOS emulator does not support Metal, you will need to use the macOS app.

Note: The project file for this tutorial also includes iOS Target.

Create a new macOS App using the Cocoa App template.

Name it Pipeline and check Use Storyboards. Do not select others.

Open the main. storyboard and select the View of the View Controller Scene.

In the right inspector, change the view from NSView to MTKView.

This makes the main View the MetalKit View.

Open the ViewController. Swift. At the top of the file, import the MetalKit Framework:

import MetalKit
Copy the code

Then, in viewDidLoad(), add the following code:

guard let metalView = view as? MTKView else {
  fatalError("metal view not set up in storyboard")}Copy the code

Now you have a choice. You can inherit MTKView and use that view in your storyboard. This way, the subclass’s draw(_:) will be called every frame, and you can write code inside the method. However, in this tutorial, you will create a Renderer class that adheres to the MTKViewDelegate protocol and set the Renderer as a proxy for MTKView. MTKView will call the proxy method every frame, and you’ll need to write the necessary drawing code here.

Note: If you’ve used other apis before, you might want to look for game loop constructs. You can also choose to extend CAMetalLayer instead of creating MTKView. You can also use CADisplayLink for timing; But apple introduced MetalKit and used protocols to make it easier to manage the game loop.

The Renderer class

Create a new Swift file named renderer.swift and replace it with the following code:

import MetalKit

class Renderer: NSObject {
  init(metalView: MTKView) {
    super.init()}}extension Renderer: MTKViewDelegate {
  func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize){}func draw(in view: MTKView) {
    print("draw")}}Copy the code

Here you create a constructor and make the Renderer comply with the MTKViewDelegate, implementing the MTKView’s two proxy methods:

  • mtkView(_:drawableSizeWillChange:): Gets each window size change. This allows you to update the rendering coordinate system.
  • draw(in:): called per frame. inViewController.swift, adds a property to hold the renderer:
var renderer: Renderer?
Copy the code

At the end of viewDidLoad(), initialize the renderer:

renderer = Renderer(metalView: metalView)
Copy the code

Initialize the

First, you need to set up the Metal environment. The big advantage of Metal over OpenGL is that you can pre-instantiate objects instead of having to create them every frame. The chart below lists the objects you can create at the start of the app.

  • MTLDevice: Indicates the reference of software to GPU hardware.
  • MTLCommandQueue: Responsible for creating and organizing each frame requiredMTLCommandBuffers.
  • MTLLibrary: contains code converted from vertex shaders and fragment shaders.
  • MTLRenderPipelineState: Sets drawing information, such as which shader function to use, which depth and color Settings, and how to read vertex data.
  • MTLBuffer: Holds data, such as vertex information, in a format that you can send to the GPU

Normally, there is only one MTLDevice, one MTLCommandQueue, and one MTLLibrary object in your app. There are usually several MTLRenderPipelineState objects to define different pipeline states and several MTLBuffers to hold data.

Before you can use these objects, you need to initialize them. Add the following properties to the Renderer:

static var device: MTLDevice!
static var commandQueue: MTLCommandQueue!
var mesh: MTKMesh!
var vertexBuffer: MTLBuffer!
var pipelineState: MTLRenderPipelineState!
Copy the code

These properties are used to refer to different objects. For convenience, they are now implicitly unpacked, but you can change them after initialization. You don’t need to reference the MTLLibrary, so you need to create it.

Next, add code before the super.init() in init(metalView:) :

guard let device = MTLCreateSystemDefaultDevice(a)else {
  fatalError("GPU not available")
}
metalView.device = device
Renderer.commandQueue = device.makeCommandQueue()!
Copy the code

Here the GPU is initialized and the command queue is created. You use the class attribute to store the device and command queues to ensure that only one copy exists. In some cases, you may need more than one, but in most cases, one will suffice.

Finally, after super.init(), add the following code:

metalView.clearColor = MTLClearColor(red: 1.0, green: 1.0,
                                     blue: 0.8, alpha: 1.0)
metalView.delegate = self
Copy the code

Set metalView.clearColor to a cream color. Also set the Renderer to the metalView proxy so that it calls the MTKViewDelegate draw method.

Build and run the app to make sure everything is done and working. If normal, you will see a gray window. In the debug console, you will see the word “draw” repeated over and over again. Use this to verify that your app is calling the draw(in:) method every frame.

You can’t see the cream color of the metalView because you didn’t ask the GPU to do any drawing.

To prepare data

It is useful to have a special class to create 3D metagrids. In this tutorial, you will create a class to create 3D shape primitives and add cubes to them.

Create a new Swift file named Primitive. Swift and replace the default code with the following:

import MetalKit

class Primitive {
  class func makeCube(device: MTLDevice.size: Float) - >MDLMesh {
    let allocator = MTKMeshBufferAllocator(device: device)
    let mesh = MDLMesh(boxWithExtent: [size, size, size], 
                       segments: [1.1.1],
                       inwardNormals: false, geometryType: .triangles,
                       allocator: allocator)
    return mesh
  }
}
Copy the code

This class method returns a cube.

In renderer.swift, in init(metalView:), create the grid before calling super.init() :

let mdlMesh = Primitive.makeCube(device: device, size: 1)
do {
  mesh = try MTKMesh(mesh: mdlMesh, device: device)
} catch let error {
  print(error.localizedDescription)
}
Copy the code

Then, an MTLBuffer is created to hold the vertex data that will be sent to the GPU.

vertexBuffer = mesh.vertexBuffers[0].buffer
Copy the code

This puts the data in an MTLBuffer. Now you need to establish the pipeline state so that the GPU knows how to render the data.

First, create the MTLLibrary and make sure the vertex and fragment shader functions are available.

Continue adding before super.init() :

let library = device.makeDefaultLibrary()
letvertexFunction = library? .makeFunction(name:"vertex_main")
letfragmentFunction = library? .makeFunction(name:"fragment_main")
Copy the code

You will create these shaders later in this tutorial. Unlike OpenGL shaders, these shaders are compiled as you compile your project, which is definitely more efficient than compiling at run time. The results are stored in the library.

Now, create the pipeline state:

let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexFunction = vertexFunction
pipelineDescriptor.fragmentFunction = fragmentFunction
pipelineDescriptor.vertexDescriptor = MTKMetalVertexDescriptorFromModelIO(mdlMesh.vertexDescriptor)
pipelineDescriptor.colorAttachments[0].pixelFormat = metalView.colorPixelFormat
do {
  pipelineState = try device.makeRenderPipelineState(descriptor: pipelineDescriptor)
} catch let error {
  fatalError(error.localizedDescription)
}
Copy the code

This creates a possible state for the GPU. The GPU needs to know the full state of a vertex before it starts managing it. You set up two shader functions for the GPU and set the pixel format to be written to the texture.

The vertex descriptor for the pipeline is also set. It determines how the GPU interprets the vertex data that you pass in the grid data MTLBuffer.

If you need to call different vertex or fragment functions, or use different data layouts, then you need multiple pipeline states. Creating pipeline states is quite time consuming, which is why you need to create them early, but switching pipeline states between frames is very fast and efficient.

The initialization is complete and your project is about to compile. However, when you try to run it, you get an error because you haven’t created the shader function yet.

Apply colours to a drawing frame

In renderer. swift, replace the print statement in draw(in:) :

guard let descriptor = view.currentRenderPassDescriptor,
  let commandBuffer = Renderer.commandQueue.makeCommandBuffer(),
  let renderEncoder = 
    commandBuffer.makeRenderCommandEncoder(descriptor: descriptor) else {
    return
}

// drawing code goes here

renderEncoder.endEncoding()
guard let drawable = view.currentDrawable else {
  return
}
commandBuffer.present(drawable)
commandBuffer.commit()
Copy the code

This creates the render command encoder and sends the view’s drawable texture to the GPU.

draw

On the CPU, to prepare data for the GPU, you need to give the data and pipeline state to the GPU. Then you need to make a draw Call.

Again in draw(in:), replace the comment:

// drawing code goes here
Copy the code

For the following code:

renderEncoder.setRenderPipelineState(pipelineState)
renderEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0)
for submesh in mesh.submeshes {
  renderEncoder.drawIndexedPrimitives(type: .triangle,
                     indexCount: submesh.indexCount,
                     indexType: submesh.indexType,
                     indexBuffer: submesh.indexBuffer.buffer,
                     indexBufferOffset: submesh.indexBuffer.offset)
}
Copy the code

When you submit the command buffer at the end of draw(in:), you indicate to the GPU that the data and pipeline are ready for the GPU to take over.

Rendering pipeline

It’s finally time to review the GPU pipeline! In the diagram below, you can see the state of the pipeline.

The graph pipeline receives vertices in multiple stages, and the vertices are transformed in multiple spatial coordinates.

As a Metal programmer, you only need to consider the vertex and fragment processing phases, since only these two phases are programmatically controlled. Later in the tutorial, you’ll write a vertex shader and a fragment shader. For other non-programmable pipeline phases, such as Vertex Fetch, Primitive Assembly, and Rasterization, the GPU has hardware units specifically designed to handle these phases.

Next, you’ll look at each of these stages one by one.

1-Vertex Fetch

The name of this phase varies from graphical API to graphical API. For example, DirectX is called Input Levitator.

To start rendering 3D content, you first need a scene. A scene contains a number of models in which there is a grid of vertices. The simplest model is the cube, which has six faces (12 triangles).

You use vertex descriptors to define how vertex properties, such as position, texture coordinates, normals, and colors, are read. You can also choose not to use vertex descriptors and only send a set of MTLBuffer vertices. However, if you do this, you must know in advance how the vertex buffers are organized.

When the GPU gets the vertex buffer, the drawing call of MTLRenderCommandEncoder tells the GPU whether the buffer has an index. If the buffer has no index, the GPU assumes that the buffer is an array and picks one element at a time in order.

These indexes are important because vertices are cached for reuse. For example, a cube has 12 triangles and 8 vertices. If you don’t use indexes, you have to specify vertices for each triangle and send 36 vertices to the GPU. This may not sound like much, but in a model with thousands of vertices, vertex caching is very important!

There is also a second buffer for colored vertices, so that vertices that have been visited many times only need to be colored once. Colored vertices are vertices to which color has been applied. But that happens in the next phase.

A special unit of hardware called the Scheduler sends vertices and their properties to the Vertex Processing phase.

2-Vertex Processing

At this stage, vertices are processed separately. You need to write code to calculate the light and color per vertex. More importantly, you have to convert the vertex coordinates, through different coordinate Spaces, to determine where they are in the final frame buffer.

Now it’s time to look at what’s going on at the hardware level. Take a look at the modern AMD GPU architecture:

  • 1 Graphics Command Processor: It schedules the entire workflow.
  • Shader Engines (SE) : An SE is an organizational unit on the GPU that serves the entire pipeline. Each SE has a graphics processor, a rasterizer, and a computing unit.
  • A CU is a set of shader cores.
  • Shader Core: The shader Core is the basic building block of the GPU that does all the shading.

36 CU have a total of 2304 shader core. That’s a huge difference compared to your quad-core CPU!

For mobile devices, things are a little different. The following graph shows the GPU architecture on iOS devices over the last few years. PowerVR GPU cancelled SE and CU and used Unified Shading Cluster (USC). This custom GPU has six USCs, each of which has 32 cores, for a total of 192 cores.

Note: the latest mobile GPU on the iPhoneX is designed entirely by apple. Unfortunately, Apple doesn’t disclose the features of its GPU hardware.

So what can you do with all these cores? Because these cores are specialized for vertex and fragment coloring, obviously these cores can work in parallel, so vertex and fragment processing can be faster. There are also rules. Within a CU, you can only process vertices or fragments, not both. The good news is that there are 36 CU! Another rule is that each SE can only handle one shader function. Having four SE’s allows you to work more flexibly in combination. For example, you can run a fragment shader on one SE and a second fragment shader on a second SE all at once. Or you can separate your vertex shaders from the fragment shaders and have them run in parallel on different ses.

Now, it’s time to take a look at vertex processing! The vertex shader you are going to write should be minimal and encapsulate most of the necessary vertex shader syntax.

Use the Metal File template to create a new File named shaders.metal. Then, add the following code to the end of the file:

/ / 1
struct VertexIn {
  float4 position [[ attribute(0)]]; };/ / 2
vertex float4 vertex_main(const VertexIn vertexIn [[ stage_in ]]) {
  return vertexIn.position;
}
Copy the code

Code Meaning:

  1. Create a structureVertexInTo describe vertex attributes to match the previously created vertex descriptor. In this case, there is only oneposition.
  2. Implement a vertex shader,vertex_mainIt receivesVertexInStructure, and tofloat4Format returns vertex position.

Remember, vertices are indexed in the vertex buffer. The vertex shader takes the current index from the [[stage_in]] property and unpacks the VertexIn structure cache corresponding to the index.

The cell can process a large number of vertices (at once), depending on the maximum value of the shader core. This batch takes full advantage of the CU cache, so you can reuse vertices as needed. This batch will keep the CU busy until the processing is complete, but the rest of the CU will become available for the next batch.

Once vertex processing is complete, the cache is cleared for the next batch of sub-vertices. At this point, the vertices are sorted, grouped, and ready to be sent to the next stage.

To recall, the CPU sends a vertex buffer created from the model’s grid to the GPU. Configure the vertex buffer with a vertex descriptor that tells the GPU what the vertex data is structured. On the GPU, you create a structure to wrap vertex attributes. The vertex shader receives this structure as a function argument and, using the [[stage_in]] modifier, knows that position is passed from the CPU through the [[attribute(0)]] position in the vertex buffer. The vertex shader then processes all vertices and returns their positions via Float4.

A special hardware unit, called the allocator Distributer, sends the grouped vertex data blocks to the next Primitive Assembly stage.

3-Primitive Assembly

The previous phase groups vertices into data blocks and sends them to this phase. Note that vertices of the same geometric shape (primitive) will always be in the same block. This means that a one-vertex point, or a two-vertex line, or a three-vertex triangle, will always be in the same block, so there is no need to read a second block.

At the same time, the CPU also sends the vertex connection information when sending the draw Call command, such as:

renderEncoder.drawIndexedPrimitives(type: .triangle,
                          indexCount: submesh.indexCount,
                          indexType: submesh.indexType,
                          indexBuffer: submesh.indexBuffer.buffer,
                          indexBufferOffset: 0)
Copy the code

The first argument to the draw function contains the most important vertex connection information. In this case, it tells the GPU to draw the triangle with the vertex buffer it gets.

The Metal API provides five basic shapes:

  • point: Rasterizes one point for each vertex. You can use attributes in vertex shaders[[point_size]]To specify the size of the point.
  • Line: Raster the line segments between each pair of vertices. If a vertex is already contained on one line, it cannot be contained on another. If there are an odd number of vertices, then the last vertices are ignored.
  • LineStrip: Similar to the previous simple line, but the line strip connects all adjacent vertices, forming a multi-segment line. Each vertex (except the first one) is connected to the previous vertex.
  • Trangle: Raster one triangle for every three consecutive vertices. If the last vertices do not form a triangle, they are ignored.
  • TrangleStrip: Similar to the previous simple triangle, but a vertex can form a new triangle with adjacent triangle sides.

There is actually another basic shape called patch, but it needs special treatment and cannot be used in draw calls with indexes.

The pipeline specifies the direction of rotation of the vertices. If the rotation is counterclockwise, then the order of the vertices of the triangle is the counterclockwise faces, which are the heads. Otherwise, this face is the back and can be removed because we can’t see their color and light.

Pixels will be cropped if they are obscured by other pixels, but if they are only partially off screen, they will be cropped.

For efficiency, you should specify the direction of rotation and enable backside culling.

At this point, the primitives have been fully assembled from the vertices and will enter the rasterizer.

4-rasterization

Currently, there are two different rendering techniques: ray tracing and rasterization, which are sometimes used together. They are very different, each having advantages and disadvantages.

Raytracing works better when the rendered content is static and at a distance; Rasterization works better when the content is very close to the camera and moving.

With raytracing, you emit a ray from every point on the screen into the scene to see if it intersects with objects in the scene. If so, change the color of the pixels on the screen to the color of the object closest to the screen.

Rasterization is another way of working: from every object in the scene, you emit rays onto the screen to see which pixels are covered by that object. Depth information is also retained like raytracing, so the colors of the pixels on the screen are updated when closer objects appear.

At this point, the connected vertices sent from the previous stage are presented on a two-dimensional grid in terms of X and Y coordinates. This step is the triangle setup.

Here, the rasterizer needs to calculate the slope of a line segment between any two vertices. When we know the three slopes between the three vertices, we can form a triangle with these three sides.

The next step in the process, called Scan Conversion, scans the screen line by line looking for intersections, determining which parts are visible and which are not. To draw points on the screen, all you need is their vertices and their slopes. The scanning algorithm determines whether all points on a line segment or in a triangle are visible, and if so, they are all filled in with color.

For mobile devices, rasterization can take full advantage of the PowerVR GPU’s TiLED architecture, which allows parallel rasterization of a 32×32 block grid. Thus, 32 is the number of screen pixels assigned to the block, a size that perfectly matches the number of USC cores.

What happens if one object is hiding behind another? How does the rasterizer decide which objects to render? This problem of removing hidden surfaces can be solved by using stored depth information (pre-Z test) to determine whether any point is in front of other points in the scene.

After rasterization is complete, three additional hardware units take over the task:

  • A buffer called Hierarchical Z is responsible for removing segments that the rasterizer has flagged as culled.
  • The Z and Stencil Test unit then compares segments with depth buffers and template buffers, removing those segments that are not visible.
  • Finally, the Interpolator unit takes the remaining visible fragments and generates the fragment attributes from the assembled triangle attributes.

At this point, the Scheduler unit once again schedules the task to the shader core, but this time, the rasterized fragments are sent to the Fragment Processing phase.

5-Fragment Processing

Time for a quick pipeline refresher.

  • The Vertex Fetch cell grabs data from memory and passes it to the Scheduler cell.
  • The Scheduler unit knows which shader cores are available and assigns work to them.
  • When the work is done, the allocator Distributer unit knows whether the work is vertex processing or fragment processing.
  • In the case of vertex processing, it sends the results to the Primitive Assembly unit. The path continues to the Rasterization unit and then back to the Scheduler unit.
  • If it’s fragment processing, it sends the result to the Color Writing unit.
  • Finally, the colored pixels are sent back into memory.

The Primitive processing in the previous phase was sequential, as there was only a Primitive Assembly unit and a Rasterization unit. However, once the fragment reaches the Scheduler Scheduler unit, the work can be forked into many smaller parts, each of which is allocated to the available shader core.

Hundreds or even thousands of cores are now being processed in parallel. When the work is finished, the results are joined and sent back into memory.

The fragment processing phase is another programmable control phase. You will create a fragment shader function to receive the light, texture coordinates, depth and color information output from the vertex function.

The output of the segment shader is the color of the segment. Each fragment contributes to the final pixel color in the frame buffer. All properties of each fragment are interpolated.

For example, to render a triangle, the vertices function processes three vertices in red, green, and blue colors. As the diagram shows, each fragment that makes up the triangle is interpolated by the three colors. Linear interpolation is simply averaged by the distance and color of the two endpoints. If one endpoint is red and the other endpoint is green, then the color of the midpoint of the line segment is yellow. And so on.

The parameterized form of the interpolation equation is as follows, where the parameter P is the percentage of the color component (or the range from 0 to 1) :

newColor = p * oldColor1 + (1 - p) * oldColor2
Copy the code

The colors are easy to visualize, but the outputs of all other vertex functions are similarly interpolated to get the individual pieces.

Note: If you don’t want a vertex’s output to be interpolated, add the property [[flat]] to its definition.

In shader.metal, add the fragment function at the end of the file:

fragment float4 fragment_main() {
  return float4(1.0.0.1);
}
Copy the code

This is probably the simplest fragment function. You return the interpolation color float4. All the pieces that make up the cube will be red.

The GPU received the fragment and performed a series of post-processing tests:

  • Alpha-testing uses depth testing to determine which transparent objects will be drawn and which will not.
  • In the case of a transparent object, transparency alpha-testing mixes the color of the new object with the color in the previously saved color buffer.
  • Scissor testing checks whether a clip is inside a specific rectangular box; This test is very useful for mask rendering.
  • Stencil Testing checks for the difference between the template value in the frame buffer where the fragment is located and a specific value that we chose.
  • Early Z testing has been run in the previous phase; Late-z testing has now been completed to address more visibility issues; Templates and depth tests are also useful in ambient shading and shadows.
  • Finally, antialiasing is also calculated here so that the final image displayed on the screen does not look serrated.

6-Framebuffer

Once the fragments have been processed into pixels, the dispatcher Distributer unit sends them to the Color Writing unit. This unit is responsible for writing the final color to a special memory location called the framebuffer. From here, the view gets the colored pixels for each frame that is refreshed. But does the color being written to the frame buffer mean it’s already on the screen at the same time?

A technique called double-buffering is used to solve this problem. While the first buffer is displayed on the screen, the second is updated in the background. Then, the two buffers are swapped, the second buffer is displayed on the screen, the first is updated, and so on.

Hey! There’s a lot of hardware information to learn here. However, as you write code for every Metal renderer, you should learn to understand the rendering process, even though you have only just begun to look at apple’s sample code.

Build and run your app, and your app will render a red cube.

Coordinates Normalized Device Coordinates (NDC)

Send data to the GPU

Metal is all about gorgeous graphics and fast, smooth animations. Next, you’re going to move your cube up and down on the screen. To do this, you need a timer that updates each frame, and the position of the cube will depend on this timer. You will update the vertex positions in the vertex function, which will send the timer data to the GPU.

Above the Renderer, add the timer property:

var timer: Float = 0
Copy the code

In draw(in:), before the following code:

renderEncoder.setRenderPipelineState(pipelineState)
Copy the code

add

/ / 1
timer += 0.05
var currentTime = sin(timer)
/ / 2
renderEncoder.setVertexBytes(&currentTime, 
                              length: MemoryLayout<Float>.stride, 
                              index: 1)
Copy the code
  1. Add a timer to each frame. You want your cube to move up and down, so you need a value between -1 and 1. usesin()Is a good way to do it.
  2. If you only want to send a small amount of data to the GPU (less than 4KB),setVertexBytes(_:length:index:)Is to establishMTLBufferAnother way to do this. Here, you setcurrentTimeFor index 1 in the buffer parameter table.

In Shaders.metal, replace the vertex function with the following code:

vertex float4 vertex_main(const VertexIn vertexIn [[ stage_in ]],
                          constant float &timer [[ buffer(1) ]]) {
  float4 position = vertexIn.position;
  position.y += timer;
  return position;
}
Copy the code

Here, your vertex function receives a timer in float format from Buffer 1. Append the value of the timer to y and return the new position.

Build and run the app, and now you have the cube moving!

With just a few lines of code, you’ve learned how pipelines work and added a little animation.

What’s next?

If you want to see the project after this tutorial, you can download the tutorial materials and find them in the Final folder.

If you enjoyed what you learned in this tutorial, why not try our new book, Metal by Tutorials?

This book will introduce you to implementing low-level graphics programming in Metal. As you study this book, you will learn a lot about the basics of building a game engine, and gradually assemble it into your own engine.

When your game engine is complete, you will be able to compose 3D scenes and encode your own simple 3D game. Since you’ll be building your 3D game engine from scratch, you’ll be able to customize whatever appears on the screen.

But beyond technical definitions, Metal is an ideal way to use GPU parallelization capabilities to visualize data or solve numerical problems. So it’s also used for machine learning, image/video processing or, as we’ll see in this book, graphic rendering.

This book is the best resource for intermediate Swift developers who want to learn 3D graphics or want to gain a deeper understanding of how the game engine works.

If you have any questions or comments about this tutorial, please comment below!

Data download