preface

It’s often said in software development that “premature optimization is the root of all evil.” Don’t over-optimize or over-optimize. I think it’s important to be aware of the performance implications in the coding process, but everything has its limits and you can’t delay development for the sake of performance. When time is tight, we often use “quick and dirty” solutions to produce results quickly and then iterate on optimization, which is called agile development. Corresponding to it is the waterfall development process in traditional software development.

The cause of caton

In iOS, the display of image content to the screen requires the participation of both CPU and GPU. The CPU is responsible for computing display content, such as view creation, layout calculation, image decoding, text drawing, and so on. Then THE CPU will submit the calculated content to the GPU, which will transform, synthesize and render. The GPU then submits the rendering result to the frame buffer and waits for the next VSync signal to be displayed on the screen. Due to the VSync mechanism, if the CPU or GPU does not complete the content submission within a VSync period, the frame will be discarded and displayed at the next opportunity, while the display will keep the previous content unchanged. That’s why the interface gets stuck.

Therefore, we need to balance the CPU and GPU loads to avoid overloading one side. To do this, we first need to understand what the CPU and GPU are responsible for.

The figure above shows the location of each module in iOS system. Now let’s take a look at the corresponding operations of CPU and GPU.

CPU consuming tasks

Layout calculation

Layout computing is the most common CPU drain in iOS, and if the view hierarchy is complex, it can take time to figure out the layout of all the layers. Therefore, we should try to calculate the layout information in advance, and then adjust the corresponding attributes when appropriate. Also avoid unnecessary updates and update only when a real layout change has occurred.

Object creation

The object creation process involves memory allocation, property setting, and even file reading, which can consume CPU resources. You can optimize performance by replacing heavy objects with lightweight objects. For example, CALayer is much lighter than UIView, so if view elements don’t need to respond to touch events, CALayer is more appropriate.

Creating view objects in Storyboard also involves deserialization of files, which can be much more expensive than creating objects directly in code, making Storyboard not a good technology choice for performance-sensitive interfaces.

For list-type pages, you can also refer to UITableView’s reuse mechanism. Each time you want to initialize a View object, first fetch it from the cache pool according to the identifier. If you can get it, reuse the View object. If you can’t get it, initialize it. When the screen is swiped, the View object that slides off the screen is put into the cache pool according to the identifier, and the View that enters the visible range of the screen is determined by the previous rules whether to actually initialize it.

Autolayout

Autolayout is a new layout technology that apple has introduced since iOS6. In most cases, Autolayout can greatly speed up development, especially when dealing with multiple languages. For example, in Arabic, the layout is right-to-left. Use Autolayout to set leading and trailing.

However, Autolayout can often cause serious performance problems for complex views, and it is recommended to use manual layout for performance-sensitive pages, with a controlled refresh rate and relayout only when you really need to change the layout.

The text calculated

If an interface contains a large amount of text (such as Weibo, wechat moments, etc.), text width and height calculation will occupy a large portion of resources and is inevitable.

A common scenario is in a UITableView, where the heightForRowAtIndexPath method is called so frequently that even non-time-consuming computationssuffer performance losses over time. The optimization here is to try to avoid re-calculating the line height of the text every time. After obtaining the Model data, the layout information can be calculated according to the text content, and then the layout information can be saved as an attribute in the corresponding Model. This allows properties in the Model to be used directly in the callback of the UITableView, reducing the computation of text.

Text rendering

All text content controls that can be seen on the screen, including UIWebView, are at the bottom of the page formatted and drawn as bitmaps through CoreText. Common text controls (UILabel, UITextView, etc.), its typesetting and drawing are carried out in the main thread, when the display of a large number of text, the CPU pressure will be very large.

This part of the performance optimization required abandoning the system-provided upper-layer controls and going straight to CoreText for layout control.

Wherever possible, try to avoid making changes to the frame of a view that contains text, because it will cause the text to be redrawn. For example, if you need to display a static block of text in the corner of a layer that frequently changes size, put the text in a sublayer instead.

When a view containing text changes its layout, it will rerender the text. For static text, we should try to minimize the layout modification of the view in which it resides.

Image drawing

Drawing an image is usually the process of drawing an image onto a canvas using those methods that begin with CG, and then creating a picture from the canvas and displaying it. As shown in the previous module diagram, CoreGraphic works on the CPU, so calling methods that start with CG consumes CPU resources. We can put the drawing process in the background thread and then set the result in layer contents in the main thread. The code is as follows:

- (void)display { dispatch_async(backgroundQueue, ^{ CGContextRef ctx = CGBitmapContextCreate(...) ;// draw in context...
        CGImageRef img = CGBitmapContextCreateImage(ctx);
        CFRelease(ctx);
        dispatch_async(mainQueue, ^{
            layer.contents = img;
        });
    });
}Copy the code

Image decoding

Once an image file has been loaded, it must then be decompressed. This decompression can be a computationally complex task and take considerable time. The decompressed image will also use substantially more memory than the original.

After loading, images need to be decoded, which is a complex and time-consuming process and requires more memory resources than the original image.

In order to save memory, iOS system will delay the decoding process, and the decoding process will be performed only after the image is set to the contents property of layer or the image property of UIImageView. However, both operations are carried out in the main thread, which still causes performance problems.

If you want to decode ahead of time, you can use ImageIO or draw the image into CGContext ahead of time. For this practice, see iOS Core Animation: Advanced Techniques

Just to mention a little bit, the common UIImage loading methods are imageNamed and image with content file. ImageNamed will decode the image immediately after loading it, and the system will cache the decoded image, but the cache strategy is not public, we can’t know when the image will be released. Therefore, in some performance sensitive pages, we can also use static variables to hold images loaded by imageNamed to prevent them from being released, and improve performance in a space-for-time manner.

GPU consumable tasks

Compared to the CPU, the GPU can do a single thing: take the submitted Texture and vertex description, apply the transform, mix and render, and then print it to the screen. Broadly speaking, most CALayer attributes are drawn on the GPU.

The following actions can degrade GPU drawing performance,

Mass geometry

All bitmaps, including images, text and rasterized content, are eventually committed from memory to video memory and bound to the GPU Texture. Both the process of submitting to video memory and the process of GPU adjusting and rendering Texture consume a lot of GPU resources. When a large number of images are displayed in a short period of time (such as when the TableView has a large number of images and slides quickly), the CPU usage is very low and the GPU usage is very high, and the interface will still drop frames. The only way to avoid this situation is to minimize the display of a large number of pictures in a short period of time, and to display as many pictures as possible.

In addition, when the image is too large to exceed the maximum texture size of the GPU, the image needs to be preprocessed by the CPU, which will bring additional resource consumption to the CPU and GPU.

Mixing of views

When multiple views (or Calayers) are displayed on top of each other, the GPU blends them together first. If the view structure is too complex, the mixing process can also consume a lot of GPU resources. To reduce GPU consumption in this situation, applications should minimize the number and level of views and reduce unnecessary transparent views.

Off-screen rendering

Off-screen rendering is when a layer is rendered in a buffer outside the current screen buffer before being displayed.

Off-screen rendering requires multiple context switches: first from the current Screen to off-screen; When the off-screen rendering is finished, the rendering results of the off-screen buffer are displayed on the screen, and the context environment needs to be switched from off-screen to the current screen, which is a costly action.

Reasons for offscreen rendering are as follows:

  • Shadow (UIView. Layer. ShadowOffset shadowRadius /…).
  • Rounded corners (when UIView. Layer. The cornerRadius and UIView layer. MaskToBounds when used together)
  • Layer mask
  • Turn on rasterization (shouldRasterize = true)

Setting shadowPath at the same time when using shadows can avoid off-screen rendering and greatly improve performance, which will be demonstrated in a later Demo. Off-screen rendering triggered by rounded corners can be avoided by using CoreGraphics to round the image.

CALayer has a shouldRasterize attribute, which is set to true to turn on rasterization. Turning on rasterization will draw the layer to an off-screen image, which will then be cached and drawn to the contents and sublayers of the actual layer. For many sublayers or for complex effects applications, this will be more efficient than drawing all the frames of all transactions. But rasterizing the raw image takes time and consumes extra memory.

Rasterization can also bring some performance loss. Whether to enable rasterization depends on the actual use scenario. It is not recommended to use rasterization when the layer content changes frequently. It’s best to use Instruments to compare FPS before and after to see if it’s optimized.

Note: remember to set rasterizationScale when shouldRasterize = true

Instruments used

Instruments is a set of tools, and we will only demonstrate the use of Core Animation here. In the lower right of Core Animation you will see the following options,

Color Blended Layers

This option highlights the blending areas of the screen from green to red based on the degree of rendering, with red indicating poorer performance and having a greater impact on metrics such as frame rate. Red color is usually caused by multiple translucent layers stacked on top of each other.

Color Hits Green and Misses Red

When UIView. Layer. ShouldRasterize = YES, time-consuming mapping will be cached images, and presented as a simple flat images. At this point, other parts of the page (such as reuse of the UITableViewCell) are shown green if they hit directly with the cache, and red if they miss. The more red, the worse performance. Because rasterization to a cache is expensive, if the cache is hit and used efficiently in large numbers, the overall cost is reduced, whereas the opposite means that new caches are generated frequently, which makes performance problems worse.

Color Copied Images

For images in color formats that are not supported by the GPU, the CPU can only process such images by marking them as blue. The more blue, the worse performance.

Color Immediately

Typically, Core Animation Instruments updates layer debugging colors about 10 times per millisecond. This is clearly too slow for some effects. This option can be used to set it to update every frame (this can affect rendering performance and lead to inaccurate frame rate measurements, so don’t set it all the time).

Color Misaligned Images

This option checks whether the image is scaled and whether the pixels are aligned. Zoomed images are marked yellow, and misaligned pixels are marked purple. The more yellow and purple, the worse the performance.

Color Offscreen-Rendered Yellow

This option will turn the off-screen rendered layers yellow. The more yellow, the worse performance. These yellow layers will probably need to be optimized using shadowPath or shouldRasterize.

Color OpenGL Fast Path Blue

This option will make any layer drawn directly with OpenGL appear blue. The more blue, the better performance. If you just use UIKit or Core Animation’s API, it won’t work.

Flash Updated Regions

This option shows the redrawn content in yellow. The more yellow that should not appear, the worse performance. Usually we want only the updated parts to be marked yellow.

demo

Rendered Yellow and Color Hits Green and Misses Red are the most successful candidates. Below I focus on demonstrating the detection of off-screen rendering and rasterization, wrote a simple Demo to set the shadow effect, the code is as follows:

    view.layer.shadowOffset = CGSizeMake(1.1);
    view.layer.shadowOpacity = 1.0;
    view.layer.shadowRadius = 2.0;
    view.layer.shadowColor = [UIColor blackColor].CGColor;
// view.layer.shadowPath = CGPathCreateWithRect(CGRectMake(0, 0, 50, 50), NULL);Copy the code

When shadowPath is not set, the FPS detected by Instruments is basically below 20 (iPhone6 device). After shadowPath is set, the FPS is basically maintained at around 55, showing significant performance improvement.

Now let’s look at the rasterization detection, the code is as follows,

    view.layer.shouldRasterize = YES;
    view.layer.rasterizationScale = [UIScreen mainScreen].scale;Copy the code

When you select Color Hits Green and Misses Red, the following is displayed:

As you can see, caching works both at rest and almost nothing at fast sliding, so whether to turn on rasterization or not depends on the situation, using Instruments to check the performance before and after turning it on.

conclusion

This article summarizes some of the theoretical knowledge of performance tuning, and then introduces some of the usage of Core Animation’s performance metrics in Instruments. The most important thing about performance optimization is to use tools to detect rather than guess, first to see if there are problems such as off-screen rendering and then to analyze time-consuming function calls with a Time Profiler. After modification, use the tool to analyze whether there is improvement, step by step implementation, careful and careful.

I suggest you to actually analyze your application and deepen your impression, enjoy~

The resources

  • Blog.ibireme.com/2015/11/12/…
  • www.samirchen.com/use-instrum…
  • Apprize. Info/apple/ios_5…

Finally do a promotion, welcome to pay attention to the public account MrPeakTech, I learn a lot from here, recommend to everyone, common progress ~