Caton optimization

Before we get into the caton optimization, we need to understand the CPU and GPU.

CPU (Central Processing Unit)

Object creation and destruction, object property adjustment, layout calculation, text calculation and layout, image format conversion and decoding, Core Graphics are all done by CPU.

Graphics Processing Unit (GPU)

Texture rendering,

The information to be displayed is generally calculated or decoded by THE CPU, and the data passed by the CPU is handed to the GPU for rendering. The rendering work is completed in the frame cache, and then the data is read from the frame cache to the video controller and finally displayed on the screen.

In iOS, there is a dual-cache mechanism, with the pre-frame cache and the post-frame cache, which makes rendering very efficient.

Principles of screen imaging

The image we see on a dynamic screen is actually frame by frame, just like video. To synchronize the display with the system’s video controller, the display (or other hardware) generates a series of timing signals using a hardware clock. When the gun moves to a new line and is ready to scan, the monitor emits a horizontal Synchronization signal, or HSync; When a frame is drawn and the gun returns to its original position, the display sends a Vertical Synchronization signal, or VSync, before it is ready to draw the next frame. The monitor usually refreshes at a fixed rate, which is the frequency at which the VSync signal is generated.

Caton phenomenon

Caton causes

As we know, the process of displaying information is as follows: CPU calculates data -> GPU renders -> screen sends VSync signal -> image. If the screen has sent VSync but THE GPU has not completed rendering, only the last data can be displayed, so that the frame data of the current calculation is lost, thus causing stutter. After the current frame is calculated, it can only wait for the next cycle to render.

The solution

  • The main idea to solve the problem is to reduce the consumption of CPU and GPU resources as much as possible.
  • At a brush rate of 60fps, a VSync signal is generated every 16ms. Then there are the following optimization schemes for CPU and GPU:

CPU

  • Try to use lightweight objects such as CALayer for UI controls that don’t handle events.
  • Do not frequently call UIView properties such as frame, bounds, transform, etc.
  • Try to calculate the layout in advance, adjust the corresponding attributes once when necessary, do not modify many times;
  • Autolayout consumes more CPU than frame;
  • The image size is the same as the UIImageView size;
  • Controls the maximum number of concurrent threads.
  • Time-consuming operations are put into child threads; Such as text size calculation, drawing, picture decoding, drawing, etc.;

GPU

  • Try to avoid a large number of pictures in a short time;
  • The maximum texture size that GPU can process is 4096 * 4096. Exceeding this size will occupy CPU resources, so the texture cannot exceed this size.
  • Minimize the number and level of perspectives;
  • Reduce transparent views (alpha < 1) and set opaque to YES for opaque views;
  • Try to avoid off-screen rendering;

Off-screen rendering

In OpenGL, the GPU has two rendering methods:

  • On-screen Rendering: renders On the current Screen, Rendering On the Screen buffer currently used for display;
  • Off-screen Rendering: Creates a new buffer outside the current Screen buffer for Rendering;

Off-screen rendering costs performance for:

  • In the whole process of off-screen rendering, it is necessary to switch the context environment several times. First, it switches from the current Screen to the off-screen. After rendering, the rendering result of the off-screen buffer is displayed On the Screen, and the context is switched from off-screen to the current Screen, which will cause performance consumption.

What actions trigger an off-screen rendering?

  • ShouldRasterize = YES
  • Mask the layer mask
  • Rounded corners and set layer.masksToBounds = YES, layer.cornerRadius > 0
  • Clipping rounded corners can be drawn using CoreGraphics
  • shadow
  • ShadowPath will not render off-screen if layer.shadowPath is set

Caton detection

  • The main thread RunLoop can be used to detect the lag. Add an Observer to the main thread RunLoop to monitor the lag by listening for the elapsed time of the state change of the RunLoop.
  • Detection method: The time taken for the activity to change from kCFRunLoopBeforeSources (or kCFRunLoopBeforeTimers) to kCFRunLoopAfterWaiting exceeds our predetermined threshold to determine whether there is a lag

Optimize the power consumption

The main sources of electricity consumption are:

  • CPU;
  • Network request;
  • Positioning;
  • Image rendering;
  • Optimization idea
  • Reduce CPU and GPU power consumption as much as possible.
  • Use timers less;
  • Optimize I/O operations;
  • Try not to write small data frequently, it is best to write in batches at a time;
  • When reading and writing large amounts of important data, dispatch_io provides a GCD-based API for asynchronous file manipulation that optimizes disk access;
  • When the amount of data is large, use the database to manage the data;

Network optimization

  • Reduce and compress network data (JSON is more efficient than XML files);
  • If the results of multiple network requests are the same, try to use cache.
  • Use breakpoint continuation, otherwise the same content may be transmitted many times when the network is unstable.
  • When the network is unavailable, no network request is made.
  • Users can cancel long running or slow network operations and set appropriate timeouts;
  • Batch transmission, such as video download, do not transmit small data packets, directly download the whole file or large download, and then slowly display;

Location optimization

  • If you need to quickly determine the location of the user, use the requestLocation method of CLLocationManager to locate the user. After locating the user, the locating hardware will automatically power off.
  • If it’s not a navigation app, try not to update your location in real time and turn off location services when you’re done.
  • Try to reduce positioning accuracy, such as do not use the highest accuracy KCLLocationAccuracyBest;
  • Need background position, try to set up pausesLocationUpdatesAutomatically to YES, if the user doesn’t move, system will update since the suspension position;

Start the optimization

There are two types of App launches: Cold Launch and Warm Launch. The former means to start the App from scratch, while the latter means that the App has been stored in memory and is still alive in the background. Click the icon again to start the App.

App startup optimization is mainly for cold startup optimization. By adding environment variables, App startup time analysis can be printed: Edit Scheme -> Run -> Arguments -> Environment Variables Add DYLD_PRINT_STATISTICS to 1.

Running the program prints:

To print more detailed information, add the environment variable DYLD_PRINT_STATISTICS_DETAILS to 1.

App cold start

Cold start can be divided into three stages: DYLD stage, Runtime stage and main stage.

The first stage is to process the image of the program, the second stage is to load the program’s classes, classification information, etc., the last stage is to call the main function stage.

dyld

Dyld (Dynamic Link Editor), Apple’s Dynamic linker, which can be used to load Mach-O files (executables, Dynamic libraries, etc.).

When the App is started, DyLD will load the executable files of the App and recursively load all the dependent dynamic libraries. When the executable files and dynamic libraries are loaded, DyLD will inform the Runtime to proceed with the next step.

Runtime

When App is started, map_images is called for content parsing and processing of executable files. Call_load_methods is called in load_images to call load methods of all classes and categories. The objC structure is then initialized (registering classes, initializing class objects, and so on). It then calls the C++ static initializer and _attribute modified functions. At this point, all symbols (classes, protocols, methods, etc.) from the executable and dynamic libraries have been loaded into memory in their proper format and are managed by the Runtime.

main

In the Runtime phase is completed, dyld will call the main function, followed by the UIApplication function, AppDelegate application: didFinishLaunchingWithOptions: function.

Start optimization idea

For different stages, there are different optimization ideas:

dyld

  • Reduce dynamic libraries, merge dynamic libraries, regularly clean up unnecessary dynamic libraries;
  • Reduce the number of classes and classifications, reduce the number of selectors, and periodically clean up unnecessary classes and classifications;
  • Reduce the number of C++ virtual functions
  • Swift development uses struct as much as possible;
  • A virtual function is similar to an abstract function in Java, but the difference is that a virtual function defined by a base class may or may not be implemented by a subclass, while an abstract function subclass must be implemented.

Runtime

Replace all _attribute(((constructor)), C++ static constructors, and objective-C load methods with inilialize and dispatch_once;

main

  • Put some time-consuming actions off and don’t put them all in the finishLaunching method;
  • Mounting package slimming
  • Installation package (IPA) is mainly composed of executable files and resource files. If it is not properly managed, the volume of the installation package will become larger and larger. Therefore, for resource optimization, we can adopt lossless compression of resources to remove useless resources.

For slimming executable files, we can:

Optimize at the compiler level

  • Strip Linked Product, Make Strings read-only, Symbols Hidden by Default set to YES;
  • Disable exception support, Enable C++ Exceptions, Enable Objective-C Exceptions set to NO, Other C Flags add -fno-exceptions;
  • Using AppCode, detect unused Code detect: menu bar -> Code -> Inspect Code;
  • Write LLVM plug-in to detect duplicate code and uncalled code;
  • Detection by generating LinkMap files;

LinkMap

Build Setting -> LD_MAP_FILE_PATH: set the file path, Build Setting -> LD_GENERSTE_MAP_FILE -> YES

Run the program to see:

Open to see various information:

We can use this information to optimize for a class.