The background,

The Mobile team of Manbang Group began to try React Native at the beginning of 2018. After nearly three years of development, it has now loaded most of the core business scenarios, involving 16+ business modules, 200+ pages and daily PV data of tens of millions. After the core business was also developed with React Native, we broke away from the restrictions of APP release and used dynamic release uniformly. Compared with the APP version, the frequency of dynamic version is much higher, the minimum is two versions a week, sometimes even five versions a week.

React Native launched in 2018 with version 0.51, which was relatively new at the time. In subsequent releases, Facebook officially introduced a number of new features, such as Hooks, the Hermes engine, and more. We continued to use version 0.51, and these new features were unavailable, as were many of the third-party library businesses in the community based on the newer Version, ReactNative. Therefore, after three years of using version 0.51, we decided to upgrade to the current, newer version 0.62.

2. Improvement of version 0.62

Previously, we had been using version 0.51. However, after nearly two years of iteration, React Native released version 0.62, which offers significant performance improvements over the previous version.

2.1 Performance Improvement

The biggest improvement of 0.62 compared to 0.51 is the use of Hermes as the JS execution engine on Android, which has a significant improvement in startup speed, memory footprint and JS running efficiency.

2.2 Stability improvement

From version 0.51 to version 0.62, a number of functional and stability bugs have been fixed. For example, the robustness of the Native part of the SDK has been greatly enhanced, such as the ReactHostView in Android, the security of show() and hide() has been enhanced. Another example is the ViewManager section, where exception handling is performed directly when it is illegal.

2.3 Community Ecology

The ecology of ReactNative is mainly divided into two parts:

React language features.

React 16.0 for 0.51 and 16.11.+ for 0.6x. There are some exciting new features like Context 16.2.0 and Hooks 16.8.0.

React Native and React third-party libraries

Community third-party libraries tend to improve React dependencies on a biannual basis, such as the more well-known to navigation libraries:

React-navigation, and many useful new features, such as ReactNative internal routing stack starting to support activation between pages, back to background etc. This is very useful in our daily development.

2.4 Performance on Android

Starting from 0.6x, the Hermes engine was introduced on Android, which brought a significant performance improvement. The biggest improvement of Hermes over JSC is the ability to directly run precompiled products of JS code, resulting in a significant increase in cold start performance and a decrease in memory footprint, but with a larger package size.

In order to understand the performance improvement data, we conducted a performance comparison test on JSC and Hermes on the Android terminal. The test device is:VIVO X21 RAM: 6G.

2.4.1 Cold Start Time Data

As can be seen from the figure above, the cold start time of Hermes+HBC is more than 50% lower than that of JSC+JS, so we decided to use the Hermes+HBC solution.

2.4.2 Packet size data

As can be seen from the figure above, the compression ratio of HBC binary package is significantly lower than that of Jsbundle, and its volume is almost twice that of Jsbundle. However, this can be avoided through subsequent unpacking and end-to-end conversion of HBC.

2.4.3 Code instruction processing speed

In the face of a lot of computation and parsing, JSC performance deteriorates severely, while Hermes is relatively stable. The time ratio between Hermes and JSC is about 1/6, and the excellent processing speed greatly improves frame rate and animation smoothness.

2.4.4 Memory Usage

Two conclusions can be drawn from the data measured in the figure above:

1. Memory performance of ReactNative 0.62 is significantly better than that of ReactNative 0.51, thanks to the loading mechanism of Hermes, which does not load the entire file into memory for parsing at one time.

2. ReactNative 0.62 has relatively smooth memory jitter, thanks to the fact that Hermes executes binary rather than JS code without the need for secondary transcoding.

The overall operation process, involving 4 ReactInstanceManager and 5 pages, saves 56 M of memory space, and benefits are indeed considerable.

2.5 iOS Performance

After upgrading from 0.51 to 0.62, the JS engine on iOS is still JSC only. However, outside of Jsbundle, support for RAM format, using RAM and inline scheme can improve cold start speed and memory greatly. However, considering that we will do pedestal splitting later, RAM format is not used, and JSC+Jsbundle scheme is still used on iOS. As a result, there is not much improvement in memory, cold boot, and command execution speed on iOS. However, with the recent release of React Native version 0.64, Hemers is officially available on iOS.

In terms of performance data, the Performance of The Android terminal has been greatly improved. The latest features of React, such as hooks, are also available, so we decided to upgrade to version 0.62.

3. Perform a sensorless upgrade

3.1 Challenges and risks

3.1.1 Multi-department cooperation and cooperation

As mentioned earlier, ReactNative hosts most of manbang’s core business scenarios, involving 16+ business modules, 200+ pages, and 50+ developers. The business of Manbang Group is in the period of rapid development, and all kinds of business operation activities are carried out on a daily basis. Large business, large staff, fast iteration pace, high stability requirements. It is necessary to coordinate the work of multiple testing, development and release teams.

3.1.2 SDK upgrade and high frequency release are parallel

To accommodate fast-paced business iterations, we release dynamic releases at least twice a week (up to five times a week). We require that technical changes should not affect business iteration (including APP version iteration and dynamic version iteration), and any business requirements should not be delayed because of technical changes. Therefore, we need the 0.51 release work and the 0.62 upgrade work to be synchronized and not interfere with each other.

3.1.3 Reducing upgrade Costs

Under the fast pace and high frequency of release, SDK upgrade should not bring too much burden to the development and testing of business requirements, and the impact on business development and testing should be reduced as much as possible. As a big version upgrade that will span 3 years, this upgrade involves a lot of Release notes. We need to try our best to accommodate these differences from the bottom, so as to reduce the modification surface of developers and regression strength of degraded testers as much as possible, and reduce costs in all aspects.

3.1.4 Ensure line stability

The average daily UV level of the two core apps of Manbang Group is 5 million, and the requirements for APP experience are very strict. If the abnormal rate increases by 1/10,000, the customer complaint rate will increase. Stability guarantee is the top priority of the upgrade plan. But no matter how perfect our plan is, there is no guarantee that something unexpected will not happen. Therefore, we need to detect online anomalies in the first place, reduce the impact and repair them in time.

The React Native SDK update is like changing a tire on a heavy truck going 120 yards.

3.2 Upgrade Scheme Principles

3.2.1 low risk

There are two main points:

1. Low business risk: it does not affect the iteration of business requirements.

2. Low stability risk: it does not affect the stability of the line, and the abnormal rate should be controlled at a very low level.

3.2.1.1 Release scheme design

In order to meet the above two conditions, we decided to use batch, grayscale release online.

Batch is the process of dividing online users into batches, with one batch going online and then the others. There are four apps in Manbang: Yunmanman driver terminal, Van bangdriver terminal, Yunmanman cargo main terminal and Van Bangcargo main terminal. After analyzing the business characteristics, we adopt a scheme that two driver terminals are used in the first batch and two cargo main terminals are used in the second batch.

Gray scale is now used in the industry is very common, here no longer explain the meaning, below will be detailed on the details of gray scheme.

3.2.1.2 Alarm and Rollback Scheme Design

To be really low risk, we also need to nip online problems in the bud. We need an alarm mechanism. The full band ReactNative before the upgrade already has an alarm mechanism, so we only need to split 0.62 into a statistical dimension for separate calculation, because the amount of early gray scale is small, if it is reused with the original alarm mechanism, it is difficult to trigger the alarm condition.

For those online problems that cannot be solved within a short period of time, we also need to have a downgrade plan, which can switch from 0.62 online to 0.51 in a short period of time, and then cut back to 0.62 after the problem is solved.

3.2.2 Low cost:

Low cost here refers to the business development, testing impact as low as possible. Reduce the amount of code modification, modification difficulty, so as to reduce the labor cost of development investment; Reduce the scope of influence, so as to narrow the test regression range, reduce the regression strength, so as to save the labor cost of testing.

3.2.2.1 A set of code

In order to reduce risks, we use the form of multiple batches of gray volume release online, the whole online cycle will last for a long period of time, during the online, each business module is constantly developing new requirements iteratively. In other words, the existing business code and the business code for the new requirements should be compatible with both versions of the SDK. The simplest solution is to maintain two sets of code, one for each VERSION of the SDK, but this requires writing the code twice, which is a heavy burden for business development. To avoid this burden, we came up with a solution that accommodates both versions of the SDK.

3.2.2.2 Switching the development environment

One set of code fits both SDKS, and of course the code has to be on one branch. When developing business requirements, it is necessary to run the code on two VERSIONS of SDK environment respectively. We provide the environment switching script, which can switch to different ReactNative environment with one command. For example, grayscale has been carried out on the driver side online, while the consignor side has not yet started to scale. For the codes that need to run at both ends of the driver and consignor, developers can switch to different environments for development through scripts, as shown in the following figure.

3.2.2.3 Code modification scan

To further reduce the cost of adaptation for developers, we have developed a special scripting tool that scans out all the changes that need to be made and shows how to make them.

By adopting the above scheme, we achieved complete control of upgrade risk (stability control through multiple batches of grayscale upgrade), and minimized the adaptation cost of developers (adaptation of two versions of SDK and script scanning and modification through one set of code).

Four, preliminary preparation

4.1 API Changes Comb

Before upgrading, you need to comb through the API differences between the two SDK versions and have a thorough understanding of all changes from 0.51 to 0.62. API change is divided into two types:

  • breaking change
  • The product change

Our method is to read all versions of Release notes from 0.51 to 0.62, sort out all breaking changes, and make special adaptation schemes for each breaking change. For example, AsyncStorage 0.51 uses XXX and 0.62 uses YYY, so the code of 0.51 and 0.62 is incompatible with each other. Our adaptation solution is to use our own encapsulated Bridge[MBbridge.app.storage].

// NPM install --save @react-native community/asyncstorage // import asyncstorage from is not recommended '@react-native-async-storage/async-storage'; / / / / advice is modified to Bridge form according to the KEY to obtain the VALUE MBBridge. App. Storage. The getItem ({KEY: BootPageModalKey.KEY_IS_SHOW_BOOTPAGEMODAL }).then(res => { if (this.isGuidanceSwitch(res? .data? Text)) {retuReactNative null}}) / / store < KEY, VALUE > MBBridge. App. Storage. SetItem ({KEY: Constant.StorageKey.Common.RefeReactNativeame, text: commonStore.refeReactNativeame })Copy the code

4.2 Code adaptation scheme

With three or more iterations a week and the ReactNative stack being used, it would be too expensive to synchronize two sets of code (0.51&&0.62) at such a fast development pace. So we decided to have a code that would be compatible with both 0.51 and 0.62: for all incompatible apis, encapsulate an adaptation layer and mask the underlying differences. As shown below:For example, the adaptation of the navigation library is as follows:

Before modification:

import { StackNavigator } from "native-navigation"
const RootStack = StackNavigator(...)
export default class xxxx extends Component<any, any> {
  render() {
    retuReactNative (
      <RootStack screenProps={this.props} />
    )
  }
}
Copy the code

After modification, ReactNative-lib-protocal is our protocol layer

import { createStackNavigatorCompat, createAppContainerCompat } from "@ymm/ReactNative-lib-protocal"
const RootStack = createStackNavigatorCompat(...)
export default class StickerPageRouter extends Component<any, any> {
  render() {
    const App = createAppContainerCompat(RootStack)
    retuReactNative (
      <App screenProps={this.props} />
    )
  }
}
Copy the code

Then, the protocol implementation layer code is as follows.

import { NavigationActions } from 'react-navigation';
export default class StackActionsCompat {
static reset(resetAction: any){
retuReactNative NavigationActions.reset(resetAction)
}
static push(pushAction: any) {
retuReactNative NavigationActions.push(pushAction)
  }
static pop(popAction: any) {
retuReactNative NavigationActions.pop(popAction)
  }
static popToPop() {
retuReactNative NavigationActions.popToTop()
  }
}
Copy the code

In this way, business development students can implement one set of code running on two React Native versions, saving the cost of maintaining two sets of code.

4.3 Script Tool

The tools here include three:

1, API inspection tool (support local && CI/CD);

2. Code engineering environment switching tool;

3. Run the environment check tool.

4.3.1 API checking Tools

The API checker is designed to check for those apis that run in 0.51 but are no longer compatible with 0.62. To solve this problem, we abstracted the rules of API checker for the various changes between the two versions. The check tool is written with Python script. Developers can check locally (directly run Python script or run NPM command) or enable the check when Jekins is packaged. The check effect is as follows:

4.3.2 Environment Switching Tool

The engineering environment switching tool is for the convenience of developers to switch protocol implementation layers and configuration files (package.json, metro.config.js, etc.) of 0.51 and 0.62, which can be implemented by Shell or Python.

This tool ensures that business development students can work on a branch without focusing on API differences and configuration differences between 0.51 and 0.62.

4.3.3 Environment Check Tool

For example, the 0.51 native SDK loaded 0.62 Bundle/HBC, and the 0.51 native SDK loaded 0.62 Bundle/HBC. Or the 0.62 native SDK loaded a 0.51 Bundle to avoid unnecessary hassle and communication costs:

Five, landing plan

Below is a schematic of our upgrade plan. The process is divided into four main lines based on roles: developer, tester, APP version, and dynamic version. The timeline for each main line has detailed actions at key points in time.

For example, for business developers (line 1), it is necessary to merge the adapted business code into the dynamic-1231 main release branch by 2020-12-18, followed by 0.51, 0.62 common code until the end of the upgrade process.

5.1. Upgrade in batches

As mentioned above, we adopt a batch upgrade scheme, with the driver end APP being launched in the first batch and the cargo main end APP in the second batch.

The React Native environment has been added to Android for this update. In order to minimize the risk, the first batch of Android driver launch was carried out through the form of plug-in dynamic release: version 0.62 SDK and HBC products were delivered to the end through dynamic upgrade. Dynamic publishing allows very flexible greyscale pacing: to ensure stability, we can pull the greyscale time long enough. Also, our dynamic upgrade platform supports online real-time rollback.

From the perspective of stability, we decided to launch the Android terminal through dynamic upgrade. However, while ensuring stability, it cannot affect the launch of business requirements. When the 0.62 VERSION SDK and HBC products are released in gray scale online, business requirements will also be released synchronously based on the 0.51 version SDK and Jsbundle products. That is: 0.51 and 0.62 environments need to exist in parallel online for a long time.

One important aspect of the grayscale process is the synchronization of the online environment: 0.51 and 0.62 products are released online at their own pace, without interfering with each other, but at the same time must contain all business requirements.

For example, 0.51 has a version every two days, while 0.62 grayscale cycle is 10 days. Therefore, it is necessary to ensure that users should include the latest functions regardless of whether they are using 0.62 or 0.51. Our strategy is as follows:

As you can see, the 0.51 and 0.62 releases are parallel lines. The 0.62 release is designed to be larger than the 0.51 release (ensuring that the 0.62 product is never overwritten by the 0.51 product), and a 0.62 package is released simultaneously for every 0.51 release. Therefore, the following two points can be guaranteed:

1. The functions used by online users are always up-to-date;

2. The 0.62 product always grayscale according to its own rhythm and will not be covered by the 0.51 product.

After the 0.62 gray scale is completed, the 0.51 business package will not be released online, and the online upgrade switch will be completed.

5.2, CI/CD

Due to the fact that 0.51 and 0.62 business bundles need to exist online in parallel for a long time, and the two versions of the environment, there will be incompatibilities in the artifacts. Therefore, in addition to the means of environmental inspection in the test phase, we also need to insert a series of verification processes in the CI/CD phase:

1. Environment switch;

2. Integrate python scripts that check for API compatibility into the build process;

3. Generate version number rules based on product type:

  • Rule for version number 0.51:5.91.XXX.YY
  • Rule 0.62 Version: 5.91.1 XXX.YYYY

4. Generate additional maps for Android HBC products and upload them to FTP.

5.3. Data preparation

This is mainly a buried point to distinguish it from the 0.51 data. We expect the online 0.62 data to accurately reflect the real situation of the upgrade (access ratio, stability), and we have configured a separate alarm policy for the 0.62 data.

Six, online verification

After the project was launched, all we had to do was timely follow up the online data, verify the preliminary laboratory data, pay attention to the monitoring data and adjust the plan in time.

6.1 Daily Report Output

During the gray level of 62 upgrade package, there will be daily report output, including DAU, PV, JS abnormal users, JS abnormal rate, SDK abnormal users and SDK abnormal rate of each module, so that the development and testing students can have an overall understanding of the online operation status. We also made a pre-plan in advance to stop gray scale when the abnormal rate reaches a certain threshold.

6.2 Output of Performance Data

Taking the performance data of Android terminal as an example, the performance data collected online is as follows, which is basically consistent with the data measured offline:

1. Package size

On the Android side, Hermes+HBC is used to package the output from string format.jsbundle to binary package.hbc, which increases the package size by more than 45%. This is a space-for-time optimization (JIT to AOT).

2, cold start

After the Hermes+HBC solution is adopted, the command running speed is greatly improved, the cold start time is reduced by about 64%, and the start speed is increased by nearly three times, which is basically in line with our previous tests and expectations.

3. Hot start

We made the engine reuse mechanism, after the engine is created once, it will reside in memory, so the second startup is hot. Compared with cold startup, hot startup does not require time-consuming operations such as JS code loading and initialization execution. As a result, the hot start time has barely improved, which is basically in line with our previous tests and expectations.

4. Memory usage

The Hermes engine performs HBC, eliminating JS code interpretation, resulting in a more than 30% reduction in cold start single page runtime memory.

6.3 Subsequent Batches

The first batch of upgrade will basically come to an end, during which many best practices will be accumulated: on-line plan, fault tolerance scheme, test plan, performance analysis, etc. The upgrade of the main end of the second batch only needs to make slight adjustments on this basis, and the on-line risk and overall plan will be much smoother.

In the first batch of stability tests at the driver’s end, we can confirm that the overall risk is under control. Therefore, when the second batch of consignors went online, they went online directly by following the release of APP.

There is no plug-in mechanism on iOS terminal, so the two batches are launched following the release of APP, using the default 7-day gray scale of AppStore.