This article was first published on the wechat public account “Shopee Technical Team”.

1. The background

React Native (RN) is a popular cross-end development framework for hybrid applications. RN is well suited to the flexible business of e-commerce. Because RN is a client-side rendering technology, it has some advantages in user experience compared to H5 pages.

With the rapid development of Shopee business, the number of RN codes in our App grew very fast, resulting in problems such as too large volume of construction products, too long deployment time, and dependency conflicts between different teams. In order to deal with these pain points, we explored the decentralized RN architecture and developed the Code Push Platform (CPP) and client SDK by combining the model, covering a series of RN r&d cycles including development, construction, release and operation of multiple teams. After nearly three years of iteration, a number of company-level core apps have been accessed.

Shopee merchant service front end team has created a number of merchant side applications. Most of the users are merchant service personnel, who have high requirements for high availability of business system and timely feedback of problems, thus pushing us to have higher requirements for React Native architecture.

This paper will introduce how we meet the development needs of multiple teams in complex business step by step from four directions of development history, architecture model, system design and migration scheme.

2. Development history

With the rapid development of our business, the number of RN bundles has increased rapidly, and the number of APPS has reached nearly ten. The whole RN project has changed in the three dimensions of development model, deployment model and architecture model, from single team to multiple teams, from a bundle to multiple bundles, from centralized architecture to decentralized, and finally to the business code of each team can be independently developed, deployed and run.

The whole development history can be divided into four stages, namely, single-bundle centralized development mode, single-bundle multi-business group development mode, multi-bundle centralized release mode and multi-bundle decentralized release mode.

2.1 Stage 1: Single Bundle centralized development mode

The initial overall technical architecture of RN was relatively simple. Since the business form was not complex at that time, in order to meet the development process of independent teams in the same code repository, the whole release process was updated and released based on CDN, and the version and download address of RN bundle files were recorded using configuration files for resource management. The entire release consists of two artifacts, an RN resource bundle and a JSON configuration file for resource versioning.

Each time an RN resource completes a build, these two build artifacts are placed in the static resource directory. App will automatically pull the configuration file at a specific time node (such as App restart) to check the update status of resources, and then pull the RN static resources from the CDN. The next time the page is opened, the App will load the latest page content.

As the business grew, more and more business teams expected to develop their business using RN technology stacks, and this changed the existing architecture, the idea of “multiple business groups multiple code repositories” naturally emerged.

2.2 Stage 2: Single-bundle multi-service group development mode

To solve the above problems, host-plugin is the research and development solution of multiple service groups.

Host is used to manage common dependencies and general logic. React, React Native, Shopee, RN SDK and so on are managed in an independent repository, which ensures the “singleton” condition of special RN dependencies and avoids overlapping dependencies of some client components. Such overlapping dependencies are officially prohibited by RN.

A host corresponds to multiple plugin repositories, and a business code repository is considered a plugin that is connected to the main application in the form of a plug-in. The business team can manage the repository according to its own coding specifications. Each plug-in repository is treated as an NPM dependency of the Host project and is built as a centralized publishing process. All code is integrated into the Host project to execute the build script. This mode meets the requirements of super App.

At the same time, the mode of host-plugin also brings a “problem”. Business development makes the volume of RN products gradually increase, and excessive products will affect the decompression efficiency of clients and JS loading time of RN containers.

2.3 Stage 3: Multi-bundle centralized architecture mode

To address the problem of large RN products, we used the build tool to subdivide the packaged products into bundles. This optimization was necessary and we called it “subcontracting.” The Host project corresponds to the public package, and the Plugin project corresponds to the business package.

The entire build takes place in the Host project, where the model is still “centralized build” and “centralized release.” Multiple bundles will be published to the system, and clients will pull hot updates. The client will load the corresponding bundle on demand, and the resource consumption of a single load of RN container is greatly reduced, which solves the efficiency problem.

But its disadvantages are also obvious. As business teams grow and business content expands, the multi-bundle centralized publishing model also has four drawbacks:

  • For RN runtimes, even though the subcontracting technology separates the artifacts, they still run in the same JSContext, which may lead to dependency conflicts and pollution of environmental variables.
  • In the process of development and debugging, the project relies heavily on the host project. Every time there is a code change, a lot of content needs to be reloaded, which makes the development and debugging not very friendly.
  • In the process of project construction, the packaging speed is affected by the number of plugins, and it even takes 50 minutes to execute a build for a very large application, which seriously affects the release efficiency.
  • In the process of deployment and release, the host project maintainer is responsible for the whole App. Each business group cannot be released independently, and the release time will be bound together. When a live issue occurs, the developer incurs significant communication costs and can only roll back the entire issue.

2.4 Stage 4: Multi-bundle decentralized architecture mode

The decentralized React Native architecture is similar to the concept of “micro front-end” for web pages or “micro application” for clients. It can be independently developed and deployed by multi-service teams and run independently on each module of the same App. It covers development, build, release, run, and more. This model solves the four drawbacks mentioned above, and has a comprehensive upgrade for the whole RESEARCH and development system. Its advantages include: non-interference of RN runtime, high efficiency of development and debugging, and independence of construction and publication.

The following highlights the project’s decentralized RN architecture and system design, and how we struck a balance between flexibility and stability.

3. De-centered RN architecture model

In simple terms, the decentralized RN publishing model involves four parts: a separate JS runtime; Independent development process; A separate build process; Independent release process. With the help of these four key links, each team runs RN’s r&d process at its own pace.

3.1 Independent JS runtime

The emergence of independent runtimes (multiple JSContext, execution context) is a major feature of decentralized architectures. The independent runtime is A perfect guarantee for independent publishing. By isolating RN running code according to the plugin dimension, it can effectively avoid variable conflicts and dependency conflicts between different businesses. In other words, the release of “Plugin A” will not affect “Plugin B”.

Its design mainly includes the following three points:

  • Create the JSContext ahead of time and preload the public package;
  • When you enter the plugin page, the SDK checks to see if the corresponding JSContext has been instantiated. If it has already been instantiated, it can be used directly. Otherwise, an independent context is selected from the JSContext Pool to load the execution service package. Each plugin is isolated from each other.
  • When you exit the business page, the JSContext is not destroyed immediately, but is put into a cache pool so that you can repeat the business for maximum experience.

The container for the JSContext device can be a thread or a process. To avoid its frequent creation and reclamation, we maintain the cache pool and reuse existing JSContext as much as possible.

Here, we adopt the strategy of Least Frequently Recently Used (LFRU for short). The JSContext will be re-enabled when the application that just exited is re-opened. This allowed us to save 85% of the first screen rendering time. The number of caches can be configured so that the service can make reasonable estimates based on the application scale. While the RN page is still in use, even if the estimated number is exceeded, the context is not immediately reclaimed. This design effectively ensures the availability of the page.

3.2 Development Process

As mentioned above, the debugging efficiency of RN project will decrease as the volume of business code increases. Every developer’s productivity has a direct impact on everyone’s “happiness”. In contrast, RN decentralized publishing is specifically optimized for the development process.

With the advent of independent runtime environments, when RN is debugged, the client can load only one plugin into the corresponding JSContext, and the other plugins use the built-in cache.

Doing so has two advantages: one is to ensure the minimum range of service startup, to ensure the efficiency of code hot loading; The second is to ensure consistency between the development and build processes, which can expose problems early in the development phase, such as compilation problems caused by the Babel plugin missing. This “decentralized” development process improves RN debugging efficiency.

3.3 Construction Process

With business development, the number of RN plugins in an App is 4. The old construction process is affected by the number of Plugins, and the centralized construction takes more than 20 minutes. However, with the decentralized RN architecture, the construction time does not increase with the number of plugins, but is only related to the amount of plugin code, which is stable at about 5 minutes.

The new architecture is also based on the host-plugin model, with the isolation of individual repositories giving each team free rein to grow. Given that the base Native dependencies are unified within the application, the Host project is only used to manage unified common dependencies. The project needs to build the Common Bundle first, and the system will record the dependency information in the common bundle. As each Plugin project builds, the build tool removes the common package dependencies and completes the business package build. The build artifacts of each business package are stored independently in the system. The system has the ability of independent rollback, independent release and independent gray scale.

The advantage of this is that the build task is minimal and each plugin build does not cause the entire project to be rebuilt, making it truly “packaged on demand”.

3.4 Release Process

Building and publishing RN are two separate processes. This also means that the build and release phases of the bundle are completely decoupled, and the release timing can be flexibly arranged by each business team release lead. Each business group is responsible for its own code quality and flexibly controls its own release pace without affecting the online business of other teams. The release process includes full release, joint release, gray release, rollback and other operations. The following chapters will describe how to ensure the stability of release in detail.

4. System design

Why do we need a system? For complex large-scale projects, simple thermal update process can no longer meet the cooperation of multiple business groups. We need a thermal update system with perfect functions, superior performance and friendly operation to meet the development of complex business. The Code Push Platform is written by Node.js and comes with the system-attached command line tools and client SDK.

In order to satisfy the operation of the system in multi-business teams, the whole system can be divided into three parts from the functional perspective, which are as follows:

  • Multi-team authority control;
  • Bundle lifecycle management;
  • System efficiency is improved.

The system efficiency improvement function is further divided into:

  • Incremental difference;
  • Multi-scene entrance volume optimization;
  • One-stop multi-environment integration.

4.1 Permission Control of Multiple teams

In addition to recording every build operation, the system is decentralized in the workflow, and the permissions of each plugin are isolated. Each owner can only operate within the system, and the plugin 1 owner can only trigger related builds and releases, not plugin 2. The system standardizes all release processes through strict authority control to ensure the controllability of the project.

The React Native decentralized release was designed to save on communication costs between different teams. The system limits their build and publish actions so that their releases do not interfere with each other.

Permissions are managed in a tree structure, one App corresponds to one project, and the project leader is the project leader of the App team by default. System actions such as creating a brand new plug-in require project leader approval. An App contains multiple plugins, and the leader of each plugin is the corresponding business team leader by default, who has the authority to assign release and build permissions.

4.2 Bundle Lifecycle Management

4.2.1 Client Version Control

RN differs from web applications in that it has a tight dependency on the client. In the absence of changes in the client’s underlying dependencies, developers can generally update RN code through hot updates. However, a major update, such as the React Native version moving from 59 to 63, required not only JavaScript side changes, but also client side upgrades and no further backward compatibility. Technically, it is unavoidable. This situation, in which the client is not backward compatible, is known as a “fault.”

The system will provide client version control capabilities. When a major change occurs, the App owner should create a “fault message” on the system with a version number ranging from the lowest App compatible version to the highest App version. Only in this interval can the client pull the latest RN resources of this fault.

As shown in the following table, apps with versions greater than or equal to 2.5.0 pull packages of version 105 RN. Pull RN package 103 from 2.0.0 to 2.5.0; Pull the version 100 RN package from version 1.0.0 to 2.0.0.

APP Version RN Version
2.5.0 ~ latest 105
2.0.0 ~ 2.5.0 103
1.0.0 ~ 2.0.0 100

This measure can effectively avoid potential risks. The latest demand will only be on the latest fault line, and the old fault will only do the on-line problem repair. After all, there are two sets of code, the code has a cost to maintain, and the old faults should be phased out as users update to the latest version.

4.2.2 Grayscale and rollback

The release process includes full release, gray release, rollback and other operations. For large requirements, going live at full volume is potentially risky. Generally speaking, the release of the new version is given priority to some users. The release leader can release the gray scale according to the specified users and a specific range, and gradually expand the gray scale release range until it reaches the full scale. When a major bug is found, publishers can use a “zero build” approach for “second level” rollback.

The decentralized RN architecture supports each plugin to be published independently, grayscale independently and roll back independently, ensuring quality and avoiding risks with the smallest granularity operation. Greyscale and rollback at the Plugin dimension level allows flexibility for different business teams, each of which can release their own versions, control the greyscale tempo, and handle online issues.

4.3 System performance improvement

4.3.1 Differential increment

Frequent updates of RN resource packs by App will consume user traffic. The most effective way is to save traffic by incremental updates. The RN resource bundle covers compiled JavaScript artifacts, images, translation files, and other static resources. Their differences are code or other resource files that have changed for that version. In order to make the difference granularity deep into the resource package, the system specially provides independent “difference service”, and adopts binary difference method to differentiate the constructed products.

The DIFF (difference) operation of RN resource bundle is completed on the server side, and the patch (integration) operation is completed on the App side. In a decentralized RN architecture, the differences of each plugin are independent. The release of Plugin will automatically trigger the execution of difference. The system will pull the last five versions according to plugin, and Diff Server will calculate the difference between them and the current version in turn. If the calculation is successful, the difference result will be uploaded to the CDN and fed back to the system, otherwise continue to retry. The whole differential operation is an asynchronous process. Even in extreme cases such as the offline of the differential service, the system will automatically degrade to the full package to ensure the availability of the system.

4.3.2 Volume optimization of multi-scene entrance

Because React Native’s build officially relies on Metro.js, it doesn’t have tree-shaking capabilities. As business code expands, package size optimization is an important issue.

ShopeePay, for example, provides payments for several of the company’s core apps. ShopeePay Plugin has some page level differences in different regions and apps. The same repository contains all the code and resources, but the build script packages them all into a single artifact. Obviously, this results in a release of ShopeePay that contains a lot of redundant resources and is not optimal, wasting download traffic and affecting the efficiency of code execution.

We use our self-developed babel-plugin-scene plug-in, which sets a scene value through the injected environment variable. Babel can load different files according to the difference of scene value, and use the default file as the bottom of the downgrade. Different scenes correspond to different entry files, which can effectively control the package volume.

4.3.3 One-stop multi-environment integration

A normal development process is from the Test environment, to the UAT environment, to the LIVE environment. The Code Push Platform connects to the Test/UAT/LIVE environment of the App, so RN developers only need to conduct one-stop operation on the system to meet the whole research and development cycle of a requirement.

The transfer of package resources in different environments is a highlight of multi-environment integration. If an RN bundle is built in a UAT environment, it does not need to be rebuilt, and the bundle is seamlessly converted to a Live environment for distribution. The advantages it brings are the “zero build time” and the stability of resource bundles, because the bundle is not rebuilt, so its content is fully verified in THE UAT and there is less risk of distribution.

5. Migration of existing services

How to migrate an App for an existing business is a very serious issue, especially for a business with a heavy history, which may have “logical coupling” or “component coupling” scenarios. At the same time, many related businesses are in the process of demand iteration, and the migration of the system cannot hinder the demand iteration, so the “gradual migration” scheme of the old business is very necessary.

5.1 Logical Coupling

If two or more plugins have a logical dependency, the user must load the latest plugin at the same time. Given the possibility of hot updates failing, logical coupling is multiple plugins hiding a constraint relationship. For example, there is a certain logical coupling between order business and purchase business, and it is impossible for the release person to release plugins one by one for the super App with huge traffic. In extreme cases, the user may load Plugin A first, and the new version of Plugin A is incompatible with the old version of Plugin B, which can lead to serious consequences. In this case, there are two solutions:

  • Scheme 1: Logical decoupling between plugins ensures the independence of each plugin.
  • Scheme 2: The system provides joint release to ensure that multiple plugins can be loaded into the latest version at the same time in Native side.

Scheme 1 is the most ideal state, but in the case of business scenario segmentation, the project structure is difficult to achieve absolute independence.

Plan 2 can be considered for old business. The system provides the concept of Module, and one module corresponds to more than two plugins. They have a binding relationship. In the same download task, the client SDK is in the form of a “transaction” to ensure that multiple plugins can be downloaded and put into use at the same time. The ability to publish syndication effectively circumvents the possibility of such errors at the system level.

5.2 Component Coupling

Whereas joint publishing is a compatibility solution for “logical coupling” at the plugin dimension, “component coupling” is a more fine-grained coupling at the component level. That is, there are multiple components on a page from different teams, such as product details pages that have evaluation components. This scenario of “a page with JSContext nested with each other” exists in e-commerce business.

There are two solutions to this “component coupling” situation:

  • Scenario 1: The nested components are separated into a separate repository for use by third-party plugins.
  • Scheme 2: Use the ability of “same screen rendering” to achieve “multi-context nesting”.

Plan one is the ideal solution. However, considering the cost of migration, we also provide scheme 2 (a nested “same screen rendering” component) to support this scenario, which is similar to a Native component. In the case of multiple JsContexts, the plugin name and the page name are used to nest the desired content into another page.

As shown in the figure below, Plugin A will nest plugin B, and both A and B can be rendered on the same screen. From a Web perspective, this situation is a bit like an “iframe” scenario that supports the nesting of multiple pages. It is very easy for RN developers to understand, and the client SDK can dynamically load the target bundle and render it in place.

5.3 Progressive migration

With existing apps, it’s hard to do the whole migration all at once because the business can’t stop the iteration. Therefore, we offer a “progressive migration” solution. Given the historical context, the solution is not to migrate all plugins at once, but to gradually break them down and migrate them into a new release system.

The migration steps are shown below:

  • Separate businesses are preferentially migrated to the Code Push Platform and enjoy a separate JSContext;
  • All the “code to split” shares a separate JSContext;
  • Continue to split the “code to be split” into separate plugins, using JSContext independently, and leave the rest as in Step 2.

Repeat the second and third steps as version iterations continue until all historical services have been split. In this way, we can achieve the optimal goal of true “build independent” and “release independent”.

6. Summary

The goal of the system is to meet the efficiency of multi-team research and development of all apps. The decentralized RN publishing model takes into account four aspects of “independent runtime”, “independent development”, “independent construction” and “independent release” to ensure the independence of each plugin. The ultimate goal is to support Shopee’s multiple RN teams to publish freely and efficiently on different App platforms at their own pace.

The system design involves “multi-team authority control”, “client version control”, “gray scale and rollback”, “incremental difference”, “multi-entry package volume optimization”, “one-stop multi-environment integration”, which accelerates the whole RESEARCH and development process and truly achieves both “flexibility” and “stability”.

In this paper, the author

Weiping, from Shopee merchant service front end team.