background

Scaffolding is a very broad concept in the field of Internet technology, from the early wordpress personal website construction that most people were familiar with, to the graphic layout and construction of some article CMS systems, to the more complex UI construction, especially after React/Vue came into being. From the framework level, it also provides a lot of capabilities, including view structuring (VDOM), view and data association (data binding), and so on, simplifying the originally very complex building canvas. In Taobao, the first establishment System TMS (Template Management System) was established in 2008, and the design at that time greatly influenced the establishment System design in the following ten years. These designs include:

  1. Separation of front-end and operational operations

  1. Data mining based on templates

  1. Abstraction of page rendering



From the above page abstraction, there are still a lot of shadows of the PC era. With the strategic adjustment of the company, the development of wireless and personalized, the construction system also embraces changes together, resulting in various construction applications facing different scenes and demands, which inevitably means a lot of repeated construction behind. In 2019, with taobao, Tmall technology merge, eventually in ali front committee’s support, started building technology direction, thus reducing the redundant construction, improve the Angle of business communication, to build applications/services layered abstraction, tianma also build domain as economies unified service, and then each BU, business can be unity based service, and the specification, Build a build system that fits your business.

Tianma completed the access and construction support of more than a dozen BU in the last fiscal year, with an overall output of tens of thousands of modules and millions of pages, covering more than 30,000 Ali operations and hundreds of thousands of merchants.

Nouns and concepts of construction

Construction is a process of multi-role composite participation, which is a big difference between construction and other technical directions. Therefore, when designing and building, it is necessary to make clear who the key users are and what roles to design the process around. To help you understand what’s going on, let’s line up some nouns:

  1. Module: the minimum unit on which non-technical students build pages
  2. Page building: The composition process from module to page
  3. Data placement: Data changes much more frequently than pages, so the concept of data placement is extracted separately
  4. Terminal: Target operating environment

Design of construction

Process design

No matter in the PC era a few years ago, or now the wireless era, the operation of business is inseparable from the complex page production, such as Taobao, from the early different industry categories, to the present different shopping guide marketing way, behind is the need to make a large number of page support.

The whole page production process, after abstraction. It mainly consists of several steps:



These steps should be completed independently by the operation, without the intervention of r & D students. Focus on the three steps of setting up, building and placing.

  1. Set page: Page title, keywords, description, and other Settings that can affect direct changes to the HTML document
  2. Build page: adjust page structure, add module, delete module, interactive module location and so on
  3. Release data: Set data for single or multiple modules

Behind these capabilities is the way pages, modules, and data are abstracted, which determines how a building material should be designed. Of course, these processes are only a sequence and do not mean that these operations must be carried out manually, even if the automation of production and page building, the system behind the process is similar.

Build the core material – module

Module is the smallest unit of page construction defined by Tianma. The page is composed of modules, which can be associated with data. Tianma’s building module has several core design principles:

  1. flat
  2. Across the terminal
  3. Standards-oriented data development

flat

In the previous introduction to TMS, we explained how TMS designed page rendering, in addition to up-down module-building, it also provided horizontal module-building capabilities, also known at the time as raster modules.



But Tianma simplifies this part and only supports the ability to build from top to bottom, which is a one-dimensional flat module structure.

Block building

The corresponding page is as follows:



Why do I simplify this part?

  • ** Friendly to operation: ** Thinks from the perspective of the operation students as the main users of the construction, and the features of the mobile phone screen in the wireless scenario, and the module list of one-dimensional storage is relatively friendly. This design also simplifies the construction of the service itself, the entire page structure is a one-dimensional array, each operation can be converted into a simple array operation. Of course, one-dimensional storage does not represent a one-dimensional display, and developers can still use some parent-child relationships to transform the one-dimensional storage structure into a tree structure during presentation. At present, we are deciding whether it is appropriate to give complexity to developers and simple operations to non-technical students.
  • ** Because of wireless, the company’s investment in wireless is much larger than the desktop (mainly on the consumer side), so if you can build wireless pages, and then the desktop automatically generated, or reverse, for the construction of users can save a lot of time. Especially in extreme scenarios, users need to generate additional PC, WEEX, and small program versions to build a wireless page. The module structure of one-dimensional storage can establish the corresponding relation of different terminal page modules well.
  • It is convenient to establish the relationship between the server and the module: due to the popularity of the algorithm, the individuation of the page is not limited to the inside of a commodity module, but different people visit the same page, and the order of the whole page may be adjusted because of the individuation. So for the back-end algorithm, it needs to be aware of the page structure and associate with the back-end algorithm model. The one-dimensional structure is very friendly to this part

Across the terminal

As early as 12 years ago, Tmall already had the concept of cross-terminal. As the previous concept definition, here is not strictly defined to distinguish terminal, container, etc., first use a relatively simple concept, called terminal, that is, the target operating environment. Terminals include desktop Chrome, safari on mobile, UC browser on TV box, WebView/WEEX container on mobile Taobao, Small program container on Alipay and even SSR rendering engine on service end, etc.

Why not use the reactive form? Reactive is just a cross-terminal solution, it doesn’t solve the problem of code running on the server side, and reactive itself focuses too much on efficiency, rather than addressing essential differences.

Weixin.qq.com/r/IHXv4C-Es…(Qr code automatic recognition)

For example, the navigation module in the figure, the wireless terminal is a TAB, while the desktop terminal is a scrolling and floating module with the screen, which is a case of interaction differences, but in fact, there are more differences in content and business logic, so there is no need to stick to the response type, write two sets of logic.

Of course, cross-terminal is the ability of modules, if my module is intended to serve only one end, I can write only one end.

The actual situation will be more complicated, Amoy system has chosen Rax as a unified DSL, based on the above construction design, plus Rax itself a development of multi-terminal operation ability, CAN achieve I just need to write a wireless web code, respectively out of weeX, small program version, So my module put into webview is web mode, put into the small program is native small program, put into WEEX can be rendered in weeX form. So when a module is published, it is synchronized to both CDN and NPM. The CDN version is used by pure browsers and servers, and the TNPM part is used by applets and source pages and other scenarios with page-level building capability.

Standards-oriented data development

This principle can get confusing, but in reality, people are doing things like validating a form, defining a new data interface with the back end, and now TypeScript defining data formats. We divide this principle into two parts:

  1. Data-oriented research and development
  2. Data standardization

The data format is a schema. JSON data description that conforms to the JSON Schema specification, which describes the input parameters accepted by the module, that is, the data for which the module is developed. There are more conventions inside these inputs, such as how to skin the module (quite different from the mid-background skin mechanism), how to make the module accept some configuration, and how to pass the core rendering data to the module. The standardized data format solves the problem that developers write the same commodity module with different fields by defining the data model first and then referencing the existing data model as far as possible. Otherwise, for the back end, you would have to build a very complex system to cram the same commodity data into different modules, while accommodating different field definitions for each module. Usually the backend students don’t want to do this either. Behind these designs is the expectation that developers develop modules as far away from the business scenario as possible, with as little interaction with specific back-end interfaces as possible, and write modules as pure rendering components so that the fluidity of modules can be maintained. Here is an example of schema.json:

{
  "type": "object"."properties": {
    "$attr": {
      "type": "object"."properties": {
        "hidden": {
          "type": "boolean"}}},"$theme": {
      "type": "object"."properties": {
        "themeColor": {
          "type": "string"}}},"items": {
      "type": "array"."items": {
        "type": "object"."properties": {" itemId ": {"type": "string"
          }
        }
      }
    }
  }
}
Copy the code

The logic related to the business scenario is handled at the page level, and different pages can share a set of page initialization logic.

Data standardization

To speak for standard data research and development, because of the special features of the building, as well as the corresponding data is not just a simple module for forms, especially in the one thousand thousand face popularization, personalized today, most of the module behind, is not only the static data, but some dynamic data services, these interfaces may come from companies large and small different kinds of systems. For module developers, which interface should I define the data description for? The same is the commodity interface, A application and B application interface return field, one is underline style, the other is how to do hump.

Data standardization is to solve this problem, we should be oriented to a standard data research and development. This standard data is based on the current lowest level of these systems, commodity libraries, user libraries and so on, after the unified naming convention results. Everyone follows the specification to pass data to the module.

But it’s more complicated than that. For example, you have a module with a line of text, and in some cases it shows the product title, and in some cases it shows the marketing copy written by the merchant, and the UI itself is ambiguous. In this case, we will define a field called title in the data description. The specific title corresponds to itemTitle or itemDescription, depending on the actual scene.

Finally, a domain-oriented data interface may be standardized twice before it is rendered to the front end. One is the standardization of the domain model to ensure that the fields are not ambiguous, and another is the standardization of VO, and then, based on the needs of the view, it is mapped to the display of modules that may be ambiguous.

Usually we ask the backend students to standardize the domain model, and then the front end standardizes the view-model in FaaS or maintains a gateway like application. It is also possible for the front end to map directly from the data source to the UI model, but at a different level of abstraction.

How to write a module

As mentioned earlier, we still prefer modules that do only render and bind as little as possible to specific scenes. So to simplify module development, it is:

  1. Define what format of data DO I need
  2. Have a mock data ready
  3. Write a piece of logic, enter the mock data, and return the render result

Sounds a lot like a traditional definition of a function. Here is an example of a RAX module:

import { createElement } from 'rax';
import View from 'rax-view';
import Text from 'rax-text';
export default function Mod(props) {
  let defaultTheme = {
    themeColor: '#fff'
  };
  let defaultAttr = {
    hidden: false
  };
  let {
    items = [],
    $theme: {themeColor} = defaultTheme,
    $attr: {hidden} = defaultAttr,
  } = props.data;
  return (
    <View className="mod" style={{
      backgroundColor: themeColor}} >{ hidden ! == 'true' ?<Text>Welcome to tianma module!</Text> : null
      }
      <View className="keys">
        {
          items.map(element => {
            return (<Text>{element.key}</Text>); })}</View>
    </View>
  );
}
Copy the code

Of course, there must be some modules that are more complicated, such as liking, following, etc., which can be solved by more than one rendering. On the one hand, this part can be componentized, and the developer who wrote the module does not need to care about this part of logic, or just needs to pass some user information to the component. On the other hand, if there are N implementations of a like interface, is that a problem for back-end service design? Would it be better to push for uniformity?

Module development link design

The module development link is not much different from normal development of an NPM package. Based on many conventions, we provide some convenient scaffolding, as well as plugin-enabled builders that can be provided for a variety of different but limited DSL modules to build operations.

At the same time, as developers have ISVS, outsourcing and internal staff, visual research and development is still very important. Try to smooth out the differences with the development of an NPM module through visual research and development. We also provide capabilities including native module management, debugging, preview, schema editor, and release processes such as code scanning and resource storage.

How to run modules

The service side

Because the server side is also the target running environment of the module, the current mode of running the module on the server side mainly includes the old-school pure template rendering method (requiring developers to write a separate template file to generate HTML), and the increasingly popular SSR scheme. The former is simple enough and deterministic, while the latter is future-oriented but needs enough stability.

The client

As mentioned earlier, in order to make module development simple and liquid, the page level needs to undertake more operations including data requests, page container initialization, etc. Data request logic is a very core part of page logic, and now there will be some concepts of data-driven UI presentation, especially interface merge, paging split screen, and disaster recovery basics, which are very important functions. Paging split screen determines what content needs to be displayed and what data needs to be requested on the first screen. Then interface merging is responsible for reducing the number of requests and accelerating the first screen display. Then, disaster recovery is made to ensure that content can be displayed to users at last, even if there are various network and service problems. The page container rendering mainly includes initialization of the scroll container and rendering of the multi-dimensional module list. Such as rendering a one-dimensional list of modules into a multi-tab parent-child relationship, and finally having to initialize each module individually.

The core design of the build – dependency de-weighting

Most of the above content and background build is still relatively similar, but there are some conventions. Then there are the differences in tianma’s design. Background scaffolding usually requires only a schema description on top of the NPM package component, which is built into the page bundle at release time. But not on the consumer side. Currently, each module needs to be built separately. Why?

background

  1. In one event, about 100+ modules were used to build 1000+ pages. Then there is a feature that requires a few modules to be upgraded in a short time. If every page needs to be built to work, it’s not feasible to build a lot in a short period of time, especially if complex Webpack builds take a long time.
  2. After being personalized and popularized in front of thousands of people, the display of the page is driven by data. If the traditional construction scheme is used, it is impossible to accurately load only the module on the first screen, because the module contained in the first screen itself is determined by data rather than bundle.

Data-driven presentation



Because the smallest unit to build is module, and there are a lot of dynamic requirements in the business, for example, the version of 5 modules of 1000 pages needs to be upgraded at 10 o ‘clock one day, and it is not feasible to reconstruct and release these 1000 pages. Therefore, the process of assembling the module is realized by calculating assets Combo URI through online rendering service. As long as you click the module upgrade in the background of operation, the 1000 pages will automatically update the module version without having to go through the construction logic again. This also means that each module needs to be packaged separately to give a version of the Web that is already running in the browser.

But because each module is packaged separately, doing nothing creates the problem of dependency reloading. In order to ensure that the page size is not controllable due to repeated loading of dependent modules, seed is introduced to describe the dependency mechanism.

{
  "modules": {
    "@ali/pmod-ark-butian-test/index": {
      "requires": [
        "@ali/rax-pkg-rax/index"."@ali/rax-pkg-rax-view/index"."@ali/rax-pkg-rax-text/index"]}},"packages": {
    "@ali/rax-pkg-rax": {
      "path": "/ / g.alicdn.com/rax-pkg/rax/1.0.15/",},"@ali/rax-pkg-rax-view": {
      "path": "/ / g.alicdn.com/rax-pkg/rax-view/1.0.1/",},"@ali/rax-pkg-rax-text": {
      "path": "/ / g.alicdn.com/rax-pkg/rax-text/1.0.2/",},"@ali/pmod-module-test": {
      "path": "/ / g.alicdn.com/pmod/module-test/0.0.9/",}}}Copy the code

Having a description is definitely not enough. The core also needs to decide on a strategy for modules that depend on different versions of the same NPM package. NPM is installed by taking the maximum compatible version, installing multiple copies of the policy when incompatible or specified. The strategy on the Web is similar, but the internal development is more controlled, so the strategy has been simplified more (take version X, Y, Z as an example) :

  1. X-bit large versions can coexist (optionally not)
  2. Both y and Z bit version changes are forward compatible and automatically take the latest version of the compatible version, even if the version is specified.

From a Web and user-side perspective, loading a large number of different versions of the same component only leads to ballooning of page size, wasted bandwidth and traffic, and poor user-side experience. In essence, is the original Webpack to help developers do internal dependency management, abstraction out, unified management at the page level.

Seed mechanics with Webpack

Seed. Json dependency issues

1. The cost of understanding and building proprietary implementations

The core problem with Tenma’s current SEED configuration is that it is a very proprietary implementation, and all components that make it into tenma’s architecture and have de-weight capability have to be rebuilt to generate seed. This problem becomes apparent when new services are added to Tianma because page-level de-weighting is based on the SEED configuration, and the original seed source is sed.json inside the component. Json dependencies can be collected only after you convert the package.json dependencies in the component to seed.json. This is why components need to be registered once on tenma, which is the process of generating the corresponding seed. Currently the process is to manually import components and submit a copy to Pegasus. In the future, Tianma Module Center will provide the ability of automatic registration. Json dependencies are also used to generate seed. There are two main reasons:

  1. The dependencies declared in package.json are not necessarily used, and the actual import references need to be read
  2. The dependencies declared in package.json are not necessarily public dependencies, and the internal dependencies are simply packaged away, otherwise the seed itself would be very large
  3. A common component needs to be published to a CDN and converted to a web-ready CMD/UMD/AMD, etc., which itself requires a build process.

2. Dependency on complex Loaders

Because the seed file exists and contains a lot of relationship descriptions, a relatively complex loader is needed to parse this description. We tried to extend it based on SystemJs, but it is still not as dynamic as the original self-developed Loader. The loader implementation of KISSY 3 is also available for those interested.

Possible problems with Webpack Module Federation

1. Organization of HTML

Based on the SEED JSON format, Wormhole integrates the ability to generate HTML from seed. Json. Developers don’t need to worry about how the HTML is generated, because the order is not that important because of the Loader’s guarantee. In the case of Module Federation, HTML still relies on page-level construction. If you need to dynamically assemble HTML in a similar construction scenario, you either have to load all the remoteEntry files, or you have to generate a dependency for the server. Of course, you can also use SSR directly, in which case, you need another configuration to package an SSR version of the code, after all, the server is not so significant on demand, and SSR probability can only cover part of the page, the rest of the HTML is still facing the problem of how to organize resource references.

2. Redundant code

Because Module Federation packages dependencies into the remoteEntry file as code, there is bound to be a lot of redefined code. For example, webpack _require has a bunch of functions, and remoteEntry loads too much. However, compared to the current tianma dependency processing system, Feloader itself supports various module formats such as KMD/CMD/AMD due to historical reasons, but also a bit too large, so there are similar problems.

3. CDN combo

The dependencies described in sed. json can be quickly parsed into a combo format, where the dependent scripts are merged into a single request. Currently, Module Federation handles each remoteEntry separately. Although it has the ability to handle deep dependencies (dependent dependencies don’t reload) compared to SystemJs or other WebPack add-ons, it lacks a merge capability. It is possible to load dependencies serially, but this problem is not difficult to solve by extending the Trunk Loader to incorporate the processing of the dependent component promise function.

The core design of the build – rendering service

Above will be more around the construction of materials, Tianma also provides a general Node.js online rendering engine, used to provide unified rendering services, as long as the products built by Tianma, can be consumed by rendering services, and render the final results to the user access. The template rendering itself has no particular point, mainly saying something different.

Multi-terminal caching

Facing ali’s large traffic scenario, we designed a multi-terminal cache scheme:



Modules are cross-terminal, and pages are definitely cross-terminal. Behind this is the need for a unified terminal identification architecture to support. At present, the pages of the build product are hosted in a set of cache + source architecture, and different ends will have a corresponding cache copy, avoiding the need to re-render every time they visit.

Based on such a set of architecture, we can also achieve operation only need to put an address or TWO-DIMENSIONAL code, there are different ways to display in different ends.

High performance guarantee

In support of cross-terminal at the same time, the rendering engine also assumes the responsibility of similar webpack HtmlWebpackPlugin, the result released by the building system is a description containing page structure, dependency relationship. The rendering engine uses this description to render HTML, WEEx bundles, and so on. Of course, this process will take some time, mainly in the following two parts:

  1. Pull module resource files that need to be pulled remotely from OSS
  2. Calculate the final dependencies for the entire page

Because the rendering service is online, it’s real-time rendering based on the CDN cache architecture (automatically updated back to source at regular intervals), and it’s still a bad thing to build as slowly as WebPack. So a lot of work is done here, including caching of dependency calculations, caching of files, and so on. And then through the CDN cache ability, improve the overall access speed.



At the same time, in order to improve the user access experience, the deployment scope of the rendering engine is larger than the construction service, especially in the international scenario, the rendering engine has been deployed to Asia, Europe and the United States, and the OSS file synchronization optimization for these countries.

future

Seed system and Webpack long-term integration solution

Tianma’s unique seed dependency mechanism, which does not hide dependencies like Webpack does, has a learning cost and is prone to problems (webPack has its own complexities and learning costs, of course). So those pages quantity is not big, and no tao is relatively abnormal update demands, because the module itself is a standard of NPM package, tianma also supports offline build way, this solution is more close to the react source app development, can do more for the build time of optimization, product is out of the seed system at the same time, The rendering process is simplified. In the long term, as WebPack itself grows, it is expected that it will eventually be incorporated into community solutions to achieve a scientific balance between page building and dynamic capabilities.

dynamic

For Amoy, dynamic is always an important ability. My own idea is that if today we can publish source files directly to CDN without packing modules, take CDN as directory, and run a similar ability to Webpack directly on the browser, we can keep dynamic at the same time. There is no additional complexity (complexity is all in the solution itself, for the developer, there is no need to know too much).





Why is this just a vision? In order to do so, we still need to face some problems:

  1. Whether the browser is powerful enough to do this compilation, especially since compiling in WebPack itself is now a time-consuming task, is a bit scary to dump on the user side.
  2. Although many caching mechanisms can be designed for the time consumption of remote file systems on the network, it is still a problem for the first time. Besides, there are still many problems in making a set of file caching by bypassing the caching mechanism of the browser itself, especially in the wireless app Webview, where the space is very limited.
  3. The complexity of package management, the current SEED mechanism already does a similar thing, does not currently bring much change in the way packages are managed.

0 development

For the future, for developers, written code can be more natural and simple to run, the cost of understanding and maintenance will be reduced a lot, whether it is a perfect developer supporting tools, or friendly modular design, are to improve the developer experience. But one can do is always limited, at present the tianma also do more cooperation in and imgcook and iceluna, and combined with the visual and intelligent code generation code, developers can focus more on maintaining a mechanism or engineering, to systematically improving the user experience, rich business play, rather than into the endless pages on production.

other

Tianma, as a building service, is more bound to Ali’s business. The above content is only a part of Tianma, and there are many contents that are not suitable for external at present. In the future, we also hope to have more channels, such as open source and cloud, to share tianma’s services and thinking behind. We also welcome students who are interested in building or have ideas to communicate more. For those who are interested in participating in the construction of Tianma, the following are the contact channels. We also have an exchange group for the construction direction.

  1. Resume email: [email protected]