Technology leads innovation, with “core” to build ecology, the first stage of the Loongson Ecological Forum on March 12, 2021 (Friday) grand opening! As an important technical exchange window for the ecological construction of the dragon son, the dragon son ecological forum will gather senior technical experts and industry ecological partners elite, continue to carry out industrial construction results and technology sharing activities, and contribute to the prosperity of the dragon son ecological construction and the promotion of technological innovation.

As an important content in the construction of ecological resources, application migration under the Loongson framework has attracted much attention. At present, based on the Loongson architecture, a complete adaptation system has been built, covering the operating system, middleware and all kinds of application software. Input method tools, as common software, have also become an important part of the application migration and adaptation of the Loongson architecture. Sogou input method, which enjoys the reputation of “national input method” and adopts the classic C/C++ language, has completed the migration and adaptation under the framework of the dragon son, and is serving the party and government enterprises, colleges and universities and many end users.

This event and invited senior engineer of the division in the godson sogou input method, senior development engineer yu-hang wang two teachers, around the godson architecture of C/C + + class application migration technology base and case, sogou input method enterprise applications migration from theory to practice, share technology experience, analyzes the technical problems in the process of migration, lead us a glimpse into application migration technology.

The following is a shorthand of the event

Sharing topic: sogou input method in the Loongson platform adaptation and application

Sharing Guest: Wang Yuhang, senior senior development engineer of sogou

Now I work in sogou Input Method Division. Engage in the research and development of cross-platform sogou input method, be fond of various cross-platform technologies, and be committed to providing the ultimate input experience for input method users.

The text is as follows

1. Project background

The theme of this sharing is the adaptation and application of sogou input method on the Loongson platform. The following aspects are mainly discussed in the following aspects: the background and introduction of the project, the current adaptation results, the migration path and follow-up planning.

Sogou was founded in 2003. The number of monthly active users of sogou is second only to BAT, making it the fourth largest Internet company in China. Sogou has experienced 6 breakthroughs since its establishment. It has been growing continuously. Behind the growth is the pursuit of continuous breakthrough and innovation. In China, when it comes to input method, sogou input method is the first thing that comes to mind. Sogou input method has taken over the market rapidly since it was released in 2006. At present, sogou input method ranks the third in the APP list with 820 million active users, accounting for 86% market share.

Sogou input method adhering to provide users with the ultimate input experience, each update will bring users a different input experience. The Linux input method was released in 2014, and the subsequent iterations are also constantly accelerating. In January 2020, sogou input method launched a new enterprise version 1.0, which broke through the traditional x86 only, and also adapted to ARM and MIPS multi-architecture, and supported Wubi.

2. Adaptation results

Sogou input method in the Loongson platform adaptation results, currently sogou input method for the mainstream hard environment and the core of the input scene has reached 100% adaptation.

  • In terms of operating system: in the field of Xinchuang, the mainstream UOS and Kirin operating system are fully compatible.
  • In terms of CPU architecture, the Loongson architecture has been fully adapted
  • The sogou input method is fully supported by the two major package managers in Linux distributions: RPM and DEB, the two installer package formats
  • From the input method function dimension, the basic input Pinyin input, Wubi input, English input, small language input 100% support
  • Extended input: virtual keyboard input voice input, as well as ancillary property Settings have been migrated to achieve 100% support

3, sogou input method migration path

  • Migration criteria: migration speed, migration effect is good.
  • Early stage: combing and designing the platform architecture for sogou input method.
  • Medium-term: three stages: compilation, debugging and deployment, deployment can be successful.
  • Late: successful operation does not mean successful migration, sogou input method adhering to provide users with the ultimate input experience. After the deployment, it is necessary to further verify, debug, test and tune the fluency and performance of the input method so as to achieve the best results.

The following will be introduced from the early, middle and late phases of migration.

4. Problems and solutions encountered in the early phase of migration

At the early stage of migration, the input method software was designed in layers to abstract the platform-related functions, shield the platform differences in the upper layer, and realize the maximum reuse of the original logic of the input method.

Previously, sogou input method only runs on Windows system, in line with TSF input method framework, while Loongson architecture only runs Linux system, so sogou input method needs to adapt to Linux system and access to the mainstream FCTIX input method framework of Linux system. In order to adapt to the new FCTIX framework and achieve the migration goal of “fast speed and good effect”, the architecture design of the input method software is re-designed, the Platform layer related to the Platform is abstracted, the Platform difference is shied from the upper layer, and the technical basis is provided for the maximum reuse of the existing stable code.

The new input method software architecture is described below. The diagram below.

  1. Platform layer: It abstracts and centralizes the platform-related parts, including UI drawing, message loop, system information and other functions. It shields the differences of different platforms and provides a unified operation interface for the upper layer.
  2. Foundation layer: This layer is the basic functionality layer, providing the basic common functionality and UI elements for the upper layer. This layer is divided into CommonLib and Uilib, in which the general library is the basic functional library, mainly to achieve INI and XML parsing, encryption and decryption, container, lock, authentication and other basic functions; The UI library provides the basic UI elements for the UI, including images, fonts, canvases, timers, Windows, and various controls.
  3. Input method logic layer: because the bottom layer has dealt with the platform differences, this layer has a large number of reuse sogou input method of the original stable operation of the input method logic. This layer is divided into two parts: input logic layer and UI logic layer. The input logic layer includes voice input, physical keyboard input, handwriting input, virtual keyboard input, small language and multi-language input and other input logic. The UI logic layer mainly includes the window interface of sogou input method UI, including writing window, software disk, status bar and so on.
  4. SDK layer: as a bridge between input method and system input method framework, SDK layer shields the differences between different input method frameworks, provides a stable and unified interface for the logic layer of input method, realizes high reuse of existing logic, and provides a solid foundation for good and fast migration.

The following will be introduced on Linux platform, sogou input method and FCTIX input method framework and the process relationship between each application. See below.

The FCTIX input method framework in Linux is on the lower right of the picture. The framework is a C/S architecture, which is an independent process running in Linux system.

The left half of the image shows various applications running on Linux, including GTK, Qt, and X Window applications. Each application communicates with the FCTIX input method framework through XIM protocol, such as keyboard keys. After receiving the keyboard key message, FCITX will call the corresponding input method plug-in according to the currently selected input method. So in Linux system sogou input method implementation of a FCTIX framework plug-in running in the FCTIX input method framework.

After the plugin of sogou input method is called by FCTX, it makes interprocess call through MessageQueue and calls to the service of Sogou input method. The specific data transmitted by MessageQueue is serialized and deserialized using ProtoBuf, and the information after data package is realized. After the service of the input method is called, the data will be transferred to the SDK of the sogou input method, which will be processed according to various input logic contained in the SDK, and the processing results will be returned to the FCTIX plug-in through MessageQueue (MessageQueue), and FCITX will be returned to each application through XIM protocol to complete the input process.

After the overall framework design is completed, the actual migration work can be carried out, and it is also in the middle of the migration. In the middle stage of migration, different problems are encountered in different stages of compilation, debugging and deployment, including complex construction of development environment, inconvenient use of coding debugging tools, and lack of software operation dependence.

Specifically, take a look at the actual problems and solutions encountered in the middle of the migration. Because of the need for cross-platform construction, we choose CMAKE as the tool chain. However, in some platforms, we are faced with the lack of CMAKE and GCC tool chains, so we need Loongson or OS manufacturers to provide tool chains.

In the coding stage, our team unified the coding tools. Since VSCODE does not provide the installation package of MIPS architecture on its official website, VSCODE cannot be used. Therefore, Qt Creator, a lightweight IDE, is chosen as an alternative tool. Mainly because of the development and debugging of Qt is more friendly, especially for the Qt data structure, such as commonly used QString, Qrect and other data structures can display its internal information, more convenient than the non-Qt produced IDE use.

There may be a lack of runtime dependencies during deployment, so it is necessary for Loongson or OS vendor to provide the corresponding runtime dependency packages.

5. Problems encountered in the migration process & solutions

  1. Note that the path delimiter in the Windows platform file system is a backslash (\), and the path delimiter in Linux is a slash (/), and that the file path name in the Linux file system is case-sensitive, which is not the case in Windows. For these reasons, the header files for include in all C++ code in the input method are path-separated by slashes to prevent compilation errors on different systems.
  2. Uniform use of UTF-8 character encoding. Advantages of UTF-8 encoding: Compatible with English ASCII encoding. As English characters are commonly used, UTF-8 encoding saves space compared with UTF-16 and UTF-32 encoding. In addition, UTF-8 encoding has no byte order problem, and other multi-byte encoding has the difference between big-endian and small-endian encoding values.

6. Problems and solutions in the later period of migration

After completing the work in the middle phase of migration, the input method can work normally, but it is far from enough to work normally. Sogou input is to provide the ultimate input experience, so it needs to be optimized and debugged in the aspects of running fluency and user experience in the later phase of migration. Here are two post-migration issues to share.

  1. Handwritten input stuttering: the more strokes the input, the more obvious stuttering. For example, in the following figure, the handwriting input area in the red box is analyzed, and the cause of stalling is UI refresh logic. The original refresh logic is full refresh, such as “Hello everyone”. UI refresh will be carried out for each stroke in the writing process, and stalling will occur as the number of strokes increases. To find the reason of the problem and take solution for incremental refresh, namely only draw parts have change, as shown in figure, write “hello” at the end of a “good” word “son”, only the last of a “child” for drawing, stroke and other controls will not be redrawn, greatly improve the efficiency of drawing, the improved handwritten fluency increase greatly.

  1. Property setting APP starts slowly: by using the performance analysis tool – flame diagram, to locate the problem. The analysis results of flame diagram are shown in the following figure. The Y axis of the flame diagram is the call stack (call direction: from the bottom up), and the X axis is the record of the sampling point. Because there is usually a PMU unit in the CPU to record the hardware usage in the process of program running, the more points collected, the longer the running time of the function. It’s important to note that the X-axis is not the time, it’s just the number of samples. The blue part in the figure takes up 40% longer than the time when the app is started. According to the name of the function, it can be judged that the function mainly completes the creation of interface. It is preliminarily determined that the performance bottleneck of the property setting of the app startup is the creation of interface. After evaluation, the delayed creation method was adopted to deal with the problem. Attribute set of UI page is more complex, a total of seven, and can be arbitrary switching between pages, but only display a page at the same time, based on the way of use and in the attribute set application starts, visible only to create user page, upon the program started in the background and then gradually to create other page, both improving the user experience, and ensure the integrity of the function, After the improvement, the startup efficiency increased by 40%.

The sogou input method will not rest on its laurels, and there will be more to come. In terms of product functions, the company plans to add symbols, speech synthesis, handwriting support characters and other functions. In terms of basic quality polishing, it will continue to work hard to continuously improve input accuracy, improve the speed of speech and handwriting recognition, so as to allow users to enjoy a more extreme user experience.

Q&A session:

Q1: Is it possible to provide the DEB package under the Loongson architecture now?

A1: It is not available yet. Please keep your attention.

Q2: sogou input method supports Arabic?

A2: At present, there are dozens of small languages that are supported. It is necessary to clarify whether users are concerned about handwriting input or voice input.

Q3: How is the conflict between the sogou input method and JetBrains handled?

A3: At present, we are actively communicating with each other.

Q4: What are the security aspects of sogou input method?

A4: sogou input method provides encryption and decryption, authentication module; The security of sogou input method can be realized through these technical means. The installation package of sogou input method is signed and verified when it is loaded. In addition, the encryption authentication of interface level is provided.

Q5: How is it necessary to migrate software using CMake GCC and external dependencies on the PC side?

A5: CMAKE build system is cross-platform, and VC is usually used on Windows, but VC cannot be used on non-Windows platform. Therefore, considering the cross-platform requirements, CMAKE build system is adopted, because CMAKE can not only generate Makefile and other project management files under Linux system. It can also generate VC development project under Windows system, so as to realize the cross-platform transformation of the building tool.

The godson technology blog: https://blog.csdn.net/loongnix?spm=1010.2135.3001.5343&type=blog

The godson technology community: https://loongson.cloud.csdn.net/