The official website of Grape City, grape city for developers to provide professional development tools, solutions and services, enabling developers.

It is an immutable truth that The Times make heroes. In the context of the current era, online documents can be called such “heroes”.

Since the outbreak of COVID-19 in 2020, telecommuting has completely overturned the traditional enterprise management model, and online documents, as an important part of telecommuting software, have also witnessed rapid development.

Now, even if the market has tencent document, graphite documents, books, sparrow, and begged the document online office products, such as online document itself are function, technology, data security, service, ecological aspects of test, such as data processing efficiency, collaboration, secondary extension framework, system integration, compatibility, etc.

From the technical point of view, online, data processing and multi-person collaboration are the most critical technical indicators for developing online document system. However, online and data processing have been relatively mature technical solutions, the implementation is not difficult. Therefore, multi-party collaboration is the core factor that affects the usability of online document systems.

What is multiplayer collaboration?

Multiplayer collaboration means that multiple people edit the same document at the same time so that users can see the changes made by others without refreshing them. Google Docs, Tencent Docs, Graphite Docs, Quip, etc all have multi-player collaboration capabilities.

So, how does multi-person collaboration work?

For any information to be real-time edited and displayed by multiple people, the following three steps are needed:

  • operationalization
  • Can transfer
  • Can restore

These three steps are similar to the codec process. Information is first converted into a set of operations, then the operations are transmitted to other terminals over the network, and finally the operations are restored to information on the local terminal.

These steps may seem simple, but each step requires a lot of thought. For example, in the process of operationalization, how to ensure that all changes of information can be decomposed into the set of operations, how to make operations cover all changes of information, and how to determine the granularity of segmentation when information is divided and combined.

 

Transportability needs to consider the following points:

  1. Transmission content

A. Original text

1) clarity 2) redundancy

B. Compression technology

1) Logical compression 2) protocol compression 3) Manual compression

2. Network protocols

a.Socket

1)TCP 2) UDP

b.HTTP

c.WebSocket

Quality of Service (QoS)

A. Fail quickly

B. Automatic rollback

C. Automatic reconnection

D. Automatic recovery

Reducibility mainly involves:

1. Restoration of absolute operations

A. Control volume

B. Reasonable hints 2. Restoration of relative operations

A. Strict orderliness

[C] sequential remedies

3. Restore local operations

A. Filter the received operation set

B. Refine the operating particles from the source

C. Local save and local execution

4. Invasion-free restore

A. Define an intrusion

[B]. Exclude intrusion

[C]. A thousand faces

Now that we know the basics of multiplayer collaboration, let’s look at the technical difficulties.

What are the technical challenges of multi-person collaboration?

The essence of multi-party collaboration is Multiple Leader Replication in a distributed system, that is, any client can be regarded as a Data Leader, and Data synchronization between these leaders will inevitably encounter problems of disorder and conflict. This is the main difficulty of teamwork.

For Multiple Leader Replication collisions, there are the following solutions:

  • Avoid collisions, that is, don’t let multiple users edit the same place at the same time. This solution is simple and crude. It is necessary to check whether the product form is suitable for this solution.
  • Expose conflicts to users and let them resolve them. Most professional version control software currently uses this approach, but it is not suitable for products with a large number of non-professional users, such as online documents.
  • Marks the write operation with a global index, which can be a timestamp or sequence number. The index must be global and incrementing. In any conflict, select the one with the higher index to write. The advantage of this approach is that conflict resolution is completely automated without user involvement. The disadvantage is that if you encounter long synchronization intervals, you lose a lot of user input.

Operational Transformation (OT) algorithm is a common method to solve the conflict problem of multi-person collaboration in the development of online document system. This technology was created in 1989 and works by unifying text content into three types of operations in order to provide the user with the ultimate consistency implementation:

  • Retain (n) : Retain n characters
  • Insert (STR) : inserts character STR
  • Delete (STR) : deletes the character STR

After the above operations are complete, the OT algorithm merges and transforms the concurrent operations to form a new operation flow and applies it to the historical version to achieve lockless synchronous editing.

(figure) OT algorithm technology operations in the conversion process (photo source: en.wikipedia.org/wiki/Operat…).

The idea behind the OT algorithm is actually very simple, that is, the corresponding operation transformation under certain conditions. Therefore, OT is mainly used for text, which is usually complex and not extensible. For more advanced constructs such as rich text editing, OT trades complexity for what the user expects without adversely affecting system performance. As a result, most real-time co-editing logic today is based on OT algorithms.

Because of this, OT algorithm has become one of the most important solutions to current cooperative conflict processing. However, even though it has been around for more than 30 years, the theory of control algorithms is still not well suited to the problem of distributed implementation, and developing a system that supports real-time collaborative editing by multiple people is far more complex than expected.

Where is the breakthrough to achieve multi-person collaboration?

This shows, more than a complex real-time collaborative editing system is not enough to just rely on algorithm logic, need according to the different business scenarios (such as kanban, plain text editor, undo/redo, etc), invest a lot of research and development cost and time, and, in the trial and error to find the balance between product performance and ease of use.

So, is there a simpler and quicker solution?

Through the analysis of the sample code of several online collaborative office products on the market, we found that these products basically rely on third-party form components in addition to the OT algorithm mentioned above. By embedding components, the online document system well supports the ultimate consistency of multi-user collaboration, provides users with a more easy-to-use and diversified experience, and achieves higher computational complexity while reducing research and development costs, greatly improving the efficiency of multi-user collaboration.

What are the capabilities of a table component for multi-party collaboration?

The first is functional support for tables.

Because tables are much more numerically sensitive than other data types, they can be used as collaborative documents for more granular operations and computational complexity. Therefore, the selected components must have strong table function support, not only in data entry, data filling and other aspects to show a strong ability, but also with all kinds of statistics, calculation summary, perspective analysis, as well as graphical means.

Second, there needs to be an open API interface for more customization options.

Such components need to provide rich event and application interfaces for controlling logic such as cell state, form protection, data transfer, and, in the case of multi-person collaboration, limitations on user editing of the same content, and insertion of timestamp (serialization) capabilities.

 

Out of curiosity, I downloaded and tested several forms on the Web, and found that SpreadJS is the one that stands out. This component features “online Excel” for embeddable systems, a pure front-end architecture that can be easily embezzled into system development, regardless of compatibility with native systems. It’s worth noting that SpreadJS uses Sparse arrays, which store non-empty data rather than empty data, compared to traditional chained storage or Array storage.

In addition to saving memory space, sparse arrays also make it easier to build data dictionaries based on row indexes for loosely-arranged data types like tables, so that nodes at any level of the storage structure can be replaced or restored at any time. SpreadJS enables efficient data rollback and data recovery (Redo/Undo) across multiple people.

Note: Sparse Array of SpreadJS

conclusion

The epidemic has accelerated the digital transformation of enterprises. In the future, enterprise collaborative office will develop towards the improvement of product usability, integration and secondary expansion ability, highly fit with the original system/business, and meet the use habits of end users.

How to break the technical barriers and develop online document products that can meet the needs of users in different scenarios and have market competitiveness and differentiation is the primary consideration of SaaS enterprises and system suppliers.

“The good wind, by force, sent me to the blue clouds.” In today’s highly competitive online document field, in addition to spending a lot of energy on independent research and development, learning to make efforts to meet different business scenarios and customer needs may also be a good choice.