Write a rich text editor from zero
In Writing a Rich Text Editor from Zero (PART 1), we implemented a very simple rich text editor, but the obvious problem was that we did not abstract the rich text content into data, that is, it was not data-driven. The L1 editor is only partially data-driven, not out of Contenteditable, but L2 is the real data driver.
So the question is this
-
Development is not friendly. When a developer uses our implemented editor, he still has to deal with a complex, ever-changing DOM. When you want to make some changes to rich text content, you need to manually generate HTML, insert it into the editor, and the data saved by the developer is also a bunch of HTML, making it difficult to do some parsing from it.
-
Browser compatibility problems. Editing in different browsers to produce the same rich text content may have different DOM structures, resulting in some performance that is not expected (but what the browser expects). If you had data to describe rich text, you could generate the same DOM structure based on the data, avoiding this problem, but at best solving some compatibility issues.
-
Synergy is difficult. What is synergy? When user A enters A 1 on the document, and user B enters A 2 on the document at the same time and at the same place, is the final document 21 or 12? Let’s say 12, how does user A’s document go from 1 to 12, and user B’s document go from 2 to 12? A collaborative protocol is needed to solve this problem.
Therefore, based on the above problems, we need a development-friendly document model supported by mature collaborative protocols to describe rich text content and changes to rich text content.
What do other rich text editors do
Right now, there are almost all reasonable, usable rich text expressions, and I’m not a genius at creating new ones, so let’s take a look at what other editors do.
Tencent document
We open the console on the page of Tencent document, inperformance
So let’s record oneprofile
When typing a text into a Tencent document, you do something like this
We go into itsapplyDelta
This function, and then print out its arguments, focusing on the red box,range
Refers to the position of the selection before entering,text
I typed one after the original 121
), this is the changeoperations
.
But this expression of operations is not really an expression of the final change.
We went straight to the logic of the collaborative part of the Tencent document and found this unintelligible string, which is actually an expression of its rich text content. It’s a representation of a linear structure.
The text in the figure above is the character that exists in the document, so the problem comes, the most important thing of rich text is a [rich] word, text only has a string of text content, how to express [rich]?
Let’s continue with the picture above_numToAttrib
This is the set of attributes in the document. Expand it as shown below, for example["bold", "true"]
It meansbold
The value of the property is"true"
, means bold; Again for instance["font-family", "PT Sans"]
“, meaning the font isPT Sans
.
So you have the text, you have the attribute set, how do you relate the two, and the key is attribs, which is a bunch of strings that you can’t understand at first glance. [“author”, “p.144115215160803528”]; *1, *2, *3; *m; Because this is written in base 36, m is equal to 22. *m is followed by +1, which means a character is affected. That is, the first character [1] in the document has attributes 0, 1, 2, 3, and 22 in _numToAttrib, respectively
["author", "p.144115215160803528"]
["bold", "true"]
["font-family", "PT Sans"]
["font-size", "9pt"]
["italic", "true"]
in+ 1
After that, yes*0*1*2*3*m*7*6*4+1
, then this paragraph can still be analyzed in accordance with the above logic, and will not be repeated.
The above is the rich text description of the Tencent document, so there is another question, how is the change operations mentioned above applied to this pile of strings?
The answer isoperations
Will be converted to the correspondingchangeset
, hereinafter referred to ascs
, as shown in the figure below.Z:
Ignore, which can be interpreted as an identifier bit,y
Refers to the original document length (base 36),> 1
Indicates that the length of the document has been increased by 1,= 2
To leave (skip) two characters,*0*2*3*1*m*4+1
As explained above,The $1
The insert character is 1, because all I did above was type 1.
So why does Tencent Document use this form to express rich text content (open source rich text editor Ethepad-Lite also uses data structure)? Because of the synergy protocol EasySync, detailed documentation is available at the link.
Github.com/ether/ether… Github.com/ether/ether…
(EasySync collaboration protocol is not only easysync, but also OT-JSON, etc., in theory there will be a new document to write this, including easySync detailed analysis)
slate
When a breakpoint is made during SLATE’s execution, you can see that SLATE’s data structure and corresponding rich text content are shown in the figure below.
It’s a linear structure with a finite hierarchy, what is this line in terms of the type field, and we can see that there are four types of type in the rich text up here, right
paragraph
, indicating that the line is plain text with no special line attributesblock-quote
, indicating that the line is a referencenumbered-list
, indicating that the row is an ordered listlist-item
, indicating that this is a child of the outer ordered list
The leaf node of the tree, on the other hand, represents text, such as {text: ‘BOL ‘, bold: true}, which means that the text is BOL and has been bolded.
So how do you represent change in SLATE?
We can find it in the source codeOperation
The statement is divided intoNodeOperation
,SelectionOperation
,TextOperation
.
Here we start withTextOperation
So for example, let’s look at how to say insert a text,offset
Refers to the offset of the insertion position in the current row,text
Refers to inserted text,path
The way[a, b]
In most cases,a
I’m pointing to a row,b
To zero.
When I type a 1 after the first text on the second line, thenoperation
As shown in the figure.
Youdao Cloud note also uses a similar structure, you can see mp.weixin.qq.com/s/wIu_8yv69…
quill
The documentation model used by Quill is Delta. Details can be found in the official Quill documentation at github.com/quilljs/del… The documentation of Mrs. Mrs. Is so good that I won’t add more details, but a brief explanation.
The following structure describes a document, insert for text, attributes for attributes, stating that Gandalf is bold, the is plain text, and Grey is # CCC.
[
{ insert: 'Gandalf', attributes: { bold: true } },
{ insert: ' the ' },
{ insert: 'Grey', attributes: { color: '#ccc' } }
];
Copy the code
Also, with the same structure, Delta can describe changes in the contents of the document. I inserted a character 1 after Gan, so it can be described as
[{retain: 3}, // can be read as skip 3 characters Gan {insert: '1'} // insert character 1]Copy the code
Suppose I wanted to delete the fourth and fifth characters of a document
[{retain: 3}, // do not handle the first 3 characters, retain {delete: 2} // delete the 4th and 5th characters]Copy the code
The description Delta takes up more space than the model in the Tencent documentation, but is more friendly from the developer’s point of view.
Delta’s structure is much lighter than SLATE’s.
The final data structure
After various comparisons, I finally chose Delta for subsequent development for the following reasons
-
Development friendly, simple structure, concise and easy to understand. (If you look at easySync for the first time, you might wonder what it really is.)
-
A mature collaborative protocol oT-JSON can support the subsequent development of collaborative editing
-
Linear representation, when modifying the content of the document, the modification cost of linear structure is lower than that of tree structure (although in rich text scenarios, tree structure may be easier to understand). In addition, if the selection is expressed as line A and offset is B, then in the case of linear structure, it is obviously easier to find which position in the document corresponds to the selection, and if it is a tree, it is also necessary to go through the depth.
Afterword.
There are mature products for all kinds of expressions, there are only three of them, and there are a bunch of other very good editors out there, so if you are interested, you can have a look at the code. Such as
Draft
ethepad-lite
(dropbox paper
In fact, the beginning is to takeethepad
Redeveloped)prose-mirror
notion
Google Docs
(This source strip is too laborious, confusion master, Tencent’s confusion is not so outrageous)