The opening

Text line is gradually popular one function, no matter you are a novel reading web site, or sell tutorial site, generally have the function of notes or comments, traditional practices are added a comments section at the bottom of the article, the advantage is simple, uniform, defect is not convenient for the article, to give you a paragraph or sentence comments, Therefore, there is a need for underlining and commenting. Currently, the products I have seen have the function of underlining: wechat reading APP, Geek Time:

InfoQ Writing Platform:

This function seems to be simple, but in fact there are many difficulties, such as how to underline various complex text structures with high performance, how to store as little data as possible, how to echo underline accurately, how to deal with repeated underline, how to deal with subsequent text editing and so on.

As a front move brick work, whenever see an interesting little feature when I want to make it, but after watching the one of the few relevant article, found that not 😓, these are just a general idea of the article introduces, finish will make people feel like, but think you will find a lot of problems, can only go to the source, the source is always time-consuming, I don’t know. It is very difficult to achieve a production available, so this article is the next best thing, simply write a demo happy. Demo effect please click: lxqnsys.com/#/demo/text… .

The overall train of thought

The general idea is simple: iterate through all the text in the selection, cut it into a single character, wrap each character with an underscore element, continue wrapping it at the deepest level for repeated underscore elements, and start with the deepest element for event processing.

This is stored by recording the label name and index of the first non-underscore element in the underlined text, as well as the total offset of the character among all characters within it.

This is done by getting the element that corresponds to the above stored data and then iterating through the characters of that element to add an underscore element.

implementation

HTML structure

<div class="article" ref="article"></div>
Copy the code

The text was in the div above, and I copied and pasted in the HTML structure of a random article from the gold digger’s brochure:

Show tooltip

The first thing to do is to display an underline button on the selection. This is simple. We listen for the mouseup event, then fetch the selection object, call its getBoundingClientRect method to get the location information, and then set it to our tooltip element:

document.addEventListener('mouseup'.this.onMouseup)

onMouseup () {
    // Get Selection object, which may contain multiple 'ranges'
    let selObj = window.getSelection()
    // There is usually only one Range object
    let range = selObj.getRangeAt(0)
    // If the start and end positions are the same, nothing is selected
    if (range.collapsed) {
        return
    }
    this.range = range.cloneRange()
    this.tipText = 'line'
    this.setTip(range)
}

setTip (range) {
    let { left, top, width } = range.getBoundingClientRect()
    this.tipLeft = left + (width - 80) / 2
    this.tipTop = top - 40
    this.showTip = true
}
Copy the code

line

Bind the click event to the tooltip to retrieve all the text nodes in the selection. Let’s look at the structure of the Range object:

A brief introduction:

An collapsed property indicates whether the places where an collapsed begins and ends are the same;

The commonAncestorContainer property returns a public parent containing startContainer and endContainer;

The endContainer property returns the node containing the end of the range, usually a text node;

EndOffset returns the number of the endpoint of the range within the endContainer;

The startContainer property returns the node containing the starting point of range, usually a text node;

StartContainer returns the number of the starting point of range within startContainer;

Therefore, the goal is to traverse all nodes between startContainer and endContainer to collect text nodes. Due to the lack of algorithm and data structure knowledge of the author, I can only choose an opportunistic method to traverse commonAncestorContainer node. Then use the isPointInRange() method of the range object to check whether the current traversal node is in the selection range. Two points need to be noted in this method. One is that isPointInRange() method does not support IE currently, and the other is that the first and last nodes need to be processed separately. Since the first and last nodes may be partially inside the selection, this method returns false.

mark () 
  this.textNodes = []
  let { commonAncestorContainer, startContainer, endContainer } = this.range
  this.walk(commonAncestorContainer, (node) = > {
    if (
      node === startContainer ||
      node === endContainer ||
      this.range.isPointInRange(node, 0)) {// Start and end nodes, or nodes within a range, are collected if they are text nodes
      if (node.nodeType === 3) {
        this.textNodes.push(node)
      }
    }
  })
  this.handleTextNodes()
  this.showTip = false
  this.tipText = ' '
}
Copy the code

Walk is a depth-first traversal function:

walk (node, callback = () = > {}) {
    callback(node)
    if (node && node.childNodes) {
        for (let i = 0; i < node.childNodes.length; i++) {
            this.walk(node.childNodes[i], callback)
        }
    }
}
Copy the code

After obtaining all text nodes in the selection range, we can cut characters for element substitution:

handleTextNodes () {
    // Generate a unique ID for this time
    let id = ++this.idx
    // Iterate over the text node
    this.textNodes.forEach((node) = > {
        // The first and last elements of the range need to determine the offset, which is used to intercept characters
        let startOffset = 0
        let endOffset = node.nodeValue.length
        if (
            node === this.range.startContainer &&
            this.range.startOffset ! = =0
        ) {
            startOffset = this.range.startOffset
        }
        if (node === this.range.endContainer && this.range.endOffset ! = =0) {
            endOffset = this.range.endOffset
        }
        // Replace the text node
        this.replaceTextNode(node, id, startOffset, endOffset)
    })
    // Serialize to store, get all the underlined elements of the id just generated
    this.serialize(this.$refs.article.querySelectorAll('.mark_id_' + id))
}
Copy the code

If it is the first node and startOffset is not 0, then the character before startOffset does not need to be underlined. If it is the last node and endOffset is not 0, then the character after endOffset does not need to be underlined, and all other text in the middle needs to be cut and underlined:

replaceTextNode (node, id, startOffset, endOffset) {
    // Create a document fragment to replace the text node
    let fragment = document.createDocumentFragment()
    let startNode = null
    let endNode = null
    // Intercepts the previous paragraph of text that does not need to be underlined
    if(startOffset ! = =0) {
        startNode = document.createTextNode(
            node.nodeValue.slice(0, startOffset)
        )
    }
    // Truncate the text that does not need to be underlined
    if(endOffset ! = =0) {
        endNode = document.createTextNode(node.nodeValue.slice(endOffset))
    }
    startNode && fragment.appendChild(startNode)
    // Cut all the text in the middle
    node.nodeValue
        .slice(startOffset, endOffset)
        .split(' ')
        .forEach((text) = > {
        // Create a SPAN tag to wrap the element around as an underline
        let textNode = document.createElement('span')
        textNode.className = 'markLine mark_id_' + id
        textNode.setAttribute('data-id', id)
        textNode.textContent = text
        fragment.appendChild(textNode)
    })
    endNode && fragment.appendChild(endNode)
    // Replace the text node
    node.parentNode.replaceChild(fragment, node)
}
Copy the code

The effect is as follows:

HTML structure:

Serialized storage

It’s better to put a canvas element on top of the article to give the user a free canvas, so it needs to be saved and the next time you open it, the line will be displayed again.

Can store the key is to make the next position back and reference for other articles to introduce the method of this paper choose the storage line elements of outer first marking element tag name, as well as with the types of elements within the specified node in the index, and the characters in the line element, the general character of the offset. The description may be a little convoluted, but look at the code:

serialize (markNodes) {
    // Select the article element as the root element. The advantage of this is that other structure changes on the page do not affect the placement of the underlined element
    let root = this.$refs.article
    // Iterate over all the span nodes just generated for this underscore
    markNodes.forEach((markNode) = > {
        // Calculates the total text offset of the character from the first non-underscore element in the outer layer
        let offset = this.getTextOffset(markNode)
        // Find the first non-underlined element in the outer layer
        let { tagName, index } = this.getWrapNode(markNode, root)
        // Save relevant data
        this.serializeData.push({
          tagName,
          index,
          offset,
          id: markNode.getAttribute('data-id')})})}Copy the code

Calculate the total text offset from the first non-underscore element by counting the total number of characters in the preceding sibling element, and then counting the total number of characters in the parent element and its preceding sibling node up to the outer element:

getTextOffset (node) {
    let offset = 0
    let parNode = node
    // Iterate until the first non-underlined element in the outer layer
    while (parNode && parNode.classList.contains('markLine')) {
        // Get the total number of characters for the preceding sibling element
        offset += this.getPrevSiblingOffset(parNode)
        parNode = parNode.parentNode
    }
    return offset
}
Copy the code

Gets the total character count of the preceding sibling element:

getPrevSiblingOffset (node) {
    let offset = 0
    let prevNode = node.previousSibling
    while (prevNode) {
        offset +=
            prevNode.nodeType === 3
            ? prevNode.nodeValue.length
        : prevNode.textContent.length
        prevNode = prevNode.previousSibling
    }
    return offset
}
Copy the code

Get the first non-underscore element from the string string.

getWrapNode (node, root) {
  	// Find the first non-underlined element in the outer layer
    let wrapNode = node.parentNode
    while (wrapNode.classList.contains('markLine')) {
        wrapNode = wrapNode.parentNode
    }
    let wrapNodeTagName = wrapNode.tagName
    // Calculate index
    let wrapNodeIndex = -1
    // Use the tag selector to get all of the tag elements
    let els = root.getElementsByTagName(wrapNodeTagName)
    els = [...els].filter((item) = > {// Filter out the underlined elements
      return! item.classList.contains('markLine');
    }).forEach((item, index) = > {// Computes the index of the current element
      if (wrapNode === item) {
        wrapNodeIndex = index
      }
    })
    return {
        tagName: wrapNodeTagName,
        index: wrapNodeIndex
    }
}
Copy the code

An example of the last stored data is as follows:

Deserialization display

Display is to draw a line according to the data stored above, iterate over the above data, first get the specified element according to tagName and index, then iterate over all text nodes under the element, find the character to be underlined according to offset:

deserialization () {
    let root = this.$refs.article
    // Iterate over serialized data
    markData.forEach((item) = > {
        // Get the specified element
        let els = root.getElementsByTagName(item.tagName)
        els = [...els].filter((item) = > {// Filter out the underlined elements
          return! item.classList.contains('markLine');
        })
        let wrapNode = els[item.index]
        let len = 0
        let end = false
        // Iterate over all nodes of the element
        this.walk(wrapNode, (node) = > {
            if (end) {
                return
            }
            // If it is a text node
            if (node.nodeType === 3) {
                // If the number of characters in the current text node + is greater than offset, the character is in the text
                if (len + node.nodeValue.length > item.offset) {
                    // Calculate the offset in the text
                    let startOffset = item.offset - len
                    // Since we are cutting to a single character, the total length is 1
                    let endOffset = startOffset + 1
                    this.replaceTextNode(node, item.id, startOffset, endOffset)
                    end = true
                }
                // Add the number of characters
                len += node.nodeValue.length
            }
        })
    })
}
Copy the code

The results are as follows:

Delete the line

Deleting the underlined element is easy. We listen for the click event, if the target element is the underlined element, grab all the underlined elements for that ID, create a range, display the tooltip, and then click and delete the underlined element.

// Displays the ununderlined tooltip
showCancelTip (e) {
    let tar = e.target
    if (tar.classList.contains('markLine')) {
        e.stopPropagation()
        e.preventDefault()
        // Get the underscore ID
        this.clickId = tar.getAttribute('data-id')
        // Get all underlined elements of the id
        let markNodes = document.querySelectorAll('.mark_id_' + this.clickId)
        // Select the first and last text nodes as range boundaries
        let startContainer = markNodes[0].firstChild
        let endContainer = markNodes[markNodes.length - 1].lastChild
        this.range = document.createRange()
        this.range.setStart(startContainer, 0)
        this.range.setEnd(
          endContainer,
          endContainer.nodeValue.length
        )
        this.tipText = 'Ununderline'
        this.setTip(this.range)
    }
}
Copy the code

After clicking the cancel button, iterate over all underlined nodes of the id to replace elements:

cancelMark () {
    this.showTip = false
    this.tipText = ' '
    let markNodes = document.querySelectorAll('.mark_id_' + this.clickId)
    // Walk through all marked streets
    for (let i = 0; i < markNodes.length; i++) {
        let item = markNodes[i]
        // If there are children, that is, underlined elements with other ids
        if (item.children[0]) {
            let node = item.children[0].cloneNode(true)
            // The child node replaces the current node
            item.parentNode.replaceChild(node, item)
        } else {// Create a text node instead if you only have text
            let textNode = document.createTextNode(item.textContent)
            item.parentNode.replaceChild(textNode, item)
        }
    }
    // Delete the id from the serialized data
    this.serializeData = this.serializeData.filter((item) = > {
        returnitem.id ! = =this.clickId
    })
}
Copy the code

disadvantages

So that’s the end of the minimalist line, and now let’s see what the downside of this minimalist approach is.

First of all, there is no doubt that if you have a lot of underscore characters and repeat underscore characters many times, you can generate a lot of SPAN tags and nesting levels, and the number of nodes is a big problem affecting page performance.

The second problem is that the data to be stored will also be large, increasing the storage cost and network transmission time:

This can be improved a bit by compressing the field name to a single letter, and by merging successive characters together, but this doesn’t work.

The third problem is that, as the name implies, underline text, which can only be underlined for text, not for other images:

The fourth problem is that the HTML structure changes if the underlined text is modified.

All of these problems were so frustrating that it was only a demo.

Just optimize it a little bit

An easy way to optimize is not to slice the characters individually, but to wrap them as a whole. Here’s why:

replaceTextNode (node, id, startOffset, endOffset) {
    // ...
    startNode && fragment.appendChild(startNode)

    // Wrap the entire text directly
    let textNode = document.createElement('span')
    textNode.className = 'markLine mark_id_' + id
    textNode.setAttribute('data-id', id)
    textNode.textContent = node.nodeValue.slice(startOffset, endOffset)
    fragment.appendChild(textNode)
    
    endNode && fragment.appendChild(endNode)
    // ...
}
Copy the code

This serialization requires an additional field of length:

let textLength = markNode.textContent.length
if (textLength > 0) {// Filter out null characters of length 0, otherwise there will be unpredictable problems
	this.serializeData.push({
      tagName,
      index,
      offset,
      length: textLength,// ++
      id: markNode.getAttribute('data-id')})}Copy the code

The amount of serialized data is greatly reduced:

Next the deserialization also needs to be modified, if the character length is variable, it may cross text nodes:

deserialization () {
    let root = this.$refs.article
    markData.forEach((item) = > {
        let wrapNode = root.getElementsByTagName(item.tagName)[item.index]
        let len = 0
        let end = false
        let first = true
        let _length = item.length
        this.walk(wrapNode, (node) = > {
            if (end) {
                return
            }
            if (node.nodeType === 3) {
                let nodeTextLength = node.nodeValue.length
                if (len + nodeTextLength > _offset) {
                    // Text before startOffset does not need to be underlined
                    let startOffset = (first ? item.offset - len : 0)
                    first = false
                    // If the number of characters remaining in the text node is less than the length of the underlined text, the text node is still part of the underlined text and needs to be processed in the next text node
                    let endOffset = startOffset + (nodeTextLength - startOffset >= _length ? _length : nodeTextLength - startOffset)
                    this.replaceTextNode(node, item.id, startOffset, endOffset)
                    // The length needs to be subtracted from the length that has already been processed by the node
                    _length = _length - (nodeTextLength - startOffset)
                    // If the number of characters left to process is 0, it is finished
                    if (_length <= 0) {
                      end = true
                    }
                  }
                len += nodeTextLength
            }
        })
    })
}
Copy the code

The last ununderlined node also needs to be modified, because the child node may not have just one underlined node or text node, and need to be traversed through all the child nodes:

cancelMark () {
    this.showTip = false
    this.tipText = ' '
    let markNodes = document.querySelectorAll('.mark_id_' + this.clickId)
    for (let i = 0; i < markNodes.length; i++) {
        let item = markNodes[i]
        let fregment = document.createDocumentFragment()
        for (let j = 0; j < item.childNodes.length; j++) {
            fregment.appendChild(item.childNodes[j].cloneNode(true))
        }
        item.parentNode.replaceChild(fregment, item)
    }
    this.serializeData = this.serializeData.filter((item) = > {
        returnitem.id ! = =this.clickId
    })
}
Copy the code

Now look at the effect:

HTML structure:

You can see that both the serialized data and the DOM structure are much cleaner.

However, if the structure of the document is complex or if the underline is repeated many times, the resulting nodes and data can be quite large.

conclusion

This paper introduces a simple implementation of the underline function of Web text. The initial idea is to wrap it by cutting it into a single character. The advantages of this method are very simple, but the disadvantages are also obvious. It is found that wrapping the whole text directly does not cause too many problems, but it can reduce and optimize a lot of data and DOM structure to be stored. Therefore, it is often wrong to take it for granted. Finally, data structure and algorithm are really important 😭.

Example code at: github.com/wanglin2/te… .

Reference article:

1. How to use JS to realize the online note-taking function of “underline words”?

2. “Underline” and “Insert notes” — not just the front end