Shiv ENOW large front end

Company official website: CVTE(Guangzhou Shiyuan Stock)

Team: ENOW team of CVTE software Platform Center for Future Education

The author:

preface

This paper starts from the pure front end, so it will not involve the copy and paste operation like Flash or plug-in to skip the browser security detection, completely based on the security restrictions of the browser and some “strange techniques” to achieve a relatively complete copy and paste function. This article will start with the copy-and-paste function of the browser itself, leading you to understand the implementation of the function behind copy-and-paste, comparing multiple rich text document implementations, and how to implement a jSON-model data based clipboard suitable for rich text under various constraints of the browser.

The importance of copy and paste

People who work for Google and NPM can’t help but know how good CV is, and copy and paste has long been a part of our daily work life. In fact, for word processing, the importance of the copy-and-paste function is unimaginable. The use of form elements is not uncommon in code. In form elements, such as input fields, we may not know about internal copy-and-paste implementations: for example, copying an image from outside and pasting it into an input field doesn’t work, but text does? How to implement text with external style inserts, like word documents? How do you copy and paste images in rich text? How do we customize our copy and paste functionality?

Copy and paste the troika

Here we need to understand three concepts: MIME, DataTransfer, clipboardEvent:

Media Type (MIME)

In fact, when we implement a full copy and paste on form elements such as input and Textarea, we invoke the browser’s default capability. A media type (often called Multipurpose Internet Mail Extensions or MIME types) is a standard that represents the nature and format of a document, file, or byte stream. The MIME structure is actually quite simple: it consists of a type and a subtype, with a ‘/’ split between two strings to form type/subtype. Spaces are not allowed. Common MIME types include text/plain, text/ HTML, image/ PNG, application/json, etc. For example, when we write a script or style definition:

<style type="text/css"></style>
<script type="text/javascript"></script>
Copy the code

Or when requesting a back-end interface:

If you are familiar with the interface specification or do backend services, you should know that the definition of the Content-Type keyword is closely related to the resolution of the back-end program. When debugging the interface, there will often be inconsistency between the content-Type and the sent data. For example, if the backend needs Application/JSON data, an error status code will be returned if the application/ X-www-form-urlencoded format is passed. In this case, the front-end needs to do the corresponding data processing for the Content-Type. There are also special implementations: Content-Type: multipart/byteranges; Boundary = XXXX to inform the browser that the data is cut into multiple parts to realize the function similar to segmented loading of audio and video; That is to say, the browser wants to know what type of data your data is, and what kind of parsing or downloading process needs to be done (such as parsing to media or document files, which are generally downloaded as resources), and needs to be learned through MIME. In the copy-paste process, actually, it is also necessary to use MIME to parse the corresponding processing.

DataTransfer

Drag events are also available on paste, copy,cut, and other events. I prefer that “moving data” in a document can be defined using DataTransfer. DataTransfer has several properties and methods, but most of them are generated or only available under Drag. For example, files only applies to drag events, and if a drag operation does not involve dragging a file, the property is an empty array: so we focus only on items. Items is a Data object; Another pair of methods:

  • SetData (format, data) is used to set content;
  • GetData (format) gets the content;

clipBoardEvent

ClipboardEvent is a generic clipboardEvent supported by browsers. Paste, cut, copy, etc. In copy and paste we only need to focus on these two properties: type: describes the type of event that was triggered; Clipboard: a DataTransfer object;

The default implementation of the browser

In the browser, the general copy-paste will use the standard MIME format common to the browser (source: MDN screenshot) :

For example, in both input and Textarea copy-paste, only text/plain MIME types are accepted, which is probably the default text format that all software (yet to come across that does not support this format) will support. Of course, other file types can be selected if your input type is set to File, which is not discussed here.

Rich text scenarios

In rich text, in addition to plain text (text/plain), there are two MIME types that need to be supported: text/ HTML and image/ PNG. Copying and pasting text from a document such as Word in a consistent style and format should be a common feature of rich text. At this point, if we get text/plain, we will only get the corresponding plain text version. This is where we need to get text/ HTML. It is worth noting that the general text editor (Word, PPT, Kingsoft document……) , the obtained HTML format is not standard, or there is a lot of redundant data, in this case, we may need to take the initiative to carry out a data cleaning, only retain the data we need. In general, you can use the re to clean up the superfluous data after obtaining the data. You can see the corresponding data processing in common rich text such as uEditor and WangEditor. Image/PNG: copy and paste an image.

The present scene

First of all I need in a similar PPT documents including a copy and paste function, can copy and paste the text and images to our page, you need the support defined inside other elements, the need to support cross tabs or even cross browser in our consistency on the page of the copy and paste interaction; And because JSON data and HTML data are not connected, for example, in the ordinary rich text editor, direct copy, basically is directly obtained HTML data; In the current scenario, because of the MVVM framework, we are converting all the DOM into a single model data, so the data you copy needs to be processed separately and cannot be pasted directly into any rich text. The following problems arise:

1. Clipboard security restrictions of the browser

Browser is have strict security restrictions for a clipboard: not allowed to directly read clipboard content, unless you use proposed the navigator, clipboard. ReadText/navigator. Clipboard. Read to ask for permission, after active by users, can be read directly. However, there are risks. First of all, this proposal is still in the draft stage. Of course, it has a high chance of passing. Second, if the user actively disables this method, then subsequent paste operations are still problematic. So you have to operate under existing standards;

On Google Slides:

  1. Proactively ask users whether to install plug-ins and skip this layer of security restrictions on plug-ins;
  2. Do not install the case insafariIf YOU hit Paste, you’ll see a little button pop up again becausesafariAbility to customize menus;

2. Right-click menu customization dilemma

And in fact, in most of these scenarios, the right-click menu is also customized. Typically, you can just call the right-click menu for browser-appropriate copy-and-paste, but if you want to customize the menu, you can listen for contextMenu events and actively block the default behavior, such as Tencent Documents or Google Slides. But some browser behaviors are hidden or even uninvoked: copy and paste, for example. There are even some Web documents that click on the paste button on the menu and pop up a message asking the user to paste using a shortcut key, which is certainly anti-human. Based on the above two points, it is necessary to have its own set of memory data, provide data in the clipboard and right-click menu, and then actively update clipboard when necessary, so that the system paste memory data and internal memory data to achieve unity. One trick here is to use document.execCommand(‘copy/cut’) to actively update the clipboard data. If you need to update data to the clipboard when you right-click copy, you can call the clipboardData raised by the cut/copy event, and then use setData under that object to unify internal and external data and keep data circulating across tabs.

private bindCopy = (e) = >{...console.error('copy');
    e.preventDefault();
    this.duplicate.attemptToCopy(e, false);
  };

/ / Duplicate classes
/** * copy/cut *@param e ClipboardEvent
   * @param IsCut isCut * 1. Active shortcut keys copy and paste * 2. Right-click menu and click Copy (clipboardData object does not exist) * 3. Actively stuff in custom data */
public attemptToCopy(e: ClipboardEvent | null, isCut = false) {
    this.isCutCommand = isCut;

    if(e && e? .clipboardData) {const clipboardData = this.updateStash();
      clipboardData && this.updateClipboard(e, clipboardData);
    } else {
      this.autoCopy(); }}Automatic copy / * * * * 1. Support execCommand, equivalent to copy to walk again after the addEventListener (' copy ') * this time can get e.c. with our fabrication: lipboardData object, you can perform the above updateStash * benefits: You can setData in copy, set the flag bit; Disadvantages: execCommand has the risk of being repealed * 2. Not supported when using writeText * Benefits: degraded handling; Cons: No special MIME can be set, only */ in getData('text/plain')
  public autoCopy() {
    if (!document.execCommand(this.isCutCommand ? "cut" : 'copy')) {
      const clipboardData = this.updateStash();
      clipboardData && navigator.clipboard.writeText(JSON.stringify(clipboardData)); }}Copy the code

Of course, I’m making a little bit of compatibility here, because the document.execCommand method is, after all, a deprecated state. Of course using the navigator. Clipboard. Also can write.

3. Customize MIME types

We can directly access our internal data through clipboard, or DataTransfer. For example, we give a special identifier like text/copy, which means we define the MIME type ourselves, So next time we can get it directly from getData(‘text/copy’), wouldn’t that be nice? That is a hard currency such as the MIME type of standard, so many years, all regions (browser vendors and system software) support, and also have my own way to exchange MIME parsing (general), and our own definition of MIME is a digital currency, do not know where is certainly not recognized by the market, can only be used internally. Same thing: In general, this is perfectly feasible. But this is not a standard MIME type and cannot be retrieved across browsers. In other words, certain extreme scenarios are not acceptable. So why do we set MIME ourselves? For example, if text/ HTML is used, the corresponding XML format data will be added at the beginning and end. For other MIME types, it will be parsed by MIME. There may be special add-data or parse operations for standard but special MIME formats. Second, across browsers so far I’ve only seen text/plain and text/ HTML pass data, the rest is filtered…… If text/plain is set to text/plain, all copy internal data will be exposed to external paste events, which is fine if it is plain text data, but will make users feel strange if it is internal saved formatted data. So the initial decision is to use a special MIME type + a text/ HTML, can do the internal data parsing. It’s a nice idea, but there’s another problem: with some rich text editors, it’s basically text/ HTML and then you have to do another layer of parsing, and your data is exposed to someone else. For this, if you’re copying to external pastes, there’s no good way, sorry, because you want to keep the data consistent and the behavior across browsers. If it’s internal, match it. If it’s internal, filter it out. Of course, there may be some special cases, such as external copying of an SVG image. SVG is actually text data in XML format. Small SVG images are fine, but if they are large, the browser is not guaranteed to freeze. In this case, you may need text/plain to prejudge……

Of course, if you don’t need to work across browsers, you don’t need to bother, just keep a custom MIME type.

In general, when we get clipboard content judgments, we start with internal data, that is, custom MIME types, then image types, and then plain text types.

import { SPEC_MIME } from ".. /util/variable";

class PasteHelper {
  // get img data
  [text/ HTML, image/ PNG]
  // PPT/screenshot tool [image.png]
  / / PPT: [text/plain, text/HTML, text/RTF, image/PNG]

  // RTF: Cross-text format under Microsoft
  // https://zh.wikipedia.org/wiki/RTF
  public getImgTransData(e: ClipboardEvent) {
    if(! e.clipboardData? .items? .length) {return false;
    };
    const transferDatas = Array.from(e.clipboardData.items);
    const isText = transferDatas.find(c= > c.type === 'text/rtf');

    // In the case of the video class, img is returned, but getAsFile is null
    if(! isText) {const imgTransData = transferDatas.filter(c= > c.kind === 'file' && c.type.indexOf('image') = = =0) [0];
      if(! imgTransData) {return false;
      } else {
        const imgFile = imgTransData.getAsFile();
        if (imgFile) {
          returnimgFile; }}};return false;
  }

  SPEC_MIME -> htmlData(cross-browser) -> plainText(execommand not available, go to writeText) *@param e ClipboardEvent
   */
  public getInnerData (e: ClipboardEvent) {
    constinnerData = e.clipboardData? .getData(SPEC_MIME); . }// Get plain text, whitespace character filter
  public getPlainText (e: ClipboardEvent) {
    // Filter special characters and give an empty string
    const reg = /[\0-\x08\x0B\f\x0E-\x1F\uFFFE\uFFFF]|[\uD800-\uDBFF](? ! [\uDC00-\uDFFF])|(? :[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/;
    lettext = e.clipboardData? .getData('text/plain') | |"";
    text = text.replace(reg, "");
    return text ?? false; }}export default new PasteHelper();

Copy the code

4. Hell is platform compatible

In general, when we copy and paste, we want to interact with word, PPT and other native applications in a consistent way. However, imagination is full, reality is very skinny. Different levels of support, and even implementation, are available on different platforms. The following table shows how to obtain data from external sources by copying and pasting it internally:

External sources The text The picture The text box Audio and video
Web page (normal) support support Convert to text Does not support
google slide support Does not support Convert to text Does not support
Tencent document support Does not support Convert to text Does not support
Office PPT (Web) support Does not support Does not support Does not support
Kingsoft Document (Web) support Does not support Does not support Does not support
Office (PPT/excel) support support Convert to text Convert to picture
wps support Does not support Does not support Does not support
keynote support support Convert to text Convert to picture
numbers support support Convert to text Does not support
Windows (System) support Does not support Does not support
Uos (system) support Convert to text Convert to text
MAC (system) support support Convert to text

Web applications such as Google Slides and Tencent Documents do not support copying images. Instead, they need to parse images in text/ HTML data separately. In fact, string parsing is possible. The reason web apps like Golden Mountain Docs and Office don’t support text boxes and text is that they use internal protocols, and we generally don’t deal with MIME. Windows system does not support pictures and audio and video copy and paste, after testing, in this system can only get plain text; In this case, you can only hug the product dad’s thigh and say: I can’t do……

5. Achilles heel of media file handling

Based on the trust of internal data, you might just want to copy the data at first, serialize it and stuff it into our custom MIME type, and then take it out and deserialize it again. However, in the beginning, our product did something special for multimedia files (audio, video, pictures) : they were simply turned into bloBs when they were uploaded to our page, stored in memory, and uploaded to the cloud the next time they were synced. This method can improve user experience to some extent, after all, there is no need to upload to the cloud once. There is a drawback: if the previous TAB page is closed, then the bloB link will be broken because the memory or reference address of the bloB is stored in the previous TAB page, but if it is changed at this time, the impact will be wider. At this time, can only take the degraded scheme: has been uploaded to the cloud, directly obtain the link address; For bloB links that have not yet been uploaded, the bloB address can only be copied first. In the new TAB page, the fetch will be downloaded to the current page first, and then it can be processed like normal files. There are two problems with this: first, if you close the current page immediately after copying, the bloB memory will be freed and the download will not be possible. In this case, you can only upload the bloB directly to the cloud during copying, or silently upload it at idle time. Second, on cross-browser, there is nothing to do, and only cloud link format will work. Google Slides is how blob transfers urls, while Tencent’s documents are uploaded directly, and Yuqi is first uploaded using a base64 display. Of course, some people might say that we can convert the file to Base64 format first, but we usually copy multiple media files, base64 will take time to generate, and the amount of data may exceed the size of the clipboard memory. After all, there was no way to pass binaries between browsers, so we had to do this disgusting thing first.

if (isBlobUrl(model.source)) {
  if (medias[model.hash]) {
    model.source = medias[model.hash];
  } else {
    let blob;
    if (blob = await url2blob(model.source)) {
      // Recreate the bloB of the current page according to blobUrl
      const file = await blob2File(blob, model.mediaName || model.pictureName);
      const blobUrl = await file2BlobUrl(file);
      model.source = blobUrl;
      // Cache media data
      this.storageData.update({
        medias: {
          [model.hash]: blobUrl
        }
      });
    } else {
      console.error('Unsupported blob_URL or cross domain');
      return null; }}}Copy the code

Just use oneDataTransferObject, and then into the data, the test is not.

Of course, if you understandclipboardItems“You might thinkclipboardItemsYou can stuff data into the clipboard.

Yeah, I could totally, butclipboardItemsIt looks like an array, it works like an array, that is, the data is only supported in arrays of length 1.

6. Black holes in pictures

In fact, IT never occurred to me that images would need this kind of exception handling. In the process of copying and pasting an image, you probably wouldn’t think that you would copy a “thin man” and end up with a “fat man” that you wouldn’t recognize at all. In the browser, we always get a single image format from image/ PNG, because the browser will convert the image to bitmap for you to support compatible image formats, so you only get bitmap bloB in image/ PNG, which causes two problems:

  • You can’t get an image in its original format, you can only get itpng, it is impossible to judge the format;
  • bitmapThe conversion is different on different platforms, which can cause images to increase in size, such as one20mThe pictures throughgetAsFileMethod, which may exceed30m......inmacandwindowThe next test might yield two different results.

Sure enough, you still need input to fully access the browser’s file capabilities. Specific reference: lists.whatwg.org/pipermail/w…

At the end

Of course, there are some other problems, such as copying and pasting external tables actually requires a separate layer of parsing, which can be more troublesome; Serialized and deserialized data needs to be considered carefully, as some data is at risk of being transformed after formatting……

But that’s basically the whole copy-and-paste process

In fact, the whole copy-and-paste process is not perfect, and there are a lot of problems with consistency within the limitations of the browser, so you have to survive in the cracks. If the original content you want to copy and paste is HTML compliant, it’s easy; In a scenario like the one I encountered, where the data is mostly JSON, there is a lot of effort involved in converting between the data and dealing with boundary issues.

Refer to the article

  • www.alloyteam.com/2015/04/how…