Project background

While working with the Ant Design documentation, ANTD discovered that they were using a screen recording framework called logRocket, and immediately used logRocket in their own projects to test its functionality.

LogRocket website will collect data and classify it by person and session, watch the playback of the operation of each person, you can find the inconvenience of some operations in the system, and you can find out which people are your heavy users.

But logRocket’s data is stored on their server, and from logRocket playback, you can see all kinds of important data in the system. If the data is accessed by people with ulterior motives, the consequences can be serious.

rrweb

If we need to build on an open source framework and store the data on our own servers, limiting the access that people can view it, then the problem can be solved.

The following I want to introduce today’s protagonist RRWeb framework, full name record and replay the Web. It consists of three libraries:

Rrweb-player is an API for recording and replaying UI pages. Rrweb-player supports fast forward, full screen, drag and drop operations. Rrweb converts all the DOM elements in the page into document data and assigns each DOM element a unique ID. Later, when the page changes, only the changed DOM elements are serialized. When the page is replayed, the data will be deserialized and inserted into the page, and the original incremental DOM changes, such as attribute or text changes, will be found according to the ID of the corresponding DOM element modification; Dom changes are made when child nodes are added or subtracted based on the parent element ID.

The development process

1. Directly use RRWeb to record each serialized screen recording data, save it in localStorage first. When the data amount exceeds the threshold or time limit, sendBeacon sends the data to Node and saves it in Mongo.

2. The first problem I encountered was the loss of sendbeacon data, because when the data exceeded 65536, it would fail to be sent. Since Sendbeacon was sent separately by the background process, it could not obtain the failure state, so it had to be degraded.

3. As the users of the background system in the company are distributed all over the world, the network delay abroad is high, so the problem of compressed data size needs to be solved. The LZ-String library is used here. Json. parse has a high frequency of errors, so it is changed to compress data before sending it to the back end and decompress it on the Node side.

4. At the beginning, the database was selected as influxDB, but it was changed to mongodb due to some irresistible reasons.

5. After the project went online, I selected a small project for testing and found good storage and playing effect. The code is as follows

import rrweb from 'rrweb'; rrweb.record({ emit(event) { storagePush(event); }});Copy the code

The data structure stored in the database is

{
timestamp: 1563418490795,
name:'Ming',
event:...
}
Copy the code

You can search for data by user and time range as follows

The rrWeb source code shows that the checkoutEveryNms property splits sessions by time, so the code looks like this

rrweb.record({
  emit(event, checkout) {
    if(checkout)rrwebSessionSet();
    storagePush(event);
  },
  checkoutEveryNms: 1000 * 60 * 10
});
Copy the code

Each time the checkoutEveryNms expires, the second checkout parameter in the emit is set to true. This will tell you that a new session has started, assign a unique value to the session, and store the data structure in the database instead

{
timestamp: 1563418490795,
name:'Ming',
session:xxxxxxxxxxx,
event:...
}
Copy the code

With the concept of a session, a person’s operations on a given day can be selected according to the session

First of all, different items are stored in separate tables, and the index is set as background processing. After this scheme is used, the playing page becomes normal, but the personnel list interface is still very slow.

Therefore, each time mongo is stored, a copy of data of personnel and date is stored in Redis. At present, the system has been running normally, and all interfaces can return all data within 1s.