Building a data platform generally includes data collection, data reporting, data storage, data calculation and data visualization and other important links. Among them, data collection and reporting is an important part of the whole process, only to ensure that the front-end data production is comprehensive, accurate and timely, the final data results can be reliable and valuable.

In order to solve the problems of the accuracy, timeliness and development efficiency of the front-end burial point, various companies in the industry put forward a variety of technical solutions from different perspectives, which can be broadly classified into three categories:

  1. The first type is code burying point, that is, directly upload burying point data by invoking the interface of nodes that need burying point, which is mostly adopted by third-party data statistics service providers such as Umeng and Baidu Statistics.

  2. The second type is the visualization buried point, that is, through the configuration of the visualization tool to collect nodes, the front end automatically analyzes the configuration and reports the buried point data, so as to achieve the so-called “traceless buried point”, which represents the open source Mixpanel;

  3. The third type is “no buried point”, which does not really need burying point. Instead, the front-end automatically collects all events and reports burying point data, and filters out useful data during data calculation at the back-end, which represents the domestic GrowingIO.

Meituan-dianping has high requirements for front-end burial points, which can be summarized as follows:

  1. The accuracy and timeliness of the data and the quality of the data will directly affect the back-end policy services, settlement with partners, and operational data reports that rely on buried data.

  2. Burying efficiency, burying complexity is often related to business requirements, burying efficiency will affect the speed of version iteration.

  3. The ability to dynamically deploy and repair buried points is essentially a means to increase buried point efficiency and make buried points no longer dependent on client release.

The original burying point of the company mainly adopts the manual code burying point scheme. Although the code burying point is flexible to use, the development cost is high, and it is difficult to modify once it is online. If there is a serious data problem, we can only fix it by heating. If directly improved to visual burying point, the development cost is high, and can not solve all burying point needs; If the improvement is to have no buried point, the traffic consumption and data calculation cost are unacceptable to the business. Therefore, on the basis of the original code burying point scheme, we have evolved a set of lightweight, declarative front-end burying point scheme, and further explore and practice in dynamic burying point, traceless burying point and other directions.

Code buried point

Since the declarative burying point and traceless burying point schemes introduced later still rely on the underlying logic of the original code burying point, it is necessary to briefly introduce the code burying point. When implementing code burying point, we mainly pay attention to the standardization of data structure, the ease of use of burying point interface, the reliability of reporting strategy and other issues. The overall module division is shown in the figure below.

Developers need to manually insert this buried code at nodes where it is needed (e.g. callback methods for click events, display callback methods for list elements, page lifecycle functions, and so on).

EventInfo eventInfo = new EventInfo(); eventInfo.nm = EventName.MGE; // The event type is MGE eventInfo.val_bid = "XXX "; Eventinfo.val_lab = new HashMap<>(); Eventinfo.val_lab. put(Constants.Business.xx," XXX "); Statistics.getChannel("hotel").writeEvent(eventInfo);Copy the code

It can be seen that code burying is a typical imperative programming, so burying code often intrudes into specific business logic, which makes burying code very cumbersome and error-prone. Therefore, the most straightforward approach is to decouple buried code from business logic, known as “declarative programming,” to make burying easier.

Declarative burial point

The idea of declarative burying is to decouple the buried code from the concrete interaction and business logic. Developers can only care about the controls that need burying and declare the data needed for those controls, thus reducing the cost of burying.

Android

In Android, we customize common UI controls, such as TextView, LinearLayout, ListView, ViewPager, etc., and rewrite the event response methods to automatically fill in buried code inside these methods. The advantage of rewriting controls is that they can intercept more events, execute more efficiently, and run more stably. But the downside is obvious — the cost of porting is high!

To solve this problem, we borrowed the idea of Android V7 support libraries, which automatically replace UI controls through AppCompatDelegate agents.

public class GAAppCompatDelegateV14 extends AppCompatDelegateImplV14 { @Override View callActivityOnCreateView(View parent, String name, Context context, AttributeSet attrs) { switch (name) { case "TextView": return new NovaTextView(context, attrs); } return super.callActivityOnCreateView(parent, name, context, attrs); }}Copy the code

This way, developers can automatically replace UI controls by overriding the getDelegate method in their Activity base class and replacing the return value of the method with a modified Appdelegate.

@Override
public AppCompatDelegate getDelegate() {
    if (mDelegate == null) {
        mDelegate = GAAppCompatUtil.create(this, this);
    }
    return mDelegate;
}
Copy the code

However, new problems have arisen.

This method does not work if the UI control is overridden in the referenced third party library, which means that we need a method to replace the parent of the UI control class. At runtime, however, we did not find a viable superclass method to replace the UI control class. Therefore, we tried to modify the parent class at compile time and developed a Gradle plug-in. In fact, there is no run-time efficiency issue, just some compile speed sacrifice. This way the developer just needs to run the plugin to automatically replace the parent of the UI control with our overwritten UI control.

apply plugin: 'com.meituan.judasplugin'
Copy the code

With declarative burying points, you only need to declare the required burying points during control initialization. We no longer need to invade the various response functions of the program, reducing the difficulty of burying points.

GAHelper.bindClick(view, bid, lab);
Copy the code

iOS

In iOS, with the syntactic features of Objective-C associative properties and categories, we can implement declarative dotting without rewriting UI controls. With UIControl, you can add a new action when you declare a burying point and fill in the burying code automatically when an event occurs.

- (void)nvja_setAnalyticsParams:(NVJAMGEParameter *)params mgeType:(SAKStatisticsEventMGEType)type
{
    if (self.wmja_clickParams == nil && type == SAKStatisticsEventClick) {
        [self addTarget:self action:@selector(wmja_controlDidTapped:) forControlEvents:UIControlEventTouchUpInside];
    }
    [super nvja_setAnalyticsParams:params mgeType:type];
}
Copy the code

For UITableView, you can override the UITableViewDelegate to use messaging mechanisms to intercept events and automatically fill in buried code in event callback methods.

- (void)forwardInvocation:(NSInvocation *)anInvocation { SEL selector = [anInvocation selector]; if (self.originalDelegate && [self.originalDelegate respondsToSelector:selector]) { [anInvocation invokeWithTarget:self.originalDelegate]; } SEL nvjaSelector = [self nvjaSelector:selector]; if ([super respondsToSelector:nvjaSelector]) { [anInvocation setSelector:nvjaSelector]; [anInvocation invokeWithTarget:self]; }}Copy the code

Similarly, the use of declarative burying points simplifies burying code.

NVJAMGEParameter *parameter = [[NVJAMGEParameter alloc] init];
parameter.bid = @"bid";
parameter.lab = @{@"poi_id":@"1"};
button.nvja_clickParams = parameter;
Copy the code

Declarative burying points can replace all code burying points and solve the problem of high migration costs encountered early on. But it’s still essentially a code burial point, with less code buried and no more intrusive business logic. If you want to meet the needs of dynamic deployment and repair of dead points, you need to eliminate dead points written on the front end.

Non-trace buried point

We notice that declarative buried points require dead code for two main reasons: first, we need to declare the unique event identifier of the buried control, i.e., bid; The second is that there are business fields that need to be carried at the front-end burial point, and these fields are values that are available at run time.

For the first point, we can try to automatically generate event identifiers on the front and back ends using consistent rules so that the back end can configure the front end’s burying behavior to automate burying. For the second point, you can try to somehow automatically associate business data with buried data, either on the front end or the back end.

Event identification

To automatically generate event identifiers, we need to get characteristic information such as each control’s own ID, class name, and Index in the parent component, and iterate up to the root node step by step. The root node is usually manually tagged, and if not, defaults to the top node of the view hierarchy tree. Finally, the feature information of all nodes on the path generated by traversal is combined together, which is the identification of this event. To allow for the possibility of dynamically inserted controls in the actual layout, we allow some error in the Index of the parent component.

The configuration background needs to maintain the mapping between automatically generated event ids and bid, and can send a configuration file to the front end. When the current control event is triggered, it automatically matches the configuration file to get the corresponding bid. It is important to note that configuring the backend to maintain event identifiers is not an easy task. The main complexity is the change in event identifiers caused by layout changes between versions, which is why you also need to manually mark the root node. Therefore, we generally choose view nodes that are not easily changed.

Data correlation

In order to realize the automatic association between business data and buried data, we tried the way of log association of the front and back end at first. That is, when the front-end requests the back-end API, the back-end writes service data to the log, and finally merges the corresponding front-end and back-end logs during data cleaning. The problem with this approach is the high cost of back-end transformation and the high overhead of data cleaning, so it is not widely used. However, this association is necessary in some special scenarios, such as some business data that is known only by the back-end and not by the front-end.

More common data associations occur between front-end data. When a page jumps, the service data is transferred to the next page through the transfer of the standard jump URI Scheme, and automatically filled in the PV event of this page. All other events generated within the page carry the same business data as PV events.

Thus, by automatically generating event identifiers and data correlation, we are able to achieve “traceless burial points” and burial point nodes can be delivered dynamically through configuration files, thus providing dynamic deployment and recovery of burial points. It is important to note, however, that this “traceless burying point” does not solve all problems, and when business fields cannot be retrieved through data correlation (which is more common), developer code burying points or declarative burying points are still required to specify business fields. According to the data in the current practice stage, about 70% of burying point requirements in the business can be solved by traceless burying point, while for the other 30% of burying point requirements, declarative burying point and code burying point still need to be used.

conclusion

Front-end data collection and reporting are the most important links in the process of building a data platform. The data reported by the front-end of Meituan-Dianping every day reaches 10 billion times. In order to better meet the demand of various business increasingly complex burial site, as well as to the buried point accuracy, timeliness, the requirement of the development efficiency, we buried in the code point solutions evolved on the basis of a set of lightweight, declarative front buried point solutions, and buried in a dynamic point, non-trace buried point the direction for further exploration and practice. At present, the declarative buried point has been fully used in some businesses and achieved the expected benefits from data quality and developer feedback. The traceless embedding point is also being verified and continuously optimized in some businesses, and will be further promoted in the company.

In practice, we realize that burying point problem can not be solved by a single technical solution, we need to choose different burying point solution in different scenarios. For example, for simple user behavior events, you can use traceless burying points to solve; Declarative burying points are needed to address the need to carry a large number of business fields that are available only at runtime. From a higher level, in addition to the optimization of front-end buried point technology, the standardization of buried point data, the coordination of front and back end buried point, data cleaning and association are also very important for the construction of a more automated and dynamic buried point system in the future.