“Number” is a product of “Getui” that provides statistical analysis for APP developers. “Number” carries out a comprehensive statistical analysis of APP from the perspectives of user attributes, channel quality, industry comparison and other dimensions through visualization buried point technology and big data analysis ability.

“Number” can not only timely count the active and new users, but also analyze the composition and direction of uninstalled users. In addition, it can also realize the prediction of key user behaviors such as churn and payment, so as to help APP developers realize the fine operation and full life cycle management of users. It is worth mentioning that the innovation of “Number” in “visualization burying point” and “behavior prediction” has brought great convenience to APP developers in actual operation. Therefore, we will also make detailed analysis on these two points in the following article.

Visual burial point

Embedding refers to the process of embedding relevant statistical codes in key parts of product processes to track user behaviors, count the usage of key processes, and report the data to the server in the form of logs.

At present, data burying point acquisition mode mainly includes code burying point, no burying point, visual burying point and so on.

“Code burying point” refers to adding basic JS to the monitoring page and adding monitoring code according to the demand. Its advantage is flexible, you can customize Settings and choose the data you need to analyze, but for complex websites, every time you modify a page, you have to re-create a burying point scheme, which costs a lot. At present, baidu Statistics, Umeng, Tencent Cloud Analysis, Google Analytics, etc.

“Visual burying point” usually refers to the burying point mode in which developers directly operate interactive and effective page elements (such as pictures, buttons, links, etc.) on the data access management interface by connecting the device to user behavior analysis tool to realize data burying point and deliver effective loop number of collection code. At present, the representative products of visualization buried point include number, Mixpanel, divine policy data, etc.

Buried “no point” similar to “all buried point”, its principle is “all collection, on-demand selection”, meaning that it can all interactive elements in the page of user behavior are collected, it is as much as possible to collect to detect the content of the page first, and then decide which data analysis through interface configuration, but it is a standardized collection, If you need to set a custom collection method still need code buried power. The representative products of this scheme are GrowingIO, Number Geek, Baidu Statistics and so on.

Why does “number” choose visual burying point?

The developing situation of the mobile Internet is in a high-speed development and rapidly changing phase, the developers need to be in a timely manner according to the analysis of large data, feedback, adjust the business functions and so on in traditional mode of operation, if you want to know the data of different nodes, will modify the corresponding code inside buried point, then test, then in check, online app store, The entire cycle can take several weeks, which is clearly not enough to meet the needs of the business. So, The “visualization burying point” technique used by “Number” is designed to help developers solve this problem.

“Number” visual buried flexible, convenient, do not need to add any code to track point data, users need only through the equipment connection management, the page can be buried point element loops, can add effective interface tracking point at any time, at the same time on the mode of data acquisition and data analysis ability, “number” can provide accurate and effective data to developers.



The visual buried point has the following characteristics:

  • Zero code, no code, cost saving
  • Free of update, easy to add, no need to upgrade
  • Easy to test, loop test, real-time presentation

In other words, visualization of buried points not only saves the enterprise money, but also improves the productivity of developers and operations staff.

Behavior prediction

The behavior prediction of “number” mainly includes loss prediction, unloading prediction, payment prediction, etc. Its principle is to build an algorithm model based on the historical behavior data of App to predict users’ key behaviors, so as to help developers achieve the purpose of users’ fine operation and full life cycle management.

Here it is important to note that the behavior of the “number” forecast and commonly used electric business platform of personalized recommendation is different, the latter is mainly based on the user’s recent actions, such as browsing records, purchase records and analysis the user may need, and the “number” is based on App indices such as channel number of unloading, uninstall trend of comprehensive analysis, is more of a clustering analysis of population, Not just based on individual behavior.

Steps of behavior prediction

According to Zhu Jinxing, a big data scientist at Getui, the behavior prediction of “Number” can be divided into the following steps:

1. Looking for samples, mainly from the historical database;

2. Feature extraction to get through the user and database for matching;

3. Feature screening to retain highly relevant or valuable features;

4. Model training: the retained features are put into the model for training. In the selection of models, the “number” is mainly used by logistic regression, which is simpler than deep learning and other models, and relatively easy to deal with in feature screening.

5, parameter optimization, according to the effect of adjustment, if the result is not ideal, you can return to adjust the parameters and go through the process again.

The example analysis

Let’s take the payment forecast as an example to sort out the specific implementation process.



The process of data payment prediction mainly includes the following points:

1. Target problem decomposition

Identify the issues that need to be forecasted namely the paid forecast, and the time horizon in the future.

2. Analyze sample data

(1) Extract the historical payment records of all users;

(2) Analyze payment records to understand the composition of paying users, such as age level, gender, purchasing power and product category;

(3) Extract the historical data of non-paying users, which can be extracted conditionally or unconditionally according to the needs of the product, such as extracting active and non-paying users, or extracting directly without conditions;

(4) Analyze the composition of non-paying users.

3. Characteristics of the model

(1) The original data may be directly used as features;

(2) Some data can be better used after transformation, such as age, which can be transformed into juvenile, middle-aged, old and other characteristics;

(3) The generation of cross features, such as “middle-aged” and “female”, can be combined into one feature for use.

4. Calculate the correlation of features

(1) Calculate characteristic saturation and filter saturation;

(2) Calculate indicators such as feature IV and Chi square for filtering feature correlation.

5. Use logistic regression for modeling

(1) Select appropriate parameters for modeling;

(2) After the model is trained, the model is evaluated by statistical indicators such as model accuracy, recall rate and AUC;

(3) If the performance of the model is acceptable, the verification can be carried out on the verification set. After the verification is passed, the model can be saved and predicted.

6, forecasting

Load the saved model above and load the prediction data to make the prediction.

7, monitoring,

Finally, operators also need to monitor the key indicators of each forecast result to find and solve the problems in time and prevent unexpected situations that may lead to invalid forecast or deviation of forecast results.

Other scenarios, such as churn prediction, unload prediction, etc., are similar in flow to paid prediction, so I won’t cover them all here.

With accurate behavior prediction, operators can split and refine the operation objectives, specific to each scene, each process, and adopt different promotion channels and operation strategies for different users. For example, based on attrition prediction, operators can have insight into user attrition behavior in advance, intervene in advance, and retain the users who are about to lose by means of personalized content recommendation, news push and other operational means, thus reducing the attrition rate. In general, with the help of big data behavior prediction, operators can have a more timely and comprehensive understanding of users, so as to achieve the goal of refined operation.

About the future

Next “number” will also do more to explore in the areas of recommendation, such as the development of precise recommendation technology, etc., will develop the potential of big data, combined with the feedback data for further optimization, customers around the training sample data to do more in-depth study, etc., for developers to provide more comprehensive data services, stay tuned.