What is the ** index system? How to use OSM model and AARRR model to build index system? How to unify the process, standardization, tool management index system? This paper will answer and analyze the construction methodology combined with the construction practice of didi data index system.
#1. What is the indicator system
##1.1 Definition of index system
Index system is a systematic organization of scattered single points with interrelated indicators, so as to see the whole situation through a single point and solve the problems of a single point through the whole situation. It mainly consists of two parts: index and system.
Index refers to the quantified measurement value after business unit subdivision, which makes business objectives describable, measurable and disassembling. It is the combination of business and data, the basis of statistics and an important basis for quantifying effects.
Indicators are mainly divided into result type and process type:
-
Results-based indicators are used to measure the results generated after a user takes a certain action, which are usually known after delay and difficult to intervene. Result indicators are more about monitoring data anomalies or whether user requirements are met in a certain scenario
-
Process indicators Indicators generated when users perform certain actions can be influenced by certain operational strategies to affect the process indicators, thus affecting the final result. Process indicators pay more attention to why users’ needs are met or not met
The system is composed of different dimensions, and dimension refers to the “thinking Angle” from which users observe, think and express something. Dimension is the core of the index system. Without dimension, it is meaningless to simply say index.
Dimension is mainly divided into qualitative dimension and quantitative dimension, qualitative dimension, mainly partial text description such as city, gender, occupation, etc. Quantitative dimensions are mainly numerical description, such as income, age, etc. Numerical grouping is required for quantitative dimensions.
##1.2 Indicator system life cycle
The life cycle mainly includes four stages: definition, production, consumption and downline. In view of the whole life cycle, it is necessary to continue to do the index operation and maintenance, quality assurance, and at the same time, in order to improve the index data reuse degree and reduce the user cost, it is necessary to do the corresponding data operation work.
##1.3 Comprehensive usage scenarios
The indicator system is used in combination with users’ business scenarios. Multiple indicators and dimensions can be combined for comprehensive service analysis. Users can see the overall service changes based on the indicator changes, and quickly discover and locate problems. The commonly used scenarios are one of decision analysis, in which the business status is clearly seen through data for strategic decision support, and the other is the operation analysis scenario. No matter the user operation, product operation or activity operation, all kinds of indicator data are needed to identify, analyze and guide the solution of problems.
#2. Why build an indicator system
- Measure the quality of business development
The index system can reflect the objective facts of the business, see the current situation of business development, measure the business quality through the index, control the business development, focus on solving the business problems found, and promote the orderly growth of the business
- Establish indicator causality
The relationship between result index and process index is mainly clarified, and the core reason for solving the problem is found by tracing the process index through result index
- Guide user analysis
Objective To establish product evaluation system, activity effect evaluation system and intelligent operation analysis system
-
Guide the construction of basic data to clarify the direction of basic data construction, concentrate resources, and avoid the omission or absence of process and result analysis index data
-
Guide the construction of content products
Multiple indicators and dimensions can be combined to comprehensively analyze services. Users can quickly discover and locate problems based on indicators’ changes
- Unify index consumption caliber
Unified key indicators within the enterprise business caliber and calculation caliber, unified enterprise business objectives, to achieve top-down goal driven
# 3. How to set up index system of index system of the construction of the commonly used method is through the scene for the construction of the index system, in view of the user scenario thinking, and top-down, business-driven index system construction in particular scenarios completes the construction of index system, need to choose the right indicators, then use the scientific method to build index system.
##3.1 Scientific method to select indicators
The common methods for selecting indicators are index classification method and OSM model.
Index classification mainly refers to vertical thinking of index content. Indicators are classified from top to bottom according to enterprise strategic objectives, organization and business process, and analyzed layer by layer. Indicators are mainly divided into three levels, T1, T2 and T3.
-
T1 indicator: strategic indicator of the company
Indicators used to measure the achievement of the company’s overall goals are mainly decision-making indicators, and T1 indicator is usually used to serve the strategic decision-making level of the company
-
T2 indicator: service policy-level indicator
In order to achieve the target of T1 indicator, the company will disassemble the target into business lines or business groups and make a series of targeted operation strategies. T2 indicator usually reflects that the policy results are supportive indicators and also the core indicators of business lines or business groups. T2 is a vertical path disaggregation of T1 indicators to facilitate fault location of T1 indicators. T2 indicators usually use service lines or business groups
-
T3 indicator: service execution level indicator
T3 index is the disassembly of T2 index, used to locate the problem of T2 index. T3 metrics are also typically the most common in business processes. According to the different objectives of each functional department, the indicators concerned are also different. The use of T3 index can usually guide front-line operations or analysts to carry out their work. The content of T3 index is process-oriented and can quickly guide front-line personnel to make corresponding actions.
For example: the index classification of turnover rate
OSM model (Obejective, Strategy, Measurement) is an important method to help determine the core in the process of index system construction, including business objectives, business strategies and business measures. It is a lateral thinking of index content.
- O
What is the user’s goal with the product? What needs does the product meet? Objectives are mainly determined from user and business perspectives, and the principles are practical, easy to understand, intervenable and positive
- S
What strategies have I adopted to achieve these goals?
- M
What are the resulting changes in data metrics?
Take Didi ride-hailing as an example. According to the OSM model, what are its indicators?
- O: What are the needs and goals for users to use Didi?
User needs and goals are convenient, fast taxi, safely arrive at the destination
So how do you make users feel like their needs are being met?
- S: Didi’s strategy is:
Convenience: Independent APP version and mini program version are provided. You can also get a taxi through multiple channels. For example, autonavi, wechat and Alipay all have taxi access. Intelligent and precise positioning of start and destination maps; The optimal route selection and other fast aspects: according to the different demands of different groups of people, we provide a variety of product choices, such as express, premium, carpool, taxi and other services, improve the capacity of hot spots according to the morning and evening peak hours, reduce the user queuing time safety aspects: driver access mechanism, driver compliance mechanism, driver portrait
- M: We need to make indicators for these strategies, in which our indicators are respectively result indicators and process indicators:
** Result indicators: ** channel conversion completion rate, passenger cancellation rate, supply/demand ratio, driver service points ** Process indicators: ** channel number of issuance, channel number of completion, number of passengers in line, passenger queuing time, driver praise rate, driver orders received, driver cancellations, etc
After the index selection, the following is the most important analysis dimension selection, the previous definition of the index system said that the dimension is the core of the index system, without the dimension, it is meaningless to simply say the index. Therefore, dimension selection is mainly determined from the perspective of data analysis combined with the actual analysis of business scenarios. For example, city dimension, business circle dimension, channel dimension, time dimension, user label dimension and so on.
##3.2 Build index system with analysis model
In the book Lean Data Analysis, two sets of commonly used indicator system construction methodologies are given, one of which is the well-known Pirate indicator method, which is AARRR pirate model that we often hear. The Pirate model is a classic model for user analysis, reflecting the fact that growth is systematic throughout all phases of the user lifecycle: Acquisition, Activation, Retention, Revenue, Referral.
AARRR model
-
A Lahxin obtains target users through various promotion channels in various ways, evaluates the effects of various marketing channels, and continuously optimizes investment strategies to reduce the cost of acquiring customers. It involves key indicators such as newly registered users, activation rate, registration conversion rate, retention rate of new customers, downloads, installs, etc
-
A Active active users mean that they really start to use the value provided by the product. We need to master the behavioral data of users and monitor the health of the product. This module mainly reflects the behavior of users entering the product and is the core of product experience. It involves key indicators such as DAU/MAU, daily usage duration, APP startup duration, and APP startup times
-
R Retention is a measure of user engagement and quality. Involving key metrics such as retention, attrition, etc
-
R realization is mainly used to measure the commercial value of products. Related to key indicators such as life cycle value (LTV), customer unit price, GMV, etc
-
R recommends measuring the degree of self-communication and word-of-mouth. It involves key indicators such as invitation rate and fission coefficient
According to actual business scenarios, OSM and AARRR models can be combined to systematically select the core data indicators required by different stages.
At present, the Internet business is a popular general abstract scene ** “people, goods and field” **, which is actually what we call users, products and scenes in daily life. In popular terms, it means who uses what products in what scenes. Different business models will have different combination modes. Take the actual scenario of Didi as an example: in which scenario (the scenario is defined as terminal, such as Native, wechat and Alipay) which people (passengers) use what goods on the platform (platform business lines, such as express/special train, etc.), and then evaluate the value and effect of user growth.
###3.3.1 “Human” perspective
From the perspective of “people”, what we care about is which passengers took a taxi when, how long they waited in line, how long they waited to get on the bus, how many times they took a taxi in a cycle, how much they spent, whether there were complaints and cancellations, Specific data indicators mainly depend on the number of users of issuing orders, the number of users of finishing orders, the unit price of customers, the number of finished orders, the number of canceled orders and the number of evaluation orders within the cycle.
###3.3.2 “Goods” perspective
From the perspective of “goods”, what we care about is the amount of transaction, transaction volume and expenditure. In terms of specific data indicators, we mainly look at GMV, transaction rate and cancellation rate, and further subdivide into cities, regions, first-level categories and second-level categories. The effect of data is analyzed and determined through target comparison, horizontal comparison and historical comparison.
###3.3.3 “field” perspective
From the perspective of “field”, we are more concerned about which channel users have the most clicks and the highest exposure rate, how many new users they bring, how many transaction orders they complete, and how much the customer unit price is. Or which activity to pull new or activate the effect of how much conversion rate, combined with the actual situation of scene data to develop the corresponding strategy.
The above data indicators and analysis dimensions are refined from the perspectives of “people”, “goods” and “field” respectively. The following three indicators are decomposed and correlated with the index classification method.
How to manage index system ##4.1 Pain point analysis
Mainly from three perspectives of business, technology and product:
-
Business perspective
Service analysis scenario indicators and dimensions are not clear;
Frequent demand changes and repeated iterations, bloated data reports, uneven data;
It is expensive for users to find and verify data after analyzing specific business problems.
-
Technical point of view
Index definition, index naming confusion, index is not unique, index maintenance caliber is inconsistent;
Index production, repeated construction; High cost of data calculation;
Index consumption, data export is not unified, repeated output, output caliber is inconsistent;
-
Product vision
Lack of system productization support for data flow from production to consumption without system product level;
##4.2 Management Objectives
- Technical objectives
Unified index and dimension management, index naming, calculation caliber, unique statistical source, standard dimension definition and consistent dimension value
- Business goals
Unified data egress and scenario-based coverage
- The target product
Index system management tool productization landing; The content of the index system is productized to support decision-making, analysis and operation, such as decision-making Polaris and intelligent operation analysis products
##4.3 Model architecture
# # # this business line plate definition principle: the business logic level abstraction, physical structure level for segmentation, can undertake spin-off refinement hierarchy according to the actual business situation, hierarchical classification for proposal a triple spin-off, most primary segmentation can be unified specification to determine company level, level 2 and subsequent resolution can be split according to actual business business lines. Such as the business logic level drops travel field and carriage are two rounds of car travel areas can abstract travel business sector (level 1), according to the physical structure plane in subdivided pratt &whitney, mesh about car, taxi, lift (level 2), and subsequent according to the actual needs of the business in the segment, net about car can be subdivided by and by, Pratt & Whitney can be subdivided into bicycles, enterprise.
###4.3.2 Specification definition
-
A data domain is a collection of business processes or dimensions abstracted for business analysis. Among them, business process can be summarized as one by one unsplit behavior events, under which indicators can be defined; Dimensions are the measured environment, such as the passenger call event, and the call type is dimension. In order to ensure the vitality of the whole system, the data domain needs to be abstracted and maintained and updated for a long time, and the change process needs to be implemented.
-
The business process
Refers to the company’s business activity events, such as call orders, payment are business processes. The business process is not separable.
-
Time period
Time range or point in time, such as the last 30 days, nature week, and expiration date.
-
Modified type
Is an abstract division of modifiers. The modification type belongs to a service domain. For example, the type of the access terminal in the log domain includes the APP terminal and PC terminal.
-
Modifiers refer to the service scenario limited abstraction of indicators other than the statistical dimension. Modifiers belong to a type of modifiers. For example, in the log domain, modifiers include APP and PC.
-
Metric/atomic metrics
Atomic indicator has the same meaning as measurement. Measurement based on a business event behavior is a non-separable indicator in the business definition. It has a name with clear business meaning, such as payment amount.
-
The dimension
A dimension is a measurement environment that reflects a class of attributes of the business, and the collection of such attributes constitutes a dimension, which can also be called an entity object. Dimension belongs to a data domain, such as geographical dimension (including country, region, province, etc.) and time dimension (including year, season, month, week, day, etc.).
-
Dimension properties
Dimension attributes belong to a dimension, such as the country name, country ID, province name in the geographical dimension are dimension attributes.
-
Index classification is mainly divided into atomic index, derived index and derived index:
- Atomic indicator Is a measure based on the behavior of a service event. It is a non-separable indicator in the service definition and has a name with clear business meaning, such as call volume and transaction amount
- The derived index is 1 atomic index + multiple modifiers (optional) + time period, which is the delineation of the business statistical scope of atomic index. Derived indicators are divided into the following two types:
Transactional metrics are metrics that measure business processes. For example, call volume, order payment amount, such indicators need to maintain atomic indicators and modifiers, based on which to create derived indicators. Stock index refers to the statistics of some states of entity objects (such as drivers and passengers), such as the total number of registered drivers and the total number of registered passengers. Such indicators need to maintain atomic indicators and modifiers, and then create derived indicators on this basis. The corresponding time cycle is generally “the historical end of the current time”.
- Derivative index is the composite of transactional index and stock index. There are ratio type, proportion type and statistical mean value
The model design mainly adopts dimension modeling method to construct. The fact table of basic business details mainly stores dimension attribute sets and metric/atomic indicators. The analysis service summary fact table stores statistical dimension sets, atomic indicators, or derived indicators. The analysis service summary fact table stores only statistical label sets of analysis entities. At the level of physical realization of data warehouse, the index system is mainly guided by the hierarchical structure of data warehouse model. The index data of Didi is mainly stored in DWM layer, which serves as the core management layer of indicators.
Dimension management includes basic information and technical information, which are maintained and managed by different roles.
- Basic information Service information corresponding to dimensions is maintained by service managers, data products, or BI analysts, including dimension name, service definition, and service classification.
- Technical information The data information corresponding to dimensions is maintained by data r&d, including whether there is a dimension table (whether there is an enumeration dimension table or an independent physical dimension table), whether there is a date dimension, corresponding English name and Chinese name of code, corresponding English name and Chinese name of name. If a dimension has a dimension physical table, bind it to the corresponding dimension physical table and set the fields corresponding to code and name. If the dimension is an enumeration dimension, fill in the corresponding code and name. The unified management of dimensions is conducive to the standardization of data tables and the use of users’ queries.
Index management includes basic information, technical information and derivative information, which are maintained and managed by different roles.
- Based information corresponding to the index of business information, by the business management personnel, products of data, or BI, an analyst at maintenance, mainly includes the ownership information (business sector, business process, data domain), the basic information (name, English name, define indicators, statistical algorithms, the index type (to heavy, not to go)), the business scenario information (dimension, scene description);
- Technical information corresponds to the physical model information of indicators, which is maintained by data research and development, mainly including the corresponding physical table and field information;
- Derivative information corresponds to related derivative or derivative indicator information, associated data application and business scenario information, which is convenient for users to query which other indicators and data applications indicators are used, and provides the ability of indicator blood analysis to trace data sources.
Atomic indicator Definition Owning information + Basic information + Service scenario information Derived indicator definition Time period + modifier set + Atomic indicator modification types include type description, statistical algorithm description, and data source (optional).
The modeling process is mainly from the business perspective to guide engineers to abstract and categorize the subject of the indicators involved in demand scenes, unify business terms, reduce communication costs, and avoid repeated construction of follow-up indicators.
Analysis data system is a physical collection of summary fact tables in model architecture. At the business logic level, the index system is abstracted and precipitated according to business analysis objects or scenarios. Didi Chuxing mainly abstracts themes based on the analysis objects, such as driver themes, safety themes, experience themes, city themes, etc. Index classification is mainly abstract classification based on actual business process, such as driver transaction index, driver registration index, driver growth index, etc. The basic data system is a physical collection of detailed fact tables and basic dimension tables in the model architecture. The business logic level is abstracted according to actual business scenarios, such as driver compliance and passenger registration, to restore the core business process.
The development process is to guide engineers in the production, operation and maintenance of the indicator system and quality control from a technical perspective. It is also a bridge for communication and coordination between data products or data analysts and data warehouse r&d.
The index system atlas, also known as the data analysis atlas, is mainly a collection of business classification, analysis indicators and dimensions involved in abstract business analysis entities based on actual business scenarios.
Construction method: mainly through business thinking, user perspective to build, business and data closely linked, structured and classified indicators organization
Construction purpose:
- For users:
In this way, users can quickly locate the required indicators and dimensions, and quickly reach users’ data demands by precipitation of the indicator system based on business scenarios
- For R&D:
It also helped boundary in subsequent production model design, boundary in data content, quantitative iteration in data system construction, and implementation of data assets
###4.6.2 Index system Atlas model
###4.6.3 Example of indicator system atlas
#5. The index system is productized
The product set involved in the index system should be constructed according to its life cycle, and the data flow can be opened through product tools to realize the unified, automated, standardized and process-based management of the index system. Because the essential goal of index system construction is to serve business and realize data-driven business value, the core principle of construction is “light standard, heavy scene, from control to service” **. Improve user data efficiency and accelerate business innovation through the convergence of tools, products, technologies and organizations. The product strongly related to the methodology of index system is the implementation of index dictionary tool, and its product positioning and value:
- Support indicator management standard from method to implementation of the tool, automatically generate standard indicators, solve the problem of index name confusion, index is not unique, eliminate the ambiguity of data
- Provides standard indicator caliber and metadata information in a unified manner
# 6. Conclusion
The paper introduces the construction methodology of Didi index system and the construction of tool products as a whole. At present, the process of index dictionary and development tool has been completed, and data service will be provided through DataAPI in the follow-up to the integration with data consumer products, which is under planning and construction. Index system construction methodology and tools have been drops within the group to promote the use of drops about car, pratt & Whitney, cars and other departments have begun to use access, so far there are 5000 + indicators into the indicator system, covering the company’s core business sectors, 88 of the 385 data fields, business process, 52 business scenario, there could be increasing iterative methodologies and tools.
Team to introduce
The Data Management Department-Data Warehouse team of Didi Basic Platform Department of Didi Cloud Platform Business Group is responsible for the architecture, planning and design of the data warehouse of the company’s online ride-hailing, taxi, hitchhiking and international travel business as well as the construction of data content products. Support data decision analysis of core business departments such as operation, product, analysis, strategy, security and experience, and provide complete, reliable and high-quality data services.
The authors introduce
Focus on the systematic construction of data warehouse, product number warehouse concept promotion and practice
read
Content editing | Charlotte contact us | [email protected]
Wechat search “Didi technology”, follow the public account, learn more about core technology dry goods oh