Author: UmENG +U-APM Application Performance Monitoring team
Why? Why do I monitor application performance?
First of all, what exactly does application performance monitoring mean? And the purpose:
Monitoring is a complete set of “monitoring + alarm” system. For App developers like us, App performance monitoring is the first hurdle to measure an App. If the App is of poor quality, it will cause the most direct harm to the user experience. After the launch of an App, developers cannot get real-time information about users’ use and experience at 7*24, so a set of high-quality monitoring tools is needed.
So what exactly do we need to monitor?
For example, Android needs Java, Native, ANR errors, etc., while iOS needs Objective-C, Swift, C++ layer errors, etc.
In the definition of error indicators, the most basic is the number of errors of different types. If the comparison between the number of errors and the overall application usage is considered, the ratio can be considered, for example, the error rate can be defined:
If you want to pay attention to the number of errors and the number of users affected by errors, you can calculate the number of users affected by errors based on the number of errors.
How do you define a unique user? We can consider identifying by device ID, such as IMEI, IDFA, AndroidID, etc. If these information is difficult to obtain, we can also use business user ID, such as login account, member name, etc. In addition, it is a good idea to use device identification definition ids provided by third-party SDKS. After using this type of ID for reloading, you can get the number of incorrect affected users.
If we know the number of users affected by an error, but cannot determine its percentage of scope, we can look at the following metric:
In conclusion, we can count the number of errors, error rate, number of affected users, proportion of affected users and other indicators of different types of errors in a certain time range. In terms of the refined classification of indicators, we can also define monitoring with different dimensions, such as version number.
How? How to develop your alarm plan flexibly?
We first ask you to take a quiz to determine your monitoring alarm type (5 questions in total, only 1.5 minutes).
The rules are as follows: A gets 5 points, B gets 10 points, C gets 15 points and D gets 20 points
Q1: What is the current stage of your product?
A: It is in A stable state and has low requirements for monitoring alarms
B: Still in the development stage, need to catch some errors in the test, the need to monitor the alarm is moderate
C: It is just online and stable on the whole. The demand for monitoring alarms is high
D: It has just been online, and the effect is unknown. It requires 7*24 hour real-time attention and has a high demand for monitoring alarms
Q2: What is your position in your company/department?
A: Leader, how about focusing on the quality of the application
B: Operations personnel, on-line problem monitors responsible for monitoring overall application performance
C: testers, responsible for quality control before application release
D: Android /iOS client developer
Q3: How many members of your team focus on and participate in application performance quality?
A: 1. The plain Commander works by himself
B: 2-5 people, small development team
C: 6 ~ 25 people, cooperate with each other to optimize the application quality
D: 25+, super large development team, not modesty to say that it is the industry leader
Q4: Which application performance monitoring indicators do you pay attention to?
A: The most basic number of errors will do
B: Considering the range of users affected by the client, the number of users affected and their proportion should be monitored on the basis of the above
C: Based on the above number of errors and the impact on users, we should also consider the distribution of each version
D: You need to formulate combined alarm rules. For example, when the number of errors is greater than 100 and the error rate is greater than 1% or the number of users is 1% more than one day ago, you need to consider the version distribution
Q5: Do you have requirements for fine setting of alarm notification mode?
A: Nothing, as long as I can receive it
B: I have some requirements on time. I don’t want to be disturbed at midnight
C: There are some requirements on the channel that require email or specific office chat software
D: There are requirements for time and access channels
What? So how do you set up an alarm plan?
If the above points are added together, please first determine your test score (5 points for option A, 10 points for option B, 15 points for option C and 20 points for option D), and see which of the following monitoring alarm requirement levels your App falls within :(which range is the data in? Or at what level are the monitoring alarms?)
Warm Blood Bronze (25 ~ 50 points) : You belong to the primary stage of alarm monitoring users, you do not need to very fine check the occurrence of various errors in daily work. This may be because your application is still in the initial stage, or you are in a high position. You only need to monitor the alarm information. See Option 1 below
Heroic Gold (50 ~ 75 points) : You are an intermediate level user of monitoring alarms. You or your team already have awareness of monitoring alarms and are aware of real-time application quality in your daily work. You can set alarms with some fine rules. Go to Plan 2
King of Glory (75 ~ 100 points) : You are already an expert at monitoring alarms, and with a little guidance, you can become a “super ace” in monitoring and warning
Based on the scores of the preceding tests, you can determine whether the alarm Settings you need are easy or not. The solutions are as follows, from easy to difficult. If you want to learn the most comprehensive alarm setting function, please go to plan 3
Solution 1: simple – overall application quality monitoring
As a rudimentary alarm setup, you only need to consider two questions:
A. When should I receive an alarm?
B. How can I receive application alarm messages?
To solve the first problem, you can consider the simplest state, where I get an alert whenever there is an error, and then simply set the error count condition to >0. If you feel that too many alarms are interrupted, you can set the number of errors to xx based on your application
To solve the second problem, you need a medium that can receive messages, the simplest being a mailbox:
A simple plan to monitor alarms is set up
Scheme 2: Advanced – fine application quality monitoring
You can set different alarm messages for a single application by monitoring indicator type or version. For example, if the number of users affected is more than 10, an alarm will be triggered, and if the overall error rate of the old version does not increase by more than 5% compared with last week, then we can set it as follows:
A. Alarm rules of the new version:
B. Alarm rules of earlier versions:
In this scheme, we apply the threshold type and contrast type of alarm triggering conditions respectively. The definitions of these two rules are as follows.
Threshold rules
You can select a metric (number of errors, error rate, number of affected users, percentage of affected users) and select Greater than a value or percentage
Contrastive rule
You can select an indicator (number of errors, error rate, number of users affected, and percentage of users affected) and select ratio to the historical time range to increase the percentage. The calculation method is as follows :(value in the past hour – value in the historical hour)/value in the historical hour. If the value is greater than or equal to the selected value, an alarm is generated
Scheme 3: King – combined index monitoring
You are familiar with setting and monitoring alarms, so with the following hints, I believe you can flexibly make your alarm plan according to your daily work needs
A. Flexibly set the alarm validity time.
You can add an alarm validity period, for example, from 9am to 19am from Monday to Friday, and from 12am to 20am on weekends. You can flexibly set your working hours to prevent invalid messages from interfering with your working hours
B. Major error type or single error alarm
You can choose the types of errors you want to focus on
Or you can focus on an alarm for a fixed error
C. Trigger conditions of combined alarms
You can set alarm triggering conditions based on multiple indicators and threshold or contrast rules in the combination of intersection and union
D. Multiple channels for alarm access
If you also need to monitor the contact channels for alarms, consider using the company’s office software for group contact, and work with other colleagues in your group to troubleshoot application problems.
In this solution, you can use u-APM to set all alarm monitoring functions and make alarm plans in 2 minutes.