This is the 25th day of my participation in the August Genwen Challenge.More challenges in August
The process of data analysis
1. Target determination
2. Data acquisition
3. Data cleaning
4. Data sorting
5. Description analysis
6. Insight into conclusions
7. Write reports
Objective to determine
We need to ask what is the purpose of data analysis? What problem is it to solve? To what end?
There are two general analytical purposes:
1. Descriptive analysis based on the existing situation
2. Predict the future based on the current situation – predictive analysis
Such as:
For a stock, analyze its movement, up and down. This is descriptive analysis
In view of these ups and downs, back test should be at what time point to buy what time point to sell. This is predictive analytics
Data acquisition
For a target, you need to know what data you want to get. Data acquisition is divided into two parts
1. Field design
2. Data extraction
Such as:
It is necessary to analyze the sales data of a sales team, and the basic indicators include average sales, total sales, increase or decrease, etc., but these indicators are not available in the database, so it is necessary to extract the existing fields from the database to process the above basic indicators – this is the field design
Import and export data from sales software, import and export data from database using SQL – this is data extraction
Data cleaning
After the steps of data acquisition, we have obtained the required data, but such data cannot be directly used, we need to do further processing, which is data cleaning
Data cleaning is mainly aimed at blank value, invalid value, duplicate value and other outliers.
The identification and processing of outliers has been covered in previous articles on data metrics, so you can look back if you haven’t
Data sorting
The data that has been cleaned cannot be used directly, so data collation is needed here, which is mainly for data formatting.
Such as:
Date processing, processing the date in the data into a uniform format
Formatting rows and columns
Basic calculations, such as average, total, mode, etc. of songs
Description analysis
After completing the above data manipulation, you can begin to describe the analysis
Description analysis is divided into two parts
1. Data description: Describe the basic information of the data, such as the total number of data, time span, data source, etc
2, index statistics: combined with the actual business analysis of the actual situation of the data indicators, such as to analyze the website traffic, website PV, IP, retention, jump rate, conversion rate and so on
There are four main description scenarios for indicator statistics
1) Change: changes in increase or decrease over time
2) Distribution: performance at different levels, such as regional distribution, male and female distribution, and population distribution
3) Comparison: comparison between data items
4) Forecast: predict future changes based on the current increase or decrease
Insight into the conclusion
This part is based on your data analysis skills and your understanding of the business you are in charge of. This part is the core of data report and also reflects your data analysis ability
Writing reports
After a process of analysis, you need to synthesize what you come up with into a data analysis report
The data report mainly contains the following contents
1. Background: Describe the business you are trying to solve
2. Purpose of the report: What problem do you want to solve
3. Basic information of data: mainly reflects the reliability of data, whether your data source is reliable, data dimension, data integrity and so on
4, visual chart: the understandable degree of data, enhance the understanding of people who see statements
5, strategy selection: propose solutions, state your conclusions, provide solutions, etc
So that’s the whole process of data analysis.