In this paper, byNetease cloudRelease.
Author: Wang Wenkai
Cross-view data granularity calculation is a new function launched by NetEase. Its advantage is that you can perform this calculation independently of the dimension used by the current view. There are three types of cross-view data granularity calculation expressions, namely, FIXED, INCLUDE, and EXCLUDE. To understand when/where/why we need to use it, we must first understand:
1. What is data granularity?
2. Where in NetEase will affect the data granularity of views
1. What is granularity of data?
Let’s assume that we have a table with the following fields: OrderID, Customer Name, and Sales. There are 12 lines:
Basic form
In this table data, OrderID is the primary key, so the most detailed data granularity for this table is OrderID, because this field distinguishes each record in the table. So you can think of it this way:
Fields that distinguish between each row of data are called the finest-grained data.
In this example, we average the “Sales” field. At the finest data granularity, we average all 12 records.
I summarize the Sales column and divide it by the total number of rows, which is $3,059.00 divided by 12. This results in an average sales per record of $254.92.
In NetEase Meoliu, if you only drag the “Sales” field into the chart and select “average value” in the aggregation mode without placing any other dimensions, it can be seen that NetEase Meoliu will use the finest granularity for calculation by default.
As we can see, this is the same as placing “OrderID” on the Y-axis and then averaging “Sales”.
Next, we consider changing the aggregation granularity of “Sales” by using the “Customer Name” dimension in the raw data. For example, I put forward the following questions in NetEase Meili:
What is the average sales per customer?
Now let’s rearrange the data in our table by sorting by the “Customer Name” field.
Sort by customer name
Next, let’s group all our orders by the “Customer Name” field and summarize them by Customer sales.
Group by customer
Finally, we can sum the entire “Sals” column as before and divide by the new number of rows. In this case, my denominator would be 8 customers instead of 12 records. The result is $3,059.00 per customer divided by 8, which equals $382.38.
In other words, when the current “Sales” is averaged, the granularity of calculation changes from the original “OrderID” to “Customer Name”.
So, in the abstract:
We’re in the middle of anything
To measure thefor
Polymerization wayWhen calculating, rely on the current
The dimensionPartition of data.
2. NetEase has the view granularity
A view can be easily understood as a chart.
In NetEase’s Chart data panel, there are two areas that determine the granularity of your chart.
1 is the X-axis and Y-axis,
2 is the properties panel
1. The X and Y axes.
For example, I put “region” on the Y-axis, “Sales” on the X-axis, and select “Average” as the aggregate of sales
X Y
This results in the following graph, which is equivalent to averaging sales at a granularity of region,
That is, you sum each record in each locale and divide by the number of rows in that locale.
2. Properties Panel
For example, you can first put “profit” and “sales” on the X-axis and the Y-axis respectively, resulting in the following figure. Since the two indicators have not been shred by any dimension, they represent the sum of sales and profit recorded by all rows, so they converge into a single point.
When you place the “Order ID” in the “Breakdown” column of the properties panel, you add granularity to the chart, which is then broken down to show the sales and profit for each “order”.
Whether it’s the X axis, Y axis, or properties panel, putting fields in these changes the style of the diagram. Is there a way to allow analysis to be free to define the computational granularity of the measure, independent of the granularity of the current chart view?
This is what NetEase has in mind for cross-view granularity computing, which has the advantage that you can perform this computation independently of the dimensions used by the current view. If I drag a field onto the X-axis, Y-axis, or properties panel, the entire view will be affected. Cross-view granularity calculation allows us to set the level of granularity independent of the current view.
There are three types of cross-view granularity calculation expressions: FIXED, INCLUDE, and EXCLUDE. I’ll look at these three expressions in more detail in a future article.
NetEase Numero is an enterprise-class big data visualization and analysis platform with comprehensive security guarantee, powerful big data computing performance, advanced intelligent analysis, convenient collaboration and sharing and other features. You can try it for free by clicking here.
Related reading:
3. EXCLUDE expressions
“Cross-view granularity calculation” — INCLUDE expressions
Understand NetEase Cloud:
The official website of NetEase Cloud is www.163yun.com/
New user package: www.163yun.com/gift
NetEase Cloud community: sq.163yun.com/