In data analysis, the use of percentages is an indispensable and important method for making effective comparisons between figures, especially when the data in question differ greatly in sample size or total value. With percentages, we can quickly and accurately understand how the sum of the data has changed in a dimension type (such as time range, geographic region, product line, etc.).

In this blog post, we will detail how to use TSVB (Sequential Data Visualization Tool) in Kibana to calculate percentages using common visualizations such as pie charts, single-valued metrics, tables, or timing with several examples.

In this post, we will use the flights and ecommerce sample datasets provided in Kibana. Using this data, you can learn how to answer three questions:

  • What percentage of flights are on time?
  • What is the ratio of each delay type over time?
  • How much did total sales change week over week?

However, before you can begin, you need to install the FLIGHTS and ecommerce sample data. You can either install the sample data on your own cluster (version 6.5 or later) or get a free 14-day trial of our Elasticsearch Service.

What percentage of flights are on time?

Your customer, an airline, wants to highlight a number on their dashboard to show how they are performing against their on-time target. Each document in the kibanA_SAMple_datA_FLIGHTS sample index represents a flight, so to calculate a percentage, you divide the number of documents representing “on-time flights” by the total number of flights.

To use this algorithm formula, you should use Metric visualization in TSVB because it includes the Filter Ratio calculation method. Filter Ratio divides the same metric in both document sets and returns a numeric value. The only requirement for using TSVB is that there is a time field in the index, which happens to be in the KibanA_SAMple_DATA_FLIGHTS index.

In version 7.4 or later, to do this with TSVB, you first need to select the visual type and data set, and then configure the aggregation to display the above percentages.

To select visual types and data sets:

Using TSVB

1) Go to TSVB and select Metric:

 

2) Select the Panel Options TAB:

3) Set the Data Timerange mode to Entire timerange, do not set it to use only the last time interval. Note: This option is available only in versions 7.4 and later. For previous versions, you set the date interval to the maximum 4) Enter kibanA_SAMple_datA_FLIGHTS as the index. Set the time field to TIMESTAMP.

Now that you have selected the index and time range, you can configure the data to be displayed.

1) Go back to the Data TAB and use Filter Ratio to calculate the percentage of a particular value (such as FlightDelayType: “No Delay”).

  • Filter Ratio selects both sets of documents using Query String syntax.

2) Go to the Options TAB and select the Percentage Formatting tool.

If you want to compare multiple values, rather than just one, you can use two other visualizations in Kibana. If bucket Aggregation is used to select the value to compare, you can use a pie chart or table to convert each value into a percentage.

Use pie charts

We can click Options to configure:

Use the form

Click Options to customize:

 

These visualizations apply Terms bucket aggregation to the FlightDelayType field, converting the quantity in each bucket to a percentage of the total value and displaying it. In the FLIGHTS sample data, the field FlightDelayType has only 6 unique values, so when Terms is set to 6 or more, the percentage displayed is accurate. If there are more unique values in the data, you need to enable the “Other” bucket to cover all data:

 

What is the ratio of each delay type over time?

The company wanted to show two visualizations side by side in the dashboard: the summary results above based on temporal comparisons. This allows them to drill down to a specific time range and look at both summary and breakdown data.

Since flight documents have time fields, TSVB is the most powerful tool for creating visualizations. Still using the Settings above, you can use Filter Ratio to calculate the value of the “No Delay” flights divided by the total number of flights for each interval:

To compare more than one item of timing data, you can either create multiple Filter ratios or use aggregation to select groups. The TSVB has a mode for displaying multiple timings with a total ratio of 100%.

To configure stacked percentage visualization in TSVB, you first need to select the right data and then configure your aggregation. To select the correct data:

  1. Go to TSVB, select Panel Options, and type KibanA_SAMple_datA_FLIGHTS as the index.
  2. willTime fieldSet totimestamp.

To configure your aggregation:

  1. Return to the Data TAB.
  2. Under Group by, select Terms for FlightDelayType.
  3. Go to the Options TAB.

 

How much did sales change week over year?

You have an e-commerce customer that stores each transaction in the KibanA_sample_datA_ecommerce index. They wanted to visually show the percentage change in sales from week to week, a metric that is critical to their business. Because there is a time element to the problem, you will use TSVB to create this visualization. You can also use Timelion to create week-to-week charts, but I won’t cover that here.

As you saw above, TSVB uses the same aggregation to create both metric and temporal visualizations. Unlike the previous example, where you set the Data Timerange mode to the Entire Time range, in this visualization, you simply show the Data for the most recent day compared to the Data for the same day last week.

To create this visualization, you need to calculate the total sales for each day and then compare them to the total from seven days ago. The easiest way to compare is the Serial Difference aggregation, which takes each value and subtracts the value up to a specific time interval from it.

As in the previous example, set up TSVB, make sure to select the KibanA_sample_datA_ecommerce index, and set the Time field to order_date. Also, set the interval exactly to 1D. The default interval in TSVB varies based on the overall time range, but this calculation requires a fixed interval.

Go back to the Data TAB and select the Sum aggregation for the field TAXFUL_total_price.

 

Then, add a Serial Difference aggregation for Sum of Taxful_total_price, remembering to set the Difference to 7 buckets because you have set the interval to 1 day. Serial Difference subtracts 7 barrels of the previous day’s value from the daily value to arrive at the total year-over-year Difference in sales week. Because of the way the Serial Difference metric is calculated, it always leaves the left 7 days blank, so you should choose a larger time range:

Click the + button above:

Finally, you need to mathematically convert the week-to-year difference into a percentage. Add the current limit and the difference, then divide by the current sales:

Click the + button above:

Click the + button above:

The Painless script used is (((params.total + params.diff)/params.total) -1)*100.

The resulting visualization:

The results are shown in the visualization above. For clarity, you can also create a new sequence with a static value of 0 to highlight the baseline when the margin is 0.

You can now toggle between the Time Series and Metric tabs to determine which presentation is best for your use case.

 

conclusion

These are just a few examples of percentages that are effectively calculated and used in Kibana. In addition to what we’ve listed in this blog post, if you want to explore more computational options when working with percentages, you can try using Canvas in Kibana to control both how the data is queried and displayed.