• Mistakes, we’ve drawn a few
  • By Sarah Leo
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: ccJia
  • Proofreader: Charlo-o, CYZ980908

Learn data visualization from your mistakes

At The Economist, we take data visualization very seriously. We publish about 40 charts a week in print, on our website or through an app. For each chart, we tried to make it as accurate as possible to best support the topic we were trying to describe. But sometimes we make mistakes. If we learn from these mistakes, we can do better in the future, and others can learn something from them.

By digging deep into our archives, I came across some examples to learn from. I grouped them into three categories :(1) misleading, (2) confusing, and (3) unillustrative charts. For each misclassification, I provide a modified version that takes up a similar amount of space, which is an important consideration when publishing a print edition.

(Brief disclaimer: Most “original” charts were published before the redesign of the charts. Draw improved diagrams to conform to our new specification. The numbers are the same.)


Misleading chart

Let’s start with the worst crime of data visualization: presenting data in a misleading way. We would never do that on purpose! But it does happen all the time. Let’s look at three examples from our case.

Error: Truncate the scale

This chart shows the average number of likes on Facebook for posts from left-wing parties. The purpose of the chart is to show how far Mr. Corbyn’s posts differ from others.

The original chart not only underestimated the number of likes on Mr CorByn’s post, but also overstated the number of likes on others’ posts. In the redesigned version, we display Mr. CorByn’s histogram in its entirety while other histograms are still visible. (Fans of this blog can see another example of this error.)

Another odd thing is the choice of color. To mimic Labour’s colour scheme, we used three dark orange/red hues to distinguish Jeremy Corbyn from the rest of the parliament, party/group. We didn’t explain that. The meaning behind the colours will be obvious to most readers, but not to those unfamiliar with British politics.

Download chart data

Error: Emphasizing a set of relationships through a choice of metrics

The chart above depicts cases related to weight loss in dogs. At first glance, a dog’s weight and neck size are strongly correlated. But is it true? It should only be relevant to some extent.

In the original chart, both scales fell by 3 units (21 to 18 on the left, 45 to 42 on the right). In percentage terms, the scale on the left decreased by 14 percent and the scale on the right by 7 percent. In the redesigned chart, I kept both scales, but I adjusted the range of changes so that the results better reflected a comparative scale change.

Given the entertaining nature of the subject, the mistake is not so serious. After all, the information used in both versions of the chart is the same. However, it is worth mentioning that if two sets of data are closely related, it is a good idea to think carefully about the choice of scale.

Download chart data

Error: The wrong visualization method was chosen

We posted the poll results on our daily news app Espresso. It uses a line chart to show people’s attitudes to the EU referendum result. Through this set of data, respondents’ views on the outcome of the referendum are very volatile and have fluctuated over time.

Instead of using a scatter plus a smooth curve to show trends, we connected the results for each respondent. This is most likely due to the fact that our internal tools do not provide the ability to draw smooth curves. Until recently, we were not used to statistical software (such as R) that provided more sophisticated visualization tools. In fact, all of us can now create a voting chart, as redesigned above.

How to truncate the scale is another concern in this diagram. The scope of the data displayed in the original chart is expanded beyond what the data should be. In the redesigned diagram, I left a space between the starting point of the scale and the smallest data point. Francis Gagnon’s blog summarizes a formula for this: Reserve at least 33% of a line chart that does not start at zero.

Download chart data


Chaotic chart

A difficult chart is not as harmful as a misleading chart, but it does mean that the chart is a bad visualization.

Mistake: Thinking too wildly

At The Economist, we encourage divergent thinking journalism. But sometimes we go too far. The chart above shows the relationship between the US trade deficit in goods and the number of workers employed by factories.

This chart is outrageously hard to understand. It has two main problems. First, the trade deficit is all negative, while factory employment is all positive. Without normalizing the two sets of data to the same scale, it is not appropriate to combine them into one table. This straightforward approach leads to a second problem: the two sets of data do not share the same baseline. The baseline for the trade deficit is the red line at the top left half of the chart, and the baseline for the right-hand measure is at the bottom of the chart.

There is no need to combine the two sets of data in one table. In our redesigned chart, the relationship between the trade deficit and factory employment is clearer, taking up just a little extra space.

Download chart data

Mistake: Confusing use of colors

The chart compares the proportion of people over 65 and state pension support in selected countries, with a particular focus on Brazil. To make the chart smaller, only selected countries are highlighted in iron blue, and OECD averages are highlighted in light blue.

This visualizer (me!) Ignoring the fact that a change in color means a change in species. At first glance, this chart is the same, all the iron blues seem to belong to different groups of dark blues. But that’s not what I mean by the fact that the only thing they have in common is that they’re tagged.

In the redesigned version, the colors of all countries remain unchanged. I changed the transparency of the unmarked countries to highlight the marked ones. I then changed the font to emphasize our focus on Brazil in bold and the OECD in italics.

Download chart data


A chart with a vague view

This last category of errors is not particularly obvious. Charts like this are neither misleading nor confusing. They just don’t do a good job of justifying their existence — often due to incorrect representation or trying to fit too much information into a small space.

Mistake: Too much detail

How colorful! We published this chart in our column on Germany’s budget surplus. It shows the budget balances and cash balances of the 10 euro-zone countries. It is impossible to convey information in a chart like this because of the many colors used in the chart and the large amount of data that is difficult to distinguish due to its small value. It almost lets you feel your way through the fog. And, more importantly, the reason we don’t have data for all eurozone countries is because such stacked data makes no sense.

Let’s revisit this case to see if there are other ways to simplify the table. The columns of the chart mention Germany, Greece, the Netherlands, Spain and a few others. In the redesigned version, we decided to only feature these. To solve the problem of stacking only a few countries, I added an additional category that includes all other eurozone countries (” others “). (In the redesigned chart, the cash account balance is lower than the original chart because we used revised data from Eurostat.)

Download chart data

Mistake: Too much data, too little space

Due to space constraints, we often cram data into a very small sliver. It saves space, but it also has consequences like this chart (from March 2017). This case illustrates the male-dominated nature of scientific journals. All data are equally meaningful and subject relevant. But so much data (covering four research fields and inventors) is hard to show here.

After much deliberation, I decided not to redesign the diagram. If I saved all the data, the chart would be too bloated to express the topic succinctly. In this case, the best approach is to chop off some data. For example, we could show a measurement mean, or use the average of women’s publications in all fields instead. (If you can do better in this tight space! Then let me know, I’d love to know what you think.

Download chart data


The best ways to implement data visualization are evolving rapidly: what works today may not work tomorrow. New technologies are emerging all the time. Have you ever made a mistake that could be easily fixed? Come and tell us!

Sarah Leo, The Economist’s data visualization correspondent.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.