Thursday, 20 June 2019

Spurious

By Fred Shivvin

My love for graphs has a most unfortunate side-effect. I end up shouting and swearing at most news reports with line graphs, bar graphs or pie charts. Regularly. In fact, most days.

What I see are spurious graphs, and they make me angry. 

Spurious means:
1: of illegitimate birth; bastard
2: outwardly similar or corresponding to something without having its genuine qualities; false
3a: of falsified or erroneously attributed origin; forged
b: of a deceitful nature or quality

Bastard, false, deceitful! Yep, they are some of the words I shout at the TV.

A spurious graph is manipulated to make illegitimate claims with false authority and objectivity. The figures are from real data, right? The plots and lines are mathematics, so how can you argue with them? They have a seductive power to mislead and deceive - their apparent objectivity gives viewers a sense of trust. 

Spurious graphs are a boon for marketing and PR. They turn up in advertising for dubious health products, plausible but disingenuous political claims, and shoddy science interpretation. (Think The Coalition's 2019 misleading claim that Australia's carbon emissions were actually going down.) 

Graphs provide authority for the claims made in research and news reports. They are also easy to view and interpret quickly, so they fit into our fast paced news cycle.

six upward trending lines on unlabeled graph
Fig 1: Important Graph.
As an example, you can see just by glancing to the left that there has been a massive increase in the number of news articles featuring a graph over the last 20 years. The upward trend is clear and the clean graph layout suggests the information is credible. 

The problem is that  I made it up, and I made up the graph too. If you look more carefully, it is essentially meaningless. It has no labels, no scale, a pointless title. It's a spurious graph that actually says nothing. Except if you just glance at it, the lines infer 'Up, ever upward.'

And if you pay attention, you will see spurious graphs everywhere. 

Here are some disturbing examples that might make you laugh with how bad they are, until you remember these are real but spurious graphs produced to manipulate you and me. I also suggest things for you to look out for.

Example 1: The first graph starts at 10 instead of zero on the Y axis, and runs from 10 to 15 only - so this exaggerates the difference between the length of each bar. Someone has manipulating the graph to show a bigger change than there actually is. This would matter if they were claiming their funding program has made a big difference. In the right hand graph where this is corrected to run from zero to 15, it looks very different. 

So first thing for any graph, check the scale on the Y axis: does it start at zero? What does it go up to? (A scale up to 10 or 100 can make a very different impression of results, for example.) Is the scale a reasonable sort of way to measure this type of content? 

shows upward trend in selected data which is actually downward trend in full data
Example 2: adapted from Venngage

Example 2: The top graph shows a very short time period on the X axis. In this short 'cherry picked' time period, the financial returns on this particular share appear to have done well. The graph below it shows a longer, more representative time period for the trading pattern, which shows the share is on a downward trend.

Second key things to check (mainly for line graphs) then is the time period on the X axis; does it include enough information for the claim they are making, or has someone done some cherry picking to make things look better or worse than they are? 

shows misleading inference because the scale is not even
Example 3: from Noijam 

Example 3: In this graph, the uneven spaces in the points of time suggest something has been omitted. The graph implies there has been a sudden increase in the number of episodes. But why is there so little space on the right hand side? This should be spread out over more space on the X axis (just like the earlier points on the graph), and then I would imagine the trend line would be fairly similar. 

So third check: Do both axes have consistent spacing and sensible labels for the type of content? And has the spacing been manipulated to imply something?

not just wrong but ludicrous, more people on welfare than with full time job
Example 4: from Data Science Central.

Example 4: The graph compares households with any person on welfare on the left with every individual with a full time job on the right, not sensible to compare, but works rather well to look outrageous. To be legitimate, they should have compared households for welfare and jobs OR individuals for welfare and jobs. It doesn't start at zero either, so it implies a shockingly greater number of people on welfare than those working. Clearly manipulated information to mislead in order to stoke the outrage machine.

So final check: if the findings look shocking or outrages, check the type of data used to make the graph. Is it sensible to compare these two things or has someone compared apples with oranges to make you angry at fruit?

In summary, these graphs are spurious because they do not represent the data faithfully or accurately. A valid graph should:
  • have an X and Y scale suitable for the type of content, staring at zero
  • include all the data and enough data over time
  • have clear labels on the axes
  • compare things of the same nature.
Accuracy is essential. But there is a bit more to avoiding being mislead by spurious graphs.

Graphs are tools of communication; they tell us things about the world. So when we consider what a graph is telling us, we also need to think about how that relates to our existing knowledge of the world. Graphs communicate how things relate to each other. It's up to us to then think about why this might be, and why it might be relevant and what that knowledge allows us to do. 

Exploring how things relate to each other in our complex world involves more than numbers and objective methods of plotting on a grid. It involves us thinking carefully about the world and the sorts of relationships between things that could possibly exist.

Interpreting graphs requires us to think about the value of the information.

Tyler Vigen has created a wonderful website Spurious Correlations to demonstrate just how easily graphs and data can be manipulated to make it appear that two things relate to each other, to show lines that trend in the same pattern and then appear to show a correlation (cor meaning 'together' + relation)

He has technically accurate examples of graphs that show completely unrelated things as somehow, mysteriously, but so very definitely related to each other - see those lines! Like the per capita consumption of chicken and US crude oil imports! That makes sense. Not. Not if you think about it for three seconds.

Does the consumption of chicken cause the level of crude oil import or the other way around? Um, well, neither. It's a spurious relationship to start with. And it lacks any link to anything we know about the world. 
  
You can make your own spurious graph at his site on this page. Do try it; it's fun and not just for graph nerds. (Well, I wouldn't exactly know.)

So while graphs are appealing and powerful because they seem objective, they depend on a whole pile of decisions that are subjective. 

Graphs rest on the authors' decisions and selection of data to graph. They should start from a genuine interest in finding out about the world using all the available data, and not some limited figures or time periods to suggest some relationship you want to imply. Graphs are also constructed on assumptions about how things in the world can relate to each other, and what factors to track together to investigate relationship and correlation. To say anything meaningful, they must refer to a nomological network which precisely defines each factor on the graph. Graphs emerge from our underlying understandings of how things relate to each other in our world.

These subjective decisions are value-laden, and they are open to misunderstanding, bias and, as we have seen, deliberate manipulation. 

Vigen's website aims to reminder us that interpreting graphs takes some care as well as asking the appropriate questions, firstly about graphs, but also about the world and the sorts of relationships between things that could possibly exist.

Spurious graphs are everywhere so being alert to them can be a demanding task. It's easier said than done. 

Because we trust a trending line on a graph to mean something objective, we maybe let our guard down when we really shouldn't. Does a graph suggest there a relationship between a person's political leanings and their tendency to violence? Vigen's site suggests that anyone could make a graph that appears to say this. If you believe your political opponents are not very nice people, that graph would be satisfying. But it might just be spurious.

The use of spurious graphs in the media is part of keeping up sales.

News media outlets aim to attract attention, to get clicks and readers. This drives the news content. Humans are well known to pay more attention to negative than positive news. As a result, media adopts a negative tone to maintain profit. Fittingly (again), here is a graph that shows just this. This one is real: media relies on negative content, alarming or shocking even better, including the spurious graph. (Read more at source article on human's natural attraction to negative news.)

So, the upshot for a media worker is that if a negative graph is part of your newsroom editorial decisions, but you don't really understand it, don't worry, it will be good for clicks. If a story involves distributing a bizarrely shocking graph, let's not bother with any fact-checking that might impact profit. 

This is not new, as social psychologist Milton Rokeach wrote in 1968:

“The kinds of data … disseminated in the mass media seem designed more to entertain than to inform. … The quality of the information conveyed seems not much different from that conveyed in the sports pages or, better yet, the daily racing form."

And as for people who deliberately manipulate graphs as propaganda*, as in some of the examples above, all I can say is protect yourself with a lot of scepticism and a little knowledge.

Spurious graphs are used and reused to entertain, to generate clicks and revenue, but also to confuse and to manipulate. All with the cloak of authority and objectivity.

The misuse of my much loved graphs gets right up my nose. Strangely, shouting at the TV does not seem to improve things.

The least we can all do is think twice about what an upward trending line really says.

Footnote
*RE 'fake news'. I prefer the term propaganda which says more about the purpose than the specific item of 'news'. For those interested, this article explores this issue.

Sources of images of other people's graphs

2 comments:

  1. Thanks for the article Fred. Personally I find shouting at the tv and the radio very soothing. Cari

    ReplyDelete

We would love to hear your comments. All comments are moderated - so after you have your say, click Publish (bottom left), then you should get a pop up about approval. If it is your first time commenting, you may get a Blogger site request to confirm your name which will be displayed with your comment. Fred or the other writers will do their best to get back to you in a day or two!

Recent posts