Think you’ve stumbled on a revelatory insight from a dataset? Visit Victoria’s head of research and insights, Julian Major, explains that there’s more chance than not that there’s something wrong with your data.
In 2023, Netflix released the documentary Live to 100: Secrets of the Blue Zones. Unlike many speculative “documentaries,” this film received a 100 per cent approval rating on Rotten Tomatoes, won three Daytime Emmys, and garnered positive coverage from several reputable news outlets, all of which offered life advice inspired by its content
The idea is that certain regions of the world, known as Blue Zones, have populations with a higher likelihood of living beyond 100. By studying the lifestyle and dietary commonalities across these geographically diverse areas, we can gain insights into how to extend our own lifespans.
Sound reasonable?
Perhaps hindsight is king, but surely this should have immediately sparked scepticism. Living to 100 is rare – can it really be as simple as taking more naps, finding religion, or using more olive oil?
In 2024, Dr. Saul Newman won the Ig Nobel Prize, a satirical award given to achievements that “first make people laugh, and then make them think.”
Newman found that fraud, not diet and lifestyle, is what largely explains blue zones. Poor record keeping, poverty and incentives were what led to blue zones, not sweet potatoes and a positive outlook.
According to his pre-published paper, the introduction of state-specific birth certificates is associated with a significant decline in the number of supercentenarian records, ranging from 69 per cent to 82 per cent. If we want to live longer, perhaps we should remove the Births, Deaths and Marriages office.
Late market researcher Tom Twyman is best known for Twyman’s Law: “The more unusual or interesting the data, the more likely they are to have been the result of an error of one kind or another”. Twyman, being a market and media researcher by trade, no doubt created this for those with good intentions. Well-run research is often boring. Changes in longitudinal research are rare and slow, and the differences that exist within datasets are often obvious. When we spot something unusual, it is worth investigating further before acting on the data.
While Twyman’s Law was likely intended for well-intentioned data analysts, it’s particularly relevant in industries where there are strong incentives to exaggerate and where scepticism is often undervalued.
Scroll through any marketing press publication and you will find countless examples of internal results published; whether that be a marketing campaign, or channel results, often fueled by MarTech.
At some point, you have to question whether there would be a figure out there that people look at and say ‘Well that doesn’t sound right’. 50x increase in click-through rates? 200x ROI? I’m sure the people proactively going to the press with these figures are merely wanting to impart some factual wisdom. Often these sorts of claims are made in largely stable markets, where you can easily find customer penetration, market share, or total revenue data. Which is often stable. Growth is hard. Brands in categories will largely have access to the same, or similar technology and audience solutions. Perhaps some are better than others. Delivering growth 20x more than others? Unlikely.
Another recent example is the application of synthetic data for market research. As Ujwal Kayande from the Melbourne Business School pointed out in a recent webinar, academic publications around its accuracy are mixed. Boring! And so, it doesn’t stop large positive claims about synthetic data’s accuracy, usually backed up with impressive, but fundamentally vague statistics (It’s 95 per cent accurate!). It is not surprising the claims come from those with direct investments in companies selling products around synthetic data.
Sensational claims are not purely in the realm of business. Positive results bias is a real thing for academic publications. Replications are not trendy. Academic fraud, whilst inexcusable, is partially a result of these incentives. We value flimsy research on ‘groundbreaking results’, rather than robust research that shows certain things do not work. For synthetic data, whilst results are currently mixed, we don’t know how many ideas for papers were thrown in the bin based on uninteresting findings, or even those rejected due to biases.
In the business world, there is also a case for being too sceptical. In the same way that ‘if you’re too open-minded, your brain will fall out’, in uncertain worlds, we do have to make judgment calls about the best way forward without perfect evidence that may never come. But I think there is a clear tendency for the industry we are in to lean too far one way.
So, when you next read headlines that have some sensational findings, it’s best to be reminded of Tom Twyman.