The single most important rule when reading anything based on statistics is to understand the source and reliability of the information. Never was that more evident than in the much-quoted numbers about how Breaking Bad's final episode has been pirated, with Australia apparently leading the piracy charge. Is that really the case? A careful look at the data suggests not.
The source of this information is research by BitTorrent news site TorrentFreak, which regularly tracks download habits for major TV shows and movies. By TorrentFreak's calculations, more than 500,000 copies of the final Breaking Bad episode were shared within 12 hours of its broadcast. TorrentFreak also claims that 18 per cent of downloads it tracked originated in Australia, a higher total than any other country. Those figures have been very widely reported, including by our sibling sites Gizmodo and Business Insider.
Realistically, we're not going to see any better data than this. There's a commercial incentive for TV networks to have their ratings measured carefully and accurately, so they're prepared to pay to fund such a system and have consistent, comparable data. There's no direct financial benefit to knowing the scale of piracy, so the data will always be piecemeal. But that doesn't mean we shouldn't understand what the figures can tell us, and why we should perhaps be more cautious in interpreting them.
So let's highlight the problems with this data:
It doesn't track all pirate activity TorrentFreak only measured visible public activity on public torrents. That wouldn't include private sharing sites, or people viewing illegal streams. To be fair, the figure is presented as "at least", so that possibility is acknowledged it, but we need to bear it in mind before making blanket statements.
The methodology isn't clearly defined For the headline figure -- more than 500,000 downloads -- all we are told is that "data gathered by TorrentFreak shows that 12 hours after the first copy of the episode appeared online, more than half a million people has grabbed a copy through one of many torrent sites". That's not particularly specific. Which sites? When did the first copy appear?
The sample size is quite small One big source of confusion in a lot of the reportage: While the numbers say more than 500,000 people have downloaded the episode, the figures on which countries were downloading is based on a subset of that group: 13,945, to be precise. While that's still a large number, it isn't spectacularly large. We're not told for this group what period was being measured (which is important for a reason I'll get to shortly).
Identifying originating countries is tricky While an IP address is quite a reliable indicator of country of origin, it isn't foolproof. If you're using a VPN, you could appear to have come from somewhere else. Tracking cities is even less reliable.
We can't eliminate possible sources of bias Given Australia's population size and the fact that Breaking Bad is quite easily accessible legally, our appearance at the top of the country rankings does seem unusual. However, even with the limited data we have, it's possible to conclude either that the figure itself might not be accurate or that the collection methodology biases the results towards Australians.
On the first point: TorrentFreak's list of the top 10 countries downloading accounts for 64 per cent of the total. While it's conceivable that the remaining 36 per cent is spread across much smaller countries, it also seems likely that some of the data couldn't be accurately tracked by country. If you assumed that a quarter of the untracked downloads actually originated from the US -- no big stretch given its population -- then the US would once again be at the head of the pack. Without a clear indication on that point, it's risky (and certainly unscientific) to state unequivocally that Australia downloads more than anyone else. At best, we can say that Australia was the most identifiable source.
The second point. While we aren't given details, the 13,945 downloads tracked would have been measured at a specific point in time. The first torrents would have become available in the middle of the day for Australians, but many people wouldn't search for and download them until they returned home from work. Conversely, in the US, pirate viewers could have begun downloading much closer to broadcast time. If the tracking happened six hours after broadcast, you would expect Australians to be disproportionately higher (taking their first opportunity when at home), and US figures to be lower (since many viewers would have already downloaded and starting viewing).
We don't know that this is what happened, because we don't have enough details of the methodology. But it's not an inconsistent reading of the limited data we have. And if this is the case, we can't claim Australians are the biggest pirates in the world for Breaking Bad: we can only suggest that most people do their torrenting from home.
Finally, consider this. Even if you assume all these figures are reasonable, the 500,000 claimed downloads within 12 hours are a tiny proportion of the 10.3 million people who watched the final episode live on TV in the US. Even on a generous assessment, piracy isn't as popular as sitting down and watching -- at least some of the time.
Lifehacker 101 is a weekly feature covering fundamental techniques that Lifehacker constantly refers to, explaining them step-by-step. Hey, we were all newbies once, right?