Benford’s Law (1)
In any given set of large (i.e. spanning several orders of magnitude) random numbers, one might expect that all the digits from 1 to 9 should appear, statistically, at each position in the number sequence around 11% (100/9) of the time[1].
In 1881 Canadian astronomer Simon Newcomb, and again independently in 1938 American physicist Frank Benford, noticed that, in sets of large numbers obtained by measuring natural physical phenomena, digit 1 appears as the first digit of the number on average 30% of the time, 2 appears in this position 26% of the time and so on through to digit 9, which appears in the first position a mere 5% of the time. Benford called this the “Law of Anomalous Numbers” or the “Law of First Digits”, but it has today come to be known as the Newcomb-Benford Law or, more commonly, just Benford’s Law[2]. Benford’s Law is a statistical effect, so our set of measured numbers must contain many examples before the effect can be conclusively demonstrated.
The light bulb moment for Newcomb came when he noticed that the first few pages of his book of logarithmic tables (covering numbers 1,2 & 3) were much more thumbed and dog-eared than the final few pages covering numbers 7, 8 & 9 [3].
Counterintuitively, Benford’s Law holds true even for systems which we might expect to be randomly distributed, for example: the length of the hundred longest rivers in the world. It holds true no matter what physical characteristic of a system (mass, length, volume, radiation, dollars etc..) is being measured. The Law also holds true irrespective of the units being used (SI v Imperial, meters v kilometers, grams v tonnes, light years v parsecs) provided that the units employed give sufficiently large numbers for the thing being measured. The numbers can be to base 10 (our normal system of counting), base 60 (as used by the ancient Babylonians) or base 2 (as used by computers). It does not, however, hold true for artificially restricted numbers (telephone numbers, for example).
Why is this so? Clever mathematicians and statisticians have pondered this question and, although I do not fully understand all their explanations, the one answer that I do understand seems to lie in the positional number system that we use to represent large numbers.
In the following, remember that we read large numbers from left to right, but add to them (count) from right to left.
With our numbering system, we unconsciously use logarithms almost every day. The system is a not-so-secret code which specifies that a in a number expressed as a linear sequence of digits, the digits are arranged logarithmically. Each number position (slot) contains an order of magnitude more digits than the position immediately to its right. Thus, the five-position number 76543, when we break the code, represents (7×10000) +(6×1000) +(5×100) +(4×10) +3.
When we are counting with this system, each added unit goes into right-hand position of our growing number. When that position is filled, the units are converted to a 1 which is then transferred to the second position from the right. Adding units steadily to the right-hand slot, it will take ten times as long to fill the second slot as it does to fill the first, and ten times as long again to fill the third slot as it does the second, and so on. When we stop counting, it is therefore much more likely the slot on the right will contain a 9 than any position to its left. Turning the argument around, it is likely that when counting stops, the left-hand position of our large number – the last slot to be filled – will contain a 1 than any slot to its right. The same argument can be made for all the other digits from 2 to 8.
Benford’s Law has found many practical applications in the detection of fraud. It cannot prove a fraud, but it can quantify a “that does look right” moment and so focus further investigation. Crooked accountants inventing false company profits. Bureaucrats working for corrupt Governments producing false GDP figures. Scientists publishing fake data. Fishermen lowering the weight of their annual haul to meet Government catch limits. Taxpayers doing their annual returns. In all such cases, liars are likely to produce numbers that do not correspond the Benford’s Law. The reason is that creators of large false numbers will instinctively seek to distribute their digits randomly along the sequence.
A good example (I take this from the statology.org website) is the collapse of the giant US energy company Enron in 2001. Forensic accountants, applying Benford’s Law to Enron’s public accounts, showed that they did not satisfy the Law. This could have, and should have, raised suspicions about Enron’s viability if said forensic accountants had carried out their work before the company collapse, rather than after. Enron’s auditor for many years – Arthur Andersen LLP, one of the world’s largest accounting firms with over 84,000 employees – deservedly followed Enron into receivership in 2003. Note that number 84000 – although large, it has only two significant figures.
Law enforcement relies on being smarter than the average crook. That, unfortunately, is not always the case. More sophisticated liars can research Benford’s Law – as I have done – and make sure that their numbers pass this test.
Can Benford’s Law be applied to the mining industry? Mining and exploration companies can fabricate false accounts like any other. However, when they announce their resource figures these do not normally contain enough significant digits. No company ever presents their metal resource to the nearest ounce of gold or ton of base metal – a large element of rounding is always used, reflecting the inherent uncertainty of such calculations. A knowledgeable person would not need Benford to recognise BS when they saw it.
Take the biggest mining fraud in history – the Busang scandal of the mid to late-1990s (for full details and analysis on the Busang fraud see my previous posts HERE and HERE).
In this fraud, a small group of contract Filipino geologists working for the Canadian junior exploration company Bre-X, added purchased alluvial gold to drill samples from the Busang gold project, located in the remote jungle of Central Borneo, in the Indonesian province of Kalimantan. The crushed and mixed samples were then sent to an independent laboratory to be assayed. From thousands of assays results reported regularly to the Toronto stock exchange over a three-year period, expert analysts calculated that the Busang deposit contained over 70 million ounces of gold. Hyperbolic, overexcited, financial journalists enthused over a potential for 200 million ounces. At its peak, the penny stock Bre-X soared to CAN$286.50, valuing the company at CAN$6 billion. The fraudsters made cash by selling their company stock.
But 70 million ounces is a round figure. All one can say is that it is a big, big number for an epithermal gold deposit. There is nothing for Benford to work on here. And the Borneo fraudsters were sophisticated; they did not just add a pinch of gold dust to each of around 10,000 drill samples (my estimate). Operating over several years in a secret jungle laboratory, they carefully weighed calculated amounts of gold to each sample to precisely create desired assay numbers compatible with a real hard-rock epithermal gold system.
The deception continued for several years. Exploration geologists who knew the Indonesian exploration scene and fancied themselves knowledgeable in gold exploration (I was one) did not suspect fraud even though there were some puzzling aspects to the steady stream of misinformation provided by Bre-X. The sheer scale and duration of the fraud that would have been necessary was beyond anything we had experienced,
The criminality was finally exposed in 1997 by an expert geological team from the American mining company Freeport-McMoRan. They did this while carrying out on-site due diligence prior to making a partial takeover offer for Bre-X. But the unmasking of the fraud was too late for the thousands of Canadian mum-and-dad and corporate investors who collectively lost 6 billion dollars when Bre-X was delisted. No one was ever charged with the crime [4].
[1] In our positional numbering system, 0 (zero) is not a whole digit but a placeholder indicating an unfilled slot or position in the number sequence.
[2] Confusingly, there is another Benford’s Law, propounded by US physicist Gregory Benford in 1980. It states: “Passion is inversely proportional to the amount of real information available”. (Gregory) Benford’s “Law” is an interesting and humorous aphorism rather than a scientific Law and is much lesser known than the Law of Simon Newcomb and Frank Benford.
[3] When I first read this gem of scientific insight, I dug out my own book of log tables – miraculously-preserved from pre-electronic calculator days – I have it in front of me right now – dated 1957, compiled by Frank M Castle “For the use of students in Scottish High Schools”. And sure enough… the first few pages are indeed more thumbed and dog-eared than the last!
[4] I cannot resist adding this dramatic footnote to the Busang story. It concerns the senior Filipino site geologist Michael de Gusman. If we exclude the possibility that the Calgary-based Bre-X executives were in on the scam, De Gusman fills the role of chief villain. In 1997, while Freeport geologists were carrying out their due diligence, De Gusman, flying to site from the regional capital Samarinda in an Indonesian Army helicopter, supposedly fell to his death. A body was found on the jungle floor a few days later, partially eaten by pigs and was identified as that of De Gusman by one of his site-based geology colleagues. But a later report by respected New Zealand investigative journalist John McBeth found that a male corpse had mysteriously disappeared from the Samarinda morgue some days before the incident (!). Conspiracy theories abound. There is still time for an aged De Gusman, living under an assumed name, to re-appear from the shadows to tell his side of the story.