Looking after number one
In 1881, Simon Newcomb noticed an unusual pattern in his book of logarithm tables, the early pages were far more worn than the latter ones. It would be 50 years before this phenomenon would be named - not after Newcomb, as is almost traditional in mathematics, but after Frank Benford who later studied it in 1938. Although this observation may seem innocuous, Benford’s Law, as it is now known, is far-reaching and can be invaluable for detecting fraud across a range of situations.
Benford took the initial observation even further and tested it on as many different sets of data as he could get his hands on, from the surface areas of rivers to the street addresses of the first 300 people listed in American Men of Science. Each of these conformed broadly to the same pattern.
Put simply, Benford’s Law is the principle that numbers starting with lower digits are more common in sets of data. He theorised that in ‘real-world’ statistics the distribution of first digits would not be uniform but instead follow a trend, where numbers starting with lower digits (1, 2, 3) are more likely to arise. Using population as an example, Benford showed that in US cities the population was more likely to begin with 1. This could mean 10, or 100, or 1000 or even 1 million.
Formally the law states that the probability of a number having a certain digit d as its first digit is given by:
This can also be shown as a graph of percentage frequency of each of the first digits:
This shows that numbers beginning with 1 are six times more likely than numbers beginning with 8 or 9. We can also extend this to account for the second digits in a number and this even transfers into numbers expressed in other bases, such as hexadecimal. However despite having discovered this phenomenon over a century ago, mathematicians still struggle to understand exactly why it occurs. We can see that it follows naturally from exponentially growing processes, such as the amount of bacteria in a sample at regular intervals. It also applies to data that is scale-invariant, that is to say, has the same distribution of digits even if we change our units from meters to feet, for example. Despite this, there is no reason why so many sets should have this property.
So what has this got to do with us? One important use of this Law is in detection of fraudulent financial behaviour. It will be no surprise to you that companies and individuals within companies are often drawn to the desire to ‘cook the books’ to their own ends. However, company financial statements generally follow Benford’s Law and a comparison between these statements and the expected Benford’s distribution can therefore be used to weed out those statements which have been tampered with.
Just like companies, countries have incentive for fudging their financial statements as well. In Greece’s case, the incentive came from their membership of the Eurozone. Members are expected to comply with certain economic benchmarks, and face sanctions if they fail to do so. In 2009, a study was launched into the economic information provided by 27 EU nations, and according to Benford’s Law, Greece’s economic data could be described as suspicious to say the least. Though this was not a surprise (the EU had launched several investigations into Greece’s economic reports already), it is an excellent example of how this Law can be used across a wide range of data sets to provide insight into their validity.
One relatively high profile use of Benford’s Law was following the 2009 Iranian Election. Following this election, the candidate officially considered to have won was Mahmoud Ahmadinejad, however this was unexpected and resulted in outcries and riots across the world. Many have since argued that statistical analysis of the vote count can be used as evidence of tampering. In 2006, Walter Mebane of the University of Michigan studied election results from a variety of countries. When he later compared his findings to the vote count of the 2009 elections in Iran, he found that these did not match well to a Benford’s Law test on the second digits, contrary to what he had found in 2006 where most election data did comply with this distribution.
Some 130 years later, what started as an anecdote regarding a now rarely used tool, has been used to detect fraud in financial, economic and election data on a large scale. From a nation’s economic data all the way down to your own bank statement, Benford’s Law describes so much of how our world really operates.
By Callum Kemp, University of Bath














