Technology in Auditing Using Benford's Law
By: Mike • Research Paper • 1,547 Words • May 12, 2010 • 1,223 Views
Technology in Auditing Using Benford's Law
Technology in Auditing Using Benford’s Law
What started out as a curious observation by an astronomer in 1881 has the potential to have a significant impact on the audit profession 125 years later. In 1881, the astronomer “Simon Newcomb noticed that the front pages of his logarithmic tables frayed faster than the rest of the pages…”. Newcomb concluded “the first digit is oftener 1 than any other digit”. Newcomb quantified the probability of the occurrence of the different digits as being the first digit and as well as the second digit. For the most part, Newcomb just considered it a curiosity and left it at that. (Caldwell 2004)
In the 1920’s, a physicist at the GE Research Laboratories, Frank Benford, thought it more than a curiosity and conducted extensive testing of naturally occurring data and computed the expected frequencies of the digits. In Table 1, there is a table of these expected frequencies for the first four positions. Benford also determined that the data could not be constrained to only show a restricted range of numbers such as market values of stock nor could it be a set of assigned numbers such as street addresses or social security numbers. (Nigrini 1999)
The underlying theory behind why this happens can be illustrated using investments as an example. If you start with an investment of $100 and assume a 5% annual return, it would be the 15th year before the value of the investment would reach $200 and therefore change the first digit value to 2. It would only take an additional 8 years to change the first digit vale to 3, an additional 6 years to change the first digit to 4, etc. Once the value of the investment grew to $1,000 the time it would take to change the first digit (going from $1,000 to $2,000) would revert back to the same pace as it took to change it from $100 to $200. Unconstrained naturally occurring numbers will follow this pattern with remarkable predictability. (Ettredge and Srivastava 1998)
In 1961, Roger Pinkham tested and proved that Benford’s law was scale invariant and therefore would apply to any unit of measure and any type currency. In the 1990’s, Dr Mark Nigrini discovered a powerful auditing tool using Benford’s law. He was able to determine that most people assume that the first digit of numbers would be distributed equally amount the digits and that people that make up numbers tend to use numbers starting with digits in the mid range (5, 6, 7). Therefore, performing an analysis of a set of numbers comparing their distribution of the digits to the frequency expected under Benford’s Law can indicate that there may be manipulation of the set of numbers and this manipulation could represent fraud. Dr. Nigrini has continued to focus on ways to use Benford’s Law as a fraud detection tool. (Walthoe et al. 1999)
As I first learned of this predictability of numbers and how they could be used to detect fraud, my interest perked. I understand the basic logic of how and why this works but I am most interested in how to apply Benford’s Law in the work I do. It appears fairly simple, with the power of computers and the easy access to enormous amounts of data, to quantify how a set of data compares to the expected results using digital analysis based on Benford’s Law. Once you compare the results, what does it all mean?
The first criterion is to have sufficient data. Dr. Nigrini suggests that a minimum of 1,000 records are needed to expect good conformity to Benford’s law. A data set with 3,000 or more records should provide excellent conformity to Benford’s Law. Data sets below 300 records are not practical to be tested using Benford’s Law and data sets between 300 and 1,000 records should be expected to have higher deviations from Benford’s Law. Secondly, we must choose the tool we will use to perform the analysis. Microsoft Excel will perform the analysis but you will be limited to data sets with less than 65,536 records. Microsoft Access seems to be a preferred choice as it is not limited as to the number of records in the data set and it can easily read various files using its ODBC capabilities. Access also has strong grouping and data filtering features that are useful for digital analysis. There are also many specialized data analysis software products which are usually more expensive than Microsoft Access. (Nigrini 2002)
Once you have performed the analysis, how do you interpret the results? You are looking for an abnormal occurrence of a set of digits. The most appropriate test is to test the occurrence of the first 2 digits. Once you find an abnormal occurrence, you would group and filter the data to isolate the abnormal transactions and evaluate the items to determine what caused this abnormality. In one example, there was a high occurrence of the first