There has already been too much written about the Bernie Madoff $50b ponzi scheme, but forgive me just one more where I indulge my inner geek. The current party line is that this was an unsophisticated scam, something that would have been easy to spot, if not from the overly regular returns then from a fraud analysis of the numbers themselves. To that way of thinking, Bernie wasn’t a very sophisticated crook.

Leaving the first point aside, how well does the second one stand up? Could you really have spotted Bernie’s scam from an analysis of the numbers themselves, as, for example, my friend Barry suggests here? Specifically, could you have applied Benford’s Law to the distribution of most significant digit in the monthly series of Madoff returns, spotted something awry, and turned him in, without knowing anything about "split strike conversion" strategies?

To answer that question I took Madoff’s monthly returns from December of 1990 to May of 2005, as contained in a Fairfield Sentry performance data document. There were 196 months in total, more than enough to credibly expect the distribution of the numbers 1 through 9 in the most significant digits of the performance numbers to track fairly closely log_{10} ((d + 1)/d).

Here are the results. I have plotted the expected and actual incidence of each number, in percentage terms, side-by-side.

The upshot is that Bernie’s performance numbers tracked Benford’s surprisingly closely. In other words, a straightforward numerical analysis of his performance numbers â€“ without recourse to knowledge about "split strike conversion" option strategies â€“ would not necessarily have shown up the (alleged) fraud here. Matter of fact, had you done this sort of quantitative analysis as an SEC employee, your tendency might have been to be somewhat skeptical about claims of fraud.

Now, this isn’t meant to absolve Madoff. While he has been convicted of nothing, the allegations seem well substantiated, and he has apparently said some awfully incriminating things. Nevertheless, it is interesting to see that any fraud here was sufficiently sophisticated such that the proffered performance numbers were credible from a distributional point of view.

Taking it one step further, it almost certainly means Madoff’s numbers would have been generated algorithmically. He didn’t pluck them from air at the end of each month. That is, I think, interesting in that it shows that this (alleged) con was at least somewhat more sophisticated than some of the noisier critics out there have been saying.