On Worshipping Non-Parametric Tests

In continuing to ponder the inherent problems in making non-modal distributional assumptions about stocks, I got to wandering through one of my old statistics texts. That, in turn, got me thinking for the first time in ages about non-parametric tests, which remain the wondrous things I remember them to be.

With the preceding in mind, here is the Second Commandment from John Alroy’s excellent  Ten Statistical Commandments:

Thou shalt run non-parametric tests! If the parametric and non-parametric tests come out the same, thou hast lost nothing. If they don’t, the data are non-normal, the parametric test is wrong, and thou shalt use the non-parametric result. Spearman, Mann-Whitney, and Kolmogorov-Smirnov are the Holy Trinity (or Quintinity, or whatever). Worship them!

Truer words ne’er said.


  1. Do you wonder about the statistical coincidence of statistically based algorithmic trading and failure in times of volatility?
    Is using statistics the best method to model an essentially biological system? Perhaps not.
    When a whole industry seems to be based on using the same methods and the same people its headed in the direction of disaster. Look at how quants are recruited and you will see a growing mono culture (like the world of financial services in general), and a great opportunity no doubt for smart traders to leverage the inherent weaknesses exposed by statistical thinking.
    I wonder, do any algorithmic models actually model the people who drive markets and hence drive volatility in times like these? If they did not that would be very odd, but also very predictable.
    Statistics are good at finding patterns but they are not adaptive and dont react well to change. That would seem like a serious flaw to me but my knowledge of algorithmic trading is limited so it would be great to be enlightened.

  2. Josh Stern says:

    I posted in the “Fat tails” comment section about the problem of correlations. I like non-parametrics, but they don’t do anything to solve the problem of making huge multivariate models tractable without independence assumptions. When independence assumptions are correct, then normality follows quickly (Central Limit Theorem is fast convergence, so the mean distribution of independent trials of any old variable starts to make a close approximation to normality after about N=12). The assumption of statistical independence between the trials is the weak link here, not the CLT.

  3. @Ian: “behavioral finance”

  4. Josh
    That’s a fair point. A big part of the problem here is the assumption of statistical independence, which blows up distributional assumptions.
    That said, regime shifting is something not well handled by quant models. I’ve been looking at a Lehman factor model, and the transitions in the last four weeks have been remarkable.

  5. Josh Stern says:

    Most all of statistics, including non-parametrics, is based on modeling i.i.d. (independent identically distributed) samples. If a set of trials (past or predicted future) are not independent, or if the future predictive situations are not identically distributed to the past samples, then statistical modeling with those problems won’t be valid.
    To put your regime idea in this context, if both regimes were represented in past samples, then, by itself, the two regimes shouldn’t be a problem. Sure, it’s possible, as you suggest, that maybe the right distribution is multi-modal and modelers tried to use, say, a Gaussian distribution that just didn’t fit well. But I’m trying to emphasize that questioning the validity of the i.i.d. assumption is a much more serious and fundamental problem for this type of quant modeling. It couldn’t be handled by non-parametrics or anything else that needs the prediction problem to be close to an i.i.d. setup for training and test.

  6. @Seth: Right, my question is, how prevalent are behavioral approaches, compared to statistical approaches?