Sunday, May 31, 2009 at 5:15 pm by Darryl
Amateur Statisticians Do the Darnedest Things
I value statistical inference as a way of evaluating evidence in an argument. I really do. I usually wince a little when people throw out the “lying with statistics” or “lies, damn lies, and statistics” lines. Such lines dismiss (usually in utter ignorance by the utterer) a rather robust body of norms for the use statistical inference in addressing scientific questions.
Here’s what brings this up. Someone named Marla Singer over at some blog called Zero Hedge posted an interesting analysis today.
Marla and her team took the records of all Chrysler dealers–both those to be closed and those remaining—and matched each to the owner’s political contributions. Of course they knew whether the dealerships were targeted for closure or not. Next Marla used logistic regression on the dichotomous variable Keep/Dump (i.e. close status) to look for any pattern in how donation history affects the odds of being closed.
Here is what she found:

If you walked up to 100 professional statisticians, put these results in their face and asked for a quick interpretation, what you you get? All of them would point out that none of these parameter estimates were significant at any commonly accepted levels of statistical significance. A p-value of less than 0.05 is taken as “significant” in most disciplines, but a p-value of less than 0.1 is sometimes acceptable for some disciplines within the social sciences.
The statisticians would go on to explain that there is considerable uncertainty in all of the estimates, such that the odds-ratios cannot really be distinguished from the value 1.0 (no effect). They might then explain that either (1) there is no effect of any of the covariates, or, (2) if there is some small effect, a much larger sample would be needed in order to to detect it.
So how did our amateur statistician respond to this analysis? Let’s take it in pieces:
This puzzled us. Why would there be a significant and highly positive correlation between dealer survival and Clinton donors?
The answer is that there is not a “significant and highly positive correlation.” What Singer has shown is that there is a parameter estimate that is not significantly different from zero by any common statistical criteria.
Granted, that P-Value (0.125) isn’t enough to reject the null hypothesis at 95% confidence intervals (our null hypothesis being that the effect is due to random chance), but a 12.5% chance of a Type I error in rejecting a null hypothesis (false rejection of a true hypothesis) is at least eyebrow raising.
In fact, this statement is not true under ordinary statistical inference. Singer has undertaken a statistical fishing trip, hoping to find some regression covariate that yields a parameter significantly different from zero at the 0.05 level. Singer does not show all regressions she ran, and she doesn’t seem to list all the covariates tested. But suppose she ran a total of eight regressions, each with only a single covariate. (I think that is how she did it, but her description of methods lack sufficient detail to know for sure.) Even if the all covariates had no association whatsoever with the outcome variable, she would have a 1 – (1 – 0.05)8 = 33.7% probability of finding at least one of the covariates significant. Finding a p-value of something smaller than 0.125 is pretty likely.
Most statistians would not call this a “find” as 95% confidence intervals are the gold standard for this sort of work.
No shit…particularly when the analyst has undertaken a p-value fishing expedition.
Nevertheless, it seems clear that something is going on here.
Well…maybe in Singer’s head. The statistical analysis certainly does not offer evidence that “something is going on.”
Specifically, the somewhat low probability that the Clinton data showing higher survivability of Clinton donors could result just from pure chance.
What the fuck?!?
But why not better significance with any of the other variables?
What the fuck, again?!?
Why this stand out?
This stand [sic] out only because Singer wished for it to stand out.
There is one other amusing thing about Singer’s analysis. Many of our panel of statisticians would likely suggest to Ms. Singer that she needs to collect a larger sample size, to which Singer would protest, “but I used all the Chrysler dealerships there are!”
“Oh,” each statistician would state. “You mean you don’t have a statistical sample…you have the entire population?”
“Yes.”
“Well then there is no sampling error,” our statisticians would point out (and point out with geeky glee), “you got what you got, and there are no standard errors, since standard errors quantify uncertainty resulting from accidents of sampling from a population.”
“Oh.”
In fact, there are a number of interesting analyses that Ms. Singer could do with these data, but p-value hunting ain’t one of ‘em.

Sunday, May 31st, 2009 at 5:46 pm
Excellent post. I’m amused that this and FiveThirtyEight’s debunking are both listed in the Trackback section of Singer’s original entry.
Monday, June 1st, 2009 at 7:59 am
She’s up front about all this, which is the priceless part! She says *specifically* that you can’t reject the null hypothesis, and then goes on to discuss it as if it were true.
What’s incredibly funny about this too is that, as Nate Silver noted, she changed her hypothesis. In fact, if I were a betting man, I’d say that the regression using party alone (Democrat/Republican/other/none) yielded absolutely no good statistical results, and seeing that she spent so much time scraping/downloading and coding her data to prove a partisan point, she was struggling to find something else to say.
In the end, she did a great job proving that there was no partisan preference in this case. Thanks for the effort, Marla!
Monday, June 1st, 2009 at 10:00 pm
I got a kick out of Marla’s comment after presenting the results:
Doh! And that’s when she revealed the conspiracy-theory tin foil hat she was wearing.
I could have saved her a lot of time massaging data. According to the motor vehicle industry’s Allpar News from May 21st:
And if that’s not good enough, Chrysler dealers may contest criteria automaker used for list.
Perhaps Marla could do a regression against the, uh… data-driven metric used to determine whether franchise contracts would be renewed?
…Nah, on second thought, that’s too easy. ‘Much better to go looking for the Cigarette-Smoking Man instead. Where’s Fox Mulder when you need him!?
Monday, June 1st, 2009 at 10:29 pm
Does no one remember the 2008 primaries, wasn’t Rush exorting people to cross party lines and vote for Hillary…could it be the ever so slight, not very significant Hillary leaning of the dealers was do to some Repub dealers contributing to Hillary of the long grueling Dem primary in the hopes of dividing Dems…so while Dem dealers were likely split between Hill and Obama during 2008, Repubs would contribute to Hill…do not know if dealers could only be labeled as being supporter of one candidate or of any and all they contributed…if any and all contributions were counted…Repub contributions to Hill seems simplest explanation….