Comments on Probably Overthinking It: There is only one test!

Got it, and I agree. Thanks again!

2012-10-29T15:53:10.029-07:00

Got it, and I agree. Thanks again!

Hi Allen, sorry for the delay and thanks for the r...

2012-10-29T15:41:39.766-07:00

Hi Allen, sorry for the delay and thanks for the response!

For the second point, here's all I meant to say: A p-value above 0.1 might not mean the apparent effect is *likely* due to chance ("We're confident this is a fair die"), but rather that *you shouldn't rule out* that it might be due to chance ("Our sample was too small to be confident that it's not a fair die"). Sometimes that distinction is important.

But I agree that a tiny p-value generally means your effect is likely not due to chance (after accounting for multiple comparisons etc).
And I definitely agree it's best if you can explicitly account for the costs of false positives or false negatives, and Bayesian methods are a good approach for that.

Excellent comments... thanks! I completely agree ...

2012-01-26T12:33:19.715-08:00

Excellent comments... thanks!

I completely agree with your first point, and I think we are just beginning to figure out the effect of modern computation on statistics.

I mostly agree with your second point. I think the best approach is to use Bayesian methods to estimate the posterior probability of the hypotheses, and then there is a natural way to take into account the cost of false positives and false negatives.

But if we are restricted to classical hypothesis testing, I still think it's ok, if the p-value is very small, to make the qualitative conclusion that the effect is likely to be real.

Hi Allen, great article! Two comments: You say &q...

2012-01-26T12:20:20.954-08:00

Hi Allen, great article! Two comments:

You say "there is no special reason to choose the exponent 2" ...
I'm sure you're well aware, but you could point out for future readers: one major reason it's traditionally a "chi-squared" statistic instead of "chi-absolute-value" is that squares are differentiable, and absolute values are not. So folks 100 years ago could analytically minimize a sum of squares, and end up with a simple plug-in formula; but you can't do that with sums of absolute values -- those require more intensive number-crunching.
(But if today's computers had been widely available when stats was being invented, we might all be using absolute-value versions instead.)

Also, you say "if the p-value is smaller that 1/100, the effect is likely to be real; if it is greater than 1/10, probably not" ...
I disagree. If the p-value is, say, 0.20, I might still think the effect is "likely" to be real, but the evidence for it just isn't strong enough to justify an action that might have big consequences.
The important question is: are there serious consequences if I claim a real effect when in fact it's not real?
In the Casino example, if the p-value were 0.2, I'd remain wary of that guy (and personally believe "the effect is likely to be real"), but wouldn't have enough evidence to justify something drastic like putting him in jail.
But in another example, it might be fine to accept the alternative hypothesis with a p-value of 0.2.
(Does my class actually prefer sausage over the default (pepperoni) pizza we usually get for pizza parties? Let's say a small survey suggests they do, with a p-value of 0.2. It's not STRONG evidence for sausage over pepperoni, so I could be wrong; but nobody's really going to suffer if I order sausage even though pepperoni would have been slightly more preferable.)

Cheers,
Jerzy

Great article! Build on it! It might be nice to po...

2011-06-01T12:05:49.464-07:00

Great article! Build on it! It might be nice to point out the differences between parametric and non-parametric bootstrap, MCMC sampling from posterior distributions for Bayesian analyses, permutation tests, etc. Might be useful to consider correlated tests as well. I don't think even simple solutions are available without a good deal thinking!