Comments on: Publishing False Positives

By: David Colquhoun

David Colquhoun — Sun, 03 May 2015 20:17:21 +0000

Under weak assumptions it’s possible to show that, if you claim to have made a discovery when you observe P = 0.047, you have at least a 30% chance of being wrong (and a lot worse if it’s an implausible hypothesis).

P values do exactly what’s claimed of them. The problem is that what they tell you isn’t what you want to know. What you want to know is the false discovery rate. i.e. the probability that a “significant” result is wrong. A lot of people think that’s what the P value tells you. It isn’t.

There are simple explanations on my blog and on Youtube
http://www.dcscience.net/2014/03/24/on-the-hazards-of-significance-testing-part-2-the-false-discovery-rate-or-how-not-to-make-a-fool-of-yourself-with-p-values/

https://www.youtube.com/watch?v=tRZMD1cYX_c

There is a proper description in Royal Society Open Science
http://rsos.royalsocietypublishing.org/content/1/3/140216

By: jt512

jt512 — Sat, 07 Jan 2012 01:01:11 +0000

banyan wrote:

Reading [the study], I found this: “We used father’s age to control for variation in baseline age across participants.” Can someone explain what this means?

The reason you don’t understand that sentence is because it is utter nonsense. “Baseline” refers to the starting time of a longitudinal study—a study where repeated measures are taken on subjects over time. “Baseline age” would then be the ages of the subjects at the beginning of the study.

However, the study (Study 2) in the paper, was not longitudinal; it was cross-sectional: only a single measurement was taken on each subject. Therefore, “baseline,” and hence “baseline age,” have no meaning. Furthermore, even if the study were longitudinal, you could not use the subjects’ fathers ages to adjust for differences in the subjects ages between experimental groups. If you wanted to make such an adjustment, you’d put the subjects’ own baseline ages in the models, not their fathers’.

What the “researchers” in this “study” (Study 2) appear to have done was to divide subjects into groups who listened to one of two songs. There was no significant difference in the mean ages of the subjects between the groups. The researchers then tried statistically adjusting the subjects’ ages by using a number of nonsensical factors until they found one that produced a statistically significant difference between the mean subjects’ ages between groups. That factor happened to be the subjects’ father’s age. They then dreamt up some science-y sounding rationale (“adjusting for baseline age”) to give the procedure the appearance of legitimacy. They then made the ridiculous claim that one of the songs caused a regression in age for the subjects who listened to it.

It was a silly exercise, because a difference in ages between the groups, whether due to nonsensical statistical modeling or not, does not imply regression in age.

Jay

By: sonic

sonic — Fri, 06 Jan 2012 17:47:40 +0000

jt512-
is correct about how the Bayes factor works (you can assume a prior probability and/or probability distribution– but you must make the assumption).

Further for Bayes to be valid- the information must be coming in randomly– that is the next piece of information must come randomly from all possible sources of information about the topic. (One can’t look into the bag before picking a ball)
If a researcher decides to do a study based on his/her understanding of a situation– then it is questionable that Bayes applies. (He looked into the bag and made decisions about how to pick the next ball)

And I do apologize for the strained analogy.

I’m pretty sure the misuse of statistics is not new. Back in the late 1970’s early 1980’s computer programs were developed for statistical analysis. These were then used by people who have no idea about the limitations of the mathematics.

For example– a statistical analysis is valid for a population that has been randomly sampled.
What is the population that is randomly sampled in the case of a study done on college sophomores who got paid to do the study?
The answer is NONE. It is not a random sample of any population.
So to make any conclusions about any population (other than the actual participants) using this method to get subjects is not valid according to the math.

A small study of non-randomly selected people can be a means of doing a study– the results of which might be interesting enough to do a real study (costly, time consuming).
This is one reason replication is important– but who gets paid for that? Heck, it seems the magazine wouldn’t publish attempts (both successful and not) to replicate Bem’s work.
The demand for novel findings seems higher than the demand for careful analysis and testing of said results right now.

By: banyan

banyan — Fri, 06 Jan 2012 15:55:10 +0000

Brilliant idea for a study.

Reading it, I found this: “We used father’s age to control for variation in baseline age across participants.” Can someone explain what this means?

By: jt512

jt512 — Thu, 05 Jan 2012 20:45:45 +0000

Blaisepascal,

Your understanding of the Bayes factor is wrong. The Bayes factor does not appear in the form of Bayes theorem that you have presented. It only appears in the odds form of Bayes theorem, which I gave in my previous post. From that equation, it is evident that the Bayes factor does not depend on the prior odds; however, it does depend on the statistician’s choice of distribution for the alternative hypothesis, as I explained, above.

Jay

By: jt512

jt512 — Thu, 05 Jan 2012 20:33:45 +0000

Steven,

Bayesian hypothesis tests are based on the odds form of Bayes theorem:

(posterior odds) = (Bayes factor) * (prior odds), where

(Bayes factor) = P(D|H1) / P(D|H0).

The Bayes factor, above, is the amount by which the data change your degree of belief in the alternative (versus the null) hypothesis. As is evident from the odds form of Bayes theorem, the Bayes factor is independent of the prior odds, your degree of belief in the hypothesis before seeing the data. Different people will have different prior odds of the hypothesis, and the Bayes factor is independent of those subjective judgments.

However, in order to calculate the Bayes factor, itself, the statistician must specify a prior distribution on the alternative hypothesis. This distribution is needed to calculate P(D|H1), the numerator of the Bayes factor. The choice of distribution on the alternative hypothesis will affect the Bayes factor, and so, some subjectivity in a Bayesian hypothesis test is unavoidable.

However, that does not mean that anything goes. Some distributions on the alternative are more reasonable than others, and whatever distribution is chosen, it needs to be disclosed and justified. Furthermore, a sensitivity analysis can be conducted to investigate how different choices of reasonable distributions affect the Bayes factor.

Jay

By: mlegower

mlegower — Thu, 05 Jan 2012 20:01:19 +0000

blaisepascal- "If both P(E|H) and P(E|not H) were reported, the reader could calculate their Bayes Factor and posterior probability themselves, without the researcher having to assume any given prior themselves." [Presented entirely absent of hostility and in the interest of mutual education] But the formula for the Bayes' factor is K = P(E|H)/[P(H)P(E|H) + P(not H)P(E|not H)], which means that you have to assume something about P(H) to calculate it, right? Which means that you can report the probabilities of observing the data given each regime (P(E|H) and P(E|~H)), but you can't go on to infer anything about the probabilities of each regime given the data unless you establish a prior over the regimes, correct? But if you are only reporting the probabilities of observing the data given the regime, then you are back to what is essentially a frequentist approach I would imagine. Certainly, given the data and the methods, you can establish the posterior for any prior you might have. And maybe the best route is to report simply P(E|H), P(E|~H), and Bayes Rule so that the interested observer can plug in their own prior. You can even test the sensitivity of the posterior to the choice of prior. But it seems like that is the nature of the criticism above.

By: BobbyG

BobbyG — Thu, 05 Jan 2012 19:05:20 +0000

” if a researcher can’t demonstrate that his distribution is Gaussian, he should use Chebychev’s inequality instead. that would kick a lot of results out of “statistical significance”.

Indeed. Color me Chebychev.

@ 2 s”sigma”1-(1/k^2)=.75, @ 3 “sigma it’s .89

LOL. Any time I see the word “Gaussian,” particularly in conjunction with purported assessment of some non-physical attribute, my hand slides reflexively over my wallet.

“Gaussian Copula” – well, now, THAT worked out really, REALLY swell, didn’t it?

By: Steven Novella

Steven Novella — Thu, 05 Jan 2012 18:58:37 +0000

Sorry – didn’t have time to edit until this afternoon. Yes 0.05 – now fixed.

By: wfr

wfr — Thu, 05 Jan 2012 16:59:02 +0000

I really hope he meant “0.5” I’m just about to publish the killer results of my coin-toss experiment.