Author Topic: p-values and Statistical Significance  (Read 548 times)

0 Members and 1 Guest are viewing this topic.

Offline Achamian

  • Brand New
  • Posts: 6
p-values and Statistical Significance
« on: Apr 12, 2012, 07:46:54 PM »
Hi all, I have recently discovered SGU and have been devouring the podcast archive. I love the show.

I noticed that the few times statistical metrics have been brought up on the show, the p-value is usually the go-to statistic to cite. I've taken a basic college stats class so I know a little about p-values, correlation vs causation, null hypotheses, etc. Anyway, I was feeling a little skeptical about the use of p-values, so I did some digging around. What I've found has made me very suspicious of the use of p-values as a sole (or even primary) criterion for characterizing scientific results as significant/insignificant:

1) http://www.tc.umn.edu/~alonso/Goodman_Semin_Hemat_2008.pdf
2) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1119478/?tool=pmcentrez
3) http://www.uv.es/sestio/TechRep/tr14-03.pdf
4) http://www.stats.org.uk/statistical-inference/Johnson1999.pdf

(Bad citations? Screw you, I'm lazy and this isn't school :P)

Some highlights:
Quote
The P value is a measure of statistical evidence that appears in virtually all medical
research papers. Its interpretation is made extraordinarily difficult because it is not part of
any formal system of statistical inference. As a result, the P value’s inferential meaning is
widely and often wildly misconstrued, a fact that has been pointed out in innumerable
papers and books appearing since at least the 1940s.
(1)
Quote
Firstly, table ​table33 shows that P<0.05 cannot be regarded as providing conclusive, or even strong, evidence against the null hypothesis.
(2)
Quote
Conservatively, therefore,
392, or 90.1%, of empirical articles in a sample of marketing journals report the results of statistical tests
in a manner that is incompatible with Neyman–Pearson orthodoxy
(3)

I am wondering how pervasive the misuse of p-values is within the scientific community. Of course, this is good reason not to trust a single study, but I have to say that this makes me really question some of the scientific conclusions you see coming out of the softer sciences. Perhaps someone within the scientific community has more insight on this question.

Offline Cowtown Cody

  • Well Established
  • *****
  • Posts: 1669
  • Part-time Ferret
    • Cold North
Re: p-values and Statistical Significance
« Reply #1 on: Apr 12, 2012, 08:32:38 PM »
As to how prevalent p-value misunderstandings are, a good answer is in that first paper you cite.  It looks like your research has answered a lot of your own questions.  Which is great.

Appropriate p-values are set by convention, and in my experience, it's pretty well understood that a p-value is not an indicator of some kind of mechanical, scientific certainty - it's just convention.  But that is useful in and of itself; if you see an unusually large p-value, or a p-value that doesn't adhere to the minimum standards set by convention, you should be much more wary about how you regard the claims being made by that study, because the larger the p-value, the easier it is to call weaker evidence "statistically significant," which is a good way to swindle people by science-washing your data.  In warning against pseudoscientific claims that pretend to have rigorous statistical analysis, that's a very useful caution.

You shouldn't use a p-value all by itself for in-depth information and obviously replication still matters quite a bit.  And in point of fact, on the Brain Science Podcast, the latest episode features a cognitive neuroscientist making exactly these statements regarding a variety of Cog Sci research that's been done.  Much of it hasn't been replicated, a lot of the methodology features intentional removal of information which creates an over-focusing on localization of cognition and brain function.

But all that said, it is still useful to get a quick handle on a study as well, provided you understand what it means.  If you see a study that shows that, assuming the null were true, their results or more extreme ones would only occur 5% or 1% of the time, that's still pretty useful in general terms.

Online Chew

  • Poster of Extraordinary Magnitude
  • **********
  • Posts: 10361
  • Let's gut it!
Re: p-values and Statistical Significance
« Reply #2 on: Apr 12, 2012, 08:44:38 PM »
The few times I remember Steve mentioning p-values was in the context of homeopaths/quacks/woo-masters cherry-picking research results, as in "out of 100 studies with a p-value of .05  five of those studies will be false positives just by chance alone".
"It is difficult to say what truth is, but sometimes it is easy to recognize falsehood." -Albert Einstein

Online Beη

  • Well Established
  • *****
  • Posts: 1653
  • Resident density functional theorist
    • electron exchange
Re: p-values and Statistical Significance
« Reply #3 on: Apr 12, 2012, 08:50:06 PM »
There is an SGU episode where they interview a statistician and briefly discuss some misunderstandings about p-values.  I can't think of the person's name.  I think the main topic discussed was Bayesian statistics.  I'll see if I can find it.

Offline Achamian

  • Brand New
  • Posts: 6
Re: p-values and Statistical Significance
« Reply #4 on: Apr 12, 2012, 09:00:00 PM »
Quote
But all that said, it is still useful to get a quick handle on a study as well, provided you understand what it means.  If you see a study that shows that, assuming the null were true, their results or more extreme ones would only occur 5% or 1% of the time, that's still pretty useful in general terms.

Thanks for your thoughts, and most of what you said makes sense. I'm still bothered by just how useless a p-value is by itself and how frequently you hear the words "statistically significant" like it's some sort of seal of approval. In particular, I'm looking at Table 3 from source 2 and just how varied the percentage of false positives can be. Without more pieces of information, it seems fairly useless to be honest.

The other worrying aspect is that reading "P Values are not Error Probabilities" it appears that two schools of thought on statistics (Neyman–Pearson's and Fisher's) were merged together rather arbitrarily.

Offline Achamian

  • Brand New
  • Posts: 6
Re: p-values and Statistical Significance
« Reply #5 on: Apr 12, 2012, 09:12:31 PM »
The few times I remember Steve mentioning p-values was in the context of homeopaths/quacks/woo-masters cherry-picking research results, as in "out of 100 studies with a p-value of .05  five of those studies will be false positives just by chance alone".
This is exactly the misunderstanding that appears to be so common. The p-value is not a false positive rate. The false positive rate could be much higher (even upwards of 50%) with p-values of 0.05.
From page 10 of P Values are not Error Probabilities (my 3rd source):
Quote
Fisher was insistent that the significance level of a test had no ongoing sampling interpretation. With
respect to the .05 level, for example, he emphasized that this does not indicate that the researcher “allows
himself to be deceived once in every twenty experiments. The test of significance only tells him what to
ignore, namely all experiments in which significant results are not obtained” (Fisher 1929, p. 191).

Offline Cowtown Cody

  • Well Established
  • *****
  • Posts: 1669
  • Part-time Ferret
    • Cold North
Re: p-values and Statistical Significance
« Reply #6 on: Apr 12, 2012, 09:28:56 PM »
Yeah, my old stats professor used to say that "statistically significant" is just dweeb-code for "hey, that's kind of interesting," and "insignificant at a given p-value" is dweeb-code for "this is boring."  But it's still important that we're all using the same key to decipher that code.

We do have to at least try to be objective in declaring things interesting or boring.  Think of it as a minimum condition.  Without an acceptable p-value, a study shouldn't even attract your attention.  Once it does, you can look at the sample size, the methodology, and decide how seriously to take that p-value and the study itself.  But if the p-value doesn't meet the minimum standards, it's very fair to say that the study produced nothing of interest, no matter what else happened.

As for the false positive rates, that's a good point that argues well against the conventionally determined p-values in many cases, but it doesn't argue against the p-value per se, and it even tells you when it could be quite credible, and ways that you could make your study's p-value much more credible.  And I like that a lot.

Online Jay_One

  • Seasoned Contributor
  • ****
  • Posts: 515
Re: p-values and Statistical Significance
« Reply #7 on: Apr 13, 2012, 07:19:53 AM »
I think the rogues discussed it during an interview with somebody who works in Bayesian probability. I don't remember the name though.
"I intend to live forever, or die trying." - Groucho Marx

Offline superdave

  • Stopped Going Outside
  • *******
  • Posts: 4122
  • Religious, but not spiritual
Re: p-values and Statistical Significance
« Reply #8 on: Apr 13, 2012, 08:16:50 AM »
A low P-value means your measurement was precise, but it doesn't mean it was accurate.  Say I used a broken scale that never went over 10 pounds to measure all the people who live in my town.  I might conclude that the average male weighs the same as the average female.   However, because the variance would be very low and the total number of subjects very high, I could calculate absurdly low* P-values in this case. 


[pendacism]Actually your P-values would be absurdly high, but in this case you would be more interested in 1-P[\pendacism]

Online Guillermo

  • Frequent Poster
  • ******
  • Posts: 3522
  • (╯°□°)╯︵ ┻━┻
Re: p-values and Statistical Significance
« Reply #9 on: Apr 13, 2012, 09:43:38 AM »
The p-value is a measurement of the probability of committing a Type I error (False positive). It's useful because it tells you how precise the experiment was and gives you standardized measurement of your experiment. You can use this to compare effectiveness of various effects.
However, all experiments should be followed by the Power. The probability of not commuting a Type II error (False negative). This one is improved by the sample size of the experiment.

After you have established an acceptable p-value, you have to follow up with other types of statistical test on the experiments.

The way I see it, a large p-value will tell you, "No need to continue, there is nothing here"


Offline Achamian

  • Brand New
  • Posts: 6
Re: p-values and Statistical Significance
« Reply #10 on: Apr 13, 2012, 07:32:19 PM »
Quote
The p-value is a measurement of the probability of committing a Type I error (False positive).

This appears to be incorrect.

Quote
Most applied researchers are unmindful of the historical development of methods of statistical
inference, and of the conflation of Fisherian and Neyman–Pearson ideas. Of critical importance, as
Goodman (1993) has pointed out, is the extensive failure to recognize the incompatibility of Fisher’s
evidential p value with the Type I error rate, α, of Neyman–Pearson statistical orthodoxy.
(pg 1 of my 3rd source)

The rest of your post looks right. I guess as long as you use/interpret p-values correctly, no harm is done. :)

Offline superdave

  • Stopped Going Outside
  • *******
  • Posts: 4122
  • Religious, but not spiritual
Re: p-values and Statistical Significance
« Reply #11 on: Apr 13, 2012, 08:41:54 PM »
Quote
The p-value is a measurement of the probability of committing a Type I error (False positive).

This appears to be incorrect.

Quote
Most applied researchers are unmindful of the historical development of methods of statistical
inference, and of the conflation of Fisherian and Neyman–Pearson ideas. Of critical importance, as
Goodman (1993) has pointed out, is the extensive failure to recognize the incompatibility of Fisher’s
evidential p value with the Type I error rate, α, of Neyman–Pearson statistical orthodoxy.
(pg 1 of my 3rd source)

The rest of your post looks right. I guess as long as you use/interpret p-values correctly, no harm is done. :)
It would be wrong to say a p-value is a measurement of the type 1 error, but type one errors are a very common reason why a p-value will be wrongly applied.  Depending on what exactly you are measuring, they can basically be the same thing.

Offline jt512

  • Seasoned Contributor
  • ****
  • Posts: 742
    • jt512
Re: p-values and Statistical Significance
« Reply #12 on: Apr 14, 2012, 08:57:26 PM »
A low P-value means your measurement was precise, but it doesn't mean it was accurate.


No.  A p-value does not indicate that the measurement was either precise or accurate.  The precision of a result is quantified by the standard error of the result se (smaller is better), but the p-value is a decreasing function of x̄/se, where x̄ is the result. Therefore, a p-value can be small if x̄ is imprecise but large.  For example, in their study of olive oil use and stroke risk, Samieri et al found that intensive users of olive oil were 41% less likely to suffer a stroke than non-users of olive oil. The p-value for this result was small (p=.03), but the precision of the result was low, as indicated by its 95% confidence interval, 6%–63%, which is an order of magnitude wide!

The p-value is a measurement of the probability of committing a Type I error (False positive). It's useful because it tells you how precise the experiment was and gives you standardized measurement of your experiment. You can use this to compare effectiveness of various effects.


No, no, and no.  (1) The p-vaule is not the probability of committing a Type I error.  See any of the references in the OP, especially the one by Goodman.  (2) The p-value is not a measure of precision.  See my comments above. And (3) The p-value is not a standardized measure of much of anything, and if I knew what you meant by comparing the "effectiveness of various effects," I could probably explain why the p-value isn't valid for that, specifically.

Jay


Offline jt512

  • Seasoned Contributor
  • ****
  • Posts: 742
    • jt512
Re: p-values and Statistical Significance
« Reply #13 on: Apr 15, 2012, 12:21:48 AM »
Hi all, I have recently discovered SGU and have been devouring the podcast archive. I love the show.

I noticed that the few times statistical metrics have been brought up on the show, the p-value is usually the go-to statistic to cite. I've taken a basic college stats class so I know a little about p-values, correlation vs causation, null hypotheses, etc. Anyway, I was feeling a little skeptical about the use of p-values, so I did some digging around. What I've found has made me very suspicious of the use of p-values as a sole (or even primary) criterion for characterizing scientific results as significant/insignificant:

1) http://www.tc.umn.edu/~alonso/Goodman_Semin_Hemat_2008.pdf
2) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1119478/?tool=pmcentrez
3) http://www.uv.es/sestio/TechRep/tr14-03.pdf
4) http://www.stats.org.uk/statistical-inference/Johnson1999.pdf

(Bad citations? Screw you, I'm lazy and this isn't school :P )



Damn good citations, actually.

Jay

 

personate-rain