Comments on The 20% Statistician: 98% Capture Percentages and 99.9% Confidence Intervals

Any way you can re-upload your Excel spreadsheet t...

2017-04-10T16:20:09.968+02:00

Any way you can re-upload your Excel spreadsheet to calculate this? Thanks in advance!

A late reply, but anyway: The interpretation of th...

2016-03-14T14:43:17.582+01:00

A late reply, but anyway: The interpretation of the CI is wrong because the % probability is not about your CI, but about how many future CIs contain the true effect size. After all, it is frequentism we talk about. (by Ingo Rohlfing; the website does not allow me to use my profile.)

Am I missing something or is the case for capture ...

2014-12-29T00:29:22.452+01:00

Am I missing something or is the case for capture percentages actually quite weak?

1) 95% of future CI will contain the true effect size (so, my CI has 95% chance of including *true* effect size)

2) 83% of future effect sizes will be in my CI (so, my CI has 83% chance of including any individual future effect size)

Surely, we are interested in the true effect size, not any individual effect size. We are in the game of investigating human psychology, not predict each other's stats results.

I don't like the ad-hoc, black-box approaches ...

2014-12-05T13:17:31.187+01:00

I don't like the ad-hoc, black-box approaches to handling of outliers/extreme values like windsorizing, outlier exclusion or t dist with wide tails. If you are really interested in the extreme boundary I would model outliers explicitly with some mixture model. The simplest way to do so would be a bimodal distribution. Of course that means you have to make assumptions about how outliers behave, which is often difficult.

The best way, really, is to treat the value at say 99.9% boundary as the target dependent variable. Something like - extend model M1: y~bx where you report the 99.9%CI of b into M2: y~bx; u~getCI(b,99.9) and report the mean and CI of u. Then you would design an experiment that allows you to estimate u precisely (most probably at the expense of b). In a simple study this may just mean that you target mostly well-skilled and miserably skilled population or that you use very easy and very difficult items.

It may look to you like this creates an infinite regress, but it really is about what your target variable is, which should be determined by your research question. A meta CI (say 99.9% CI of a 99.9 percentile of some variable) is of little theoretical importance. As I said I can barely imagine any use-case for 99.9% CI.

Hi Matus, thanks for your reply - very good points...

2014-12-04T19:48:11.146+01:00

Hi Matus, thanks for your reply - very good points. Could this be resolved by using robust statistics? Have to learn more about this, but some windsorized variances might improve things perhaps?

I report the mean + 95% interval and use the cater...

2014-12-04T16:44:30.977+01:00

I report the mean + 95% interval and use the caterpillar (eh, is that what it is called?) plot with 50% and 95% interval.
I think reporting larger interval than 95% is bad idea. Actually I sometimes feel bad about 95%. Let me put aside the obvious problem, that for informed interpretation you would never, ever be interested in something like 99.9% interval. I just wish to point out that the extreme CIs are simply beyond the resolution of your data and stat method. Your stat tools come with certain precision. E.g. if you use bootstrap or mcmc to construct the estimated interval you will need to get more samples. With 1000 samples your 99.9% interval will depend on the position of the 2 most extreme points. Such boundary estimate will not be very robust and vary considarably across replications of your analysis. The problem arises because we usually use distributions with small amount of probability mass at its tails. That is thediscrepancy in the estimated var at the tail (eg between 99.9 and 99.8) will be greater compared to more central regions. Similar, the noise will have bigger impact at tails.
There is another reason for this problem, namely that there will be discrepancy between the model our analysis assumes and how nature really works. This discrepancy will be larger in parts of population where we have little data to check the model's assumption. In worst case, the estimation of an extreme interval boundary such as 99.9 will just reflect the assumptions behind the model. Or the estimate will reflect arbitrary outliers, which should be properly handled by a different models - as is the case in various outlier detection algorithms.

I think pleasing readers who like to see p-values ...

2014-12-02T22:29:16.326+01:00

I think pleasing readers who like to see p-values is (for now) a pretty good reason to also report p-values. Give people time to get used to seeing both, then maybe someday we won't need p-values.
Another common mistake I see in interpreting CIs is people forgetting that values at edges of CI are less likely than values in the middle. I like the caterpillar plot suggestion for that reason.

Interesting post, I did not know of coverage perce...

2014-12-02T20:51:00.870+01:00

Interesting post, I did not know of coverage percentages before. I am not sure how much it helps to report p-values with CIs because most people might focus more (or only) on the p-value. Besides, I do not see the added value of a p-value when you also present the CI, except of pleasing readers who want like to see p-values. If one were to present information on both, one could easily do this in one plot. The plot would contain the CI and one could use the usual *, **, *** as marker labels right next to each CI.

That's indeed an excellent suggestion for plot...

2014-12-02T15:54:24.087+01:00

That's indeed an excellent suggestion for plots, and might even be a good suggestion for tables! I think it will indeed work in moving the attention of readers to estimation instead of statistical differences from 0.

If you really want to use confidence intervals ins...

2014-12-02T15:37:39.629+01:00

If you really want to use confidence intervals instead of probability/credible intervals there is nothing stopping you from reporting many different coverages at the same time, and this is especially easy if you use a plot. This is sometimes called a caterpillar plot (an example: http://xavier-fim.net/packages/ggmcmc/figure/xcaterpillar.png.pagespeed.ic.bTv6onD83d.png). Reporting many different coverages at the same time (say, 50%, 90% 99.9%) would , I believe, put more focus on the actual estimate and less focus on whether an interval crosses zero or not.