So, ultimately I agree that the sample size is inadequate; I wrote as much in my post above. It's just the reasoning being provided that is incorrect. Why is the sample size too small? Because of our prior beliefs. This issue is being danced around (in Khevar's post as well) but not being made explicit.

Khevar, the 99.7 is an arbitrary boundary. In fact, it's common in many areas of science to use 95% as the cut-off for significance. Pointing out that 3 more events would have brought it within the boundary is irrelevant. The fact is, the chance of getting such a low result given a rate of 20% is extremely small. The whole idea of the confidence interval is that it builds in information both from the size of the discrepancy to the proposed model, and the sample size. If a measurement lies outside the 99.7 percent confidence interval for a given hypothesis, that carries the same weight in disproving our hypothesis regardless of the sample size.

I also do not know what you mean by saying that it proves that "HE" isn't getting 20%. Are you claiming that the OP somehow has RE generated at different probabilities? Clearly his results are universally applicable. That they don't support the conclusion is another story, and it's because of what you describe as "practical issues", which is in fact a very good name for it. Those practical issues amount to our beliefs about Bioware and their ability to program something in line with their claims.

Let's get quantitative: if we use the Bayesian framework, we can estimate the relative likelihood of different success rates. Let's assume that initially we have no bias about what the success rate is. We can calculate the relative probability of any given success rate given the data. I assumed an experiment with 400 trials and 52 successes. The most likely success rate according to this is 13%. It turns out a success rate of 13% is about 1000 times more likely than a success rate of 20%. The question is this: before you saw this experiment, how many times more likely did you think a success rate of 20% (the quoted value) was than 13%? If you think it's more than 1000 times higher (I do), then this experiment should not change your mind. If you think that people at Bioware are incompetent and only had a 50-50 chance to get the correct rate, with the other 50 percent being distributed equally at every possible success rate, you would have thought that 20% is only about 100 times more likely than 13%, so you would find this experiment persuasive.