Please upgrade your browser for the best possible experience.

Chrome Firefox Internet Explorer
×

Reverse Engineering is not 20%

STAR WARS: The Old Republic > English > Crew Skills
Reverse Engineering is not 20%

Quellryloth's Avatar


Quellryloth
03.04.2013 , 08:05 AM | #101
Quote: Originally Posted by DarthTHC View Post
The question is, what is "enough times" to prove the 20% out? It's not going to be measured in 100's or even 1,000's. Any statisticians care to answer that?
It depends on how certain you want to be. The uncertainty depends not on the number of tries, but on the number of successes. With 100 successes, one standard deviation is about 10 (or 10%). With 1000 successes, it is around 31.6 (3%). More generally, it goes as roughly the square roof of the number of successes. If you know how many standard deviations away you are from the expected value, you can calculate the probability of the expectation being valid.

This is in fact exactly what Darth Sweets did. He had 400-odd tries (i.e. with a 20% chance, the expectation is 80) and 53 successes so he is a little over 3 standard deviations away. He did the calculation and it works out to about a 3 in a 1000 chance of the RE probability being 20%. Like a lot of people have said earlier in the thread, this is improbable, but hardly impossible.

By the way, I tried to record my RE attempts and what I got at the point when I stopped was 26 successes out of 125 attempts (i.e. it's basically dead-on 20%, even with a relatively large uncertainty).

quickNir's Avatar


quickNir
03.04.2013 , 12:45 PM | #102
First off, to anybody reading through this thread who doesn't have a strong background in probability and statistics, please do not assume anything written in this thread is correct. Some (a minority) of the people posting here know what they're talking about (like for instance the post just above mine), but there's a lot of junk here. So your best bet is really not to trust anything but read about this stuff somewhere else if it interests you.

The testing technique in the original post is legitimate. I've seen tons of posts claiming that the sample size just isn't large enough, without actually responding to the numbers the OP put up. Most of those posts are simply incorrect. There is no fundamental sample size that you need. To find a discrepancy between a claimed hypothesis and reality, the size of the sample required depends on the size of the discrepancy and how certain you want to be. Sometimes 100 samples is enough.

Suppose I have a coin that comes up heads 100% of the time. How long will it take you to determine my coin is rigged? Not very long. After it comes up heads even just 20 times in a row, you will be very suspicious. After 50 times it's virtually a certainty. If, on the other hand, I have a coin that comes up heads 51% of the time, it will take a very large number of samples to prove anything.

To work out what these sample sizes are, there is no alternative except to crunch the numbers, which the original poster did. You start with the hypothesis that 20% is the probability of RE. You do the experiment. You see how likely it is that you would get the outcome you got or one more extreme, if the probability really were 20%. If this probability is low, you justify discarding the hypothesis that 20% is the true probability.

Here is what (in my opinion) is missing from this discussion: Bayesian statistics. Some of the posters are correct in not criticizing the original methodology, but saying that it simply doesn't support the conclusions adequately. Why not? It seems like the confidence interval is pretty convincing. The reason is that 20% isn't just another number. It's the number given to us by the game. Since it's really easy to generate heads randomly (pseudo-randomly, technically) at a 20% level, I tend to suspect that Bioware did not screw this up. In other words I have prior beliefs about the likelihood that 20% is the RE rate, as opposed to other values.

Suppose I flip a coin I find on the street 100 times. I get 75 heads. This is a wildly improbably result for a fair coin. Yet, I will not conclude that the coin I found on the street is unfair. Why? Because if you find a random coin on the street, it is many many many times more likely to be fair than unfair (note that when I say fair, I mean within a small tolerance of 50%, as real coins are). When you work out the math, it results in the final conclusion still being that the coin is likely fair. Because it is more likely that I found a fair coin and had an unusual sequence of flips, then that I found an unfair coin and had a usual sequence of flips.

The same applies here. The evidence would be convincing if there was nothing special about 20%. But I have pretty strong prior beliefs about Bioware programmers being able to do something so simple correctly. In other words, despite the evidence, I think it is more likely that Bioware got this right and that your test was a fluke, then that Bioware screwed this up and your test is representative. So I will require much, much stronger evidence before I believe that 20% is not the true rate.

psandak's Avatar


psandak
03.04.2013 , 02:12 PM | #103
@quickNir

After taking a quick refresher in statistics (using wikipedia) and running the OP's numbers, I want to apologize to the OP. His analysis is spot on. His results fall outside the predictable margin of error. I focused on the sample size of his test rather than the analysis of that sample and for that I apologize.

But as you said, his is only one test. And to conclude that the system is broken based on that one test is irresponsible. More testing is required to come to a viable conclusion. So, at the end of the day, to an extent, we are right back where we started...the "sample size" argument in fact still applies .

Khevar's Avatar


Khevar
03.04.2013 , 02:45 PM | #104
@quickNir, Being one of the main posters who keeps harping on the sample size being inadequate, I feel the need to defend my position.

Early on (close to a year ago now) my anecdotal experience was suggesting to me that I was getting too few successes for each RE. So I began logging every attempt at REing a green at 20% and every attempt at REing a blue at 10% (I was mainly focusing on Armormech and Synthweaving at the time).

After 200 tries, I was way off of expected results
After 400 tries, it was still off, but better
After 600 tries, I was starting to reach the 20% and 10% respectively

I stopped tracking shortly thereafter, as the more I did the closer I reached 20%/10%. This was adequate for my purposes.

Please notice the title of this thread is: "Reverse Engineering is not 20%".

The OPs sample size is adequate to prove that HE wasn't getting 20%. It is NOT adequate to prove that everyone else isn't getting 20% either.

This statement that he made:
Quote: Originally Posted by Darth_Sweets
With the mean sample is out of the 99.7 boundary that mean that it is almost impossible that the 20 percent is the true rate of getting a new plan.
At only 400-odd samples in his raw data, it wouldn't take very many successes in a row to bring his data above the 99.7. Cleet_Xia (post 29) pointed out that 3 more successes would bring the results within the boundary.

-------------------------------------------

Now, my most significant objection with the sample size is a purely practical one:

If Bioware has, in fact, incorrectly implemented RNG and it is not actually working at the stated 20%, they will need a lot more data than the OP has provided to look into it.

Does anyone really believe that those 400 points of data is "proof" that the 20% is wrong? Really? And that based on this alone the devs should believe it is broken? Really really? Seriously?

If the OP instead was asking for others to log their RE results and add them to this thread, and collectively a broader picture started to form, THAT would be useful. But that's not what happened. Data was gotten; a conclusion was formed; it was briefly defended; and that was it.

quickNir's Avatar


quickNir
03.04.2013 , 04:49 PM | #105
So, ultimately I agree that the sample size is inadequate; I wrote as much in my post above. It's just the reasoning being provided that is incorrect. Why is the sample size too small? Because of our prior beliefs. This issue is being danced around (in Khevar's post as well) but not being made explicit.

Khevar, the 99.7 is an arbitrary boundary. In fact, it's common in many areas of science to use 95% as the cut-off for significance. Pointing out that 3 more events would have brought it within the boundary is irrelevant. The fact is, the chance of getting such a low result given a rate of 20% is extremely small. The whole idea of the confidence interval is that it builds in information both from the size of the discrepancy to the proposed model, and the sample size. If a measurement lies outside the 99.7 percent confidence interval for a given hypothesis, that carries the same weight in disproving our hypothesis regardless of the sample size.

I also do not know what you mean by saying that it proves that "HE" isn't getting 20%. Are you claiming that the OP somehow has RE generated at different probabilities? Clearly his results are universally applicable. That they don't support the conclusion is another story, and it's because of what you describe as "practical issues", which is in fact a very good name for it. Those practical issues amount to our beliefs about Bioware and their ability to program something in line with their claims.

Let's get quantitative: if we use the Bayesian framework, we can estimate the relative likelihood of different success rates. Let's assume that initially we have no bias about what the success rate is. We can calculate the relative probability of any given success rate given the data. I assumed an experiment with 400 trials and 52 successes. The most likely success rate according to this is 13%. It turns out a success rate of 13% is about 1000 times more likely than a success rate of 20%. The question is this: before you saw this experiment, how many times more likely did you think a success rate of 20% (the quoted value) was than 13%? If you think it's more than 1000 times higher (I do), then this experiment should not change your mind. If you think that people at Bioware are incompetent and only had a 50-50 chance to get the correct rate, with the other 50 percent being distributed equally at every possible success rate, you would have thought that 20% is only about 100 times more likely than 13%, so you would find this experiment persuasive.

Khevar's Avatar


Khevar
03.04.2013 , 07:58 PM | #106
Just as a comment, I do appreciate this level of discourse.

I think I should clarify the point I made about HE vs EVERYONE getting the 20%.

When I did my own personal logging of tests, my end result after >600 samples was, as I recall, ~19% for the green-blue and ~11% for the blue-purple.

So my tests came up with a very different result than Darth_Sweets. Here are some of the possible conclusions that can be drawn:

1. Between the time of my test and Darth_Sweets' test, the underlying RNG code changed.
2. My test contained innacurate raw data.
3. His test contained innacurate raw data.
4. One test or the other (or both) contained insufficient data to draw a conclusion.

quickNir's Avatar


quickNir
03.05.2013 , 12:39 AM | #107
Good, I enjoyed it as well! I think your four options are a little too black in white, but I basically agree that 4 is the solution.

tanktest's Avatar


tanktest
03.05.2013 , 03:30 PM | #108
Quote: Originally Posted by criminalheretic View Post
Everybody in this thread - "Math math math math math, I'm smart, math math math, you're wrong, math math"

The only answer that matters:

Whenever you have RNG, people "feel" like they come out on the wrong side of the equation. People being mad at your game all the time = bad. Implement a system where you have a specific # of attempts to achieve a schematic. You have to RE 5 greens to get a blue, and 10 blues to get a purple. I'll gladly trade the "joy" of getting a schem on my 3rd attempt, to rid myself of the frustration of not getting one by the 37th, and then we are still making the investment in creds/mats/time etc.
I don't isn't that the fun of it all seeing those big green letters saying you know it even after you spend 6 mill for one Schematic to me it was worth it ...

Naej's Avatar


Naej
03.05.2013 , 08:17 PM | #109
I'm trying to dig out the info, but I'm pretty sure the devs have stated that RE chance is modified by your level of crafting skill VS the item's actual level. Which means that you simply wouldn't have enough info to claim anything at all, since we don't know how the initial 20% base rating would be affected.

For all we know, you could be at 20% base - 5% mod. Meaning your 14ish% would be right within target.

I'll post link of the Dev post ASAP.
Naej - 50 Jedi Guardian / Eljian - 50 Gunslinger [AWAKEN - PotF]
Furlone - 38 Vanguard / Jaen - 45 Jedi Shadow / Florune - 11 Mercenary / Jyang - 11 Operative

Khevar's Avatar


Khevar
03.06.2013 , 10:02 AM | #110
Quote: Originally Posted by Naej View Post
I'm trying to dig out the info, but I'm pretty sure the devs have stated that RE chance is modified by your level of crafting skill VS the item's actual level. Which means that you simply wouldn't have enough info to claim anything at all, since we don't know how the initial 20% base rating would be affected.

For all we know, you could be at 20% base - 5% mod. Meaning your 14ish% would be right within target.

I'll post link of the Dev post ASAP.
Hmm. I've been following crafting posts for a long time and have never seen any dev post like that.

Are you sure you aren't thinking of the green/yellow/orange gathering missions? Those have different success percentages.