Please upgrade your browser for the best possible experience.

Chrome Firefox Internet Explorer
×

Reverse Engineering is not 20%

STAR WARS: The Old Republic > English > Crew Skills
Reverse Engineering is not 20%

Khevar's Avatar


Khevar
02.12.2013 , 09:18 PM | #31
Quote: Originally Posted by NotRonin View Post
You do not need 10,000 data points. 400 is more than enough.

..
400 data points is not enough.

I don't seem to be able to communicate my point, so I've just pulled out my dice roll modeling program to try and help (2 six-sided dice).

In the real world, the odds of rolling a seven are 16.6%. The odds of rolling a twelve are 2.77%. For this test, I ran 4 separate sets of 400 iterations each.

Set 1:
Hit seven 20% of the time
Hit twelve 1.5% of the time

Set 2:
Hit seven 16% of the time
Hit twelve 4.25% of the time

Set 3:
Hit seven 13% of the time
Hit twelve 2.5% of the time

Set 4:
Hit seven 20.5% of the time
Hit twelve 1% of the time

With only 400 data points per set, are you able to tell if my RNG program is written correctly or not?

AshlaBoga's Avatar


AshlaBoga
02.13.2013 , 02:01 AM | #32
Quote: Originally Posted by Khevar View Post
400 data points is not enough.

I don't seem to be able to communicate my point, so I've just pulled out my dice roll modeling program to try and help (2 six-sided dice).

In the real world, the odds of rolling a seven are 16.6%. The odds of rolling a twelve are 2.77%. For this test, I ran 4 separate sets of 400 iterations each.

Set 1:
Hit seven 20% of the time
Hit twelve 1.5% of the time

Set 2:
Hit seven 16% of the time
Hit twelve 4.25% of the time

Set 3:
Hit seven 13% of the time
Hit twelve 2.5% of the time

Set 4:
Hit seven 20.5% of the time
Hit twelve 1% of the time

With only 400 data points per set, are you able to tell if my RNG program is written correctly or not?
^^^^^^^
400 isn't enough. You need +2k to get an accurate result.
The dark is generous, and it is patient, and it always wins.
It always wins because it is everywhere.
The brightest light casts the darkest shadow.
Click my Referral Code for free goodies!

psandak's Avatar


psandak
02.14.2013 , 11:08 AM | #33
Quote: Originally Posted by NotRonin View Post
You do not need 10,000 data points. 400 is more than enough.

Whenever someone mentions random number, you'll get all these people with no understanding of probability coming out and say 'you need a bigger sample size'. You don't. The bigger issue however is that 20% is the mean, it doesn't tell you anything about the "distribution". Suppose the 'random number generator' goes like this : fail for the first 800 tries, and success for the next 200. The mean is still 20%, however it's not evenly distributed over time.

According to the data the OP has gotten, there is a 99.7% confidence interval only applies if it's a random even distribution. Most programs, when you use the default 'random' function, suffers from this. This is why when you run something in Python or Excel, you will gets a lot more 'streaks' compared to what the theory indicates.

True randomness is hard to do, most implementation uses 'psuedorandom'. For instance, you can start off with 1000 numbers in a box, scramble them up, and then pick one of them out until no number is left. What number you get will then depend on what number has already been picked out, and you're prone to more 'streaks' because of it.
Your last paragraph is not relevant to the argument at hand. RE is more like putting numbers 1 through 100 into a hat, shaking it up, drawing a number, annotating that result, putting the number back in the hat, shaking it again, drawing another number, annotating that result and repeating again and again and again for X number of attempts. No single attempt affects the result of any other attempt. Every time you draw a number the chance of drawing a given range of numbers (say 1-20 ) remains constant.

That being said, you are correct that no random number generator is ever 100% truly random, but that is only because computers cannot truly randomize numbers. Computerized RNGs are biased but the only way to truly find the bias is to run millions upon millions of attempts...400 attempts does not even come close.

Linkuramaz's Avatar


Linkuramaz
02.14.2013 , 06:03 PM | #34
That doesn't prove it's not 20%. You are just unlucky.

Lodril's Avatar


Lodril
02.14.2013 , 06:34 PM | #35
I love how every time someone starts discussing the statistics, people trot out the same arguments about the invalidity of observation.

Personally, I have nothing to say about the statistics (I'm terrible at math), but I love the logic that any reasoning made based off of observable data cannot be valid, because that data is just a subset of a larger pool of unobservable data. The beauty of it is that it's typically accompanied by the argument that 20% means that each instance is at 20%, and multiple attempts cannot change the outcome because the effect of the attempts are not cumulative... except apparently when they are, across whatever the maximum number of attempts is.

It's a hilarious logical fallacy that doesn't really involve math at all, but comes up every single time as if it were some secret counter-intuitive knowledge of statistics proudly trotted out for the occasion. It's a scientifically framed argument that is utterly anti-science, since it requires faith in the presented prediction because it argues that the data can only be understood based on unknowable quantities whose existence is only framed by the presented prediction, and therefore must be presumed to support it.

Khevar's Avatar


Khevar
02.14.2013 , 07:53 PM | #36
@Lodril, Quite the philosophical analysis of RE threads.

Dare I ask, do you hold a viewpoint of:

1. The RE tooltip percentage (20%) is probably accurate.
2. The RE tooltip percentage (20%) is probably incorrect.
3. Don't care poking fun at forum posters.

Darth_Sweets's Avatar


Darth_Sweets
02.14.2013 , 09:21 PM | #37
Quote: Originally Posted by Khevar View Post
400 data points is not enough.

I don't seem to be able to communicate my point, so I've just pulled out my dice roll modeling program to try and help (2 six-sided dice).

In the real world, the odds of rolling a seven are 16.6%. The odds of rolling a twelve are 2.77%. For this test, I ran 4 separate sets of 400 iterations each.

Set 1:
Hit seven 20% of the time
Hit twelve 1.5% of the time

Set 2:
Hit seven 16% of the time
Hit twelve 4.25% of the time

Set 3:
Hit seven 13% of the time
Hit twelve 2.5% of the time

Set 4:
Hit seven 20.5% of the time
Hit twelve 1% of the time

With only 400 data points per set, are you able to tell if my RNG program is written correctly or not?
Once again i will state that your dealing with a joint distribution where each distribution in very different from a binomial one we are talking about?

Darth_Sweets's Avatar


Darth_Sweets
02.14.2013 , 09:25 PM | #38
Quote: Originally Posted by DataBeaver View Post
I generated ten random series of a thousand samples each with a simple Python script, and here's the result:

http://snag.gy/JPFI4.jpg

Apologies for the horrible colors. Notice how one of the lines is slightly below the 99.7% confidence interval for a while, before eventually climbing back up? Exactly like you are seeing in your experiment.

While 99.7% may seem like an impressive number, it only covers 997 of every 1000 cases. The remaining three fall outside it. And there are many thousands of players on each server; maybe even tens of thousands. So it's far from impossible for someone to have such a streak of bad luck.
I am working on coming up with a more stringent test but the statistics expert at work doesn't think there would ever be a need for it, but in theory it is possible.

Darth_Sweets's Avatar


Darth_Sweets
02.14.2013 , 09:26 PM | #39
Quote: Originally Posted by Quellryloth View Post
We know exactly what standard deviation is. In fact, we know everything about this distribution (it's what is called a normal distribution) because this is the classic high school binomial distribution problem. But in fact, even if it was not so simple, a lot of other scenarios (almost all that anyone cares about) converge to the same thing. Read up on the central limit theorem if you want to know why.

By the way, here are my results so far: 74 tried, 15 successes.
thank you I will add your data to what i have and post an update

Timonius's Avatar


Timonius
02.14.2013 , 10:10 PM | #40
Regardless of how the math is done - I think the RE numbers need a boost say 30% for green to blue and 15% from blue to purple. It's bad enough that time, money and ,material is wasted on green to blue for a presumably negligible market.

Oh well - it's time they did a major patch revision focusing on crafting/RE'ing.