Jump to content

Reverse Engineering is not 20%


Darth_Sweets

Recommended Posts

after going through too many materials i decided to check the reverse engineering rate of going greens to blues which is stated to be at 20% according to the tool tip in game. I have collected the number of reverse engineering attempt and success that i have had going from a green to a blue, I have not counted the times where I RE'ed and there was no chance of a plan to me gained.

 

Reverse Reverse

Engineering Engineering

Attempts Successes

9 - 1

15 - 2

5 - 1

6 - 1

1 - 1

5 - 0

5 - 1

5 - 0

10 - 1

19 - 5

15 - 4

8 - 2

10 - 2

10 - 2

1 - 1

24 - 7

10 - 0

7 - 2

5 - 0

9 - 2

10 - 1

5 - 1

5 - 0

10 - 1

10 - 0

10 - 1

10 - 1

10 - 0

4 - 1

15 - 0

9 - 2

10 - 1

5 - 0

5 - 0

5 - 0

10 - 1

5 - 1

10 - 1

10 - 0

5 - 0

2 - 1

5 - 0

10 - 0

7 - 1

13 - 2

8 - 1

15 - 2

 

This gives a 13.4 percent mean. Using a standard confidence calculation with 99.7 percent boundary if 20 percent is the true mean as defined in the tool tip the average for the sample above should be between 14.3 to 25.7 percent. With the mean sample is out of the 99.7 boundary that mean that it is almost impossible that the 20 percent is the true rate of getting a new plan.

Link to comment
Share on other sites

  • Replies 160
  • Created
  • Last Reply

Top Posters In This Topic

That test proves absolutely nothing. It's the same crap with WoW of people claiming that people roll hacked when someone rolled 100 twice in a row. The chance of that happening is 1 in 10,000, yet it happened all the time. If you have thousands of people REing stuff, some people will get the short end of the stick, some people will be extremely lucky.
Link to comment
Share on other sites

How did you calculate the standard deviation?

 

Maybe I am misunderstanding you. The three sigma rule says that in a normal distribution approximately 99.7% of the values will fall within three standard deviations (three sigmas) of the mean, so I assume that is what you are talking about.

 

You performed a test involving four-hundred-and-some rolls, and you had a success rate of 13.4%. If you are trying to claim that this is so far outside the parameters of an expected distribution that we should conclude the system is broken, you need to know how much variance we expect there to be in a distribution of trials with four-hundred-and-some rolls to begin with. How did you go about calculating that?

 

By the by, unless I am totally missing the point this kind of confidence interval calculation does not seem to be a very good way to make an argument about probability.

Link to comment
Share on other sites

The usual gamers fallicy...

 

"Again, the fallacy is the belief that the "universe" somehow carries a memory of past results which tend to favor or disfavor future outcomes."

 

It's a 20% chance each time you re an item, it's not a 20% chance of all your re attempts.

Link to comment
Share on other sites

The usual gamers fallicy...

 

"Again, the fallacy is the belief that the "universe" somehow carries a memory of past results which tend to favor or disfavor future outcomes."

 

It's a 20% chance each time you re an item, it's not a 20% chance of all your re attempts.

 

QFT. The same arguement was in LOTRO about percentage chances. Its based on that one attempt, not the percentage of all your attempts

Link to comment
Share on other sites

How did you calculate the standard deviation?

 

Maybe I am misunderstanding you. The three sigma rule says that in a normal distribution approximately 99.7% of the values will fall within three standard deviations (three sigmas) of the mean, so I assume that is what you are talking about.

 

You performed a test involving four-hundred-and-some rolls, and you had a success rate of 13.4%. If you are trying to claim that this is so far outside the parameters of an expected distribution that we should conclude the system is broken, you need to know how much variance we expect there to be in a distribution of trials with four-hundred-and-some rolls to begin with. How did you go about calculating that?

 

By the by, unless I am totally missing the point this kind of confidence interval calculation does not seem to be a very good way to make an argument about probability.

 

Actually the confidence interval calculation is what I us at work to validate math models with test results for products we build. At my job we say that if a test data falls outside the 90 percent confidence interval we say that it fails to validate the model. In this case the programers are telling us 20 percent is the outcome we should see. As for how it is computed I used what we have used at work it also agrees with with my college text and I see similar things on wikipedia as well.

 

As for the people that are complain that this is just a RNG "thing" the point of a confidence interval test is to define a band of what kind of results you can expect to see from a set of sample tests that are all independent from one another.

Link to comment
Share on other sites

I am a mathematician and I wondered about the reverse engineering probability. I calculated the probability of not getting a schematic which is a probability of 0.8, not each time you don't get a upgrade schematic , the more failed attempts the more unlikely it becomes in actually not getting an upgrade. There are a lot of are events happening on SWTOR.
Link to comment
Share on other sites

Actually the confidence interval calculation is what I us at work to validate math models with test results for products we build. At my job we say that if a test data falls outside the 90 percent confidence interval we say that it fails to validate the model. In this case the programers are telling us 20 percent is the outcome we should see. As for how it is computed I used what we have used at work it also agrees with with my college text and I see similar things on wikipedia as well.

 

As for the people that are complain that this is just a RNG "thing" the point of a confidence interval test is to define a band of what kind of results you can expect to see from a set of sample tests that are all independent from one another.

You sample size is too small.

 

If you've ever applied math to gambling with dice or roulette, and tried to calculate optimal gambling strategies (I have) you should know that you need a much larger sample size to have any confidence in your results.

 

Applying standard deviation to "the products you build" is NOT the same as applying standard deviation for gambling results.

Edited by Khevar
Link to comment
Share on other sites

You sample size is too small.

 

If you've ever applied math to gambling with dice or roulette, and tried to calculate optimal gambling strategies (I have) you should know that you need a much larger sample size to have any confidence in your results.

 

Applying standard deviation to "the products you build" is NOT the same as applying standard deviation for gambling results.

 

Actually the confidence interval test is valid with almost any sample size since, the tightness of the bounds will change with the number of samples used. As for calculating dice rolls that is standard stuff done in any class. (Even in HS) As for this idea that the sample standard deviation, I am not using the standard deviation that the developers are telling us we should see.

 

Since it seem that it is making an issue I am a flight controls engineer with a masters in engineering, my specialty is Kalman filtering witch is just a fiance statistical tool. I spend my days making math models of IMUs and then validating the models i make with test results for autonomous vehicles, missile and rockets.

Link to comment
Share on other sites

Hmm... there doesn't seem to be anything wrong with what you did, but it is possible that you were just very, very unlucky. I am going to RE a bunch of stuff with two of my characters in the next few days. When I get to 200 items RE'ed, I will come back to this thread and report.
Link to comment
Share on other sites

Actually the confidence interval test is valid with almost any sample size ...

This is incorrect. Sample size affects the confidence interval test.

 

Going back to gambling, rolling a hard 10 hop is a one in 36 chance.

 

And yet, it is not hard to sit at a craps table and roll the dice 400 times and only see a single hard 10. It's just too small a sample size.

 

In modeling statistics of purely random events (e.g. a dice roll) it requires a very large sample size for any deviation to have any real value.

 

Your post tile claims "Reverse Engieering is not 20%" That conclusion is WAY too broad for the data you present.

Edited by Khevar
Link to comment
Share on other sites

This can all be solved via a simple term: RNG. Just because it says 20%, doesn't mean you'll RE 5 items and get a blue schematic, and so on. I've gotten the schematic on the first RE. Sometimes I've had to do 15+ to get it. Just unlucky.

 

pretty sure OP knows a lot more about probability and RNG then you

Link to comment
Share on other sites

Actually the confidence interval calculation is what I us at work to validate math models with test results for products we build. At my job we say that if a test data falls outside the 90 percent confidence interval we say that it fails to validate the model.
Right, but are you modeling probabilistic events?

 

Confidence intervals are useful when we have a definite hypothesis and definite test results. They allow us to represent things like our level of confidence that our measurements are accurate, and how confident we are that our results point to an actual phenomenon and not just statistical noise.

 

Confidence intervals are also useful when we have definite results from a subset of some larger population and we want to extrapolate from them. We know with complete certainty how people in exit polls voted. Confidence intervals allow us to represent our confidence that these exit polling numbers are an accurate reflection of all the ballots cast.

 

In your case, it sounds like confidence intervals allow you to represent the confidence with which you can say that a rocket landed where your model said it would because the model is right, and not because of measurement error or expected variability or whatever.

In this case the programers are telling us 20 percent is the outcome we should see.
No, they aren't. I think that is precisely the problem here.

 

There is a very important difference between saying that something has a 20% chance to happen and saying that something will happen 20% of the time.

 

For the sake of illustration, let's say that we are going to reverse engineer five items, each with a 20% chance to teach us a new schematic. A simple probability table gives us the following percentage chances for each of the five possible outcomes:

 

0/5 Successfullly teach us a new schematic = 32.77%

1/5 Successfully teach us a new schematic = 40.96%

2/5 Successfully teach us a new schematic = 20.54%

3/5 Successfully teach us a new schematic = 5.12%

4/5 Successfully teach us a new schematic = 0.64%

5/5 Successfully teach us a new schematic = 0.032%

 

We don't have a definite hypothesis because our model is probabilistic. Our chance of learning exactly one new schematic from five reverse engineering attempts (a perfect 20% success rate) is less than half. It is the most probable of the five outcome - it should occur with greater frequency than any other individual outcome - but it is substantially less probable than all the other outcomes put together.

 

If your friend said to you "I am going to reverse engineer five items. I bet you I learn exactly one new schematic, no more and no less," you would be smart to bet against him. Your odds of winning are roughly 3:2: If you and your friend made the same bet over and over and over, you ought to win one and a half times as much as you lose.

 

It is not impossible to make an argument about probability by way of confidence intervals, but I think it is kind of a clunky way to do it.

 

Common sense tells us that if we perform four trials, and in each of the four trials we learn five new schematics from five reverse engineering attempts (results that we would expect to occur with a frequency of about 1/3000), we can have a high degree of confidence in our inference that something is probably biasing the results.

 

Things get a lot trickier when the results are less extreme though. How many times do you have to flip heads before you conclude that your coin is not working correctly?

 

Is a 13.4% success rate over 400+ trials evidence enough to conclude that the system is not working as it is supposed to? The answer to that question depends entirely on how much variance normally exists among measurements like this. The 99.7% rule says that 99.7% of all values in a normal distribution fall within three standard deviations of the mean. In other words, any data point which is more than three standard deviations from the mean is an extreme outlier and extremely unlikely to occur by mere chance. But to say that your results are three standard deviations from the mean, you need to know what the standard deviation for tests like this is. I have no idea what that would be, but my intuition suggests that 13.4% is probably within three of them.

 

This is admittedly outside my area of expertise, but if you do not know how much statistical variance normally exists across trials of that size I do not think it is even possible to make a meaningful claim about the significance of your results using a confidence interval calculation.

Edited by Kaskali
Link to comment
Share on other sites

Right, but are you modeling probabilistic events?

 

Confidence intervals are useful when we have a definite hypothesis and definite test results. They allow us to represent things like our level of confidence that our measurements are accurate, and how confident we are that our results point to an actual phenomenon and not just statistical noise.

 

Confidence intervals are also useful when we have definite results from a subset of some larger population and we want to extrapolate from them. We know with complete certainty how people in exit polls voted. Confidence intervals allow us to represent our confidence that these exit polling numbers are an accurate reflection of all the ballots cast.

 

In your case, it sounds like confidence intervals allow you to represent the confidence with which you can say that a rocket landed where your model said it would because the model is right, and not because of measurement error or expected variability or whatever.

No, they aren't. I think that is precisely the problem here.

 

There is a very important difference between saying that something has a 20% chance to happen and saying that something will happen 20% of the time.

 

For the sake of illustration, let's say that we are going to reverse engineer five items, each with a 20% chance to teach us a new schematic. A simple probability table gives us the following percentage chances for each of the five possible outcomes:

 

0/5 Successfullly teach us a new schematic = 32.77%

1/5 Successfully teach us a new schematic = 40.96%

2/5 Successfully teach us a new schematic = 20.54%

3/5 Successfully teach us a new schematic = 5.12%

4/5 Successfully teach us a new schematic = 0.64%

5/5 Successfully teach us a new schematic = 0.032%

 

We don't have a definite hypothesis because our model is probabilistic. Our chance of learning exactly one new schematic from five reverse engineering attempts (a perfect 20% success rate) is less than half. It is the most probable of the five outcome - it should occur with greater frequency than any other individual outcome - but it is substantially less probable than all the other outcomes put together.

 

If your friend said to you "I am going to reverse engineer five items. I bet you I learn exactly one new schematic, no more and no less," you would be smart to bet against him. Your odds of winning are roughly 3:2: If you and your friend made the same bet over and over and over, you ought to win one and a half times as much as you lose.

 

It is not impossible to make an argument about probability by way of confidence intervals, but I think it is kind of a clunky way to do it.

 

Common sense tells us that if we perform four trials, and in each of the four trials we learn five new schematics from five reverse engineering attempts (results that we would expect to occur with a frequency of about 1/3000), we can have a high degree of confidence in our inference that something is probably biasing the results.

 

Things get a lot trickier when the results are less extreme though. How many times do you have to flip heads before you conclude that your coin is not working correctly?

 

Is a 13.4% success rate over 400+ trials evidence enough to conclude that the system is not working as it is supposed to? The answer to that question depends entirely on how much variance normally exists among measurements like this. The 99.7% rule says that 99.7% of all values in a normal distribution fall within three standard deviations of the mean. In other words, any data point which is more than three standard deviations from the mean is an extreme outlier and extremely unlikely to occur by mere chance. But to say that your results are three standard deviations from the mean, you need to know what the standard deviation for tests like this is. I have no idea what that would be, but my intuition suggests that 13.4% is probably within three of them.

 

This is admittedly outside my area of expertise, but if you do not know how much statistical variance normally exists across trials of that size I do not think it is even possible to make a meaningful claim about the significance of your results using a confidence interval calculation.

 

look i don't understand what your problem here is so I will try to make it clear. You are trying to define a probability of an out come of a number of events that are related to one another. If we defined it that way then that would have a different probability of one type of situation occurring over another, that is true.

 

The point your missing is I am looking at each event, either i get a plan or i don't. This is called a binomial sequence or process. In a binomial process the rate of one event is/should converge to the stated probability of each individuality event in this case 20 percent. Now since we can only do so many test we need to find a way to check if the test set matches what the stated probability of an individuality event. to do this you compute the confidence interval for any given sample size. (the interval changes based on the number of test just as you would expect) for the number of test that i have done the rate of success is outside of the confidence interval. I know this is true. I know that being out of this interval means that rate of 20 percent that Bioware is telling us is WRONG with only a .3 percent chance that I am out of the confidence interval. So this post was to do one of two things.

 

1) someone else would generate their own test set and see if they get the same results. If they do that is just more demand that number two needs to occur.

 

2) the developers have said they like it when we tell the with data, when we can, when there is a problem. I am telling them that the math says there is a problem.

 

I don't understand what you think but I know what I have done and what is means. I also know how easy it is for Bioware to not have tested this enough. In fact if you look at my sample you will see that I hit a 20 percent rate at one point and I though that life was good but that rate didn't hold and things have dropped off. The link should take you to a plot i have made.

 

http://www.flickr.com/photos/10439526@N08/8463791003/in/photostream

 

in fact if you look at the plot you will see that the rate of successes hit 20 percent at one point. The red is the confidence interval and the blue is the rate of the times i got a plan from reverse engineering my greens.

Edited by Darth_Sweets
Link to comment
Share on other sites

I generated ten random series of a thousand samples each with a simple Python script, and here's the result:

 

http://snag.gy/JPFI4.jpg

 

Apologies for the horrible colors. Notice how one of the lines is slightly below the 99.7% confidence interval for a while, before eventually climbing back up? Exactly like you are seeing in your experiment.

 

While 99.7% may seem like an impressive number, it only covers 997 of every 1000 cases. The remaining three fall outside it. And there are many thousands of players on each server; maybe even tens of thousands. So it's far from impossible for someone to have such a streak of bad luck.

Link to comment
Share on other sites

But to say that your results are three standard deviations from the mean, you need to know what the standard deviation for tests like this is. I have no idea what that would be, but my intuition suggests that 13.4% is probably within three of them.

We know exactly what standard deviation is. In fact, we know everything about this distribution (it's what is called a normal distribution) because this is the classic high school binomial distribution problem. But in fact, even if it was not so simple, a lot of other scenarios (almost all that anyone cares about) converge to the same thing. Read up on the central limit theorem if you want to know why.

 

By the way, here are my results so far: 74 tried, 15 successes.

Link to comment
Share on other sites

@Darth_Sweets, I hope that you recognize that I'm NOT trying to defend the 20% tooltip as correct. I'm also not trying to say that your data is incorrect.

 

I'm simply saying that your conclusion is premature because your sample size is TOO DAMN SMALL.

 

You're dealing with a random number generator. As a software developer, I can say with confidence that implementing RNG is very easy. You can pick simplified rng that uses few calculations, or you can do a more complex calculation with crypto rng functions. Every language has a toolset that provides for this.

 

Using such an RNG in your code is even easier. Example:

if ( rng.GenerateValue() <= schematic.ChanceForSuccess ) {
   schematic.Success = true;
}

I'm not saying Bioware implemented rng correctly, I'm just saying that implementing a functioning RNG solution is very very easy to do right.

 

Now, if you want to validate any sort of RNG, you need to take a large enough sample size. The larger your sample size, the closer you should be to the expected results, and if there are deviations, you have enough data to present your case.

 

A few hundred tests isn't enough. Even 1,000 tests may not be enough. When modeling craps betting strategies and dice roll patterns, I had to get up to 10,000 or more iterations before I was seeing consistent results.

Link to comment
Share on other sites

You're dealing with a random number generator. As a software developer, I can say with confidence that implementing RNG is very easy.

 

I can tell you that you have never made a random number generator, if you have you wouldn't make that statement because there is no such thing as a true random number generator. They all have a bias the only question is can you live with the inherent bias.

Link to comment
Share on other sites

You're dealing with a random number generator. As a software developer, I can say with confidence that implementing RNG is very easy.

 

I can tell you that you have never made a random number generator, if you have you wouldn't make that statement because there is no such thing as a true random number generator. They all have a bias the only question is can you live with the inherent bias.

Try to reduce the snark. You misread my statement. I said "implementing RNG is very easy"

 

The reason I said that is almost every programming language has a number of built-in rng functions. And if these are not adequate for your needs, there are a number of publicly available crypto rng functions which are much more robust, easily available and have a much less predictable pattern.

 

I have built a program using computerized rng to roll 2 six-sided dice, and tally up how often different results came up. To do this, I didn't have to write my own RNG, I used one that was already available. It's a safe bet that rather than re-write an RNG from scratch, the Hero engine uses some already created package. Therefore, it is easily implemented.

 

In my craps program (above) I compared these results to published and known probabilities of craps rolls. The larger the sample size, the close the computer rng game to published probability.

 

What does this mean? Given a large enough sample size, computerized RNG can match actual probability.

 

This point, which you have yet to acknowledge, is 100% applicable here. Why is it applicable? Because SW:TOR is a computer game, and is using some sort of software-based RNG to calculate reverse engineering success.

 

If you are trying to posit "Reverse Engineering is not 20%" you need a larger sample size. Savvy?

Edited by Khevar
Link to comment
Share on other sites

Try to reduce the snark. You misread my statement. I said "implementing RNG is very easy"

 

The reason I said that is almost every programming language has a number of built-in rng functions. And if these are not adequate for your needs, there are a number of publicly available crypto rng functions which are much more robust, easily available and have a much less predictable pattern.

You are mixing up implementing an RNG with using it. It's easy to use an existing RNG to implement other algorithms that need randomness. However, implementing the RNG itself can be quite difficult.

 

There are some types that seem deceptively easy (LFSR and LCG come to mind), but they need careful selection of parameters or they'll produce very poor quality randomness. There are some others which involve more complex math but produce very high quality of randomness (like Mersenne twister). There are yet others that produce cryptographically secure random numbers but are computationally more expensive and thus unsuitable for high-throughput applications. A small error in the implementation of any RNG can significantly reduce its quality, so in most cases it's indeed best to use an existing implementation.

Link to comment
Share on other sites

You are mixing up implementing an RNG with using it. It's easy to use an existing RNG to implement other algorithms that need randomness. However, implementing the RNG itself can be quite difficult.

 

There are some types that seem deceptively easy (LFSR and LCG come to mind), but they need careful selection of parameters or they'll produce very poor quality randomness. There are some others which involve more complex math but produce very high quality of randomness (like Mersenne twister). There are yet others that produce cryptographically secure random numbers but are computationally more expensive and thus unsuitable for high-throughput applications. A small error in the implementation of any RNG can significantly reduce its quality, so in most cases it's indeed best to use an existing implementation.

Implement: v. "To put into practical effect; Carry out."

 

If I were to implement Google Analytics on a website, I surely wouldn't be writing Google Analytics code from scratch, I would be taking existing code and adding it or adapting it to the page.

 

As to the points you're making about RNG, I agree.

 

My point remains, however:

 

The Hero engine may have a poor implementation of RNG. It may even possibly (as the OP is claiming) not be 20%, or be broken in some way. But his current sample size is inadequate to demonstrate that. That's all I'm trying to get him to understand.

Edited by Khevar
Link to comment
Share on other sites

pretty sure OP knows a lot more about probability and RNG then you

 

But that does not make his statement invalid.

 

I quote the TV series Numb3rs...

 

Charlie Eppes: Lady luck, chance, randomness... Human beings, truly, have a hard time understanding it. Raindrops fall randomly. Now which of these two images best represents raindrops falling on a sidewalk? Is it image A?

[grouped pattern]

Charlie Eppes: Image B?

[even pattern - the class chooses this one]

Charlie Eppes: Okay. You're wrong. Our brains misperceive evenness as random, and wrongly assume that groupings are deliberate. Because of this people make all sorts of irrational decisions. Like, they won't work in a high rise building, or they're afraid to live in an earthquake prone area. And yet mathematical assessment tells us that you are far more likely to suffocate in bed than you are to you are to die in a terrorist attack. You are ten times more likely to die from alcohol than from being in an earthquake. And it is three times more likely that you will be killed while driving to buy a lottery ticket than it is that you will win the lottery.

 

Getting back to the RE RNG. Just because ONE player has a bad streak of 10 REs not producing a schematic does not make that cluster deliberate and the system broken. On the flip side just because one player manages to get three schematics in three REs also does not make that cluster deliberate and the system broken.

 

My point is that doing statistical analysis of RE events from one player's perspective is an inherently flawed experiment. Now if you could get a thousand players to do 500 REs each....then you are dealing with a reasonable sample size. But even then, "reasonable" is relative. There WILL be a margin of error of several percentage points. The greater the number of participants and the greater the number of attempts the more likely the final tally of results will be the expected result of 20%. And Bioware actually has the data from EVERY player, on EVERY character, and EVERY RE ever done. Do you REALLY think that if the system were actually broken that BioWare would not fix it? If so, you better get your tinfoil hat out :eek:

Edited by psandak
Link to comment
Share on other sites


×
×
  • Create New...