G-test Calculator

How many versions?

Success is measured by
count of successes
percentage of trials that succeeded

Use the Yates' continuity correction?

Test Trials Success

Explanation

This form does a G-test calculation to determine the outcome of an A/B test. The G-test statistic is a measure of how much overall variation there is from an ideal prediction that you would expect if all versions were the same. In the case where all versions are the same and there are many trials, the distribution of this statistic is known. If the statistic we have is very unlikely, then we have good evidence that we are seeing real differences of some sort. Our confidence that there is a real difference is 100% minus the odds of a less likely G-test statistic than the one we saw.

At what confidence do we end the test? There is no hard rule, however a common benchmark is 99% confidence. Please note that 99% confidence does not mean you are right 99% of the time. How often you're right depends in complicated ways on information you can't get, like what the size of the real difference is. Instead it means that if there was no real difference, then there was only a 1% chance of your making a mistake at the point in time where you made your decision.

The fields should be self-explanatory except the Yates' continuity correction. The Yates' continuity correction adjusts the numbers slightly to take into account the fact that the test results must be a discrete number. This makes the test slightly more accurate for small sample sizes, but the difference is seldom material. It defaults to true.