Did you know that conversion optimization can actually reduce your profits?
When you run a test, it may be easier for you to misinterpret the results and draw incorrect conclusions.
The “winner” that you can pick up can cause a long-term loss if you are not careful.
If you’ve never taken an introductory statistics course, don’t worry – I’ve got you covered. We will also follow ways in which the statistics of your split tests can be spoofed.
Stats 101: A Basic Crash Course
If you never had to take Stats 101 in college or university, you missed out on something exciting … umm, not really.
However, there are still some things from that course that you need to know before the split test that will make any real sense.
I am going to quickly go over these concepts so that you do not get confused in future. If you are already a stats supporter, just skip to the next section.
How to know if your partition test is valid (hint: Statistics may lie) @delkudmore
Is there a confidence interval in the world?
Whether you use a conversion tool like Optimizely, or a simple web app like Iswali, you’ll note that conversion rates always come with a limit.
Currently the conversion rate is 4.3%, but its range is low from 0.6 to 8.0%. This means that given a large enough sample, the conversion rate can fall anywhere in that range.
Now this does not mean that there are likely to be distant ends (0.6 or 8.0), it means that they are possible.
How important is importance?
Statistically Important – Have you heard that term before?
The importance of a test tells us how we should be confident that we have the right result when we are choosing from 2 or more options.
When you run a basic A / B test, you will have a confidence interval for each option.
See below for an example. The original can have a conversion rate up to 5.6%, while the variation (current winner) can have a conversion rate as low as 0.6%.
Does this mean the current results are worthless? no, not at all.
But this means that we need to calculate the significance of the test to determine how confident we can be when we choose the variance as the winner.
According to the instrument, the significance is currently 91.1%. This means 91.1% of the time, the best-performing option. However, that leaves 8.9% of the time where the original is actually the best.
In fact, tests are typically run until 95% or higher significance is achieved. Even at 95%, 1 out of 20 tests will eliminate the worst option with you. While it would be ideal to test everything for the 99% + significance level, this is not always possible due to traffic or time limitations.
Note the importance: If you can only achieve 95% significance in most tests, then it is not ideal, but it is fine. Just understand that not every lesson you have learned is going to be perfect, and that you should expect a conflicting result at once.
An important variable: sample size
Flip a coin 10 times, and you are very likely to get the results omitted like 3 heads (30%) and 7 tails(70%), even if in theory, they should be split at 50/50.
See where I am going with this?
The size of your sample is one of the most important factors in determining the importance of testing.
There are many simple sample-size calculators out there that you can use for free.
Take a look at Optimizely’s free web calculator here:
Sample size calculator
In this case, you will need to run the test until there are 10,170 samples (views) for each option.
So that the figure becomes 101 in about 5 minutes. If your split test results are indeed valid, then proceed to determine.
Here’s what most business owners do when doing split testing:
Calculate the required sample size
Run the test for that long
Choose a winner from the results
It doesn’t sound crazy, does it?
But there are some serious flaws that can negatively impact your bottom line.
You must segment your traffic
Simply segmenting means doing something different.
In terms of web traffic, you can segment in three main ways:
From Source: Traffic comes from various places. Google, Bing, social media, email links and more. Visitors from different sources of traffic behave differently and change.