If you test, whether it be A/B or Multivariate, chances are you’ve seen at least one test that has failed. We’d all like to get a 100% success rate, but testing itself is conducted because there are visitor behaviors we are incapable of predicting. Even with 10+ years of experience in behavior-based conversion rate optimization, we here at FutureNow are sometimes wrong. So, the question is, when you see that your test is failing, what do you do?
Just like there are patterns in successful tests, there are patterns in failing tests as well. But the place you should always start is at your hypothesis or research question. You’ve heard us preach the importance of having a hypothesis when creating your tests, and they’re just as important when getting to the bottom of why a test is failing.
Tests fail for 3 primary reasons:
Chances are, if you go back to your research question, you’ll receive some insight as to where your test might have failed. Maybe this will lead you to re-working your research question, or perhaps lead you to a different hypothesis altogether. Either way, understanding WHY your test is failing is just as important as what you do after.
Above are two screen shots of real tests run by clients. Both are failing, but they are very different. The one on the left is being beaten by the original, and has been consistently so since early on in the test. The one on the right has been going back and forth for over a month and we’re currently at a dead tie.
Many of you will be surprised to hear that of the two test results, the one I’d prefer to see is the one on the left. This is because I know exactly what to do – end the test. The consistency of the tests results indicate that this is a failed test and we need to go back to the drawing board about our hypothesis. With the test on the left, we can still draw knowledge about the visitors from the test data, and use that to either refine our hypothesis, or draw up a new one.
The test on the right is much more problematic. Not only doesn’t it give us clear results, but it doesn’t leave us with much knowledge to use to refine our hypothesis. I’d still suggest to end this test as well. The difference is, ending it puts us back at sqaure one: we still are no closer to understanding who our visitors are and what affects their behaviors.
Tests should reach completion within 60 days, ideally 30 days. Not sure how to estimate the duration? Use the free test duration calculator tool provided by Google to figure it out. If your test is going to take longer than 60 days, it’s not the right test to be running.
But what if your test is trending negatively after only a week? While there’s no steadfast rule, I always encourage my clients to run their tests for at least 2 weeks. This will give us a better impression of visitor behavior, and many times, things begin trending in a more positive direction. Having said that, letting a negatively trending test run for an additional week is not always the best choice in all cases.
If you have a gaggle of tests ready and the one trending in the failing direction doesn’t have as potentially significant an impact as some of the other tests in your pipeline, you may consider pausing the negatively trending tests in favor of launching a new test. Know what you have in your testing cue. At the end of the day you should always be testing whatever will give you the most significant results first.