Prefer to learn by listening? This article is now a podcast. Available on Spotify, Youtube and elsewhere.

Introduction

Have you ever seen a Sample Ratio Mismatch? Where one of your variants has significantly less traffic than the other? Or a variation has no traffic going to it, but everything else is behaving normally? Or no traffic going into your test at all?

This document looks into the various causes of Sample Radio Mismatch, and more broadly "problems with bucketing/potting users".

Before we get started

You should not expect a 1:1 match between control and variation for user counts. Optimize does not currently (as of Oct 2024) artificially adjust sample sizes.

In the same way that flipping a coin 100 times is unlikely to yeild exactly 50 heads and 50 tails, the same is the case of randomising users into control vs. variation.

Also, note that these issues are amongst the more difficult to debug. This document does a very detailed job of explaining how you should think through this investigation, but as it is almost always a user-error, we expect users to investigate and take corrective measures appropriately.

You should also check the % split between your variations to make sure it's 50/50 or as expected. The platform is capable of uneven splits such as 90/10 and there are reasons to do this, but an uneven split is a certain reason to see SRM in your results.

Where is the problem?

Follow these steps and route your investigation from top to bottom.

Are users not getting into your test at all? I.e. the count of both control and variant numbers are low/zero? Jump to My test is getting no traffic.

Are you analysing by session instead of user? Jump to Analysing by session

Is your control showing lower user numbers than your variation? Jump to Control is lower than variation.

Is your variation showing lower user numbers than the control? Jump to Variation is lower than control.

My test is getting no traffic

This speaks to one of a few problems, each of which are explored below.

Location is incorrect

Does the experiment show up in the Force Experiment Widget? Are you able to get into it in staging mode? Is this the case on all pages you're expecting to test on?

Preview links have their quirks as it's help article describes (it's a forced preview), but the Widget will correctly display experiments on pages where they should run. As it's a bookmarklet, you'll be able to get an answer in one click.

If the location is incorrect, nobody will see your test, so making sure it's correct and validating this in Staging Mode before setting your test live is a key pre-launch check.

Segment logic is flawed

Quite often, users wil force their way into a test, and this leapfrogs segmentation checks - making it something that often gets missed during pre-launch QA testing.

The Debug Mode in the tag, with verbosity level 3, will show you the segmentation evaluation checks, including a highly detailed output of every check we've done (browser, geolocation, etc.) including expected and found values, plus the outcome of the comparison.

Test is throwing JS code errors

This is explained below.

Analysing by session

Tests - AB, ABn and MVT - are all "sticky". This means that once a user has been exposed to a particular variation, they will continue to see this both across page loads within a session, and across sessions too.

The split in traffic, and decision for which variation to allocate users to, is therefore done at the User level, not the Session.

If you run a successful test, you may encourage users who abandon to come back again later that same day that wouldn't usually. This is typically a good result. And yet, when you look at session counts, it will look like the variation is being shown to more users, when in reality they might just be coming back more.

Similarly, if your variant causes fewer users to return to the site than would do usually, the variant could appear artificially lower in session counts than the control, despite the same users being assigned the variation/control.

SRM is not relevant to analysis by session - you should only be investigating it if spotted when analysing by the User Scope. It is, however, useful insight to know that you've encouraged more users to return for a second/third/fouth session!

Control is lower than variation

This less-likely scenario is typically due to:

Code you've put in the control JS that's throwing errors. Jump to this
Global test-code, which was written for variations but negatively affects the control. Investigated as above.

Variation is lower than control

Redirect tests

Is your test performing a redirect, not using the built-in Split functionality in the platform?

A common mistake users make is believing that Optimize will track the users the instant a variation is assigned. In reality, there is a (very) short delay between your code running and users being counted.

If this process is interrupted, users may not be counted. And as JS redirections are immediate - often sending the user to the new page without waiting for it to load as normal links do - it's easy to see how that request could fail to complete.

Optimize has an events system, through which you can hook into "pageview" events (test entry events). With these event hooks, you can wait for a pageview to be tracked, and then redirect the user afterwards.

Code errors

If there are runtime errors in your test when the "rubber meets the road" - the moment of JS being executed - Optimize will not count users. We do not want to incorrectly count users as having been exposed to your variant when they haven't seen the changes.

Optimize hides execution-time JS Syntax errors (not ones wrapped in a timeout/interval/delay) from users, inside of the Debug Mode. To view these errors, you can add ?_wt.debug=v to your query string. Note the number of Vs denote verbosity of the logs, with one V showing errors and key information only. Add more if needed, up to 5 for trace-level logs.

Note also that code errors might not be for all users and all devices, and so analysing your data by browser might highlight that Chrome and Edge are fine, but there's a problem with Safari on iOS or Internet Explorer. To do this, apply a dimension in the reports of browser and then look at sample sizes to steer your investigation.

Note also that code errors can come from any of: The tag and it's various components (preinit, preload, postload, etc.), Pre-rendering script, variation code or post-rendering script. The debug mode will flag all of these to you if found.

OTS API Guide for Server Side testing

Why isn't my test working?

Advanced Consent Mode with Redirect split tests

Creating an experience in the Advanced editor (CSS and JS)

Using the Optimize Build Framework (OBF)