Multi Armed Bandits | Webtrends Optimize Help Center

Multi Armed Bandits can be enabled for ABn or Split experiences that are in staging, live or paused state.

If these conditions are met, you will see an "MAB" tab on the expanded row panel on the experience screen. Once enabled, the Multi Armed Bandit algorithm will update the variation throttles every 30 mins.

Thompson Sampling

For Thompson Sampling, select a KPI and click "Apply" and our algorithm will handle the rest.

We use Monte Carlo Thompson sampling, with many samples drawn from a Beta distribution. This results in probabilities that each variation is the best, which are directly proportional to the variation throttles that would be set using pure Thompson. We add a weighted blend on top, where the Thompson throttles are multiplied by 0.3 and the existing throttle is multiplied by 0.7 to get the final value. This just slows down the rate of change slightly. The algorithm still gets to a result when there is one, and evaluating every 30 minutes will mean getting there quickly, but it doesn't jump to a winner after a single loop.

Epsilon Greedy

Alternatively, select Epsilon Greedy from the "Approach" dropdown and choose your Epsilon value between 1-100.

The Epsilon value is split evenly between each variation.

The rest (100 - Epsilon) is allocated to whichever variant is performing the best, against our KPI.

For example, if Epsilon is set to 30 and there are 3 variants, each one gets 10% traffic. The winning variant gets the remaining 70%, resulting in throttle allocation of 10%, 10%, 80%.