A/B Testing

How does A/B testing work?

  1. Define the hypothesis and the Overall Evaluation Criterion (OEC), e.g., checkout Conversion or ROI.
  2. Instrument events and goals (see Event and Goal), and tag traffic consistently with UTM parameters.
  3. Randomize users into A and B and hold all else equal.
  4. Size the sample (power & alpha), run the test, then analyze (frequentist p-value or Bayesian posterior) for statistical significance.
  5. Decide: ship, iterate, or archive. Validate results across segments (e.g., by Cohort or channel) and ensure your Attribution Model doesn’t mask the lift.

Minimal example

Variant Users Conversions Conv. Rate
A (control) 10,000 500 5.00%
B (test) 10,000 560 5.60%
Uplift = (5.60% − 5.00%) / 5.00% = +12% (evaluate significance before rolling out).

Good practice & gotchas

  • One primary metric to avoid fishing; pre-register guardrails (latency, churn, etc.).
  • No peeking: frequent looks inflate false positives—use sequential methods or a validated stats engine.
  • A/A tests catch bucketing or instrumentation bugs.
  • Variance reduction and stratification (e.g., CUPED, cohorts) increase sensitivity.
  • Tooling: beyond GA4, platforms like Optimizely/VWO and privacy-first analytics (Plausible, Matomo, Simple Analytics) can run or analyze experiments.

SEO note: Also searched as “What is A/B testing?” and “How does A/B testing work?”