A/B Testing: what it is, how it works, and why it matters

How does A/B testing work?

Define the hypothesis and the Overall Evaluation Criterion (OEC), e.g., checkout Conversion or ROI.
Instrument events and goals (see Event and Goal), and tag traffic consistently with UTM parameters.
Randomize users into A and B and hold all else equal.
Size the sample (power & alpha), run the test, then analyze (frequentist p-value or Bayesian posterior) for statistical significance.
Decide: ship, iterate, or archive. Validate results across segments (e.g., by Cohort or channel) and ensure your Attribution Model doesn’t mask the lift.

Variant	Users	Conversions	Conv. Rate
A (control)	10,000	500	5.00%
B (test)	10,000	560	5.60%
Uplift = (5.60% − 5.00%) / 5.00% = +12% (evaluate significance before rolling out).

One primary metric to avoid fishing; pre-register guardrails (latency, churn, etc.).
No peeking: frequent looks inflate false positives—use sequential methods or a validated stats engine.
A/A tests catch bucketing or instrumentation bugs.
Variance reduction and stratification (e.g., CUPED, cohorts) increase sensitivity.
Tooling: beyond GA4, platforms like Optimizely/VWO and privacy-first analytics (Plausible, Matomo, Simple Analytics) can run or analyze experiments.

SEO note: Also searched as “What is A/B testing?” and “How does A/B testing work?”