A/B Analysis Guide for Data Analysts: Metrics, SQL, and Decision Checklist

What is A/B analysis?

A/B analysis compares a control group and a treatment group to decide whether a product, marketing, pricing, or checkout change improved a target metric. For data analysts, the core workflow is to confirm the experiment setup, calculate group-level metrics, check guardrails, estimate lift, and turn the result into a launch, rerun, or do-not-launch recommendation.

Who this A/B analysis template is for

This guide is written for data analysts, product analysts, growth analysts, and marketing analysts who already have an experiment result table but need a structured way to turn that result into a decision.

The goal is not to explain every statistical theory behind experimentation. The goal is to give you a reusable template: what to check before analysis, what SQL to write, what metrics to report, and how to write a recommendation that stakeholders can act on.

The A/B analysis checklist

Before you calculate lift, confirm these items. Most bad experiment reports fail before the first p-value appears.

Control and treatment definitions are documented.
The randomization unit is clear: user, account, session, store, or device.
Each unit appears in only one experiment group.
The primary metric was chosen before launch.
Guardrail metrics are defined, such as refund rate, churn, support tickets, or latency.
The exposure window and conversion window are fixed.
The experiment was not stopped early because one day looked good.

Recommended metric table

For most product or growth experiments, build one table with one row per experiment group. Include sample size, conversions, conversion rate, revenue, average order value, guardrails, and the absolute and relative lift.

Metric table columns:
experiment_id
variant
users
conversions
conversion_rate
revenue
revenue_per_user
refund_rate
absolute_lift
relative_lift
decision_note

SQL example: control versus treatment conversion rate

This query assumes an assignment table and an event table. Adapt table names and date logic to your warehouse. The important part is the grain: one row per assigned user, then one conversion flag per user.

WITH assigned_users AS (
  SELECT
    experiment_id,
    user_id,
    variant,
    assigned_at
  FROM experiment_assignments
  WHERE experiment_id = 'checkout_button_test'
),
user_conversions AS (
  SELECT
    a.experiment_id,
    a.user_id,
    a.variant,
    MAX(CASE WHEN e.event_name = 'purchase_completed' THEN 1 ELSE 0 END) AS converted
  FROM assigned_users a
  LEFT JOIN events e
    ON e.user_id = a.user_id
   AND e.event_time >= a.assigned_at
   AND e.event_time < a.assigned_at + INTERVAL '7 days'
  GROUP BY 1, 2, 3
)
SELECT
  variant,
  COUNT(*) AS users,
  SUM(converted) AS conversions,
  1.0 * SUM(converted) / NULLIF(COUNT(*), 0) AS conversion_rate
FROM user_conversions
GROUP BY 1
ORDER BY variant;

SQL example: check sample ratio mismatch

A sample ratio mismatch means traffic did not split as expected. If the test was supposed to be 50/50 but the data is 70/30, do not trust the result until you understand why.

SELECT
  variant,
  COUNT(DISTINCT user_id) AS assigned_users,
  1.0 * COUNT(DISTINCT user_id)
    / SUM(COUNT(DISTINCT user_id)) OVER () AS traffic_share
FROM experiment_assignments
WHERE experiment_id = 'checkout_button_test'
GROUP BY 1
ORDER BY 1;

How to write the decision

A useful A/B test report should not end with "statistically significant" or "not statistically significant." It should explain what the team should do next.

Decision template:
We recommend [launch / do not launch / extend / rerun].
The primary metric moved from [baseline] to [treatment result],
which is an absolute change of [x percentage points] and a relative lift of [y%].
Guardrail metrics [were stable / changed].
The main caveat is [sample size, tracking issue, segment difference, or external event].
Next step: [rollout plan or follow-up test].

A/B analysis vs A/B testing

A/B testing is the experiment design and traffic split. A/B analysis is the post-experiment work that validates the data, calculates the effect, checks risks, and explains the decision. A strong test can still produce a bad recommendation if the analysis ignores sample ratio mismatch, tracking problems, segment imbalance, or guardrail movement.

Common A/B test analysis mistakes

Using sessions as the unit when users can appear in both variants.
Changing the primary metric after looking at results.
Reporting relative lift without baseline rate and sample size.
Ignoring guardrail metrics when the primary metric improves.
Mixing pre-exposure and post-exposure events in the same conversion window.
Calling a test failed when it was underpowered rather than negative.

Download the A/B test checklist

Use the checklist before launch and again before reporting results. It helps catch missing hypothesis definitions, wrong metric windows, sample ratio issues, and unclear decision rules.

Download A/B test checklist Download Excel workbook