Statistical methodology for count trend experiments

Last updated:

|Edit this page

Trends experiments for count-based data use Bayesian statistics with a Gamma-Poisson model to evaluate the win probabilities and credible intervals for an experiment. Read the statistics primer for an overview if you haven't already.

What the heck is a Gamma-Poisson model?

Imagine you run a pizza shop and want to know how many slices a customer typically orders. Some days customers might order 1 slice, others 3 slices, and occasionally someone might order 6 slices! This kind of count data (1, 2, 3, etc.) follows what's called a Poisson distribution.

The Poisson distribution has one key number: the average rate. In our pizza example, maybe it's 2.5 slices per customer. But here's the catch - we don't know the true rate for sure. We only have our observations to guess from.

This is where the Gamma distribution comes in. It helps us model our uncertainty about the true rate:

  • When we have very little data, the Gamma distribution is wide, saying "hey, the true rate could be anywhere in this broad range".
  • As we collect more data, the Gamma distribution gets narrower, saying "we're getting more confident about what the true rate is".

So when we say we're using a Gamma-Poisson model for experiments, we're:

  1. Using the Poisson distribution to model how count data naturally varies.
  2. Using the Gamma distribution to express our uncertainty about the true rate.
  3. Getting more confident in our estimates over time.

Our Gamma-Poisson model uses a minimally informative prior of ALPHA_PRIOR = 1 and BETA_PRIOR = 1.

Win probabilities

The win probability tells you how likely it is that a given variant has the highest rate compared to all other variants in the experiment. It helps you determine whether the experiment shows a statistically significant real effect vs. simply random chance.

Let's say you're testing a new feature and have these results:

  • Control: 100 events from 1000 users (rate of 0.1 events per user)
  • Test: 150 events from 1000 users (rate of 0.15 events per user)

To calculate the win probabilities for the experiment, our methodology:

  1. Models each variant's rate using a Gamma distribution:

    • Control: Gamma(100 + ALPHA_PRIOR, 1000 + BETA_PRIOR)
    • Test: Gamma(150 + ALPHA_PRIOR, 1000 + BETA_PRIOR)
  2. Takes 10,000 random samples from each distribution.

  3. Checks which variant had the higher rate for each sample.

  4. Calculates the final win probabilities:

    • Control wins in 40 out of 10,000 samples = 0.4% probability
    • Test wins in 9,960 out of 10,000 samples = 99.6% probability

These results tell us we can be 99.6% confident that the test variant performs better than the control.

Credible intervals

A credible interval tells you the range where the true rate lies with 95% probability. Unlike traditional confidence intervals, credible intervals give you a direct probability statement about the rate.

For example, if you have these results:

  • Control: 100 events from 1000 users (rate of 0.1 events per user)
  • Test: 150 events from 1000 users (rate of 0.15 events per user)

To calculate the credible intervals for the experiment, our methodology will:

  1. Create a Gamma distribution for each variant:

    • Control: Gamma(100 + ALPHA_PRIOR, 1000 + BETA_PRIOR)
    • Test: Gamma(150 + ALPHA_PRIOR, 1000 + BETA_PRIOR)
  2. Find the 2.5th and 97.5% percentiles of each distribution:

    • Control: [0.082, 0.122] = "You can be 95% confident the true rate is between 0.082 and 0.122 events per user"
    • Test: [0.128, 0.176] = "You can be 95% confident the true rate is between 0.128 and 0.176 events per user"

Since these intervals don't overlap, you can be quite confident that the test variant performs better than the control. The intervals will become narrower as you collect more data, reflecting your increasing certainty about the true rates.

Questions?

Was this page useful?

Next article

Statistical methodology for property value trend experiments

Trends experiments for property values use Bayesian statistics with a log-normal model and Normal-Inverse-Gamma prior to evaluate the win probabilities and credible intervals for an experiment. Read the statistics primer for an overview if you haven't already. What the heck is a log-normal model with Normal-Inverse-Gamma prior? The log-normal model is great for analyzing metrics like revenue or other property values that are always positive and often have a "long tail" of high values…

Read next article