FoundationalIndustry primer · Incrementality·12 min read

What is incrementality testing? The only honest measure of ad performance

Incrementality testing is the discipline of measuring what your ads actually caused vs what would have happened anyway. It's the answer to 'is Meta lying to me about ROAS?' (often, yes). This primer explains what incrementality is, the six study designs that actually work, and the amateur-vs-elite gap between dashboard-watching and lift-measuring.

Start here

Incrementality testing measures lift caused by your ads - not conversions that happened during your ads

Platform-attributed ROAS tells you how many conversions Meta or TikTok credited to your ads. It does NOT tell you how many of those conversions would have happened without your ads. The gap between the two is the difference between attributed performance and incremental performance - and that gap can be huge.

The honest answer to 'are my ads working?' isn't on the dashboard. It's in a controlled experiment: show ads to one group, withhold ads from a comparable group, measure the difference. That's incrementality testing. The math is simple; the discipline of running it well is rare.

Common Thread Collective's published research found that platform attribution captures only ~86% of actual new customer acquisition at best, and the standard 1-day click window captures only ~47%. Operators making decisions on the dashboard alone are systematically over-funding the formats that capture last clicks and under-funding the formats that drive upstream lift.

Common misidentifications

It's not this. It's that.

The most-common confusions, lined up side-by-side.

Not this

Incrementality = A/B testing creative

This

Incrementality = testing whether the channel/campaign drives lift vs no-ads control

Not this

ROAS measures incrementality

This

ROAS measures attributed conversions during ad exposure - not lift caused by the ads

Not this

Incrementality is for big budgets only

This

Incrementality is for any budget where 20%+ can be held out for 1-2 weeks - usually means $50K+/mo accounts

Not this

One incrementality test is enough

This

Incrementality needs quarterly cadence - channels and creatives drift, last quarter's lift number expires

Anatomy

The 6 components of a working incrementality test

Incrementality tests fail in predictable ways. The six components below have to be in place or the test won't produce a defensible number.

Why it matters

Without a true control, you can't distinguish 'caused by ads' from 'would have happened anyway'.

Concrete example

Meta GeoLift test: ads run in 60% of US metros, paused in matched 40%. Measure conversion difference over 14-21 days.

The gap

The 8 differences between amateur and elite incrementality practice

Most operators talk about incrementality. Few run it well. The gaps below are what separate the practice from the rhetoric.

Dimension

Amateur

Elite

Cadence

Run once when a stakeholder demands it

Quarterly schedule, channel by channel

Sample size discipline

Run for as long as it takes to 'feel real'

Power analysis upfront; pre-defined sample

Hypothesis registration

Threshold set post-hoc to fit the data

Pre-registered threshold, no goalpost moving

Test isolation

'Test all of Meta'

Isolate by channel + campaign type

Length

7 days

Minimum one full purchase cycle (14-28 days)

Tooling

Built ad hoc in spreadsheets

Meta Conversion Lift, GeoLift, Haus, Recast, or proper MMM

Use of result

Result mentioned in deck once, ignored

Result rewrites budget allocation immediately

Attribution literacy

Optimizes on dashboard ROAS only

Reads dashboard ROAS, MMM, and lift - knows which contradicts which

Pitfalls

The most common mistakes

Each one alone is recoverable. Several stacked together break the practice.

Pitfall 1

Confusing A/B creative testing with incrementality

A/B creative tests measure which creative wins; incrementality tests measure whether the channel/campaign drives lift. Different jobs, different methods. Most operators conflate them.

Pitfall 2

Running under-powered tests

A 7-day test on $30K/day spend usually can't detect a 10% lift. The sample math has to come first; the test schedule has to fit the math.

Pitfall 3

Moving the threshold after seeing the result

Pre-register the threshold before launch. Post-hoc threshold-changing is the #1 way incrementality 'proves' channels that didn't actually lift.

Pitfall 4

Ignoring the lift number once measured

Most operators run incrementality, get a result, and don't change their budget allocation. The discipline is letting the number rewrite the math. Otherwise the test is decoration.

Glossary

Related terms you should know

The vocabulary that surrounds this concept. Bookmark this section.

Incrementality

The conversions caused by an ad/channel/campaign, measured against a no-ad control.

Lift

The difference in outcomes between test and control groups, attributable to the intervention.

Geo-holdout / GeoLift

Incrementality test design where geographic regions are split into test (sees ads) and control (doesn't). Meta GeoLift is the standard tooling.

Conversion Lift

Meta's first-party incrementality study product - randomized within-audience holdout.

MMM (Marketing Mix Modeling)

Statistical model that decomposes total sales into contributions from channels + non-channel factors. Recast, Haus, Recast all build MMM tools.

MTA (Multi-Touch Attribution)

Attribution methodology that credits multiple touchpoints in a conversion journey. Northbeam, Triple Whale, Hyros build MTA.

Attribution gap

The difference between platform-reported conversions and incrementally-measured conversions. Often 30-50%.

Lift multiplier

The ratio of incremental conversions to platform-attributed conversions. Used to adjust attributed ROAS into incremental ROAS.

Pre-registration

Writing down the hypothesis, threshold, and timeframe before the test starts. The discipline that prevents post-hoc rationalization.

Where Shuttergen fits

Foundational knowledge in. 25 variants out.

Once you understand the discipline at this level, the bottleneck moves to production. Shuttergen turns one validated concept - anchored to your starting image - into 25 brand-safe variants you can test. The strategist stays in the loop; the production grind goes away.

Try Shuttergen free

Where to go next

The connected pages that compound on this one.

Primer · Attribution

What is attribution? The mostly-broken science of crediting conversions to ads

Foundational primer on attribution - 6 models (last-click, first-click, linear, position-based, MTA, MMM), the amateur-vs-elite gap, and why dashboard ROAS is rarely the truth.

Read

Playbook · Testing

How to run creative tests that actually move the business

6-step framework - isolated variables, power calculations, pre-registered thresholds, geo-holdouts, documentation discipline.

Read

Calculator · Sample size

Creative test sample size calculator: how big does your test need to be?

Interactive power-analysis calculator for creative testing. Real two-proportion z-test math. Inputs: baseline CVR, MDE, alpha, power, CPC, traffic. Outputs: conversions per side, spend estimate, days to complete.

Read

Research · Static vs video

Static vs video ads: which converts better, and when?

Honest interactive research with operator citations from Plofker, Shackelford, Hott, Pilothouse, Common Thread, Foxwell, Triple Whale, Recast. The 6 variables that decide + the advertorial-alignment caveat.

Read

Sources

What we read to build this

Foundational knowledge. Now ship the variants.

Shuttergen turns understanding into output - one validated concept into 25 brand-safe variants in hours, not weeks.

Start free

What is incrementality testing? The only honest measure of ad performance

Incrementality testing measures lift caused by your ads - not conversions that happened during your ads

It's not this. It's that.

The 6 components of a working incrementality test

1. Test + control group

2. Sample size + statistical power

3. Length: at least one full purchase cycle

4. Pre-registered hypothesis + threshold

5. Channel + campaign isolation

6. Documented learning + reallocation

The 8 differences between amateur and elite incrementality practice

The most common mistakes

Confusing A/B creative testing with incrementality

Running under-powered tests

Moving the threshold after seeing the result

Ignoring the lift number once measured

Related terms you should know

Foundational knowledge in. 25 variants out.

Where to go next

What we read to build this

Foundational knowledge. Now ship the variants.