FoundationalIndustry primer · Incrementality·12 min read

What is incrementality testing? The only honest measure of ad performance

Incrementality testing is the discipline of measuring what your ads actually caused vs what would have happened anyway. It's the answer to 'is Meta lying to me about ROAS?' (often, yes). This primer explains what incrementality is, the six study designs that actually work, and the amateur-vs-elite gap between dashboard-watching and lift-measuring.

Start here

Incrementality testing measures lift caused by your ads - not conversions that happened during your ads

Platform-attributed ROAS tells you how many conversions Meta or TikTok credited to your ads. It does NOT tell you how many of those conversions would have happened without your ads. The gap between the two is the difference between attributed performance and incremental performance - and that gap can be huge.

The honest answer to 'are my ads working?' isn't on the dashboard. It's in a controlled experiment: show ads to one group, withhold ads from a comparable group, measure the difference. That's incrementality testing. The math is simple; the discipline of running it well is rare.

Common Thread Collective's published research found that platform attribution captures only ~86% of actual new customer acquisition at best, and the standard 1-day click window captures only ~47%. Operators making decisions on the dashboard alone are systematically over-funding the formats that capture last clicks and under-funding the formats that drive upstream lift.

Common misidentifications

It's not this. It's that.

The most-common confusions, lined up side-by-side.

Not this

Incrementality = A/B testing creative

This

Incrementality = testing whether the channel/campaign drives lift vs no-ads control

Not this

ROAS measures incrementality

This

ROAS measures attributed conversions during ad exposure - not lift caused by the ads

Not this

Incrementality is for big budgets only

This

Incrementality is for any budget where 20%+ can be held out for 1-2 weeks - usually means $50K+/mo accounts

Not this

One incrementality test is enough

This

Incrementality needs quarterly cadence - channels and creatives drift, last quarter's lift number expires

Anatomy

The 6 components of a working incrementality test

Incrementality tests fail in predictable ways. The six components below have to be in place or the test won't produce a defensible number.

Why it matters

Without a true control, you can't distinguish 'caused by ads' from 'would have happened anyway'.

Concrete example

Meta GeoLift test: ads run in 60% of US metros, paused in matched 40%. Measure conversion difference over 14-21 days.

The gap

The 8 differences between amateur and elite incrementality practice

Most operators talk about incrementality. Few run it well. The gaps below are what separate the practice from the rhetoric.

Dimension
Amateur
Elite
Cadence
Run once when a stakeholder demands it
Quarterly schedule, channel by channel
Sample size discipline
Run for as long as it takes to 'feel real'
Power analysis upfront; pre-defined sample
Hypothesis registration
Threshold set post-hoc to fit the data
Pre-registered threshold, no goalpost moving
Test isolation
'Test all of Meta'
Isolate by channel + campaign type
Length
7 days
Minimum one full purchase cycle (14-28 days)
Tooling
Built ad hoc in spreadsheets
Meta Conversion Lift, GeoLift, Haus, Recast, or proper MMM
Use of result
Result mentioned in deck once, ignored
Result rewrites budget allocation immediately
Attribution literacy
Optimizes on dashboard ROAS only
Reads dashboard ROAS, MMM, and lift - knows which contradicts which

Pitfalls

The most common mistakes

Each one alone is recoverable. Several stacked together break the practice.

Pitfall 1

Confusing A/B creative testing with incrementality

A/B creative tests measure which creative wins; incrementality tests measure whether the channel/campaign drives lift. Different jobs, different methods. Most operators conflate them.

Pitfall 2

Running under-powered tests

A 7-day test on $30K/day spend usually can't detect a 10% lift. The sample math has to come first; the test schedule has to fit the math.

Pitfall 3

Moving the threshold after seeing the result

Pre-register the threshold before launch. Post-hoc threshold-changing is the #1 way incrementality 'proves' channels that didn't actually lift.

Pitfall 4

Ignoring the lift number once measured

Most operators run incrementality, get a result, and don't change their budget allocation. The discipline is letting the number rewrite the math. Otherwise the test is decoration.

Glossary

Related terms you should know

The vocabulary that surrounds this concept. Bookmark this section.

Incrementality

The conversions caused by an ad/channel/campaign, measured against a no-ad control.

Lift

The difference in outcomes between test and control groups, attributable to the intervention.

Geo-holdout / GeoLift

Incrementality test design where geographic regions are split into test (sees ads) and control (doesn't). Meta GeoLift is the standard tooling.

Conversion Lift

Meta's first-party incrementality study product - randomized within-audience holdout.

MMM (Marketing Mix Modeling)

Statistical model that decomposes total sales into contributions from channels + non-channel factors. Recast, Haus, Recast all build MMM tools.

MTA (Multi-Touch Attribution)

Attribution methodology that credits multiple touchpoints in a conversion journey. Northbeam, Triple Whale, Hyros build MTA.

Attribution gap

The difference between platform-reported conversions and incrementally-measured conversions. Often 30-50%.

Lift multiplier

The ratio of incremental conversions to platform-attributed conversions. Used to adjust attributed ROAS into incremental ROAS.

Pre-registration

Writing down the hypothesis, threshold, and timeframe before the test starts. The discipline that prevents post-hoc rationalization.

Where Shuttergen fits

Foundational knowledge in. 25 variants out.

Once you understand the discipline at this level, the bottleneck moves to production. Shuttergen turns one validated concept - anchored to your starting image - into 25 brand-safe variants you can test. The strategist stays in the loop; the production grind goes away.

Try Shuttergen free

Related Shuttergen reading

Where to go next

The connected pages that compound on this one.

Sources

What we read to build this

Foundational knowledge. Now ship the variants.

Shuttergen turns understanding into output - one validated concept into 25 brand-safe variants in hours, not weeks.

Start free