Research10 min read

Attribution Modeling is Broken. Here's Why.

Last-touch, first-touch, linear — every attribution model is wrong in the same way. They measure correlation, not causation. Here's why this matters and what to do about it.

Every major marketing platform offers attribution modeling. Google Analytics gives you data-driven attribution. Your DSP offers last-touch, first-touch, linear, time-decay, and position-based models. Your CDP probably has its own version. And they all have the same fundamental problem.

They measure correlation. None of them — not even "data-driven" attribution — answer the question your budget decisions actually depend on: what would have happened if this channel hadn't been there?

How Attribution Models Work (and Why They're Flawed)

All standard attribution models observe conversion journeys and assign credit to touchpoints based on some rule. Last-touch gives 100% credit to the final touchpoint before conversion. First-touch gives it to the first. Linear splits it equally. Time-decay gives more credit to recent touchpoints.

"Data-driven" attribution uses machine learning to find patterns in which touchpoint combinations correlate with higher conversion rates. It sounds more sophisticated — and it is — but it's still measuring correlation. A touchpoint that appears often in converting journeys gets more credit, even if those journeys would have converted anyway.

The brand search problem: customers who are about to buy often search your brand name before converting. Brand search gets enormous attribution credit under almost every model. But did the brand search cause the conversion — or did the customer search because they had already decided to buy?

The Confounding Problem

The reason attribution models give misleading answers is confounding. High-intent customers behave differently from low-intent customers. They click on more ads. They search the brand more. They open more emails. So naturally, they show up more often in "successful" conversion paths.

Any model that assigns credit based on presence in a conversion path will systematically over-attribute to channels that high-intent customers happen to use — not because those channels caused the conversions, but because high-intent customers were going to convert regardless.

Simpson's Paradox in Attribution

Simpson's Paradox is when a trend that appears in aggregated data disappears or reverses when the data is segmented. It's surprisingly common in marketing analytics.

A real pattern: a retargeting campaign appears to drive a 12% higher conversion rate than the control group in aggregate. But when segmented by prior purchase history, it has zero effect on new customers and a negative effect on loyal customers (who find the retargeting annoying). The aggregate result — the one that drives budget decisions — is an artifact of the mix of customer types, not the campaign itself.

Attribution models can't detect or correct for Simpson's Paradox because they don't model the causal structure of the data.

Marketing Mix Modeling (MMM) Is Not the Answer

Marketing Mix Modeling uses time-series regression to estimate channel contributions to revenue. It's better than touchpoint attribution in some ways — it can capture diminishing returns and halo effects — but it has its own fundamental limitation.

MMM estimates correlation between spend and revenue over time. It can't distinguish between channels that cause revenue and channels that are correlated with revenue because of shared seasonality or brand momentum. A channel that you increase spend on during high-demand periods will appear more effective than it is.

  • MMM models typically require 2–3 years of weekly data to produce stable estimates — most companies don't have this
  • They can't produce customer-level estimates — only aggregate spend-to-revenue curves
  • They assume constant returns to scale within the modeled period, which is rarely true
  • They can't estimate heterogeneous effects — which channels work for which customer segments

What Causal Attribution Actually Measures

Causal attribution uses causal inference methods — specifically, the potential outcomes framework and structural causal models — to estimate what's called the "incremental lift": the change in conversion rate or revenue that is directly attributable to a channel, holding all confounders constant.

The key difference is the counterfactual: "What would the conversion rate have been if this channel hadn't existed, for the customers who were actually exposed to it?" This is not a question attribution models or MMM can answer. It requires causal inference.

The Uplift vs. Attribution Distinction

Attribution asks: "Which channel gets credit for this conversion?" Causal inference asks: "Did this channel cause the conversion — or would it have happened anyway?" The second question is the one that matters for budget decisions. You should only pay for incremental conversions, not for conversions that would have happened regardless.

How to Move From Correlation to Causation

The practical path forward doesn't require abandoning your attribution tools. It requires understanding what they're telling you (which channels correlate with conversion) and what they're not (which channels cause conversion). Then using causal inference to answer the harder question.

  • Identify your key attribution questions: Which channels are you most uncertain about? Brand search is usually the first candidate
  • Build a causal schema: treatment = channel exposure, outcome = conversion, confounders = intent signals, prior behavior, demographics
  • Run incremental lift analysis with a doubly robust estimator (AIPW) on your customer-level data
  • Compare causal estimates to attribution model outputs — the difference tells you how much confounding is distorting your attribution
  • Reallocate budget away from channels with high attribution credit but low causal lift

Typical finding: reallocation based on causal estimates reduces wasted spend by 20–40% while maintaining or improving total conversions. The wasted spend was going to channels that were present in conversion paths but not causing them.

The Bottom Line

Attribution modeling is not going away. It's useful for understanding customer journeys and optimizing touchpoint sequencing. But for budget allocation decisions — how much to spend on each channel — you need causal estimates, not correlation-based credit assignment.

The teams that will win in the next five years of marketing analytics are the ones that move from "which channels get credit" to "which channels cause results." That's a methodological shift, and it's available to any team with customer-level data.

Ready to apply causal inference to your data?

CausoAI takes you from CSV to causal insights in minutes — no data science background required.

Start Free Trial