Simpson’s paradox explains why an effect can disappear or reverse when data from different groups are combined without accounting for a confound or confounder — a third variable that influences both independent (or intervention or exposure) and dependent (or outcome) variables. (View Highlight)
Simpson’s success was prefiguring contemporary statistics reform efforts by emphasizing the need for qualitative knowledge — and not just expert knowledge in the traditional sense, but also, we might say, common sense — to play a central role in statistical analysis. (View Highlight)
So now we have three problems in talking about Simpson’s paradox: (1) it’s not Simpson’s, in the sense that he wasn’t the first to describe confounding, (2) there’s no paradox, in the sense that you should be thinking about causality early and often when doing statistics, and (3) Simpson’s example was inadvertently showing something closer to the numeric phenomenon of noncollapsibility, a failure of subgroups with high proportions to make simple averages in the aggregate, and not the causal phenomena of confounding that he thought he was talking abou (View Highlight)
it’s widely recognized that all models omit variables (View Highlight)
This requires understanding a group of related concepts starting with d-separation (d as in dependence; see also “d-separation without fears”), its opposite *d-*connection, and how and why to combine graphs and probabilities to draw conditional independence in a special type of causal logic drawing called Directed Acyclic Graphs ( (View Highlight)
when you hear “Simpson’s paradox,” you should think — “D’OH! Is this the numeric problem of noncollapsibility, the causal problem of simple confounding, or another confound problem, like collider bias?” — (View Highlight)
when you hear “Simpson’s paradox,” you should think — “D’OH! Is this the numeric problem of noncollapsibility, the causal problem of simple confounding, or another confound problem, like collider bias?” — (View Highlight)
then there’s a beautiful irony in all this: Simpson’s main point — story drives statistics — is getting lost in dressing up science as “objective” when it’s not. (View Highlight)