Correlation does not imply causation. Just because two variables are correlated does not mean one causes the other. This is the single most important warning in introductory statistics because the human mind naturally turns patterns into stories. Once we see two things move together, we immediately want to explain why. Statistics asks you to slow down and separate association from mechanism.
When two variables correlate, at least three broad explanations are possible. First, X may really cause Y. Smoking and lung cancer became the textbook example because the association was strong, persistent, time-ordered, biologically plausible, and supported by multiple lines of evidence. Second, the direction may run the other way: Y could influence X. Depression and social isolation illustrate this challenge, because each can plausibly worsen the other. Third, a third variable Z may drive both X and Y. Hot weather creates a classic confounding pattern: it raises ice cream sales and also raises swimming activity, which can increase drownings. Ice cream does not cause drowning, even if the two series correlate.
This is why spurious correlations are so instructive. Tyler Vigen popularized absurd but real examples such as Nicolas Cage film appearances versus pool drownings, per-capita cheese consumption versus deaths caused by bedsheet entanglement, and Maine divorce rates versus margarine consumption. Those correlations are statistically real in the recorded series, but they do not reveal a meaningful causal mechanism. They reveal that enough time-series data can generate weird alignment by chance, seasonal overlap, or shared background trends.
To establish causation, statisticians look for stronger evidence than association alone. Randomized controlled trials are the gold standard because randomization helps break the link between treatment and hidden confounders. Even outside experiments, analysts want clear time ordering, plausible mechanisms, replication, and explicit attempts to control alternative explanations. If X supposedly causes Y, X must happen before Y. If a mechanism is impossible or incoherent, the causal story weakens. If the effect disappears after adjusting for a third variable, confounding was probably doing the real work.
Correlation is still extremely useful. It is excellent for prediction, pattern detection, exploratory analysis, and generating hypotheses worth testing. It is not enough on its own for clinical decisions, policy claims, or strong intervention advice. The right practical habit is simple: use correlation to discover questions, not to pretend you have already answered them.