Browse all pages

Control for a third variable

Partial Correlation Calculator

Ice cream sales correlate with drowning deaths. But once you control for summer heat, the correlation vanishes. Partial correlation is how you find the truth behind the numbers.

Quick answer

Partial correlation measures the relationship between X and Y after statistically removing the influence of a control variable Z.

Before vs After Controlling for Z

Before controlling for Z

r(XY) = 0.997

very strong

After controlling for Z

r(XY.Z) = 0.594

moderate

The raw correlation between X and Y is 0.997 (very strong). After controlling for Z, the partial correlation is 0.594 (moderate), a 40% drop in absolute strength.

Significance Test

t-statistic

1.649

df = n - 3

p-value

0.16

Decision

Not significant

The test uses t = r(XY.Z) * sqrt(n - 3) / sqrt(1 - r(XY.Z)^2), with df = n - 3.

Relationship Diagnosis

Partial mediation detected

Raw r = 0.997 -> partial r = 0.594. Absolute strength changed by 40%.

Z explains part of the X-Y relationship, but a remaining direct relationship may still exist.

Partial r squared

0.352

Z being held constant, X accounts for 35.2% of the remaining variance in Y.

How partial correlation is calculated

To find the pure correlation between X and Y, first remove the influence of Z from both X and Y separately, then correlate what is left.

r_{XY \cdot Z} = \frac{r_{XY} - r_{XZ}r_{YZ}}{\sqrt{1-r_{XZ}^2}\sqrt{1-r_{YZ}^2}}

Step 1

Start with three r values

Use rXY = 0.63, rXZ = 0.57, and rYZ = 0.88.

Step 2

Compute the numerator

0.63 - (0.57 x 0.88) = 0.128

Step 3

Compute the denominator

sqrt(1 - 0.57^2) x sqrt(1 - 0.88^2) = 0.390

Step 4

Divide

r(XY.Z) = 0.128 / 0.390 = 0.329

In the stork example, controlling for land area reduces the stork-birth-rate correlation from 0.63 to about 0.33. Much of the original relationship was explained by the control variable.

What can partial correlation reveal?

Spurious correlation

Before: r(XY) = 0.85

After: r(XY.Z) = 0.02

Ice cream sales and drowning deaths. Control for temperature, and the link vanishes.

Z is the true driver. X and Y have little direct relationship.

Partial mediation

Before: r(XY) = 0.63

After: r(XY.Z) = 0.33

Stork counts and birth rates. Control for land area, and the relationship weakens.

Z explains part of the relationship, but some direct X-Y link remains.

Direct relationship

Before: r(XY) = 0.72

After: r(XY.Z) = 0.70

Study time and exam score. Control for IQ, and the relationship barely changes.

Z is not explaining the X-Y relationship.

Suppressor effect

Before: r(XY) = 0.10

After: r(XY.Z) = 0.55

Control for Z and the relationship becomes stronger or changes direction.

Z was masking the true relationship.

Three types of correlation

Type	Definition	Symbol	Use
Zero-order	Ordinary correlation with no control variable	r(XY)	Initial relationship exploration
Partial	Controls Z for both X and Y	r(XY.Z)	Remove a confounding variable
Semi-partial	Controls Z for only one variable	r(X.Z)Y	Unique contribution in regression

Use partial correlation when you want the pure relationship between X and Y. Use semi-partial correlation when you are building a regression model and want to know how much unique variance X contributes.

Controlling for two variables

Second-order partial correlation controls two variables. Conceptually, it repeats the same logic after one control variable has already been removed.

r_{XY \cdot Z_1Z_2} = \frac{r_{XY \cdot Z_1} - r_{XZ_2 \cdot Z_1}r_{YZ_2 \cdot Z_1}}{\sqrt{1-r_{XZ_2 \cdot Z_1}^2}\sqrt{1-r_{YZ_2 \cdot Z_1}^2}}

For controlling more than two variables simultaneously, use multiple regression residuals or dedicated software like R, Python, SPSS, or Stata.

FAQ

What is partial correlation used for?

Partial correlation measures the relationship between two variables while statistically removing the effect of one or more control variables. It is used to test whether an observed correlation is genuine or driven by a confounding variable.

What is the difference between partial correlation and controlling in regression?

They are closely related. The partial correlation r(XY.Z) equals the correlation between the residuals of regressing X on Z and the residuals of regressing Y on Z.

Can partial correlation be larger than the original correlation?

Yes. This is called a suppressor effect. When Z suppresses the true X-Y relationship, removing Z's influence can reveal a stronger or even sign-reversed correlation.

How do I interpret a non-significant partial correlation?

It means that after controlling for Z, there is insufficient evidence of a direct relationship between X and Y in your sample. This could indicate a spurious correlation or insufficient power.

Can I use partial correlation with Spearman?

Yes. The same formula can be applied to Spearman rank correlations, giving a rank-based partial correlation that is more robust to outliers and non-normality.

Continue your analysis

Partial correlation is one step in a broader workflow: first inspect the zero-order relationship, then test significance, then think carefully about causation.

Calculate zero-order Pearson r first →Is your partial r significant? Check p-value →Correlation is not causation →Compare Pearson vs Spearman for your data →

Variable X	Variable Y	Control Variable Z