r
Statistical Tools
Correlation Coefficient Calculator
Live calculator, charts, and worked steps

Pearson Correlation Coefficient Calculator

Use this Pearson correlation coefficient calculator to measure the linear relationship between two continuous variables. Enter your raw data, upload a CSV or Excel sheet, and instantly see Pearson r, r², p-value, t-statistic, confidence intervals, scatter plot diagnostics, and a full step-by-step solution that is suitable for teaching, assignments, and quick statistical checks.

No sign-up requiredInstant resultsStep-by-step solutionFree forever
Live update window
300ms
Manual row limit
500
Output package
r, p, CI
Active method
Pearson r

Best for continuous variables with a linear relationship.

r=i=1n(xixˉ)(yiyˉ)i=1n(xixˉ)2i=1n(yiyˉ)2r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}
Data input
Enter or import paired values
8 valid pairs
Row
X Variable
Y Variable
1
2
3
4
5
6
7
8
Drag and drop a CSV or Excel file
Column headers are detected automatically so you can choose which variables become X and Y.
Example datasets
Interactive Scatter Plot
2.04.06.08.050.055.060.065.070.075.080.085.090.0X VariableY Variable
Correlation Meter
-1.00.0+1.00.9991
Strength badgeVery strong
Data health check
Sample sizeGood
8 valid pairs gives a stable first-pass estimate.
Distribution shapeGood
Neither variable shows strong skewness from a quick sample-skewness check.
Linearity checkInfo
Pearson and Spearman are close, which supports a mostly linear trend.
Residual Plot
X VariableResidual
r
0.9991
0.9981
p-value
2.13e-9
t
56.1931
df
6
95% CI
0.995 to 1.000
99% CI
0.991 to 1.000
Automatic interpretation
This dataset shows a very strong positive linear relationship. It is statistically significant at the 0.05 level.

Pearson r = 0.9991 based on 8 valid pairs, p = 0.0000.

Your result
r = 0.9991Very strong
-1-0.500.51
Statistically significant at p < 0.05
r² = 0.9981 so X explains 99.81% of Y variance.
Sample size n=8 is small, so treat the confidence interval with caution.
Step-by-step
How the calculator got this result
Step 1: Compute the means

Average the X values and the Y values before measuring joint movement.

x̄ = (1.0000 + 2.0000 + 3.0000 + 4.0000 + 5.0000 + 6.0000 + 7.0000 + 8.0000) / 8 = 36.0000 / 8 = 4.5000
ȳ = (52.0000 + 57.0000 + 62.0000 + 67.0000 + 72.0000 + 77.0000 + 83.0000 + 86.0000) / 8 = 556.0000 / 8 = 69.5000
Step 2: Measure paired deviations

Subtract the mean from every X and Y value to get centered deviations.

#1: dx = 1.0000 - 4.5000 = -3.5000, dy = 52.0000 - 69.5000 = -17.5000
#2: dx = 2.0000 - 4.5000 = -2.5000, dy = 57.0000 - 69.5000 = -12.5000
#3: dx = 3.0000 - 4.5000 = -1.5000, dy = 62.0000 - 69.5000 = -7.5000
#4: dx = 4.0000 - 4.5000 = -0.5000, dy = 67.0000 - 69.5000 = -2.5000
#5: dx = 5.0000 - 4.5000 = 0.5000, dy = 72.0000 - 69.5000 = 2.5000
#6: dx = 6.0000 - 4.5000 = 1.5000, dy = 77.0000 - 69.5000 = 7.5000
#7: dx = 7.0000 - 4.5000 = 2.5000, dy = 83.0000 - 69.5000 = 13.5000
#8: dx = 8.0000 - 4.5000 = 3.5000, dy = 86.0000 - 69.5000 = 16.5000
Step 3: Sum the covariance numerator

Multiply each pair of deviations and add them up.

#1: (-3.5000) × (-17.5000) = 61.2500
#2: (-2.5000) × (-12.5000) = 31.2500
#3: (-1.5000) × (-7.5000) = 11.2500
#4: (-0.5000) × (-2.5000) = 1.2500
#5: (0.5000) × (2.5000) = 1.2500
#6: (1.5000) × (7.5000) = 11.2500
#7: (2.5000) × (13.5000) = 33.7500
#8: (3.5000) × (16.5000) = 57.7500
Σ(xᵢ - x̄)(yᵢ - ȳ) = 209.0000
Step 4: Sum the squared deviations

Compute the denominator from the independent spread of X and Y.

#1: dx² = 12.2500, dy² = 306.2500
#2: dx² = 6.2500, dy² = 156.2500
#3: dx² = 2.2500, dy² = 56.2500
#4: dx² = 0.2500, dy² = 6.2500
#5: dx² = 0.2500, dy² = 6.2500
#6: dx² = 2.2500, dy² = 56.2500
#7: dx² = 6.2500, dy² = 182.2500
#8: dx² = 12.2500, dy² = 272.2500
Σ(xᵢ - x̄)² = 42.0000
Σ(yᵢ - ȳ)² = 1042.0000
Step 5: Divide numerator by denominator

The covariance term is normalized by both standard-deviation components.

r = 209.0000 / √(42.0000 × 1042.0000)
r = 0.9991

How to Use This Calculator

Step 1

Enter paired X and Y values manually, paste directly from Excel, or import a CSV or Excel file.

Step 2

Check that both variables are continuous and that a linear relationship is the right assumption for your data.

Step 3

Read the r value, r², p-value, t-statistic, degrees of freedom, and confidence intervals from the result cards.

Step 4

Review the scatter plot, residual plot, interpretation card, and worked solution before drawing a conclusion.

What Is the Pearson Correlation Coefficient?

The Pearson correlation coefficient is the standard statistic for measuring the strength and direction of a linear relationship between two continuous variables. When people refer to Pearson r, they usually mean the sample estimate of how closely paired observations move together around a straight-line trend. The Pearson correlation coefficientranges from -1 to +1, where +1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship.

The method is named after Karl Pearson, who formalized the coefficient in modern statistics. A Pearson correlation coefficient is most appropriate when both variables are measured on continuous scales, the pattern is approximately linear, and the data are not being dominated by a few influential outliers. For formal inference, analysts often also check whether the variables are roughly normally distributed, especially in smaller samples.

In practice, Pearson r is useful because it answers several questions at once. It tells you whether the relationship is positive or negative, how strong the relationship is, and whether the observed pattern is statistically distinguishable from zero using a significance test. It also connects naturally to r2r^2, which gives the proportion of shared variance explained by the linear model. That is why Pearson correlation appears so often in psychology, business analysis, education, health research, and engineering dashboards.

Pearson Correlation Formula

r=i=1n(xixˉ)(yiyˉ)i=1n(xixˉ)2i=1n(yiyˉ)2r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}
t=rn21r2,df=n2t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}}, \quad df = n-2
Z=12ln(1+r1r),CI=tanh(Z±zα/2n3)Z = \frac{1}{2}\ln\left(\frac{1+r}{1-r}\right), \quad CI = \tanh\left(Z \pm \frac{z_{\alpha/2}}{\sqrt{n-3}}\right)
SymbolMeaning
rrPearson correlation coefficient
xi,yix_i, y_iThe ith paired observation
xˉ,yˉ\bar{x}, \bar{y}Sample means of X and Y
nnNumber of valid paired observations
ttTest statistic for significance testing
dfdfDegrees of freedom, equal to n minus 2
ZZFisher z transformed correlation
CICIConfidence interval for the population correlation

The first formula converts raw paired data into a standardized covariance. The numerator captures whether X and Y tend to move together, while the denominator scales that movement by the variability present in each variable. That is why Pearson r always stays between -1 and +1.

The t statistic transforms the sample correlation into a significance test under the null hypothesis that the population correlation is zero. Fisher's z transform is then used to build confidence intervals because the sampling distribution of r becomes more symmetric after transformation.

How to Calculate Pearson r Step by Step

Step 1

Compute the mean of X and the mean of Y so each variable has a central reference point.

Step 2

Subtract each mean from every observation to get paired deviations from the center.

Step 3

Multiply the paired deviations and sum them to create the covariance numerator.

Step 4

Square the deviations within X and Y, sum them, and multiply the two totals inside the denominator.

Step 5

Divide the numerator by the denominator to get Pearson r, then test significance with the t formula.

The interactive Step-by-Step module above follows this same logic with your actual numbers. It is useful when you need to verify a homework solution, explain a result in class, or show a client how the coefficient was obtained from the raw data rather than from a black-box tool.

How to Interpret Your Results

Start with the sign of r. Positive values mean the variables rise together, while negative values mean one tends to fall as the other rises. Then look at the magnitude. A larger absolute value indicates a stronger linear relationship. After that, use the p-value and confidence interval to judge uncertainty. Finally, inspect the scatter plot and residual plot to make sure the linear model is a sensible summary of the data.

0.90 to 1.00
Very Strong Positive
0.70 to 0.89
Strong Positive
0.50 to 0.69
Moderate Positive
0.30 to 0.49
Weak Positive
-0.29 to 0.29
Negligible
-0.49 to -0.30
Weak Negative
-0.69 to -0.50
Moderate Negative
-0.89 to -0.70
Strong Negative
-1.00 to -0.90
Very Strong Negative

Pearson vs Spearman vs Kendall

Pearson is ideal when the relationship is linear and both variables are continuous. Spearman is a safer choice for ordinal or clearly non-normal data because it works on ranks. Kendall is even more robust in small samples or datasets with many tied values. A practical workflow is to calculate Pearson first, then compare against Spearman or Kendall if the scatter plot looks curved, the residuals look unstable, or outliers appear influential.

Pearson rSpearman ρKendall τ
Data typeContinuousOrdinal or continuousOrdinal or continuous
Distribution requirementApproximately normal for formal inferenceNo strict normality requirementNo strict normality requirement
Outlier sensitivitySensitiveMore robustMost robust
Typical sample sizeUsually n ≥ 10Usually n ≥ 10Small samples handled well
LinkCurrent page/spearman-correlation//kendall-correlation/

Real-World Examples

Height vs Weight

A classic positive linear relationship with mild natural variation.

This type of dataset is useful when both variables are measured numerically and you want to quantify a straight-line association rather than only a ranked pattern.

Temperature vs Ice Cream Sales

Stronger temperatures generally push sales up, with one cooler-day outlier.

This type of dataset is useful when both variables are measured numerically and you want to quantify a straight-line association rather than only a ranked pattern.

Study Hours vs Exam Score

Useful for step-by-step teaching because the pattern is easy to inspect by hand.

This type of dataset is useful when both variables are measured numerically and you want to quantify a straight-line association rather than only a ranked pattern.

Frequently Asked Questions

What is a good Pearson correlation coefficient?

A good Pearson correlation coefficient depends on the field. In social sciences, values above 0.5 are often treated as strong. In physics or engineering, values above 0.9 may be expected. As a practical guide, absolute values below 0.3 are usually weak, 0.3 to 0.7 are moderate, and values above 0.7 are strong.

How do you calculate the Pearson correlation coefficient by hand?

First find the mean of X and Y. Then subtract each mean from its values to get deviations. Multiply each paired deviation and sum the products. Compute the square root of the product of each variable's sum of squared deviations. Finally, divide the covariance numerator by that denominator.

What does a Pearson r of 0.7 mean?

A Pearson r of 0.7 indicates a strong positive linear relationship. About 49% of the variance in Y is explained by X because r squared is 0.49. With a moderate sample size, it is often statistically significant.

Is r = 0.5 a strong correlation?

r = 0.5 is usually considered a moderate correlation. It means roughly 25% of the variance in one variable is explained by the other. Whether that is strong enough depends on the research setting and the amount of noise normally present in the data.

What is the difference between Pearson and Spearman correlation?

Pearson correlation measures linear association between two continuous variables and is most appropriate when the relationship is roughly linear. Spearman correlation measures monotonic association using ranked values, so it is more robust to outliers and non-normal data.

How do I find the p-value for Pearson correlation?

Convert r into a t statistic using t = r multiplied by the square root of n minus 2, divided by the square root of 1 minus r squared. Then use the t distribution with n minus 2 degrees of freedom to get the two-tailed p-value. The calculator does this automatically.

Can Pearson correlation be negative?

Yes. A negative Pearson correlation means the two variables tend to move in opposite directions. As one variable increases, the other tends to decrease. The closer the value is to negative one, the stronger that inverse linear relationship is.

What sample size do I need for Pearson correlation?

A minimum of about 10 observations is often recommended for a rough estimate, but 30 or more is better for stable inference. Very small samples can produce unstable r values and wide confidence intervals, so significance tests should be interpreted carefully.