Coursera Learner working on a presentation with Coursera logo and
Coursera Learner working on a presentation with Coursera logo and

What’s R-Squared?

R-squared (R2) is a statistical measure representing the proportion of the variance for a dependent variable that is explained by one or more independent variables in a regression model. While correlation explains the strength of the relationship between an independent variable and a dependent variable, R-squared explains the extent to which the variance of one variable explains the variance of the second variable. So, if R2 of a model is 0.50, then about half of the observed variation can be explained by the model inputs.

The Formula for R-Squared Is

\begin{aligned} &\text{R}^2 = 1 – \frac{ \text{Explained Variation} }{ \text{Total Variation} } \\ \end{aligned}​R2=1−Total VariationExplained Variation​​

When investing, R-squared is generally interpreted as the percentage of the movements of a fund or security that can be explained by the movements of a reference index. For example, an R-squared for a fixed income security compared to a bond index identifies the percentage movement in the price of the security that can be expected based on a movement in the index price. The same can be applied to a stock relative to the S&P 500 index or any other relevant index.

R-Squared Calculation

The actual calculation of R-squared requires several steps. This includes collecting data points (observations) of dependent and independent variables and finding the best fit line, often from a regression model. From there you calculate the expected values, subtract the actual values and square the result. In this way you get a list of errors squared, which is then added up and is equal to the explained variance.

To calculate the total variance, you subtract the average real value from the predicted values, square the result and add it up. From there, divide the first sum of the errors (explained variance) by the second sum (total variance), subtract the result from one, and you have the R square. 

What does R-Squared tell you?

R-squared values range from 0 to 1 and are commonly referred to as 0% to 100%. A 100% R-squared means that all movements of a security (or other dependent variable) are fully explained by the movements of the index (or the independent variable(s) you are interested in).

When investing, a high R-squared, between 85% and 100%, indicates that the performance of the security or fund moves relatively in line with the index. A fund with a low R-squared, at 70% or less, indicates that the security does not generally follow the movements of the index. A higher R-squared value indicates a more useful beta value. For example, if a stock or fund has an R-squared value close to 100%, but has a beta below 1, it most likely offers higher risk-adjusted returns.

The difference between R-Squared and Adjusted R-Squared

R-Squared only works as expected in a simple linear regression model with an explanatory variable. With a multiple regression consisting of several independent variables, R-Squared must be adjusted. The adjusted R-Squared compares the descriptive power of regression models that include different numbers of predictors. Each predictor added to a model increases R-squared and never decreases R-squared. Thus, a model with more terms may seem to have a better fit just because it has more terms, while the adjusted R-squared compensates for adding variables and only increases if the new term improves the model above what it would likely and decreases when a predictor improves the model less than what is expected by chance. In an overfitting condition, you get an erroneously high R-squared value, which leads to a decrease in prediction capability. This is not the case with adjusted R-squared.

While standard R-squared can be used to compare the goodness of two or different models, adjusted R-squared is not a good metric to compare non-linear models or multiple linear regressions.

The difference between R-squared and B

Beta and R-squared are two related but different correlation measures, but beta is a measure of relative risk. A mutual fund with high R-squared is highly correlated with a benchmark. If the beta is also high, it can produce higher returns than the benchmark, particularly in bull markets. R-squared measures how closely each change in the price of an asset is correlated with a benchmark. Beta measures the magnitude of these price changes in relation to a benchmark. Used together, R-squared and beta give investors a complete picture of asset managers’ performance. A beta of exactly 1.0 means that the risk (volatility) of the asset is identical to that of its benchmark. In essence, R-squared is a statistical analysis technique for the practical use and reliability of securities betas.

Limitations of R-Squared

R-squared will give you an estimate of the relationship between the movements of a dependent variable based on the movements of an independent variable. It will not tell you if the chosen model is good or bad, nor will it tell you if the data and forecasts are biased. A high or low R square is not necessarily good or bad, as it does not convey the reliability of the model, nor does it tell you if you have chosen the right regression. You can get a low R square for a good model, or a high R square for a poorly equipped model, and vice versa.


Weekly newsletter

No spam. Just the latest releases and tips, interesting articles, and exclusive interviews in your inbox every week.