What is an Acceptable Value for VIF? (With References)

Most research papers consider a VIF (Variance Inflation Factor) > 10 as an indicator of multicollinearity, but some choose a more conservative threshold of 5 or even 2.5.

So what threshold should YOU choose?

When choosing a VIF threshold, you should take into account that multicollinearity is a lesser problem when dealing with a large sample size compared to a smaller one. [Source]

That being said, here’s a list of references for different VIF thresholds recommended to detect collinearity in a multivariable (linear or logistic) model:

VIF Threshold	Reference Type	Reference Date	Reference
VIF > 10 is problematic	Book	2005	Vittinghoff E. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. Springer; 2005.
VIF > 5 or VIF > 10 is problematic	Book	2017	James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning: With Applications in R. 1st ed. 2013, Corr. 7th printing 2017 edition. Springer; 2013.
VIF > 5 is cause for concern and VIF > 10 indicates a serious collinearity problem	Book	2001	Menard S. Applied Logistic Regression Analysis. 2nd edition. SAGE Publications, Inc; 2001.
VIF ≥ 2.5 indicates considerable collinearity	Research Paper	2018	Johnston R, Jones K, Manley D. Confounding and collinearity in regression analysis: a cautionary tale and an alternative procedure, illustrated by studies of British voting behaviour. Qual Quant. 2018;52(4):1957-1976. doi:10.1007/s11135-017-0584-6

How to interpret a given VIF value?

Consider the following linear regression model:

Y = β₀ + β₁ × X₁ + β₂ × X₂ + β₃ × X₃ + ε

For each of the independent variables X₁, X₂ and X₃ we can calculate the variance inflation factor (VIF) in order to determine if we have a multicollinearity problem.

Here’s the formula for calculating the VIF for X₁:

VIF (variance inflating factor) formula for the first variable in the model

R² in this formula is the coefficient of determination from the linear regression model which has:

X₁ as dependent variable
X₂ and X₃as independent variables

In other words, R² comes from the following linear regression model:

X₁ = β₀ + β₁ × X₂ + β₂ × X₃ + ε

And because R² is a number between 0 and 1:

When R² is close to 1 (i.e. X₂ and X₃ are highly predictive of X₁): the VIF will be very large
When R² is close to 0 (i.e. X₂ and X₃ are not related to X₁): the VIF will be close to 1

Therefore the range of VIF is between 1 and infinity.

Now, let’s discuss how to interpret the following cases where:

VIF = 1
VIF = 2.5
VIF = +∞

Example 1: VIF = 1

A VIF of 1 for a given independent variable (say for X₁ from the model above) indicates the total absence of collinearity between this variable and other predictors in the model (X₂ and X₃).

Example 2: VIF = 2.5

If for example the variable X₃ in our model has a VIF of 2.5, this value can be interpreted in 2 ways:

The variance of β₃ (the regression coefficient of X₃) is 2.5 times greater than it would have been if X₃ had been entirely non-related to other variables in our model
The variance of β₃ is 150% greater than it would be if there were no collinearity effect at all between X₃ and other variables in our model

Why 150%?

This percentage is calculated by subtracting 1 (the value of VIF if there were no collinearity) from the actual value of VIF:

2.5 – 1 = 1.5

In percent:

1.5 * 100 / 100 = 150%.

Example 3: VIF = Infinity

An infinite value of VIF for a given independent variable indicates that it can be perfectly predicted by other variables in the model.

Looking at the equation above, this happens when R² approaches 1.

How to interpret a given VIF value?

Example 1: VIF = 1

Example 2: VIF = 2.5

Example 3: VIF = Infinity

Further reading