Run and Interpret Ordinal Logistic Regression in R

Ordinal logistic regression is a type of regression analysis that models the relationship between one or more predictors (numerical or categorical) and an ordinal outcome. An ordinal outcome is a variable that has more than 2 categories that have a logical order, such as:

Cancer stage: 0, 1, 2, 3, or 4
Income level: low, medium, or high

In this tutorial, we will use ordinal logistic regression on the titanic data set, which can be downloaded from this GitHub link, to predict the variable Pclass (passenger class), which can take on 1 of 3 values (1: first class, 2: second class, and 3: third class), using the predictors: Age, Sex, and Survived.

1. Loading the data

First, we will transform the numeric variable Pclass into an ordinal variable, and then remove all observations with missing values:

library(tidyverse)

titanic <- read_csv("titanic.csv") |> 
  mutate(Pclass = factor(Pclass,
                         levels = c(1, 2, 3),
                         ordered = TRUE)) |> 
  select(Pclass, Age, Sex, Survived) |> 
  drop_na()

titanic
## A tibble: 714 × 4
#   Pclass   Age Sex    Survived
#   <ord>  <dbl> <chr>     <dbl>
# 1 3         22 male          0
# 2 1         38 female        1
# 3 3         26 female        1
# 4 1         35 female        1
# 5 3         35 male          0
# 6 1         54 male          0
# 7 3          2 male          0
# 8 3         27 female        1
# 9 2         14 female        1
#10 3          4 female        1
## ℹ 704 more rows
## ℹ Use `print(n = ...)` to see more rows

2. Fitting an ordinal logistic regression model

We will use the most common type of ordinal logistic regression, which is proportional odds logistic regression, since it is the easiest to interpret.

The model can be run using the clm() function from the ordinal package as follows:

library(ordinal)

model <- clm(Pclass ~ Age + Sex + Survived, data = titanic)

summary(model)
#formula: Pclass ~ Age + Sex + Survived
#data:    titanic
#
# link  threshold nobs logLik  AIC     niter max.grad cond.H 
# logit flexible  714  -627.62 1265.23 4(0)  1.74e-07 4.1e+04
#
#Coefficients:
#          Estimate Std. Error z value Pr(>|z|)    
#Age      -0.063208   0.005857 -10.792   <2e-16 ***
#Sexmale  -0.315159   0.199921  -1.576    0.115    
#Survived -1.969870   0.204830  -9.617   <2e-16 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Threshold coefficients:
#    Estimate Std. Error z value
#1|2  -4.2024     0.3091  -13.60
#2|3  -2.8120     0.2838   -9.91

Interpreting the relationship between each predictor and the outcome

The general interpretation of an ordinal logistic regression coefficient β₁ associated with a predictor X₁ is:

A 1 unit increase in the predictor X₁ is associated with a β₁ change in the log odds of the outcome Y of being in a higher category.

Since the coefficients of the predictors are the log odds ratios, we will have to exponentiate them to get the odds ratios.

In R, we can call the function tidy() on the model fit to extract the coefficients and p-values, with the arguments exponentiate = TRUE and conf.int = TRUE (to print the 95% confidence intervals). Then, we remove the standard errors and z-scores from the output to make the table smaller:

library(tidymodels)

tidy(model, exponentiate = TRUE, conf.int = TRUE) |> 
  select(-c(std.error, statistic))
## A tibble: 5 × 6
#  term     estimate  p.value conf.low conf.high coef.type
#  <chr>       <dbl>    <dbl>    <dbl>     <dbl> <chr>    
#1 1|2        0.0150 4.17e-42  NA         NA     intercept
#2 2|3        0.0601 3.77e-23  NA         NA     intercept
#3 Age        0.939  3.77e-27   0.928      0.949 location 
#4 Sexmale    0.730  1.15e- 1   0.491      1.08  location 
#5 Survived   0.139  6.77e-22   0.0927     0.207 location

We will only look at the coefficients of the predictors (the last 3 lines), not the intercepts. Here’s how to interpret them:

Interpreting the coefficient of Age:

The exponentiated coefficient (the odds ratio) related to Age is 0.939 which is less than 1: this means that age is negatively related to higher passenger class values. But since a 1^st class ticket is better than a 2^nd class which in turn is better than a 3^rd class ticket, then higher age is positively related to having a higher-class ticket, and the interpretation becomes:

Older passengers tend to have higher-class tickets (p < 0.001). Specifically, a passenger that is 1 year older has 6.1% more odds (0.939 – 1 = -0.061) of having a ticket that is one class higher (i.e. a lower Pclass value).

Interpreting the coefficient of Sex:

The exponentiated coefficient (the odds ratio) related to male Sex is not statistically significant (p = 0.115) which means that gender is not a good predictor of passenger class.

Interpreting the coefficient of Survived:

The exponentiated coefficient (the odds ratio) related to Survived is 0.139 which is less than 1, this means that:

Passengers who survived tend to have lower Pclass values, i.e. tend to be have higher-class tickets (p < 0.001). Specifically, a passenger who survived had 86.1% (0.139 – 1 = 0.861) more odds of having a ticket that is one class higher than a passenger who did not survive.

Checking the model fit

The function nagelkerke() from the rcompanion package will calculate the pseudo-R-squared and run a likelihood ratio test:

library(rcompanion)

nagelkerke(model)
#$Models
#                                                    
#Model: "clm, Pclass ~ Age + Sex + Survived, titanic"
#Null:  "clm, Pclass ~ 1, titanic"                   
#
#$Pseudo.R.squared.for.model.vs.null
#                             Pseudo.R.squared
#McFadden                             0.155862
#Cox and Snell (ML)                   0.277186
#Nagelkerke (Cragg and Uhler)         0.316640
#
#$Likelihood.ratio.test
# Df.diff LogLik.diff  Chisq   p.value
#      -3     -115.88 231.77 5.738e-50
#
#$Number.of.observations
#          
#Model: 714
#Null:  714
#
#$Messages
#[1] "Note: For models fit with REML, these statistics are based on refitting with ML"
#
#$Warnings
#[1] "None"

Of particular importance are:

Nagelkerke’s R-squared: which is a number between 0 and 1 that measures the goodness of fit of a logistic regression model.
The likelihood ratio test: which tests if the full model (the model with all the predictors included) fits the data better than the null model (the model with no variables). In our case, the LogLik.diff is -115.88 with p < 0.001, which means that adding the predictors is better than the null model with no predictors.

We can also run the Lipsitz test to check the goodness of fit:

library(generalhoslem)

lipsitz.test(model)
#	Lipsitz goodness of fit test for ordinal response models
#
#data:  formula:  Pclass ~ Age + Sex + Survived
#LR statistic = 7.0386, df = 9, p-value = 0.6331

Since the null hypothesis is a good model fit, then the p = 0.6331 obtained means that we cannot reject that hypothesis — which is a good thing.

Accuracy of the ordinal logistic regression model

Next, we will calculate the model’s accuracy in 3 steps:

Step #1: Get the fitted values and save them in preds:

preds <- augment(model, type = "class")

preds
## A tibble: 714 × 5
#   Pclass   Age Sex    Survived .fitted
#   <ord>  <dbl> <chr>     <dbl> <fct>  
# 1 3         22 male          0 3      
# 2 1         38 female        1 1      
# 3 3         26 female        1 1      
# 4 1         35 female        1 1      
# 5 3         35 male          0 3      
# 6 1         54 male          0 1      
# 7 3          2 male          0 3      
# 8 3         27 female        1 1      
# 9 2         14 female        1 3      
#10 3          4 female        1 3      
## ℹ 704 more rows
## ℹ Use `print(n = ...)` to see more rows

Step #2: Look at the confusion matrix:

conf_mat(preds, truth = Pclass, estimate = .fitted)
#          Truth
#Prediction   1   2   3
#         1 113  59  49
#         2   0   0   0
#         3  73 114 306

Notice that the model did not classify anyone to have a 2^nd class ticket.

Now we can stop here and manually calculate the accuracy using the confusion matrix, but the function accuracy() from tidymodels can do it faster.

Step #3: Calculate the model accuracy:

accuracy(preds, truth = Pclass, estimate = .fitted)
## A tibble: 1 × 3
#  .metric  .estimator .estimate
#  <chr>    <chr>          <dbl>
#1 accuracy multiclass     0.587

So we can say that, based on Age, Sex, and Survived, the model correctly guesses the passenger’s class 58.7% of the times.

Checking the proportional odds assumption

The specific type of ordinal logistic regression model that we used (the proportional odds logistic regression) assumes that the effect of each predictor on the outcome is the same for all levels of the outcome. For example, “the effect of increasing age by 1 year on the odds of having a second-class ticket vs a first-class ticket” is the same as “the effect of increasing age by 1 year on the odds of having a third-class ticket vs a second-class ticket”.

We can check this assumption by using the Brant test, where the null hypothesis is that the proportional odds assumption holds. The assumption is considered violated if p < 0.05 on the Omnibus test plus at least one of the variables [source: McNulty K. Handbook of Regression Modeling in People Analytics: With Examples in R and Python. 1st edition. Chapman and Hall/CRC; 2021.]

library(gofcat)

brant.test(model)
#Brant Test:
#            chi-sq   df   pr(>chi)
#Omnibus      3.699    3       0.30
#Age          2.027    1       0.15
#Sexmale      1.554    1       0.21
#Survived     0.679    1       0.41
#
#H0: Proportional odds assumption holds

Here, all p-values are above 0.05, so the proportional odds assumption holds.