Using pivot_longer with names_sep and names_pattern in R

In this article, we will explain how to use the arguments names_sep and names_pattern of the function pivot_longer() from the tidyr package.

First, let’s create some data:

df <- data.frame(jan_2023 = c(1,1),
                 feb_2023 = c(2,2),
                 mar_2023 = c(3,3))

df
#  jan_2023 feb_2023 mar_2023
#1        1        2        3
#2        1        2        3

The general syntax of pivot_longer() is:

pivot_longer syntax explanation

1. Using the names_sep argument in pivot_longer()

When calling pivot_longer(), we can create 2 new columns (instead of 1) by splitting the column names of df using the following arguments:

  • First we specify that we want 2 columns by setting names_to = c('month', 'year')
  • Next, we set names_sep = '_' to split jan_2023, feb_2023, and mar_2023 into month and year
  • And names_transform = list(year = as.numeric) to transform the year column to numeric instead of character
df2 <- df |> 
  pivot_longer(cols = everything(),
               names_to = c('month','year'),
               names_sep = '_',
               values_to = 'sales',
               names_transform = list(year = as.numeric))
df2

df2
## A tibble: 6 x 3
#  month  year sales
#  <chr> <dbl> <dbl>
#1 jan    2023     1
#2 feb    2023     2
#3 mar    2023     3
#4 jan    2023     1
#5 feb    2023     2
#6 mar    2023     3

2. Using the names_pattern argument in pivot_longer()

We can split the columns in df by providing a regular expression to the names_pattern argument of pivot_longer():

Since we want to split the column names of df (jan_2023, feb_2023, and mar_2023) into 2 columns (month and year), the regular expression should contain 2 groups:

df2 <- df |> 
  pivot_longer(cols = everything(),
               names_to = c('month','year'),
               names_pattern = '([A-Za-z]+)_([0-9]+)',
               values_to = 'sales',
               names_transform = list(year = as.numeric))
df2
## A tibble: 6 x 3
#  month  year sales
#  <chr> <dbl> <dbl>
#1 jan    2023     1
#2 feb    2023     2
#3 mar    2023     3
#4 jan    2023     1
#5 feb    2023     2
#6 mar    2023     3

Further reading