I analyzed a random sample of 104,161 full-text research papers, uploaded to PubMed Central between the years 2016 and 2021, to learn more about title length.
I used the BioC API to download the data (see the References section below).
Here’s a summary of the key findings
1. The median title was 14 words long (equivalent to 103 characters), and 90% of titles in the sample were between 6 and 25 words.
2. The 10-year trend shows an increase in title length from an average of 103 characters in 2012 to 111 characters in 2021.
3. Since Google shows only the first 60 characters of titles in its results page, 89.2% of titles in our sample will be truncated when they appear in Google search. And the median title loses 41.7% of its words in this process.
4. On average, review articles (systematic reviews and meta-analyses) had longer titles (16 words) compared to original research articles (14 words).
5. Longer articles are not associated with longer titles.
6. Articles published in high impact journals tend to have shorter titles than average.
1. Overall title length
In our sample of 104,161 articles, the mean title length was 14.7 words, and the distribution of title word count had an expected right skew:
Here’s a table that describes the title word and character counts in the sample:
|Word Count||Character Count|
|Minimum||2 words||21 characters|
|25th Percentile||11 words||80 characters|
|50th Percentile (Median)||14 words||103 characters|
|Mean||14.7 words||105.7 characters|
|75th Percentile||18 words||129 characters|
|Maximum||148 words||1,097 characters|
From these data, we can conclude that most titles were between 11 and 18 words long (80 to 129 characters).
The shortest title was:
“Cellular Inheritance”Link to the article on PubMed
Length: 2 words (21characters)
And the longest title was:
“Safety and efficacy of alpha‐amylase from Bacillus amyloliquefaciens DSM 9553, Bacillus amyloliquefaciens NCIMB 30251, Aspergillus oryzae CBS 585.94 and Aspergillus oryzae ATTC SD‐5374, endo‐1,4‐beta‐glucanase from Trichoderma reesei ATCC PTA‐10001, Trichoderma reesei ATCC SD‐6331 and Aspergillus niger CBS 120604, endo‐1,4‐beta‐xylanase from Trichoderma koningii MUCL 39203 and Trichoderma citrinoviride CBS 614.94 and endo‐1,3(4)‐beta‐glucanase from Aspergillus tubingensis MUCL 39199 as silage additives for all animal species”Link to the article on PubMed
Length: 148 words (1,097 characters)
2. Title length 10-year trend
The following is a plot of the average title character count each year, for the past 10 years:
The 10-year trend shows an increase in title length from an average of 103 characters in 2012 to 111 characters in 2021.
3. Titles as they appear in Google search
The results page of Google shows only the first 60 characters of titles and the rest is truncated. So, the first 60 characters constitute the part of a research title that is visible to users.
As an example, let’s try to search on Google for the article that had the longest title in our sample (1,097 characters).
In Google’s search field, I typed: “safety and efficacy of alpha-amylase” pubmed.
Here’s the response:
What happened is that Google chose a part of the title (specifically, 57 characters from the title) and displayed it in its results page.
We can all agree that this is horrible!
All these people who are searching online for the safety and efficacy for alpha-amylase are seeing a title that has nothing to do with their search, and will probably end up not clicking on that title.
So how many research titles get truncated by Google search? and what percentage of these title is invisible to users?
Based on our sample data, 89.2% of titles were longer than 60 characters and therefore will be truncated when they appear on the results page of Google. And the median title has 41.7% of its words invisible to online users.
When writing a research title, make sure:
- To keep it as short as possible
- That the visible part in an online search (the first 60 characters) is meaningful. Journalists call this: front loading–i.e. important words should be put close to the beginning.
4. Title length for different article types
In our sample of 104,161 articles, review articles (systematic reviews and meta-analyses) had longer titles (median: 16 words; n=2,851 articles) compared to original research articles (median: 14 words; n=101,310 articles).
5. Influence of article length on title length
To study the influence of article length on title length, I ran a Poisson regression model that predicts the title character count given the whole article word count.
According to the output of that model:
A research article that has 1000 more words, has a title that is 1% longer.
Although this result is statistically significant, it is practically negligible, since an article that has 1000 words more than the median is associated with a title that is only 1 character longer.
In practice, longer research articles are not associated with longer titles.
6. Length of titles in different journals
The following table shows the maximum title length allowed in 10 famous scientific journals according to their “instructions for authors” available from their websites:
|Journal Name||Maximum title length allowed|
(in parenthesis is the calculated character count)
|JAMA (Journal of the American Medical Association)||150 characters|
|Reviews of Modern Physics||10 words (76 characters)|
|Physiological Reviews||160 characters|
|American Journal of Respiratory and Critical Care Medicine||100 characters|
|PLOS One||250 characters|
|Thorax||20 words (139 characters)|
According to this table, famous journals recommend keeping titles below 126.6 characters on average. But 27% of the titles in our sample exceed this limit.
So where are these 27% of articles published?
More generally, do higher-quality journals prefer publishing shorter titles?
In order to answer this question, I ran a Poisson regression that models the title word count given the journal impact factor. Here’s the model’s output:
|Journal impact factor||-0.011||<0.001||<0.001|
The model shows that a higher journal impact factor is associated with shorter titles. Specifically, a 1 unit increase in the journal impact factor is associated with a decrease of 1.1% in the title word count. For the median article, this means that a 1 unit increase in the journal impact factor is associated with a decrease of 0.15 words (or approximately 1 character) in the title.
On average, higher-quality journals tend to publish slightly shorter titles.
- Comeau DC, Wei CH, Islamaj Doğan R, and Lu Z. PMC text mining subset in BioC: about 3 million full text articles and growing, Bioinformatics, btz070, 2019.