I analyzed a random sample of 61,429 full-text research papers, uploaded to PubMed Central between the years 2016 and 2021, in order to answer the questions:
What is the typical length of an abstract? and which factors influence it?
I used the BioC API to download the data (see the References section below).
Here’s a summary of the key findings
1. The median abstract was 263 words long (equivalent to 11 sentences, or 2 paragraphs), and 90% of the abstracts in the sample were between 163 and 416 words.
2. Longer research articles have slightly longer abstracts. Specifically, a research article that has 1000 more words, has an abstract that is 1.4% longer.
3. The length of the abstract does not differ between review articles and original research articles.
4. The quality of the journal does not influence the length of the abstract.
Overall length of the abstract
Here’s a table that describes the length of the abstract in terms of words, sentences, and paragraphs:
|Word Count||Sentence Count||Paragraph Count|
|Minimum||19 words||1 sentence||1 paragraph|
|25th Percentile||217 words||9 sentences||1 paragraph|
|50th Percentile (Median)||263 words||11 sentences||2 paragraphs|
|Mean||273.6 words||11.2 sentences||2.7 paragraphs|
|75th Percentile||315 words||13 sentences||4 paragraphs|
|Maximum||1,686 words||73 sentences||40 paragraphs|
From these data, we can conclude that the abstracts in most research papers are between 217 and 315 words long (9 to 13 sentences).
The abstract is 7.15% the length of the entire article word count, and longer research articles have longer abstracts. In fact, I ran a Poisson regression model that predicts the abstract word count given the whole article word count, and the outcome was that: a research article that has 1000 more words, has an abstract that is 1.4% longer. To put that into perspective, an article that has 1000 words more than the median is associated with an abstract that is 3.5 words longer.
Length of the abstract for different article types
The following table shows the median word count of the abstract for different study designs:
|Study design||Number of studies in the sample||Median abstract word count|
|Case series||140 studies||273 words|
|Case-control||443 studies||275 words|
|Pilot study||842 studies||282 words|
|Case report||407 studies||285 words|
|Cross-sectional||1,481 studies||285 words|
|Quasi-experiment||144 studies||286 words|
|Meta-analysis||3,529 studies||287 words|
|Cohort||5,180 studies||288 words|
|Systematic review||686 studies||294 words|
|Randomized controlled trial||689 studies||301 words|
The data show no clear pattern since the abstracts of review articles and original research articles have almost similar word counts. So we can conclude that there is no particular article type that requires a longer abstract.
Length of the abstract in different journals
In order to study the influence of the journal quality on the length of the abstract, I ran a Poisson regression that models the abstract word count given the journal impact factor. Here’s the model output:
|Journal impact factor||-0.004||<0.001||<0.001|
The model shows that a higher journal impact factor is associated with a shorter abstract. Although statistically significant, this effect is practically negligible since a 1 unit increase in the journal impact factor is associated with a decrease of only 0.4% in the abstract word count. For the median article, this means that a 1 unit increase in the journal impact factor is associated with an approximate decrease of 1 word in the abstract.
- Comeau DC, Wei CH, Islamaj Doğan R, and Lu Z. PMC text mining subset in BioC: about 3 million full text articles and growing, Bioinformatics, btz070, 2019.