I analyzed a random sample of 76,147 full-text research papers, uploaded to PubMed Central between the years 2016 and 2021, in order to check the popularity of programming languages among medical researchers.
I used the BioC API to download the articles (see the References section below) of which only 12,086 mentioned the use of at least 1 programming language.
R was the most used programming language overall, mentioned in 69.69% of research papers, followed by Matlab (21.31%) and Python (8.98%).
The 6-year trend showed that the popularity of R and Python is increasing (+1.59% and +1.21%) as opposed to Matlab, which showed a decline of 2.01%.
Here’s a table of the top 14 programming languages used in medical research:
|Ranking||Programming Language||Number of Mentions|
⚠ How was the trend calculated?
The 6-year trend is the linear regression coefficient (reported in percent) obtained by regressing “the percent of articles that mention a particular programming language each year” onto the “years” variable. This trend was calculated only for programming languages with more than 100 mentions over the past 6 years, because otherwise, this number will be reflecting the noise more than the trend
- Comeau DC, Wei CH, Islamaj Doğan R, and Lu Z. PMC text mining subset in BioC: about 3 million full text articles and growing, Bioinformatics, btz070, 2019.