“I” & “We” in Academic Writing: Examples from 9,830 Studies

I analyzed a random sample of 9,830 full-text research papers, uploaded to PubMed Central between the years 2016 and 2021, in order to explore whether first-person pronouns are used in the scientific literature, and how?

I used the BioC API to download the data (see the References section below).

Popularity of first-person pronouns in the scientific literature

In our sample of 9,830 articles, 93.8% used the first-person pronouns “I” or “We”. The use of the pronoun “We” was a lot more prevalent than “I” (93.1% versus 13.9%, respectively).

In fact, even articles written by single authors were more likely to use “We” instead of “I”. Out of 9,830 articles, 39 were written by single authors: 8 of them used “I” and 19 used “We”.

The following table describes the use of first-person pronouns in each section of the research article:

Article SectionProportion of sections that used the pronoun “I”Proportion of sections that used the pronoun “We”
Abstract0.01%22.71%
Introduction1.41%64.31%
Methods7.29%68.29%
Results6.28%52.36%
Discussion2.60%85.65%

Use of the pronoun “I”

The pronoun “I” was mostly used in the Methods section (present in 7.29% of all methods sections in our sample).

For example:

“In general, I assumed a steady-state and a closed-population.”

Link to the article on PubMed

“I” was also prevalent in the Results section (6.28%). But here, all of its uses were to quote participants’ answers, such as:

The respondents stated, “I am scared to get infected and infect my family when I go home”

Link to the article on PubMed

“I” was scarcely used in the Discussion section, for example:

“Based on this observation, I suggest that future research on this population seek to increase the participation of Indigenous communities.”

Link to the article on PubMed

And only 1 article of 9,830 used the pronoun “I” in the abstract:

“By using the largest publicly available cancer incidence statistics (20 million cases), I show that incidence of 20 most prevalent cancer types in relation to patients’ age closely follows the Erlang probability distribution (R2 = 0.9734-0.9999).”

Link to the article on PubMed

Use of the pronoun “We”

The pronoun “We” was primarily used in the Discussion section (in 85.65% of the cases). For example:

“Although we cannot rule out this potential bias, we expect that missing data in our analysis did not depend on our dependent variable.”

Link to the article on PubMed

Followed by the Methods section (68.29%). For example:

“Depending on the severity and chronicity of disease, we applied three different time frames”

Link to the article on PubMed

Then followed by the Introduction (64.31%). For example:

“Instead of using time series analysis, we conducted a manipulative field experiment.”

Link to the article on PubMed

“We” is used to a lesser extent in the Results section (52.36%). For example:

“In contrast, we did not find differences in survival when mutants where challenged with acute oxidative stress (Figure S5)”

Link to the article on PubMed

And a lot less in the Abstract. For example:

“In this study, we analyzed and summarized seven RCTs and four meta-analyses.”

Link to the article on PubMed

Article quality and the use first-person pronouns

The following table compares articles that used first-person pronouns with those that did not use first-person pronouns regarding:

  • The number of citations per year received by these articles
  • The impact factor of the journals in which these articles were published
Articles that used first-person pronounsArticles that did NOT use first-person pronouns
Median number of citations per year2.2 citations/year1.9 citations/year
Median journal impact factor2.72.5

The data show that higher-quality articles (those that bring more citations and those published in high-impact journals) tend to use first-person pronouns.

In other words, high article quality is correlated with the use of first-person pronouns.

References

  • Comeau DC, Wei CH, Islamaj Doğan R, and Lu Z. PMC text mining subset in BioC: about 3 million full text articles and growing, Bioinformatics, btz070, 2019.

Further reading