Top 5 mistakes native speakers make when writing scientific journals

In high school, I once asked a native English speaking friend what she looks for when proofreading other people’s essays and she said “smell.” Interestingly, “code-smell” was a similar concept I encountered when I started working as an iOS developer in San Francisco many years ago.

Now, as I find myself proofreading scientific papers written by my peers in the heavily scrutinised field of medical research, I wonder how I can describe my innate suspicion of a strangely worded sentence other than by simply saying it just doesn’t smell right.

This led me to parse the most common smelly sentences I found and document the top five mistakes.

1. Datum is singular. Data is plural.

Some may consider this to be a pedantic argument, especially in the computer sciences where data may refer to devices. However, consider the sentence “The data indicate many matches…” and notice how it may sound unnatural. This is actually the correct usage. Furthermore, data is a count noun, which means it can be used in its plural form, e.g., “there are twenty-four datums in this experiment.” Compare this to a non-count noun, like water. You would not say, for example, “I am drinking 10 waters.”

2. When to use which versus that.

Most people learn about relative clauses like which and that early in school and then promptly ignore it for the rest of their lives. I’ve noticed some writers only use which, and others prefer to use that. Consider the sentence “The vial which matches…” versus “The vial that matches…” and you have two very different meanings. The word which in the first sentence is referring to a specific vial, probably described in a previous paragraph. The word that in the second sentence is referring to all vials in general, i.e., not one in particular. In the context of an experiment, this could mean two very different approaches.

3. Mystery sentences

One of the biggest issues casual writers have with academic papers is sometimes called the mystery sentence. This is best described as a sentence structure where the reader is prevented from understanding what is being said until the end. Consider the sentence “With a p-value of 0.03, the pulse values measured in both cohorts displayed statistically significant tachycardia.” This reads more like a mystery novel than the results of a successful experiment. A better version may read “Tachycardia was found to be statistically significant in both cohorts as demonstrated by a p-value of 0.03.”

4. Phenomenon is singular and implies doubt in its explanation.

Firstly, the word phenomenon is often confused with its plural form, phenomena. For example, “The experiment displayed visual phenomena closely resembling…” would be the correct usage. However, some writers incorrectly use phenomenon to describe mundane observations. The word phenomenon is not interchangeable with a plain observation, as seen in the sentence “Boiling water produces a gaseous phenomenon known as steam” because there is no doubt that boiling water produces steam.

5. Using the word who and whom.

The specific use of the word who which gets most misused is when it is replaced with the word that in a relative clause. For example, “The participant in the study that has achieved the most…” incorrectly uses that in place of who, because the noun preceding it is a person. Furthermore, the word whom is sometimes used incorrectly, or not at all. For example, “This dosage is given to whom?” uses the whom form. We can use a heuristic to validate this by replacing whom with him. Does “This dosage is given to him?” make sense? Yes. Therefore, whom is correct. The same heuristic can be used between who and he.

6. (Honourable mention) When to use consists versus contains.

Technically, the words consist and contain are interchangable. However, the word consist implies a consistency. For example, bread consists of flour, water, and yeast. A class contains students. To say a class consists of students would technically be correct but may also imply that they were put in a blender and poured into a classroom.

I’m curious to hear from people who are surprised by these mistakes. Or perhaps there are other mistakes I’ve missed? Let me know in the comments. Also, I may do a “Top 5 mistakes when determining p-values” in the future, if there is interest. Cheers!

Leave a comment