Writing a paper, materials and methods – day 4: making definitions

Free Preview

Writing definitions

Defining things is a key part of scientific writing, you will often need to define methods, scales, objects, etc. in the course of writing up your research so that it is clear to the reader what research you have done, what your are intending to evaluate and how. In Academic research, particularly at the forefront, there is often disagreement over many things including what certain words mean, the best way to measure things and so on. So, defining your methods, approach and interpretations is important.

A definition normally consists of three parts:

The term that is being defined
The class of object, concept, measurement that the word or phrase belongs to
The defining characteristics that make the term what it is and different from others

Here are some examples of definitions:

Carbon monoxide [term] is a gas [class] formed by incomplete combustion of carbon, which is colourless, odourless, toxic and flammable [defining characteristics].
An antibiotic [term] is a medicine [class], such as penicillin, that inhibits the growth of or destroys microorganisms [defining characteristics].
The Richter Scale [term] is a numerical measure [class] for expressing the strength of an earthquake on the basis of seismograph oscillations. It is a logarithmic scale and a difference of one represents an approximate thirtyfold difference in magnitude [defining characteristics].

It is important to make a definition when:

You have invented something new
Your audience may not understand the term
You know there are different interpretations of the term
When there are layers of subjectivity or evaluation in the way you choose to define it

Try to avoid defining by stating [something] “is when” or [something] “is where” – instead define a verb with a verb and a noun with a noun.

Example methods section

Let’s have a look through the example methods section below for its structure, phrases and use of definitions.

4. Methods

While most clustering algorithms produce robust results in low-dimensional spaces, only a few perform adequately in multidimensional spaces where the curse of dimensionality becomes noticeable. In practice, subspace clustering has often been found to yield the best results by identifying clusters that are hidden in specific subspace(s) while presented with noise from other dimensions. Our methods aim to address this challenge and use human input to augment the performance of classical algorithms.

In this section, we describe two novel semi-automated clustering algorithms for multidimensional data leveraging the information embedded in annotations collected from Colony B participants. The first one, human-based CLustering In QUEue (hubCLIQUE), uses a bottom-up approach that generalizes the seminal CLIQUE algorithm. The second one, Clustering Of Crowdsourced networks (CloCworks), applies a community detection strategy to identify groups of answers in agreement to generate the clusters.

4.1. hubCLIQUE

CLIQUE (CLustering In QUEue) [term] is one of the first, and still one of the most popular, algorithms [class] developed for clustering high-dimensional datasets. It uses grid-based and density-based approaches to identify dense areas in lower-dimensional spaces and progressively expands the candidate clusters in higher dimensions [defining characteristics]. This strategy is flexible enough to easily incorporate additional information extracted from a human input. We call this algorithm hubCLIQUE, a bottom-up subspace clustering approach guided by crowdsourced solutions collected by Colony B.

…

4.2. CloCworks

In this section, we propose a different approach to predicting clusters using the annotation collected from Colony B. We call this algorithm [class] CloCworks [term]. In contrast to hubCLIQUE, CloCworks aims to detect the occurrence of groups of consistent answers rather than to exploit the density of data. CloCworks models the data collected from Colony B as a network and uses a community search algorithm to find pseudo-optimal partitions in this network [defining characteristics]. Since the problem is NPNP-hard, we choose the Louvain community search algorithm because of its performance (modularity score) and speed in comparison with other related algorithms.

4.2.1. Network construction

We build a network for every pair of dimensions used in the dataset analysed by Colony B. Each node of the network represents a data point, whereas the edges model the probability of two data points to be clustered together. Hence, the weight of the edges encodes the observed frequency of occurrence of a pair of points in a cluster.

Formally, let Pi,jkPki,j be a stage of the game for two dimensions {i, j}. The algorithm analyses all solutions for Pi,jkPki,j and computes the frequency of co-clustering two points together as the number of times the points are selected together over the number of times this stage was presented to a player. The results for all pairs of points are stored in a similarity matrix and we repeat this procedure for all stages in the {i, j} subspace. Then, we average all the similarity matrices over all pairs of dimensions to produce a summary network that is processed in the next step of CloCworks. An illustration of the full process can be found in the electronic supplementary material.

This extract is taken from: Butyaev A, Drogaris C,Tremblay-Savard O, Waldispühl J. 2022 Human-supervised clustering of multidimensional datausing crowdsourcing.R. Soc. Open Sci.9: 211189.https://doi.org/10.1098/rsos.211189

This paper centres around the use of two novel algorithms for analysing data. The definition of the algorithms in the methods section is important. The term, class and defining characterstics for each one are highlighted in the text above.

Note that in the second definition the order of the term and class are reversed. However, this does not matter and the definition is still made clearly.

In this paper the authors write using the pronoun ‘we’, putting themselves at the start of sentences and use the present tense, for example: ‘we build a network for….’

Useful phrases

In this section, we describe two novel: this phrase is used to introduce what is being written about in the methods

The first one, The second one,: These linking words are used to introduce each item, in this case the algorithms.

We build a network for: describes what the authors have done.

Alternatives: we create, we put together

Formally, let Pi,jkPki,j be a: in this field formally means that something is defined according to a set of rules.

Some useful stock phrases for research articles are below:

The results for all pairs … are stored … and we repeat this procedure for all stages: in many studies procedures will need to be repeated.

An illustration of the full process can be found in the: again in many studies you will need to refer the readers elsewhere for further details.

Further study for this week

Try using the advice that comes with the Journal this week to write up a materials and methods section from a recent piece of research you have been involved with.

Take today’s short quiz below.

Lesson tags: materials and methods, useful phrases, writing a paper, writing definitions