Distributional data such as detrital age populations or grain size distributions are common in the geological sciences. As analytical techniques become more sophisticated, increasingly large amounts of distributional data are being gathered. These advances require quantitative and objective methods, such as multidimensional scaling (MDS), to analyse large numbers of samples. Crucial to such methods is choosing a sensible measure of dissimilarity between samples. At present, the Kolmogorov–Smirnov (KS) statistic is the most widely used of these dissimilarity measures. However, the KS statistic has some limitations such as high sensitivity to differences between the modes of two distributions and insensitivity to their tails. Here, we propose the Wasserstein-2 distance (

A distributional dataset is one where the information does not lie in individual observations but in the

For all uses, the choice of which dissimilarity metric to use is vital as different metrics result in different numerical results and thus different geological interpretations. In general, the most appropriate metric will depend on the data being analysed and the scientific question under investigation. The Kolmogorov–Smirnov (KS) distance, calculated as the maximum vertical distance between two empirical cumulative distribution functions (ECDFs), has emerged as a “canonical” distance metric between mineral age distributions

In this paper, we present an alternative to the KS distance that does not suffer from some of these limitations: the Wasserstein distance (also known as the Earth mover's or Kantorovich–Rubinstein distance). To introduce the chief principle behind this measure, let us consider a simple toy example. Table

A toy, single grain per sample dataset.

As the KS distance is the vertical difference between ECDFs, it is insensitive to the absolute, “horizontal” age differences between individual observations. Thus, the KS distances between A and the other three samples are

In the following sections, we first introduce the Wasserstein distance in a more realistic setting and formally define it. Next, we discuss how it can be decomposed into intuitive terms that accord with how qualitatively, as geologists, we might compare distributions. We then proceed to compare the Wasserstein distance to the KS distance using a simple yet realistic synthetic example. Finally, we analyse a series of case studies, analysing real datasets using both the Wasserstein and KS distances. We thus evaluate the benefits and drawbacks of both metrics, identifying scenarios in which one metric may be preferred to the other. Whilst we focus primarily on detrital age distributions, we emphasize that much of the following discussion applies equally to any form of distributional data.

The Wasserstein distance is a distance metric between two probability measures from a branch of mathematics called “optimal transport”. Optimal transport is often intuited in terms of moving piles of sand from one location to another with no loss or gain of material (e.g.

Intuition of the Wasserstein distance.

We consider two univariate probability distributions,

We focus on these univariate instances as they apply to the most common geological distributional data including detrital age distributions and grain size distributions. However, we note that the Wasserstein distance is, in general, multivariate. As a result, some form of the Wasserstein distance could prove useful for analysing a number of other geological datasets such as the geochemical compositions of detrital minerals, or joint U–Pb and Lu–Hf isotope analysis (see

Like the KS distance, the

A particularly useful property of the

Most distributional data in the Earth sciences do not, in raw form, follow continuous probability distributions. Instead, samples may be discrete sets of observations, e.g. lists of individual mineral ages. The above formulations can be easily applied to such cases by describing the probability functions

To demonstrate the intuition of the

Comparing the Wasserstein distance to the Kolmogorov–Smirnov distance.

We argue that the behaviour of the

We reiterate that at a translation of 0 Ma, the

As stated above, the most appropriate dissimilarity metric to use will depend on the scientific question being answered. In general, the Wasserstein distance is most appropriate when absolute differences along the time axis (or more generally, the

Here, we discuss a variety of realistic scenarios where the KS and

We first consider a scenario where the samples are assumed to be mixtures, in differing proportions, of some known or unknown fixed endmembers. This situation is one where absolute distance along the time axis is not relevant, as the nature of the endmembers is not sought, simply their relative contributions to a set of mixtures. Instead, it is the

Mixing of discrete endmembers.

For example, let us consider three unimodal potential sediment sources, as shown in Fig.

In contrast, scenarios where the shape of sediment source age distributions evolves in space and time are well suited to using the

Temporally evolving source distributions.

Figure

In thermochronology, age distributions shift along the time axis according to thermal signals (e.g. exhumation). In many thermochronological studies, we may seek to characterize how such a signal evolves in space and time. For this question, absolute distance along the time axis is useful information and so the

Analysing thermochronological data using

A final scenario where the

Comparing samples from an interlaboratory calibration study. KDEs

We provide the example code (

Additionally, the

The second Wasserstein distance,

The code and data repository are found at

AL conceived the project; both authors contributed to development, writing, and software production.

At least one of the (co-)authors is a member of the editorial board of

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work benefited from discussions with Malcolm Sambridge and Kerry Gallagher.

This research has been supported by the Merton College, University of Oxford, and the Natural Environment Research Council (grant no. NE/T001518/1).

This paper was edited by Michael Dietze and reviewed by Joel Saylor and one anonymous referee.