Received: 26 Nov 2022 – Discussion started: 22 Dec 2022
Abstract. Open data has become the modern science meme, and major funding bodies and publishers support open data. On a daily basis, however, the open data mandate frequently encounters technical obstacles, such as a lack of a suitable data format for data sharing and long-term data preservation. Such issue is often community-specific and best addressed through community-tailored solutions. In Quaternary sciences, luminescence dating is widely used for constraining the timing of event-based processes (e.g., sediment transport). Every luminescence-dating study produces a vast body of primary data that usually remains inaccessible and incompatible with future studies or adjacent scientific disciplines. To facilitate data exchange, long-term data preservation, in short, open data, in luminescence dating studies, we propose a new XML-based structured data format called XLUM. The format applies a hierarchical data storage concept consisting of a root node (node 0), a sample (node 1), a sequence (node 2), a record (node 3) and a curve (node 4). The curve level holds information on the technical component (e.g., photomultiplier, thermocouple). A finite number of curves represent a record (e.g., an optically stimulated luminescence curve). Records are part of a sequence measured for a particular sample. This design concept allows the user to retain information on a technical component level from the measurement process. The additional storage of related metadata fosters future data mining projects on large datasets. The XML-based format is less memory efficient than binary formats, however, in focus is data exchange, preservation and hence XLUM long-term format stability by design. XLUM is inherently stable to future updates and backwards compatible. We support XLUM through a new R package, `xlum', facilitating the conversion of different formats into the new XLUM format. XLUM is licensed under the MIT licence and hence available for free to be used in open and closed-source commercial and non-commercial software and research projects.
After submitting the manuscript, I realised that the Zenodo DOI with the link to the detailed specification is correct but not very straightforward to read, so here is the link to the specification website, which is similar to what we have on Zenodo but easier to read: https://r-lum.github.io/xlum_specification/specification/
The website also provides an option to download the specifications rendered as a PDF. The link will also become part of the manuscript with the next iteration.
Open data has become the modern science meme. Funding bodies and publishers support open data. However, the open data mandate frequently encounters technical obstacles, such as a lack of a suitable data format for data sharing and long-term data preservation. Such issues are often community-specific and demand community-tailored solutions. We proposed a new human-readable data format for data exchange and long-term data preservation of luminescence data based called XLUM.
Open data has become the modern science meme. Funding bodies and publishers support open data....