Exploring the advantages and limitations of in situ U-Pb carbonate geochronology using speleothems

The recent development of methods for in situ U-Pb age determination in carbonates has found widespread application but the benefits and limitations of the method over bulk analysis (isotope dilution ID) approaches have yet to be fully explored. Here we use speleothems – cave carbonates such as stalagmites and flowstones to investigate the utility of in situ dating 10 methodologies for ‘challenging’ matrices with typically low U and Pb contents, and predominantly late Cenozoic ages. Using samples for which ID data have already been published, we show that accurate ages can be obtained for many speleothem types by Laser Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICPMS). Consideration of our own and literature data suggests that most carbonates with >1ppm Uranium and a few hundred ppb of Pb should be good targets for in situ methodologies, regardless of age. In situ analysis often provides a larger spread in U/Pb ratios which can 15 be advantageous for isochron construction but isochron ages rarely achieve the ultimate precision of ID analyses conducted on the same samples simply because signal sizes are dramatically reduced. LA analysis is faster than ID and thus will play a significant role in reconnaissance studies. The major advantage of the in situ methodology appears to be the potential for successful dating outcomes in sample types requiring high spatial resolution analysis or those with a high common Pb component where LA approaches may facilitate identification of the most radiogenic regions for analysis. 20


Introduction
The U-Pb decay scheme has played a key role in the chronology of carbonate rocks for more than 3 decades (e.g. Moorbath et al., 1987;Jahn and Cuvellier, 1994;Rasbury and Cole, 2009) utilizing predominantly isotope dilution (ID; i.e. bulk sample) methods. Recent years, however, have seen a revolution in the field with the emergence of in situ analysis techniques employing laser ablation inductively coupled plasma mass spectrometry (LA-ICPMS) and offering the prospect of direct determination of U-Pb ages on the scale of a few hundred microns. Although still in its infancy, this method has already been applied to the chronology of marine cements (Li et al., 2014), vein calcites associated with faulting (Roberts and Walker, 2016;Hansman et al., 2018;Parrish et al., 2018), and the alteration of oceanic crust (Coogan et al., 2016).
To date, a thorough exploration of the utility of in situ techniques to speleothem (secondary cave calcite such as stalagmites and flowstones) research has not been conducted, although U-Pb dating of speleothems is widely used in studies of climate change (e.g. Vaks et al., 2013;Sniderman et al., 2016), human evolution and migration (e.g. Walker et al., 2006;Pickering et al. 2011Pickering et al. , 2019, bio-diversity and ecosystem change (e.g. Woodhead et al., 2016), and tectonics and landscape evolution (e.g. Lundberg, 2000;Polyak et al., 2008;Meyer et al., 2011;Woodhead et al., 2019). Speleothems offer a variety of unique analytical challenges for in situ analysis -not least because of their highly variable and often very low levels of radiogenic Pb, but also because of the fact that most samples of interest are also relatively young -predominantly Neogene or early Quaternary. As such they actually form a useful test of the limitations of the in situ carbonate dating methodology more generally. Here we explore the utility of LA-ICPMS techniques as applied to speleothems not only to highlight important new research avenues but also to explore both the benefits and limitations of the method.
We first compare LA-ICPMS (henceforth "LA") ages for a variety of speleothem samples for which bulk, solution multicollector ICPMS, ID U-Pb age data have already been published as a benchmark against which to judge the reliability of our in situ analyses. We then explore the advantages and limitations of LA methodologies in this context and make recommendations for the optimal use of both technologies.

Overview
Samples for analysis were prepared either as polished slabs or Epofix™ resin mounts. While a polished surface is not an essential prerequisite for LA studies, it significantly enhances the ability to view the sample clearly with the reflected light microscopes widely employed in LA systems. Mounts were cleaned in ultra-pure water in an ultrasonic bath and dried under nitrogen prior to loading into the sample cell.
We used the "freeform" sample holder available in the S155 large-format ablation cell of an Australian Scientific Instruments (now Applied Spectra) RESOlution-LR ablation system, based around a Lambda Physik Compex 110 excimer laser, operating at 193 nm wavelength, and coupled to a Nu Instruments Attom-ES high-resolution magnetic sector ICPMS operating in deflector jump mode.
Laser fluence was typically adjusted to ∼ 2-3 J cm −2 , and we used a laser repetition rate of 5 Hz, allowing the potential for depth resolution if required (see below). Analyses were conducted with either a 154 µm or 228 µm spot; we aimed to achieve maximum 207 Pb counts, without taking 238 U into attenuated mode (the first attenuation mode trip on our instrument was set to 3 million cps (counts per second) for this study). A brief pre-ablation using a larger spot size was conducted prior to every analysis. Baseline measurement for 30 s was followed by 40 s acquisition during each spot ablation. The masses measured and dwell times used are documented in Table 1, together with other instrumental parameters. Laser gas flows and instrument settings were optimized primarily for highest sensitivity: for a 40 µm spot under these conditions on a NIST glass, we see around 25-35 000 cps ppm −1 . Although oxide levels are generally low ( 248 ThO/ 232 Th < 0.3 %), and Th/U ratios close to unity, we have observed no relationship between variation in these parameters and data quality, and do not tune to optimize these values (as is commonplace, for example, in trace element determination). Table 2 lists the samples used in these experiments and the publications in which original ID data for these materials (reproduced in Table 3) can be found. Kernel density plot of U and Pb concentration data obtained by ID methods in our laboratory over the past decade, representing over 2000 sample aliquots. All samples were leached briefly in dilute HCl prior to dissolution to remove any blank Pb that may have been introduced during sample handling (e.g. Woodhead et al., 2012). The small hotspot to the right of the main array is dominated by the Corchia site in Italy, from which we have analysed many samples. The vast majority of other speleothems, however, have 0.1-1 ppm U and ∼ 0.001 ppm Pb.

Analytical strategies
Natural speleothems display a remarkably wide range of U and Pb concentrations - Fig. 1 shows data generated for over 2000 speleothem calcite aliquots analysed at the University of Melbourne by ID methods over a 10-year period following rigorous sample cleaning protocols to remove Pb contaminants derived from initial processing (e.g. Woodhead et al., 2012). The majority of samples contain ∼ 0.1-10 ppm U and generally very low Pb concentrations, typically ∼ 1-100 ppb. These traits provide very challenging conditions for LA analysis.
The primary concern for any samples of this type, and particularly when measuring by LA, is the obvious potential for contamination with environmental ("blank") Pb during analysis. Although LA rastering and extraction of age information from the resulting isotopic images have shown great promise for dating limestones with relatively high Pb abundances (Drost et al., 2018), the same approach cannot be easily implemented in speleothems where Pb contents are often in the low parts per billion range and thus where each spot analysis may be measuring total Pb amounts in the low femtogram range. For this reason, in this study, we have used spot analyses and perform a clean pre-ablation with a larger        spot size before analysis. In addition, we also discard the first few seconds (and often more -see below) of each analysis to avoid any remaining blank Pb contaminants. A variety of different calibration strategies for in situ carbonate U-Pb analysis are currently in use. The major problem facing analysts is that, to date, no suitable, homogeneous carbonate reference material has been identified. Most studies, therefore, use the heterogeneous but well-characterized calcite WC-1 (Roberts et al., 2017) and employ a variety of strategies in order to compensate for its heterogeneous nature. For example, Roberts and Walker (2016) use a NIST glass to correct for any bias in 207 Pb/ 206 Pb ratios and then take a session mean of values for the WC-1 reference material to correct the 238 U/ 206 Pb ratio of unknowns. They do not perform any downhole fractionation corrections but simply use means of each ablation. Conversely, Hansman et al. (2018) use the NIST glass for correction to both the 207 Pb/ 206 Pb and 238 U/ 206 Pb ratios and then apply "additional offset factors" (essentially multipliers) to account for matrix-induced variation in U/Pb ratios between NIST and calcite and downhole fractionation effects.
We have confirmed that there are no observable matrix effects on the 207 Pb/ 206 Pb ratio when ablating NIST 614 glass relative to calcite and that both return measured / true ratios ≈ 1 within the resolution of our instrument. Therefore, we see no immediate advantage in using a NIST glass in this way and simply use the WC-1 reference material as the primary calibrant. The VizualAge UcomPbine data reduction scheme (DRS; Chew et al., 2014) for Iolite (a popular ICPMS data processing software package; Paton et al., 2011) is designed to allow the use of heterogeneous reference materials (i.e. those with variable amounts of common Pb) such as WC-1 but assumes no 207 Pb/ 206 Pb fractionation. The timeresolved reference material data from a single spot analysis exhibit shifts in 238 U/ 206 Pb-207 Pb/ 206 Pb isochron ("Tera-Wasserburg": Tera and Wasserburg, 1972) space primarily from encountering variable common Pb and/or experiencing downhole Pb/U fractionation. UcomPbine corrects each time slice of background-subtracted data for the reference mate- show examples of the complex compositional behaviour seen in many carbonates -these are all single ablations of the same sample. In panel (a) a relatively simple structure is observed and, with the exception of a small amount of (surface-contamination) common Pb at the start of the analysis, almost all of the data collected can be used. Panel (b) shows a grain with more complex structure and a zone of intrinsic common Pb (high 206 Pb/ 238 U and 208 Pb cps) encountered towards the end of the ablation, while in (c) more common Pb is seen in the first half of the ablation. In complex cases such as these the analyst can focus on the most radiogenic parts of the analysis, as indicated, as long as the data are downhole-corrected. rial based on its known common and radiogenic Pb compositions using a 204 Pb-, 207 Pb-, or 208 Pb-based approach. This allows the time-resolved radiogenic Pb/U signals to be combined with the ablation depth (or a proxy, such as time since laser on) to correct for downhole Pb/U fractionation as described by Paton et al. (2010). Drift correction is carried out as usual in Iolite by fitting a function (in this case a smoothing spline) to the reference material analyses that bracket unknowns. For this study, the 207 Pb-based correction of UcomPbine was employed. Note that the data presented here include only the internal precision for each measurement. The propagated uncertainty of Pb/U ratios can be calculated by UcomPbine using the "pseudo-secondary" approach Iolite uses for its built in U-Pb geochronology DRS (i.e. Paton et al., 2010), but UcomPbine's 207 Pb-based correction precludes calculating excess uncertainty on 207 Pb/ 206 Pb in this way. We typically find propagated Pb/U uncertainty 1.25 times the internal precision and expect that this factor would be smaller for 207 Pb/ 206 Pb. To properly assess the 207 Pb/ 206 Pb excess uncertainty and to evaluate mass bias effects which may be more pronounced or resolvable with other instruments, a true secondary reference material with homogeneous 207 Pb/ 206 Pb could be employed. As a result of these current limitations, our long-term reproducibility using this methodology is still being evaluated.
We prefer to correct for any downhole elemental fractionation effects, which will be exacerbated with smaller spot sizes. The provision of a downhole correction capacity conveys an important advantage for this type of work in so much as it allows the selection of only the most advantageous areas in a single spot ablation for use, knowing that an appropriate correction for downhole effects has been made at each point in the ablation profile: in other words, the downhole correction allows depth resolution within each individual analysis. This can be a significant benefit in avoiding areas dominated by common Pb which are invariably encountered in the analysis of most carbonates (Fig. 2) and thus maximizing data spread in Tera-Wasserburg space. In addition, this methodology allows us to maximize the use of "good" data, e.g. by trimming only those integrations required at the start of an ablation to remove surface contamination rather than employing a blanket crop of several seconds for every spot analysis. Although the downhole correction profile obtained by Iolite from multiple analyses of a heterogeneous reference material is not as robust as that which might be obtained from a homogeneous reference material, in practice the large numbers of reference analyses included in any given run (at least 16 for the propagation of excess uncertainty) usually produce well-characterized downhole U/Pb profiles when using the UcomPbine DRS.

The accuracy of the method
In order to assess the accuracy of the method for relatively low-concentration samples we have analysed a number of speleothems for which we have already produced and published ages by solution ID methods (Tables 2 and 3); these display a wide range of radiogenic : common-Pb ratios but, in all cases, have U in the low parts per million range and Pb in the low parts per billion range, typical of many speleothems. In all of these cases the ID data were obtained using a 233 U-205 Pb isotopic tracer calibrated against Earth-Time (http://www.earthtimetestsite.com, last access: 3 December 2019) reference solutions, and accuracy was constantly monitored by reference to EarthTime synthetic zircon solutions run concurrently. The new LA data, together with Larger aqua symbols with low error correlations are LA data; smaller red symbols with high levels of error correlation are the isotope dilution data for the same samples -in many cases these are almost invisible at this scale and their locations are, therefore, highlighted with red arrows. Red and black dotted lines represent best-fit isochrons for the LA and ID datasets respectively derived using Isoplot (Ludwig, 2001). the pre-existing ID data for the same speleothem sample, are presented in the familiar 238 U/ 206 Pb-207 Pb/ 206 Pb isochron Tera-Wasserburg construction -see Fig. 3.
It is immediately clear that the LA-and ID-generated data are of quite different character -laser data have inherently large uncertainties resulting from the minute quantities of material being analysed and also show little, if any, error correlation. Mean square of weighted deviate (MSWD) values are in the range of 2-4. In contrast ID data have far smaller individual uncertainties and usually show a high degree of error correlation which is common to many unradiogenic samples plotted in such diagrams. Because of the larger uncertainties shown by the LA data, many more analyses are required in order to constrain an isochron -a feature explored in later discussion.
In all cases shown in Fig. 3 the LA-derived ages fall within uncertainty of the ID-derived data. In addition, for many samples, the LA data show a wider range in U/Pb ratios than the ID data although not consistently more or less radiogenic.
There also appear to be subtle differences in the slope of the isochrons resulting from the two methods: the LA data often (but not always) seem to have a steeper slope and the common-Pb intercepts are not always within uncertainty of each other. We have explored many explanations for this apparent isochron "rotation", including potential inaccuracies in the assumed common-Pb and radiogenic-Pb endmembers for WC-1 and inaccuracies in the calculated dead time used on our instrument (both evaluated by adjustment in postprocessing of the data) but find no consistent theme in these studies. As a result of these investigations, and the fact that the effect is variable in occurrence, we currently believe that this effect is most likely attributable to a minute (femtogram) blank contribution from the surface of some samples which is not captured in the baseline (gas blank) measurement and thus cannot be adequately corrected for nor indeed readily measured. Additional tests are underway but, until the exact source of this issue is determined, the 207 Pb/ 206 Pb intercepts of our LA-derived isochrons must be regarded as inaccurate at the percent level.
It is also clear from these experiments that the LAgenerated isochrons rarely attain the high precision of ID datasets -most likely because many orders of magnitude less material is being analysed -although in some cases they certainly approach those values. As such ID-generated data can still be considered the benchmark for high-precision speleothem applications, when abundant suitable sample material is available.

Number of analyses required and efficiency considerations
Given the relatively large uncertainties associated with each LA analysis, an important consideration in establishing an analytical protocol is the number of analyses required to form a robust age determination. In order to assess the effect of sample size on isochron quality, we collected large numbers of spot analyses (up to 70) for several samples and then randomly subsampled this dataset, determining the age at each step. All samples show very similar behaviour, and so we use, as an example, results for sample P-1 in Fig. 4. A somewhat unexpected observation from this analysis is that relatively high-accuracy and high-precision ages can be obtained with as few as 30 spot analyses and that any subsequently acquired data often do little to improve the analytical precision. There is, however, considerable scope for generating erroneous ages with analysis counts lower than ∼ 30.
These data feed into an assessment of the potential time savings available from LA analysis compared to ID studies. If we assume 30 spot analyses "per age" this amounts . Sample size vs. isochron quality. Percent difference of LA intercept age from ID age versus sample size when randomly subsampling the P-1 dataset. Thirty different random selections of these data were made for each sample size. The resulting intercept ages for each of the selections are represented individually as dots and collectively as a vertical kernel density estimate ("violin plot"). The ID age with uncertainty is represented by the blue bar centred on ordinate 0. The black and grey dashed lines are the median and extreme uncertainties calculated for each sample size. Figure 5. Limitations of the method. Plots of U and Pb contents vs. age for our own samples and those literature studies from which concentration data could be extracted, with panel (b) representing a zoom view of the data shown in panel (a). Data points represent the average of concentrations reported for each sample: for our own analyses these are ID analyses but, in the case of literature samples, these are the concentration data reported from LA studies. Samples are colour-coded -those in purple represent successful literature age determinations by LA methodologies, whereas those in green are our own successful LA analyses. Samples shown in red, however, are those for which we have previously successfully determined ID ages but have not been able to produce isochrons using LA. These samples we consider to be beyond the current limits of the LA technique due to a combination of low U and Pb contents and age. A grey plane is drawn to separate successful from unsuccessful analyses. Data sources: this study, Li et al. (2014), Coogan et al. (2016), Ring and Gerdes (2016), Walker (2016), andHansman et al. (2018). to around 40 min run time per sample using our analytical protocol.
Once sample preparation time (slabbing and polishing or resin mounting) and other overheads (e.g. digitizing, cell stabilization after sample changes) are taken into account, however, it may be hard to generate more than 10-20 age determinations per week without significant automation such as overnight running. In comparison, we tend to process aliquots for ID studies in batches of 30 which routinely provides approximately five age determinations per week (1 d sample preparation and cleaning, 1 d spiking, 1 d chemistry, and 2 d mass spectrometry), with clear scope for further expansion. There is certainly, therefore, significant time saving when employing LA analysis but not necessarily a dramatic (order of magnitude) one. As such LA methods offer great potential as a rapid reconnaissance tool, largely superseding alternative methods advocated by Woodhead et al. (2012).

Limitations of the method
In addition to the small number of isochrons shown here for samples with published ID-derived ages, we have throughout the course of this study attempted to reproduce our IDderived ages for a number of other (currently unpublished) samples of variable age and U and Pb contents. The results of these experiments, combined with comparable data gathered from literature studies of non-spelean carbonates, are plotted in Fig. 5. In this plot, samples are colour-coded -those in purple represent successful literature age determinations by LA methodologies, whereas those in green are our own successful LA analyses. The diagram illustrates the particular challenges of analysing speleothem materials compared with many other carbonates -sub-parts per million levels of U and Pb and generally relatively young ages. Samples shown in red are those for which we have previously successfully determined ID ages but have not been able to produce isochrons using LA. These samples we consider to be beyond the current limits of the LA technique due to a combination of low U and Pb contents and age.
We recognize that the results of such an entirely empirical approach are likely to show some variation between instrumentation and may ultimately change as equipment becomes more sensitive; for the moment, however, this diagram provides a first impression as to the potential limitations of the method as judged by current literature data. A relatively simple plane can be drawn to separate "successful" from "unsuccessful" experiments: the equation of this plane in x (U ppm) −y (Pb ppm) −z (age in Ma) space is 2.25x + 10.5y + 0.42z − 3.15 = 0. If the analyst has an independent assessment of likely U and Pb contents (e.g. from reconnaissance quadrupole ICPMS analyses) and an approximate idea of age, these values can be inserted into the equation above. Strongly positive values would suggest a high likelihood of dating success with appropriate equipment, whereas negative values would suggest parameters likely to Figure 6. An example of high spatial-resolution geochronology. High spatial-resolution analysis of a straw speleothem from the Nullarbor Plain of SW Australia. Panel (a) shows an active straw stalactite in situ, while panel (b) is an SEM cross-sectional view of the sample studied showing typical dimensions and wall thickness, together with representative laser ablation pits. Panel (c) shows the Tera-Wasserburg isochron which is within the range of other speleothems analysed from this site via ID methods . be beyond the current reach of the methodology. In broader terms any samples with > 1 ppm U and a few hundred parts per billion of Pb should be datable regardless of age. The actual range of U/Pb ratios present in a sample also plays a role in the generation of isochrons, but almost all carbonate materials analysed to date show some variation in this ratio at the micron scale, and thus the potential for isochron construction if isotopic measurement of their U and Pb is analytically feasible.

Advantages of the method
It is clear from the previous discussion that the LA methodology has some limitations in terms of working with small amounts of U or Pb or relatively young samples. In comparison, ID methods can produce useful data from samples with lower U and Pb concentrations simply because of the much larger sample sizes employed (typically 50 mg for a bulk ID analysis compared with 0.005 mg for an LA analysis). As a result, higher-precision data can be obtained and consequently, younger samples can be dated -samples in the range of a few hundred thousand years are possible by ID (e.g. Richards et al., 1998), providing substantial overlap with the U-Th chronometer in optimal circumstances.
The trade-off, however, is one of spatial resolution -LA methods offer a few hundred microns resolution compared to several millimetres (at best) using samples drilled out for ID analysis. Note in this context that it is difficult to produce sample powders for U-Pb analysis without contamination by environmental Pb, and so traditional micro-milling methodologies are not applicable to low-level Pb samples. For this reason, in speleothem studies, complete fragments of crystalline calcite have to be removed by drilling (Woodhead et al., , 2012. The power of these high spatial-resolution LA approaches has already been demonstrated by previous studies, e.g. an analysis of single calcite fibres in vein structures (e.g. Goodfellow et al., 2017). Here we use an example from speleothem studies.
Straw stalactites -hollow, calcite cylinders that precipitate from cave drip points -are thin-walled structures which are plentiful in many caves and often form the nucleation point for the eventual growth of stalactites. Because of their fastgrowing habit, they have been explored as short-term climate records (e.g. Paul et al., 2013) but they are also rather fragile and frequently broken off. Because of this they readily accumulate in cave sediments and have been shown to be useful in the context of dating relatively young archaeological sequences at cave sites via the U-Th method (St Pierre et al., 2009). Cave straws are in fact often remarkably well preserved and so could be used to date many cave sedimentary sequences of any age, but no previous attempts have been made to obtain U-Pb ages for these due to their minute size.
In Fig. 6 we show a straw speleothem from the roof of a Nullarbor cave which has been shown to contain speleothems in the 3.1-5.6 Ma range . This sample has a total diameter of only ∼ 5 mm and a wall thickness of between 0.25 and 1.2 mm (Fig. 5b). In addition, the outer regions of straws (those areas exposed to the environment) can contain high levels of detrital components derived from dust. Dating of such structures would be impossible by conventional ID methods, but we are able to use the laser to obtain a well-constrained isochron age of 4.91 ± 0.33 Ma (uncorrected for initial disequilibrium). It is impossible in this case to independently confirm the validity of this age, but it is well within the range of nearby cave formations previously dated by ID.
In addition to the ability to date extremely small and/or fragile materials, the increased spatial resolution afforded by the laser provides a further advantage -a greatly increased ability to avoid areas dominated by common-Pb and/or opensystem behaviour which may be unavoidable in ID analyses where ∼ 50 mg calcite samples are the norm. As a result, there are likely to be many situations where laser approaches can produce age information for samples which are intractable to ID approaches.

Conclusions
Laser ablation methods are capable of generating accurate U-Pb ages for speleothems (and by inference other carbonates) with moderate U contents (> 1 ppm), regardless of age. At lower U contents, however, the possibility of successful outcomes is also strongly dependent on Pb content and age. Together with our own studies, a compilation of successful literature analyses provides a first-order test for potential dating suitability. The absolute precision obtained by LA methods can approach but rarely supersedes that of the benchmark ID method. The latter, however, requires many orders of magnitude more sample to achieve. LA conveys an advantage in speed and is thus useful as a reconnaissance tool. The overwhelming advantage of LA methods, however, remains one of high spatial resolution, allowing the dating of materials which are beyond the reach of ID methods simply because of their size. Data availability. The raw Attom data and RESOlution laser log files for the LA analyses and the Iolite-3 experiments showing how they were processed can be obtained from the authors.
Author contributions. JW collected the analytical data. JW and JP both performed the data analysis and wrote the paper.
Competing interests. The authors declare that they have no conflict of interest.

Special issue statement.
This article is part of the special issue "In situ carbonate U-Pb geochronology". It is a result of the Goldschmidt conference, Barcelona, Spain, 18-23 August 2019.