Resolving the effects of 2 D versus 3 D grain measurements on ( UTh ) / He age data and reproducibility

(U-Th)/He thermochronometry relies on accurate and precise quantification of individual grain volume and surface area, which are used to calculate mass, alpha ejection (FT) correction, isotope concentrations, equivalent sphere radius (ESR), and ultimately age. The vast majority of studies use 2D or 3D microscope dimension measurements and an idealized grain shape to calculate these parameters, and a long-standing question is how much uncertainty these assumptions 10 contribute to observed intra-sample age dispersion and accuracy. Here we compare the results for volume, surface area, grain mass, ESR, effective uranium (eU) and FT correction derived from 2D microscope and 3D x-ray computed tomography (CT) length and width data for >100 apatite grains. We analyzed apatite grains from two samples that exhibited a variety of crystal habits, some with inclusions. We also present 83 new apatite (U-Th)/He ages to assess the influence of 2D versus 3D FT correction on sample age precision. The data illustrate that the 2D approach systematically overestimates grain volumes and 15 surface areas by 20-25%, impacting the estimates for mass, eU, and ESR – all important parameters used for interpreting age scatter and inverse modeling. FT factors calculated from 2D and 3D measurements differ by ~2%. This variation, however, has effectively no impact on reducing intra-sample age reproducibility. We also present a grain mounting procedure for x-ray CT scanning that can allow 100’s of grains to be scanned in a single session, and new software capabilities for 3D FT and FTbased ESR calculations that are robust for relatively low-resolution CT data, that together enable efficient and cost-effective 20 CT-based characterization.


Introduction
(U-Th)/He thermochronometry of accessory phases, such as apatite and zircon, has been widely applied to study tectonic, volcanic, and surface processes.The method is based on the radiogenic accumulation of He from the alpha decay of 25 U, Th, and Sm isotopes and the diffusive loss of He via thermal processes.In addition, He is lost due to 'long alpha stopping distances' associated with the kinetic energy of alpha decay (~5 MeV), requiring a shape-based alpha ejection correction (F T correction) (Farley et al., 1996).This correction as traditionally applied includes several simplifications and assumptions, such as an idealized grain geometry and homogenous parent nuclide concentrations (Farley et al., 1996(Farley et al., , 2002;;Ketcham et  uses orthogonal 2D grain photos to model accurate 3D grain shapes.Others have employed x-ray computed tomography (CT) to determine accurate grain shapes in an effort to improve precision and accuracy in F T and (U-Th)/He age determinations.Herman and others (2007) used 3D CT grain dimensions to calculate F T factors and present a productiondiffusion model to extract thermal histories for detrital apatite grains.Evans and others (2008) tested the efficacy of 2D microscope measurements against 3D CT data of zircon and apatite grain shape and size and documented a 1-24% 5 discrepancy in derived F T values between microscope measurements and the CT data for even simple shapes (e.g., sphere).This new study investigates the effect of 2D versus 3D grain geometry measurement techniques on grain dimension, volume, surface area, ESR, mass, F T and the corrected age as well as effective uranium (eU) concentrations.In contrast to previous studies, we characterized >100 apatite grains from two granitic samples for a more statistically robust comparison and in an effort to more systematically capture variations in apatite morphologies, sizes, and screen for inclusions.The 10 apatite grains were picked and measured by a single analyst using 2D optical techniques and then CT-scanned.Building on previous work, we present a method for relatively rapid scans of >100 grains at 4-5 µm resolution, enabling affordable and efficient 3D screening.We introduce the capabilities of an updated version of Blob3D (Ketcham 2005; freely distributed software) that allows efficient batch processing of CT-scanned grains and outputs parameters such as grain volume and 3D F T .We further develop an approach for calculating ESR on the basis of equivalent-F T rather than an equivalent surface-to-15 volume ratio as a more direct and accurate means of approximating the diffusional domain as a sphere.Finally, in contrast to previous studies, we use the results of >80 apatite (U-Th)/He ages to evaluate the reliability of the 2D measurements as well as the impact on the (U-Th)/He age and uncertainty.

Geologic background of the samples
For this study, we selected two plutonic samples from the Cretaceous Cordilleran magmatic arc in the western USA 20 that yielded abundant, high-quality apatite and have been part of previous thermochronometric studies.Sample 97BS-CR8 is from a granodiorite in the Carson Range in the eastern Sierra Nevada along the Nevada-California border.The sample yielded an apatite fission track age of 68 ± 2 Ma (Surpless et al., 2002).The second sample, 95BS-11.3 is from a quartz monzonite exposed in the Wassuk Range in western Nevada, exhumed during Basin and Range normal faulting.The sample has a reported apatite fission track age of 16.3 ± 1.4 Ma and apatite (U-Th)/He age of 9.9 ± 1.9 Ma (Stockli et al., 2002).25 These samples were chosen for their abundant apatite and relatively simple cooling histories.Their geologic histories are relevant to the present study in that the apatite grains derive from plutonic rocks and did not experience complex metamorphic or magmatic histories, nor natural abrasion during sedimentary transport.Furthermore, both samples derive from plutons that experienced rapid post-magmatic cooling or fault-related exhumation and are expected to have spent little time in the apatite He partial retention zone and therefore should be less affected by slow cooling which would amplify 30 kinetic effects on age dispersion.

Grain selection and 2D measurements
62 and 50 apatite grains were picked dry using a Nikon SMZ-U/100 optical microscope at a total magnification of 180x from two samples (97BS-CR8 and BS95-11.3).Apatite grains were selected to include the range of grain morphologies present in the sample (e.g., broken, flat, and prismatic ends).Intentionally, several grains with visible inclusions were also 5 selected.All apatite grains were photographed using a Nikon digital ColorView camera connected to the microscope.The short and long axes were measured manually using AnalySIS® imaging software (Figure 1 and 3).For sample BS95-11.3, grains were imaged and measured on double-sided sticky tape (in preparation for the CT mount) (Figure 1).However, we determined that this can cause grains to sit in upright orientations, which is fine for CT scanning, but not for 2D measurements.For sample 97BS-CR8 each apatite grain was placed on a glass slide for 2D measurements and then 10 transferred to the sticky tape for the CT mount to remedy this issue (Figure 1).

Grain mounting procedure for CT
Once the grains were measured optically in 2D, they were mounted for CT scanning by orienting several tens of grains on a plastic disc and stacking multiple discs (Figure 2).The procedure to create a single-layer mount for multi-grain scanning entails covering a flat top of a pushpin with double-sided sticky tape that can be pre-cut using a standard hole punch.Apatite grains are then picked directly onto the tape in a grid-like pattern.The pushpin surface is ~5 mm in diameter, 20 which easily allows for ≥ 50 apatite grains to be mounted in one layer, tightly spaced, without touching.Grains could be https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.packed more densely as long as they can be reliably identified after scanning; they can even be touching, although this leads to a small increase in processing time to separate them using functions in the Blob3D software.
To utilize the total scanned volume, at least five multi-grain layers can be stacked for a single scan (up to 5 mm tall).To create stackable layers, sturdy plastic discs are made using a standard hole punch, with one side of the disc covered with double-sided sticky tape and apatite grains mounted in the procedure outlined above.Once all the layers are mounted 5 and all excess tape is trimmed, the discs are stacked on top of the push pin.The arrangement is secured by a thin wrap of parafilm.The parafilm and sticky tape are critical to ensure the crystals and layers do not move during scanning.This mount can be easily disassembled after scanning to retrieve the grains for further analysis.stacked to take full advantage of the height of the scan.These layers are held together using parafilm, and a hashmark on the pushpin enables further orientation of the scan in order to retrieve the grains afterwards for further analysis.

X-ray CT Scanning
The multi-grain mounts were scanned with a Zeiss Xradia MicroXCT scanner at the University of Texas High-Resolution X-ray CT Facility.Optimal scanning parameters will vary with the instrument being used, with top priorities 15 being to minimize scanning artifacts and noise, while also minimizing time and cost.Lower X-ray energies are more sensitive to compositional variations, but more prone to beam-hardening artifacts.We experimented with various settings in this study.The grain mount for sample 97BS-CR8 was scanned with X-rays set at 100 kV and 10W, with a 1.0 mm SiO 2 filter.1153 views were gathered at 1.5s per view, for an acquisition time of 28.9 minutes.Source-mount distance was 37.7 mm, and mount-detector distance was 12.8 mm.The 2048x2048 camera data were binned by 2, and the lower-energy X- The grain mount for sample BS95-11.3 was scanned with X-rays set at 150 kV and 10 W with a 1.6 mm CaF 2 beam filter, acquiring 571 views at 1.5s per view, for an acquisition time of 14.3 minutes, not including calibration.Source-mount distance was 37.7 mm, and mount-detector distance was 17.8 mm.The camera data were binned by 2, and no beam 5 hardening correction was applied during reconstruction.The resulting data had a voxel size of 4.58 µm.some grains.These subtle artifacts have a negligible effect on measurements, but may be expected to increase in severity with more or higher-density grains.In both cases the 3D shapes are recovered well.
2.4 Grain size and shape, F T , mass calculations

2D measurement calculations
The microscope length and width measurements are used to calculate volume and surface area, which are then used 5 to calculate mass, ESR, and UF T and ThF T for each apatite grain, following methods laid out in Farley et al. (1996), Farley andStockli (2002), andFarley (2002) (Figure 4).An equidimensional hexagonal prism geometry was assumed with the length (L) measurement for height of the prism, and the half-width (W) for the radius of the prism.All equations used for calculating these parameters are included below or in the Appendix.

3D calculations
Our principal 3D calculations were implemented in Blob3D (Ketcham 2005), a program written in the IDL programming environment for efficient measurement of the dimensions, shape, and orientation of discrete features in 5 volumetric data sets.The typical Blob3D method for calculating volume is to segment the grains based a threshold set at 50% of the CT number (grayscale) difference between apatite and the surrounding air.If grains are touching, or close enough to touching that their selected regions are connected, the software provides several separation methods, the simplest being an erode/dilate procedure.Volume is calculated as the number of voxels in a grain multiplied by the voxel volume, and surface area is calculated by summing the areas of the triangular facets of an isosurface surrounding the grain, which is 10 smoothed to reduce excess roughness from the cubic voxel edges.The shape parameters BoxA, BoxB, and BoxC are respectively the length (L), width (W), and height, corresponding to the dimensions of the smallest rectangular box that will enclose the grain (Ketcham and Mote, 2019).BoxC is calculated as the shortest 3D caliper length, BoxB is the shortest caliper length orthogonal to BoxC, and BoxA is the caliper length perpendicular to BoxC and BoxB (Figure 4).
A Monte Carlo method was implemented to measure F T , probably similar in many, but not all, respects to previous 15 work (Herman et al., 2007;Glotzbach et al., 2019).Stopping distances for 238 U, 235 U, 232 Th, and 147 Sm for the set of minerals reported in Ketcham et al. (2011) are included in the software.Taking the set of selected voxels for a grain, the origin point for each alpha particle is selected by first randomizing from which voxel to start, and then an (x,y,z) location within that voxel.The direction for each particle is obtained by sequentially stepping through a list of near-uniformly distributed orientations calculated by starting with an octahedron and subdividing each triangular face four times until there are 1026 20 vertices, which are then scaled to lie on a unit sphere (Ketcham and Ryan, 2004).This approach provides slightly better precision than randomizing orientations, and 200,000 Monte Carlo samples are sufficient to get precision to within 0.1% in all tests reported below.Separate F T factors for each decay chain (F T,238 , F T,235 , F T,232 , F T,147 ) are calculated, and a revised method for calculating mean F T that more precisely accounts for 235 U is provided in EQ 6 (explanation in the Appendix).
If the resolution of the scan is low with respect to the stopping distance (stopping distance/voxel size > 4), excess 25 surface roughness effects from voxelation are reduced by super-sampling.The voxels for each grain, and surrounding voxels, are subdivided into 27 (3 3 ) elements, and the super-sampled volume is smoothed with a 5-voxel-wide cubic kernel.
The result is then thresholded using a value that maintains the original volume as closely as possible.
These methods were tested on ideal spheres and cylinders, with radii of 63 and 31.5 µm and the latter with an aspect ratio of 4. At voxel sizes up to 8 and 4 µm for the respective radii, mean F T,238 values averaged within 0.2% of the ideal-30 shape values for spheres; further doubling the voxel sizes raised the mean error to 0.5%.Cylinders performed better, with a mean error of 0.3% when voxel sizes were ¼ of the radius.In their Monte Carlo F T implementation, Herman et al. (2007) report poor precision for small spheres when their centers are not centered in a voxel, with errors rising to several percent for a 40-µm radius sphere with 6.3-µm voxels across a range of center locations (calculated F T range ~0.58-0.67).Errors of this magnitude correspond to the effect of getting the radius wrong by plus or minus almost an entire voxel (~15% of the radius), too large to be reasonable and probably caused by a problem with their test.We tested our segmentation method by running 100,000 trials randomizing the location of the 5 sphere center using the same radius and voxel size and got maximum radius errors of +0.8/-1.1% and a standard deviation of 0.2%.We are thus confident that our implementation provides a high degree of accuracy and precision on even very small grains at low resolutions where voxel sizes are up to 25% of the radius.
We took three approaches to calculating ESR from the 3D data.The first two are based on the equivalent surfaceto-volume ratio (SV) approach (Meesters and Dunai, 2002).The model-based value ESR SVm uses the BoxA and BoxB 10 caliper dimensions as L and W for equations ( 1) through ( 3), while the 3D CT-based value ESR SV3D uses the 3D-measured volume and surface area for equation (3).Because of the unsupported assumptions of the model approach and the shortcomings of surface area measurements, both discussed below, neither of these solutions are ideal.An alternative ESR is based on the equivalent-F T approach; Ketcham et al. (2011) demonstrated than an equivalent-F T sphere provides a more accurate conversion for diffusion calculations than an equivalent-SV one.The set of calculations to determine the F T -15 equivalent sphere radius ESR FT are provided in the Appendix.

(U-Th)/He procedure
The apatite (U-Th)/He ages were analyzed in the UTChron Thermochronology Laboratory at the University of Texas at Austin.Individual grains were measured, wrapped into platinum tubes, loaded into a 42-hole sample holder, and pumped to ultra-high vacuum.Each aliquot was heated to ~1070°C for 5 minutes using a Fusions Diode laser system.The 20 released gas was spiked with a 3 He tracer, and purified by a Janis cryogenic cold trap at 40K and SAES NP-10 getter prior to measurement of the 4 He/ 3 He on a Blazers Prisma QMS-200 quadrupole mass spectrometer.Final 4 He contents were calculated using a manometrically-calibrated 4 He standard of known concentration measured during the analytical run.All apatite aliquots were reheated once under the same conditions to ensure full gas release.
After degassing, the platinum packets containing the apatite grains were placed into plastic vials and dissolved in a 25 100 µl 30% HNO 3 235 U-230 Th-149 Sm spike solution for 90 minutes at 90°C.After acid digestion, 500 µl of Mili-Q ultra-pure H 2 O was added to dilute the solutions to ~5% HNO 3 and equilibrated for ≥ 24 hours prior to analysis.The solutions were analyzed using an Thermo Element2 HR-ICP-MS equipped with a 50µl/min micro-concentric nebulizer.Final 238 U, 232 Th, and 147 Sm values were blank corrected and calibrated using a spiked, gravimetrically calibrated ~ 1 ppb standard solution.
Final (uncorrected) ages were calculated by solving the He age equation by means of Taylor Series expansion and reported 30 with a 6% standard error, based on long-term intra-laboratory analysis of apatite age standards.Corrected final ages are determined by dividing the uncorrected age by the mean F T factor (EQ. 5).U, Th, and Sm concentration, although not used in the age calculations, were determined for reporting purposes using the grain volumes and a nominal apatite density (e.g., Figure 4, EQ. 4).

Results
Blob3D provides 3D grain-specific volume, surface area, dimensions, and F T factors for each decay chain.The 2D optical measurements provide dimension information, which are used to calculate volume, surface area, UF T and ThF T based 5 on an assumed grain geometry of an equidimensional hexagonal prism (all results are reported in the appendix).We assume that the 3D-measured volume and F T values are accurate and serve to benchmark the 2D data (all comparisons reported in Table 1 and Figure 5).Surface area is more problematic to benchmark due to a number of factors, such as fractal roughness, CT data blurring and voxelation effects, as discussed below, and thus 2D and 3D results can only be compared in a relative sense for surface area.10 2D and 3D data are compared for each sample and as an entire population in Tables 1 and 2. The average 3D/2D ratio of each parameter is reported with its 1s standard deviation.This average ratio shows whether the 2D measurements on average overestimates (ratio <1) or underestimates (ratio >1) the 3D measurements.Also reported is the absolute percent difference between the 2D and 3D measurements to illustrate the magnitude of deviation between the measurements.
While comparing 2D and 3D results, it became apparent that one 2D grain measurement was made at an incorrect 15 microscope magnification setting, causing the length and width to be off by 2x, far greater than every other grain measured.
Hence, this grain measurement (97BS-CR8-1) was not included when calculating the average differences between 3D and 2D measuring techniques.

Grain factors
Grains from both samples display a range of habits typical for apatite, including two flat ends, two prismatic ends, 20 one flat and one prismatic end, and one or two broken or chipped ends (Figures 1 & 4).The grain morphology and the presence of any visible inclusions were recorded during handpicking (Table 2).Surprisingly, there are no clear systematic relationships between the presence of inclusions and grain age, or grain shape and ESR, volume, or surface area.The 2D length measurements are on average ~2% smaller than the 3D BoxA dimension.On the other hand, the 2D width dimension is on average ~3% greater than the 3D BoxB dimension (Table 1).25 One inevitable source of uncertainty in 2D length and width measurements is analyst judgment and error.For example, if a grain has uneven terminations, it is at the analyst's discretion to measure the longest axis or split the difference, whereas the CT analysis always reflect the longest axis.Similarly, CT scanning is also not subject to any user error introduced by measuring the apatite grain not lying on its widest face, or at an incorrect magnification.In our dataset, a couple grains have very large deviations from the CT-derived volume, which may be caused by the microscope https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.magnification setting being slightly off during measuring.In contrast to grain 97BS-CR8-1, we cannot attribute this as the cause with full certainty so we do not exclude these grains.

Volume and Surface Area
Volumes and surface areas calculated using the 2D microscope dimensions both average ~20% larger than the 3D calculations for both volumes and surface areas (3D/2D VOL = 0.82, 3D/2D SA = 0.81) (Table 1, Figure 5).Specifically, 2D 5 volumes and surface areas calculated from length and width data assuming a hexagonal prism shape have an absolute average difference of 23 ± 32% (2s) and 22 ± 18% (2s), respectively, from 3D Blob3D calculated volumes and surface areas.

ESR and Mass
The 2D ESR is calculated using the surface area to volume ratio (SA/V), which is derived assuming a hexagonal 10 prism with the length and width dimensions measured on the microscope (EQ.2, Figure 6).The 3D data had the ESR calculated based on SA/V in two ways.First, the SA/V for ESR SVm is calculated using the BoxA and BoxB values provided by Blob3D and assuming a hexagonal prism, mimicking the 2D approach.The variation between 2D and 3D ESR SVm measurements has a 2s spread of ± 12%, but the variability is fairly evenly split in over-and under-estimating the ESR, such that the average 3D/2D ratio is 1.02.Second, the 3D SA/V is calculated using the surface area and volume measurements 15 output by Blob3D (ESR SV3D ).The variation between 2D and 3D ESR SV3D is even larger at ±18% (2s), with an average 3D/2D ratio of 1.01 (Table 1, Figure 5).The F T -based ESR was on average similar to the SV-based one (ESR FT /ESR SVm = 1.0), but the variation was ±9% for the two samples, and extreme values were 9% higher and 21% lower.The relative variation of the ESR FT value with the 2D data is ±14%, similar to that for the other 3D ESR calculations (Table 1, Figure 5).20 The grain mass is calculated from the volume data using a nominal apatite density, and therefore 2D and 3D mass determination directly reflect the variability in the 2D and 3D volume data.The 2D approach consistently overestimates the mass, with a high degree of scatter (3D/2D = 0.82 ± 0.44 (2s)) (Table 1, Figure 5).

F T corrections
UF T and ThF T correction factors calculated from the 2D data are generally 1-2% lower than the Blob3D U and Th 25 F T factors.To combine the F T factors into a single term that is applied to the (U-Th)/He age, a mean F T was calculated in two ways using EQ.6 (see methods).This results in mean F T factors that vary by an average of 2% between the 2D and 3D datasets.The 1s scatter in 3D/2D FT factors is 1.8%, though individual differences can reach up to 9% (Table 1, Figure 5).One of the main motivations behind this study was to assess the accuracy of 2D grain measurements and using an assumed grain geometry for calculating grain parameters (volume, ESR, mass, F T ) and the impact on the accuracy of the 5 final (U-Th)/He age and eU.For this reason, we selected two samples with relatively simple geologic histories, in order to reduce the impact of geologic or kinetic factors that could lead to age dispersion.
The most striking deviations between 2D and 3D measurements is in the volume and surface area, which 2D measurements consistently overestimated by 20-25% in our study, with a large degree of scatter (1s = 22% and 14%, respectively).These results are in line with previous work.Evans and others (2008) observed a similar discrepancy in the 5 10 apatite grains they measured, where their 2D-based volumes were 30% greater than the 3D volumes (Table 3).Our dataset contains >100 apatite grains, implying that the 2D overestimation of volume (and therefore mass) may be systematic in the 2D measurement approach.In contrast, Glotzbach and others (2019) analyzed 24 apatite grains and found that the 2D volume measurements varied by a similar magnitude (~15%), but did not systematically overestimate the volume as in our study and Evans et al (2008) (Table 3).This is likely due in large part to their procedure of selecting the appropriate shape 15 model on a grain by grain basis, including ellipsoids for anhedral grains and accounting for terminations using the functions provided in Ketcham et al. (2011), rather than assuming exclusively flat-terminated hexagonal prisms.
There are multiple factors that can contribute to overestimating the volume of a given apatite crystal.First, the assumption of a hexagonal prism crystal shape with flat terminations, in which the length of the grain is used as the height of the prism, has the potential to overestimate the volume if the crystal has tapered ends (Figure 4).However, our data suggest 20 this can only account for about a third of the volume difference, because even crystals with two flat (or broken) ends still had an average volume difference of 13%.Second, the ideal-prism model also presumes a perfect, equal-sided hexagonal cross section perpendicular to the c-axis, for which the ratio of width to height should be 2/sqrt(3), or 1.1547.The 3D shape measurements give mean ratios of 1.25(02) and 1.23(01) for our two samples, indicating that the cross sections are on average flatter than ideal hexagonal prisms.The non-ideality of this cross section was also noted by Glotzbach et al. (2019), 25 and can result in either an underestimate or overestimate of volume, depending on which face the grain is lying on when measured in 2D.The systematic bias we observe is not surprising as apatite commonly come to rest on their flatter side, whereas some of our observed scatter comes from this not always being the case.This shape divergence explains about a quarter of the departure between 2D and 3D volume in our data.The remaining deviation may be due to chipped crystals, surface roughness, or other deviations from a perfect prism that the 2D calculation cannot account for.30 A number of factors will directly impact surface area calculations.Surface area is calculated from the 2D measurements by assuming a perfectly smooth prism.CT has the potential to capture irregular surfaces present in natural apatite grains, which if present and resolution is sufficient, should lead to higher surface area calculations in the 3D data.However, surface area is problematic to measure, regardless of resolution.Irregular surfaces are to some degree fractal entities, making their measured areas dependent on measurement scale, and the "correct" answer is not straightforward to define.All CT images are naturally blurry to some extent, smoothing out both irregularities and also sharp corners and edges.Conversely, the 3D measurement process of segmentation by thresholding can lead to artificial enhancement of surface area due to "voxelation" effects (the 3D equivalent of pixilation).5 In our data, the 2D measurements consistently result in a higher surface area than the 3D measurements.This is probably partly due to the ~5-µm resolution of our CT data, and also to the flat-terminated hexagonal prism model leading to an overestimate.Evans et al. (2008) observe a similar discrepancy in surface area measurements between 2D and 3D data (2D ~23% higher) with a 3.77 µm resolution scan (Table 3).On the other hand, Glotzbach et al., (2019) scanned their grains at a 1.2 µm resolution and their 2D measurements gave surface areas on average 8% lower than 3D (Table 3).As with 10 volume, a large part of the difference is probably due to using a more accurate shape model than an ideal equal-sided hexagonal prism.The overshoot may be in part due to their higher CT data resolution capturing roughness better, but their 3D images also show voxelation effects such as ridge sets on flat surfaces that likely increased their surface areas to an unknown extent.

Mass and eU 15
The discrepancy in volume between 2D and 3D measurements directly impacts the mass calculation, causing the grain masses derived from the 2D measurements to be ~25% higher than the 3D grain mass determinations (Figure 6).Evans and others (2008) found similar deviations, with their masses calculated from 2D volumes ~30% greater than their masses for 3D volumes (Table 3).Both of these divergences stem from using the assumption of a flat-ended hexagonal prism, whereas an approach that takes grain shape into account when choosing the F T formula (Ketcham et al., 2011;Glotzbach et 20 al., 2019) avoids this systematic bias.However, in all cases that use perfect shape models, the relative scatter is on the order of 20% (1s), which is high enough to be worth fixing.
Although the age equation does not require knowledge of the grain volume or mass, both are necessary to calculate reported concentrations for U, Th, Sm and He (Figure 6).The U, Th, Sm concentrations, often combined into a single term, 'effective uranium' (eU), has been used a proxy for radiation damage within a crystal, and age versus eU correlations are 25 commonly used for interpretation of age scatter, and thermal history inverse modeling (e.g., Flowers et al., 2009;Guenthner et al., 2013).Therefore, accurate knowledge of volume has cascading effects from mass to eU concentration and age interpretation (Figure 6).Comparison between eU calculated for the 3D mass data and 2D mass data show that the 2D masses underestimate the bulk eU concentrations by ~20-30%.This is consistent with the 2D mass data being ~25% higher than the 3D mass data, which would have the effect of 'diluting' the eU signal.

ESR
The various ESR calculations all yielded similar results on average, but high degrees of variation between measurement and calculation modes (5-6%).In addition to being more accurate for simplifying complex shapes to spheres for diffusion calculations, the ESR FT method is also likely more robust than others that presume or measure surface area.
Surface area, beyond being difficult to define and measure for irregular natural objects in a resolution-resistant way, has only 5 secondary importance for diffusion and F T calculations when it varies on a fine scale compared to the grain (i.e.µm-scale roughness).

F T
A somewhat surprising result of our study is that, despite volume and surface areas being very different between the 2D and 3D methods, these differences largely cancelled each other out in S/V-based F T calculations.This is in part 10 because volume and surface area co-vary, both in the assumed models and the actual measurements, so an error in one leads to a similar-magnitude error in the other (Figure 6).
A result that more closely conformed to is that, as grain size fell, dispersion between 2D and 3D F T values increased, although it remained modest.The standard deviation of 3D/2D U FT was 2.7% for grains with U FT values from 0.6-0.7,2.4% from 0.7-0.8, and 1.3% for grains above 0.8.15

Reproducibility of (U-Th)/He ages
In addition to assessing the accuracy of using the 2D measurements, this study aimed to quantify the uncertainties that may be introduced by such measurements, particularly in F T , as a means to potentially improve age accuracy, precision, and intra-sample dispersion.Previous studies have estimated that uncertainties in F T calculation can account for 1-5% of sample age uncertainty (Evans et al., 2008;Glotzbach et al., 2019).Our results are consistent with this range, and suggest 5 that uncertainties in the U and Th F T calculation are on the order of 1-3%, and mean F T varies by 2%.We find the greatest deviations are caused by user error and not the assumed grain geometry.In samples with less euhedral apatite grains, the effects of F T and an assumed grain geometry can increase.
Our data also show that the 3D F T correction does not increase the overall sample age precision.For sample 97BS-CR8, 24 apatite grains were analyzed, two of which are outliers.Of the two outliers, one (97BS-CR8-1) was clearly caused 10 by a user error during microscope measurement, leading to an incorrect F T correction (0.55) and old age (78.8 Ma).This was discovered during 3D image processing, in which the same grain was identified, measured correctly, and produced an F T of 0.76 and a more congruent corrected age of 57.2 Ma.In contrast, for a second outlier (97BS-CR8-24), the 2D and 3D F T corrected ages both produced anomalous ages of 101.2 and 98.4 Ma, respectively.An unusually high He concentration the likely culprit for the old age for this grain, potentially due to He implantation, but the reason for the high He concentration is 15 not evident from our data.Excluding these two outliers, the average age and uncertainty for the sample population (n=22 grains) calculated based on the 2D and 3D measurements are indistinguishable (56.8 ± 2.9 Ma and 56.0 ± 2.9 Ma); relative errors are 5.1% in both cases.
Similarly, the sample ages calculated with 3D and 2D data for 95BS-11.3(n=59 aliquots) are indistinguishable, 12.2 ± 4.0 and 12.1 ± 4.0 Ma, respectively.Unlike sample 97BS-CR8, there was no clear-cut evidence of user error, and the 20 relatively high age uncertainty (33%) is reproducible between the 2D and 3DF T corrected ages.Five aliquots produced ages > 20 Ma, which skews the mean age older (the median age is 10.2 Ma, within error of the previous reported age in Stockli et al., 2002).The apatite ages do not correlate with factors such as ESR (grain size) or eU.The >20 Ma aliquots all have high He concentrations (nmol/g) compared with the bulk of the sample, suggesting that excess He, possibly due to implantation from high U-Th neighbors, or the presence of undetected and insoluble high-eU inclusions.25 Overall, these data suggest that although the 3DF T can provide a more accurate F T correction and varies from 2D estimations by ~2%, it has a minimal effect on the calculated sample age (1-2%) and no effect on the reproducibility for these two samples.This is not surprising, as a ~2% error would constitute a negligible proportion of the often cited 6% dispersion derived from analyzing age standards; error propagation indicates that removing a source of 2% error would only reduce an overall 6% error to 5.7%.This points to the importance of other factors in intra-sample dispersion, such as U-Th 30 zonation, and/or excess He from nano-inclusions or high U-Th neighbors.

Effects of inclusions or broken grains
It is widely regarded that inclusions and broken grains are both contributors to intra-sample dispersion and inaccurate He ages, particularly anomalously old ages.Inclusions in apatite can act as He traps or a source for excess He, particularly mineral inclusions that do not dissolve during apatite HNO 3 digestion.Both apatite samples had multiple grains with high density and low density inclusions detectable by microscope during picking and/or the CT scan.In both samples, 5 the presence of inclusions did not have any discernable effect on the (U-Th)/He age.While inclusions are almost certainly a source for error and dispersion in some samples, at least the easily visible one do not appear be relevant in these samples, which suggests they are likely also not U-Th bearing inclusions.
Similarly, broken grains can be a source of dispersion if they were broken after the sample passed through the He partial retention zone, e.g., after the grain began to accumulate He (see Brown et al., 2013a).Typically, this may occur 10 during erosional transport or during mineral separations.Brown and others (2013) assess that broken grains can contribute from 7 to > 50% dispersion from the sample age, depending on cooling history.In our samples, grain terminations varied from doubly prismatic to flat, and in some cases appeared chipped or broken.However, there is no clear correlation between the chipped or broken grains and He age.One possibility is that the grains broke prior to cooling through the He retention zone.This seems somewhat unlikely, given that both samples come from crystalline rocks.Alternatively, and perhaps more 15 plausibly, the variety of crystal habits may reflect how the crystals grew in the host rock.In any case, the grains in these samples that appear to be chipped or broken are not obvious sources for the age dispersion observed in the samples.

Benefits and limitations of X-ray CT over microscope measurements
CT scanning mineral grains for (U-Th)/He chronometry has both analytical and practical benefits.CT provides more accurate grain volume measurements, which becomes increasingly important as grain shapes deviate from idealized 20 forms (e.g., abraded or broken grains).CT data are able to highlight inclusions or other internal heterogeneities based on contrasts in density in the X-ray data, which may not be visible by the naked eye.Furthermore, the CT-mounting method outlined in this study allows for the scanning of up to 250 grains in a single session, and potentially many more, making it cost and time effective.Different mineral phases can be scanned together, and data can be processed in a batch so that from a single scan, one can gather volume, surface area, caliper dimensions, F T , mass, and ESR at once for several samples and 25 phases.We anticipate that more volume-based shape measurements can and will be developed to automatically and quantitatively evaluate grains for euhedrality, rounding, broken faces, and a wealth of other potentially informative data.
CT scanning mineral grains used for (U-Th)/He dating also has the benefit of removing the possibility of 'user error' during the grain measurement step.Unlike with microscope measurements, the orientation of the apatite grain on the CT mount does not matter, and there is no need to set a magnification or trace the dimensions of the grain by hand, reducing 30 potential for mistakes.CT also eliminates variability that may arise from different microscopes, lighting conditions, and The main limitation of using CT is access to the instrumentation and cost for sample analysis.However, CT scanners are becoming more common as desktop instruments in earth science departments, and many universities have imaging facilities that include micro-CT machines.As CT instruments continue to proliferate and costs continue to fall, we 5 anticipate that measuring, screening, and documenting grains used for thermo-geochronology will become a widely-used practice.

Conclusions
The shape and size of 109 apatite grains from two rapidly-cooled plutonic samples were analysed by 2D and 3D methods.2D length and width measurements made on an optical microscope were used to calculate surface area, volume, 10 ESR, mass, F T and eU, assuming an ideal equal-sided hexagonal prism grain shape.The same apatite crystals were scanned using x-ray computed tomography at a 4-5 µm resolution, and the same factors were calculated using Blob3D software, which does not require assuming a grain shape.83 new apatite (U-Th)/He ages were collected to resolve the influence of 2D versus 3D F T correction factors on final (U-Th)/He age and reproducibility.With these data, we derive the following conclusions: 15 al. https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 1 :
Figure 1: Apatite grain photos with 2D measurements taken on an optical microscope.Dimensions are reported in µm and the grain aliquot name is in the top left corner of each photo.The top row is photographed on double sided sticky tape, and the bottom row is photographed on a glass slide.

Figure 2 :
Figure 2: Schematic rendering of CT mounting procedure.Grains are adhered to the top of a plastic disc using double sided sticky tape, with multiple grains placed onto a 5x5 mm surface.Multiple plastic disk layers with grains may be assembled and then 10 https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.rays and weaker filtering necessitated application of a beam hardening correction during reconstruction.The reconstructed data had a voxel (3D pixel) size of 5.03 µm.

Figure 3 :
Figure 3: Example CT slices (upper row) and 3D renderings (lower row) of apatite grain mounts for BS95 (left) and 97BS (right).Arrows indicate two grains with high-attenuation mineral inclusions in BS95, and a fluid inclusion in 97BS.CT slice for 97BS is actually an oblique slice through the original data, to allow all grains to appear in the same image.10

Figure 4 :
Figure 4: (Top) Rendering of dimension data collected by 2D and 3D methods.Length and width are measured in 2D using an 10 /doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.exhibited by the apatite in this study.Highlighted in gray are potential areas of over-estimated volume if an ideal hexagonal prism is assumed and calculated with 2D length and width data.

Figure 5
Figure 5: XY plots for Volume, Surface Area, eU, Mass, Mean F T , ESR, and Age for both samples.Both samples are plotted together, unless otherwise noted.Each data point represents a single apatite aliquot.Black lines represent 1:1.3D data measurements plotted on Yaxis in all plots.2D measurements overestimate volume, surface area and mass, and underestimate eU and mean F T .

Figure 6 :
Figure 6: Workflow diagram showing the effect of volume and surface area measurements on other parameters used for (U-Th)/He age calculation and interpretation.The average absolute difference between 2D and 3D measurements for each of the parameters are reported with their 1s uncertainties (reported in Table1).Note the greatest deviations are in volume and surface area, and those parameters that rely on volume alone.ESR, F T and age deviate less because they use the SA/V, which is ~1 between20 https://doi.org/10.5194/gchron-2019-3Preprint.Discussion started: 27 May 2019 c Author(s) 2019.CC BY 4.0 License.

Table A1 :
Values used for calculating eU