Constructing Proxy Records from Age Models (copra)

Reliable age models are fundamental for any palaeoclimate reconstruction. Available interpolation procedures between age control points are often inadequately reported , and very few translate age uncertainties to proxy uncertainties. Most available modeling algorithms do not allow incorporation of layer counted intervals to improve the confidence limits of the age model in question. We present a framework that allows detection and interactive handling of age reversals and hiatuses, depth-age mod-eling, and proxy-record reconstruction. Monte Carlo simulation and a translation procedure are used to assign a precise time scale to climate proxies and to translate dating uncertainties to uncertainties in the proxy values. The presented framework allows integration of incremental relative dating information to improve the final age model. The free software package COPRA1.0 facilitates easy interactive usage.


Introduction
Palaeoclimate reconstructions are a way to relate recent variability in climatic patterns to past changes and to discuss the significance of such changes. They are based on proxy records, retrieved from a large variety of natural archives such as trees, glaciers, speleothems, or sediments. Such archives store information about climate parameters in stratigraphic, and, hence, chronological order.
The proxy-record in question is generally given against depth and must be related to a time scale before any attempt of interpretation can be made. This is done by dating individual points within the sediment column using layer counting, radiometric dating (e.g. 14 C-, or U-series), or marker horizons (e.g. tephrochronology). The (growth-) depth-age relationship can be determined for each proxy data point (the actual age modeling). Limitations on sample material, analytical costs, and time considerations allow for dating of only a few points within a proxy record, and various methods are employed for the age modeling process. Naturally, the quality of any age model depends on the number of dates and the associated uncertainties (Telford et al., 2004).
Inevitably, each dating method comes with inherent uncertainties, and herein lies one basic caveat for any climate reconstruction: the time frame is not as absolute (i.e., pinpointed in time) as one wishes and the uncertainties must be accounted for when interpreting these proxies. Dating uncertainties are often only discussed for discrete dated points, rather than the entire age model. Blaauw et al. (2007) and Blaauw (2010) present possible ways of resolving this issue. Scholz and Hoffmann (2011) point out that the interpolation procedures and techniques used are often not described in Published by Copernicus Publications on behalf of the European Geosciences Union.
adequate detail and lack of consistency and objectivity can cause difficulties when different records are to be compared or reanalyzed, or wherever leads and lags between different reconstructions are studied. The problem is becoming more acute because the spatio-temporal coverage of proxy records now allows for spatio-temporal analysis using complex networks (Rehfeld et al., 2011(Rehfeld et al., , 2012 and more quantitative reconstructions (Hu et al., 2008;Medina-Elizalde and Rohling, 2012). In order to obtain meaningful results from such studies, we must be able to consider the uncertainties of the compared records.
Establishing a methodology for comparing different proxy records is challenging because of the apparent uncertainties in the ages at which the proxy values are known. Such a method would require something equivalent to the "inertial frame of reference" in physics -a reference system that possesses certain universally valid properties without exception. In the particular case of proxy record construction, such an invariant "referential" quantity is the physical time. Physical time is the same for any proxy source, and the differences arise only in the deposition, growth, measurement and finally, the estimation of the proxy in question. In this study we establish a method for constructing such a precise time scale for proxy records with Gaussian uncertainty distribution. The primary goal of age modeling is to construct meaningful time series that relate uncertainties in climate proxies with depthage relationships and associated errors. We propose to improve the age modeling by including pointwise depth information, i.e., layer counting data (if available). Previous approaches, both Bayesian and Monte Carlo-based, do not usually include both relative (i.e., layer counting) and pointwise depth information together in order to improve the overall chronology. Only recently, a methodology has been proposed to combine layer counted floating chronologies with U-series dates (Domínguez-Villar et al., 2012). These authors anchor the layer counted chronology to the radiometrically dated chronology using a least squares fit of a linear relation between the two and also estimate the corresponding age model uncertainties because of the layer count data inclusion. Instead, we use the least squares fit to estimate the minimum "distance" between the radiometric age model and the layer counted age model -a significant difference of our approach from that of their's.
Ignoring the uncertainties between dated points in a sequence interrupts the error propagation and the true uncertainty behind the time series remains hidden. Most available approaches use the mean or median of the age model to construct the final proxy record, leaving a disjoint between the errors of the constructed age model and the final proxy uncertainties. Several techniques have been developed to construct consistent age models and uncertainty estimates (Blaauw et al., 2007;Bronk Ramsey, 2008;Scholz and Hoffmann, 2011). The most recent one is StalAge (Scholz and Hoffmann, 2011), a Monte Carlo-based age modeling software that allows users to construct age models with various interpolation choices, deals with potential outliers and estimates the uncertainties of the constructed age model at desired depths. StalAge was especially designed for speleothem U-series age modeling and allows for detection and handling of outlier and hiatuses. None of five recently compared modeling procedures translates the dating uncertainties to proxy uncertainties Scholz et al. (2012). Blaauw et al. (2007) and Blaauw (2012) discuss this problem and show a Bayesianbased solution for 14 C-based chronologies.
In this study however, COPRA takes this idea a step further to actually quantify the proxy errors for given ages. Here, the age uncertainties have been transferred to the proxy domain using conditional probability (Prob(A|B), where A and B are probabilistic events), which is a crucial difference to the study by Blaauw et al. (2007). This method results in an uncertainty-free time axis. Time domain-fixed proxy records can subsequently be used for direct statistical comparison with other, equally treated, time series. Age model constructions can be further improved with the incorporation of additional information into the numerical procedure, such as counted intervals between at least two "absolute"dated points (Domínguez-Villar et al., 2012). We also introduce a novel approach to proxy modeling that attempts to integrate the pragmatic and theoretical aspects of reconstructing a proxy record from the measurement data in a holistic framework. Moreover, this approach allows the assignment of the proxy values to an precise (i.e. error-free) time scale by translating the dating uncertainties to uncertainties in the proxy values.
A precise time scale 1 is a sequence of error-free calendar dates that represent the true chronological dates at which time the proxy signals were recorded in the core. Usually, the most likely age is somehow assigned to a measured proxy value. But now we pose the converse question: which proxy value is most likely for a given year? Instead of considering ages with uncertainties, we now have the uncertainty entirely in the proxy value. The precise time has the benefit that we can statistically compare different proxy records directly, because their time axes are identical, fixed, and without any error. These time axes are not necessarily equidistant.
With COPRA (COnstruction of Proxy Record from Age models) we propose a new, heuristic, framework that bridges the gap between the uncertainties in "age" and the uncertainties in the "proxy record". The introduced software implementation allows one to interactively detect and handle typical complications such as reversals and hiatuses, and narrow the age uncertainty by supplying additional information, such as layer counting data. As an implementation of this algorithm, we present COPRA1.0, an interactive interfacebased proxy reconstruction software that allows -detection, classification, and treatment of age reversals; -detection and treatment of hiatuses; -interpolation between discrete dating points (using standard functions: linear, cubic, or spline); -optional inclusion of layer counting information (thereby potentially including highly resolved nonlinear accumulation behavior between dating points); -mapping of the proxy records to a precise time scale and estimation of proxy record uncertainties which inherently take into account the uncertainties of the age model.
The interface allows the specialist to handle suspect data and/or include additional information in order to improve the final age model. The software logs and exports all relevant meta-data to ensure reproducibility. We hope that this software routine will help palaeoclimatologists to construct reliable and reproducible proxy record time series. Before we proceed to describe the algorithm, we have to distinguish between point estimates and incremental dating.
Point estimates, i.e., age estimates at the date points provide the only "absolute" chronological information. If the archive is actively accumulating then the top of the sequence also provides a date. While there are several forms of pointwise age-estimates, their treatment in age modeling is usually generic and we will focus on U-series dates in particular. Radiometric dates come with measurement uncertainties, so although they might be "absolute", they are not "exact". In the context of the COPRA algorithm, we assume that the uncertainty distribution of the U-series date (or any other pointwise age estimate) is Gaussian, and the standard deviation is given. Gaussianity presents a simplification, and is not always correct (Blaauw, 2010). This is seen in the highly asymmetric uncertainties prevalent in calibrated 14 C-dates. In future COPRA versions, handling of non-Gaussian uncertainty distributions will be implemented, enabling COPRA to also operate with 14 C-dated archives such as lake sediments. In the case of very high precision dates additional uncertainty might result from the physical sampling procedure, if the analytical error in years is smaller than the years integrated by sampling. At sufficiently high growth rates, this "sampling contribution" becomes negligible. Currently we do not propagate this "sampling uncertainty" in our modeling routine, but with future analytical improvements this additional uncertainty must be considered.
Incremental dating can be obtained if the archive growth is seasonally or annually structured. If this is the case, annual layers might be distinguished, e.g., from crystallographic, or geochemical changes (Treble et al., 2005;Mattey et al., 2006;Fairchild et al., 2006). Starting at a known date, the years can be counted backwards (from the top or the most recent section). This is a standard procedure in tree ring and ice core chronology building, and sometimes, in speleothems or lake sediments (Marwan et al., 2003;Mattey et al., 2006;Preunkert et al., 2000;Svensson et al., 2008;von Rad et al., 1999). In this case, highly resolved information about the depth-age relationship is available and should, if possible, be included in the age modeling procedure in order to improve the uncertainty estimates of the model. The COPRA algorithm can make use of the incremental dating information. The software allows layer counting information to be provided for any section of the record, in order to improve the overall chronology.
In summary, the fundamental assumptions are: -Age measurements (both pointwise and incremental) are assumed to be the expectation value of a normally distributed random variable with the standard deviation equalling the measurement error.
-An exhaustive computer-aided search of all stratigraphically possible (i.e., monotonic) relationships within the normally distributed age observations will help to quantify the most realistic age model within the limits of measurement uncertainty.
-In cases where the stratigraphic condition is violated at the level of the age observations themselves, it is assumed to be due to two primary causes: 1. one or more of the dating points are incorrect and not representative of the "true" accumulation history and have to be either removed (outliers) or considered only after treatment (age reversals); 2. a physical event in the archives accumulation history caused the observations to deviate from typical stratigraphic monotonicity and thus have to be treated (e.g., in case of hiatuses, see Sect. 2.4).
-Incremental dating information amounts to additional knowledge about the depth-age relation and hence, when incorporated into the age model, should reduce the overall uncertainty.

General remarks
For age modeling and subsequent assignment of proxy uncertainties in a precise time frame, two datasets are needed: one including the dated points, and another the proxy values, each with their respective distances from top or base. For simplicity, in the following we consider the distance as "depth". Additional information on marker layers (e.g., hiatuses), and other specific information might be provided in a third file. In order to compile the optimal input for age modeling, the direct dating information, here the U-series dated samples and their geochemical behavior, and mineralogical and petrographic environment, should be evaluated by the specialist. Information on sampling depth, possible contamination, hiatuses, or geochemical alterations might be available and can greatly help to identify outliers prior to any modeling. The COPRA algorithm enables reliable and reproducible uncertainty modeling for proxy time series. Therefore, CO-PRA has to record all necessary information required to reproduce the age modeling, including the input dating information (depth, error, age, error), input proxy values, information on layer counts (if given), and all information on the modeling, like number of Monte Carlo (MC) realizations, interpolation method used, excluded dates, enlarged error bars, confidence interval details, etc.

Monte Carlo modeling: the core of COPRA
Fundamental to the COPRA algorithm is the creation of the age model that accounts for the age uncertainties which are used subsequently to estimate proxy errors. The age model is generally derived by interpolation between the few dates in the dating table towards the higher-resolution depth scale of the proxy measurements. Each dated point is provided with an error value corresponding to a standard deviation σ of a normal distribution. This means that for each location dated the most likely age is represented by the peak in the distribution, but also other ages (younger and older) are possible, with lower probability. The probability of these slightly differing ages is specified by the normal distribution and its standard deviation σ . For example, a σ value of 5 yr would mean that with 5 % probability the given age could be 10 yr older or 10 yr younger than actually specified (the 95 % confidence interval can be estimated as 2σ ).
The point estimate age data are provided as a dating table, which is in the form of {D i , T i , σ T i }, with i = 1, . . . , N entries containing depths D i , corresponding age estimates T i and standard deviations σ T i of the age estimates T i . Here, for matters of simplicity, we shall require that D i+1 > D i , i.e., the depths at which ages are measured should always be reported in increasing order. Now, in order to incorporate the dating uncertainties σ T i into the age model, COPRA adds small random numbers drawn from a normal distribution with standard deviation σ T i to the ages T i and interpolates the ages to the proxy record. Repeating this many times (Monte Carlo simulation, Gilks et al., 1996), we get many slightly differing age models populating the confidence intervals of the dating points (defined within COPRA as different realizations of the final age model, cf. Fig. 1).
These age model realizations demonstrate the uncertainties of the ages given by the dating errors and allow the construction of an age distribution for the given depth D j , formally p j (T j |D j ). The median of these realizations for each depth value D j of the proxy record reveals the most likely Age Depth Fig. 1. Schematic of the Monte Carlo model: the point estimates are identified with normal distributions whose standard deviation equals the measurement error (represented here as cyan shaded areas over the error bars of each point estimate). Several realizations of the Monte Carlo simulation are shown as gray and brownish curves. The brownish curve includes an age reversal and is subsequently rejected. The median (blue) and the confidence limits (red dashed) represent the final age model resulting from a series of different Monte Carlo simulations.
age T j for this sample position j ; and the quantiles of the corresponding age distribution for each depth value can be used to infer the confidence interval for the corresponding ages ( Fig. 1). However, the shape of these confidence intervals depends on the chosen interpolation (linear, cubic, spline).
A further precondition for the interpolation is monotonicity -arising from the stratigraphic reasoning that in almost all palaeoclimatic archives "deeper is older". Therefore nontractable age reversals within the dating table have to be excluded beforehand (discussed in further detail in Sect. 2.3). Still, due to the addition of small random numbers to the ages, in some realizations the monotonicity might not be preserved if the age errors of the dated samples are largely overlapping. Such realizations will be dropped and a new Monte Carlo iteration is added (brown curve in Fig. 1). In some extreme cases this can lead to a very large number of realizations to be calculated until the predefined number of realizations fulfills the monotonicity criterion. The COPRA software uses 2000 MC realizations by default and will issue a warning if the MC simulation converges very slowly, allowing for interrupting the process if it is too time-consuming and interactively re-checking the original age data.

Clim. Past, 8, 1765-1779, 2012
www.clim-past.net/8/1765/2012/ Based on the age model realizations, we derive for each proxy value a distribution of corresponding ages p j (T j |D j ) (now assigned to the depth scale of the data). However, the ensemble of age model realizations also allows us to construct a precise time scale, to which we assign the likely proxy values. This is done by calculating the distribution of the positions in the record at a given age p j (D j |T j ) (using interpolation). For each age T j we can now calculate the distribution of proxy values. By this procedure we translate the dating uncertainties into proxy value uncertainties. 95 % confidence bounds are constructed using the ±2 σ deviation from the median trajectory of the proxy. As already mentioned, precise time scales have the advantage that they allow subsequent statistical comparison of different records, even if they are differently dated.

Age reversals and outliers
Age reversals and outliers are the main causes for problems in the construction of age models. It is important to differentiate between age reversals and outliers.
Age reversals violate the fundamental assumption of monotonicity, i.e., positive growth of the deposit. Outliers change the depth-age relationship significantly and can (but not necessarily always) lead to age reversals. Both features must be identified and solutions be found to obtain a monotone depth-age relationship. Whereas reversals can be handled by their error distribution (increasing the error margins of the involved dates), outliers should usually be excluded.

Age reversals
An age reversal occurs when a dated point leads to a nonmonotonic depth-age relationship, i.e., if it is older than the age of its subsequent dated point below it T i > T i+1 . Stratigraphic reasoning dictates that with positive depth difference in an archive, the age difference has to be positive as well: sequential sedimentation is preserved in most natural archives at positive growth rates. A stalagmite for example is always younger at the top and older at the base. Therefore, the monotonicity of the depth-age relationship is crucial, since we can infer from this stratigraphic information that only positive slopes in a graphical representation of the depth-age relationship are possible and meaningful.
We classify reversals into two types: tractable and nontractable. In the MC simulations (which are at the heart of COPRA) these two classes of reversals have different properties. A non-tractable reversal is said to be present if the considered error intervals of the two involved dated points do not overlap; otherwise the reversal is tractable (Fig. 2). More formally, a non-tractable reversal has its lower 2σ margin outside the upper 2σ margin of the subsequent dated point: . For a tractable, or benign, reversal, the lower 2σ margin of the dated point is smaller than the upper 2σ margin of the subsequent dating point, thus the Age Depth Fig. 2. Classifying age reversals: a non-tractable reversal (point marked in blue) has its lower 2σ margin outside the upper 2σ margin of the subsequent point in the dating table. Thus, the probability of finding a stratigraphically correct depth-age curve (such as the blue curve) is very low and tending toward zero, even though it is possible in principle. On the other hand, a tractable reversal (point marked in green) has non-zero overlap of the 2σ margin with that of the next point, making it computationally feasible to find a correct physically relevant depth-age curve (e.g., the green curve). The ends of 2σ error margins for the relevant points are marked in light red. The dark red line represents the possibly most likely depth-age curve if both outliers are eliminated. In this case, the age model reaches a very homogeneous growth rate, which might reflect the most realistic growth history. error intervals are overlapping and ensure that the negative slope can be compensated within the error bounds: so although we find T i > T i+1 , the ages and their errors satisfy A non-tractable reversal will have to be "treated" (see below) by the user, otherwise the algorithm will not converge to a final result. Conversely, a tractable reversal in the input data does not need correction for the Monte-Carlo approach to yield a result.
It is possible that a reversal is caused by an outlier (see below). Then such a dating point has to be excluded from the subsequent analysis.

Outliers
An outlier is a dating point that is not consistent with the growth history of the archive. Outliers occur for different reasons, e.g., geochemical alterations, contamination, or Outliers can be identified visually (in the current version of COPRA) if a dating point deviates extremely from the general depth-age relationship. But also non-tractable reversals have to be checked whether they are outliers or not, e.g., by using additional knowledge about the sample and the dating measurement.
In many speleothem cases, identified "outliers" can often be traced to problems such as, for example, high detrital thorium concentrations or mineralogical hints to altered segments in the stalagmite (e.g., aragonite to calcite diagenesis). The geochemical data obtained during U-series analysis help evaluate samples for unforeseen chemical changes (like leaching). In such cases, it is the scientist who must evaluate the geochemical data in its overall sedimentological/geological context. Samples affected by such influences should be marked and evaluated with extra care. If independent information proves outliers to have undergone alteration they have to be excluded from the age modeling procedure. As mentioned, such evidence could be geochemical data, Xray diffraction (XRD) results pointing to diagenesis, or other information.

Treatment of reversals, neighbors, and outliers
The treatment of reversals and outliers remains subject to individual evaluation, and the possible handling options are detailed below.

Reversals
If a date causes a tractable reversal COPRA will highlight the suspicious date (t i+1 ), calling for further inspection. Such reversals can be dealt with in the MC simulation procedure where only trajectories in agreement with the monotonicity law are propagated. A non-tractable reversal will require the user to inspect the dating table and modify it before it can begin modeling the depth-age relationship. Non-tractable reversals are often caused by outliers which can then be excluded from further analysis. If no outlier is identified, or if the cause for the reversal cannot be ascertained, the error margins might be conservatively enlarged, either of the highlighted sample or one of its neighbors.
Treatment is compulsory for points causing non-tractable reversals, but is optional for tractable reversals.

Neighbors
It is important to note that while our algorithm highlights one point of the dating table as a reversal, the dates right before and after this point might just as well be erroneous instead. Thus, these neighbors must be evaluated too. If identified as outliers or as suspicious samples, exclusion or error-widening of these adjacent dated points, respectively, can also lead to a consistent growth history and the final decision must be based on the experts knowledge.

Outliers
Statistical outliers must be excluded from further analysis and interpretation. Outlier removal usually alters the shape of the depth-age relationship in a positive way: often, reversals disappear and a simpler growth trend is established. In the current version of COPRA, no automatic statistical outlier detection has been implemented and the user has to identify outliers manually. If a sophisticated treatment is desired, the dating input might be scrutinized using the methods described elsewhere. Methods, like, for example, detailed in Aggarwal and Yu (2001); Barnett and Lewis (1994); Iglewicz and Hoaglin (1993); Knorr et al. (2000), all have their advantages and weaknesses and must be tested for their usefulness in our context. Implementation of such outlier detection schemes is planned for future COPRA versions.

Hiatuses
A hiatus is a growth interruption in the archive. Climatic changes, such as aridity, cooling, or biologic changes can force hiatuses, but also factors unrelated to the climate history, such as burial of stalagmites under sediment could be relevant. Therefore, their close investigation is important for reconstructing the growth history and the causes for their occurrence. In stalagmites, hiatuses occur if the supply of drip water, supersaturated with respect to CaCO 3 , ceases. Often, this points to dry conditions above the cave, and can in itself be a "drought indicator". Even "negative growth" can occur if undersaturated water dissolves the stalagmite (Lachniet, 2009). In a worst case scenario this leads to the destruction of the stalagmite, but if undersaturated water enters the cave on a seasonal scale it might also lead to "micro-hiatuses", lasting only weeks or months. The former extreme case might be rather unique and such samples are not used as palaeoclimate archives. The latter might occur undetected, and changing drip water saturation and chemistry can potentially affect the geochemistry by re-mobilizing uranium isotopes that can go undetected in the field. Hiatuses are not unique to stalagmites. Similar effects can be observed in low-accumulation ice cores, when strong winds can stimulate loss of accumulated snow mass (Mosely- Thompson et al., 2001).
In COPRA, potential hiatuses are evaluated in the context of the individual slopes T / D of the originally given dates to depth data. A statistical test is conducted to check whether the observed slope between two successive dated points is significantly lower (or zero) than in the rest of the record. If this is the case, the user can choose whether to split the age model at this point (thereby no dates are assigned between the bounds of the hiatus) and the age model is broken into segments. Alternatively, the user can also remove Clim. Past, 8, 1765-1779, 2012 www.clim-past.net/8/1765/2012/ points or increase error margins surrounding the hiatus. If COPRA fails to highlight a hiatus that the scientist suspects because of additional information, a hiatus can be specified manually and then the above treatment options can be used subsequently. Likewise, if COPRA returns false positives and detects hiatuses where there should be none (for example, slower -but non-zero -growth than in the rest of the archive), the user can ignore this false detection. The robustness of the statistical test is clearly dependent on the number of slope estimates, i.e. the number of entries in the dating table.
The age model can then be split at the potential hiatus and individual models will be calculated for each segment. Within each segment, the modeling extrapolates between the closest dated point and the respective hiatus depth. The scientist can also choose to either enter a known depth for any hiatus, or let COPRA use the mid-point between the bracketed dating point as the splitting depth.

Incorporating incremental dating
As discussed before in Sect. 2.1, several palaeoclimatic archives can also provide incremental dating information such as layer counting. Typically, this information is in the form of {d j , t j , σ d j }, j = 1, . . . , n. Such a dataset is obtained by counting N 0 times (say) the depthsd j at which the age t j occurs; and then d j and σ d j are the mean and standard deviation of the N 0dj observations for the age t j . Note here that the depth scale d j (where j = 1, . . . , n) might be quite different from the depth scale D i (i = 1, . . . , N) mentioned in Sect. 2.2.
Here we propose to incorporate such information so that we step closer to a more precise age model in the end involving as much information as possible in its construction. Although our approach shares some similarity with the one recently proposed by Domínguez-Villar et al. (2012) for positioning the "floating" chronology relative to the radiometrically determined one, there are differences between the two methods as outlined in Sect. 4.3.
First, we assume that incremental dating (layer counting) and point estimates (U-series dating) are independent experiments, especially in the sense that the errors of measurement of the incremental dating points are not correlated to the errors of measurement of the point estimates. Also, we assume that since incremental dating is a relative dating technique and that the ages refer to the first dated point in the table, the age of that first incrementally dated point is zero.
Considering the hypothetical example shown in Fig. 3 (for illustrative purposes) where layer counted age information is available in the light gray shaded area we carry out the following steps: Step 1: we run a Monte Carlo simulation of the point estimates alone and obtain an age model as one would have if the incremental dating information had not been there. Let us call this Age Model A (the brown curve in Fig. 3a).
Step 2: next, we run a second Monte Carlo simulation (analogous to the description in Sect. 2.2) of the incremental dating points alone but this time by drawing random numbers from the depth axis instead of the age axis as in the previous step. The standard deviation of all realizations at any given depth then shall be the error in age for that particular depth (Fig. 3, inset). Thus, by estimating the respective meanst j and deviations σt j at all d j , we obtain a second age model (say Age Model B) which starts at age zero (red curve, Fig. 3b); and where the earlier error of the depths are now "transferred" to the ages, to give us {d j ,t j , σt j }.
Step 3: the next step is to position Age Model B as optimally as possible within the context of Age Model A.
To do this, we minimize the least squares separation between Age Model A and Age Model B (S 2 , dashed magenta curve, Fig. 3c) which essentially means minimizing the overall distance (on the age axis) between the red and brown curves by shifting the red curve left and right. This gives us the age offset A o by which Age Model B has to be shifted in order to be closest to Age Model A. Although the interpolation method used for Age model A might theoretically bias the final result after including the layer counted interval, we assume this to be a minor problem. If the accuracy of the radiometric dates is low, the large errors allow for many different interpolations to be realized fulfilling the stratigraphic requirements. This results in larger error margins (without layer counting). When a segment with counted layers is included (which has a better internal chronology), this segment will markedly improve the final error estimate, regardless of the chosen interpolation method.
Step 4: we now shift Age Model B by the vector A o (Fig. 3d). This would transform every point in Age Model B {d j ,t j , σt j } to {d j ,t j + A o , σt j }. This allows us to construct a final dating table combining all the depth-age information from the point estimates {D i , T i , σ T i } and the age-shifted incremental dating information {d j ,t j + A o , σt j }. We arrange the combined set {D i , d j } in ascending order (while keeping track of their associated ages and errors) to get the final dating table.
Step 5: using this combined dating table we carry out a final Monte Carlo simulation to get the final age model that incorporates dating information from both the point estimates and the incrementally dated points.
The fundamental idea in this approach is that the incrementally dated points have to be first positioned in the right  (c) Estimation of minimum least squares separation between the brown and red curves (S 2 curve in magenta). (d) Shifting of the incrementally dated points by A o to get the final dating table combining point estimates and incremental dating. Inset: the standard deviation of all realizations (gray curves) along the age axis is considered the error/uncertainty in the age model. place within the point estimate dating table so that the relative dating information stored in them can be used to construct a better age model. Minimizing the least squares separation between the two independent Age Models A and B provides an intuitive way to achieve this.
Currently, in estimating the vector A 0 , we do not take into account the errors of the Age Models A and B and only use the mean values. For a more holistic solution we need to obtain an error for vector A 0 (that includes the errors of Age Models A and B) as well, which ultimately affects the final age model and also the uncertainty estimates. This is the focus of further analysis and is intended to be included in future COPRA versions.

Application of COPRA
In order to demonstrate the performance of COPRA, we discuss three different scenarios. First, a hypothetical dataset is employed that simulates tractable and non-tractable age reversals, and hiatuses. Second, we test COPRA on a realworld stalagmite for its performance to detect and handle reversals and hiatuses. Finally, we use a fast-growing, well U-series dated, and partly layer counted stalagmite to exemplify the inclusion of layer counting as a way to improve the confidence interval for the age model.

Hypothetical dataset
We have used artificial datasets to assess the detection of reversals, hiatuses and layer counting in COPRA. Here, we first simulate a growth history for a hypothetical palaeoclimate archive and concordant climate proxy variations and then sample both to obtain a dating table of point estimates and a proxy climate history.
We assume that a 1 m long stalagmite was obtained and ten point age estimates are spaced equidistantly along the record. We translate these depth intervals into age intervals by drawing average growth rates from a uniform distribution varying between 0.5-1.5 mm yr −1 . Cumulative summing of these age intervals gives us the ages for the dating table. To test COPRA's abilities with respect to reversals and hiatuses, we modify the randomly selected growth rates to yield low or even negative increments (for reversals) or a very large age increase at low depth gain (to simulate a hiatus).
Five hundred "proxy measurements" were obtained along the 1 m long record from the top down to shortly below the maximal dating depth. A true age was assigned to these measurements by interpolation to the growth history in the dating table, taking note of hiatuses, but not reversals. The proxy signal we used was a slowly varying sinusoid with a period of 150 yr.
Synthetic layer counting data was generated for the consistent growth history case. We assumed that 25 yr of the record could be counted, with a minimum error of 0.1 mm at the top and an additional term increasing with 1 % of layer counting depth.

Stalagmite TSAL-1
As a real-world example, we select stalagmite TSAL-1 from Tksaltubo Cave in the Georgian Caucasus mountains that includes outliers and a hiatus. This sample was found broken in the cave and was tested for its usefulness as palaeoclimate proxy archive by preliminary U-series dating and low resolution stable isotope sampling. 12 U-series dates have been measured on a multi-collector inductively coupled plasma mass spectrometer at the University of Minnesota (details on the radiometric dates can be found in Table 1 in the Supplement). 183 stable isotope samples (δ 18 O) have been measured at the ETH Zurich at 2 mm intervals as a reconnaissance profile. The ca. 360 mm long TSAL-1 grew between 46 and 35.5 kyr BP, with the lowermost age showing large uncertainties. , 8, 1765-1779 TSAL-1 features a visible growth interruption at a depth of 72.25 ± 0.2 mm (Fig. 4), evident as a change of growth rate below and above. The growth interruption is clearly visible in the stalagmites petrography. However, without the petrographic evidence, we would not be able to securely assign a hiatus. Luckily, we can measure the hiatus depth and use it in COPRA to split the age model into two segments. The reason for this hiatus is not clear; it may have been caused by lack of drip water, submergence of the cave, or burial of the stalagmite in sediment. The latter factor is evidenced by silt material ingrown along the sides of the stalagmite during the hiatus, whereas the hiatus surface seems to have been washed clean of sediment by drip water.

Stalagmite YOK-G
As a second example, we use a fast growing stalagmite from Yok Balum Cave, Toledo District, southern Belize. Stalagmite YOK-G (Fig. 5), collected in 2006, is an aragonite sample (confirmed by X-ray diffraction analysis) that displays visual growth laminations of altering dark compact and milky white with more porous material. Preliminary U-series dates and layer counts suggest that the sample was fast growing (growth rate between 0.12 and 1.63 mm yr −1 ) and for the most part annually laminated. Furthermore, environmental monitoring of Yok Balum cave supports the notion that seasonal changes in dripwater chemistry and cave environment cause changes in crystal arrangement and therefore annual fabric laminations. Details on the radiometric dates can be found in Table 2 in the Supplement. Interpretation of the palaeoclimatic information from YOK-G will be published in later contributions.
Annual layer counting was performed by two people on part of the core. Layer counting should ideally be performed multiple (at least three) times by the same or different people, so that an estimate of the uncertainty of the counting process is given. For the modeling process, both an expected depth for a given year, and an expected variation about this mean is needed. The first would usually be given by the mean absolute deviation of the depths assigned to each individual year in different counting runs, the latter by their standard deviation. However, we only had access to two counting iterations for YOK-G. Therefore we replaced the standard deviation of the depths at a given year by the maximum absolute www.clim-past.net/8/1765/2012/ Clim. Past, 8, 1765-1779, 2012 deviation about the mean depth of the two counts. The short layer counted interval is in good agreement with U-series dating results (see Fig. 5).

Hypothetical dataset
The true depth-age relationship is unknown for real palaeoclimate archives. In order to be able to test COPRA's general functionality and classification capability of age reversals and hiatuses we employ simulated stalagmites as described in Sect. 3.1.1. For this simulated, piecewise linearly grown stalagmite, we found the depth-age relationship without prior modification of the dating table (Fig. 6a) and the minima and maxima of the hypothetical record was determined. The proxy signal, a sinusoid with a period of 150 yr, varies slowly compared to the average sampling rate, however. For a signal with higher frequency variability it would be impossible to characterize minima, maxima and change-points with confidence. The confidence intervals of the final age model, as well as the proxy time series, can be significantly narrowed if layer counting data is available, as the insets in Fig. 6 reveal. If age reversals are deliberately embedded into the dating table they are faithfully detected and highlighted. Removing these false point age estimates leads to an depth-age relationship and proxy time series consistent with the true curve (results not shown).
Unrecognized and unaccounted hiatuses can lead to erroneously narrow confidence bands in the hiatus area of the depth-age relationship and the proxy time series. On the other hand, if the hiatus is compensated by splitting the age model simulation (allowing for a non-continuous depth-age relationship) the error margins before and after the period of slowed growth widen and no ages are assigned to the period in-between (not shown).

Reversals and outliers
In a first step, COPRA evaluates the input data and properly detects and marks (with violet circles) two reversals (Fig. 7). When we inspect the depth-age diagram, TSAL-1 shows two seemingly suspicious samples, because they do not easily fit into a monotonous depth-age plot. The two detected reversals are caused by non-tractable reversals, which are identified as outliers, and must be eliminated from further analysis. High detrital 230 Th is the most likely cause for these two dates apparent ages. This is confirmed by information from the laboratory, that the oldest age was contaminated (we use it for illustration purposes here only).  Fig. 6. Illustration of layer counting effects using synthetic data: (A) median age model for the synthetic dataset (linear interpolation, with/without layer counted information, which is available in the grey shaded region) shown along with the "true" depth-age curve. Inset: detailed view highlighting the improvement of confidence bounds inside the layer counted interval. (B) Median proxy record estimated for the age models in (A). Inset: detailed view showing better constrained proxy estimates after inclusion of layer counted data, compared to result not considering layer counts. 95 % confidence bounds were constructed using ±2-standard deviations from the median (legend valid for A and B). , 8, 1765-1779

Hiatus
Since we know the position of the hiatus in TSAL-1, we enter the depth when asked by COPRA in the hiatus detection and treatment loop. The age modeling is split into two compartments, and each age model segment is extrapolated from the nearest dating point towards the hiatus. Fig. 8a shows the result of an untreated hiatus, while Fig. 8b shows a split MC simulation with the hiatus shown as dashed line. In the continuous age model a very strong (and unlikely) decrease in growth rate is apparent, while in the split simulation two separate MC simulations are run, each extrapolating from the closest U-series date to the given hiatus depth. The realizations tend to fan out near the hiatus, as they do at the base and top of the stalagmite.
The proxy time series for TSAL-1 on an absolute time scale is obtained as the final output of COPRA (Fig. 9). While the older segment of TSAL-1 shows a trend towards more positive values, with rather large uncertainties, the younger segment above the hiatus reflects some clear variations. The confidence interval indicates how well (or not) these variations in δ 18 O can be interpreted in a palaeoclimatic context.
If we use a simple interpolation of the U-series dates to the depth scale of TSAL-1 instead considering the precise time scale and dating uncertainties, the resulting proxy record shows remarkable variations and might prompt to unwarranted conclusions about high-frequency fluctuations (Fig. 9). COPRA allows a more reliable assessment of such variations.

Stalagmite YOK-G
We use the presence of both, U/Th dates as well as layer counted (relative) age information in the case of the stalagmite YOK-G in order to test the efficiency of the COPRA methodology in effectively incorporating the layer counted ages to increase our confidence in the proxy record. U/Th dated point from top) with reasonable confidence limits (Fig. 10). However, incorporating the layer counted data further increases the confidence of the age model in that region (Fig. 10b). This has a significant impact on the consequent proxy record because this reduced uncertainty is also reflected in a reduced uncertainty of the proxy values: higher frequency variations remain interpretable (Fig. 11b) whereas the proxy record obtained without the layer counted information fails to capture any of the higher frequency variations in the same age interval (Fig. 11a).

Proof-of-concept
The results shown in Sect. 3.2.1 present a simple but sufficient proof-of-concept for the COPRA methodology. The proxy record and the variations within it are estimated to reasonable accuracy. Apart from the basic assumptions that were highlighted in Sect. 2.1, COPRA is not restricted to any particular model of growth or sediment accumulation. The current Monte Carlo approach using Gaussian distributions can be replaced by other methods estimating the confidence intervals, e.g., by Bayesian techniques, which might allow for considering other age distributions (e.g., as evident for radiocarbon ages, Blaauw 2010).
COPRA presents a general reconstruction framework that is limited only by the accuracy of the measurements provided. This is further corroborated by the fact that it is able to give a better estimate (closer to the true value) as well as a narrower confidence bound when additional information (in the form of layer counts) is provided (cf. Fig. 6). We would like to stress that in the future, a quantitative comparison of different modeling algorithms using benchmark datasets should be performed in order to assess the capabilities of software packages.
Furthermore, COPRA presents several pragmatic advantages in dealing with the hurdles confronted when developing age models and proxy climate records with associated uncertainties. This is discussed below.

Managing age reversals and hiatuses effectively
The real-world examples in Sects. 3.2.2 and 3.2.3 represent different practical problems when developing age models for proxy climate records: age reversals and hiatuses. COPRA is able to detect tractable and untractable reversals, and is flexible enough to handle reversals, based on additional information. It also allows the user to exclude any other problematic dating point that the specialist might be aware of due to independent information relating to the experiment. Furthermore, hiatuses can be either defined manually (if the hiatus position is known), or detected automatically by COPRA. On detecting a hiatus, the age model can be split into segments (above and below the hiatus) and the age model in each segment is calculated individually (cf. Fig 8). If a hiatus is assigned automatically COPRA uses the mid-point between the bracketing dates, as the hiatus position is unknown. This method of dealing with a hiatus is realistic and relevant to the actual growth of the stalagmite because the proxy record does not attribute any proxy values to the hiatus period (Fig. 9). If the age modeling had not been split into two independent segments at the hiatus then there would have been a proxy estimate inside the hiatus years with very narrow confidence bounds, which is both meaningless and false.

Layer counts increase accuracy
An algorithm to combine layer counting information with point-wise dating improves the confidence intervals for the counted segment of the age model. However, to date, only a few studies (e.g. Domínguez-Villar et al., 2012) provide a general approach to the incorporation of incrementally dated information in the construction of age models while allowing to estimate the final age model uncertainties. This feature in COPRA is thus a significant advancement, as it allows the construction of reliable age models even if individual dates show rather large uncertainties. Both COPRA and the study by Domínguez-Villar et al. (2012) use a least squares fit to anchor the floating layer counted data to the radiometric depth-age data. However, in COPRA, we first construct two independent age modelsone from the layer counted data and the other from the radiometric data -and then position the floating layer counted age model relative to the fixed radiometric age model. We do this by using least squares estimation to get the minimum distance between the two models in the depth-age coordinates. In contrast, Domínguez-Villar et al. (2012) anchor the layer counted chronology to the radiometric one by estimating the least squares fit of a linear relationship between the two. This represents a critical difference between the two methods.
Finally, the increase in accuracy of the age model within the layer counted interval can yield drastic improvements in the proxy record (Fig. 11). The layer counted interval in YOK-G did not show any significant oscillations without the layer counted data, whereas it revealed several oscillations in the climate proxy in the same interval as soon as the layer counts were included in the estimation process.

A "precise" time scale for all proxies
With COPRA we introduce the concept of a true, precise time scale for proxy time series. As outlined above, a precise time scale is needed if several records are to be compared statistically. Using the translation from dating uncertainties to proxy uncertainties, we ask what is the most likely deposited sediment range (depth) that was deposited at a certain (true and absolute) year. Following this idea, COPRA is able to provide a new time series of the proxy record where the probability distributions of the proxy values are assigned to the true dates, the absolute time scale, allowing for comparability of differently dated proxy records.
The MC-generated distribution of the ages is used to transfer the age uncertainties into uncertainties of the proxy record and to assign a true time axis (using interpolation). Thus, for each true chronological date, a distribution of the proxy values is assigned. The median of the proxy value distribution can be further used for subsequent time series analysis. The calculated confidence levels and usage of the median proxy record allow a more reliable interpretation of the proxy variation, as well as a better comparison with other proxy records.
Another observation about the incorporation of age uncertainties into proxy estimation is that the resultant proxy record has less variability than a proxy record that is constructed simply by interpolation alone (cf. Fig. 9). This does not necessarily imply that the proxy actually is less variable, but it simply means that we cannot say anything with www.clim-past.net/8/1765/2012/ Clim. Past, 8, 1765-1779, 2012 confidence about the high frequency variations of the proxy taking the uncertainties of the measurements into account. That is, our knowledge of proxy variations is constrained (or, to look at it conversely, "enhanced") by the measurement errors (or "precision").

Conclusions
Proxy uncertainty modeling, i.e., building a reliable chronology for palaeoclimate proxy records with proxy uncertainty estimates, is a complex process and is often difficult to be objectively performed and to be reported in sufficient detail. Moreover, different assumptions and priorities considered can produce incommensurable chronologies, thus, incomparable proxy time series. We present here a new framework for COnstructing Proxy-Record from Age models (COPRA). It allows for a reliable and reproducible age reversal definition, hiatus detection, error estimation, inclusion of layer counted intervals to improve overall confidence intervals, and it enables users to translate age uncertainties into proxy uncertainties.
In the future, we plan to continue the development of CO-PRA, e.g., for implementing a robust automatic outlier detection, and improving hiatus detection, which is challenged if the hiatus is short compared to the dating resolution. The latter is especially important for chronologies with very short (sub-centennial to sub-decadal) growth interruptions that are of great importance for a realistic interpretation of climate proxy records. As a flexible and expandable framework, CO-PRA allows adaptation and integration of various age modeling approaches, or other specific statistics to further improve proxy time series reconstructions.

Appendix A The COPRA workflow
After preparing the basic input data (dating and proxy data tables) and import of these files, the user is guided through the modeling algorithm (Fig. A1). First, the user is asked if additional information is available (e.g., layer counted segments or hiatus information). If no additional data is available, COPRA proceeds to check the record for age reversals. If a layer counted interval is available, the number of counted years, the lower depth, and the upper depth of the counted interval must be entered.
After these tests the user confirms the selected dating information (which is shown as an depth-age graph) and the actual modeling is initiated. In the process the user can modify the interpolation routine used (see above; linear = default). In the case of hiatuses, treatment selection choices must be made by the user (involving further exclusion of points, or  Fig. A1. COPRA Workflow: this schematic shows the basic structure of the COPRA algorithm. The flow is divided into five broad sections which are color-coded. The light blue boxes denote the initial part where the user is prompted for the input data and other relevant information for the proxy construction. The dark blue boxes denote the part where the dating table is checked for reversals and outliers and thereafter the user is allowed the option of selecting an appropriate "final" dating table to be used in the Monte Carlo age modeling stage. The pink boxes denote the core Monte Carlo modeling part where the user also has the option to tune various relevant aspects of the model as required. The orange boxes deal with hiatus detection and treatment. Finally, the green boxes in the chart deal with the construction of the final proxy model from the age model information and the exporting of the final outputs. increase of error margins, or the splitting of the simulation into two independent age models) for each hiatus detected and/or specified. Further, the number of MC simulations, the confidence interval, and the type of central estimate (e.g., median or mean) can be changed (COPRA uses 2000 MC simulations, the 95 % confidence level, and the median value as center point as default).
The MC simulation is performed next and the resulting depth-age relationship with the dates and their errors, and with the chosen confidence intervals, is displayed. Subsequently, the user can confirm and continue (if satisfied), or repeat the process by either making a different choice concerning the included dates and hiatuses, or choosing different MC simulation parameters. Once the user is satisfied COPRA proceeds with computing proxy-age relations and Clim. Past, 8, 1765-1779, 2012 www.clim-past.net/8/1765/2012/ associated uncertainties (COPRA uses the confidence level given above for the MC simulation, by default). Finally, the output data (proxy vs. depth or age, with vertical (proxy) and horizontal (age) errors) are displayed. All relevant metadata and the computed data are exported as plain ASCII files. The metadata ensures repeatability using the same input data and settings.