Climate change between the mid and late Holocene in northern high latitudes – Part 1: Survey of temperature and precipitation proxy data

. We undertake a study in two parts, where the overall aim is to quantitatively compare results from climate proxy data with results from several climate model simulations from the Paleoclimate Modelling Intercomparison Project for the mid-Holocene period and the pre-industrial, conditions for the pan-arctic region, north of 60 ◦ N. In this ﬁrst paper, we survey the available published local temperature and precipitation proxy records. We also discuss and quantiﬁy some uncertainties in the estimated difference in climate between the two periods as recorded in the available data. The spatial distribution of available published local proxies has a marked geographical bias towards land areas surrounding the North Atlantic sector, especially Fennoscandia. The majority of the reconstructions are terrestrial, and there is a large over-representation towards summer temperature records. The available reconstructions indicate that the northern high latitudes were warmer in both summer, winter and the in annual mean temperature at the mid-Holocene (6000 BP ± 500 yrs)


Introduction
In recent decades the northern high latitudes have experienced significant warming, which is larger than elsewhere on the globe (e.g. Moritz et al., 2002;Brohan et al., 2006). Observations since 1961 show that temperatures have risen by more than 2 • C in Arctic areas (IPCC, 2007) and in the past 100 years averaged Arctic temperatures have increased at almost twice the global average rate (IPCC, 2007). The observational evidence is generally consistent with climate model simulations that include increased greenhouse gas concentrations and other observed external forcings (Holland and Bitz, 2003). However, uncertainties in Arctic regional climate predictions, which rely on our current understanding of climate-influencing processes in the various components of the climate system, still exist (IPCC, 2007). Improved knowledge of past climate variability and regional climate evolution in the Arctic is crucial for a better understanding of present climate dynamics and is a prerequisite to meet expressed needs for improved climate forecasting 592 H. S. Sundqvist et al.: Mid-Holocene climate change: survey of proxy data capabilities. This necessitates quantitative information also from data records that reach beyond the information available from instrumental records (e.g. Jones et al., 2009). Such data -i.e. climate proxy records -are further needed for validation of climate models, to assure that the models are able to reproduce the observed climate change.
Reconstructions of past climate changes can be obtained by analyzing different types of proxy data from natural archives, like biological proxies preserved in lake and marine sediments, the width of tree-rings or the oxygen isotope composition in ice, trees, speleothems and lake sediments. There is evidence that climate in the northern high-latitude regions were warmer than today during the mid-Holocene. Davies et al. (2003) suggested, based on evidence from pollen data, that the so called Holocene Thermal Maximum occurred across a wide area of northern Europe at around 6000 years ago. While the insolation in boreal summer was maximal at around 11 000 years ago (Berger and Loutre, 1991) due to orbital factors (tilt and precession), the temperature maxima in the proxy records are in some places delayed up to 4000 years as a result of the cooling effects of melting of the big continental ice sheets (Kaufmann et al., 2004;Renssen et al., 2009). The long-term temperature trend since then, caused by the change in orbital forcing, appears to have been amplified by several positive feedbacks, for example the icealbedo feedback (Deser et al., 2000), the sea-ice albedo feedback (Harvey, 1988), and the tundra-taiga feedback (Otterman et al., 1984). A further mechanism of importance is the slow release of the excess heat stored in the oceans due to their large heat capacity (Renssen et al., 2006). The existence of such feedback mechanisms imply that high-latitude climate variability is a complex process, which demands the analysis of simulations with numerical climate models of a considerably degree of complexity in order to be understood.
The Paleoclimate Modelling Intercomparison Project (PMIP, http://pmip.lsce.ipsl.fr/ and http://pmip2.lsce.ipsl.fr/) was launched to study the role of climate feedbacks arising for the different climate subsystems and to evaluate the capability of state of the art climate models to reproduce climate states that are different from those of today. Two of that project's focus periods are the mid-Holocene (6000 years ago) and the pre-industrial period (ca. AD 1750). In a number of studies, several models' response to primarily orbital forcing has been compared with evidence from proxy data for different regions (e.g. Prentice et al., 1998;Gladstone et al., 2005;Brewer et al., 2007). Nevertheless, for the Arctic region there are more proxy records available than hitherto used in the model-data comparisons and furthermore, the various types of uncertainties that always exist in proxy data (e.g. Wanner et al., 2008;Jones et al., 2009) have usually not been explicitly accounted for.
We undertake a study, in two companion papers, where we first address questions concerning temperature and precipitation proxy data availability and proxy data uncertainty, and then perform climate model vs. proxy data comparisons for the whole northern high latitude region (60-90 • N), focusing on the periods at 6000 years ago and the pre-industrial period. By focusing on these two periods, it is possible to undertake a quantitative analysis where we compare the available local temperature and precipitation proxy series with all model simulations from the PMIP projects. Such a modeldata comparison is the goal in Part 2 of our study (Zhang et al., 2010).
In the first part (this paper) the aim is to undertake a survey of available calibrated local temperature and precipitation proxy records extending 6000 years back. We analyze the geographical coverage of data, we calculate the change in climate between 6000 years ago and the pre-industrial period in each record, we compare what we find in different proxy types and regions, we discuss sources for uncertainties in the data and we attempt to quantify some important uncertainties of relevance when comparing the climate change deduced from proxy data with results from model simulations. The estimated local climate changes and the associated quantified uncertainties found here are then used in the modeldata comparison in Part 2, where we implement a simple cost function approach of a type that has previously been used in proxy-data vs. model comparisons , in an attempt to identify the models that most closely agree with the proxy data. These "best-fit" models are then subject to analyses of the models' response to the (mainly) orbital forcing change and the processes and feedback mechanisms that are involved.

Screening of proxy data
There exists no public database that contains all published temperature and precipitation proxy records covering the region and period of interest here, which is the area north of ∼60 • N and the time period from the mid-Holocene to the pre-industrial. Therefore, to obtain an overview of the available data it was necessary to start with a systematic screening of the peer-reviewed literature. We only searched for published records that were already calibrated and we were not looking for regional compilations from multiple records, instead we focused on local site-by-site reconstructions. By doing so, we found a total number of 129 reconstructions from 71 different sites (Fig. 1, Table 1). We obtained the majority of the data series directly from the respective author by personal contact (72), while only a few were obtained from the NOAA data base (6) (www.ncdc.noaa.gov/paleo/recons. html) or from the respective author's home page (6). Whenever we could not obtain data digitally in any of those ways (45), we digitized figures in the original articles, using the program GetData.
There does, however, exist more potential data than those included in our survey. For example, for North America    regional summer temperature reconstructions from pollen covering the region 50-70 • N have been developed (Viau and Gajewski, 2009). But this is partly outside the area of our study and, moreover, it is a regional-average reconstruction and could thus not be included here because we decided to use only local data published as such. Nevertheless, the number of proxies from high-latitudes used here is substantially larger than in previous proxy data compilations for the same area. For example, Kaufman et al. (2009) used only 23 local proxy records to study Arctic temperature trends over the last 2000 years. One reason why we can allow many more records is that we do not require a temporal resolution higher than about one sample per century. With such a low temporal resolution, it is of course not possible to undertake analyses at decadal time scales which were essential in the study by Kaufman et al. (2009).

Estimation of climate change between the mid-Holocene and the pre-industrial periods
As our goal in Part 2 (Zhang et al., 2010) is to compare the PMIP simulations for the climate change between the mid-Holocene and the pre-industrial periods with the evidence from proxy data, we need to define suitable time periods for which we calculate the climate change recorded in the proxy data. The PMIP simulations are subject to orbital forcing and to forcing from changes in the atmospheric concentration of CH 4 . The change in orbital forcing is the most important (Zhang et al., 2010). Hence the PMIP simulated climate change between the two time periods is essentially the result of the slowly changing orbital forcing. PMIP defined the two periods as the 100-yr periods centered on 6000 yrs BP and AD 1750. We could in principle choose the same periods as in PMIP here. However, in reality there is certainly some climate variability at ∼100-yr scales that is due to either internal (unforced) variability or external forcings (e.g. solar and volcanic) that were not included in the PMIP simulations, and which would add some uncertainty in direct model-data comparisons. Therefore, to minimize influence from variability at the ∼100-yr scale in our estimates of climate change in the proxy data, we decided to use somewhat longer time periods. There is no obvious choice to make here, but we regard 500-yr means as a reasonable compromise. Such time windows should make it meaningful to quantify the climate change between the two periods, without too much influence from non-orbital forcing and internal variability to disturb the model-data comparison. Hence, for the early period, we average the reconstructed temperature or precipitation data over the 500-yr window defined as 6000 yrs BP ± 250 yrs. For the later period, however, it is not possible to use 1750 AD ± 250 yrs. There are two reasons for this; (i) many proxy series do not extend up to 2000 AD and (ii) climate in the recent time is influenced by anthropogenic forcing, which is not included in the PMIP simulations. Hence, we had to move the later 500-yr window slightly backwards in time. We decided to move half a window length, and use 1500 AD ± 250 yrs. This should avoid any significant anthropogenic forcing and all records collected have data throughout the time window. The influence of this shift is considered to be small in comparison to the climate change we are interested in here. For example, the annual mean temperature difference between the 500year periods 1251-1750 and 1501-2000 appears to be only ∼0.05 • C for the Northern Hemisphere as a whole, according to data from Mann et al. (2009). For Arctic summer temperatures the corresponding difference is ∼ −0.07 • C according to data from Kaufman et al. (2009). We thus regard it justified, for the purpose of comparing with the PMIP simulations, to use the period 1500 AD ± 250 yrs to represent the modern pre-industrial climate. For notational simplicity, we will denote this time period as 0.5 ka in the following (mainly in formulae, tables and figures). The early period, 6000 yrs BP ± 250 yrs, is denoted 6 ka. Once the time periods had been defined, the first practical obstacle was to handle the vastly varying temporal resolution among the records, in order to calculate comparable averages within the 500-yr time windows. The temporal resolution ranges from roughly one value per 500 years to one value per year. For the records with very low resolution (one or just a few data points within the time windows), it is necessary to use information also from the nearest points outside the windows to ensure that representative within-window averages are calculated. To achive this, we undertook a linear interpolation of each record to obtain a 10-yr resolution of all (non-annually resolved) series before calculating the 500-yr averages. For records with uncalibrated radiocarbon ages, the radiocarbon dates were first calibrated into years BP, using the program OXCAL 4.0 (Bonk Ramsey, 2001). From these data we then calculated 500-yr means for the two time windows. To estimate the climate change between the two periods, the 500-yr mean at 0 ka was subtracted from the value at 6 ka, where X is the 500-yr mean of either temperature (T ) or precipitation (P ).

Sources of uncertainty and estimation of minimum uncertainties
There are several sources of uncertainty in climate reconstruction from proxy data, and thus also in our estimates of climate change between two time windows. A list of uncertainties includes; (a) the statistical uncertainty of the calibration, (b) the assumption that the statistical model used for calibration holds also outside the calibration period, (c) the statistical calibration model may work well for one particular frequency band but not for all frequencies, (d) influences on the proxy from other factors than the target climate variable and (e) uncertainty in the age determinations of proxy samples. These types of uncertainties are general and influence the calculation of climate variability over a range of time scales. In addition to the above listed uncertainties in the reconstructions themselves, we must also consider climate variability at the site level and non-climate related variability in the local proxy record as a source of uncertainty in our estimates of climate change between two distinct time windows. Although this is not an uncertainty in the proxy data as such, it is a factor of importance when comparing the evidence from proxy data with results from forced model simulations. In Part 2 of our study, we have, for each model, one simulation for each time slice and we regard the difference between the two simulated climates as the model's response to the imposed forcing. In the proxy series, however, we can also consider the internal variability (either due to climate variability or to other factors which can be regarded as noise in the proxy) at the 500-yr time scale as a source of uncertainty when we calculate the difference between the two time windows. Even if the forcing would change very smoothly (such as the orbital forcing, which is the primary forcing considered in Part 2), there is certainly some internal variability added to the climate response. This gives rise to some variability at the 500-yr time scale in the proxy records, which we have to consider as an uncertainty in the context of model-data comparisons. Hence, we need an estimate of the internal variability of 500-yr climate averages as recorded in the proxy records.
Not all uncertainties in the data we use are possible to quantify given the information at hand, which is the data itself and the information given in the original papers. The quantification of all types of uncertainties in all collected data series appears to be almost an impossible task, and any such attempt is beyond the scope of our investigation. However, it is relatively easy to obtain some quantitative information for three important sources of uncertainty, namely those related to the statistical calibration error, the uncertainty in the age determination of the proxy records, and the internal variability in the local proxy records. The uncertainty associated with the dating of the samples, arises from laboratory and field practices and limitations in the dating techniques to provide exact age estimates. Typically we then mean radiometric dating techniques (like radiocarbon or uranium series dating) or from the counting of presumably annually laminated deposits such as ice-cores and speleothems. However, although the original papers normally provide information about uncertainties in the age determinations, these errors are given in units of time and they cannot straightforwardly be translated into units of a climate variable, because the dating errors may be complex and have multi-modal distributions. On the other hand, if the dating uncertainty is rather small compared to the time scale of interest, then it may be of minor importance. Figure 2 shows the frequency distribution of dating uncertainties in the records used in this study, expressed as 95% confidence intervals for the age (in years). About 3/4 of the records have dating uncertainties smaller than 200 years and nearly half are less than 100 years. Given that we study climate changes between two 500-yr windows, the influence of the dating uncertainty can be considered as relatively small on average, although it can have an important influence for a few records with large dating errors. Hence, we restrict our analysis to two remaining types of uncertainty, which can be rather easily quantified in units of climate variables, namely: 1. Uncertainty due to statistical calibration, 2. Uncertainty due to internal variability.
In our companion study (Zhang et al., 2010), it is essential to combine the two uncertainties for each proxy record into a single measure, which can be used in the cost function calculations. The two uncertainties are assumed to have the variances σ 2 c and σ 2 v , respectively (in the unit of either temperature or precipitation). It appears reasonable to assume that the two types of uncertainty are uncorrelated. Hence they can be merged into a combined uncertainty with variance: An approximate 95% confidence interval for the reconstructed climate change X in each proxy that accounts for the combined uncertainty, is then X± 2σ comb . These combined uncertainty estimates are of course only minimum estimates of the total uncertainty, to which all other types of uncertainty also contribute. Below, we describe how we estimate each of the two uncertainties in the individual proxy records. Some discussion of uncertainty types that are not considered numerically here is additionally provided in Sect. 3.

Uncertainty due to statistical calibration
Quantitative environmental reconstruction builds on statistical modeling, where often a biological variable, e.g. pollen abundance, is a function of an environmental variable, e.g. temperature. Other examples are δ 18 O in ice or speleothem calcite as a function of temperature, or tree ring width as a function of temperature. It is obviously the climate changes that cause changes in the proxy data, and not the other way around. Nevertheless, from a statistical point of view the problem can be posed in the opposite way. In other words; the observed climate variations may be viewed as a function of the variations observed in the proxy data plus a noise component. The statistical relationship that describes how to translate the proxy data into units of a climate variable is often called a transfer function. Transfer functions can be either univariate or multivariate. They can be defined in the time domain (typically for tree ring data) or by using a space-fortime approach, where it is assumed that modern relationships between proxies and climate over geographical space are the same as those between proxies and a changing climate back in time (typically for pollen and chironomid data). However, they always involve some type of statistical calibration and are always associated with an uncertainty.
The size of the calibration uncertainty is likely mainly influenced by the incapability of the proxy data to perfectly portray past variations of the climate variable of interest, but measurement errors and other errors due to laboratory and field procedures also contribute. This uncertainty is usually reported by the original investigators as a root mean square error of prediction (RMSEP) or a sample specific error of prediction (SSEP) of the reconstructions. These measures, however, may be valid only for the data and the time period for which the calibration was undertaken and may not hold for all time scales. Typically, the original investigations do not provide any analysis of how well the calibration statistics may hold outside the calibration period or for time scales that could not be represented in the calibration procedure. Therefore one must regard the reported RMSEP and SSEP values as minimum estimates of the calibration uncertainty.
Assuming that the calibration uncertainties for the two time windows are uncorrelated, the uncertainties for the two time slices can be combined by simply adding the variances: If there is more than one observation per time slice, however, i.e. if the resolution of the record is higher than 500 years (which it always is in our study), and if the calibration uncertainty is valid for individual estimates, then the variance has to be reduced to account for the number of observations (N ) in the time slice: We assume here that the calibration uncertainty is always given for individual estimates and, hence, we always apply the adjustment. A complicating factor, however, emerges if there is significant positive autocorrelation in the proxy time series. If this occurs, the number of "effective" samples is smaller than the actual number of data points and the variance estimate should therefore be inflated. Mitchell, Jr. et al. (1966) introduced a simple way to calculate the effective sample size in meteorological time series: where ρ is the lag-1 autocorrelation in the data series (assumed to be positive). In principle, the effective sample size defined in this way could be used here to inflate the variance estimates. However, from a practical point of view this is not straightforwardly applied, because the temporal resolution of many series is not constant in time and therefore it is not obvious what a "lag-1" means in this context. Moreover, there is a question of whether to estimate the autocorrelation in only the respective time windows (which would often lead to very few actual data points and thus large uncertainty), or to estimate the autocorrelation from the full time series. We decided to not attempt to estimate the effective sample size and our estimates of the calibration uncertainty in the 500-yr means must thus be regarded as minimum estimates.

Uncertainty due to internal variability
As argued at the beginning of Sect. 2.3, we need an estimate of the proxy variability at the 500-yr scale when we undertake the model-data comparison in Part 2. We adopt a simple approach to achieve this; first the linear trend over the entire series between mid-Holocene and the pre-industrial period is calculated by ordinary least squares regression and subtracted from the data. Then, the variance of 500-yr means is estimated from the residual detrended series. Assuming that there is no covariance in climate between the two time windows, an estimate of the uncertainty due to internal variability can be formulated as: where σ res is the variance of 500-yr means in the detrended proxy series. The factor 2 is included because there is one variance at each of the two time windows, which should be summed. This uncertainty becomes more complicated if there is temporal autocorrelation in the series. If there is a covariance in climate between the two time windows, then a covariance term should be added on the right side. From a practical point of view it is not easy to quantify whether there exists any significant co variation between 500-yr windows separated by six millennia (or actually 5500 years as in our case). The proxy data series are too short to allow any meaningful estimates of such a covariance. And there hardly exist any model simulations long enough to estimate it from model data. It is clear, however, that the 5500-yr lagged covariance of 500-yr means must be smaller than the total corresponding variance, so neglecting the covariance should not be too much of a problem. It is also worth mentioning that Zorita (2009) pointed out that the autocorrelation function for 100yr means of summer temperature decays to zero (or statistically insignificant values) already for time lags less than a few millennia for the Fennoscandian area, in a 7000 yr long simulation with a coupled atmosphere-ocean general circulation model. In personal communication with us, he demonstrated that the same holds also for other regions in high northern latitudes, both for winter and summer temperatures. This further supports that autocorrelation between the two time windows is not an important contribution to uncertainty here.

Regional availability of proxy data
The data screening reveals that temperature reconstructions are far more numerous than precipitation reconstructions (104 vs. 25). Among the temperature reconstructions, there is a large over-representation of summer temperatures (80), while only 17 represent annual mean and 7 represent winter (January or Febuary) temperature. For precipitation, 19 records represent annual and 6 winter conditions. These numbers refer to the interpretations made by the original investigators; see references in Table 1. The geographical distribution of the records is also not uniform (Fig. 1); there is a large bias towards the land areas surrounding the North Atlantic sector (76), especially Fennoscandia (62). Already from this simple summary it is evident that there is far from an even distribution of proxy data considering the geographical and seasonal representation. This makes it difficult to undertake analyses of, for example, spatial patterns and seasonal differences in climate change in high-latitude areas.
The major part of the reconstructions comes from terrestrial archives (117) and most of these are derived from biological proxies; primarily pollen (67), chironomids (26) and diatoms (10), where the abundance of various species is calibrated using transfer functions that are determined from the distributions of species in modern surface lake or ocean hydrography over a wide geographical range, using the assumption that recent variations in species between different climatic regions are the same as variations caused by changes in climate over time at one and the same site (e.g. Birks and Seppä,Map of the estimated difference in temperature 2004). Other terrestrial proxies are tree-ring width (1), δ 18 O in speleothems (1) and ice core (1), borehole temperature (2) and density of sediment in combination with pollen (5). The 12 records from marine archives are reconstructions of sea surface temperatures (SSTs) from proxies that make use of the chemical composition (alkenone unsaturation index, δ 18 O of foraminiferal shells) or abundance of planktonic organisms (diatoms and foraminifera). For information on how each proxy type was calibrated to either temperature or precipitation by the original investigators, see references in Tables 1a and b.
There are several reasons for the heterogeneous spatial distribution of proxy records. Depending on the environment there are different types of natural archives available and thereby there are regional differences in the types of available proxy data. For example, there are no large ice sheets, except for the one in Greenland, and there are no recent lakes in the middle of Greenland, and hence no access to lacustrine sedimentary proxies from this region but instead only ice cores. Also, a certain proxy can be either a better or worse climate indicator depending on the climatic regime. For example, annual mean temperature is probably a more appropriate climatic variable to be reconstructed from pollen data in southern and central Fennoscandia than July temperature, which on the contrary is presumably better reconstructed from pollen at more northern sites with a shorter growing season . Moreover, when performing research aimed at reconstructing past climates, it is desirable to study archives that are as little as possible disturbed by human influence, so that it is primarily a climate signal that is recorded and not effects of human settlement. This is the main reason why most of the data are derived from remote areas, like the mountain chains and Greenland. Another limitation is the availability of training sets for calibration of the proxy data into temperature or precipitation; for example no modern chironomid training set is available for Russia (Brooks, 2006). Reasons for why Fennoscandia stands out as the most well sampled region could be the availability of regionally restricted calibration sets (e.g. Brooks and Birks, 2001;Birks, 2003; and also the Scandes Mountains providing an environment protected from human influence. Yet another reason why Fennoscandia is the most well investigated high-latitude region is, likely, that it is the most densely populated high-latitude region, hosting several universities, and thus it is easy for scientists to reach the field sites without too expensive and logistically complicated expeditions.

Climate change between the mid-Holocene and pre-industrial periods
To enable comparison of the estimated uncertainties with the actual differences in climate depicted by the proxy data, it is useful to provide some simple summaries of the reconstructed climate changes in numeric form. To this end, we provide here some overview numbers which include arithmetic means calculated for selected groups of proxy records whenever any meaningful groups can be found. Such groups can be either identified by region, season or proxy type. We stress, however, that any arithmetic means provided here are nothing more than simple descriptions of the actual data collection at hand -intended to provide a view of the order of magnitude of the reconstructed climate changes.

Temperature
A large majority of the temperature reconstructions (80/104, i.e. 74%) indicates that temperatures were on average higher (neglecting the quantified uncertainty) at the mid-Holocene than during the pre-industrial period (Fig. 3). Summer is the only season for which the proxy records are sufficiently numerous to allow some comparison of temperature differences in different regions. A cooling by ∼ 1 • C is seen for both Northern Siberia and Fennoscandia, while the estimated temperature differences over North America and Iceland are ∼ 0.5 • C. Note that the numbers above refer to simple arithmetic averages over the available proxy records in each region, and they are not assumed to represent the whole regions associated with the chosen geographical names. Note also that here is a large difference in the number of sites and reconstructions across the regions. For Fennoscandia a total number of 38 reconstructions at 27 sites have been used. More than 60% of these indicate a summer temperature difference of more than 0.5 • C (i.e. warmer at 6 ka) and almost 40% a difference of more than 1 • C. The changes observed here are in line with previously reported temperature changes recorded from proxy data compilations for Fennoscandia ) and central Canada (Viau and Gajewski, 2009). Reconstructions for summer SST are available from diatoms, alkenones, foraminifera or dinocysts, of which about half are from formaminifera. However, the average temperature change recorded by the foraminifera is 0.3 • C, which is less than most of the other marine records indicate. If the foraminifera are excluded, the Nordic Sea region has, according to the collected data, apparently seen a cooling by ∼ 1.5 • C. One reason for the different reconstructed SST Fig. 3. Map of the estimated difference in temperature (annual, July, January) between 6 ka and 0.5 ka at the different proxy sites over the northern high latitudes. changes between the proxy types is that in contrast to diatoms and alkenone producing algae, which live in the upper 50 m of the water column, the foraminifera are found deeper down near the permanent thermocline. Usually this lower part of the ocean is unaffected by the near-surface warming during the summer season (Jansen et al., 2008).
Northern Siberia is the only region for which the number of proxy records for winter and annual mean temperatures is large enough for any meaningful intra-regional comparison between seasons. In addition to the nine summer temperature proxies, there are four proxies each that are interpreted as winter and annual mean temperature records. These latter reconstructions indicate warmer winter temperatures, by ∼ 0.3 • C as well as annual mean temperatures, by ∼ 0.5 • C, at 6 ka compared to 0.5 ka. Thus, the average temperature change in the winter and annual mean for the Siberian proxies is smaller than the above reported change by ∼ 1 • C for summer proxies from this region.
According to an unweighted average of T across all seasonally separated proxy series, the climate of the northern high latitudes at 6 ka was ∼ 0.8 • C warmer in summer, ∼ 0.5 • C in winter, and ∼ 1.7 • C in the annual, in comparison to 0.5 ka. Winter data, however, is only available from seven sites and the calibration errors alone are on average above 3.5 • C. It is anyway noteworthy that the simple average change in reconstructed annual mean temperature is considerably larger than in both winter and summer separately. There may be several possible reasons for this, which we briefly discuss in Sect. 3.6.

Precipitation
The number of available reconstructions of annual total precipitation from the northern high latitudes is only 19 and they are derived from 15 different sites. The average difference in annual total precipitation between 6 ka and 0.5 ka amounts to a decrease by ∼ 40 mm. Due to large uncertainties associated with the precipitation reconstructions (±130 mm for the 2σ comb averaged across all records; a very rough approximation given the typical non-normal distribution of precipitation data) it is not possible to judge if any significant average difference in precipitation has occurred at all. For southern Norway there exist six reconstructions of winter precipitation, which indicate that the period around 6 ka received between 20-0% less precipitation in winter compared to at 0.5 ka. However, the calibration uncertainties for these reconstructions are ∼ 20% (Jostein Bakke, personal communication, May 2009), so it is not possible to judge if any significant difference in winter precipitation has occurred even for this comparatively data-rich region. Clearly, many more precipitation proxies would be needed before we can judge whether any significant precipitation changes have occurred between the mid and late Holocene.

T in different temperature proxies
The reconstructed difference in summer temperature varies among the different terrestrial proxies. On average, chironomids indicate a cooling between 6 ka and 0.5 ka by ∼ 0.6 • C, while pollen and diatoms show cooling by ∼ 0.9 • C and ∼ 1 • C respectively. In the data-rich Fennoscandian region, the chironomids on average indicate a cooling by ∼ 0.5 • C, while pollen and diatoms show cooling by ∼ 1.4 • C and ∼ 1.1 • C respectively (Fig. 4). Hence, both as an overall average and within Fennoscandia, chironomid records suggest a smaller summer temperature change compared to pollen and diatoms. This leads to questions concerning the causes for the different behaviour of the proxy types. Chironomids and diatoms live in lakes. They are dependent on the actual water temperature of the lake but also on catchment driven fluctuations, such as pH, water depth, nutrients and dissolved oxygen, which may have had a stronger influence on the fauna during certain parts of the Holocene (Brooks, 2006). It is also a difficulty to convert the water temperature into air temperature, since the water temperature could be affected by, for example, glacier melt water and this could cause underestimates of past air temperatures (Brooks, 2006). Temperature reconstructions from deep stratified lakes may yield lower temperatures than those from shallow unstratified lakes (Heiri et al., 2003). The potential influence of depth on chironomid and diatom communities is particularly important in view of the changes in lake depth that commonly occur over time. For the pollenbased reconstructions there could be problems with human influence on the vegetation as well as long-distance transported pollen . From Bjørnfjelltjørn in Norway it has been seen that the chironomid-inferred temperatures consistently underestimate the mean July temperatures when compared to pollen throughout the early and mid-Holocene. Macrofossils from the same period support the pollen-inferred estimates (Brooks, 2006). In short, all the three most numerous proxies (pollen, diatoms and chironomids) are all associated with rather complex processes, which contribute to the total uncertainty in estimated climate changes derived from these types of data, and chironomidbased temperature changes appear to differ somewhat from pollen and diatoms.

Quantitative uncertainty estimates and a brief discussion of uncertainties
The calibration uncertainty (σ c ) lies between 0.2 and 2.5 • C, with an average of 1.1 • C. The largest individual uncertainty is found for the Russian pollen reconstructions (Andreev et al., 2005), while the lowest uncertainty is found for the Canadian pollen records  and in the Swedish tree-ring record (Grudd et al., 2002). The calibration uncertainty in the Canadian pollen data is smaller than what can realistically be expected. , however, offer no discussion on this issue. The reason for the low calibration uncertainty in the tree-ring data is that this series has annual resolution, from which we calculate 500 year averages. By taking simple averages of σ c for the most abundant proxy types, the largest calibration uncertainties are found in the pollen reconstructions of winter temperature (Fig. 5), with an average of 1.6 • C while the smallest uncertainties are found for the pollen reconstructions of summer or annual temperature and for diatoms (0.8-0.9 • C on average). One reason for this is likely that the pollen assemblages have a comparatively high correlation with summer temperature in areas with short growing season, or with annual temperature in areas with longer growing season (e.g. Seppä et al., 2009). The chironomids have an average uncertainty of 1.2 • C. This larger uncertainty than for pollen in summer and diatoms can account, at least partly, for why the average summer temperature change is smaller in chironomid compared to pollen and diatom data.
The combined (minimum) uncertainties from calibration and internal variability (expressed as σ comb ) for the estimated T in the individual reconstructions is found to lie between 3.2 and 0.2 • C, with overall largest uncertainties seen for winter temperature estimates (2.4 to 1.5 • C). When viewed across the various reconstructions, the largest contribution to the combined uncertainty is mostly the calibration uncertainty, while uncertainty due to internal variability is generally smaller (Tabel 1, Fig. 6).
One conclusion from this brief summary of estimated (minimum) uncertainties is that they are generally large, compared to the climate changes depicted by the data. Another conclusion is that the calibration uncertainty is typically larger than the uncertainty in climate change due to internal variability at the 500-yr time scale. The often rather large calibration uncertainty (and other here not quantified proxy data uncertainties) clearly makes it difficult to assess the climate changes at the site level. Thus it would be desirable to reduce the uncertainties if possible. One such possible way to go is to merge several records of different proxy types from within a small region, and use this average to represent the locality. This should reduce the influence from other factors than temperature (or precipitation) and various types of noise that is uncorrelated across members in a group of proxy records (e.g. Velle et al., 2005;Bjune et al., 2009). We undertake such an averaging in the following section, for a small locality where several nearby records are available.
Another way to reduce uncertainty is to improve calibration techniques. For example, Korhola et al. (2002) used a Bayesian statistical method as an alternative to the traditional weighted averaging partial least squares regression (WA-PLS) method for Lake Toskaljavri in northern Finland, in an attempt to improve the performance of chironomid-based temperature estimates. The temperature trends produced by both the models were similar but the WA-PLS method had (1) Torneträsk, tree-ring width, (2) Vuoskkujavri, chironomids, (3) Vuoskkujavri, diatoms, (4) Vuoskkujavri, pollen, (5) Lake 850, chironomids, (6) Lake 850, diatoms, (7) Lake Njulla, diatoms, (8) Lake Njulla, chironomids, (9)Voulep Njakajaure, diatoms, (10) Lake Tibetanus, pollen. Note that all the reconstructions are reconstructions of July temperature except the tree-ring reconstruction which is a reconstruction of June-August temperature. See Table 1a for references to the records.
about 1 • C higher sample specific errors. Another source of uncertainty that can affect some proxies, is that of postglacial uplift between mid Holocene and today, which could be of great influence, especially for e.g. northern Scandinavia. The effect of postglacial uplift could make temperature reconstructions appear warmer during the early Holocene. In some studies, e.g. Rosén et al. (2001), attempts to correct for this have been made.

Comparison between multiple reconstructions from a small region in Fennoscandia
As already seen, Fennoscandia is the most data-rich region in our survey. This region includes several reconstructions from the same or nearby sites. There are, however, major deviations between the different reconstructions. In Fig. 7, estimated differences between 6 ka and 0.5 ka for ten reconstructions of summer temperature from the Abisko area (68. 33-68.50 • N, 18.07-19.12 • E) in northern Sweden are shown. This area is very small (few tens of kilometers), and hence the climate must be considered to be almost perfectly correlated within the area at the time scale of interest here. Therefore, it appears meaningful to take an arithmetic average of the calibrated temperature proxy series, and let this composite series represent the site. To obtain an estimate of the combined (minimum) uncertainty in T calculated from this proxy average, however, requires separate treatment of σ c and σ v . The latter uncertainty can be estimated directly from the averaged proxy time sequence of 500-yr means, using Eq. (6), whereas the calibration uncertainty in the averaged proxy series should take the individual calibration uncertainties into account. Assuming that the latter are uncorrelated, a combined uncertainty can be formulated as: where the first term is uncertainty due to internal variability in the averaged series, N p is the number of proxy series and σ c,i the individual series calibration uncertatinty. (The assumption of uncorrelated calibration uncertainties, however, may not hold entirely if the same training data sets have been used for chironomids, or diatoms, or pollen from different sites. Thus, again, our uncertainty estimate is a minimum one).
The average T across all ten records depicts a cooling from 6 ka to 0.5 ka by 0.9 • C. Nine of the ten records agree on the cooling, whereas one record suggests a small warming. This record is the only tree-ring based reconstruction in our entire dataset, and its entire 2σ comb error bar (which is very small because of our use of 500-yr means, while the original calibration uncertainty is given for annual values) lies outside the overall 2σ comb,ave interval. The authors of the reconstruction, (Grudd et al., 2002), point out that the tree-ring width reconstruction does not express the full range of millennial time scale temperature variation in the Torneträsk area. The problem is also briefly discussed by Linderholm et al. (2010), in their review of tree-ring data from Fennoscandia. They argue that multi-millennial temperature trends reconstructed from tree-ring data are not reliable. There is indeed a vast literature on the capability, or incapability, of tree-ring width data to portray low-frequency climate variability (e.g. Cook et al., 1995).
Even if the other nine records suggest that T is positive, seven of their individual 2σ comb error bars include zero, and hence one may not conclude from any of those seven proxies that summers were significantly warmer at 6 ka. However, merging the ten records to one composite series reduces the estimated uncertainty drastically. The 2σ comb,ave value is only 0.3 and thus the ±2σ comb,ave interval does not include zero. Neglecting uncertainties that are not included in the estimates, the conclusion is that summer temperatures in the Abisko region were significantly warmer at 6 ka than at 0.5 ka, given the available proxy series from the region. The same conclusion could not have been drawn from most of the individual records alone. This demonstrates the usefulness of having access to several records, which can be merged.

Some further comments on uncertainties in proxy data
As evidenced by our literature survey and data screening, July temperature is the most common climate variable being reconstructed for the mid to late Holocene epoch. However, what is really being reconstructed in these cases is the temperature of the warmest month. Today, this would for continental sites be July, but at 6 ka this was more likely to be August (Berger, 1979;Zhang et al., 2010). January temperature, or the temperature of the coldest month, can also be reconstructed from pollen data. This is possible, because winter climatic conditions are considered to be important for the distribution and regeneration of many plant species, especially those restricted to the most oceanic parts along the west cost of Fennoscandia (Giesecke et al., 2008). When reconstructing past climate it is assumed that the environmental processes that govern the pattern of e.g. the vegetation or the diatom flora in a lake have been the same for the entire period of the reconstruction (e.g. the Holocene). This is most likely not true and hence adds to the uncertainty. Also, there are undesired influences from far-distance pollen, human influence on vegetation, biological interactions, identification and morphological limitations that further complicate the quantification of past climatic changes .
Also, a certain proxy can be either a better or worse climate indicator depending on the climatic regime. For example, according to , annual mean temperature is probably a more appropriate climatic variable to be reconstructed from pollen data in southern and central Fennoscandia than July temperature, which on the contrary is presumably better reconstructed from pollen at more northern sites with a shorter growing season. Further, there are other factors that influence the proxy signal that must be considered, these include for example the presence of fardistance pollen, human influence on vegetation, biological interactions and identification and morphological limitations .
As mentioned in Sect. 3.2.1, according to plain averages over available temperature proxy records, the change in annual mean temperature is notably larger than in winter and summer temperature. There could be several explanations for this behaviour. For example, the temperature difference in spring and autumn may have been larger than in both winter and summer. However, there is a difference in the number and the spatial distribution of annual mean temperature reconstructions compared to summer and winter reconstructions, which complicate a comparison of results for the different seasons. For locations under marine influence, the larger cooling in winter and annual mean compared to summer could perhaps be explained by summer heat uptake by the ocean that is released to atmosphere during winter; this mechanism would be even more prevailing if the seaice cover was less extensive. Otto et al. (2009) have shown that the atmosphere-ocean feedback is the most important in amplifying the effects of the mid-Holocene insolation forcing and especially during the autumn. Albedo effects due to changes in vegetation, causing larger seasonal effects in spring or autumn, may also be considered as a reason. Another possible reason for the large estimated temperature difference seen in the annual mean temperature reconstructions, that cannot be excluded, is that the actual proxy data (predominantly pollen, but also speleothems, oxygen isotopes in ice and borehole temperature measurements in ice) or the transfer functions used to derive the temperature estimates are not sufficiently accurate to permit realistic estimates of past annual mean temperatures. In particular in the case of pollen data, one may perhaps suspect a "seasonal bias" towards summer temperatures that may have a too large influence on the estimated past annual mean temperatures. It is beyond the scope of this paper to speculate further on this matter, but a more thorough investigation of this problem seems worthwhile. In Part 2 of our study, we find that climate models can show a response to orbital forcing that is larger in annual mean temperatures than in summer and winter temperatures (because the temperature change is particularly strong in autumn), and we discuss possible reasons for this behaviour in the models.

Conclusions
We have undertaken a systematic survey of the literature with respect to quantitative reconstructions of temperature and precipitation from proxy data in the northern high latitudes, and used these records to estimate the difference in temperature and precipitation between time slices in the midand late-Holocene. We have also discussed sources for uncertainty in the reconstructions and in the estimated climate change between the two periods. There is not sufficient information to quantify all types of uncertainties, but we made an attempt to quantify and combine two uncertainties -of relevance for direct comparisons between proxy evidence and climate model simulations -that rather easily can be quantified from the given information, which are the original articles and the actual data.
Our first finding is that it is a time-consuming task to get access to the data. Very few proxy records are stored at public databases and personal contacts and digitalization of data from figures was the most important way to obtain the data. This accentuates the need for improved systems for archiving climate proxy data. Without access to data, large-scale syntheses cannot be undertaken.
We also find that the available proxy records have a large over-representation towards summer temperatures, whereas only rather few represent annual mean temperature, winter temperature, annual precipitation and winter precipitation. The geographical distribution of the records is not uniform; there is a large bias towards the land areas surrounding the North Atlantic sector, especially Fennoscandia. The overwhelming majority of the reconstructions are from terrestrial archives, and hence only a few marine records are available.
Some improvement of the spatial density of data could be made by including local proxy records that have not (yet) been published as local series, but only included in regional compilations. Two such examples are the large-scale continental reconstructions for Europe (Davis et al., 2003) and North America (Viau et al., 2006). However, the inclusion of individual site-specific records from such compilations requires even more personal contacts than those undertaken here; thus further accentuating the need for data archiving and publishing.
A large majority of the here investigated temperature reconstructions indicate that temperatures were warmer at the mid-Holocene (6000 BP ± 500 yrs) compared to the preindustrial period (1500 AD ± 500 yrs), both in summer, winter and the annual mean. By taking simple arithmetic averages over the available data, the reconstructions indicate that the northern high latitudes were 0.9 • C warmer in summer, 0.5 • C in winter and 1.7 • C warmer in the annual mean temperature at the mid-Holocene (6 ka) compared to the recent pre-industrial. Precipitation records are too few, and uncertainties too large, to draw any meaningful conclusions regarding whether climate was wetter or drier in one or the other of the periods.
Uncertainties in reconstructed temperatures at the individual site level are generally rather large. We estimated the contribution from calibration uncertainty (as reported by the original investigators) to the reconstructed temperature change between the selected time windows, and we find that the site level calibration uncertainty alone is often larger than the reconstructed climate change, implying that it is often not possible to conclude whether any significant change has occurred or not locally. If we add to this uncertainty, also the uncertainty due to internal variability at the time scale corresponding to the time window length, then we obtain a combined (minimum) measure of uncertainty in climate change of relevance for comparison with forced model simulations. This internal variability is typically smaller than the calibration uncertainty, but certainly not negligible for model-data comparisons where the response to a particular climate forcing is in the focus.
In many regions the density of proxy records is low, or even non-existent, but in some areas there are several neighbouring proxies from a small region. In such cases, neighbouring records can be merged into a composite record represeenting the locality. We demonstrated, for a data-rich small area in northern Sweden, that by doing so, the combined influence from calibration errors and internal variability reduces drastically, and significant cooling of summer temperatures between the mid-Holocene and pre-industrial periods could be concluded. A caveat, though, is that the uncertainty estimates calculated here are only minimum estimates, as not all factors contributing to the total uncertainty can be quantified.
The challenge of producing reliably inferred climate reconstructions for the Holocene cannot be underestimated considering the fact that the estimated temperature and precipitation fluctuations during this period are in magnitude similar to, or lower than, the uncertainties of the reconstructions. Further, there are sometimes large discrepancies be-tween different reconstructions from the same area. For the future there is a great need to reduce the errors of the reconstructions. One way to do this could be to produce training sets for Russia and other relevant regions, where there is a current lack of data. It is also essential to improve our understanding of how different proxies respond to changes in environmental variables. Fennoscandia is the most data-rich region in the northern high latitudes, and in order to make pan-arctic analysis more data from other regions need to be collected.
A better understanding regarding the reasons for the observed differences can also be obtained by systematic quantitative comparisons between the observations seen in proxy data and those seen in climate model simulations. Such comparisons are undertaken in the companion paper by Zhang et al. (2010).