A chironomid-based mean July temperature inference model from the south-east margin of the Tibetan Plateau , China

A chironomid-based calibration training set comprised of 100 lakes from south-western China was established. Multivariate ordination analyses were used to investigate the relationship between the distribution and abundance of chironomid species and 18 environmental variables from these lakes. Canonical correspondence analyses (CCAs) and partial CCAs showed that mean July temperature is one of the independent and significant variables explaining the second-largest amount of variance after potassium ions (K) in 100 south-western Chinese lakes. Quantitative transfer functions were created using the chironomid assemblages for this calibration data set. The second component of the weighted-average partial least squares (WA-PLS) model produced a coefficient of determination (r2 bootstrap) of 0.63, maximum bias (bootstrap) of 5.16 and root-mean-square error of prediction (RMSEP) of 2.31 C. We applied the transfer functions to a 150-year chironomid record from Tiancai Lake (26383.8 N, 9943 E; 3898 m a.s.l.), Yunnan, China, to obtain mean July temperature inferences. We validated these results by applying several reconstruction diagnostics and comparing them to a 50-year instrumental record from the nearest weather station (265129.22 N, 100142.34 E; 2390 m a.s.l.). The transfer function performs well in this comparison. We argue that this 100-lake large training set is suitable for reconstruction work despite the low explanatory power of mean July temperature because it contains a complete range of modern temperature and environmental data for the chironomid taxa observed and is therefore robust.


Introduction
South-western China is an important region for examining changes in low-and mid-latitude atmospheric circulation in the Northern Hemisphere.It lies at the intersection of the influence of the northern hemispheric westerlies and two tropical monsoon systems, namely the Indian Ocean south-west monsoon (IOSM) and the East Asian monsoon (EAM), and should be able to inform us about changes in both the latitude and longitude of the influence of these respective systems through time.Reconstructing changes in circulation requires information about several climatic parameters, including past precipitation and temperature.While there are reasonable records of precipitation from this region (e.g.Wang et al., 2001Wang et al., , 2008;;Dykoskia et al., 2005;Xiao et al., 2014), there is a paucity of information about temperature changes.In order to understand the extent and intensity of penetration of monsoonal air masses, robust summer temperature estimates are vital as this is the season that the monsoon penetrates south-western China.
Published by Copernicus Publications on behalf of the European Geosciences Union.E. Zhang et al.: A chironomid-based mean July temperature inference model Chironomid larvae are frequently the most abundant insects in freshwater ecosystems (Cranston, 1995) and subfossil chironomids are widely employed for palaeoenvironmental studies due to their sensitivity to environmental changes and ability of the head capsules to preserve well in lake sediments (Walker, 2001).A strong relationship between chironomid species assemblages and mean summer air temperature has been reported from many regions around the world and transfer functions were subsequently developed (e.g.Brooks and Birks, 2001;Larocque et al., 2001;Heiri et al., 2003;Gajewski et al., 2005;Barley et al., 2006;Woodward and Shulmeister, 2006;Langdon et al., 2008;Rees et al., 2008;Eggermont et al., 2010;Luoto, 2009;Holmes et al., 2011;Heiri et al., 2011;Chang et al., 2015a).The application of these transfer functions has provided quantitative temperature data since the last glacial period in many regions of the world (e.g.Woodward and Shulmeister, 2007;Rees and Cwynar, 2010;Samartin et al., 2012;Chang et al., 2015b;Muschitiello et al., 2015;Brooks et al., 2016).Consequently, subfossil chironomids have been the most widely applied proxy for past summer temperature reconstructions.
Merged regional chironomid training sets and combined inference models have been developed in Europe (Lotter et al., 1999;Holmes et al., 2011;Heiri et al., 2011;Luoto et al., 2014).These large data sets and models provide much more robust reconstructions than smaller local temperature inference models (Heiri et al., 2011;Luoto et al., 2014).However, the distribution of large regional inference models is limited to Europe and northern North America (e.g.Fortin et al., 2015).There is a need to build large training sets for other parts of the world where chironomids will likely be sensitive to temperature changes.Subfossil chironomids have been successfully used as paleoenvironmental indicators in China for over a decade.These included salinity studies on the Tibetan Plateau (Zhang et al., 2007) and the development of a nutrient-based inference model for eastern China and parts of Yunnan (Zhang et al., 2006(Zhang et al., , 2010(Zhang et al., , 2011(Zhang et al., , 2012)).A large database of relatively undisturbed lakes, in which nutrient changes are minimal while temperature gradients are suitably large, is now available from south-western China and this provides the opportunity to develop a summer temperature inference model for this broad region.
In this study, a chironomid species assemblage training set and chironomid-based mean July air temperature (MJT) inference models from 100 lakes on the south-east margin of the Tibetan Plateau are developed.We test and validate the selected transfer function models by applying them to a sediment core collected from Tiancai Lake (26 • 38 3.8 N, 99 • 43 E; 3898 m a.s.l.) (Fig. 1) in Yunnan Province, southwestern China, for the last 120 years against a 50-year instrumental record from Lijiang weather station (26 • 51 29.22 N, 100 • 14 2.34 E; 2390 m a.s.l.) (Fig. 1), which is the closest meteorological station with the longest record.

Regional setting
The study area lies in the south-east margin of the Tibetan Plateau including the south-west part of Qinghai Province, the western part of Sichuan Province and the north-west part of Yunnan Province (Fig. 1).It is situated between 26-34 • N, 99-104 • E with elevations ranging from about 1000 m to above 5000 m a.s.l.The study area is characterized by many north-south-aligned high mountain ranges (e.g.Hengduan Mountains, Daxue Mountains, Gongga Mountains) that are fault controlled and fall away rapidly into adjacent tectonic basins.The mountain ranges have been deeply dissected by major rivers including the Nujiang, Lancang Jiang, Jinsha Jiang, Yalong Jiang and Dadu rivers.Local relief in many places exceeds 3000 m a.s.l.
The climate of the study area is dominated by the westerlies in winter and by the IOSM in Yunnan and Tibet, but some of the easternmost lakes are affected by the EAM.There is a wet season that extends from May (June) to October that accounts for 85-90 % of total rainfall and a dry season from November to April.Annual precipitation varies greatly according to altitude and latitude.Most of the precipitation is derived from a strong south-west summer monsoonal flow that emanates from the Bay of Bengal (Fig. 1).Precipitation declines from south-east to north-west.Mean summer temperatures vary between about 6 and 22 • C from the north-west to the south-east (Institute of Geography, Chinese Academy of Sciences, 1990).Vegetation across the study area changes from warm temperate to subtropical rainforest at lower elevations in the south-west to alpine grasslands and herb meadows at high altitude.

Description of model validation site
Tiancai Lake (26 • 38 3.8 N, 99 • 43 E; 3898 m a.s.l.) (Fig. 1) is in Yunnan Province, on the south-east margin of the Tibetan Plateau.It is a small alpine lake and has a maximum depth of 7 m, with a water surface area of ∼ 2.1 ha and a drainage area of ∼ 3 km 2 .Tiancai Lake is dominated in summer by the IOSM and most likely retains a tropical airflow in winter as the climate is remarkably temperate for this altitude.The mean annual and July air temperatures are approximately 2.5 and 8.4 • C, respectively, and the annual precipitation is modelled as > 910 mm (Xiao et al., 2014).The lake is charged by three streams and directly from precipitation and drains into a lower alpine lake via a stream.The most common rock type in the catchment is a quartz poor granitoid (syenite).Terrestrial vegetation in the catchment consists mainly of conifer forest comprising Abies sp. and Picea sp. with an understory of Rhododendron spp.Above the tree line, at about 4100 m a.s.l., Ericaceae shrubland (rhododendrons) gives way to alpine herb meadow and rock screes.

Field and laboratory analysis
Surface sediment samples were collected from 100 lakes in the south-east margin of the Tibetan Plateau via six field campaigns during autumn of each year between 2006 and 2012.The lakes in this area are mainly distributed at the top or upper slopes of the mountains and are primarily glacial in origin.Most lakes were reached by hiking or with horses and the lake investigation spanned several seasons.Small lakes (surface area ∼ 1 km 2 ) were the primary target for sampling but some larger lakes were also included.
Surface sediments (0-1 cm) were collected from the deepest point in each lake after a survey of the bathymetry using a portable echosounder.Surface sediment samples were taken using a Kajak gravity corer (Renberg, 1991).The samples were stored in plastic bags and kept in the refrigerators at 4 • C before analysis.A 30 cm short core was extracted from the centre of Tiancai Lake at a water depth of 6.8 m using UWITEC gravity corer in 2008.The sediment core was subsampled at 0.5 cm contiguous intervals and refrigerated at 4 • C prior to analysis.
Water samples were collected for chemical analysis from 0.5 m below the lake surface immediately before the sedi-ment samples were obtained.Water samples for chemical analysis were stored in acid-washed polythene bottles and kept at 4 • C until analysis.Secchi depth was measured using a standard transparency disc.Conductivity, pH and dissolved oxygen (DO) were measured in the field using a HI-214 conductivity meter, Hanna EC-214 pH meter and JPB-607 portable DO meter.Chemical variables for the water samples including total phosphorus (TP), total nitrogen (TN), chlorophyll a (chl a), K + , Na + , Mg 2+ , Ca 2+ , Cl − , SO 2− E. Zhang et al.: A chironomid-based mean July temperature inference model

Chironomid analyses
One hundred surface sediment samples from lakes of southwestern China and 55 subsamples from the Tiancai Lake short core were analysed for chironomids following standard methods (Brooks et al., 2007).The sediment was deflocculated in 10 % potassium hydroxide (KOH) in a water bath at 75 • C for 15 min.The samples were then sieved at 212 µm and 90 µm and the residue was examined under a stereo-zoom microscope at × 25.Chironomid head capsules were hand-picked using fine forceps.All the head capsules found were mounted on microscope slides in a solution of Hydromatrix ® .Samples produced less than 50 head capsules were not included in the subsequent analyses (Quinlan and Smol, 2001).The chironomid head capsules were identified mainly following Wiederholm (1983), Oliver and Roussel (1982), Rieradevall and Brooks (2001), Brooks et al. (2007) and a photographic guide provided in Tang (2006).

Numerical analysis
A range of numerical methods were used to determine the relative influence of the measured environmental parameters on the distribution and abundance of chironomids in the surface sediments within the training set.A total of 18 environmental variables were considered in the initial statistical analyses (Table 1).These measurements were normalized using a log 10 transformation prior to ordinations following a normality assessment of each data set.Chironomid species were used in the form of square-root-transformed percentage data in all statistical analyses.The ordinations were performed using CANOCO version 4.5 (ter Braak and Šmilauer, 2002).A detrended correspondence analysis (DCA; Hill and Gauch, 1980) with detrending by segments and nonlinear rescaling was used to explore the chironomid distribution pattern.The DCA was also used to identify the gradient length within the chironomid data and hence whether unimodal analyses were appropriate (ter Braak, 1987).Canonical correspondence analysis (CCA), down-weighted for rare taxa (with a maximum abundance of less than 2 % and/or occurred in fewer than two lakes, i.e.Hill's N 2 < 2), with forward selection and Monte Carlo permutation tests (999 unrestricted permutations) was then used to identify the statistically significant (p < 0.05) variables influencing the chironomid distribution and abundance (ter Braak and Šmilauer, 2002).A preliminary CCA with all 18 variables was used to identify redundant variables, reducing excessive colinearity among variables (Hall and Smol, 1992); i.e. the environmental variable with highest variance inflation factor (VIF) was removed after each CCA and the CCA was repeated until all VIFs were less than 20 (ter Braak and Šmilauer, 2002).In addition, we used stepwise selection based on pseudo-F to aid the variable selection process.Only the remaining significant (p < 0.05) variables were included in the final CCA ordination.The relationship between the significant environmental variables and ordination axes was assessed with canonical coefficients and the associated t values of the environmental variables with the respective axes.CCA biplots of sample and species scores were generated using CanoDraw (ter Braak and Šmilauer, 2002).Partial canonical correspondence analyses (pCCAs) were applied to test the direct and indirect effects of each of the significant variables in relation to the chironomid species data.These were performed for each of the significant variable with and without the remaining significant variables included as covariables.Environmental variables that retained their significance after all pCCAs were selected for use in the analyses as they are the independent variables.
Chironomid-based transfer functions were developed for mean July temperatures using C2 version 1.5 (Juggins, 2005) for the calibration data set comprised of 100 lakes.The models were constructed using algorithms based on weighted averaging (WA) and weighted-average partial least squares (WA-PLS) (Birks, 1995).The bootstrap cross-validation technique was tested for the data set because it was previously demonstrated to be more suitable for large data sets (Heiri et al., 2011) than the jackknife technique.Transfer function models were evaluated based on the performance of the coefficient of determination (r 2 boot ), average bias of predictions, maximum bias of predictions and root-mean-square error of prediction (RMSEP boot ).The number of components included in the final model was selected based on reducing the RMSEP by at least 5 % (Birks, 1998).In addition, instead of using 5 % as a simple threshold we also performed a t test to further check whether the additional component of the WA-PLS model is outperformed.
The transfer function models were then applied to the fossil chironomid data from Tiancai Lake.MJTs were reconstructed from the site and three types of reconstruction diagnostics suggested in Birks (1995) were applied to assess the reliability of the results.These include goodness-of-fit, modern analogue technique (MAT) and the percentage (%) analysis of modern rare taxa in the fossil samples.For the goodness-of-fit analysis, the squared residual length (SqRL) was calculated by passively fitting fossil samples to the CCA ordination axis of the modern training set data constrained to MJT in CANOCO version 4.5 (ter Braak and Smilauer, 2002).Fossil samples with a SqRL to axis 1 higher than the extreme 10 and 5 % of all residual distances in the modern calibration data set were considered to have a "poor" and "very poor" fit with MJT respectively.The chi-square distance to the closest modern assemblage data for each fossil sample was calculated in C2 (Juggins, 2005) using the MAT.Fossil samples with a chi-square distance to the closest modern sample larger than the fifth percentile of all chi-square distances in the modern assemblage data were identified as samples with "no good" analogue.The percentage of rare taxa in the fossil samples was also calculated in C2 (Juggins, 2005), where a rare taxon has a Hill's N 2 < 2 in the modern data set (Hill, 1973).Fossil samples that contain > 10 % of these rare taxa were likely to be poorly estimated (Brooks and Birks, 2001).Finally, the chironomid-based transferfunction-inferred MJT patterns were compared to the instrumental recorded data from Lijiang weather station between the years of 1951 and 2014.

Chronology for Tiancai Lake core
The top 28 cm of the sediment core recovered from Tiancai Lake were used for 210 Pb dating.Sediment samples were dated using 210 Pb and 137 Cs by non-destructive gamma spectrometry (Appleby and Oldfield, 1992).Samples were counted on an Ortec HPGe GWL series well-type coaxial low-background intrinsic germanium detector to determine the activities of 210 Pb, 226 Ra and 137 Cs.A total of 58 samples at an interval of every 0.5 cm were prepared and analysed at the Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences.Sediment chronologies were calculated using a composite model (Appleby, 2001). 137Cs was used to identify the 1963 nuclear weapons peak, which was then used as part of a constant rate of supply (CRS) model to calculate a 210 Pb chronology for the core.

Ordination analyses and model development
The DCAs performed on the 100 lakes and 85 non-rare chironomid taxa had an axis 1 gradient length of 3.033, indicating that a CCA approach was appropriate for modelling the chironomid taxon response (Birks, 1998).The 18 environmental variables were tested as in the initial CCA and the results showed that total dissolved solids had the highest VIF.
It was then removed from the following CCAs.Seven of the remaining variables had significant (p < 0.05) explanatory power with respect to the chironomid species data.A total of 14.6 % of variance was explained by the four CCA axes with the seven significant variables included and the first two axes explained 10 % of the total variance.Of these variables, conductivity and K + were significantly correlated (p < 0.01) with CCA axis 1 and cond, depth, Cl − , MJT showed a significant correlation (p < 0.01) with CCA axis 2 (Table 2, Fig. 3a, b, based on the t values).Potassium ions (K + ) explained the largest variance in the chironomid species data and showed the strongest correlation with CCA axis 1. MJT and conductivity explained equally the secondlargest amount of variance (4.4 %), where MJT was significantly correlated with CCA axis 2 and conductivity was sig-nificantly correlated with both axes 1 and 2 (Table 2).The pCCAs (Table 3) demonstrated that within the significant variables K + , MJT, Cl − , LOI and depth remained their significance (p < 0.01) when the other variables were included as covariables.Potassium ions (K + ) is the independent variable that dominates the first CCA axis.MJT and Cl − are the independent variables dominating the second CCA axis but MJT has an overall higher explanatory power (Table 2).
A biplot of the CCA species scores indicated that taxa such as Heterotrissocladius marcidus-type and Tanytarsus lugenstype had a significant amount of variance explained by the first two CCA axes and were negatively correlated with CCA  axis 1. Taxa including Polypedilum nubeculosum-type, Chironomus plumosus-type were positively correlated with CCA axis 1 with a significant amount of variance explained by the CCA axes 1 and 2. A biplot of the CCA sample scores showed that a major proportion of sites distributed concentrating around depth (Fig. 3b) whereas depth only explains 2 % of the total variance in the chironomid data.The transfer functions were developed for MJT.We acknowledge that MJT is not the sole independent variable on CCA axis 2 in the data set but transfer functions based on this large regional data set are created and applied to reconstruct MJT because it is a more useful parameter compared to K + and Cl − .Both WA and WA-PLS models were tested in the modern calibration set.Summary statistics of inference models based on these two different numerical methods are listed in Table 4a.As expected, the bootstrapped WA with inverse deshrinking (WAinv) and WA-PLS models generated similar statistical results for the calibration training www.clim-past.net/13/185/2017/Clim.Past, 13, 185-199, 2017  set.The WAinv model produced an r 2 boot of 0.61, AveBiasboot of 0.06, MaxBiasboot of 5.30 and RMSEP of 2.30 • C (Table 4a).We selected the second component of WA-PLS bootstrap model as it is more robust according to the t test results (Table 4b).It produced an r 2 boot of 0.63, AveBiasboot of 0.101, a lower MaxBiasboot of 5.16 and RMSEP of 2.31 • C. Figure 4c and d show the chironomid-inferred versus observed MJT and the distribution of prediction residuals for the above transfer function models respectively.

Reconstructions from Tiancai Lake
A total of 55 subsamples were analysed for chironomid taxa throughout the top 28 cm of the core recovered from Tiancai Lake.There were 41 non-rare (Hill's N 2 > 2) taxa present (Fig. 2b).The general assemblages of these 55 subsamples include Heterotrissocladius marcidus-type, Tvetenia tamafalva-type, Micropsectra insignilobus-type, Corynoneura lobata-type, Paramerina divisa-type, Micropsectra radialis-type, Paratanytarsus austriacus-type, Thienemanniella clavicornis-type, Eukiefferiella claripennis-type, Rheocricotopus effusus-type, Macropelopia, Pseudodiamesa and Procladius (Fig. 2b).All the taxa identified from this record were well represented, and most of them were recognized as cold stenotherms, in the modern calibration training sets (Fig. 2a).We acknowledge that some of the lotic taxa may result in poor temperature estimates when applying the transfer function; therefore, reconstruction diagnostics were necessary.
The 210 Pb dating results demonstrated that the top 28 cm of the short core recovered from Tiancai Lake represent the last ∼ 150 years (Fig. 5).We applied both new transfer function models (WA and WA-PLS based on 100 lakes) to reconstruct the MJT changes between 1860 and 2008 AD www.clim-past.net/13/185/2017/Clim.Past, 13, 185-199, 2017 The RMSEPs of WA-PLS C2 and WA-PLS C1 are different (Fig. 6a).The WA and WA-PLS models showed identical trends in the MJT reconstructions over the last ∼ 150 years (Fig. 6a).There were small deviations in terms of absolute values but the variations in the reconstructed MJT between the two models were within 0.1 • C for each sample (Fig. 6a).Goodness-of-fit analysis on the reconstruction results showed that out of the 55 fossil samples, eight samples from the years between 2000 and 2007 AD have "poor" and "very poor" fit to MJT (Fig. 6b).The modern analogue analysis showed that only four fossil samples have "no good" analogues in the 100 lake data set (Fig. 6c).All 55 fossil samples contain less than 10 % of the taxa that were rare in the modern 100 lake training set (Fig. 6d).Finally, the reconstructed results also showed a comparable MJT trend and a statistical significant correlation (p < 0.05, r = 0.45, n = 31) with the instrumental measured data between 1951 and 2007 AD from Lijiang weather station (Fig. 6e).

Reliability of the environmental and chironomid data
Obtaining reliable estimates of the modern climate data has been challenging in south-western China.There are very few meteorological stations and climate monitoring in the high mountains of our study area is virtually non-existent.Climate parameters including mean July temperatures and mean annual precipitation used in this study are interpolated from climate surfaces derived from a mathematical climate surface model based on the limited meteorological data and a digital terrain model (DTM) applied to the whole of the wider Tibetan region (400 × 3000 km) (Böhner, 2006).We acknowledge that there are limitations in these data due to the sparse distribution of observations from meteorological stations.Modelling precipitation in topographically complex parts of this region such as Yunnan is problematic due to the orographic interception (or non-interception) of monsoonal air masses upwind of the sites, but the scale of the DTM means that mean temperature data should be reasonably robust, except in the most topographically complex areas.Further meteorological observations are required to refine this and other studies.We suspect that this is potentially an issue resulting from the relatively low transfer function model coefficient (r 2 boot ).We examined the chironomid taxa that appeared as temperature indicators in the calibration set.A number of taxa, namely Pseudodiamesa, Pseudosmittia and Corynoneura lobata-type, emerge as cold stenotherms.Further examination of these taxa show that these three taxa are all likely lotic (Cranston, 2000).These taxa would possibly have washed in to the lakes from streams and therefore it is not appropriate to make temperature inferences based on them.We also observed that another cold stenotherm Tanytarsus gracilentustype is closely related to lake depth, while both Tvetenia tamafalva-type and Micropsectra show closer correlation with LOI and Cl − in the CCA biplot (Fig. 3a).The observations match with the ecological recognition and interpretation of these taxa in literature where Tanytarsus gracilentustype was identified as a benthic species in the arctic and is sometimes found in temperate shallow eutrophic ponds (Einarsson et al., 2004;Ives et al., 2008).Tvetenia tamafalvatype was often found in streams and this is likely related to the organic content (LOI) of the substrates as they are detritus feeders (Brennan and McLachlan, 1979), while Micropsectra was found in thermal springs and pools (Hayford et al., 1995;Batzer and Boix, 2016) and this is reflected in this data set with having a close relationship with Cl − .It presents in lakes such as Tengchongqinghai, Qicai and Haizibian, which have high levels of Cl − ions.These sites are located in geothermal spring region of Sichuan and Yunnan provinces.Well-known warm stenotherms that are distributed along the MJT gradient of the CCA species biplot (Fig. 3a) include Dicrotendipes, Microchironomus, Polypedilum and Microtendipes.Many studies (e.g.Walker et al., 1991;Larocque et al., 2001;Rosenberg et al., 2004;Brodersen and Quinlan, 2006;Woodward and Shulmeister, 2006) show that these taxa are warm temperature indicators worldwide.We therefore argue that this large calibration training set contains a relatively complete range of temperatures and environments expected to have been experienced by lakes and their chironomid fauna in the past (Brooks and Birks, 2001).This will be particularly useful when applying the models to reconstruct changes in the late Pleistocene and Holocene when climates were different (Heiri et al., 2011).
This 100-lake training set covers a temperature gradient ranging from 4.2 to 20.8 • C (MJT gradient of 16.6 • C).Based on the CCAs, we observed that the MJT signal in this larger training set is partially masked by a salinity gradient.This is represented by potassium ions (K + ) and conductivity (Fig. 3a, b).CCA axis 1 is dominated by K + and this may be related to weak weathering.This is because (1) the first CCA axis is driven by lakes that have low precipitation but intermediate level of evaporation; examples of these sites include Lake Xiniuhaijiuzhai, Lake Muchenghai and Lake Kashacuo, from the north margin of Sichuan Province.These lakes indicate cool, dry and low windiness conditions that lead to a weak weathering environment.We highlight that this area is different from the high Tibetan Plateau where aridity and salinity dominates.(2) In chemical weathering sequences, K + is an early stage weathering product (Meunier and Velde, 2013) and K + is often associated with primary minerals, such as feldspars and micas in the bedrock (Hinkley, 1996).Salinity is responding to both temperature and aridity but further pCCAs (Table 3) indicate that both K + and MJT are independent variables in this training set.
The second CCA axis is co-dominated by MJT and Cl − with very similar gradient lengths.Lakes distributed along the warmer end of the MJT gradient include Lake Longtan, Lake Lutu, Lake Luoguopingdahaizi and Lake Jianhu.Most of these sites are lower-to intermediate-altitude sites in the data set (below 2700 m a.s.l.) because elevation is correlated with temperature.Sodium ions (Na + ) largely follow the same axis as MJT as evaporation is related in part to temperature.In summary, MJT and Cl − are both independent variables that drive the second CCA axis, and Cl − and Na + partially reflect evaporation effects because, on average, lakes in warmer climates evaporate more than those in colder ones.In addition, Cl − concentration may also relate to the characteristics of the bedrock geology of the region.We highlight that there are very few lakes on the Cl − gradient and these lakes are from the border of Sichuan and Yunnan provinces, where geothermal springs are widespread.We argue that developing a MJT transfer function is appropriate for this large lake training set because MJT is independent of other variables (e.g.Rees et al., 2008;Chang et al., 2015a).Although Cl − is also independent and co-dominates CCA axis 2, the overall explanatory power is lower (Table 2) and also the lambda ratio (λ1/λ2) is smaller than MJT (Table 3).We retained all 100 lakes from the region without removing sites to artificially enhance the MJT gradient in the ordination analyses and model development because this large data set is an accurate reflection of the natural environment of south-western China.
We selected the WA-PLS-based transfer function models over the WAinv-based approach for both training sets because the addition of PLS components can reduce the prediction error in data sets with moderate to large noise (ter Braak and Juggins, 1993).The training set has a MJT gradient of 16.6 • C and the RMSEP represents 13.8 % of the scalar length of the MJT gradient.This is comparable with most chironomid-based transfer function models including those developed from northern Sweden with 100 lakes (r 2 = 0.65, Larocque et al., 2001), western Ireland with 50 lakes (r 2 = 0.60, Potito et al., 2014) and Finland with 77 lakes (r 2 = 0.78, Luoto, 2009) representing 14.7, 15 and 12.5 % of the scalar length of the temperature gradient, respectively, but less robust than the combined 274-lake transfer function developed from Europe (r 2 = 0.84, RMSEP representing 10.4 % of the scalar length of the MJT gradient) (Heiri et al., 2011).Despite of the relatively lower model coefficient (r boot = 0.63), we observe that by having a large number of lakes in the calibration set, the distribution of the sites along the MJT gradient is relatively even (Fig. 4d).The distribution of the error residuals generates a smooth curve (Fig. 4d).The model leads to overestimation of low and underestimation of high temperature values, which is typical of the WA models (ter Braak and Juggins, 1993).We acknowledge that the lower model coefficient (r boot ) may also relate to the low explanatory power of MJT in the chironomid species data and a large number of independent and significant variables in the training set when a wide range of lakes was included.However, the extensive temperature gradient length allowed the incorporation of full potential abundance and distributional ranges for each of the chironomid taxa.

Tiancai Lake reconstructions
All three types of applied diagnostic techniques (Fig. 6bd) suggest that a reliable MJT reconstruction was provided by the two-component WA-PLS model based on this 100lake data set overall.We highlight that the eight samples from the years between 2000 and 2007 AD have "poor" and "very poor" fit to MJT, which may suggest that it is possible a second gradient other than MJT influenced the chironomid species distribution and abundance in the most recent fossil samples of Tiancai Lake.In the comparison of the MJT reconstruction results with the instrumental record from Lijiang weather station (Fig. 6a), we do not expect the absolute MJT values to be identical because Lijiang is located ∼ 55 km east-northeast and ∼ 1600 m lower in alti-tude than Tiancai Lake.We applied a typical environmental lapse rate of temperature (change with altitude) for alpine regions, which is 0.58 • C per 100 m (Rolland, 2003) to estimate the equivalent MJT values from Lijiang station.If the chironomid-based transfer functions are able to provide reliable estimates for MJT, we expect the records demonstrate a similar trend with the instrumental data (Fig. 6e).
The reconstruction results are well matched with the expected outcomes as the transfer function models based on 100 lakes for the broad area of south-western China reconstructs MJT broadly match the trend recorded by the instrument.By applying the environmental lapse rate, we observe a temperature depression from Lijiang to Tiancai Lake of about 9.3 • C (giving an inferred MJT at Tiancai Lake of 8.1 • C in the year of 2004).This magnitude of change is consistent with the chironomid-based reconstructions from Tiancai Lake (at an average of 7.8 • C for the samples representing the years of [2004][2005], where the difference in mean is 0.3 • C when compared.The implication is that the transfer function model is able to reconstruct the MJT that closely reflects the actual climate record.We observe there are minor out of phase patterns (Fig. 6e) and this may reflect the uncertainties of applying the 210 Pb chronology to very recent lake sediments (Binford, 1990).Furthermore, we note that sediment samples reflect more than one season and consequently the total range of the temperature reconstructions from the chironomid samples is likely to be slightly less than the meteorological data because of the smearing out of extreme years.While we expect overall trends between Lijiang and Tiancai Lake to be similar, the sites are not closely co-located and some natural variability between the sites is expected.Nevertheless, a significant correlation (p < 0.05) was obtained between the instrumental data and the WA-PLS model inferred MJT data for the last ∼ 50 years.We highlight that in addition to the record validation produced by the reconstruction diagnostic techniques, the well-compared trend with the instrumental record reassures that the model is capable to provide realistic pattern of the long-term mean July temperature changes.In summary, the chironomid-based transfer function developed using the 100-lake calibration training set has generated reliable quantitative temperature records and can be applied to reconstructing past climate in south-western China.

Conclusions
Chironomid-based summer temperature transfer functions using 100 lakes from south-western China have been constructed and applied to Yunnan region in the south-east margin of the Tibetan Plateau.Both the ordination and transfer function statistics show that the chironomid-based transfer function is reliable.This large regional training set allowed insight into the regional chironomid distribution and species abundance despite having many more independent environ-mental gradients.The test of the transfer function models against the modern data suggests that the two-component WA-PLS model provided reconstructions that match the trend of the local instrumental record for the last 50 years.As also demonstrated from pan-European chironomid-based transfer functions (e.g.Brooks and Birks, 2001;Heiri et al., 2011), this broadly based Chinese 100-lake training set is likely robust and is appropriate for use in reconstructing long-term summer temperature changes of south-western China.

Figure 1 .
Figure 1.Map of south-west China (a) showing the location of 100 lakes included in the calibration training set (square box).(b) Lakes from Yunnan Province are shown in the square box and (c) the location of Tiancai Lake is marked with yellow triangle.

Figure 2 .
Figure 2. (a)Chironomid species stratigraphy diagram of the 85 non-rare taxa with N 2 > 2. Mean July temperature is on the y axis and taxon abundance is in percentage.The taxon code is correspondent to the code used in Fig.3a.Warm and cold stenotherms were identified and grouped based on optical observation and the beta coefficient (from low to high) calculated based on the bootstrap weighted-average partial least squares (WA-PLS) model for each species in C2 software(Juggins, 2005).(b) Forty-one (41) non-rare chironomid species present in the short core (28 cm) from Tiancai Lake where the calibrated 210 Pb-based age is on the y axis and taxon abundance is in percentage.

Figure 3 .
Figure 3. CCA biplots of sample and species scores constrained to environmental variables that individually explain a significant (p < 0.05) proportion of the chironomid species data.(a) Species and (b) sample scores constrained to seven significant environmental variables in the 100 lakes of southwestern China.The species codes are correspondent to the taxon names shown in Fig. 2a.

Figure 4 .
Figure 4. Performance of the weighted-average models with inverse deshrinking (WAinv) and partial least squares (WA-PLS) models using the 100 lakes calibration data sets: (a) WAinv bootstrap model; (b) the second component of the WA-PLS bootstrap model.Diagrams on the left show the predicted versus observed mean July temperature (MJT) and diagrams on the right display residuals of the predicted versus observed mean July temperature.Note that both models have a tendency to overpredict temperatures from the cold end of the gradient and underestimate temperatures at the warm end.This is typical for the WA-based models.

Figure 5 .
Figure 5.The age and depth model for 210 Pb dating results of the short core (28 cm) from Tiancai Lake.The concentration of 137 Cs (circle), excess 210 Pb (triangle) and the calibrated age (AD years) (square) were plotted against core sample depth respectively.

Figure 6 .
Figure 6.(a) Chironomid-based mean July temperature (MJT) reconstruction results from Tiancai Lake based on two transfer function models: the solid black line is the reconstruction based on the weighted-average partial least squares (WA-PLS) bootstrap model with two components and the dashed black line is the reconstruction based on the weighted-average with inverse deshrinking (WAinv) bootstrap model.Red solid line is the instrumental data from Lijiang weather station, corrected applying the lapse rate and solid grey line is the three-sample moving average of the data set.Reconstruction of diagnostic statistics for the 100 lake data set where (b) displays the goodness-of-fit statistics of the fossil samples with MJT.Dashed lines are used to identify samples with "poor fit" (> 95th percentile) and "very poor fit" (> 90th percentile) with temperature.(c) Nearest modern analogues for the fossil samples in the calibration data set, where the dashed line is used to show fossil samples with "no good" (5 %) modern analogues.(d) Percentage of chironomid taxa in fossil samples that are rare in the modern calibration data set (Hill's N 2 < 2).(e) Comparison between the chironomid-based transfer function reconstructed trends (represented by MJT anomalies) with the instrumental data from Lijiang weather station (in red solid line, with three-sample moving average).The black solid line represents the reconstruction based on the WA-PLS bootstrapped model with two components using 100-lake calibration set.

Table 1 .
List of all the 18 environmental and climate variables measured from 100 south-western Chinese lakes, with mean, minimum and maximum values.

Table 2 .
CCA summary of the seven significant variables (p < 0.05) including canonical coefficients and t values of the environmental variables with the ordination axes including 100 lakes and 85 non-rare species.

Table 3 .
Partial canonical correspondence analysis (pCCA) result with environmental variables that showed a significant correlation (p < 0.05) in CCAs with chironomid species data included based on the 100 lakes calibration training set.Depth, K + , Cl − , LOI and MJT (bold) maintained their significance (p < 0.01) after each step of the pCCAs.

Table 4 .
(a) Results of the transfer function output shows the performance of the weighted-average model with inverse and classical deshrinking (WAinv, WAcla), weighted-average partial least squares (WA-PLS) models for reconstructing mean July temperature using 100 lakes from south-western China and 85 non-rare chironomid species.The bold indicates the models that are tested for reconstructing the mean July temperatures from Tiancai Lake.(b) The t test (two-sample assuming unequal variances) performed on the RMSEP output values of the WA-PLS component 1 and component 2 shows that the result is significant at p < 0.05.This suggests there is a difference between the RMSEP of the two models.We therefore selected the second component of the WA-PLS because it produced a lower RMSEP value.