Present and Lgm Permafrost from Climate Simulations: Contribution of Statistical Downscaling

We quantify the agreement between permafrost distributions from PMIP2 (Paleoclimate Modeling Intercom-parison Project) climate models and permafrost data. We evaluate the ability of several climate models to represent permafrost and assess the variability between their results. Studying a heterogeneous variable such as permafrost implies conducting analysis at a smaller spatial scale compared with climate models resolution. Our approach consists of applying statistical downscaling methods (SDMs) on large-or regional-scale atmospheric variables provided by climate models, leading to local-scale permafrost modelling. Among the SDMs, we first choose a transfer function approach based on Generalized Additive Models (GAMs) to produce high-resolution climatology of air temperature at the surface. Then we define permafrost distribution over Eurasia by air temperature conditions. In a first validation step on present climate (CTRL period), this method shows some limitations with non-systematic improvements in comparison with the large-scale fields. So, we develop an alternative method of statistical down-scaling based on a Multinomial Logistic GAM (ML-GAM), which directly predicts the occurrence probabilities of local-scale permafrost. The obtained permafrost distributions appear in a better agreement with CTRL data. In average for the nine PMIP2 models, we measure a global agreement with CTRL permafrost data that is better when using ML-GAM than when applying the GAM method with air temperature conditions. In both cases, the provided local information reduces the variability between climate models results. This also confirms that a simple relationship between permafrost and the air temperature only is not always sufficient to represent local-scale permafrost. Finally, we apply each method on a very different climate, the Last Glacial Maximum (LGM) time period, in order to quantify the ability of climate models to represent LGM per-mafrost. The prediction of the SDMs (GAM and ML-GAM) is not significantly in better agreement with LGM permafrost data than large-scale fields. At the LGM, both methods do not reduce the variability between climate models results. We show that LGM permafrost distribution from climate models strongly depends on large-scale air temperature at the surface. LGM simulations from climate models lead to larger differences with LGM data than in the CTRL period. These differences reduce the contribution of downscaling.


Introduction
Permafrost reacts to climate change (Harris et al., 2009) with critical feedbacks (Khvorostyanov et al., 2008;Tarnocai et al., 2009), especially on carbon storage and greenhouse gases emissions (Zimov et al., 2006;Beer, 2008). This issue becomes an important subject of interest for the future, especially in Arctic regions (Stendel and Christensen, 2002;Zhang et al., 2008). Through these feedback processes, the permafrost will likely play a significant role in climate and in climate models responses to global change. Three main approaches exist to modelling permafrost: G. Levavasseur et al.: Statistical downscaling applied to permafrost distribution -Some land-models simulate permafrost properties (Nicolsky et al., 2007;Koven et al., 2009) only from climate data; but permafrost representation partly depends on the resolution of climate models, which cannot reflect the local-scale physical processes involved.
-A dynamical model of permafrost can be forced by climate conditions and computes the complex permafrost physics and dynamics (Romanovsky et al., 1997) as the interactions with snow cover or hydrological network (Delisle et al., 2003). This method is mainly used to study mountain permafrost (Guglielmin et al., 2003) or to focus on a small region (Marchenko et al., 2008) because it needs large computing time and localscale data about soil properties (vegetation, lithology, geology, etc.).
-Near-surface permafrost can be derived from climatic variables using simple conditions as in Anisimov and Nelson (1997) or Renssen and Vandenberghe (2003).
For simplicity, we first assume that permafrost depends solely on air temperature at the surface (or temperature at 2 m above ground and hereafter referred to as "temperature") with the relationship from Renssen and Vandenberghe (2003), presented in Sect. 2 with the used permafrost databases. Applying these temperature conditions, we are able to extract a permafrost index from climate models outputs. In this article, we will assign the name "climate model" indifferently to GCMs (Global Circulation Models) or EMICs (Earth System Models of Intermediate Complexity). In order to be able to simulate long time periods, the equations of atmospheric or oceanic dynamics are solved on coarse spatial grids. Coarse scales cannot reflect the atmospheric local evolutions. Permafrost is an heterogeneous variable related to local-scale climate. Hence, downscaling methods, bringing local-scale information, are useful to compare permafrost data with global or regional results from climate models. Moreover, coarse resolutions generate a strong variability from one model to another; for example, with state-of-the-art climate models, the predictions of mean temperature change for the next century range from 1.4 to 3.8 • C for B2 scenario (Meehl et al., 2007). Downscaling could also reduce the variability between climate models results (or the inter-models variability), especially at CTRL period. Indeed, downscaling defines a model to reproduce calibration data. Hence, different CTRL simulations associated with different downscaling models will both be close to calibration data, reducing the differences between several downscaled climate models. Downscaling is the action of generating climate variables or characteristics at the local scale as a numerical zoom applied to climate models. On one hand, Regional Climate Models (RCMs) represent the physical approach. They have a higher spatial resolution than climate models and can compute some sub-scale atmospheric processes, parameterized in climate models. RCMs are often used in permafrost studies. Stendel et al. (2007) combined a RCM driven by global climate outputs with a dynamical model of permafrost to bridge the gap between GCMs and local-scale permafrost data. Christensen and Kuhry (2000) derived permafrost from RCM simulation using the "frost index" described originally by Nelson and Outcalt (1987). However, Salzmann et al. (2007) emphasized the need to use different RCMs to reduce uncertainties and to perform sensitivity studies. Nevertheless, RCMs are computationally very expensive. On the other hand, the statistical downscaling methods (SDMs) are less resource-intensive and represent an alternative to quickly obtain high-resolution fields from several different climate models. Such an approach consists of using statistical relationships between large-scale variables and the local-scale variable of interest. For instance, in permafrost context, Anisimov et al. (2002) used a stochastic model to map the thickness of the soil layer with annual freezing and thawing (the "active-layer"). Among the many existing SDMs, like "weather generators" (Wilby et al., 1998;Wilks, 1999) or "weather typing" (Zorita and von Storch, 1999;Vrac and Naveau, 2007) methods, we choose in Sect. 3 to directly model these relationships by transfer functions (Huth, 2002;Vrac et al., 2007a). To obtain a high-resolution permafrost index, we apply the conditions from Renssen and Vandenberghe (2003) on downscaled temperatures using a Generalized Additive Model (GAM - Vrac et al., 2007a;Martin et al., 2011), allowing to quantify the agreement between simulated high-resolution permafrost and local-scale permafrost data. GAM is suitable for continuous variable such as temperature. Studying permafrost, we are dealing with discrete variable; hence, we need relationships between temperature and permafrost. So, we develop in Sect. 4 an alternative SDM based on a Multinomial Logistic GAM (ML-GAM) that models directly the relationship between localscale permafrost and global-scale variables. In climatology, logistic models are often employed to predict wet or dry day sequences (Buishand et al., 2003;Vrac et al., 2007b;Fealy and Sweeney, 2007) or vegetation types distribution (Calef et al., 2005). Logistic models were also used in the context of periglacial landforms prediction by Lewkowicz and Ednie (2004) or more recently by Brenning (2009). In our case, ML-GAM produces a relationship between several continuous variables and the occurrence probabilities of each permafrost category. Applying logistic models on a large region as the Eurasian continent allow us to build a global/generic relationship between permafrost and several factors. For both approaches, a strong hypothesis is to consider the climate as a steady-state and to assume that the nearsurface permafrost (hereafter referred to as "permafrost") is in "pseudo-equilibrium" with it.
Also, climate modelling needs to determine the ability of climate models in simulating past climates in comparison with data. In paleoclimatology, discrepancies appear between large-scale climate models and data-proxies, the latter being intimately related to their close paleoenvironment (Gladstone et al., 2005;Ramstein et al., 2007;Otto-Bliesner et al., 2009). Downscaling may reduce these differences between climate models and data. Furthermore, an important exercise is to evaluate the ability of the two statistical models to represent the permafrost distribution of a very different climate. An application of these methods to the Last Glacial Maximum (LGM) is discussed in Sect. 5. We work with a representative set of climate models from the Paleoclimate Modeling Intercomparison Project (PMIP2) (Braconnot et al., 2007a,b), which provides climate simulations for the preindustrial and LGM time periods.

Permafrost: definition and data
Permafrost is defined as ground permanently at or below 0 • C for two or more consecutive years (French, 2007). To validate the statistical models for the control period (CTRL, hereafter refered to as "present"), we use geocryological observations reviewed and grouped into one circum-artic permafrost map by the International Permafrost Association (IPA) and the Frozen Ground Data Center (FGDC) (Brown et al., 1997). Most of compiled permafrost CTRL data are observations between 1960 and 1980 drawn on different maps with different scales by several authors, e.g. Heginbottom et al. (1993) and references therein. In a similar way, LGM permafrost data correspond to a recent map of permafrost extent maximum in Europe and Asia around 21 ky BP, combining different geological observations from different maps as described in Vandenberghe et al. (2008Vandenberghe et al. ( , 2011. The combined LGM maps are not always distinctive in describing the permafrost categories, which could have different definitions depending on the authors. Moreover, the age of LGM permafrost indicators is often not precisely defined. Consequently, it is difficult to judge the accuracy of the final maps and we keep in mind these restrictions in our interpretation. Both datasets describe the spatial distribution of two main types of permafrost (French, 2007): -Continuous permafrost is a permanently frozen ground that covers more than 80 % of the sub-soil.
-Discontinuous permafrost covers between 30 % and 80 % of sub-soil. The permanently frozen ground forms in sheltered spots, with possible pockets of unfrozen ground.
Consequently, our region of interest corresponds to the Eurasian continent with the Greenland ice-sheet approximately from 65 • W to 175 • E and from 20 • N to 85 • N (see Fig. 1). We consider the Greenland ice-sheet in order to calibrate the statistical model with the widest possible present temperature range for a downscaling in the LGM climate. Nevertheless, for permafrost representation we mask the icesheets (Greenland and Fennoscandia for LGM), as the presence of permafrost under an ice-sheet is not obvious and is currently debated. Moreover, since our estimate is based on temperature there is no reason why the permafrost under the ice-sheet shall be mainly driven by air temperature above the ice-sheet.

Downscaling with a Generalized Additive Model (GAM)
To simulate a discrete variable such as permafrost, we first decide to downscale the temperatures from different climate models with the same approach by GAM as Vrac et al. (2007b) and Martin et al. (2011). Then we deduce permafrost from the downscaled temperatures using a simple relationship between permafrost and temperature. This methodology is illustrated in Fig. 2 (left half).

Temperature data and permafrost relationship
To calibrate a GAM, we need observations. The highresolution data used for the downscaling scheme are the gridded temperature climatology from the Climate Research Unit (CRU) database (New et al., 2002). For each grid-point the dataset counts twelve monthly means (from 1961 to 1990) at a regular spatial resolution of 10 (i.e. 1/6 degree in longitude and latitude) corresponding to the downscaling resolution.
Although the CRU climatology corresponds to the period of the permafrost observations, the overall permafrost system is not in equilibrium with present climate. However, in the following we will consider the climate as the steady-state and assume that near-surface permafrost is in rough equilibrium with it. In order to obtain the permafrost limits from the downscaled temperatures, we derive a high-resolution permafrost index according to the assumption that permafrost depends solely on temperature. Several relationships exist in literature (e.g. Nechaev, 1981;Huijzer and Isarin, 1997); the most employed in climate modelling are the following conditions from Renssen and Vandenberghe (2003) (explicitly described in Vandenberghe et al., 2004), which we will use and assign the name "RV": -Continuous permafrost: Annual mean temperature −8 • C and Coldest month mean temperature −20 • C.
To check the consistency of this assumption of permafrost being only related to temperature, Fig. 1 compares the permafrost distribution obtained by applying these temperature conditions on CRU climatology, with the permafrost index from IPA/FGDC. The similarities between both representations are obvious and show a consistent relationship between the two variables. Some differences exist in high mountain regions for the category or presence of permafrost. Indeed, even if this isotherms combination is calibrated on Legend : Permafrost comparison between CRU temperature climatology with the Renssen and Vandenberghe (2003) conditions and the IPA/FGDC permafrost index. In the legend panel,"N" corresponds to "No permafrost", "D" to "Discontinuous permafrost" and "C" to "Continuous permafrost". The highlighted categories with bold letters shows the agreement between both datasets. the present climate, the temperature is not the only criterion to model permafrost: for example, snow cover, soil and vegetation types have key roles for mountain permafrost (Guglielmin et al., 2003;French, 2007). Nevertheless, to a first order, deriving permafrost from temperature will be the base assumption of this study.

Generalized Additive Model
We first use a statistical model applied by Vrac et al. (2007a) to downscale climatological variables and based on the Generalized Additive Models (GAMs) as precisely studied in this context by Martin et al. (2011). GAM models statistical relationships between local-scale observations (called predictand ) and large-scale variables (called predictors), generally from fields of climate models. The large-scale predictors will be described in Sect. 3.2.1. More precisely, this kind of statistical model represents the expectation of the explained variable Y (the predictand, temperature in our case) by a sum of nonlinear functions (f ), conditionally on the predictors X (Hastie and Tibshirani, 1990): where is the residual or error, β 0 is the intercept, k is the k th predictor and n is the number of predictors and i is the grid-cell. To use GAM, we need to define the distribution family of the explained variable. For simplicity, we assume that temperature has a Gaussian distribution, which implies a zero-mean Gaussian error (Hastie and Tibshirani, 1990). Then, we define the nonlinear functions as cubic regression splines (piecewise by third degree polynomials). Finally any SDM needs a calibration/projection procedure. The calibration is the fitting process of the splines on present climate. Afterward, we project on a different climate to predict a temperature climatology in each grid-point of our region. Initially, the calibration step takes into account the 12 months of the climatology (annual calibration). To be evaluated in fair conditions, the statistical model requires independent samples between the calibration and projection steps. Using climatology data does not satisfy this condition on present climate with an annual calibration and does not allow a classical cross-validation. As a workaround, we adapt a "cross-validation" procedure which consists of a calibration on 11 months and a projection on the remaining month. With a rotation of this month, we are able to project a local-scale climatology for any month.

Predictand:
Local--scale temperature (CRU) Predictors: --Air Surface Temperature (TAS), --Diffusive COn=nentliaty (DCO), --Advec=ve COn=nentality (ACO), --Topography. In this paper, we only use GAM as a "tool" and we do not directly discuss the behavior of the statistical model; for more details we refer the reader to Vrac et al. (2007a); Martin et al. (2011). We perform this analysis within the statistical programming environment R (R Development Core Team, 2009) and its "mgcv" package (Wood, 2006).

Explanatory variables (predictors)
Previous studies from Vrac et al. (2007a) and Martin et al. (2011) lead us to select four informative predictors for temperature downscaling, fully described in their studies. Note that we only downscale on the continents because CRU data are only defined on land grid-points. Most of the predictors are computed from a representative set of coupled oceanatmosphere simulations provided by the Paleoclimate Modeling Intercomparison Project (PMIP2) using state-of-the-art climate models. The required LGM outputs for Sect. 5 lead us to work with nine of them listed in Table 1. The explanatory variables may be divided into two groups: the "physical" predictors and the "geographical" ones. The "physical" predictors are directly extracted from climate models outputs and depend on climate dynamics. The "geographical" predictors provide information to the large-vs. local-scale relationships that are robust and stable with time.
Only one "physical" predictor is used and corresponds to the air temperature at the surface. This variable is extracted from present and LGM simulations from climate models bilinearly interpolated at 10 resolution in order to produce more spatial variability. If the interpolation may have an impact on the downscaling, we do not discuss this point in this study. Moreover, the preindustrial simulations from PMIP2 do not correspond to the 1961-1990 period of CRU data particularly in terms of CO 2 concentration. To account for this effect and to have a more relevant calibration, we lift climate models temperatures (preindustrial values) into the current  climate before calibration: we compare the global mean temperature from each climate model and CRU data (grid by grid) and add the difference in each grid-point. For LGM period, we do not assume any temporal shift of the simulations. Consequently, we do not apply a similar correction on LGM temperatures and we consider LGM nearsurface permafrost in equilibrium with LGM climate.
The "geographical" predictors are the topography and two continentality indices. The surface elevation from climate models depends on the resolution and does not account for small orographic structures. To take into account the effect of high-resolution topography, we use the high-resolution  Hasumi and Emori (2004) gridded dataset, ETOPO2 1 , from the National Geophysical Data Center (NGDC) which gathers several topographic and bathymetric sources from satellite data and relief models (Amante and Eakins, 2008). We build the LGM topography from ETOPO2 adding in each grid-point a value corresponding to the difference between LGM and present orography. This difference is calculated with the elevation provided by present and LGM simulations of the ice-sheet model GRISLI (Peyaud et al., 2007) to account for the ice-sheet elevation and subsidence, and the sea-level changes. The first continentality index is the "diffusive" continentality (DCO). DCO is between 0 and 100 % and can be attributed to the shortest distance to the ocean, 0 being at the ocean edge and 100 being very remote from any ocean corresponding to a purely continental air parcel. The physical interpretation is the effect of coastal atmospheric circulation on temperature. DCO does not depend on time and is only affected by sea-level change (or land-sea distribution). The second continentality index is the "advective" continentality (ACO). ACO is somewhat similar to DCO albeit being modulated by the largescale wind intensities and directions from climate models and represents an index of the continentalization of air masses. It is based on the hypothesis that an air parcel becomes progressively continental as it travels over land influencing temperature. Hence, ACO depends on the changes of land-sea distribution and on wind fields coming from the climate models simulations. For more details about ACO and DCO see Appendix ??.

GAM results on present climate
In this section, GAM is applied to the nine climate models from the PMIP2 database. In order to make a visual comparison with CTRL permafrost data and to highlight the influence of downscaling on permafrost modelling, we compare permafrost distributions deduced from interpolated and from downscaled temperatures for each climate model. We will assign the name "GAM-RV" for the procedure of applying the RV conditions on temperatures downscaled by GAM.
In the following, we only discuss the results from two representative models: on the one side ECHAM5 is heavily influenced by GAM-RV downscaling and shows the best results on CTRL period. One the other side, IPSL-CM4 is the coldest climate model leading to good downscaling results on LGM for this method. Figures 3a and 4a compare permafrost extents from interpolated temperatures (respectively for ECHAM5 and IPSL-CM4) when applying the RV conditions to derive permafrost, with the permafrost distribution from IPA/FGDC. The two maps reveal several differences between climate models and CTRL permafrost data at high latitudes and in mountain regions, especially in Himalayas for ECHAM5 and in eastern Siberia for IPSL-CM4. Both permafrost distributions are driven by the latitudinal gradient of large-scale temperature. Even if IPSL-CM4 has a higher resolution (Table 1), improving the representation of regional topographic structures, it does not contain enough local-scale information to represent the permafrost distribution from IPA/FGDC observations. Applying the GAM-RV approach, we obtain the corresponding Figs. 3b and 4b. Downscaling shows better permafrost distributions, particularly for discontinuous permafrost at high latitudes. For both climate models, some differences with CTRL permafrost data disappear and the major contribution of local-scale topography clearly appears for ECHAM5 with the onset of colder temperatures over the Siberian mountains or the Himalayas. However, the information provided by inferred downscaled temperatures cannot reduce the differences on the Scandinavian peninsula and around Himalayas or in eastern Siberia for IPSL-CM4.  Legend : In the legend panel,"N" corresponds to "No permafrost", "D" to "Discontinuous permafrost" and "C" to "Continuous permafrost". The highlighted categories with bold letters show the agreement between model and data.
To quantitatively assess the effect of the downscaling on CTRL permafrost representation, we measure the agreement between permafrost distributions from downscaled climate models and IPA/FGDC observations with different numerical indices, the results of which are listed in Table 2. Without GAM-RV downscaling, climate models obtain a smaller total permafrost area than observations from IPA/FGDC, with a difference of 3.4 × 10 6 km 2 and 3.1 × 10 6 km 2 respectively for ECHAM5 and IPSL-CM4. Contrary to our expectations, these differences with CTRL permafrost data increase with GAM-RV downscaling to about 10 6 km 2 for both climate models in comparison with interpolated fields. In order to distinguish between continuous and discontinuous permafrost, we consider their respective areas. The smaller permafrost area predicted by GAM-RV is mainly explained by a decrease of the continuous permafrost area of about 1.1 × 10 6 km 2 for ECHAM5 and 0.8 × 10 6 km 2 for IPSL-CM4. The area of discontinuous permafrost slightly increases for ECHAM5 (+0.2 × 10 6 km 2 ) and decreases (−0.3 × 10 6 km 2 ) for IPSL-CM4. To quantify the proportion of permafrost simulated in right location, %CP (%DP) is the percentage of continuous (discontinuous) permafrost in agreement with permafrost data. %CP (%DP) corresponds to the ratio of continuous (discontinuous) matching area (respectively in blue and turquoise areas on maps 3 and 4) over the continuous (discontinuous) area from IPA/FGDC observations. These percentages of common area between permafrost data and climate models are obtained by summing up the surface of the grid-cells including continuous (discontinuous) permafrost for both. For example, 0 %DP means that discontinuous permafrost from climate model and data are entirely non-overlapping. GAM-RV reduces all percentages of about 5 %, except for %DP from 16 to 31 % for ECHAM. The results for these two climate models show the limits of the GAM-RV method. Figure 5a shows the relative difference with permafrost data from IPA/FGDC for all interpolated and downscaled climate models. We confirm the decrease of total permafrost area for most of downscaled climate models by GAM-RV with a median relative difference with CTRL permafrost data of −27.4 % against −21.8 % for the interpolated climate models. The plots also reveal a weaker variability between climate models results with downscaling. Indeed, in Table 2 GAM-RV reduces the standard deviation for all area indices. Although standard deviation computed on small-sample is not very reliable statistically, it gives a first indication about the variability between climate models results. In conclusion, it clearly appears that the resolution plays a significant role in permafrost prediction. GAM-RV provides local information improving the CTRL permafrost distribution.
These area indices provide numerical information on the permafrost extent but do not quantify the statistical relevance of agreement between climate models and permafrost data. To judge if the GAM-RV results are better than the agreement by chance, we use the kappa coefficient (κ, Appendix A). This index can take values between 0 and 1 and measures the intensity or quality of the agreement (Cohen, 1960;Fleiss et al., 1969) based on a simple counting of matching and nonmatching grid-points in a matrix used to represent errors in assigning classes. Without downscaling, ECHAM5 obtains a κ of 0.64, while the value is 0.68 for IPSL-CM4 corresponding respectively to 72 % and 74 % of a maximum agreement beyond chance of 0.88 and 0.91. With GAM-RV, the %κ max increases by 14 % for ECHAM5 and 4 % for IPSL-CM4. Moreover, all studied climate models obtain a κ adj close to their κ. Consequently, the results obtained by GAM-RV are statistically relevant and in better agreement with permafrost data from IPA/FGDC than using a simple interpolation of temperatures.
Despite non-systematic improvement from GAM on permafrost distribution, this method is informative for temperature downscaling on CTRL period. All climate models obtained a percentage of explained variance between 97 and 100% with respect to temperature observations. GAM brings downscaled climate models closer to the CRU climatology by improving the temperature distribution (Vrac et al., 2007a;Martin et al., 2011). Hence, the limits of the GAM-RV method are mainly due to the RV relationship. We confirm that the RV relationship does not provide enough information for local-scale permafrost distribution and leads to a close dependence between temperature and permafrost. The permafrost distribution from climate models is strongly driven by the latitudinal gradient of temperature, leading to a disagreement with CTRL permafrost data. Furthermore, applying the RV conditions on CRU temperatures leads to a total permafrost area of 10.4 × 10 6 km 2 . Based on the hypothesis that CRU and CTRL permafrost data have no uncertainties, the RV relationship induces an error of −26.0 % compared to permafrost data (Fig. 5a). Consequently, GAM-RV includes this error and does not improve the permafrost distribution beyond the CRU permafrost distribution.

An alternative approach: the Multinomial Logistic -GAM
Using temperature downscaling to reconstruct permafrost limits requires conditions to go from continuous to discrete values. As shown in Sects. 3.1 and 3.2.2, the RV relationship is only based on the contribution of temperature for permafrost distribution. A study at a local-scale needs more information. Here, we propose to enlarge the spectrum of relationships between permafrost and several variables.
To link a categorical variable, such as permafrost, with continuous variables, a common statistical technique is the use of logistic models representing the occurrence probability of an event (often binary, e.g. permafrost or no permafrost). This probability can take continuous values between 0 and 1. For instance, Calef et al. (2005) built a hierarchical logistic regression model (three binary logistic Table 2. PMIP2 quantitative results for CTRL period. "DATA" column corresponds to IPA/FGDC permafrost index. The CPA, DPA, PA, and PD indices are respectively set for continuous, discontinuous, total permafrost areas and total permafrost difference with data and are expressed in 10 6 km 2 . The %CP and %DP indices are respectively the percentages of continuous and discontinuous permafrost in agreement with data. The κ, κ max , κ adj indices corresponds respectively to the κ coefficient, its maximum value and its adjusted value. The %κ max is the percentage of κ max reached by κ. Numbers from 1 to 9 correspond to the PMIP2 models referenced in Table 1  Here, ML-GAM is used as a SDM to estimate the occurrence probabilities of the explained variable (Y , permafrost in our case) for each category or class j by a sum of nonlinear functions (f ), conditionally on numerical or categorical predictors (X) (Hastie and Tibshirani, 1990): where P (Y i = j ) is the probability of the j th permafrost category, β j is the intercept for the j th permafrost category, f k are defined as cubic splines for the k th predictor, n is the number of predictors and i is the grid-cell. To use ML-GAM, we need to define a reference category (r). We obtain j − 1 relationships and the occurrence probability of the reference category can be deduced with m j =1 P (Y i = j ) = 1 (considering m categories). ML-GAM is performed with the R package "VGAM" (Yee and Wild, 1996;Yee, 2010a,b).
Local-scale data used for the calibration step are directly the local-scale observed permafrost indices from IPA/FGDC. In order to compare ML-GAM and GAM-RV, we use the same predictors for both methods. As discussed in Sect. 3.1, the topography, the temperature and the continentality indices were chosen for temperature downscaling. Although the temperature and the topography are clearly necessary for permafrost representation, a study on the predictors choice for permafrost downscaling could be an interesting prospect but is not the purpose of this article.
In GAM-RV we had to set the relationship between permafrost and downscaled temperatures. Here, the logistic models build a new relationship between permafrost and the selected predictors which can be compared to the previous isotherms combinations from Renssen and Vandenberghe (2003). Figure 6 shows the probabilities to obtain each category of permafrost in each approach. On the panels 6a-c, we apply the RV conditions on CRU temperatures. On the panels 6d-f, we model by ML-GAM the relationship between permafrost from IPA/FGDC and two predictors: the annual mean temperature and the coldest month mean temperature from CRU. Thus, each graph on the left is directly comparable to the corresponding one on the right (Fig. 6). Conditions from Renssen and Vandenberghe (2003) clearly appear with probabilities of 0 or 1 depending on the isotherms described in Sect. 3.1. With ML-GAM, visible similarities with the relationships used in GAM-RV demonstrates the consistency of the method. However, the probabilities can take continuous values between 0 and 1 and allows us to obtain for each grid-point three complementary probabilities for the continuous, discontinuous and no permafrost categories. In ML-GAM, the modelled relationship also varies according to the selected predictors and the studied climate model. Bypassing temperature  Fig. 6. Permafrost occurrence probabilities based on the annual mean local temperatures and the coldest month mean local temperatures from CRU data. Panels (a-c) (on the left) corresponds to the fixed temperature conditions from Renssen and Vandenberghe (2003) (isotherms combinations) used for the GAM-RV downscaling method; panels (d-f) (on the right) shows the modelled relationship between permafrost and the two same variables by the ML-GAM downscaling method. The grey area corresponds to the cells mathematically impossible (i.e. when the annual mean temperature is colder than the coldest month mean temperature). downscaling allows computing a more complex relationship between predictors and permafrost. To calibrate on a large region as Eurasian continent also allows to build a global relationship, which could be tested on other region of interest. Moreover, the multinomial logistic models could take into account other permafrost categories (e.g. sporadic or isolated permafrost ;French, 2007).

Comparison GAM-RV vs. ML-GAM on present climate
To confront ML-GAM with GAM, Figs. 3c and 4c compare the permafrost indices downscaled by ML-GAM (respectively for ECHAM5 and IPSL-CM4) with the permafrost distribution from IPA/FGDC observations. The permafrost indices downscaled by ML-GAM correspond in each gridpoint to the highest occurrence probability. Permafrost distribution obtained with ML-GAM shows better agreement with CTRL permafrost data than that obtained with GAM-RV (Figs. 3c and 4c). The contribution of local-scale topography directly improves the discontinuous permafrost representation in the Himalayas and Tibetan plateau and in other areas with mountain permafrost (Alps, Scandinavian and Siberian mountains). For both climate models, most of the differences persisting with the GAM-RV downscaling disappear with ML-GAM, as in eastern Siberia for IPSL-CM4. In Table 2, the ML-GAM downscaling improves the continuous and discontinuous permafrost areas for both climate models. In comparison with interpolated climate models, ML-GAM reduces the total permafrost difference with observations from IPA/FGDC to 1.4 × 10 6 km 2 for ECHAM5 and 1.6×10 6 km 2 for IPSL-CM4. The percentages of continuous and discontinuous areas in agreement with CTRL permafrost data also increase to values close to 90 % for %CP and 53 % for %DP. In the Fig. 5a ML-GAM downscaling clearly shows improvements for all climate models with a median relative difference with CTRL permafrost data of −9.6 %, compared with GAM-RV (−21.8 %).
Moreover, the permafrost distribution is very similar between ECHAM5 and IPSL-CM4. The same patterns can also be observed on the maps of the different climate models (not shown) especially for continuous permafrost. Figure 5a clearly shows that ML-GAM reduces the variability between climate models results, more than with GAM-RV. Indeed, ML-GAM has a weaker standard deviation whatever the index (Table 2). This alternative method brings all climate models closer to the permafrost distribution from IPA/FGDC observations.
In terms of κ statistics, ML-GAM systematically improves the statistical agreement from 0.64 to 0.80 for ECHAM5 and from 0.68 to 0.80 for IPSL-CM4. The higher %κ max reflects a better agreement with CTRL permafrost data. Note that the standard deviation is also reduced for κ indices: the quality of the agreement is equal for all climate models. ML-GAM provides more confidence than GAM-RV, based on the fact that our results are statistically better than chance agreement. Moreover, all climate models have a κ adj closer to κ than with GAM-RV: the intrinsic biases are slightly weaker with ML-GAM.
Nevertheless, some inconsistencies persist. A high disagreement on the permafrost category persists at high latitudes for ECHAM5 (Fig. 3c). As previously mentioned, this is due to the physics included in the statistical model: the predictors choice is relevant for temperature downscaling. Soil temperature, vegetation type and snow cover could bring more consistent physics to reconstruct a highresolution permafrost distribution.
Without temperature downscaling ML-GAM leads to a more precise spatial representation of permafrost in better agreement with observed CTRL data. In conclusion, bypassing temperature downscaling provides an adapted relationship between permafrost and predictors for each climate model.
Our results are the byproduct of several factors such as: the ability of climate models to correctly represent temperature, the relationship between permafrost and chosen variables, etc. It is thus difficult to independently quantify the error of each factor in the final result. Such a sensitivity analysis is beyond the scope of our paper and will be the subject of further studies.

Application to LGM permafrost
In a climate change context it is interesting to test the ability of the statistical models to represent past climates when they have been calibrated on present climate. In terms of temperatures and precipitation Martin et al. (2011) obtained remarkable results from the EMIC CLIMBER Petoukhov et al., 2000) in comparison with GCM outputs for the Last Glacial Maximum (LGM) climate and concluded to a great potential of GAM for applications in paleoclimatology (Vrac et al., 2007a;Martin et al., 2011).
Can we thus export the statistical models at a different past climate, as the LGM, in terms of permafrost distribution? To answer this question, we apply the three SDMs on LGM outputs from the PMIP2 climate models. For this time period, the permafrost distribution used to compare with climate models is from Vandenberghe et al. (2011). Figures 7a and 8a compare the permafrost distribution from interpolated climate models (with the RV conditions) with the LGM permafrost data. Without downscaling, ECHAM5 and IPSL-CM4 already appear too warm to correctly represent permafrost limits from LGM data. For ECHAM5, the permafrost limits do not comply with the Fennoscandian ice-sheet contours. Moreover, its coarse orography is not enough to represent mountain permafrost in Himalayas. IPSL-CM4 is colder and has a higher resolution, providing a more representative permafrost distribution around the ice-sheet and the Tibetan plateau. In the legend panel,"N" corresponds to "No permafrost", "D" to "Discontinuous permafrost" and "C" to "Continuous permafrost". The highlighted categories with bold letters show the agreement between model and data. and 8b compare in the same way the permafrost distribution from GAM-RV with the permafrost distribution from Vandenberghe et al. (2011). The contribution of the localscale topography appears particularly with the onset of mountain permafrost in Himalayas for ECHAM5 as for present climate. IPSL-CM4 obtains slightly warmer temperatures with GAM, leading to permafrost limits at higher latitudes. Permafrost downscaled with ML-GAM is compared with LGM permafrost data in Figs. 7c and 8c. For those two climate models continuous permafrost over Himalayas and Tibetan Plateau disappears almost completely and discontinuous permafrost reaches higher latitudes than GAM-RV for both climate models. We give in Table 3 the numerical indices for LGM period. Quantitatively, GAM-RV does not systematically improve the total permafrost area: +1.4 for ECHAM5 and −1.6 × 10 6 km 2 for IPSL-CM4 with respect to interpolated fields. Contrary to present climate, ML-GAM increases this discrepancy with +2.1 for ECHAM5 and −4.2×10 6 km 2 for IPSL-CM4. Then, even if GAM-RV degrades the permafrost distribution for IPSL-CM4, it remains the best representation with the highest %CP (63 %) and %DP (7 %) for this method. ML-GAM improve the percentage of discontinuous permafrost predicted in right location for each climate model.
Nevertheless, whatever SDM is used, the surface differences with LGM permafrost data are more pronounced than in CTRL period. Continuous permafrost derived from downscaled temperature is still underestimated. Moreover, depending on CMs, no or few discontinuous permafrost is predicted at the right place (%DP ranges between 0 and 20 %). No significant decrease appears in terms of variability between all climate models results: the measured standard deviation (Table 3) is higher than CTRL period and remains fairly stable around 3×10 6 km 2 , except for ML-GAM which halves the variability between climate models results. Figure 5b for LGM clearly shows that GAM-RV or logistic models face difficulties in improving the nine climate models with median relative differences with LGM permafrost data around −40 %. This shows that the permafrost distribution in the LGM is strongly driven by the large-scale temperature from climate models and we cannot base our interpretation of the LGM results on CTRL results. The SDMs cannot correct the large gap between interpolated climate models and LGM permafrost data (Fig. 5b). With a simulated LGM climate closer to LGM data, downscaling could have more impact.
The larger differences with LGM permafrost data than at CTRL period imply a lower κ coefficient (Table 3). With GAM-RV no changes appear for ECHAM5 except for the κ adj showing larger biases in calculation of κ. For IPSL-CM4 the κ coefficient decreases from 0.63 to 0.58. GAM-RV does not improve the statistical agreement, reflecting the weak potential of climate models to correctly represent permafrost limits for the LGM period. ML-GAM gives similar performances.
We can summarize with some remarks: 1. The contribution of GAM or ML-GAM is not sufficient to reduce the gap between climate models and LGM permafrost data in reproducing local-scale permafrost. ML-GAM produces a more realistic LGM permafrost distribution reaching latitudes similar to those from LGM data and improving the agreement with it. Nevertheless, the SDMs do not reduce the variability between climate models results at LGM.
2. The SDMs include the strong contribution of temperature and topography. Nevertheless as for CTRL period, the predictors ACO and DCO are not informative for permafrost. So common differences appear between the two periods. Despite consistent patterns, the permafrost distribution is still strongly driven by the latitudinal gradient of temperature and incorrect transitions from continuous to no permafrost appear.
3. With the hypotheses that LGM and CTRL permafrost data have no uncertainties, that the simulated climates from climate models are at equilibrium with permafrost data, and that the relationships between permafrost and chosen variables are stable with time, the nine climate models from PMIP2 cannot simulate a cold enough climate to represent the LGM period. Another study from Saito et al. (2010) confirms this result. Thus, the methods are limited by large-scale errors from climate models in the LGM time period. The better climate models are, the larger the improvement by the SDMs.
4. The differences observed between downscaled climate models and data partly come from the relationship between permafrost and the other variables. The RV conditions are based on present observations. The relationship between permafrost and predictors from ML-GAM is also calibrated in the CTRL period. The continuous or discontinuous permafrost extents may not be defined by the same isotherms seen in section 3.1; in the case of multinomial logistic models, the influence of different predictors may change in another climate.
5. Finally, LGM permafrost data are best currently available and based on geological observations of the maximum permafrost extent and correspond to the coldest time period around LGM (21 kyr BP). The LGM time period is defined with the maximum extent of the icesheets which is probably not directly related to temperature minimum. A lag may exist between the LGM data and the LGM climate simulated by climate models. Therefore, LGM permafrost data are likely to be overestimated. The differences between downscaled permafrost from PMIP2 models and LGM permafrost extent from Vandenberghe et al. (2011) should be taken as a gross estimate. Table 3. PMIP2 quantitative results for LGM period. "DATA" column corresponds to Vandenberghe et al. (2011) data. The CPA, DPA, PA, and PD indices are respectively set for continuous, discontinuous, total permafrost areas and total permafrost difference with data and are expressed in 10 6 km 2 . The %CP and %DP indices are respectively the percentages of continuous and discontinuous permafrost in agreement with data. The κ, κ max , κ adj indices corresponds respectively to the κ coefficient, its maximum value and its adjusted value. The %κ max is the percentage of κ max reached by κ. Numbers from 1 to 9 correspond to the PMIP2 models referenced in Table 1

Conclusions
We described three statistical downscaling methods (SDMs) for permafrost studies. In order to obtain high-resolution permafrost spatial distribution, we first applied these SDMs on climate models outputs for the present climate (CTRL). The approach by Generalized Additive Model (GAM) is suitable for representing the temperature behavior at a local-scale (Vrac et al., 2007b). According to Martin et al. (2011) results, choosing a GAM leads to a relevant physical model for the small scales with simple statistical relationships that are easily interpretable. Applying the conditions defined by Renssen and Vandenberghe (2003) on downscaled temperatures improves the spatial distribution of discontinuous permafrost but underestimates the total permafrost area. This GAM-RV method reaches some limits with a permafrost strongly driven by the latitudinal gradient of temperatures. Indeed, a simple combination of isotherms is not sufficient to describe the permafrost distribution at a local-scale. The approach by multinomial logistic models is more adapted for this application. The modelled relationship, as a function of several variables, provides a better representation of continuous permafrost and mountain permafrost (especially discontinuous permafrost) and reduces the variability between all climate models from PMIP2 database with a larger statistical relevance. The results from a multinomial logistic model (Multinomial Logistic GAM -ML-GAM) confirm that a study at a local-scale needs more physics about permafrost, such as the hydrological physical processes for example.
Applying the SDMs on a different climate, the Last Glacial Maximum (LGM), leads to permafrost distribution in slightly better agreement with LGM permafrost data. Nevertheless, downscaling of LGM permafrost extent faces difficulties with larger differences than CTRL period. None of the studied climate models can represent a LGM permafrost www.clim-past.net/7/1225/2011/ extent comparable to observed data. This is true for GAM-RV and ML-GAM. The variability between climate models strongly depends on large-scale temperature that cannot be completely corrected by the SDMs. The differences with LGM data reduce the contribution of downscaling and have different sources: (i) an assumed stationarity of the RV conditions for GAM-RV and the modelled relationship for ML-GAM; (ii) an initial bias from climate models which cannot simulate a proper LGM climate; (iii) a complex permafrost dynamics under-represented in the SDMs by predictors; (iv) a possible lag between the LGM period from climate models and the period represented by LGM data from Vandenberghe et al. (2011). Our approach is thus essentially limited by the ability of climate models to produce correct climatic signal, especially for climates different from CTRL. In order to obtain better contribution of the SDMs, climate models need to improve the representation of largescale temperature on continents at LGM.
To complement this study, some points would deserve to be deepened to improve our results. Permafrost is an heterogeneous variable with few observations. Climate models temperature, used to derive permafrost distribution, is a global and continuous variable. Therefore, we need localscale predictors that will add local variability to climate signal. Our SDMs use local-scale topography but other variables used in permafrost dynamic models, such as vegetation or soil properties (Marchenko et al., 2008), are required to have a representative physics of permafrost processes and a better distribution. The potential of the multinomial logistic models lies in the control of the physics included in the predictors. In this study we used the same predictors for both approaches. It is obvious that they can and should be changed in the ML-GAM methods to represent more accurately the permafrost distribution. Future research should include snow cover and thickness and soil temperature, especially for mountain permafrost influenced by snow cover. We can also imagine building new "geographical" predictors such as exposure to the sun depending on the orientation of the topography slope (Brown, 1969). The balancing and choice of "geographical" and "physical" predictors is crucial to maintain good local representation and a consistent and robust physical model applicable to different climates. To reconcile models and data, it would also be interesting to downscale permafrost at colder periods simulated by climate models, such as Heinrich events . We would be able to determine the needed temperatures to obtain the best permafrost limits according to the data from Vandenberghe et al. (2011). In this context, we also have to keep in mind our strong assumption of a near-surface permafrost in equilibrium with climate signal. Downscaling of transient climate simulations could help us to evaluate how large the difference is due to this disequilibrium.

Appendix A The kappa statistic
The following example details the calculation of the κ coefficient:
Total n .,1 n .,2 n .,3 n where "C", "D" and "N" correspond to the three categories "Continuous", "Discontinuous" and "No" permafrost, n i,j are the cell counts with the classification totals n i. and n .j , n is the number of grid-cells, P obs is the proportion of observed agreement and P chance is the proportion of random agreement or expected by chance with independent samples. The κ values are difficult to interpret because the kappa's scale (between 0 and 1) depends on the number of categories and on the sample-size. To gauge the strength of agreement without an arbitrary scale, we use the kappa maximum (κ max ). Based on the same counting as the κ, it estimates the best possible agreement (the maximum attainable κ). We adjust the cell counts (n i,j ) maximizing the agreement (cells n i,j =i ), keeping the same classification totals of each category for climate models and data (n i. and n .j ); this allows a more appropriate scaling of κ (Sim and Wright, 2005). The difference between κ and 1 indicates the total unachieved agreement. Accordingly, the difference between κ and κ max indicates the unachieved agreement beyond chance, and the difference between κ max and 1 shows the effect on agreement of pre-existing factors that tend to produce unequal classification totals such as nonlinearities or different sensitivities of climate models. Moreover, to provide useful information to interpret the magnitude of κ coefficient, we add the percentage of κ max reached by κ (%κ max ). Calculation of the κ coefficient implies intrinsic biases (Cicchetti and Feinstein, 1990). The adjusted kappa (κ adj , also called the prevalenceadjusted bias-adjusted kappa -PABAK) is also based on the same counting as the κ with adjusted cell counts minimizing those intrinsic biases. It gives an indication of the likely effects of biases alongside the true value of κ: if the value of κ adj is close to κ, then the biases are weak (Sim and Wright, 2005). κ adj is necessary to interpret in an appropriate manner the statistical meaning of κ coefficient.

Appendix B
List of abbreviations CTRL Pre-industrial or present time period LGM Last Glacial Maximum time period SDM Statistical Downscaling Method RV Renssen and Vandenberghe (2003) permafrost-temperature relationships GAM Generalized Additive Model ML-GAM Multinomial Logistic Generalized Additive Model GAM-RV Renssen and Vandenberghe (2003) relationships applied after downscaling by GAM CM Climate Models ACO "Advective" COntinentality DCO "Diffusive" COntinentality %CP Percentage of continuous permafrost in agreement with permafrost data. %DP Percentage of discontinuous permafrost in agreement with permafrost data.

Appendix C "Diffusive" (DCO) and "Advective" (ACO) continentality predictors
Description taken from Vrac et al. (2007a) and Martin et al. (2011): The proximity to the sea can locally induce a milder and wetter climate. To take into account this effect, we use the wind simulated by a climate model and the topography to build a continentality index, which can help to represent coastal effects and inland air drying. We can define different types of continentality, corresponding to different types of wind circulations, different spatial scales, and different effects on climate.
Thus, we define a quantity which is asked to account for the drying of an air parcel moving from the sea over the land, or in reverse the wetting of an air parcel leaving the land to move over the sea. This continentality C, should also account for the effect of oceans thermal inertia upon coastal areas. Practically, it is a percentage between 0 and 100: 0 for a purely maritime air, and 100 for a purely continental air. To build this index, we assume that when an air parcel moves along one path p, its continentality follows a simple decay law. Thus, we define a local decay time τ such that the rate of this changes of continentality dC p along one path p during a time dt can be written: The index i co gives the local relaxation state, that is 0 over sea or 100 over lands. For each land point, we compute the continentality by considering a large number of regularly distributed radial pathes converging towards the point from all directions. We affect a probability w p to each path, and form the weighted average: Firstly, we define a diffusive continentality (DCO) which corresponds to the shortest distance to the ocean. If a point is close to a sea or an ocean, then DCO is close to zero. Conversely, a point far away from the sea translates into a DCO close to one. This index might be adapted to account for local thermal influence of maritime air. The rate in the decay law of Eq. (C1) takes the simple form dt τ = ln2 × dl l d , where dl is an elementary displacement along the path, and l d is a tunable characteristic distance to the sea which was set to 200km in this study. In such a case the wind tends to alternate between sea-land and land-sea directions, leaving no specific monthly mean direction. Hence, all the radial pathes are considered equiprobable. Therefore, DCO does not effectively depend on the large scale wind simulated by the model. Secondly, we define the advective continentality (ACO), which will depend on the large scale monthly mean wind produced by the model. This might be suited to represent water vapour transport from the sea. The decay probability will now depend on the local magnitude of this wind U = |U |, by dt τ = ln2 × dl l a × U U a , where U a = 10 m s −1 and l a is a tunable characteristic distance set to 200 km in this study. Also, the large scale wind direction will define a preferential direction for local winds, penalizing an air-mass traveling against the wind, via the total probability of each path which is computed by: where 1 is the path local unit vector, and the factor 1 p wp indicates a subsequent normalization.