Arctic sea ice simulation in the PlioMIP ensemble.

. Eight general circulation models have simulated the mid-Pliocene warm period (mid-Pliocene, 3.264 to 3.025 Ma) as part of the Pliocene Modelling Intercomparison Project (PlioMIP). Here, we analyse and compare their simulation of Arctic sea ice for both the pre-industrial period and the mid-Pliocene. Mid-Pliocene sea ice thickness and extent is reduced, and the model spread of extent is more than twice the pre-industrial spread in some summer months. Half of the PlioMIP models simulate ice-free conditions in the mid-Pliocene. This spread amongst the ensemble is in line with the uncertainties amongst proxy reconstructions for mid-Pliocene sea ice extent. Correlations between mid-Pliocene Arctic temperatures and sea ice extents are almost twice as strong as the equivalent correlations for the pre-industrial simulations. The need for more comprehensive sea ice proxy data is highlighted, in order to better compare model performances.


Introduction
The mid-Pliocene warm period (mid-Pliocene), spanning 3.264-3.025 Myr ago , was a period exhibiting episodes of global warmth, with estimates of an increase of 2-3 • C in global mean temperatures in comparison to the pre-industrial period (Haywood et al., 2013). The mid-Pliocene is the most recent period of earth history that is thought to have atmospheric CO 2 concentrations resembling those seen in the 21st century, with concentrations estimated to be between 365 and 415 ppm (e.g. Pagani et al., 2010;Seki et al., 2010). Therefore, this time period is a useful interval in which to study the dynamics and characteristics of sea ice in a warmer world.
September 2012 saw Arctic sea ice fall to a minimum extent of 3.4 × 10 6 km 2 , a reduction of 4.2 × 10 6 km 2 since the beginning of satellite observations in 1979 (Parkinson and Comiso, 2013;Zhang et al., 2013a). Under RCP 4.5, many models predict seasonally sea-ice-free conditions in the Arctic by the end of the 21st century (e.g. Stroeve et al., 2012;Massonnet et al., 2012), with some projections suggesting an ice-free Arctic by 2030 under RCP 8.5 (Wang and Overland, 2012), whilst other studies (e.g. Boé et al., 2009) suggest a later date for the disappearance of summer Arctic sea ice.
There is debate concerning whether the Arctic sea ice in the mid-Pliocene was seasonal or perennial. Darby (2008) suggests that the presence of iron grains in marine sediments extracted from the Arctic Coring Expedition (ACEX) core, located on the Lomonosov Ridge (87.5 • N, 138.3 • W), shows that there was year-round coverage of sea ice at this location, whilst there are indications from ostracode assemblages and ice-rafted debris sediments as far north as Meighen Island (approx. 80 • N) that Pliocene Arctic sea ice was seasonal (Cronin et al., 1993;Moran et al., 2006;Polyak et al., 2010). The prospect of the Arctic becoming ice-free in summer in the future increases the importance of the investigation of past climates which may have had seasonal Arctic sea ice.
Whilst many studies have focused on the simulation of Arctic sea ice for present and future climate by a variety of modelling groups (e.g. Arzel et al., 2006;Parkinson et al., 2006;Stroeve et al., 2007Stroeve et al., , 2012Stroeve et al., , 2014Johnson et al., 2007Johnson et al., , 2012Holland and Stroeve, 2011;Blanchard-Wrigglesworth and Bitz, 2014;Shu et al., 2015), there has been little focus on the simulation of past sea ice conditions by an ensemble of models, particularly for climates with warmer than modern temperatures and reduced Arctic sea ice cover. Berger et al. (2013) looks at the response of sea ice to insolation changes in simulations of mid-Holocene climate by PMIP2 and PMIP3 models, which shows that all the models simulate a modest reduction in summer sea ice extent in the mid-Holocene compared to the pre-industrial control (mean difference is lower than the difference in the mean observational Arctic sea ice extents for 1980-1989 and 2000-2009), but in the winter approximately half simulate a more extensive mid-Holocene sea ice cover.
The Pliocene Modelling Intercomparison Project (PlioMIP) is a multi-model experiment which compares the output of different models' simulations of the mid-Pliocene, as well as pre-industrial simulations, each following a standard experimental design, set out in Haywood et al. (2010Haywood et al. ( , 2011 (further details in Sect. 2.1). In this study we analyse the simulation of Arctic sea ice in each of the participating models in PlioMIP Experiment 2 (see Table 1), focusing on both the pre-industrial and mid-Pliocene outputs. We quantify the variability of sea ice extent and thickness in both simulations, and present an overview of some of the important mechanisms influencing the simulation of sea ice.

PlioMIP experimental design
Two experimental designs for the PlioMIP simulations are described, Experiment 1 in Haywood et al. (2010) and Experiment 2 in Haywood et al. (2011). Experiment 1 used atmosphere-only general circulation models (AGCMs), whilst Experiment 2 used coupled atmosphere-ocean GCMs (AOGCMs). Both experimental designs describe the model setup for pre-industrial and mid-Pliocene simulations. The PRISM3D reconstruction provides the boundary conditions for the mid-Pliocene simulations, which in Experiment 1 also includes the prescribed sea surface temperature (SSTs) and sea ice extents. SST reconstruction utilises a multi-proxy approach, based on faunal analysis, alkenone unsaturation index palaeothermometry, and foraminiferal Mg / Ca ratios . Maximum sea ice extent in the mid-Pliocene is set as equal to modern sea ice extent minimum, with sea-ice-free conditions for the mid-Pliocene minimum extent . These boundary conditions are based on inferences from the SST reconstruction, and evidence from diatoms and sedimentological data . In both Experiments 1 and 2, atmospheric CO 2 is 405 ppm, and a modern orbital configuration is used.
In Table 1, details of the eight models which ran PlioMIP Experiment 2 simulations are summarised. With the exception of GISS-E2-R, each model was also used for Experiment 1 simulations. Four of the models (CCSM4, GISS-E2-R, HadCM3, and IPSLCM5A) are also represented in the CMIP5 ensemble, the results for which are contrasted with the PlioMIP results. Higher-resolution versions of MIROC4m and NorESM-L, and an updated version of MRI-CGCM also ran CMIP5 simulations. For COSMOS, results from the model MPI-ESM-LR, which has a higher resolution and an updated version of the ECHAM model in COSMOS, are shown.

Analysis of results
We focus on the key sea ice metrics of extent (defined as the area of ocean where sea ice concentration is at least 15 %), thickness (floe thickness), and volume. Root-mean-square deviations (RMSDs) and spatial pattern correlations (SPCs) are calculated for mean annual sea ice thicknesses. Analysis of spatial averages of sea ice thickness covers north of 80 • N (following the example of Berger et al., 2013), whereas the RMSD and SPC are calculated for ice covered areas north of 60 • N. SPC is calculated using Pearson product-moment coefficient of linear correlation.
To understand differences in the models' simulation of sea ice, we quantify correlations between the sea ice metrics and sea surface and surface air temperatures. We also compare the pre-industrial and mid-Pliocene sea ice extents to establish how closely correlated they are. This enables us to determine to which degree the mid-Pliocene sea ice cover is influenced by the temperatures and control simulations.
In our analysis, we define winter as the months February to April (FMA), and summer as the months August to October (ASO). The rationale is that in at least half of the models these are the three months with the highest and lowest mean sea ice extents respectively. This is in contrast to the typical seasonal definitions of winter (December-February) and summer (June-August).

Sea ice extent
Plots of the mean summer and winter pre-industrial Arctic sea ice concentrations are shown in Fig. 1. Across the eight-member ensemble, the multi-model mean annual sea ice extent is 16.17 × 10 6 km 2 (Table 2), with a winter (FMA) multi-model mean of 20.90 × 10 6 km 2 and a summer (ASO) multi-model mean of 10.98 × 10 6 km 2 . The individual models' annual means range from 12.27 × 10 6 km 2 (IPSLCM5A) to 19.85 × 10 6 km 2 (MIROC4m) ( Table 2), and monthly multi-model means range from a minimum of 10.01 × 10 6 km 2 (September) to a maximum of 21.24 × 10 6 km 2 (March, Fig. 2). The lowest individual monthly extent is 7.00 × 10 6 km 2 (HadCM3, September), with the highest monthly extent produced by MRI-CGCM (March), measuring 27.01 × 10 6 km 2 (Fig. 2). Figure 2 reveals the differences in the annual sea ice extent cycles across the ensemble. The sea ice extent amplitudes of NorESM-L and IPSLCM5A are 6.39 and 7.36 × 10 6 km 2 respectively (Table 2). These are the only models in the ensemble with seasonal amplitudes below 10 × 10 6 km 2 . Other models in the ensemble show a much larger seasonal cycle, in particular GISS-E2-R, MIROC4m, and MRI-CGCM, which have sea ice extent amplitudes of 14.03, 14.05, and 15.91 × 10 6 km 2 respectively ( Table 2). The ensemble mean sea ice extent amplitude is 11.18 × 10 6 km 2 .

Sea ice thickness
North of 80 • N, the multi-model mean annual thickness is 2.97 m, with a winter multi-model mean of 3.29 m and a summer multi-model mean of 2.52 m. Across the ensemble, the annual mean thickness varies from 2.27 m (HadCM3) to 3.81 m (CCSM4). The winter thicknesses range from 2.56 m (NorESM-L) to 4.01 m (CCSM4), with summer between 1.27 m (GISS-E2-R) and 3.60 m (CCSM4). Plots of mean winter and summer pre-industrial Arctic sea ice thicknesses are shown in Fig. 3.
RMSDs and SPCs for mean annual Arctic sea ice thickness (for ice-covered areas north of 60 • N) are shown in Fig 4. MIROC4m has the highest SPC with the ensemble mean (0.93), despite the thickest ice in its simulation being located north of eastern Siberia, opposite the region of thickest ice in many of the models (see Fig. 3). It also has the lowest RMSD (0.55 m), marginally lower than COSMOS (0.56 m). MRI-CGCM displays the lowest SPC with the ensemble mean (0.76) and the highest RMSD (1.33 m). The lowest SPC between two models is 0.51 (HadCM3 and MRI-CGCM), which have a RMSD of 1.83 m, the highest of the ensemble. HadCM3 has a thickness spatial pattern which appears by eye very different to other PlioMIP models, with the thickest ice in a wedge bounded approximately by the 70 • N latitude line and 120 • W and 150 • E (see Fig. 3). However, it has a greater SPC with the ensemble mean than GISS-E2-R or MRI-CGCM, and the RMSD between the ensemble mean thickness and HadCM3 is lower than GISS-E2-R or MRI-CGCM when compared to the ensemble mean (Fig. 4).  (FMA, upper half) and summer (ASO, lower half) in the pre-industrial control simulations for each PlioMIP Experiment 2 model. Missing data at the poles is a plotting artefact (seen also in Figs. 1, 3, 5, and 7). Table 2. Mean annual sea ice extents and amplitudes of sea ice extent (maximum annual sea ice extent minus minimum annual sea ice extent) for the pre-industrial (PI) and mid-Pliocene simulations from PlioMIP, and historical

Comparison to CMIP5 simulations
Before examining the simulations of Arctic sea ice for the mid-Pliocene, the simulations of pre-industrial sea ice cover by individual models are assessed. A comparison with observed sea ice characteristics is a suitable methodology. Ideally, we would have compared the output of the pre-industrial simulations to observations of sea ice from the same time period. However, the most spatially and temporally comprehensive observations of sea ice originate from satellites. Respective data sets date back only as far as 1979, which is more than 100 years after the time period that the pre-industrial simulations represent. Whilst there are observations of sea ice characteristics available dating back to the early 20th century that could have been used for the comparison, most, particularly the earliest, are ship-based observations of ice margins. These observations are only available for the spring and summer months (e.g. Thomsen, 1947;Walsh and Chapman, 2001), and the sea ice extent in the remaining months must be estimated by extrapolation. Frequency and location of these observations are determined by shipping patterns, rather than by the scientific need for spatial and temporal coverage.
Due to the differences between the climate states represented by models and the chosen observations, we do not make any direct comparisons. However, all of the PlioMIP models, or related versions, are represented in the CMIP5 ensemble, for which historical simulations exist that can be directly compared to modern observations. Shu et al. (2015) provide an assessment of the historical simulation of Arctic sea ice by the CMIP5 models for the period 1979-2005. Their results show that, for the historical simulations by the PlioMIP models in CMIP5, MRI-CGCM simulates the highest mean annual sea ice extent (15.01 × 10 6 km 2 ), compared to the satellite observational mean of 12.02 × 10 6 km 2 for the comparable period . MRI-CGCM simulates the second highest preindustrial mean annual sea ice extent (just 0.05 × 10 6 km 2 less than MIROC4m), and the highest mid-Pliocene mean annual sea ice extent. The CMIP5 historical extent simulated by MRI-CGCM is almost 25 % greater than the observational mean, and over 18 % greater than the ensemble mean (for CMIP5 simulations), showing MRI-CGCM consistently simulates Arctic sea ice extent larger than the ensemble mean.
In contrast, MIROC4m simulates a pre-industrial mean annual sea ice extent that is similar to the MRI-CGCM PlioMIP simulation, and represents the lowest historical mean annual sea ice extent of the CMIP5 models that are included in the PlioMIP ensemble (10.66 × 10 6 km 2 ; Shu et al., 2015). The NorESM-M, the higher-resolution version of NorESM-L, which simulates both the lowest PlioMIP pre-industrial and mid-Pliocene mean annual sea ice extents, is the CMIP5 model which simulates the closest historical mean annual sea ice extent to the observations (12.01 × 10 6 km 2 , just 0.01 × 10 6 km 2 lower than the observations). As NorESM-L does with the PlioMIP simulations, NorESM-M simulates the lowest sea ice extent amplitude of the PlioMIP models in CMIP5 (Shu et al., 2015).
In addition to the mean annual sea ice extent simulated by each model in the CMIP5 historical and PlioMIP simulations, Table 2 shows the ensemble mean annual extents for these sets of simulations. In both pre-industrial and mid-Pliocene simulations, compared to the ensemble mean, CCSM4 simulates a greater mean and HadCM3 simulates a smaller mean annual extent. In the CMIP5 simulations, the reverse is true (see Table 2).
Arctic sea ice thickness in the CMIP5 simulations is analysed in Stroeve et al. (2014). The correlations between the spatial patterns of Arctic sea ice thickness in the simulations (average over the years 1981-2010) and observations from Kwok et al. (2009) are less than 0.4 for all the considered PlioMIP models -with the exception of CCSM4, which has the highest SPC of the entire CMIP5 ensemble. For each PlioMIP model, the spatial patterns of sea ice thickness in the pre-industrial simulation resembles the thickness spatial pattern in that model's CMIP5 simulation, shown in Stroeve et al. (2014). It has been noted that the SPC between different ensemble simulations with the same model is significantly higher than the correlation between one model and the observations, which suggests that poor correlations are more likely explained by biases within the models, rather than by natural variability.

Sea ice extent
In agreement with enhanced greenhouse forcing each model in the ensemble simulates a smaller sea ice extent in the mid-Pliocene simulation in comparison to the pre-industrial (Figs. 1, 5). The multi-model mean annual extent for the mid-Pliocene simulations is 10.84 × 10 6 km 2 , a reduction of 5.33 × 10 6 km 2 (33.0 %) in comparison to the respective multi-model mean of the pre-industrial simulations. Annual means in the ensemble range from 7.60×10 6 km 2 (NorESM-L), to 15.84 × 10 6 km 2 (MRI-CGCM) ( Table 1).
The lowest multi-model monthly mean extent is 3.15 × 10 6 km 2 (September), and the highest is 16.59 × 10 6 km 2 (March). In comparison to the pre-industrial simulation, the lowest multi-model monthly mean extent is reduced by 6.86×10 6 km 2 (69 %). The reduction for the highest monthly multi-model mean is 4.65 × 10 6 km 2 (22 %). The relative change in the lowest extent is therefore over 3 times greater than the relative change in the highest extent.
MRI-CGCM, CCSM4, and MIROC4m simulate the highest maximum mid-Pliocene sea ice extents in the ensemble. Both CCSM4 and MRI-CGCM also provide the highest two minimum extents, but MIROC4m is one of the four models that simulates an ice-free Arctic summer. As a result, the sea ice extent amplitude in MIROC4m in the mid-Pliocene simulations is ≈ 64 % greater than the pre-industrial simulation extent amplitude ( Table 2). The ensemble mean extent amplitude of the mid-Pliocene simulations is ≈ 20 % greater than the pre-industrial ensemble mean amplitude.
Not all of the models, however, show this trend.  simulation, in addition to the mean annual sea ice extent. Three of the eight models (CCSM4, IPSLCM5A, and MRI-CGCM) simulate mid-Pliocene sea ice extent amplitudes which are smaller than the pre-industrial extent amplitudes. For CCSM4 and IPSLCM5A, the differences in extent amplitude between pre-industrial and mid-Pliocene are less than 10 6 km 2 , and represent changes of 4.1 and 6.1 % respectively, so there is no substantial change in the annual cycles of both simulations by CCSM4 and IPSLCM5A. The increase in MRI-CGCM on the other hand is larger (2.22 × 10 6 km 2 , or 13.9 %).
In four of the eight models (COSMOS, GISS-E2-R, MIROC4m and NorESM-L) the mid-Pliocene Arctic Ocean is ice-free at some time during the summer (August-September, Fig. 6). In contrast to this, CCSM4 and MRI-CGCM simulate minimum sea ice extents of 8.90 × 10 6 km 2 and 8.26 × 10 6 km 2 respectively, which both exceed the pre-industrial minimum of HadCM3 (7.00 × 10 6 km 2 ), with the CCSM4 minimum also exceeding the NorESM-L preindustrial minimum (8.34×10 6 km 2 ). This indicates the large spread in the representation of sea ice extent in the models.
For those models that simulate summer sea ice in the mid-Pliocene, the summer sea ice conditions vary strongly. Summer sea ice in HadCM3 is confined to the Arctic Basin, with concentrations that do not exceed 60 %, and very low concentrations along all ice edges. The summer sea ice margin in MRI-CGCM, on the other hand, extends almost to the southern tip of Greenland, and a large proportion of the sea ice cover is characterised by concentrations greater than 90 % (Fig. 5).
Four of the five models with larger mid-Pliocene extent amplitudes simulated ice-free conditions for part of the summer in the mid-Pliocene. The increase in extent amplitude ranges from a 9.4 % increase in COSMOS to a 101.3 % increase in NorESM-L. It might be expected that simulating a seasonally ice-free mid-Pliocene Arctic would lead to a decrease in extent amplitude, as the minimum extent has decreased as low as possible; however, this is not the case. As Fig. 3 shows, the four models with seasonally ice-free mid-Pliocene simulations have the thinnest pre-industrial summer ice, which disappears in the mid-Pliocene summer, whereas much of the winter sea ice has simply thinned, so there is less of a reduction in extent.

Sea ice thickness
Plots of the mean summer and winter mid-Pliocene Arctic sea ice thicknesses are shown in Fig. 7. The multi-model mean annual sea ice thickness is 1.30 m, which, compared to the pre-industrial simulations, is a reduction of 1.7 m (56 %). Across the ensemble, the annual mean thicknesses range from 0.44 m (NorESM-L) to 2.56 m (MRI-CGCM). The multi-model winter mean thickness is 1.77 m, 1.5 m (46 %) less than the pre-industrial, whereas the summer multi-model mean thickness drops by 1.8 m (71 %) to 0.74 m. Similar to the sea ice extent, the summer sea ice thickness shows a greater relative decline with respect to pre-industrial than during the winter, although the contrast is not as stark for the thickness. The individual model winter sea ice thicknesses range from 0.79 m (NorESM-L) to 2.78 m (MRI-CGCM), with the summer sea ice thicknesses between 0.3 m (NorESM-L) and 2.24 m (MRI-CGCM).
SPCs and RMSDs between the pre-industrial and mid-Pliocene simulations are shown in Fig. 4. All but five of the mid-Pliocene RMSDs are lower than the equivalent RMSD for the pre-industrial simulations. This trend is not seen in the SPCs, where just over half (19 out of 36) of the mid-Pliocene correlations are higher than the corresponding pre-industrial correlation. These results show that the differences in thicknesses between the models are lower in the mid-Pliocene simulations, but the differences between thickness patterns are comparable. Lower overall RMSDs are likely to be at least part in due to the increase in the area of ice-free ocean, and lower mean thicknesses in the mid-Pliocene simulations compared to the pre-industrial.
GISS-E2-R has the highest SPC with the ensemble mean (0.90), with NorESM-L the lowest (0.60). NorESM-L has correlations of less than 0.5 with two models, CCSM4 (0.49) and MRI-CGCM (0.27). As with the pre-industrial results, MRI-CGCM has the highest RMSD compared to the ensemble of all the simulations (1.05 m), and the RMSD of 1.46 m between MRI-CGCM and NorESM-L is the highest between any two models. The highest SPC between two models is 0.97, between COSMOS and MIROC4m, which also have the lowest RMSD, at 0.11 m. Figure 4 also shows RMSDs and SPCs between each model's pre-industrial and mid-Pliocene runs. All but two models have SPCs exceeding 0.9 between the thicknesses of both simulations, with the exceptions being GISS-E2-R (0.81) and NorESM-L (0.56). The SPC between the ensemble means is 0.79.

Variability across the ensemble
The standard deviation (SD) of the monthly ensemble sea ice extents and thicknesses for both the pre-industrial and mid-Pliocene simulations is shown in Fig. 8. In each month from December to June, the mid-Pliocene extent SD is lower than the pre-industrial extent SD. During these months, the maximum extent SD in both simulations occurs in February, and SD decreases each month from February to June. In the preindustrial simulation, extent SD is lowest in July, following which it increases each month until the February peak. In the mid-Pliocene simulations, SD increases after June to July and then August, and reaches maximum SD in October. SD in August and October are greater than in February/March in the mid-Pliocene extent. The annual cycle of pre-industrial sea ice thickness SD has a minimum in May, and maximum in September. The mid-Pliocene sea ice thickness SD annual cycle follows a similar pattern, with the lowest SD in March, and maximum in July, both 2 months earlier than the equivalent pre-industrial extremes.

Correlation of sea ice characteristics in the ensemble
The correlation coefficient between the mean summer sea ice extents of the pre-industrial and mid-Pliocene simulations is 0.47, compared to a correlation coefficient of 0.87 between the mean winter sea ice extents of both time slices (Fig. 9a, b). The models' annual mean sea ice extents for the two climate states show a correlation coefficient of 0.74 (not shown). Sea ice thicknesses simulated by the pre-industrial and mid-Pliocene simulations are strongly correlated in both summer and winter, with correlation coefficients of 0.82 and 0.85 respectively (Fig. 9c, d). Whilst the winter pre-industrial sea ice thickness shows a weak relationship with the mid-Pliocene winter sea ice extent (Fig. 9f), with a correlation coefficient of just 0.30, the relationship between the summer values is stronger, with a correlation coefficient of 0.81 (Fig. 9e). It should be noted that, with a sample size of just 8, only correlation coefficients greater than 0.70 are significant at the 95 % level, so the correlation coefficients for the relationships shown in Fig. 9a and f are not significant at this level. The simulated mid-Pliocene sea ice extent and sea ice volume show a stronger relationship with both surface air temperatures (SATs) and sea surface temperatures (SSTs) than the pre-industrial sea ice extent and sea ice volume (Fig. 10). The correlation coefficient of the mid-Pliocene mean annual sea ice extent and the SAT is −0.76, the correlation coefficient of the pre-industrial sea ice extent with SAT is −0.18. For SST the correlation with mid-Pliocene sea ice extents is −0.73, for pre-industrial sea ice extent the correlation coefficient is −0.26. For the summer, the mid-Pliocene sea ice extents have a correlation coefficient of −0.88 with both SAT and SST (not shown). In contrast, the pre-industrial sea ice extents have correlation coefficients of −0.27 (SAT) and −0.32 (SST) respectively (not shown). Mean annual preindustrial SATs and SSTs have correlations with mean annual pre-industrial sea ice volume of −0.12 and −0.29 respectively. This contrasts with the respective mid-Pliocene correlation coefficients of −0.83 and −0.82. This confirms that the simulated mid-Pliocene sea ice extents and volumes have -independent of the season -stronger negative correlations (all significant at the 95 % level) with temperatures than the simulated pre-industrial sea ice extents (for which none of the correlations with temperature are significant at the 95 % level).

Influence of the sea ice models
The sea ice components of each model differ in resolution, representation of sea ice dynamics and thermodynamics, and formulation of various parameterisations, such as sea ice albedo. The key details of each model's sea ice component are summarised in Table 1. The models CCSM4 and NorESM-L use the same sea ice component, based on CICE4 (Hunke, 2010), although NorESM-L has a coarser model grid in the atmosphere than CCSM4, and furthermore employs a completely different ocean component (Table 1). The sea ice dynamics of the ensemble members can be categorised into three groups. First, CCSM4, NorESM-L, and MIROC4m, which all use the elastic-viscous-plastic (EVP) rheology of Hunke and Dukowicz (1997). Second, COSMOS, GISS-E2-R, and IPSLCM5A, which are based on viscous-plastic (VP) rheologies (Marsland et al., 2003;Zhang and Rothrock, 2000;Fichefet and Morales Maqueda, 1999). Third, HadCM3 and MRI-CGCM, which do not consider any type of sea ice rheology, the sea ice following simple free-drift dynamics (Cattle and Crossley, 1995;Mellor and Kantha, 1989). In PlioMIP, there does not appear to be any link between the type of dynamics of the sea ice components and the simulated sea ice extents -MRI-CGCM and MIROC4m produce the two highest annual means for preindustrial whilst having very different sea ice dynamics. The three models that produce the lowest pre-industrial extents, i.e. NorESM-L, IPSLCM5A, and HadCM3, employ different rheologies -EVP, VP, and no rheology respectively.
Most of the models use a leads parameterisation in their sea ice thermodynamics component, with only CCSM4 and NorESM-L employing explicit melt pond schemes. The models HadCM3 and COSMOS both use the leads parameterisation based on Hibler (1979). The models HadCM3, MIROC4m and MRI-CGCM all utilise the "zero-layer" model developed by Semtner (1976). Similarly to the considered sea ice dynamics, there is no clear influence of the thermodynamics schemes used in the models on the simulated pre-industrial sea ice extent.
The simulation of Arctic sea ice by means of GCMs has been demonstrated to be very sensitive to the parameterisation of sea ice albedo. This has been observed in the case of variations in albedo in different models (Hodson et al., 2013), and adjusting the parameterisation in one specific model (Howell et al., 2014). Hill et al. (2014) show that clear-sky albedo is the dominant factor in high-latitude warming in the PlioMIP ensemble. The four models that display the highest warming effect from the clear-sky albedo are those four models that simulate an ice-free mid-Pliocene F. W. Howell et al.: Arctic sea ice in the PlioMIP ensemble summer (COSMOS, GISS-E2-R, MIROC4m, and NorESM-L). The NorESM-L shows the largest warming due to clearsky albedo; CCSM4, on the other hand, shows the smallest clear-sky albedo effect. Both NorESM-L and CCSM4 use the same sea ice component, based on CICE4 (Hunke and Lipscomb, 2008). This sea ice model employs a shortwave radiative transfer scheme to internally simulate the sea ice albedo and thus produce a more physically based parameterisation (Holland et al., 2011).
However, it appears that the performance of this albedo scheme is very sensitive to differences in other components of the climate models: NorESM-L (which shows a large contribution of clear-sky albedo) uses the same atmosphere component as CCSM4 (low contribution of clear-sky albedo), albeit at a lower resolution version in the PlioMIP experiment, but it employs a different ocean component that also has a lower resolution than the ocean component used in CCSM4. The contrast in the contribution of clear-sky albedo to high-latitude warming between NorESM-L and CCSM4 is reflected in the large difference in their simulations of summer mid-Pliocene sea ice. One cause is certainly the nature of the sea ice-albedo feedback mechanism (Curry et al., 1995). Reduced albedo at high latitudes can be both a cause of and a result of a reduced sea ice extent. Models with parameterisations with a lower sea ice albedo minimum therefore have a greater potential to amplify the warming that originates from other sources in simulations of the mid-Pliocene, such as greenhouse gas emissivity. The low sea ice albedo assumed in NorESM-L is a likely explanation for the low sea ice extents it simulates (Figs. 2, 6), both in mid-Pliocene and pre-industrial simulations.
Clear-sky albedo has the highest contribution to highlatitude warming in NorESM-L, with the second highest being in MIROC4m. In MIROC4m there is a fixed albedo of 0.5 for bare sea ice, with higher albedo for snow-covered sea ice that furthermore varies according to ambient surface air temperature (K-1 Model Developers, 2004). Of the six models that do not use a radiative transfer scheme to internally simulate sea ice albedo (those except NorESM-L and CCSM4), only GISS-E2-R has an albedo minimum lower than 0.5. However, this model allows the albedo to vary between 0.44 and 0.84 (Schmidt et al., 2006). All other models also allow the sea ice albedo to vary, and consequently MIROC4m has a lower overall albedo. This may help to explain the ability of MIROC4m to simulate an ice-free mid-Pliocene summer, despite simulating one of the highest winter sea ice extents for both pre-industrial and mid-Pliocene.
As the parameterisation of sea ice albedo is kept unchanged between pre-industrial and mid-Pliocene simulations, differences in the parameterisation between the models should have similar effects in both simulations. However, if there is a temperature threshold above which the ice-albedo feedback becomes more dominant in some of the models, then this could explain the different influence of the sea ice parameterisation on pre-industrial and mid-Pliocene simulations.
General circulation models are tuned to best reproduce modern-day climate conditions, and parameterisations are based on modern observations (Hunke, 2010;Mauritsen et al., 2012). When simulating the climate of time periods with different climate states, such as the mid-Pliocene, models that are tuned towards present-day conditions may be biased in some regions. However, it is disputed to which extent the adjustment of parameters, such as sea ice albedo, within the limits of observational uncertainties can affect the overall sea ice cover and compensate for other shortcomings in the model (Eisenman et al., 2007(Eisenman et al., , 2008DeWeaver et al., 2008). Massonnet et al. (2012) describe the characteristics of Arctic sea ice simulated by the CMIP5 ensemble for the time period from 1979 to 2010 as being related in a "complicated manner" to the simulated future change in September Arctic sea ice extent. Figure 9 demonstrates, based on correlation values, that some combinations of sea ice characteristics in the pre-industrial and mid-Pliocene simulations are much more strongly related to each other than others. In Sect. 3.2 it was highlighted that the differences in the PlioMIP models' simulation of sea ice for 1979-2005 in CMIP5 are not consistent with the differences in pre-industrial or mid-Pliocene simulations in the PlioMIP ensemble.

Influence of the control simulation
All of the models that simulate thinner pre-industrial summer sea ice than the ensemble mean also simulate ice-free conditions during the mid-Pliocene summer, with the exception of HadCM3. Holland and Bitz (2003) demonstrate that the thickness of sea ice in control simulations has a stronger influence on the climate state of the Northern Hemisphere polar region in simulations of future climates than sea ice extent. Massonnet et al. (2012) find that those CMIP5 models that predict an earlier disappearance of September Arctic sea ice generally have a smaller initial September sea ice extent. In PlioMIP, mean summer pre-industrial sea ice thicknesses have correlation coefficients of 0.81 and 0.82 with mean summer mid-Pliocene sea ice extents and thicknesses, respectively. Mean summer pre-industrial sea ice extents, on the other hand, show weaker correlations with mean summer mid-Pliocene sea ice extents and thicknesses, with respective correlation coefficients of 0.47 and 0.51. The relatively thin pre-industrial summer sea ice simulated in PlioMIP by COSMOS, GISS-E2-R, MIROC4m, and NorESM-L therefore appears to be an important factor for the ability of those models to simulate an ice-free mid-Pliocene summer. An exception is HadCM3, which simulates perennial sea ice in the mid-Pliocene, despite simulating relatively thin (within the PlioMIP ensemble) pre-industrial sea ice.

Influence of atmosphere and ocean on the sea ice simulation
In the mid-Pliocene simulations, the correlation coefficient between Arctic surface temperatures and simulated sea ice extent is much higher than the corresponding correlation coefficient in the pre-industrial simulations (Fig. 10a, b). Preindustrial sea ice is thicker than mid-Pliocene sea ice, which could explain the lower sensitivity of the pre-industrial sea ice extent to surface temperatures. However, similar differences in correlation strength between the pre-industrial and mid-Pliocene simulations are also seen for mean sea ice volume (Fig. 10, c,d), so there is no strong relationship between warmer pre-industrial simulations and those with less total ice. In the pre-industrial simulations, much of the ocean north of 60 • N is fully covered with sea ice, so all SSTs will be −1.8 • C. The uniformity of the SSTs in this region could be a plausible explanation for the weak correlation between the overall Arctic sea ice extents and SSTs north of 60 • N in the pre-industrial simulations of the PlioMIP ensemble. The reduced sea ice coverage in the mid-Pliocene simulations, particularly during the summer months, enables, on the other hand, a greater range of possible SST values. This is potentially the reason for a much stronger correlation with the simulated mid-Pliocene sea ice extents (Fig. 10). In the models, the presence of ice in a grid box, even at low concentrations, restricts the warming in the ocean. Larger parts of the ocean are ice-free for longer periods in the year in the mid-Pliocene simulations than in the pre-industrial simulations, meaning longer periods in the mid-Pliocene simulations where the ocean can warm. This will in turn affect the warming of the atmosphere in the models, and so is a possible reason for better correlation between sea ice extent and surface temperatures in the mid-Pliocene simulations.  In addition to SATs and SSTs, there are of course other atmospheric and oceanic influences on the simulation of Arctic sea ice. The Atlantic meridional overturning circulation (AMOC) contributes significantly to poleward oceanic heat transport and has been shown to have a strong impact on Arctic sea ice (e.g. Mahajan et al., 2011;Day et al., 2012;Miles et al., 2014). Zhang et al. (2013b) analyse the simulation of the AMOC in both pre-industrial and mid-Pliocene simulations of the PlioMIP ensemble and find that there is little difference between each model's pre-industrial and mid-Pliocene AMOC simulation. There is no consistent change in northward ocean heat transport, with half the models simulating a slight (less than 10 %) increase and half simulating a slight decrease (less than −15 %). Of the models which simulate increased northward ocean heat transport (COSMOS, GISS-E2-R, IPSLCM5A, and MRI-CGCM), only two (COS-MOS and GISS-E2-R) simulate an ice-free mid-Pliocene summer. This suggests that the influence of AMOC and northward oceanic heat transport on the ensemble variability in sea ice in the mid-Pliocene simulation of PlioMIP is not the most important factor.
An analysis of multi-decadal variability influence on Arctic sea ice extent in selected CMIP3 simulations (covering 1953-2010) by Day et al. (2012) showed a significant correlation between Arctic sea ice extents and Atlantic Multi-decadal Oscillation (AMO) indices. Kwok (2000) and Parkinson (2008) demonstrate evidence of the North Atlantic Oscillation (NAO) on Arctic sea ice. Table 3 shows annual and decadal correlations between Arctic sea ice extent and AMO and NAO indices for simulations from three PlioMIP models (CCSM4, HadCM3, and NorESM-L).
All three models show a small but significant (at 90 % level) correlation between the pre-industrial annual Arctic sea ice extents and the NAO indices. The correlation coefficients at the decadal timescale are increased for both HadCM3 and NorESM-L but are not significant for any of the models. None of the correlations between mid-Pliocene Arctic sea ice extents and NAO indices are significant at the 90 % level. The correlations between pre-industrial Arctic sea ice extents and AMO indices are all not significant at the 90 % level. For the mid-Pliocene simulations, only the correlation between the annual Arctic sea ice extents and AMO indices from the CCSM4 simulations is significant at the 90 % level.
There is no significant correlation between decadal sea ice extents and NAO/AMO indices in the three models shown, and so it is unlikely that differences in the mean sea ice extents (representing averages representing between 30 and 200 years' worth of climatology) between different models and simulations can be explained by different influences of these variability indices. To more thoroughly investigate this would require much longer time series from all the modelling groups, which are not available. A comprehensive analysis of the relationships between variability indices and sea ice in the PlioMIP simulations is beyond the scope of this paper.
Patterns of ice thicknesses are strongly influenced by the motion of sea ice in the models. In each model, the equations used to determine sea ice motion account for stresses on the ice from surface winds and ocean currents, with the exceptions of HadCM3, which does not take surface winds into account (Gordon et al., 2000), and MRI-CGCM, where the ocean currents are not taken into account in determining ice motion (Mellor and Kantha, 1989). Figure 12 shows the mean annual 10 m surface winds and sea ice thicknesses for the IPSLCM5A and MIROC4m simulations. In MIROC4m, the dominant wind direction between 90 and 180 • E over the Arctic Basin is towards the northern coast of eastern Siberia, where a build-up of thicker ice is present. Similarly, in IPSLCM5A (Fig. 12), the dominant wind direction is towards the north of Greenland and the Canadian Arctic Archipelago, where the thickest ice is. Mean annual 10 m winds and sea ice thicknesses for all simulations (excluding CCSM4, for which 10 m winds are not an output) are included in the Supplement.
In HadCM3, the ocean surface currents form a vortex in part of the Arctic Basin (Beaufort Gyre), where the thickest sea ice is present in both simulations (see Fig. 13). Given that the sea ice motion is entirely determined by the surface ocean current, its influence on the spatial pattern of sea ice thickness is clear. If sea ice motion were instead determined by surface wind stresses in addition to the ocean currents (which do not have the same patterns in HadCM3), this should result in a different configuration of sea ice in the Arctic basin, and would likely affect the location of the sea ice margins simulated by the model. Mean annual surface ocean currents and sea ice thicknesses for all simulations are included in the Supplement.
Understanding the more precise influences of winds and ocean currents on the modelled sea ice and the causes of differences between models, as well as different simulations with the same model, would require a far more extensive analysis. Differences in seasonal, as well as annual patterns, alongside atmospheric circulations at higher levels, may be explored in further work.

Sea ice proxy data
Given the large spread within the ensemble with regard to the nature of mid-Pliocene sea ice, the comparison of the different models' sea ice simulation with a reconstruction of mid-Pliocene Arctic sea ice from proxy data could prove insightful. The recent development of organic biomarkers proxies such as IP 25 to reconstruct past sea ice presence (e.g. Knies et al., 2014) may indicate which models simulate the mid-Pliocene climate more realistically. A reasonable performance of a model in simulating mid-Pliocene sea ice may also improve confidence in its prediction of future sea ice, in particular if its simulation of present-day sea ice matches observations closely. If a model simulation matches well with observations/proxy reconstructions for just one climate, this may not necessarily be due to a good model performancerather, the model may be producing "the right answers for the wrong reasons", such as error compensation (Massonnet et al., 2012). However, a greater degree of confidence could be held in the predictions from a model which produces sea ice simulations that closely match both modern observations in a modern simulation and proxy-data-based reconstructions in a mid-Pliocene simulation, as the probability that the model compares well to the data by chance for both is reduced.
Relating proxy data to mid-Pliocene sea ice is, however, subject to limitations due to uncertainty in the proxy itself. Darby (2008) demonstrates evidence for perennial Arctic sea ice in the mid-Pliocene, whilst the presence of IP 25 , a biomarker proxy for sea ice coverage (Belt and Müller, 2013) in mid-Pliocene sediments, recovered from two boreholes in the Atlantic-Arctic gateway (located at 80.16 • N, 6.35 • E and 80.28 • N, 8.17 • E; see Fig. 11), implies that the maximum sea ice margin during the mid-Pliocene extended southwards beyond these two sites, but the minimum margin did not (Knies et al., 2014). The locations of these sites are within the maximum mid-Pliocene sea ice margins simulated by all of the PlioMIP models, but also within the minimum sea ice margins simulated by three of the models that simulate summer sea ice (CCSM4, IPSLCM5A and MRI-CGCM) -although the sea ice concentration at these sites is less than 50 % in the CCSM4 and IPSLCM5A simulations. The extent of the sea ice minimum in HadCM3 does not reach the location of the sites analysed in Knies et al. (2014), and so is consistent with the conclusions drawn from proxy data in both 764 F. W. Howell et al.: Arctic sea ice in the PlioMIP ensemble the studies by Darby (2008) and Knies et al. (2014). A greater spatial coverage of sea ice proxy data, such as that used in Knies et al. (2014), would improve the analysis of the simulation of sea ice by the PlioMIP models. At the moment, limited data availability does not allow for robust model-proxy comparisons.

Conclusions
We have presented a detailed analysis of the simulation of Arctic sea ice in the PlioMIP model ensemble, for both preindustrial control and mid-Pliocene simulations. The sea ice in the mid-Pliocene simulations is overall less extensive and thinner than the pre-industrial sea ice, with a 33 % decrease in mean annual sea ice extent for the ensemble mean, and a 56 % reduction in the ensemble mean annual sea ice thickness. The changes in the mid-Pliocene, relative to the preindustrial, are largest during the summer months, both in absolute and relative terms, and for both sea ice extent and sea ice thickness.
The simulated mid-Pliocene sea ice extents are strongly negatively correlated with the Arctic temperatures. In contrast, there is only a weak correlation between pre-industrial sea ice extents and temperature. Hill et al. (2014) identified clear-sky albedo as the dominant driver of high-latitude warming in the mid-Pliocene simulations of PlioMIP, particularly in those models that simulate an ice-free mid-Pliocene summer. Sea ice-albedo feedbacks may contribute to the stronger relationship between surface temperatures and sea ice in the mid-Pliocene simulations, as the feedback mechanism enhances the warming that originates from increased greenhouse gas concentrations. The effect of the sea icealbedo feedback does not appear to be similarly pronounced in the pre-industrial simulations. If it is the case that some models see an enhanced ice-albedo feedback in warmer climates, then this is likely to affect those models' prediction of future Arctic sea ice change.
The HadCM3 is the only model that simulates both perennial mid-Pliocene Arctic sea ice and a minimum sea ice extent that is completely located north of the location of the two sites studied in Knies et al. (2014), located at 80.16 • N, 6.35 • E and 80.28 • N, 8.17 • E, where IP 25 proxy data indicate the presence of a sea ice margin in the mid-Pliocene. However, this proxy evidence is sparse, originating from just two sites in the same region. If the proxy studies indicating seasonal mid-Pliocene Arctic sea ice (e.g. Cronin et al., 1993;Moran et al., 2006;Polyak et al., 2010) are correct, then the mid-Pliocene Arctic sea ice in COSMOS, GISS-E2-R, MIROC4m, and NorESM-L models concur with the data indication.
Given the limited amount of suitable proxy data, we are currently not able to make firm judgements with respect to a selection of models that simulate a more accurate mid-Pliocene Arctic sea ice cover if compared to the geologic record. The availability of additional proxy data may enable such a conclusion in the future and could help to identify strengths and weaknesses in the different models' simulations of sea ice and gauge confidence in their predictions of future sea ice.
However, as discussed in Sect. 4.1.3, there are numerous atmospheric and oceanic factors that influence the simulation of Arctic sea ice. As highlighted by Massonnet et al. (2012), a model can simulate the "right" results for the wrong reasons, perhaps due to error compensation. This does not mean that the analysis of sea ice simulations for past climates, such as the mid-Pliocene, is not valuable and justified, but that it is important to highlight that the forcings behind the sea ice simulation have to be better understood. Variability modes, such as NAO or AMO, whilst shown to have influence on sea ice extent from an annual viewpoint, do not appear to exert significant influence over the mean sea ice state on a decadal timescale. The models' representation of sea ice motion, and by extension ocean currents and surface winds, is an important influence on the distribution of sea ice, and worthy of a more detailed study. Future studies must particularly aim at quantifying the contribution of the various forcings on the sea ice in warmer climates.
The Supplement related to this article is available online at doi:10.5194/cp-12-749-2016-supplement.