1. Introduction
Sea ice is an important component of the climate system. Sea ice cover can change the surface albedo, which in turn acts to reinforce the initial alteration in ice area [
1–
3]. Extensive sea ice over Arctic regions is largely involved in heat, moisture, and momentum exchanges between the atmosphere and ocean [
4]. This is because the sea ice surface reflects significantly more of the incident solar radiation than open water and because melted water has a significant influence on oceanic circulation [
5,
6]. Therefore, changes in the extent of sea ice have great potential to influence variations in regional and global climatic systems [
7,
8].
Although a long-range dataset over a large-scale area is necessary to achieve reliable spatiotemporal analysis, it is difficult to scrutinize sea ice changes due to the lack of
in-situ observations [
9]. However, Arctic sea ice concentration, extent, and area have been continuously monitored for approximately 34 years. Monitoring has been ongoing since 1979 with the help of satellite-based multichannel passive microwave imaging systems [
10] such as Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave/Imager (SSM/I), and Advanced Microwave Scanning Radiometer (AMSR). Sea ice concentration is defined as the fraction of ice-covered areas at a given point in the ocean. Sea ice extent denotes the sum of ice-covered areas with concentrations of at least 15%, while sea ice area is the product of the ice concentration and each pixel area within the ice extent [
11,
12]. Recent satellite remote sensing studies show that there has been a significant decline in Arctic sea ice extent [
13,
14] with the possibility that global warming is occurring more rapidly than before. Therefore, reliable outlooks for sea ice conditions are crucial in understanding the future Arctic environment and global climate change [
15].
In the Intergovernmental Panel on Climate Change (IPCC) 4th Assessment Report, six climate models indicated that the Arctic might have sea ice-free summers in the 2030s [
16,
17]. However, significant differences were found among the results predicted by several climate models evaluating changes in Arctic sea ice cover [
16,
17]. Consequently, the mechanism behind the recent rapid decrease in sea ice extent, which is not yet fully understood, may result in some uncertainties in the process-based Arctic sea ice module of climate models [
18,
19]. Some other studies on sea ice changes have employed statistical models based on empirical relationships between sea ice conditions and several explanatory variables e.g., [
18,
20–
26]. Variables in the statistical models include prior information on sea ice, as well as oceanic and atmospheric conditions that can influence sea ice changes. Temperature is reasonably expected to be a major predictor for the current loss of sea ice caused by the atmospheric warming trend over the Arctic area [
27,
28]. However, statistical predictions require additional climatic variables for a more stable explanation for Arctic sea ice changes [
18,
29,
30]. Such empirical knowledge can help identify the physical processes underlying Arctic sea ice changes [
16,
31] and can contribute to making reasonable outlooks even without the need for explicit physical mechanisms and realistic initial conditions [
29].
Some previous studies have been conducted to develop statistical models for the status of Arctic sea ice at seasonal to annual scales and showed considerable possibilities to explain the impacts of climate changes on Arctic sea ice extent. Drobot and Maslanik [
20] exploited a statistical model with four regressors (winter multiyear ice concentration, spring total ice concentration, North Atlantic Oscillation index, and East Atlantic index) in order to explain summer ice conditions in the Beaufort Sea. Drobot [
21] extended the work of Drobot and Maslanik [
20] by adding more explanatory variables like heating degree-days to the regression model for the Beaufort-Chukchi Sea. These works aimed to predict the Barnett severity index (BSI) that shows the expansion of open water using surface and atmospheric variables. Drobot
et al. [
22] developed a multiple regression model to predict sea ice extent using satellite data such as ice concentration, surface skin temperature, surface albedo, and downward longwave radiative flux as explanatory variables. Drobot [
18] also provided several regression equations for the relationship between minimum sea ice extent and satellite-observed surface variables including temperature, albedo, and downward longwave radiation. Lindsay
et al. [
23] presented a similar regression analysis that predicted sea ice extent by using surface variables for the Arctic Ocean. Årthun
et al. [
24] showed the correlations between Barents Sea ice area and Atlantic heats. Pavlova
et al. [
25] analyzed the impacts of winds and sea surface temperatures on the Barents Sea ice extent using correlation coefficients. In addition, Tivy
et al. [
26] analyzed July sea ice concentration in the Hudson Bay with the help of canonical correlation analysis using sea surface temperature, geopotential height, and surface air temperature.
The characteristics of previous statistical studies on the satellite-observed sea ice change include the following. First, most of their techniques for modeling sea ice changes were somewhat limited by focusing solely on the prediction of sea ice extent [
26] using variables such as sea ice concentration, despite the fact that sea ice extent is indeed the value directly determined by sea ice concentration, according to its definition. Rather, sea ice concentration itself has rarely been modeled by statistical methods using satellite imagery except for Tivy
et al. [
26] whose analyses were conducted simply for July. Second, the statistical models require improvements to achieve better accuracies by incorporating techniques that can deal with temporality or long-term variation of the relationships between sea ice concentration and climate factors. Indeed, 34-year satellite imagery includes 408 time series on the monthly basis. It is quite a long-range dataset, so the relationships between sea ice concentration and climate factors may not be sufficiently explained by a single equation of the ordinary least squares (OLS) method. Instead, time-series statistical approaches such as vector autoregression (VAR) and autoregressive integrated moving average (ARIMA), whose predictabilities have been proved in many other fields, can be an alternative to modeling sea ice changes in terms of temporally varying relationships.
In this paper, we described the statistical modeling of the Arctic sea ice changes in relation to various climatic factors using recent 34 year satellite imagery and climate reanalysis data. A target variable to be analyzed is the sea ice concentration retrieved by the National Aeronautics and Space Administration (NASA) Team algorithm [
11], and explanatory variables include skin temperature, sea surface temperature, total column liquid water, total column water vapor, instantaneous moisture flux, and low cloud cover that were obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim datasets [
32]. The six explanatory variables were selected by taking account of the correlation coefficients for sea ice concentration and multicollinearity among variables as well. The OLS regression models were useful in summarizing climatological patterns that can be found in the relationships between sea ice concentration and climate factors, and the ARIMA models had advantages in the improvements of prediction accuracy. Our study area is the Barents and Kara Seas, which have experienced considerable sea ice changes for the period.
2. Data and Methods
2.1. Satellite and Climate Datasets
The Barents and Kara Seas area was selected to analyze sea ice changes on a regional scale. Sea ice concentration in this area exhibits very high variability because it is covered by relatively thin seasonal ice [
33] impacted by highly variable Atlantic water inflow [
34] and atmospheric forcing primarily driven by North Atlantic Oscillation [
35,
36]. The data for sea ice concentration produced by the NASA Team algorithm, which has been used in many sea ice studies, was obtained from the website of National Snow and Ice Data Center (NSIDC). An enhancement of the algorithm by the NASA Team on sea ice concentration overcomes the problem of a low ice concentration bias associated with surface snow effects. It is calculated based on the brightness temperature difference between 19 and 37 GHz channels obtained from SMMR and SSM/I [
11]. We used monthly dataset with a spatial resolution of 0.25 × 0.25 degrees in the polar stereographic projection centered on the North Pole. The Barents and Kara Seas area was delineated using a region mask provided by NSIDC (
Figure 1), which consists of 3912 pixels per scene.
As climate factors affecting the changes of sea ice concentration, we used monthly means of ERA-Interim products, which is the latest global climate reanalysis provided by ECMWF. We first investigated possible explanatory variables in relation to temperature, water, radiation, wind, pressure, heat energy, and cloud conditions. Because monthly radiation variables (e.g., surface net solar radiation and surface net thermal radiation) were only provided in forecasted values not in reanalyzed ones, they were not included in our analyses. Wind variables (e.g., wind speed and U/V component), pressure variables (e.g., mean sea level pressure and surface pressure), and heat energy variables (e.g., sensible heat and latent heat) which showed relatively low correlations (|R| < 0.5) to sea ice concentration during 1979–2012 were also excluded. Hence, we divided the remaining explanatory variables into three groups: (1) temperature-related variables including 2-m temperature, 2-m dewpoint temperature, skin temperature, and sea surface temperature; (2) water-related variables including vertical integral of water vapor, total column liquid water, total column water vapor, total column water, and instantaneous moisture flux; and (3) cloud-related variables such as low, medium, and high cloud cover. We finally selected skin temperature, sea surface temperature, total column liquid water, total column water vapor, instantaneous moisture flux, and low cloud cover as appropriate explanatory variables by considering the correlation coefficients to sea ice concentration and multicollinearity between variables within the group.
The skin temperature (K) is defined as the temperature of the top skin of the sea (approximately ≤0.01 mm), and the sea surface temperature (K) indicates the temperature of the sea at approximately 20–30 cm depth. The total column liquid water (kg/m
2) denotes vertical integral of water in the liquid phase from the ground to the nominal top of the atmosphere expressing the total amount of cloud liquid water, and the total column water vapor (kg/m
2) is for the total amount of water vapor in the atmosphere. The instantaneous moisture flux (kg/m
2/s) indicates the amount of evaporation for a unit area per second. Low cloud cover (0–1) is the fraction of clouds in the low layer, where the ratio of pressure to surface pressure is >0.8. The ERA-Interim products have been advanced by a data assimilation scheme based on 12 h 4D-Var, which possesses improved model physics, fast radiative transfer model, and better formulation of background error constraint [
32]. The data is originally produced with a spatial resolution of 0.75 × 0.75 degrees and can be also provided on the grids of 0.25 × 0.25 and 0.5 × 0.5 degrees through a spatial interpolation. We used the data on the 0.25 × 0.25-degrees grid and its geographic projection with latitude and longitude was converted into the polar stereographic projection in accordance with the NSIDC sea ice products. A set of map projection parameters such as latitude of standard parallel, longitude of central meridian, false easting, and false northing was specified for the conversion [
37].
Finally, 408 monthly layers for the 34 years (1979–2012) were aggregated from the NSIDC sea ice products and the ECMWF reanalysis datasets in order to examine the relationships between sea ice concentration and climate factors such as skin temperature, sea surface temperature, total column liquid water, total column water vapor, instantaneous moisture flux, and low cloud cover. Explanatory variables were normalized in the form of z-score calculated as (xi − χ̄)/σx, where xi is an individual value, χ̄ is the mean, and σx is the standard deviation for the entire pixels during 1979–2012. This is because normalized values allow for comparisons among the regression coefficients of explanatory variables even though they originally had different units.
2.2. Statistical Models
A regular regression model is based on the ordinary least squares method, which can be expressed in the Formula:
where y is a response variable and x1 to xk are explanatory variables. In remote sensing studies, y is generally a remotely sensed variable and x1 to xk are environmental variables of interest. β0 represents the intercept, and β1 to βk are the slopes of the relationship between y and x1 to xk. The error term ε may include all other factors influencing the response variable y except for the regressors x1 to xk. In this study, y is the sea ice concentration and x1 to xk are skin temperature, sea surface temperature, total column water, total column liquid water, instantaneous moisture flux, and low cloud cover, respectively. The OLS regression model is often assumed to apply universally over the whole area and the whole period, which implies spatial and temporal stationarities in the relationship between the response and explanatory variables.
Since some environmental phenomena can be explained in terms of temporality or seasonality, the ARIMA model may be useful in explaining complex long-range dataset. In contrast to the OLS approach, the ARIMA model assumes temporal non-stationarity based on the possible differences in the relationships between response and explanatory variables over time [
38]. The model is briefly expressed as ARIMA (
p,
d,
q) where parameters p, d, and q are non-negative integers that refer to the order of autoregressive (
p), integrated (
d), and moving average (
q) parts of the model, respectively. Also, seasonal ARIMA model is denoted as ARIMA(
p,
d,
q)(
P,
D,
Q) with additional parameters such as seasonal autoregressive order (
P), seasonal differencing order (
D), and seasonal moving average order (
Q). These parameters can determine the predictability of an ARIMA model and can be optimized by minimizing the criteria such as AIC (Akaike information criterion) and BIC (Bayesian information criterion). A typical seasonal ARIMA model has the following form [
39]:
where B is the backshift operator; Yt is the time series; μ is the mean term; φ(B) is the non-seasonal autoregressive operator; φS(BS) is the seasonal autoregressive operator; θ(B) is the non-seasonal moving average operator; θS(BS) is the seasonal moving average operator; at is the random error. In order to employ the ARIMA method in the time-series modeling with multiple explanatory variables, we can add regressors to an ARIMA equation, which literally adds the regressors to the right-hand-side of the equation.
4. Conclusions
In this paper, we described the statistical modeling of sea ice concentration in relation to climatic factors, using satellite imagery and climate reanalysis data for the Barents and Kara Seas during 1979–2012. The OLS regression model summarized the whole years and provided information about overall climatological characteristics in the relationships between sea ice concentration and climate variables. In particular, the ARIMA method was first introduced to statistical model for sea ice concentration and it helped improve prediction accuracies because the time series of the sea ice concentration is such a long-range dataset that the relationships may not be explained by a single equation of the OLS regression. We found that temporally varying relationships between sea ice concentration and the climate factors such as skin temperature, sea surface temperature, total column liquid water, total column water vapor, instantaneous moisture flux, and low cloud cover were modeled by the ARIMA method, which resulted in better prediction accuracies. The RMSE improvement was 0.076 on average (0.199 by OLS and 0.123 by ARIMA), and the prediction accuracies of the ARIMA were relatively stable throughout the months. Our improved statistical approach with the ARIMA method may be worth consideration when forecasting future sea ice concentration using the climate data provided by general circulation models (GCM).
In addition, some unique characteristics of the climate factors in relation to sea ice concentration were found during the analyses. In July and August when the ice melts, βSKT showed abrupt positive values presumably because rapid increases of skin temperature might act as a noise in the regression coefficients. During winter, βSKT and βSST showed relatively similar values close to zero partly because of the weaker thermodynamic effects of skin temperature and sea surface temperature below the freezing point. Unlike the total column water vapor, the total column liquid water (in clouds) brought about a peculiar seasonal pattern of βTCLW because the warming or cooling effects of clouds were applied differently according to season, which will require further investigations to understand the details. Those results may be useful in better understanding the physical mechanism and in improving the statistical model.
Our result derived from limited number of explanatory variables may not be applied universally when considering the complex climate change system. Therefore, additional explanatory variables related to solar radiation [
47], atmospheric refractivity [
13], and surface roughness [
48] should be incorporated in the statistical models for further improvements in prediction accuracies. In addition, we did not identify the time-lag between explanatory variables and sea ice concentration, but given that previous studies have reported the time-lag in melt onset or freeze-up [
46], a closer examination of this will be necessary for improving the time-series modeling.