1. Introduction
Weather forecast models produce numerical solutions to simulate physical processes in the atmosphere. However, these solutions fail to accurately describe physical processes, and this is one reason why errors are observed in forecasts. An effective way to reduce these errors and improve the performance of a numerical weather prediction model is the application of different algorithms to the output of the model. The goal of such algorithms is to predict the forecast error and remove it out of the final output of the forecast.
On the other hand, machine learning is a subfield of artificial intelligence that focuses on developing algorithms and techniques that enable computers to learn and make predictions or decisions without being explicitly programmed. It involves the study of algorithms and statistical models that allow computers to learn from and make predictions or take actions based on data. The core idea behind machine learning is to enable computers to learn patterns or relationships from data and generalize that knowledge to make predictions or decisions on new, unseen data. Rather than following a predetermined set of rules, machine learning algorithms learn iteratively from examples or experiences, continuously improving their performance over time.
Machine learning has a wide range of applications across various domains, including image [
1] and speech recognition, recommendation systems, autonomous vehicles [
2], finance, healthcare, weather forecast [
3,
4], and many others. It has revolutionized many industries and continues to advance rapidly with the availability of large datasets, increased computational power, and advancements in algorithms and techniques.
Artificial neural networks (ANNs) are a subset of machine learning algorithms that are modeled after the structure and function of the human brain. They consist of interconnected nodes, or neurons, that are organized into layers. The input layer receives the raw data, which are then processed through one or more hidden layers before being output as a prediction or classification.
The training of an ANN is a form of machine learning that involves adjusting the weights and biases of the neurons in the network based on a training dataset [
5]. The goal is to minimize the difference between the predicted output of the network and the actual output for a given input. Once trained, the ANN can be used to make forecasts by feeding in new input data and computing the output using the trained weights. The accuracy of the forecasts depends on the quality and quantity of the input data, the network architecture, and the training algorithm used.
ANNs have been widely used in weather forecasting due to their ability to learn complex patterns and relationships in data. Hanoon et al. [
6] showed that an ANN architecture has good potential to predict daily temperature and relative humidity, with an acceptable range of accuracy. Machine learning and deep neural network models were tested on weather station data by Talsma et al. [
7], showing results with promising accuracy (6 h prediction RMSE = 1.53–1.72 °C) for use in frost and minimum-temperature prediction applications. Moosavi et al. [
8] performed experiments using WRF-ARW model forecast data that demonstrate the strong potential of machine learning approaches to aid the study of model errors. While their experiments were focused on forecasting precipitation, the methodology developed was general and can be applied to the study of errors in other models, for other quantities of interest, and for learning additional relationships between model physics and model errors.
In this study, an ANN is used with inputs from a weather forecast model to accurately predict the air temperature at 2 m. Both hourly and daily data are used. The output is compared to air temperature measurements from weather stations inside the region of Central Macedonia, Greece. The training period of the ANN is from 1 September 2021 to 31 December 2021. The test period is from 1 January 2022 to 31 March 2023.
2. Materials and Methods
2.1. Artificial Neural Network Structure
ANNs used for temperature forecasts typically consist of an input layer, one or more hidden layers, and an output layer. The number of nodes in each layer and the connections between them are determined based on the problem requirements and the available data.
In this specific case, the ANN has a combination of 5 to 9 input nodes and each of them represents input variables, such as temperature at 2 m, skin temperature, temperature at the first and second vertical layer of the model, temperature at 850 hPa, dew point at 2 m, wind speed, cloud fraction, and soil temperature. The input layer feeds into two hidden layers, each consisting of several nodes, with an activation function being responsible for the output of each node. Each hidden layer was tested with different number of nodes, but the chosen activation function is the hyperbolic tangent. The output layer consists of a single node that provides the forecasted value of the target variable, which, in this case, is air temperature at 2 m in case of hourly data or minimum and maximum temperature in case of daily data.
During the training process, the network learns to map the input variables to the target variable by adjusting the weights of the connections between the nodes. This is achieved using a backpropagation algorithm that minimizes the mean squared error (MSE) between the predicted and actual values of the target variable. The MSE is calculated by taking the average of the squared differences between the predicted and actual values over all training examples. The backpropagation algorithm involves computing the error at the output layer and then propagating it backwards through the network to adjust the weights of the connections. This is carried out using the chain rule of calculus to calculate the gradient of the error with respect to each weight.
2.2. Forecast Model
The Weather Research and Forecasting model with the Advance Research WRF (WRF-ARW) core is used to produce temperature forecasts on an hourly basis for Greece for the next 5 days. The WRF-ARW model is a next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications [
9]. It was developed at the National Center for Atmospheric Research (NCAR), which is operated by the University Corporation for Atmospheric Research (UCAR).
The necessary initial and boundary conditions for the 5-day daily forecast are provided by the US National Aeronautics and Space Administration (NCAR) through the 1200UTC forecast cycle of the Global Forecasting System (GFS).
Using nested domains, the necessary downscaling is achieved to the desired 2 × 2 km resolution for the final grid that covers a great part of Greece (
Figure 1). A complete set of forecasts for the period from 1 September 2021 up to 31 March 2023 is available with exactly the same set-up as the model. The model data are not only available on an hourly basis but also data that were calculated daily.
2.3. Observations
On the other hand, to verify the accuracy of the forecast, the automated weather station network of the General Aviation Applications 3D S.A. (3DSA) company was used. The network has twenty weather stations that are working operationally inside the region of Central Macedonia, Greece. Hourly and daily data from ten of them, which have a complete dataset for the period from September 2021 up to March 2023 without even a single hour missing, were used for the verification. Also, daily data from another seven automated weather stations, which are in the same region but are part of the National Observatory of Athens’ network (NOA) [
10], were used.
Figure 2 shows the location of each weather station. For each weather station, the nearest grid point of the model was selected for the verification process.
3. Results and Discussion
3.1. Error Analysis
Comparing the hourly air temperature forecasts with the measurements from the weather stations, the errors are bigger, on average, during nighttime and much smaller during daytime, and this fact is true for all five forecast days. The average mean absolute error (MAE) for noon and afternoon hours is never greater than 2.0 °C, except for the fourth and fifth forecast day. On the contrary, during night and early morning hours, the MAE is greater than 2.5 °C, even for the first forecast day, and it is 3.5 °C for the last forecast day. Similarly, the daily maximum temperature forecast is much more accurate than the daily minimum temperature. As shown in
Table 1, the bias, the MAE, and the Root Mean Squared Error (RMSE) are almost 1–2 °C less for maximum temperature than they are for minimum temperature. The errors for maximum temperature are almost unbiased and similar to all stations. The errors for minimum temperature are greater to some stations (e.g., Agras, Elani, Serres) and much smaller to others (e.g., Epanomi, Naoussa).
3.2. Application of the Artificial Neural Network
Testing the optimal number of inputs and the variables that should be used showed that cloud fraction introduces noise, and the results are better without it. On the other hand, fewer variables than eight offered less reduction in the forecast errors. The final list of the inputs for the hourly forecast is air temperature at 2 m, dew point at 2 m, skin temperature, soil temperature (closest level to surface), model temperature (first and second vertical layer), temperature at 850 hPa, and wind speed. For maximum and minimum temperature forecast, the list is maximum daily temperature and minimum daily temperature, respectively, and the value of the other seven variables at the time of the maximum or minimum temperature is referred to.
Another crucial thing is the number of nodes in the hidden layers. A lot of different numbers were tested. For five nodes in each hidden layer, the lowest accuracy of the forecast is observed. Increasing the nodes up to 150, the accuracy is improved. For 300 or 500 nodes in each hidden layer, the accuracy is slightly worse. For this reason, the results for 150 nodes in each hidden layer will be presented afterwards.
The first major impact of the ANN is that it minimizes bias. The bias of the hourly temperature for all forecast hours, the maximum temperature, and the minimum temperature is less than 0.3 °C, and in most of the cases is 0 °C. The meaning of this is that any noise that the algorithm could detect within the available data is excluded from the new forecasts. As shown in
Table 2, the MAE and RMSE are significantly reduced for the minimum temperature for about 1–1.2 °C. The MAE is about 2.6–3.0 °C at first and, after the application of the ANN, decreases to 1.6–1.9 °C. The RMSE is about 3.2–3.6 °C at first and, after, it decreases to 2.0–2.4 °C. The percentage of forecast errors that are greater than 2 °C is between 51 and 57% at first but, after, it is about 30%. For the weather station in Agras, the MAE for the third forecast day reduced from 3.8 °C to 2.0 °C, and the number of forecasts with errors greater than 2 °C reduced from 312 to 173. The same indicator for the weather station in Naoussa reduced only from 1.3 °C to 1.1 °C.
The MAE and RMSE for the maximum temperature have not changed at all. For the second forecast day, the MAE even increased by 0.1 °C, and for the first forecast day, the RMSE increased from 1.7 °C to 1.9 °C. The main reason for this result is that the bias of the maximum temperature was almost 0 °C or very close to 0 °C (0.2 °C for the fifth forecast day).
For the hourly temperature forecasts, the MAE and RMSE are reduced for the nighttime and early morning hours when the initial errors are bigger. For the noon and afternoon forecast hours, the reduction is small or there is no error reduction at all.
4. Conclusions
The application of an artificial neural network to the forecast data of the WRF-ARW model showed that it can reduce forecast errors. In cases when the error was high, the reduction was also high. When the error was small, or the bias of the error was too close to zero, the algorithm failed to improve the forecasts. The best results with this algorithm can be observed when the ANN is applied on the minimum temperature forecasts and on hourly data with high errors, like those in the nighttime and early morning hours. For maximum temperature forecast and hourly forecasts that predict the temperature at noon and afternoon hours, it is better to not apply the algorithm as the improvement is insignificant.
Possibly, a third hidden layer and the addition of more variables as inputs from the same model or another model could solve the problem when the bias is zero.