**1. Introduction**

Climate classifications are frequently applied tools for evaluating the real climate system. One of the oldest and still widely accepted systems of climate types was introduced by Wladimir Köppen [1] and later modified by Geiger [2] and additionally by Trewartha [3–7]. Köppen divided eleven climate types based on annual and monthly changes in temperature and precipitation. Trewartha modified the Köppen classification so that the classifications based on the main quality differences and the vegetation characteristics were better taken into consideration. The so-called Köppen–Geiger (K-G) climate classification is derived directly from eco-biological vegetation characteristics within the individual regions of the Earth, which make it suitable for assessing climate change impacts on ecosystems. It is based on annual and monthly mean values of temperature and precipitation and distinguishes five main vegetation groups: the equatorial zone (A), the arid zone (B), the warm temperate zone (C), the snow zone (D) and the polar zone (E). The main groups are further divided into subtypes, reflecting the annual course of air temperature or precipitation and their monthly values compared to a defined threshold. For a detailed overview of all K-G classes and their spatial distribution around the world, we refer to [8]. K-G classification can be applied either to the real observed data of the Earth's climate or present or future conditions simulated by climate models [5,7,9,10]. Some studies, e.g., [11,12] have used the Köppen–Trewartha classification [13] to map the extent of climate change in Europe using an ensemble mean of regional climate models (RCMs) and simulations, considering the uncertainty related to driving global climate models (GCMs). However, the fact remains that all studies based on climate models should deal with model errors carefully before drawing conclusions.

According to [14], model errors can be caused by the initial and boundary conditions, parameterization, physical formulation, internal variability or model shortcomings [15–19]. Model errors can be divided into two categories: unsystematic errors (random) and systematic errors (bias). Random errors stem from the internal variability of climate models, which are a dominant source of uncertainty for shorter (decadal) timescales in model simulations [20]. Bias is defined as any systematic discrepancy of model simulation and observation. Systematic errors can originate either from inadequately constrained parameters or from model structures that are unable to describe the physical process of interest [21]. Model bias is the most prevalent source of uncertainty for longer (century) timescales [20]. Moreover, bias corrected climate model outputs may lead to a significant response in some impact models as decision support tools [22–24].

In our previous work [25] we applied the K-G classification as a diagnostic tool of climate change for six RCM experiments originally produced as a part of the EU FP6 project, ENSEMBLES [26]. Every experiment represented one specific RCM, driven by one of two GCMs. The simulations followed the A1B emission scenario of Intergovernmental Panel on Climate Change (IPCC) [27,28], and the results were evaluated for the near (2021−2050) and far (2071−2100) future periods. The model simulations were subjected to validation and bias correction using the empirical distribution mapping technique on E-OBS [29] observed data as a reference. We found that warmer climate type increased in each RCM for the future but the degree of their extension was different among them. These differences came from the different GCM applications as the driver, the different physical packages of RCMs and the different representations of natural variability in individual models.

Owing to the fact that any choice of bias correction method can be an additional source of uncertainty [23], in this study, we aim to quantify the impacts of different bias correction techniques on the simulated distribution of K-G zones over Europe. The influence of different bias correction methods has been studied over small geographical domains, usually select river basins in Scandinavia [30], North America [31], or North-estern China [32]. In these studies, the performance of bias correction methods was investigated by statistical indices. References [31] and [30] suggested distribution-based methods, while [32] found that the quantile mapping and power transformation of precipitation methods performed equally best in terms of the frequency-based indices, while the local intensity scaling (LOCI) method performed the best in terms of the time-series-based indices. We intend to test the performance of bias correction over a large pan-European domain, as the bias varies in regions of the domain. Moreover, we study the bias correction performance by implementing Köppen–Geiger climate classification.

Our two major research questions are as follows:

Which bias correction methods of precipitation and temperature are able to reproduce climate classification based on the observed parameters in the 1961–1990 time period?

Which bias correction methods of precipitation and temperature are the most reliable for climate prediction over the whole pan-European domain?

This paper is organized as follows: In Section 2, a short description of the K-G classification, selected models and applied bias corrections are presented. In Section 3, the resulting climate classification with respect to the individual bias correction method is presented. Section 4 contains a discussion of our findings and Section 5 offers the conclusions we draw.
