**1. Introduction**

Substances of anthropogenic origin have a major influence on the climate system [1]. Human economic activity contributes to thermal atmospheric pollution—increasing greenhouse gas (GHG) concentration enlarges the natural greenhouse effect and plays a decisive role in the rise of the average global temperature [2–4]. GHGs are mainly generated by the combustion of fossil fuels in industrial and agricultural production processes, and, by a large proportion, from waste [3,5–9]. GHG absorption is usually associated with the physiological properties of green vegetation, as other solutions to sequester carbon have not ye<sup>t</sup> been proven to be either technologically or economically effective [10,11]. Meanwhile, climate change is a global issue and needs to be addressed through global cooperation among countries to improve energy efficiency, develop and deploy clean technologies, and increase natural GHG absorption. In this context, the processes in and around land use, land use change, and forestry (LULUCF) are becoming crucially important. The LULUCF sector includes GHG emission and its removal from forests, arable or producing land, grasslands and pastures, wetlands, built-up areas, and other land plots. Emissions and removals of GHGs are accounted using internationally accepted approaches [12–14]. However, in order to actively increase carbon absorption, it is necessary to know and manage the processes involved in the development of land surface layers and land use. Cognitive processes and managemen<sup>t</sup> decisions will be hampered by a lack of access to scientifically based tools for modeling land use and hence GHG emissions.

**Citation:** Mozgeris, G.; Jukneliene, D.˙ Modeling Future Land Use Development: A Lithuanian Case. *Land* **2021**, *10*, 360. https://doi.org/ 10.3390/land10040360

Academic Editor: Cezary Kowalczyk

Received: 17 February 2021 Accepted: 20 March 2021 Published: 01 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Nowadays, many land use change modeling tools exist, differing in their methodological implementation [15,16]. They may cover universal or very specific application fields, with the focus on local case studies or continental exercises. There are several concepts of land cover and use modeling [17]—economic models, system dynamics approaches, cellular automata, and agent-based models. Spatial economic or econometric models deliver generalized predictions of states of phenomenon by balancing various inter-related input factors determining their development. System dynamics or causality-driven models assume an empirical modeling of land cover or land use changes. This involves (i) an assessment of past changes first, (ii) a determination of relationships between land changes and factors driving such changes, (iii) an evaluation of change potential, and (iv) an allocation of land to the new land cover or land use types [18]. Cellular automata usually operate in a raster domain, representing the landscape as an n-dimensional grid of cells. Each cell may acquire a finite number of states, which may change over time following some set of rules and depending on the state of neighboring cells. Models are iterated over time, delivering land cover or land use status within the cell at specific times [19,20]. Agentbased models are aimed at modeling the behavior of autonomous individuals (agents) who may perceive their environment and interact with individuals [21]. Even though there are numerous potential solutions for land use change modeling, their applicability is heavily restricted by various legal, technological, and organizational aspects. The land use change modeling depends on the specific requirements of GHG emission accounting, the availability and specifics of input data, modeling tools, and experiences, especially when considering specific countrywide exercises.

There are many factors influencing GHG emissions and absorptions in the LULUCF sector, potentially resulting in uncertainties in both GHG accounting and projections [22–26].Simultaneously, availability, or often the lack of input data for land use change analysis, makes the task more challenging [27]. Even though there are international standards to account for GHGs, there are always some specialties present in the operational approaches of each country. Lithuania, following its international climate change mitigation commitments, has developed an original LULUCF monitoring system, which is used for GHG reporting. This system predetermines the approaches of land use development projections. The core data source for GHG accounting from the LULUCF sector in the country is the National Forest Inventory (NFI), which is implemented by the State Forest Service [28,29]. Originally developed to provide statistical information on forest resources for strategic forestry planning at a country level, the Lithuanian NFI has recently been expanded to collect countrywide data on land uses and land use changes. The land uses are monitored in a systematic network of observation points through the whole country, while forest attributes are surveyed at points in the forest. There are operational solutions introduced in Lithuania to model the development of forest resources and forestry, ranging from forest stand-level simulators to systems manipulating aggregated countrywide data [30–32]. The State Forest Service uses the European Forestry Dynamics Model (EFDM), developed as a harmonized forestry modeling tool for all European countries, based on NFI data. The EFDM has been used to calculate the forest reference level (FRL) for Lithuania following the European Union LULUCF regulation for 2021–2030 [13]. The EFDM is a matrix-based model of a Markov chain type representing change by transition of areas (in this case, the NFI sample plots) between different fixed states of the forest [33]. This matches well with the system dynamics or causality-driven models introduced above. The reference levels for land uses other than forest are based on historical data, thus, one may assume that no sophisticated modeling solution is needed. Nevertheless, successful land use managemen<sup>t</sup> provides challenges for modern decision-support tools which are based on land use development scenarios. To our knowledge, the solution that has been widely used to make GHG projections in the LULUCF sector in Lithuania has been the land use, land use change and forestry emission accounting tool, LULUCFeat [34]. LULUCFeat delivers GHG predictions based on aggregated LULUCF data and past trends, using information on driving factors and expert knowledge. Methodologically, this fits the economic models mentioned above.

However, the solution is too focused on delivering certain GHG reports and underfitting expectations for a versatile land use change modeling system, based on all NFI data and compatible modeling principles.

Thus, the aim of the study introduced in this paper is to test the methodological principles for land use development scenario modeling for use in processes of GHG accounting and management. First, we ask what is the performance of the Markov chain analyses methodological approach in modeling land use development using standard GIS software? To conduct the modeling exercise, we use inputs available from already running in Lithuania inventory projects and freely available geographic databases. Then, we test the capacity of the LULUCF sector in Lithuania to accumulate carbon during the next decade, starting in 2020. For that, we project the development of major land use types in Lithuania until 2030 using several land use managemen<sup>t</sup> scenarios and estimate potential contributions of different land uses on carbon emission/absorption. We hypothesize that the carbon accumulation in the LULUCF sector in Lithuania during the next decade should increase. Finally, we end with a discussion and proposals for both methodological enhancements of modeling solutions and land use managemen<sup>t</sup> policies.

### **2. Materials and Methods**

### *2.1. Study Area*

The study was conducted in Lithuania, located in Central Europe (Figure 1) and having historically strong links with Eastern Europe. Land use development in Lithuania in recent decades strongly depended on the radical societal transformations after Lithuania broke away from the Soviet Union in 1990 and later joined the European Union in 2004 [35]. The area of three land uses important in GHG accounting and managemen<sup>t</sup> (forest, producing land and grassland) was rather similar (around 28–30%) in 1971. Then, the proportions of forest, wetland, built-up areas, and other land use types changed relatively steadily since 1971, while the trends of producing land and grassland development changed their trajectories around 1990 and again about 2005 (Figure 2). The proportions of forest land and producing land in 2015 were, respectively, 34% and 33%. The proportion of grassland was reduced to 23%, and the proportions of both wetland and built-up land were 5%. It should be noted that the total area of Lithuania is 65,200 km2.

**Figure 1.** Location of the study area. Source of the data used: https://thematicmapping.org/ downloads/world\_borders.php (accessed on 22 March 2021).

**Figure 2.** Specification of the study area: (**left**) associative illustration of the distribution of sample plots at different scales, with gray squares referring to the 100 × 100 m cells associated with Lithuanian National Forest Inventory (NFI) sample plots with 100 buffers used to extract driver variables for land use change modeling; (**right**) proportions of major land use types in Lithuania in 2015 and changes in proportions since 1971. Source of the data used: Lithuanian National Forest Inventory.

### *2.2. Input Data*

Two types of input data were used in the study: (i) data describing the land uses in Lithuania and (ii) data describing the factors influencing the land use changes. Land use information was available from the Lithuanian NFI [29,36]. Land use types and subtypes have been identified annually on a network of 16,349 systematically distributed sampling points (Figure 2) since 1971 using the nomenclature of GHG inventories [37]. Usually, three levels of identification are used; however, we used only two levels in our study, i.e., Level 1 with 6 land use types (forest land, producing land, grassland and pastures, wetland, built-up areas, and other land) and Level 2 with 25 subtypes specifying the types in more detail (Appendix A, Table A1 provides a full list of land use subtypes). To conduct the modeling and to integrate the NFI data with other datasets, we created a raster map with a cell size of 100 × 100 m and assured that each NFI plot was associated with a unique cell. Only cells with an NFI plot were used for the study. Free data available from the spatial information portal of Lithuania (www.geoportal.lt, accessed on 22 March 2021) were used to describe the factors influencing the land use changes. The following geographic datasets were used: GRPK (spatial dataset of (geo) reference base cadaster), GDR50LT (georeferenced spatial dataset for the territory of the Republic of Lithuania at the scale of 1:50,000), AZ\_DRLT (spatial dataset of abandoned land of the territory of the Republic of Lithuania), SŽNS\_DR10LT (database of limited land use areas of the Republic of Lithuania at scale 1:10,000), Dirv\_DR10LT (spatial dataset of soil of the territory of the Republic of Lithuania at scale 1:10,000), KŽS (land parcel identification system database), the spatial dataset on the farmland, cropland, and crop types from the National Paying Agency under the Ministry of Agriculture and Population, and the 2011 housing census data from Lithuanian official statistics portal (https://osp.stat.gov.lt/documents/10180/ 1491916/WHOLE.zip, accessed on 22 March 2021). Two approaches were used to specify the explanatory variables: (i) the area of specific features within a 100 m buffer zone around each 100 × 100 m cell associated with the NFI sample plot was estimated, and (ii) the shortest distance from the NFI sample plot center to specific features was estimated. All explanatory variables were stored as raster maps with a cell size of 100 × 100 m. Optimization of the explanatory variables is described in the next subchapter.

### *2.3. Modeling Land Use Development*

Modeling of the land use development was implemented using the TerrSet 18.21 Land Change Modeler [38]; thus, some approaches used were predefined by the functionality of the available tools. Therefore, the modeling started with an analysis of land use changes between two points in time. The potential of land use transitions was then modeled using a set of driver or explanatory variables. A set of maps of suitability for each transition was developed. Based on land use changes in the past, probabilities of land use change in the future were calculated by building a matrix with probabilities of all possible land use changes. Finally, the land use changes were predicted using the historical rates of change and the transition potential models for a specified date in the future.

Our study consisted of two stages. First, we calibrated and validated land use change modeling using input data freely available in Lithuania. We then simulated land use development for the next decade using several land use change scenarios. The methodological framework of our study is summarized in Figure 3.

**Figure 3.** Flowchart summarizing the overall structure of the study: (**a**) calibrating and validating the land use change model, and (**b**) modeling land use development until 2030.

We first analyzed the land use development during the period from 2005 to 2010 to predict land uses in 2015. Transitions were modeled using a multilayer perceptron (MLP) neural network algorithm. All driver variables were tested before using them to build the transition potential models. First, Cramer's V statistic was calculated for each potential explanatory variable—only variables that had a Cramer's V of 0.15 or higher were considered as having potential for modeling. The variable with a higher Cramer's V statistic was considered for modeling among highly intercorrelated variables. Finally, the lists of driver variables were optimized, analyzing the modeling reports delivered by the TerrSet system and iterating the final lists of variables that produced the best MLP performance. All driver variables were considered static. Six strategies were tested to include the driver variables in building the transition potential models, differing by the number and type of driver variables, the date they referred to, and the preprocessing solutions (Table 1).

**Table 1.** Tested strategies for the inclusion of driver variables in building the transition potential models (+ means that the variables from the respective group were considered or an optimization of variables was applied).


Driver variables originating from the KŽS database were grouped according to the date they were created: variables based on data collected (i) before 2005, (ii) between 2005 and 2010, and (iii) after 2010. This was aimed to simulate exercises, where variables changing over time were considered land use development scenario specifications. For example, variables collected after 2010 did not influence the land use change before 2010, but they could be used to specify the future (actual or expected) dynamics of factors influencing land uses. The land use declaration data from the spatial dataset on farmland, cropland, and crop types refer to 2012. The most current versions of other datasets were used. A full list of explanatory variables considered is provided in Appendix A, Table A2. Future land use was predicted using a hard prediction model. The quantity of change in each transition was modeled through a Markov chain analysis.

The second stage of our study included predicting land use development in the future, i.e., acquiring the areas of major land use types for 2020, 2025, and 2030. The sixth strategy using driver variables was applied, i.e., all available explanatory variables were tested before use in the transition potential models. Two options of actual land use change were considered to build the Markov matrix, i.e., (i) the changes from 2005 to 2010 and (ii) from 2010 to 2015. The land use change scenarios were also specified by editing the Markov matrix. The land use development scenarios considered are introduced in Table 2.


### **Table 2.** Description of future scenarios of land use change.

To obtain approximate indications of potential contributions of different land uses on carbon emission/absorption, we applied average conversion factors for 2015, as used to prepare the national GHG report from the LULUCF sector [39]; i.e., the following emission values in tons of CO2 equivalent per ha were used: forest land, 3.93; producing land, 1.43; grassland, 0.51; wetland, 2.64; and built-up land, 1.6; other land, 6.25.

### *2.4. Validation Approaches*

Approaches originating from remote sensing were used to validate the performance of land use prediction. Land use types for the year 2015 were predicted on all NFI sample plots, and the predictions were compared with actual land use types recorded by the Lithuanian NFI. Error matrices were constructed where the true and predicted land use types were cross-tabulated. The validation statistics used to evaluate the prediction were the overall accuracy of prediction and Cohen's kappa:

$$Kappa = \frac{Observed\ accuracy - \text{Expected\ accuracy}}{1 - \text{Expected\ accuracy}} \tag{1}$$

$$Observed\ accuracy = \text{Overall accuracy} = \frac{tp}{N} \tag{2}$$

$$Expected\ accuracy = \sum\_{i=1}^{k} \frac{nt\_i}{N} \times \frac{nc\_i}{N} \,. \tag{3}$$

where *tp* refers to the number of samples predicted to be positive that are, in fact, positive, *k* refers to the number of classes, *nti* refers to the number of samples truly in class *i*, *nci* refers to the number of samples assigned to class *i*, and *N* refers to the total number of samples.

The interpretation of Cohen's kappa was as follows: under 0: "poor"; 0–0.2: "slight"; 0.2–0.4: "fair"; 0.4–0.6: "moderate"; 0.6–0.8: "substantial"; 0.8–1.0: "almost perfect" [40].

Land use type–specific prediction performance was evaluated using precision (producer's accuracy), recall (user's accuracy), and the *F*-score (the harmonic mean of recall and precision):

$$Precision = \frac{tp}{tp + fp} \tag{4}$$

where *fp* refers to the number of samples predicted positive that are, in fact, negative;

$$Recall = \frac{tp}{tp + fn} \tag{5}$$

where *fn* refers to the number of samples predicted negative that are, in fact, positive;

$$F\text{-score} = 2 \times \frac{Recall \times Precision}{Recall + Precision} \tag{6}$$

The *Z* statistic was used to test whether two prediction error matrices were statistically different:

$$Z = \frac{|\pounds\_1 - \pounds\_2|}{\sqrt{var(\hat{\p}\:1) + var(\hat{\p}\:2)}},\tag{7}$$

where *κ*ˆ 1 and *κ*ˆ 2 are the Cohen's kappas of compared predictions, and *var*(*κ*<sup>ˆ</sup> 1) and *var*(*κ*<sup>ˆ</sup> 2) refer to the variances of the respective matrices. Compared predictions were treated as statistically differing if *Z* was more than 1.96 [41].
