**1. Introduction**

The mapping of the spatial structure of forest communities is an integral part of biodiversity research and environmental planning at the regional level. There are three basic constraints on forest data: First, the data on forest spatial structure must be up-to-date and be able to be regularly updated with a step of 2–5 years. Second, the data should equally describe the parameters of biodiversity and

species composition of forest communities, and not just stocks of industrial wood species. Third, the combination of local measurement data in the process of ground-based research with multispectral satellite imagery data and quantitative methods of their processing should ensure the display of important information about the structure and properties of vegetation on the map. The need to obtain more diverse and detailed information in the forest inventory is formulated as a result of the activities of international programs that regulate certain actions not only in the environmental, but also in the social and economic spheres [1].

In many countries abroad, the National Forest Inventories (NFIs) system is based on the nature of remote information combined with ground data laid down in a regular network of permanent sample plots [2]. A number of requirements are imposed on modern mapping of natural objects based on supervised classification [3]: sampling design [4], preliminary stratification of the study area into homogeneous strata [5,6], random uniform distribution of sampling points within strata and equality of samples between strata. In particular, this is necessary to reduce spatial autocorrelation within field data samples [7,8]. Another important requirement is the correspondence of the sample size to the minimum value, that varies according to different studies from 25 to 80 sample elements for each modeled object [9]. These requirements are often difficult to meet due to the fact that long-term field data collection programs that were carried out 5–10 years ago did not take many of these factors into account [10]. Due to the limited capabilities, field materials do not possess such properties a priori (in whole or in part); therefore, appropriate preliminary preparation of samples is required.

In Russia, unlike NFIs, the location of sample plots is mainly irregularly distributed, their density per area is at least 6 times less, and the spatial distribution has strong bias to the road network and settlements [11]. Moreover, the data of the state forest inventory are either officially classified or available as old paper maps. Under these conditions, the collection of scientific data on the state of plant communities is carried out by individual scientific institutes or teams extremely rarely on a systematic basis and is characterized by a number of shortcomings: (1) Uncertainty of determination of forest association groups by different researchers and (2) uneven distribution of field data in space due to transport infrastructure and inaccessibility of territories. In addition, horizontal uncertainty of GPS L1 receivers (Level 1 for civilian use) under dense forest canopy makes a negative contribution [12]. The reasonings above lodge a challenge of searching the most effective approaches and methods for modeling and mapping the spatial structure of forest communities using the available sources of data.

The Moscow Region is selected as a test area. Taking into account the strengthening of urban planning activities, the development of country and cottage construction and recreation in the region, the maintenance of ecological and social functions of the "green belt" forests is extremely important [13–15]. Concern about the state of forest plantations is noted not only on the part of NGOs (non-governmental organizations) and experts [16], but also at the state and regional level [17].

To date, there are three large-scale sources of information on the composition and spatial structure of forests in the Moscow Region: (1) materials of the state forest inventory, performed in 1995–2000 [17], (2) map of the vegetation cover of the Moscow Region, made by the team of authors at Moscow University in 1996 [18] and (3) map of terrestrial ecosystems of the Moscow Region [19]. The first two sources are characterized by significant prescription. In addition, the state inventory is based on the collection of information on the stocks of industrial wood species and, to a much lesser extent, on the data on the composition of the ground layers. It also should be noted that according to Russian forest management guidelines, the systematic error is allowed for forest inventory data. This error may reach 10–20% of species proportion in 32% of controlled forest patches [20]. The third source contains only 6 generalized forest types, which is not enough for a biodiversity inventory. Thus, for the Moscow Region (and probably for most regions of Russia), there is an obvious lack of up-to-date cartographic material on the spatial structure of vegetation, primarily, forest cover, which raises the question of the availability of reliable field data on the state of the vegetation (forest) cover.

A variety of modeling approaches are available currently [21]. The Linear Discriminant Analysis was tested by the authors and it was demonstrated that non-linear features of environmental variables might potentially improve the robustness of the model [22]. In the current study, the SDMtoolbox (Spatial Distribution Modeling toolbox 2.4, Durham, NC/Manhattan, NY/Carbondale, IL, USA) is chosen as a modeling tool. The SDMtoolbox is a python-based ArcGIS 10.7 toolbox (Redlands, CA, USA) for spatial studies of ecology, evolution and genetics. The SMDtoolbox is chosen because it includes the basic MaxEnt (Maximum Enthropy, Manhattan, NY, USA) algorithm and a number of additional tools necessary to control the autocorrelation of spatial data [23]. Among other methods, MaxEnt is shown as an effective tool for non-linear interactions between response and predictor variables, and is robust to small sample sizes [21]. MaxEnt is used not only for the Species Distribution Model, but also for a wide range of natural phenomena, for instance, for tree pests monitoring (Salento Peninsula, Italy) [24], to create predictive risk maps for soil-transmitted helminth infections in Thailand [25], to study invasive species (Korea) [26] and critically endangered Alaotran gentle lemur (Madagascar) [27]. MaxEnt is even applied in mineral prospectivity analysis (Nanling, China) [28] or as the model for the prediction of landslide patterns (Arno Basin, Italy) [29]. The geology, soil, climate and vegetation spatial data are used in the above-mentioned cases along with remote sensing data. It can be assumed that the maximum entropy method together with the MaxEnt software is the universal geographical tool for spatial modeling and mapping.

It is also a developing practice to apply spatial modeling to such natural phenomena as forest formations. For example, the habitat suitability modeling was performed for 10 types of forest formations in Europe. The Random Forest model based on 1 km climate environmental variables and more than 6000 field data forest inventory plots were used in that study. The overall accuracy of the final map was 76% [30]. Finite mixture model was applied to a national forest inventory of Italy consisting of 6714 plots with a measure of abundance for 27 tree species, and the map of potential forest types was produced also based on 1 km climate data supplemented with some geological and soil data [31]. According to the study of Panamanian tree species, their distribution appears to be primarily determined by dispersal limitation, then by environmental heterogeneity. This study used a permutation-based regression model computed on distance matrices and a hierarchical clustering of the tree composition to construct a predictive map of forest types of the Panama Canal. Fifty-three sample plots describing the floristic composition along with climatic data, elevation, geologic formation and slope are used in the referred study [32]. The study of southern Atlantic Rainforest formations (Brazil) aimed to verify the existence of indicator species and identify relationships among distributions of tree species with environmental and spatial variables. The study was based on 21 sample plots and altitude and climatic variables [33]. The forest formations are often referred to in the aforementioned studies as Floristic Patterns or Species Composition Patterns. The satisfactory results are shown for application of MaxEnt in land-cover classification and land change analysis. For example, the study in Trentino-South Tyrol, Italy, developed land cover and difference maps between 1976 and 2001 based on multispectral data and topographic variables. MaxEnt applied to land cover classes can provide reliable data, especially when referring to classes with homogeneous texture properties and surface reflectance [34].

The purpose of this study is to assess the utility of modern approaches in spatial modeling of forest communities for the example of the Moscow Region, based on field data obtained outside the state forest inventory. The tasks of this study are dictated by the need to develop and adapt optimal methods for managing the array of field descriptions unevenly distributed in space and between syntaxonomic units, as well as to develop a probabilistic cartographic model of forest communities at the regional level, as an alternative to the generalized official data of the forest inventory.
