1. Introduction
Micro, small, and medium-sized enterprises (MSMEs) are an essential part of economies at the national, regional, and local levels. They provide jobs, create investment opportunities, and develop economic potential required for development. Therefore, the literature has researched the understanding of the determinants of the development of this sector. As emphasized by [
1], a region facilitates the formation and transmission of social capital as well as geographic platform from which social capital can be accessed. Geographical location also affects spatial disparity in economic activity [
2].
The theoretical developments in the new economic geography aim at increasing the understanding of spatial perspectives and economic growth. Within its framework, it is expected that existing institutional infrastructure measured as industry density can generate spillovers that contribute to entrepreneurship development. Moreover, the variation in the firm formation is higher across regions than in time [
3]. Boschma and Frenken [
4] described how evolutionary economics may contribute to a new and more dynamic understanding of the location of an industry. They pay attention to the role of geography and firm location behavior as being price signals (neoclassical) and place-specific institutions as conditioning the range of possible (location) behaviors and potential locations, but not determining the actual (location) behavior and locational outcomes. This is due to several factors. First, the geographical concentration of industrial activities can generate agglomeration economies fostering start-ups and innovation and, possibly, the birth of a related industry in the region. Second, it increases the level of competition and makes the exits of firms raise the average fitness of routines. Third, the spatial concentration of firms can also affect the opportunities of collective action. Place-specific institutions (the institutional view) only condition the range of possible behaviors and potential locations of firms, but the actual behavior and location is largely determined by organizational routines acquired in the past. Changing these routines is possible with innovation and relocation. In the article, the authors analyze the spatial distribution of enterprises and their determinants. Such analysis contributes to understanding the characteristics of the MSME sector in the region, which can support recommending directions for further development of the enterprises, including their innovativeness.
Furthermore, [
5] claims that territories can only be called relevant and meaningful units when the idea of routines and competences can be transferred from the organizational to the regional level. The region has become an entity on its own, providing intangible and non-tradable assets based on a unique knowledge and institutional base, which is not accessible for non-local firms. In the article, the authors investigate the structure at the most local municipality level, which can provide more insight into the intra-regional variation in enterprise structure. The snapshot of the existing distribution of the MSMEs and to what extent it is affected by the spatial and economic context can inform us about the factors that were underlying the establishment of such structure in the past, which, given the assumed path dependency, also affects current developments.
An important explanation of the spatial distribution of the MSMEs are localization economies that arise from a spatial clustering of economic activities in either the same sector or related industries. As shown by [
6,
7], localization economies, including agglomeration and urbanization effects, may be decisive in explaining the emerging spatial pattern of industries. In particular, the clustering of small firms, related firms, and supplier/customer linkage can help to construct a favorable environment for manufacturing entrepreneurship. Furthermore, [
8] argue that spatial clustering is at the core of economic geography, highlighting the impact of proximity and distance, institutions, and local milieus on economic processes. Spatial clusters induce variation, observability, and comparability, while at the same time allowing increased differentiation without discouraging knowledge exchange. They show that the spatial attributes of interactive learning and innovation process could be a fruitful point of departure not only when analyzing spatial agglomerations, but also when it comes to re-invigorating research into economic geography.
Clusters emerge due to the benefits of proximity. Proximity is discussed by [
9], who analyses five dimensions of proximity (i.e., cognitive, organizational, social, institutional, and geographical). It is claimed that geographical proximity combined with some level of cognitive proximity is sufficient for interactive learning to take place. Geographical proximity may also stimulate the formation and evolution of institutions, such as norms and habits that may affect interactive learning and innovation. Too little proximity leads to the lack of spatial externalities. Yet, too much geographical proximity may lead to a lack of geographical openness. This can particularly apply to micro and small companies, which have fewer resources to reach out beyond the local context of activity. Overall, geographical proximity may facilitate interorganizational learning, but it is neither a necessary nor a sufficient condition. Understanding the diverse patterns of geographical proximity at the local level is therefore an important factor contributing to the identification of possible solutions aimed at enhancing innovation.
The degree of urbanization affects localization economies. Spatial agglomerations create localization advantages in terms of spillovers and cooperation between firms. The benefit of urbanization comes, among others, from lower transport costs and proximity to suppliers and customers [
10,
11,
12]. Access to customers can be proxied by the degree of urbanization. One of the important characteristics related to urbanization is access to transportation infrastructure, which has been found to have a significant effect on urban growth [
13].
The firm size also matters. Companies with different sizes have different characteristics, both with respect to sectoral structure but also the ability to innovate. Studies show that there is a negative relationship between firm size and employees’ tendency to become entrepreneurs (e.g., [
14,
15,
16]). Ref. [
17] shows that the occurrence of the small firm effect is closely related to low entry costs in industries typically clustered by small firms. Not only do the industry entry conditions of small parent firms have a strong and positive impact on the likelihood of employees’ entrepreneurial entry, but entrepreneurs spawned by small firms show a strong preference for starting a business in the exact same industry as their previous employer. Additionally, evidence in the literature shows that there is a link between firm size and innovative performance [
18,
19,
20]. Finally, smaller firms face more obstacles than larger firms with financing, taxes and regulations, inflation, or anti-competitive practices. For these obstacles, small firms have the biggest problems, followed by medium-sized and large firms [
21]. Moreover, [
22] analysis by firm size shows that industrial structure and corporate organization affect the benefits that arise from clustering within a given industry. This finding is strongest with regard to the size distribution of establishments; own-industry at small establishments presents a much greater attraction to potential new arrivals than does a comparable level of own–industry employment at larger establishments.
Multiple factors, both at the regional and local levels, affect the creation process of new firms, their survival, and their innovative capacity. The evidence in the literature indicates that these include, among others, population growth, a high percentage share in the population of people with high managerial and vocational skills, urban concentration, household wealth, and demand [
23]. Factors that are conducive to the development of small and medium-sized enterprises (SMEs) differ locally. Evidence from Finland shows that urban and rural communities provide different environments for enterprise development, particularly with regards to human capital, access to technology, the number of cluster enterprises, and the intensity of communal cooperation [
23]. However, in their analysis of SMEs’ development in the UK, [
24] conclude that business growth is possible under different territorial conditions, including different levels of competition and market demand between regions and differences in the occupational and skill structure of the labor market. Many SMEs in peripheral regions may actively work to develop strategies to overcome these constraints. Therefore, an initial locational disadvantage may ultimately benefit rather than inhibit a company’s growth and performance. The growth of indigenous companies in peripheral regions results from more active and autonomous roles of the entrepreneurs as well as local governments in creating an entrepreneurship-friendly environment. In turn, this can contribute to the economic and social development of the local areas [
25,
26]. These considerations can also be placed in the context of the role of the MSMEs in the economic transition in Central and Eastern Europe (CEE). In recent years, there are three major spatial developments that can be observed in the CEE countries [
27]. First, there are increasing differences in development between urban core regions that develop better than peripheral rural regions. Second, there are strong trends towards polarization between their main metropolitan area (usually the national capital) and the rest of the country. Third, there is an east-west gradient, with the western parts performing better than eastern regions, which is also confirmed by [
28]. As shown by [
29], there is a risk that spatial development further concentrates in a smaller number of (metropolitan) regions, whereas more and more other regions might be affected by the processes of peripheralization. Ref. [
30] show that the regional inequalities in CEE countries are strong and persistent. Weaker CEE regions typically lost the greatest part of their industrial base which, being in capital-intensive sectors, was more exposed to international competition; as a result they are more exposed to recession shocks, which gains importance given the recent COVID-19 pandemic situation. An analysis of the non-core regions in CEE by [
31] shows that, after the financial crisis, the increasing number of SMEs, along with substantial R&D outlays and the development of human capital, were important stimuli for development. Analyses at the CEE intrastate level conducted by [
32] show that while there was general catching-up with the EU–15 average by the state economies, there was also growing economic diversification between regions in the studied countries (internal divergence). Analysis of [
33] indicates that, following the Williamson’s curve, disparities between regions are lower in the early stages of development, peak in the middle-income stages, and diminish again as a country becomes wealthy. Development policies must not focus extensively on the country as a whole but have to take into account the preferences and possibilities of their peripheral regions as well. To that end, [
34] show that while considering business stimulation policies, both the quantity and the quality of the new firm start-ups should be taken into account.
The regional determinants of the growth of SMEs have also been investigated in Poland. According to [
35], the human capital development, wages, unemployment, economic activity of the population, and disposable incomes are essential for the development of SMEs. The study of enterprises in southern Poland indicates that access to financial instruments at the regional level is an essential barrier to enterprise development [
36]. Simultaneously, IT infrastructure, closeness to markets, suppliers, and cooperators are the most critical factors stimulating such development.
In this article, the authors focus on analyzing the local determinants of the spatial distribution of micro, medium, and small enterprises in the Kujawsko-Pomorskie region in Poland. This region for many years has lagged behind the economic development in the rest of the country. Its innovative capacity remains one of the lowest, both when taking into account human capital as well as investment in research and development [
37]. Given these challenges, this article was developed as a part of action research activities that aim at co-learning leading to developing knowledge and understanding necessary to design effective policies aiming at boosting innovative MSME growth in the region [
38] under the umbrella of the “REGIOGMINA” project, led by the regional authorities. As discussed by [
39], the choice of incubation strategy depends on the characteristics of local areas. Therefore, providing knowledge on the differences and similarities between different municipalities is an important contribution to develop the regional strategy.
The number of micro, small, and medium-sized companies per 10,000 people of working age in the Kujawsko-Pomorskie region remains below the national average (
Figure 1). In the case of micro and medium-sized companies, the gap between their number and the national average has widened in recent years, while in the case of small companies, it has remained stable. The distribution of the number of companies is asymmetric and the median number of companies is below average in the region. At the same time, there are several gminas, with the number of companies per 10,000 people of working age distinctly above the regional average. As [
24] emphasize, the close analysis of such “paradoxical” cases can open up new perspectives for regional and local policies.
There are also differences in the sectoral structure of the MSMEs both depending on the size of the companies and at the municipal level. In larger firms, the share of industry is higher. There is also spatial variation—there is a clear gradient between the shares of firms in industry and services, which also depends on locality (
Figure 2).
The novelty of the authorial approach is threefold. Firstly, the authors focus on the lowest administrative level, gmina (meaning municipality in the urban context or commune in the rural context) as the main actor in creating a friendly environment for conducting business activity [
40]. Secondly, the authors use a unique database of local-level enterprises obtained from an administrative register, namely the Social Insurance Institution database. Therefore, the authors use the best possible source of information on Poland’s actual activity (and employment) of micro, medium, and small enterprises. Additionally, the authors use gmina-level data from the Local Data Bank of Statistics Poland. Thirdly, the authors also consider spatial factors to explain differences in the MSMEs’ development in the Kujawsko-Pomorskie region. The two main research questions are:
Which factors affect the current distribution of micro, small, and medium enterprises in the region?
Are spatial factors important in explaining the local MSMEs’ development in the region, and can they be successfully included in the analytical models?
In addition to a theoretical understanding of the factors determining the existing structure of micro, small, and medium companies, the authors also identify local areas with higher levels of entrepreneurship than is explained by the analyzed factors.
2. Data and Methods
2.1. Data
The register of Social Insurance Institution is the data source on the number of micro, small, and medium-sized enterprises at the gmina level. We define micro enterprises as those that have 1 to 9 employees, small as those with 10 to 49 employees, and medium as those with 50 to 249 employees. Employees are identified as those for whom the enterprises pay social insurance contributions. The information on the number of companies paying social insurance contributions was collected for December 2018 in the “REGIOGMINA” project. The use of the administrative register enables the identification of active companies based on their actual employment; therefore, it provides timely and accurate information on the size of the MSME sector and its structure at the local level. The analytical units are gminas in the Kujawsko-Pomorskie region. This choice forced the selection of explanatory variables that are available in such disaggregation. One should note that most of the statistical information on socio-economic development is provided at the powiat (district)(NUTS 4) or regional (NUTS 2) level.
The response variable adopted in all the methods was the number of micro, small, and medium-sized enterprises (calculated per 10,000 inhabitants of the gmina—
Figure 3).
The Social Insurance Institution’s registry has provided the number of enterprises. The authors used the registry data, as it includes up-to-date information on the number of people for whom an enterprise paid social security contributions in December 2018. This number differs from the data provided by Statistics Poland: the enterprises do not regularly update it, so the latter shows a higher number of companies.
The explanatory variables come from the following sources:
Statistics Poland data, including population and labor market characteristics as well as community wealth, using tax income as a proxy:
- ○
the number of people in gminas, including people in pre-productive, productive, and post-productive ages;
- ○
the number of registered unemployed (total, men, women);
- ○
the number of registered unemployed per 100 inhabitants (derived by the authors);
- ○
gminas’ own revenues from personal income taxes (PIT) per capita;
- ○
local budgets’ total expenditure per capita.
Data from the local policy assessment supporting the entrepreneurship development, obtained from [
41].
Data from the Kujawsko-Pomorskie regional authorities: the number of projects supporting entrepreneurship development funded from the European Regional Development Fund (ERDF) and the amount of funding provided to these projects.
Spatial data of the Head Office of Geodesy and Cartography from the Topographic Object Database with 1:10,000 Level of Detail regarding, in particular, the degree of urbanization. Following [
42,
43], we proxy the transportation infrastructure using the road density; we also take into account the characteristics of the landcover. Therefore, the authors include the following:
- ○
land cover: built-up areas (PTZB class, namely land cover: built-up);
- ○
land cover: agricultural areas (PTTR02, namely land cover: grassland and arable farming);
- ○
land cover: orchards (PTUT03, land cover: permanent cultivation);
- ○
transport network: roads (SKDR, namely transport route: roads);
- ○
transport network: railway tracks (SKTR, namely transport network: a rail or tracks);
- ○
location of large cities (from ADMS class, a territorial division unit: a town);
- ○
the geometry of administrative areas (from ADJA class, an administrative division unit).
Spatial data from other sources (
Figure 4):
- ○
Location of large enterprises in the voivodship, obtained using the ranking of the largest companies in Poland. Cooperation with large companies can be an essential lever for increasing the potential of smaller companies, especially in the innovative (technological and organizational) dimension. Large companies need smaller ones because they are more agile and can propose innovative solutions. MSMEs often better know the local markets on which they focus. Specialized MSMEs can meet the diverse and complex needs of large businesses [
44].
- ○
Distance to science and technological parks, obtained using the information provided by the website, “Invest in Kujawsko-Pomorskie”. Science and technology parks create a base for the commercialization of scientific research, research cooperation, and knowledge transfer, which are vital for the development of MSMEs’ innovation and entrepreneurship. These parks offer, among others, management support, training services, venture capital access, intellectual property consultations, and laboratory services [
45].
- ○
Distance to areas of the Pomeranian Special Economic Zone, obtained using the information from its website. Special Economic Zones are instruments that support the MSME sector. The zones assure favorable conditions for business activity and foreign investment. Foreign companies operating within the SEZ provide new business standards such as technology, experience in production processes, business contacts, and good practice in training employees, which are exceedingly significant to the development of the SME sector; they are also their primary source of new technologies [
46].
- ○
Distance to higher education institutions (HEIs) (source: National Court Register). HEIs are important knowledge alliance partners of the SMEs on the regional level; they constitute the source of tacit knowledge for innovative firms [
47].
- ○
Location of the A1 highway exits (source: General Director for National Roads and Motorways): the road network on the local level is vital for economic development at both the local and regional levels, as accessibility is one of the main deciding factors in the location of new businesses [
48]. In Poland, as [
49] found, the more significant the investment in regional transport infrastructure, including national, regional, and local roads, the more visible the financial and economic outcomes of SMEs.
Spatial data were not used directly; they were used to enrich the set of data on gminas directly with attributes constituting the formalization of the spatial context and spatial relations of or in the gmina. Consequently, there were additional attributes created, describing the following for each gmina:
Percentage coverage of the gmina with a built-up area (
Figure 3).
Percentage coverage of the gmina with agricultural areas.
Percentage coverage of the gmina with orchards.
Railway network density.
Distances from the nearest of the following structures:
- ○
Bydgoszcz or Toruń (largest cities in the region with administrative functions—referred as “main cities” in
Figure 2 and
Figure 3),
- ○
another large town,
- ○
Bydgoszcz, Toruń or another large town,
- ○
a university,
- ○
a large company,
- ○
a technology park,
- ○
an economic zone,
- ○
a highway entrance/exit,
- ○
key road infrastructure in the voivodship (national roads).
2.2. Methodology
The study of factors influencing the number of MSMEs in a gmina was conducted using several methods:
studying the correlation of variables,
multivariate regression,
models of spatial econometrics,
classification trees,
First, the authors checked each potential explanatory variable’s correlation to the number of micro, small, and medium enterprises separately. As in the more populated areas the number of MSMEs is obviously larger, the authors used in their analysis the number of enterprises relative to the population (that is, per 10,000 people).
The next step was to create classical multivariate regression models through two methods:
forward selection—beginning with an empty model and adding further explanatory variables, starting from the one that affects the explained model the most;
backward elimination—starting with a model with all the variables then removing subsequent variables, starting from the variable with the least significance.
Subsequently, based on the sets of variables defined for the regression,
spatial econometric models. Such models are also known as geographically weighted regression (GWR) and allow us to include the spatial heterogeneity of the variables in the analysis. In classical regression models, the spatial neighborhood influence is omitted, although it can have significant impact on the explained variable. GWR are particularly used in the modeling of economic indicators (i.e., unemployment rate [
50]).
The authors built four GWR models which differ with regards to the type of model and the matrix of weights used. Regarding the type of model:
Regarding the method of determining the matrix of weights (defining the neighborhood) that determines the influence of particular municipalities’ values on each other:
a neighborhood-based matrix—a common border between the gminas indicates that a neighborhood exists;
a distance-based matrix—the neighboring degree is inversely proportional to the distance between gminas.
In studying correlation and in each of the regression models and spatial econometrics, the authors use a p-value = 0.05 as the significance level.
The next step involved making models that explain the number of micro, small, and medium enterprises per person, in particular, gminas using classification trees (Classification And Regression Tree types, CART) [
51,
52]. A significant advantage of this type of model is that, unlike linear regression or the spatial econometrics based on it, there are no preliminary assumptions about the linearity of the model. It is also possible to use explanatory variables in the various levels of measurement (variables used in the nominal and ordinal levels were added to the previously used variables).
However, the use of classification trees required the discretization of the response variable. The authors decided on the discretization into three classes (low, medium, high) based on the Jenks natural breaks classification method. While this approach results in a loss of informational content of the data (a “downgrade” of the level of measurement), it does produce higher readability and a more straightforward interpretation of the models created this way. The obtained results can be presented in the form of several logical IF-THEN conditions.
Before determining the trees, the authors checked the validity of the available predictors using the chi-square test to evaluate the results. The five most significant predictors (determined independently for each model), a breakpoint condition based on the Gini index, and a minimum number of nodes of 5 were employed to construct CART.
4. Discussion
Out of a significant number of generalized regression models, the authors used three models—i.e., a multivariate linear regression model, spatially weighted regression models (spatial econometric models), and CART nonlinear classification trees. The use of diverse regression models, both parametric and non-parametric, would enable obtaining more varied results. The purpose of the conducted research is not to assess the adequacy or effectiveness of different models for generalized regression, but to analyze the development of MSME entrepreneurship in the Kujawsko-Pomorskie region. The obtained results indicate that it is possible to develop a relatively reliable model for estimating the level of entrepreneurship development, explaining even 2/3 of the variability in the phenomenon, using simple models of multivariate linear regression and several explanatory variables (for micro-enterprises). In some models, the spatial distribution of the phenomenon, through the spatial weight matrix, enables a several-percent increase in the value of the coefficient of determination, R2. It is possible to obtain similar results using non-parametric models—e.g., CART nonlinear regression models. This article is limited only to the use of such classification models, thus facilitating the creation of conceptually simple models that explain the spatial differentiation of entrepreneurship development in a voivodship. Thanks to CART classification trees, it was not only possible to extract the decision variables critical for understanding, but also to explicitly formulate the decision rules.
The conducted analyses show a significant correlation between the level of entrepreneurship development, measured by the number of enterprises per 10,000 inhabitants, and the gminas’ own revenues from the PIT per capita.
Gmina’s own revenue from the PIT per capita is a proxy for households’ incomes at the local level and reflects the potential consumer demand. This result can indicate that the micro, small, and medium enterprises in the region belong to a relatively low hierarchy cluster focused on producing for local consumption, following the grouping proposed by [
53]. They divide SMEs in developing countries into three groups. At the lowest tier are small companies that produce for local consumption. The medium-tier companies are better endowed (in capital and skills) and can generate an investible surplus and produce, either directly or on contract, for the domestic and, often, export markets. The third tier includes technically innovative firms that maintain high quality, capable of entering export markets and aspiring to grow. What also confirms the relatively low tier of development of this sector in the analyzed region is the lack of significance of variables related to the distance to technological parks or higher education institutions. Furthermore, the CART classification indicated that the distance to large companies is significant to the density of the micro and small companies at the local level, while it is not significant to medium enterprises. Thus, this confirms that local factors, such as the consumption needs and cooperation between these companies and large enterprises, determine the activity of micro and small enterprises.
Moreover, the authors observed a substantial significance of variables that essentially formalize the spatial context, such as, e.g., the percentage of built-up areas in gminas, the density of the road network, or distances to some structures. The presence of such variables in both models shows the importance of the spatial context to entrepreneurship development. This is essential because economic analyses usually do not include these variables in such a comprehensive manner. However, this study shows that it is necessary to take into account an area’s spatial characteristics when determining the possibilities of entrepreneurship development. Therefore, one should consider the data enrichment of the standard data used in economics with information on the spatial context as an essential step in the construction of econometric models, especially since the models based solely on spatial variables used in this study explain between 30 and 45 percent of the variability in the number of enterprises.
Another way of considering the spatial nature of data is by using spatial econometrics instead of classical regression models. These models prove efficient when dealing with the autocorrelation of the spatial distribution of the residuals from the classical regression models (thus, the residuals are not randomly distributed). In this research, the only enterprises for which this phenomenon occurred were medium-sized enterprises, and spatial econometric models were constructed for them. This may be because the business activities of larger enterprises cover a larger area. Thus, their surroundings in neighboring gminas have a more significant impact on the businesses, as per the classification proposed by [
53]; medium-sized enterprises belong to the group with a range beyond the local market. In all of the cases, it was better to use a matrix of weights based on the distances between gminas and not on their neighborhood, which confirms that space and distance are essential, not administrative divisions. This confirms analyses of the labor market areas in Poland that form beyond the administrative borders [
40,
54]. The use of spatial econometric models in place of classical regression models for medium-sized enterprises increased the R
2 coefficient and improved the model’s quality of explaining the variable.
5. Conclusions
The results presented in the paper indicate that spatial development largely determines the local development of micro, small, and medium enterprises. The number of companies at the local level is strongly correlated to the population size. Therefore, the analysis focused on the relative number of companies compared to the population size.
The authors’ analysis indicates that the type of entrepreneurship observed in the Kujawsko-Pomorskie region mainly focuses on meeting the demand of local consumers and large companies, as there are more enterprises in those localities where the personal income taxes paid are higher, but also in those with a smaller distance to large enterprises. Furthermore, the results show that the existence of technological parks or special economic zones does not have a significant role in the MSMEs’ sector’s current structure in the Kujawsko-Pomorskie region. This may indicate that micro, small, and medium companies focus on local consumption. The further development of this sector requires providing the companies with the capital and skills needed to produce for broader markets (both domestic and foreign) and, ultimately, towards a high level of technical innovation with a high potential for growth and international competition. Thus, regional policies should focus on providing access to financial instruments and investing in the current and future workforce’s qualifications, including vocational or higher education that recognizes innovative skills.
The authors have also confirmed that the spatial context matters; models that include only geographical variables explain a large share of the variance related to the development of micro, small, and medium enterprises. In the case of micro-enterprises, it correlates with local factors, while in the case of medium-sized enterprises a more comprehensive (geographical) context is essential.
While the presented models do not indicate causality, they can direct the further monitoring of the MSME sector. The results show that there is a group of communities that enjoy a much higher or much lower enterprise development than is explained by the proposed models. These gminas should be further analyzed (using qualitative methods) to identify the factors that stimulate or hinder entrepreneurship development. This may inform the framing of regional policies focused on the development of the MSMEs in the Kujawsko-Pomorskie voivodship.
One should emphasize that conducting comprehensive analyses into the differentiation of entrepreneurship’s spatial distribution in particular gminas of the Kujawsko-Pomorskie region required the integration of descriptive data collected by various institutions, as well as the consideration of spatially localized information. The use of the BDOT 10k Topographic Object Database and other spatial data sources (e.g., the location of special economic zones or technology parks) enabled data enrichment. The enrichment of source tabular data with spatial information on, e.g., the distance to the main cities of the region, highway exits, and large enterprises, enabled obtaining additional explanatory variables in generalized regression models. What is more, determining the distance and neighborhood matrices facilitated the creation of spatial econometric models and a more comprehensive explanation of the spatial variability of the phenomenon through a geographically weighted regression model.
The developed generalized regression models, as well as classification trees, have significant advantages over the frequently used non-parametric models. Multivariate linear regression models are straightforward to interpret and apply; they also allow the recipient to determine the impact of specific factors on the model intuitively. The CART classification trees used in this article enable the automatic extraction of decision rules that explain the hierarchical influence of particular predictors on the value of the dependent variable. The obtained results also show the analytical potential of a unified database that depicts entrepreneurship in the Kujawsko-Pomorskie voivodship, as well as the possibility for further data mining with other types of algorithms for generalized regression. This issue will be the subject of further research.
There are, of course, data limitations that need to be considered. As presented in
Figure 1, there are outliers in the observed variables that affect the results, which is unavoidable in such analysis. There are also specific factors that affect the MSME development in particular localities (i.e., municipalities at the region border that are affected by centers localized outside the region), that cannot be explained by any type of regression model. Furthermore, in the analysis, the authors did not take into account additional characteristics of the MSMEs in the region, such as sectoral structure, which requires further investigation.
Finally, in the article the authors show that there is additional utility of the administrative data to investigate local characteristics of the MSME sector. The presented models can be extended further to other Polish regions but also to the wider context of CEE or European countries (provided that similar data exists). The authors show that the differences that are observed between regions in the CEE countries, as discussed in the literature, are also observed within regions, with more developed regional capitals and lagging behind peripheral municipalities. The strengthening of the MSME sector capacity and innovativeness requires designing policies that take into account such differences and designing MSME development strategies to the specificity of the local environment—both socio-economic and spatial.