1. Introduction
Buildings in rural areas of China are commonly dispersed in different locations, making it difficult to implement centralized heating. For this reason, no effective heating solution that can perfectly strike the balance between technical efficiency and economic efficiency [
1,
2] has been found for rural areas. Abundant solar energy and low population density in rural areas provide favorable conditions for the utilization of solar energy [
3,
4]. However, for rural areas with a scattered building layout, choosing an appropriate solar heating mode is difficult. If a centralized solar energy system is used to provide heating for multiple households, heat loss during long-distance heat transmission will increase heating costs [
5,
6]. If independent solar heating systems are used to provide heating for individual households [
7,
8,
9], the overall energy efficiency of the heating systems cannot be improved [
10,
11] because it is impossible to fully utilize the complementing effect [
12,
13,
14] between the heating loads of different buildings that are located relatively close to each other. Therefore, neither a fully centralized solar heating system nor a fully decentralized solar heating system is a good solution to space heating in dispersed rural areas; thus, there is an urgent need to raise a new design method for solar heating systems in scattered rural areas.
The energy loads of different types of buildings in an area can differ to varying extents. As the load pattern varies from building to building, demand profiles of several adjacent buildings can be aggregated over the same time horizon to level off the overall heating demand profile [
15] so as to reduce the energy supply cost [
16,
17]. This is called the complementing effect. To fully utilize the complementing effect, a centralized–decentralized hybrid heating has been conceived [
18,
19]. To implement centralized–decentralized hybrid heating, a building space clustering analysis needs to be carried out for the target area based on building spacing and the complementary characteristics of loads. Then, independent centralized solar heating systems can be designed for different building groups. Spatial clustering is an analysis method that divides objects in a spatial data set into multiple clusters composed of similar objects [
20,
21]. Spatial clustering methods mainly include partition clustering algorithm [
22,
23], hierarchical clustering algorithm [
24,
25], density-based clustering method [
26,
27], grid-based clustering method [
28,
29], and model-based clustering method [
30,
31]. Among them, partitioning clustering algorithms, density-based methods, and graph theory-based methods are more often used in regional building clustering.
A partitioning clustering algorithm is a heuristic method (e.g., based on K-means or K-medoids algorithms) that divides a given set of objects into groups, so that each group contains at least one object and each object belongs to and can only belong to one of the groups. Unternährer et al. [
32] used K-means clustering technique to divide the entire urban area into several smaller neighborhoods. Similarly, Samira et al. [
33] proposed a systematic approach combining the K-means partitioning clustering method with a GIS model to represent an urban area macroscopically as a set of “integrated partitions” integrated by consumers, resources, and energy conversion technologies, solving the problem of energy system design and operational strategy optimization in urban areas. However, the center point of each region in this study was chosen autonomously. To overcome this shortcoming, Giovanni et al. [
34] used classification techniques and clustering algorithms to identify representative buildings in each cluster. Predictive modeling was used to expand cluster membership in the case where some buildings were excluded from the analysis. The graph-theory-based approach, also known as the Minimum Spanning Tree (MST) clustering algorithm, was first proposed by Zahn [
35]. The main idea of this clustering method is to first consider each object to be processed as a node of a “graph” and then find similar relations (e.g., proximity relations) to form an undirected graph, and the constraint weights are assigned to each edge of graph. Regnauld [
36,
37] analyzed the scale-independent Gestalt parameters such as average size, shape, and density of each building group and then established rules and constraints on the spatial structure of the triangular network, but these constraints were not introduced hierarchically into the building clustering. To disentangle the degree of influence of multiple constraints, Qi and Li [
38] introduced the constraints hierarchically into the building clustering process based on various influencing factors such as distance, direction, and similarity. Although the above two methods were often applied to the study of regional clustering of buildings, each method has obvious limitations. For the K-means divisional clustering method, a pre-determined number of clusters should be determined previously. Moreover, the initial cluster centers are generated by random selection and sensitive to noisy data. For the graph-theory-based method, constraints such as the number of clusters should also be determined in advance. Therefore, these two methods are not suitable for spatial clustering of buildings with different distributions in areas where the buildings are scattered.
The density-based clustering method, which can identify any number of clusters with arbitrary shape in noisy datasets, is an appropriate solution to building clustering in various regions [
39]. The basic idea of the Density-based Spatial Clustering of Application with Noise (DBSCAN) algorithm is that for each point of a cluster, the neighborhood of a given radius has to contain at least a minimum number of points (
) where radius and
are input parameters, but this initial method is sensitive to noisy data. Duan et al. [
40] improved on this by proposing a density clustering algorithm for discovering clusters of different local densities in spatial databases, which is able to solve the problem of clustering data with different local densities. The algorithm improved by Dharni et al. [
41] for multi-density data can obtain different values of neighborhood radius according to the density of different data regions, which can effectively handle multi-density data, but each additional one has to traverse the data set, which greatly affects the efficiency of the algorithm. A fast clustering algorithm for DBSCAN was proposed in study [
42], which reduces the number of region queries and thus the clustering time by selecting individual representative objects in the core object neighborhood as seed points for class expansion. Liu et al. [
43] applied a new density-based spatial clustering algorithm, which is able to detect clusters of arbitrary shape and non-uniform density in the presence of noisy points in spatial objects. However, all of the above studies only performed density clustering for spatial locations but failed to consider the load differential characteristics between demand-side buildings. Wang [
44] used a density-based clustering method that considered the complementary effects of spatial dimensions and different demand curves to improve the efficiency and accuracy of large urban energy–water linked system optimization. Marquant [
45] considered the building distances along with the load demand of building users based on a density clustering algorithm and divided the urban-scale case into multiple zones to solve the multi-scale energy network planning problem. However, the above studies mostly target large-scale conventional energy systems in urban areas with relatively stable energy production system. Very few studies have examined solar heating systems, which are characterized by large fluctuations of power output.
In summary, although there has been some research focused on combining building clustering with energy system optimization [
46,
47] to improve the overall financial and technical performance [
48], solar heating systems are rarely considered in which hydraulic and thermal characteristics are fully studied in the calculation of transmission loss. In this study, a new methodology is adopted to carry out an in-depth analysis of the impact of building clustering on the design and operation of solar heating systems. After the buildings with an area are clustered into multiple building groups, the designs of the solar heating systems for different building groups are optimized separately, and the optimal design of the centralized–decentralized hybrid solar heating system is obtained by comparing the system life cycle costs (LCCs) [
49] under different clustering schemes. The results of this study provide theoretical support for the design of solar heating systems in areas with a dispersed building layout. The main contributions of this study are as follows:
An optimization framework is developed for centralized–decentralized hybrid solar heating systems based on building clustering.
A building clustering method is proposed by combining the DBSCAN with the Kruskal minimum spanning tree algorithm.
A sensitivity analysis is conducted to investigate the impacts of pipeline price and building spacing on the design of solar heating systems.
The remainder of this paper is organized as follows: the methodology is provided in
Section 2, the results of the study are presented in
Section 3, the discussion is presented in
Section 4, and the conclusions of the study are given in
Section 5.
2. Methodology
The flowchart of the combined optimization method proposed in this paper is shown in
Figure 1. The process can be roughly divided into the following five steps:
- (1)
Collecting data;
- (2)
Building clustering based on density;
- (3)
Generating pipeline network in each building group;
- (4)
Optimizing the solar heating system design for each building group;
- (5)
Determining optimal building clustering scheme and optimal system design.
From Step (1) to Step (3), various categories of data are collected, the whole district is divided into several clusters by using the density-based building clustering technique, and the pipeline network is generated by using the Kruskal minimum spanning tree algorithm. In Step (4) the corresponding solar heating systems for all the building clusters are designed. Finally, by comparing the total system cost, the optimal building clustering scheme and optimal system design can be determined in Step (5). The technical details of the involved approaches are described in
Figure 1.
2.1. Building Clustering
Among density-based building clustering methods, the DBSCAN algorithm is the most widely used because (1) there is no need to specify the number of building groups in advance, (2) building groups with arbitrary shapes can be discovered, (3) noise points can be identified, (4) outliers can be handled properly. Because the DBSCAN algorithm is sensitive to the initial parameter settings, we adjusted the neighborhood radius and minimum number of samples within a certain range to obtain different clustering schemes.
The execution steps of the DBSCAN algorithm are as follows:
Input: Dataset , neighborhood radius , and minimum number of samples .
Step 1: Randomly select an unprocessed object from dataset . If this object meets the requirement of minimum number of samples within its neighborhood radius , it is called “core object”.
Step 2: Traverse the entire dataset; find all the objects that are density reachable from object (Ii an object set , if there is a point chain , , …, , (), and is directly density-reachable from , then point is deemed as density reachable from ) to form a new group.
Step 3: Generate the final clustering result based on density connections (of there is an object that makes both object and object density reachable from , then object and object are deemed as density connected).
Step 4: Repeat steps 2 and 3 until all objects in the dataset are processed.
It can be seen from the above steps that a density-based cluster is a group of density-connected objects, and its purpose is to maximize density reachability. After clustering the buildings using the DBSCAN algorithm, we can analyze the distribution of sample points of each building group and set the core object point of each building group as the location to install the centralized solar heating system.
2.2. Generation of Pipeline Network in Each Building Group
After the buildings are clustered using the DBSCAN density-based clustering method, it is necessary to determine the pipeline network with minimum length in each building group. The conventional Delaunay triangulation method can generate a two-dimensional planning map of buildings in each group and reduce the pipeline connections between buildings far away from each other, but it cannot generate the pipeline network with minimum length. On the basis of triangulation-based planning, this paper further applies the Kruskal minimum spanning tree algorithm to generate pipeline network with minimum total length while ensuring that all buildings can be connected.
The minimum spanning tree algorithm assumes that in a given undirected graph
,
represents the edge connecting vertex
and vertex
, and
represents the weight of this edge. Suppose there exists a subset of
called
; if
is an acyclic graph and
has the minimum value, then
is the minimum spanning tree of
. The minimum edge weight can be calculated using Equation (1):
The Kruskal algorithm assumes that the initial state of the minimum spanning tree is a non-connected graph with only vertexes and zero edge, and each vertex in the graph constitutes a connection component. The algorithm selects the minimum cost edge from . If the vertices attached to the edge are on different connection components in , the edge is added to ; otherwise, this edge is discarded and the next minimum cost edge is selected. This operation is repeated until all vertices in end up forming a connection component. Therefore, Kruskal algorithm determines the shortest path connecting all vertices according to the distribution of edge connections in the graph.
2.3. Design Optimization of Solar Heating System in Each Building Group
2.3.1. System Structure and Components
After the building groups are determined, a centralized solar heating system for each building group can be constructed. The structure of the solar heating system in this study is shown in
Figure 2. The system consists of solar collectors, a natural gas boiler (auxiliary heat source), a water tank, heating pipelines, and multiple heat users.
- (1)
Solar collector
In solar heating systems, a flat plate collector is the most widely used collector type. The formulas [
50] for calculating heat collection, inlet water temperature, and outlet water temperature of collectors are as follows:
where
represents the heat collection capacity of collector at the time
, kJ;
represents the dimensionless heat transfer factor of collector;
represents the effective heat collecting area of collector, m
2;
represents the product of effective transmittance
and absorptivity
;
represents the solar radiation intensity, W/m
2;
represents the total heat loss coefficient of collector, W/(m
2·°C);
represents the inlet temperature of collector at the time
, °C; and
represents the ambient temperature at the time
, °C. The constant 3.6 in Equation (2) is required to convert heat units. The unit of heat collection for solar collectors on the left side of the equation is kJ, while the unit of heat collection on the right side is W. Therefore, the right side is multiplied by 3.6 to unify the units.
where,
represents the specific heat of thermal mass, kJ/(kg·°C), and
represents the circulation mass flow of collector, kg/h.
represents the outlet temperature of collector, °C. Equation (2) calculates the amount of heat collected by the solar collector, which is used as a known value in Equation (3) to calculate the outlet water temperature of the collector.
- (2)
Water tank
In this paper, the short-term water tank is chosen as the heat storage facility, and a single node model is developed. The mathematical expression of the temperature variation of the water in the water tank is as follows [
51]:
where
is the water density, kg/m
3;
is the specific heat capacity of water at constant pressure, kJ/(kg·°C);
represents the water tank volume, m
3;
is the temperature change in the water tank per unit time;
represents the heat output from gas boiler at the time
, kJ;
represents the heat lost to the ambient environment from the water tank at the time
, kJ;
is the amount of heat loss from the transmission of the pipe network at the time
, kJ;
represents the heating demand of the building group at the time
, kJ.
The formula for calculating heat loss of water tank is as follows:
where
represents the heat loss coefficient of water tank, W/(m
3·°C).
- (3)
Auxiliary heat source
In this paper, the natural gas water boiler is used as the auxiliary heat source in the solar heating system, and its heat output can be calculated using Equation (6),
where
represents the operating load rate of gas boiler, %;
represents the heating efficiency (85%) of gas boiler, %;
represents the rated power of gas boiler, kW.
- (4)
Heat loss during transmission
When each clustering scheme is determined, the amount of heat loss
during transmission to each heat consumer at the time
can be calculated as follows:
where
represents the water flow of primary network at the time
, m
3/h;
represents the temperature of the water supply out of the tank, °C;
represents the final return water temperature of the primary network, °C;
represents the primary network inlet and outlet water temperature at user
n, °C;
represents the secondary network inlet water temperature at user
n, °C;
represents the heat transfer coefficient of the pipe at pipe
k, W/(m
2·°C);
represents the length of pipe
k, m;
represents the heat transfer efficiency of the heat exchanger at user
n, %;
represents the water supply flow of primary network at user
n, m
3/h;
represents the water supply flow in pipe
k, m
3/h;
represents the inlet and outlet water temperature of the pipe, °C; and
represents the heating demand of the building group at user
n, kJ.
2.3.2. System Control Strategy
The starting and stopping of the solar collector and auxiliary heat source are affected by the temperature of tank. The control strategy of the solar heating system is shown in
Figure 3. The upper limit value of the water tank heating temperature is 85 °C. When the difference between the outlet temperature of the solar collector (
) and the water tank temperature (
) is larger than or equal to 8 °C and the water temperature of the heat storage tank is less than 85 °C, the circulating water pump at the collector end is turned on; otherwise it is turned off. When the water temperature of the heat storage tank is less than 50 °C, the auxiliary heat source is turned on; otherwise it is turned off. In
Figure 3, S represents the start–stop control switch for each device, S
co represents the start-stop control switch for the solar collector, and S
gb represents the start-stop control switch for the auxiliary heat-source gas boiler. When the device start-stop factor S is equal to 0, the device is off; when S is equal to 1, the device is on.
2.3.3. Objective Function
For each building group, the optimization objective is to minimize the LCC of the centralized solar heating system in that building group. The objective function is
where
represents the total initial investment of all equipment in the system (including the cost of pipe network construction), CNY;
represents the operation cost of system equipment within the service life, CNY;
represents the residual value of system equipment, CNY.
The initial investment of the system can be expressed as:
where
represents the equipment cost per input power of gas-fired boiler, CNY/kW;
represents the unit price of solar collector, CNY/m
2;
represents the equipment cost of unit volume of water tank, CNY/m
3;
represents the unit price of pipe network, CNY/m;
represents the total length of pipe network, m;
represents the cost of accessories, including piping accessories such as water pumps, valves, etc., CNY.
The annual operating and maintenance (O&M) cost of the heating system refers to the fuel consumption cost incurred by the equipment operation and the related transportation cost. In this study, the O&M cost largely consists of the cost of natural gas consumed by the heating equipment. Therefore, the operation cost of the system equipment during the operation period is
where
represents the unit heat price of the gas-fired boiler, CNY/kJ;
represents the annual heat output of gas boiler, kJ;
represents the unit electricity price, CNY/kWh;
represents the cumulative power consumption of the pump in a year, kWh;
represents the ratio of equipment maintenance cost to equipment purchase cost, which is set to 2%.
The unit heat price of the gas-fired boiler can be calculated as
where
represents the unit price of the gas, CNY/m
3, and
represents the calorific value of natural gas, kJ/m
3.
The residual value of system equipment is calculated using Equation (16):
where
represents net residual value of fixed assets (portion of the residual value of a fixed asset at the end of its useful life, less any fixed asset liquidation costs payable), CNY, and
represents the ratio of the net residual value of fixed assets to the original value of fixed assets (varies in the range of 3–5% [
52], set to 4% in this paper).
The capital payback factor is calculated as follows:
where
is the annual interest rate, set to 8%, and
is the service life of the system, set to 15 years [
53].
2.3.4. Constraints
- (1)
Equality constraints
For each building group, the hourly heating supply of the solar heating system should always be equal to the hourly heating demand of all users in the building group. The mathematical expression of the equality constraint is
where
represents the amount of heat stored by the water tank at the time
, kJ.
- (2)
Inequality constraints
The system inequality constraints are expressed as follows.
where
represents the total area of all solar collectors, m
2;
represents the volume of water tank, m
3; and
represents the maximum heating demand of building group, in this case the maximum hourly heating load is set as 6.55 kW. The maximum area limit for solar flat plate collectors is taken from a simulated typical building, whose roof area is 65.88 m
2. The related parameter settings are shown in
Table 1.
At the beginning, the water temperature in the water tank is set to 50 °C, and the temperature of the working medium in the collector is set to 10 °C.
The model is solved using the genetic algorithm in the Matlab environment, with the time step set to one hour and the whole heating season (1 November to 31 March) set as the calculation cycle. The optimization variables include the total area of solar collectors (
), volume of water tank (
), and rated power of gas-fired boiler (
). The related calculation parameter settings are shown in
Table 2.
The population initialization number of genetic-algorithm-related studies is generally set within 50 to 200 in the relevant literature [
56,
57]. Each individual in the initialized population corresponds to the capacity of a device in an optimized configuration scheme. Within the reasonable range, the population initialization number is set to 150 in this study, and the number of iterations is set to 20.
2.4. Comparison and Selection of Optimal Building Clustering Scheme and Optimal System Design
After multiple clustering schemes are obtained through density-based cluster analysis, we can optimize the design of solar heating system for each building group in each scheme and calculate the LCC of the solar heating system of each building group. Thus, the total LCC of the solar heating systems in each building clustering scheme can be obtained by adding up the LCCs of the solar heating systems of all building groups in that building clustering scheme. The scheme with the lowest total LCC is the optimal building clustering scheme, and the corresponding system design is the optimal design of the centralized–decentralized hybrid solar heating system [
58]. The principle is shown in the following equation.
4. Discussion
In order to gain insights into how the clustering schemes can be appropriately made, sensitivity analysis was carried out as a theoretical analysis of several virtual cases to identify potential patterns in the clustering results related to pipeline price and building spacing. According to the above analysis, the price of heating pipeline will significantly affect the selection of the building clustering scheme, which in turn will affect the design of the solar heating system. In order to investigate the influence of the pipeline price on the design of solar heating system, we compared the system LCC of a typical solar heating system under different pipeline prices (100 CNY/m, 150 CNY/m, 200 CNY/m, 250 CNY/m, and 300 CNY/m) with different building clustering schemes. The results are shown in
Figure 9.
In
Figure 9, the vertical axis represents the total LCC of the solar heating systems in the cluster, and the horizontal axis represents different clustering schemes from A to F. Different colors represent the LCC under different pipeline prices. The price of the heating pipeline has a significant impact on the selection of optimal building clustering scheme. Under the clustering scheme F, each building represents a building cluster that has its own solar heating system. As there is no heat transmission between different buildings, there is no need to build heating pipelines. Thus, the system LCC remains unchanged, which is zero, when the heating pipeline price changes. When the clustering scheme changes from A to F under the pipeline price of 200 CNY/m, the total cost of heating system decreases first, increases afterwards, and decreases again. However, the range of variation is not significant, so this price can be used as the price threshold to judge whether the local area is suitable for deploying a centralized–decentralized hybrid solar heating system or not. When the pipeline price is much higher than 200 CNY/m, the optimal building clustering scheme is F. When the pipeline price is much lower than 200 CNY/m, the optimal building clustering scheme is A. When the pipeline price is close to 200 CNY/m, the optimal system design becomes hard to predict. Because all factors, including climate, building layout, and building load in a region may all impact the design results, the determination of price threshold is highly related to the specific case. Thus, it is necessary to conduct research based on the local conditions in order to determine the optimal building clustering scheme and heating system design.
In order to appropriately represent the impact of building spacing on the design result, we assumed that heating radius of the village expands from 21.56 m to 344.96 m, and the building spacing increases proportionally [
61]. In order to quantitatively assess the aggregation level of each cluster, a density index [
44] is introduced in this study. The density index is calculated by the following equation:
where
is the number of buildings in one cluster and
is the total length of heating pipeline used for transmitting heating to these buildings in the same cluster, m. The building density of the target area is 22.30.
The system LCCs under different building spacings with different building density schemes are shown in
Figure 10.
As can be seen from
Figure 10, the building density also has a significant impact on the selection of an optimal building clustering scheme. Similar to the analysis about the impact of pipeline price on the system design, there also exists a threshold that can be used to judge whether the centralized–decentralized hybrid solar heating system is necessary. For the specific case in this study, the density of 22.30 can be regarded as the threshold, because there is no obvious fluctuation in the system LCC when building clustering scheme changes. When the building density is less than 22.30, a centralized–decentralized hybrid solar heating system is favorable. When the building density is more than 22.30, fully decentralized solar heating system is favorable. Overall, the above results indicate that centralized–decentralized hybrid solar heating system could achieve the ideal cost-saving effect only when the pipeline price and building density fall into a certain range. Otherwise, either fully centralized heating or fully decentralized heating is the optimal system design.
It is clear from the optimization results that the design of energy systems in rural areas needs to take full account of local energy conditions and tailor energy policies to local conditions. For energy companies, various service packages should be introduced to meet the needs of different rural residents. For the government, a well-directed and preferential subsidy policy [
62] should be developed to encourage and stimulate rural residents to participate in renewable energy utilization projects. However, the actual conditions of the target area should be fully considered, e.g., the building layout, energy demands, and natural resources. Moreover, differences in household assets, demographic characteristics, and other livelihood capital may lead to different energy consumption behaviors [
63]. Therefore, there is a need to select appropriate technology pathways and support policies to accelerate the diffusion of renewable energy in rural areas.