1. Introduction
The building sector consumes 40% of energy consumption and 16% carbon emissions in the United States, based on the 2020 Energy Outlook from the United States Energy Information Administration [
1]. It has remained a challenge to reduce building energy consumption and carbon emissions, although many advanced building technologies have been proposed. A few well-known technologies are continually evolving, such as ground source heat pumps [
2] and heat pumps in cold climates [
3], with the goal of building electrifications and carbon reductions. For any of those building heating/cooling equipment, control loops are an essential part of the system, aiming for optimal operation to reduce energy consumption, power demands, and carbon emissions.
In the past 10 years, building controls have been actively advancing and sensors have not been well studied. Sensors are critical components for controls systems, collecting inputs to controls for subsequent control actions. When sensors work in fault (or unhealthy) conditions, the control benefits will be compromised regardless of the effectiveness of the controls [
4]. Buildings are easily operating under fault conditions [
5]. For buildings, multiple components directly influence the sensor placement and deployment, such as sensor errors, sensor locations, sensor types, and sensor costs [
4].
Sensors are usually calibrated by manufacturers. However, sensor accuracy might drift with time after being installed. There are many reasons for sensor abnormalities, such as harsh environments and manufacturing defects. In such scenarios, sensor reading accuracy might suffer, which is commonly regarded as a sensor fault. Usually, HVAC systems have multiple sensors to assist the controls and multiple sensors might have multiple faults [
6]. A study described a total of nine types of sensor fault patterns based on measurement datasets [
7]:
Outlier: usually a small number of isolated sensor readings, unexpectedly far from the majority of normal readings. This reason is usually unknown but could be related to the data logger;
Spike: a pattern with a much higher rate of change for multiple data points or sensor readings in a short time period. It might be related to battery failure, other hardware failure, or connection issues;
Stuck-at: a pattern with zero variance or constant sensor readings or data points. The reason is usually associated with hardware malfunction;
High noise or variance: a pattern with higher variance or noise than historical data suggests or normally expects for sensor readings or data points. The reasons might be associated with hardware failure, environmental conditions, or weakening battery power;
Calibration: a pattern in which the sensor readings are always offset from ground truth values. It might be related to calibration error or sensor drifting. Often, incipient sensor drift (the amount of drift change with time) is also common in modern sensors;
Connection or hardware: usually inaccurate sensor readings because of malfunctioning hardware (i.e., hardware dependent). Typical patterns are unusually high/low data readings that are frequently out of normal ranges. The possible reasons might be environment changes, sensor aging, short circuit, or loose wires;
Low battery: usually inaccurate sensor readings because of low battery power. Typical patterns are unexpected gradient followed by zero variance, or lack of data, or excessive noise;
Environment out of range: when the environment conditions go beyond what the sensor system can read. Typical examples are extreme high and low temperatures. Patterns might be much higher noise or flattening of the data. Similar patterns occur with improper calibrations;
Clipping: sensor readings max out. The patterns could be sticking with maximum or minimum readings, perhaps because of environmental conditions.
Multiple sensors (e.g., temperature, flowrate) usually work together as a sensor sets. Sensor sets are different, depending on the HVAC system types and the controls loops. HVAC systems vary based on different building characteristics and functions. For small to medium office buildings, rooftop units (RTUs) are usually used. Typical sensors are air-related [
8], such as air temperature, airflow rate, and pressure sensors. For large commercial buildings (e.g., large office buildings), a chiller and cooling tower are usually applied. More sensors are placed on water loops [
9], such as water flow rate and water temperature sensors. There are three types of controls: rule-based control, local control, and supervisory control for HVAC systems. Different control strategies might require different sensor sets. Demand control ventilations need zone CO
2 sensors for control actions [
8,
10]. Occupant control, relying on occupant sensors, is another popular topic attracting attention in the past few years [
11,
12].
In the context of buildings and HVAC systems, limited studies have investigated sensor fault impacts on HVAC systems. Past studies show that the impact of sensor faults poses a great challenge to optimal performance of advanced control solutions [
13,
14]. Sensor fault modeling study could be classified into two groups: white-box and black-box [
5,
15]. The majority of studies applied the white-box method. Black-box method is suitable for fault detection. Due to the severe fault impacts, sensor calibration and fault mitigation become more important. The detailed literature reviews are summarized as:
- (1)
A study investigated sensor impact on building energy consumption [
16], through a small office model in the EnergyPlus platform. Their study proposed a new concept for sensor fault impacts: one-way impact and two-way impact. The one-way impact means that sensor faults cause decreased or increased energy consumption or thermal comfort. The two-way impact means that there could be higher energy consumption for a certain desired energy item (e.g., cooling), and simultaneously lower energy consumption for another desired energy item (e.g., heating). Another recent study proposed the sensor fault impact analysis framework [
9] to investigate sensor fault impacts. This framework is based on white-box methods, which opened a door for sensor fault studies on building performance. Their results show that sensors could cause more than double energy consumption. Another study, using white-box modeling platform, demonstrated sensor fault impacts for demand control ventilation (DCV) on building energy consumption [
8]. Results show that sensor faults severely downgraded the control performance, leading to increased energy consumption. Another recent study developed a few fault models in the EnergyPlus platform, which were validated through experiments [
17,
18];
- (2)
Black-box, or machine learning algorithm, is becoming a new trend in fault detection and diagnostics. This study applied artificial intelligence (AI) algorithms to detect the sensor faults, based on a large dataset. A review study [
19] pointed out the biggest issue for black-box method is how to identify the baseline data (data without fault) from the building energy management system;
- (3)
Sensor fault calibration and mitigation are receiving attention. This study aimed to calibrate the sensor faults [
20], to which they applied the virtual in-situ calibration method. Their results showed that the systematic errors of sensors were less than 2% and the random errors were also reduced by as much as 74%. The benefit of such sensor calibration significantly reduced the possibility of abnormal data and enhanced the reliability of sensor measurements. This can effectively eliminate the sensor negative impacts on building energy consumption and thermal comfort. A study [
21] applied fault mitigation techniques for sensors (read back for sensor readings and nearest neighbor monitoring for fault sensor correcting), which demonstrated up to 38% improvement in energy consumption and up to 75% improvement in thermal comfort. The sensor faults include stuck-at fault, spike-and-stay (SAS) fault with negative spike, spike-and-stay (SAS) fault with positive spike, single-sample-spike (SSS) fault with negative spike, and single-sample-spike (SSS) fault with positive spike.
However, current literature studies assume sensor fault or errors are constant [
5,
9,
15,
22,
23]. In real conditions, sensor fault magnitude could evolve or develop over time, which is often observed from field measurements. This is the essence of incipient sensor faults. This is also the main purpose of this study. How to address such an issue is relying on correct modeling of sensor errors. Another research gap is that there was no study proposing a sensor impact evaluation framework. Available studies use their own sensor impact evaluation platform.
The structure of this study is organized as follows:
Section 2 summarizes the sensor impact and evaluation framework, which is the methodology;
Section 3 describes the surrogate model;
Section 4 describes the uncertainty analysis;
Section 5 describes the sensitivity analysis; and
Section 6 provides conclusions.
2. Methodology
This study aimed to systematically investigate incipient sensor faults for building control performance. The US Department of Energy’s Oak Ridge National Laboratory’s (ORNL’s) two-story Flexible Research Platform (FRP-2) building was used to study the sensor fault impacts. It is a two-floor building with five zones on each floor. The cooling is from rooftop unit (RTU). The heating is from a gas heating coil and VAV electric coils. The control strategy for single-duct variable air volume (VAV) terminal boxes and the air handling unit (AHU) is implemented based on the control logics from ASHRAE Guideline 36-2018, High-Performance Sequences of Operation for HVAC Systems [
24].
A sensor-impact oriented framework is proposed for this purpose. The framework is comprised of (1) a physics-based emulator integrated with sensor faults, control sequences, and building/HVAC models; (2) large-scale simulations for sensor error samplings to the controls on the cloud; (3) a surrogate model development based on cloud simulation results for sensitivity analysis; and (4) sensitivity and uncertainty analyses for the sensors and desired outputs (e.g., energy consumption, thermal comfort).
This study is based on EnergyPlus platform through building energy models. The overall workflow is illustrated in
Figure 1. Cloud simulation was used to quicken the 3600 simulation cases, using a stochastic approach. The uncertainty and sensitivity analyses are based on simulation data from cloud simulation. The building model details are not presented here. Interested readers, please refer to the recent publications on the building [
25].
The pseudo code for the sensor fault injection and simulation is shown in
Figure 2. The pseudo code follows the basic flowchart in
Figure 1, which demonstrates the basic principle of how to implement the sensor impact analysis.
2.1. Sensor Sets
Based on extensive literature reviews, 34 sensors were identified. They are typical sensors used to operate RTU and variable air volume (VAV) systems in small to medium office buildings. The sensors were prioritized based on the severity of indoor air (IA) temperature impacts, which can significantly affect energy efficiency and occupant thermal comfort. The identified sensors are frequently used in commercial buildings. They are listed in
Table 1.
Based on the actual HVAC system configuration of the FRP-2 building, five sensor types were selected for the following reasons: (1) Those sensors were closely matching with the selected control logics. Different control logics might need different sets of sensors; (2) the IA temperature is the most important variable to be controlled to meet the heating and cooling set point temperatures; (3) the VAV box supply air (SA) temperature and SA flow rates (SAFs) directly affect the IA temperature from the control perspective; (4) RTU system-level operation also directly affects the VAV box operations; and (5) RTU outdoor air (OA) temperature (OAT) and SA temperature (SAT) are important for determining system-level energy consumption. The sensor types are listed in
Table 2. The specification of the selected sensors is described in
Table 3.
2.2. Sensor Errors
Available literature assumes fixed or constant sensor errors. Here, we proposed the incipient sensor error as bias error and precision (random) error. This research team identified two components for sensor faults [
4]: precision and bias. Precision is used to measure how precise the sensor reading is from the true reading because of measuring noise. Bias is used to measure how far the sensor reading is from the true reading because of system bias.
Figure 3 shows a diagram for precision and bias. A typical characteristic of incipient faults is that the fault magnitude might change slowly with time and effects on control performance might go unnoticed.
For a sensor, an ideal reading (or true reading) exists at a given time step, as shown by the black line in
Figure 4. The bias error is the system deviation from the ideal readings, as shown by the green dotted lines in
Figure 4. The precision error is the random deviation or noise from the average sensor readings, as shown by the blue dashed lines in
Figure 4.
The mathematical expression of such a fault profile is given as
where
is the fault reading,
is the ideal reading (no fault),
is the bias error, and
is the precision error.
The bias error is a normal distribution with a certain standard deviation. The expression is given as
The precision error is also a normal distribution with a certain standard deviation. The expression is given as
where
is the standard deviation of bias error and
is the standard deviation of precision error.
The sensor errors were incorporated based on the emulator of EnergyPlus and Python EMS. Due to the technical difficulties from larger airflow sensor errors, the airflow sensor errors need to be within an effective range. The standard deviations for the five types of selected sensors are shown in
Table 4.
2.3. Control Logic for RTU and Single-Duct VAV System (ASHRAE Guideline 36)
The installed HVAC systems in the FRP-2 building are RTUs, in which cooling is from a direct expansion cooling coil and heating is from a gas heating coil. The FRP-2 building has 10 conditioned zones. Each conditioned zone is served by a VAV box with an electricity reheat coil. The air handling unit (AHU) connects all the zone VAV boxes and the RTU. Control logic from ASHRAE Guideline 36-2018, High-Performance Sequences of Operation for HVAC Systems [
24], was developed for the RTUs and VAV boxes.
The first control logic is the T&R set point logic for the AHU. T&R logic resets set points of the pressure, temperature, or other variables on the AHU or plant side. T&R logic reduces the set point at a fixed rate until the zone thermal comfort is no longer satisfied; then, it generates the request. The set point is increased in response to a sufficient number of requests. By adjusting the importance of each zone’s requests, the critical zones will always be satisfied. If there are not a sufficient number of requests, then the set point decreases at a fixed rate.
The term “request” refers to a request to reset a static pressure or temperature set point generated by downstream zones or AHUs. These requests are sent upstream to the AHU or plant that supplies the zone or area that generated the request. For more details of Trim & Respond logic, please refer to the documents of [
24,
26].
T&R control was used to reset the RTU SA set point temperature in the emulator. When the OAT was higher than the maximum OAT (21 °C), the RTU SAT was set to the minimum RTU SA set point temperature (12 °C). When the OAT was lower than the minimum OAT (16 °C), the RTU SAT was set to the maximum RTU SA set point temperature (18 °C). If the OAT was between the minimum and maximum OAT when the OAT was increased, then the RTU SAT was linearly increased from the minimum RTU SA set point temperature to the maximum RTU SA set point temperature. For T&R control, as ASHRAE Guideline 36 describes, fewer than two requests were ignored.
- 2.
VAV box control logic
The VAV box control is the second control logic applied to the emulator.
Figure 5 shows the control logic for the VAV box from ASHRAE Guideline 36. The control logic has three sections, which correspond to the heating mode, cooling mode, and dead-band, and it uses the heating loop demand concept. Heating loop demand is the ratio (as a percentage) of the actual required heating load of the VAV box to the size of the VAV box. Equation (4) describes how to calculate the heating loop demand.
The detailed logics are threefold:
In the heating mode, when the heating loop is less than or equal to 50%, the discharge air (DA) set point temperature of the VAV box is increased from the RTU SAT to the maximum DA set point temperature of the VAV box, and the minimum SAF is maintained. When the heating loop is greater than 50%, if the DA temperature of the VAV box is greater than the IA temperature plus 3 °C, then the SAF of the VAV box is increased from the minimum SAF to the maximum SAF while maintaining the maximum DA set point temperature of the VAV box;
In the cooling mode, the DA temperature of the VAV box is the same as the RTU SAT because no option exists to decrease the SAT using the VAV box. Therefore, VAV box control is linked with T&R control in the cooling season, when the VAV box control must be considered the RTU SAT. The four cooling SA set point temperature reset requests are as follows:
If the IA temperature exceeds the indoor cooling set point temperature by 3 °C for 2 min and after the suppression period resulting from an RTU SA set point temperature change via the T&R control, then send three requests;
Else, if the IA temperature exceeds the indoor cooling set point temperature by 2 °C for 2 min and after the suppression period resulting from an RTU SA set point temperature change via the T&R control, then send two requests;
Else, if the cooling loop is greater than 95%, then send one request until the cooling loop is less than 85%;
Else, if the cooling loop is less than 95%, then send no request.
In terms of the SAF in the cooling season, the SAF of the VAV box is increased from the minimum SAF to the maximum SAF as the cooling loop is increased;
- c.
In the dead-band mode, when neither heating nor cooling are needed, the SAF is set to the minimum SAF, and the DA temperature of the VAV box is set to the RTU SAT.
The overall control logic is shown in
Figure 6.
2.4. Large-Scale Simulation
The large-scale simulation was based on a commercial cloud platform, Microsoft Azure. In total, 3600 cases were simulated on the cloud. The inputs were the sensor errors incorporated into the five selected sensors for the FRP-2 building emulator, as shown in
Table 2. The sensor errors were obtained using normal distribution samplings. EnergyPlus internal programming limits caused simulation crashes when larger sensor errors were incorporated. The standard deviations of sensor errors were based on multiple trials. The thresholds were based on engineering experience, domain knowledge, and actual RTU- and zone-level sensor ideal readings. The outputs were the target variables for energy consumption and thermal comfort, such as fan electricity consumption and reheat coil electricity energy in the VAV box.
The basic diagram is shown in
Figure 7. The basic workflow is as follows:
- (1)
A Python script was developed to generate 3600 simulation input data files (IDF files). Each IDF file was associated with a Python class of sensor errors through Python EMS. During the simulation, at each time step, a new sensor error (including bias and precision) was injected into the ideal sensor readings from EnergyPlus;
- (2)
After 3600 cases were generated, they were uploaded to the Azure cloud platform;
- (3)
In the Azure cloud platform, a bash script selected the appropriate virtual machine configurations (e.g., memory and hard drive, as shown in
Table 4) and a number of virtual machines. The team’s subscription included 300 nodes (virtual machines);
- (4)
The Azure cloud provided a job scheduler, which automatically distributed all 3600 cases across 300 nodes;
- (5)
The simulation ran automatically until all cases were accomplished;
- (6)
Finally, all the results were selected to set up the data sets (inputs and outputs) to create the black-box models.
- (7)
The configuration for the cloud is shown in
Table 4.
A total of 300 nodes were used for the cloud simulation, in which each node is a standard node: 16 cores, 64 GB memory, and 600 GB storage capacity. The total simulation time is about 9 h.
The sensor errors were sampled using a normal distribution for each time step. The sensor readings from EnergyPlus used the sensor errors to form the faulty sensor readings. The faulty sensor readings were used as inputs to control sequences to calculate new set points. These new set points were used to control the performance of buildings. Ultimately, the simulated energy consumption and thermal comfort were different from the results obtained using the ideal sensor readings.
2.5. Other Aspects
In order to ensure that the simulation results are correct, there are a few extra explanations summarized below.
- (1)
The baseline model was calibrated with the actual components and systems within the FRP2 building at ORNL campus. The input values for the HVAC system are from the measurement and nameplate values. The simulation results demonstrated the consistency between model and measurements [
25];
- (2)
The simulation cases have a total of 3600 sets. Each case matches with a sensor error module. In each timestep, the sensor error value will be injected into the model following the sensor error components (bias and precision). The energy consumption differences were easily calculated between baseline case and sensor-error case, which was caused by the sensor errors. If sensor errors were made to be zero all through the simulation timesteps, the same energy consumption was obtained with baseline model;
- (3)
We analyzed the results and see that they are reasonable for sensor errors. For example, (a) when we increase the sensor error to the zone temperature for cooling mode (lower zone temperature than it is supposed to be), we can see the energy consumption increasing. This is because the building model thinks it needs more cooling energy to meet the cooling setpoints. (b) When we increase the sensor error to the zone temperature sensor for heating mode (higher zone temperature than it is supposed to be), we can see the energy consumption decreasing. This is because the building model thinks it needs less heating energy to meet the heating setpoints;
- (4)
To explain in detail, the sensor error in this study followed the normal distribution (
Figure 4) and the sensor error range was calculated by bias sensor error plus precision error. For example, if the standard deviation of sensor error of the temperature sensor is 1 °C, the temperature sensor error range is within −3 °C and +3 °C with a probability of 99.76%. Similarly, the probability of sensor error range between −1 °C and +1 °C is about 68%. The probability of sensor error range within −2 °C and +2 °C is about 95.4%. The extreme cases are within 0.24% of scenarios on the two ends. Therefore, the differences (numbers) mentioned above occur when the sensor error is the largest (either positive or negative values).
6. Conclusions
This study investigated the incipient sensor impacts on the ASHRAE Guideline 36 control sequences through sensitivity and uncertainty analyses. The sensor errors had two components: bias error and precision (random) error. The sensor samplings were performed with normal distributions. Cloud simulations were conducted based on the sensor samplings and 3600 simulation cases. The results were collected to train surrogate models for sensitivity analysis.
The energy consumption was classified into system levels (power demands) and zone levels (zone air temperature, zone sensible heating, zone sensible cooling, and zone reheat coil energy). The thermal comfort (PPD) at the zone level was also investigated.
The uncertainty and sensitivity analyses were conducted with respect to sensor errors and energy/thermal comfort variables. The uncertainty analysis showed that the sensor errors and energy consumptions have a nonlinear relationship. The energy consumptions have wide distributions compared with the baseline model with sensor error uncertainties:
The site energy differences could go −3.3% lower or 18.1% higher, compared with baseline;
The heating energy differences could go −66.5% lower or 314.4% higher, compared with baseline;
The cooling energy differences could go −11.5% lower or 65.0% higher, compared with baseline;
The fan energy differences could go 0.15% lower or 6.9% higher, compared with baseline.
The sensitivity analysis was performed at both system and zone levels. At the system level, the random errors for SAT and OAT sensors had the most significant impacts. At the zone level, the random errors were the most influential, followed by total errors and then bias errors.
In the future, there are a few works worth exploring:
This study clearly demonstrated the severe impacts of incipient sensor faults. The implications for research, policy, and study are: (1) calibrating sensors as recommended by the manufacturer. (2) if calibration is feasible, fault mitigations are recommended.