Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric

Sheida, Kouhyar; Seyedi, Mohammad; Afridi, Muhammad Ali; Ferdowsi, Farzad; Khattak, Mohammad J.; Gopu, Vijaya K.; Rupnow, Tyson

doi:10.3390/machines12100694

Open AccessArticle

Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric

by

Kouhyar Sheida

¹

,

Mohammad Seyedi

¹,

Muhammad Ali Afridi

²

,

Farzad Ferdowsi

^1,*

,

Mohammad J. Khattak

²,

Vijaya K. Gopu

³

and

Tyson Rupnow

³

¹

Electrical & Computer Engineering, University of Louisiana, Lafayette, LA 70503, USA

²

Civil Engineering, University of Louisiana, Lafayette, LA 70503, USA

³

Louisiana Transportation Research Center (LTRC), Baton Rouge, LA 70808, USA

^*

Author to whom correspondence should be addressed.

Machines 2024, 12(10), 694; https://doi.org/10.3390/machines12100694

Submission received: 4 September 2024 / Revised: 30 September 2024 / Accepted: 30 September 2024 / Published: 1 October 2024

(This article belongs to the Special Issue Applications of Piezoelectric Devices and Materials)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This research study presents a resilient control scheme for an islanded DC microgrid (DC MG) integrating solar photovoltaic (PV), battery storage (BESS), and piezoelectric (PE) energy harvesting modules. The microgrid (MG) case study represents an energy hub designed to provide electricity for lighting systems in transportation, roads, and other infrastructure. To enhance practicality, the PE is modeled using the real data captured from a traffic simulator. The proposed reinforcement learning (RL) method was tested against four severe and unexpected failure scenarios, including short circuit at the load side, sudden and severe change of load, open circuit, and converter failure. The performance of the controller was quantitatively compared with a conventional PI controller. The results show marginal improvement in one scenario and significant improvement in the other three, suggesting that the proposed scheme is a robust candidate for microgrids with high levels of uncertainty, such as those involving solar and PE harvesters.

Keywords:

islanded DC microgrid; piezoelectric; voltage control; reinforcement learning

1. Introduction

Renewable energies (REs) have become increasingly popular for modern microgrid (MG) systems because of their environmental safety and ability to address pollution concerns [1]. Although nuclear, hydro, and thermal power generation continue to be the primary sources of power globally [2], wind turbines and photovoltaic (PV) power generation technologies are gradually gaining prominence. However, their effectiveness is heavily reliant on meteorological conditions [3]. As a result, the majority of turbine plants are strategically situated on mountains and seashores to efficiently harness the wind. Alternatively, photovoltaic (PV) systems can be put in buildings and diverse structures, including rooftops and road tactile pavement [2,3]. Another source is piezoelectric (PE) devices, PEs could convert environmental vibrations into electrical energy. However, their power generation capabilities were not adequate for use as primary sources of power. Instead, they were commonly employed in low-power applications such as sensors, quartz watches, and portable charging devices [4]. The writers in [5] used a buck–boost converter in the absence of sensors to optimize power for PE energy harvesting. Utilizing PE technology for energy harvesting has various benefits. It offers a potential source of sustainability and RE by enabling the conversion of mechanical energy that would otherwise be wasted into electrical power [6]. Nevertheless, due to their affordable price and less demand for maintenance, PE devices are presently being investigated for their possible usage in generating substantial electricity in railroads and roadways [7]. Based on research and road testing conducted by the California Energy Commission, it has been determined that a PE power generation system installed in a single lane of a one-mile road may produce an annual electricity output of 72,800 kilowatt-hours [8]. The authors in [9] present a dual-layer substrate piezoelectric transducer for harvesting energy from vehicle vibrations in asphalt, demonstrating that a finite element model predicts 132 V output and 3.56 mW power at 20 Hz with reduced stress concentration, enhancing durability and efficiency. To provide a workable harvested energy configuration for pedestrian deployment, the study in [7] highlighted the various strain elements that are impacted by the energy produced by a PZT (Lead Zirconate Titanate) energy-harvesting floor tile (EHFT). A variety of methods for obtaining energy from PE sensors are shown in Figure 1, ranging from low-power wearables to energy sources found in pedestrian traffic.

A voltage feedback-based technique was utilized by a power management integrated circuit to harvest energy from PE components [10,11]; a PE micro-actuating system was subjected to an adaptive controller, which incorporated an anti-windup compensator. In DC-DC converters, PEs were also employed as resonators. In [12], they were utilized in an inductorless step-up converter for more efficient operations, and in a self-bias flip rectifier for tracking the maximum power point [13]. Earlier research concentrated on sources and converters separately to increase energy extraction from PE sources or decrease converter size. Yet, each source requires its controller when paired with its specific converter. Conversely, ref. [14] introduced a novel approach for hybrid power systems. By using a direct power management strategy to combine a low-power PV panel with a PE harvesting module, this method provides a more cohesive and efficient energy harvesting solution. There is no need for batteries, because the system is directly connected to the grid. The study’s authors demonstrate the efficiency with which the PE module generates RE. The proposed system lacks an energy storage system (ESS), which might enhance the utilization of the produced RE. It is well known that mechanical vibrations and the surrounding environment have a big influence on how much energy PV and PE devices can produce. We describe an islanded microgrid (MG) system in this study that operates in parallel and is intended for smart road applications. The technology combines battery storage (BESS), integrated PE components, and solar PV cells. To create the required voltage, we used a Material Testing System (MTS) to simulate different pressure situations on the PE components. Our goal in creating a model predictive RL controller for the PE system was to improve control and stabilize abrupt voltage fluctuations. This all-encompassing strategy enhances energy harvesting while guaranteeing steady and dependable power transmission, increasing the usefulness and efficiency of smart infrastructure. The arrangement of the document is as follows: The experimental results from the MTS machine will be presented in Section 2, with a focus on an analysis of the embedded PE sensor performance, which is intended to mimic the impacts of vehicle traffic on a roadway. The goal is to design and test a sustainable energy hub, with part of the energy sourced from piezo sensors embedded in asphalt or concrete. Due to the uncertainties associated with piezo harvesters, we propose and test a robust AI-driven controller that can handle these uncertainties and unexpected disturbances more effectively than traditional controllers. The MG configuration and the power electronics circuits designed for each energy source are covered in Section 3. In Section 4, the performance of the MG during islanded operation is assessed and the proposed controller is implemented in detail. To evaluate the proposed controller’s resilience, Section 5 compares its effectiveness with a conventional Proportional–Integral (PI) controller across a range of fault scenarios. Section 6 wraps up the investigation and provides recommendations for more research.

2. Material Testing Machine and Piezoelectric Data

2.1. Testing System

The testing apparatus used for the sample evaluations is the 810 Material Testing System depicted in Figure 2, equipped with servo-hydraulic loop functionalities. During testing, samples were loaded for 0.1 s and unloaded for 0.9 s, simulating a truck traveling at speeds of 60–70 mph. Two boundary conditions were employed: (1) simply supported and (2) fully supported. Two types of tests were conducted: single haversine waveform loading and multiple haversine waveform loading. Single haversine loading represents a single truck or car moving over the pavement system, while multiple haversine loading simulates two or more trucks with varying loads moving over the pavement system.

2.2. Embedment of Piezoelectric Sensors in Fiber-Reinforced Concrete Samples

The PE sensors were embedded at a depth of 1/8 inch from the bottom center of the fiber-reinforced concrete samples. This placement ensures that maximum bending occurs at the center of the sample under load, generating the highest possible voltage due to tensioning. In these samples, the boundary condition was simply supported. The loading frequencies for both single and multiple haversine loading were set at 1 and 10 Hz, with loads ranging from 650 lbf to 3200 lbf. Given that the flexural strength of the fiber-reinforced concrete beam is approximately 5900 lbf, the maximum load was capped at 3200 lbf. The sample of the made concrete is shown in Figure 2.

The effects of single and multiple vehicle loads on piezoelectric sensors installed in fiber-reinforced concrete are depicted in Figure 3. The sensors produce a peak voltage of about 0.5 V when subjected to a single haversine load; however, when subjected to multiple loads, the voltage decreases to about 0.14 V because the concrete’s increased stiffness limits voltage generation and deformation.

2.3. Piezoelectric Sensors in Asphalt Samples

In the asphalt samples, PE sensors were placed between the asphalt and rubber layers to maximize bending and voltage generation when the load was applied. The implications of single and multiple vehicle loads on piezoelectric sensors embedded in asphalt are depicted in Figure 4. For the single haversine loading and multiple haversine loading of asphalt, the loading frequencies were 5 and 10 Hz, with loads ranging from 370 lbf to 270 lbf. In multiple haversine tests, the loading varied from 320 lbf to 400 lbf with frequencies of 1 Hz, 5 Hz, and 10 Hz. For both testing scenarios, the boundary condition was simply supported. The maximum load was set at 400 lbf to avoid exceeding the asphalt’s flexural strength of approximately 600 lbf. Figure 5 displays a sample of the manufactured asphalt. Since the asphalt is less firm and can withstand more deformation, the sensors can produce up to 10 V when subjected to successive haversine stresses. The material’s impact on voltage generation is demonstrated by the fact that, under a single haversine load, the voltage output is likewise larger than in concrete samples for the same reason.

In the fiber-reinforced concrete samples, lower voltage outputs were observed due to their higher stiffness. Conversely, asphalt samples exhibited higher voltage outputs because of their lower stiffness.

3. Islanded DC Microgrid Configurations and Setup

In the proposed power system shown in Figure 6, a PV system, Bess, and PE module are used as power sources. This section offers a comprehensive explanation of each power source, including their power ratings, converter configurations, and detailed mathematical models.

3.1. Piezoelectric

The two testing techniques were single-haversine waveform loading and multiple-haversine waveform loading. Single-haver loading replicates the system of a single truck or car moving on the pavement with a constant load. The movement of two or more trucks over the pavement system while towing varying loads is replicated in multiple haversine testing. In this experiment, the PE model of the Piezo Ceramic Generator SM411 was used. It measures 79 × 18 × 1.5 mm and has a static capacitance of 110 nF ± 30%. This PE module is displayed in Figure 7. PEs are low-voltage components; thus, to create the necessary voltage for them to be used as a power source on a highway, we connect them in series.

3.1.1. Rectifier

A two-phase harvesting circuit is utilized to optimize power extraction from various dynamic sources. Figure 8 shows this procedure in action. The current produced by the PE components is first transformed into direct current. It is briefly held in a capacitor

C_{i}

, while

V_{i n}

is maintained at the ideal rectifier voltage

V_{r e c}

. Capacitor (

C_{i}

)-rectified power is sent to the load bus via a DC-DC converter. Using this technique, the load bus is guaranteed to receive the most power possible from the vibrating PE element. PE modules are connected in series to guarantee the correct output voltage for the controller and boost converter to convert the source bus voltage to the intended load bus voltage.

3.1.2. DC-DC Boost Converter

The general circuit of a boost converter is shown in Figure 9. This model ignores the resistance on the equivalent series resistance of the capacitor, the resistance while the switch is on, and the resistance when the diode is off. The output voltage of the converter can be changed by altering the length of the input pulses; using PWM pulse frequency necessitates a smaller inductor.

A boost converter exhibits a non-linear relationship between the duty cycle and the output voltage due to the non-linear characteristics of the inductor. The current flow through an inductor increases exponentially as the voltage drop across it decreases exponentially. There is an exponential relationship between the inductors’ current and voltage during the charging period. The inductor voltage

v_{l}

and inductor current

i_{l}

have an exponential relationship with the inductor charging time. The voltage across an inductor of a boost converter determines the output voltage in a direct proportion. At a fixed operating frequency, there is a link between the output voltage and duty cycle. This exponential relationship causes a boost converter’s output voltage and PWM duty cycle to be non-linear. When the duty cycle increases in a boost converter, the output voltage first climbs until it peaks at a specific point and then starts to fall. Changes in the duty cycle relative to the voltage curve are caused by the load that is connected to the converter. This behavior can be depicted in Figure 10. The system’s DC-DC boost converter can function between 85 and 95% efficiently [15], depending on the load and switching frequency. The module for harvesting piezoelectric energy can generate up to 13.5 V and 0.8 A when working optimally; however, to effectively raise the gathered voltage to the required bus voltage for the system, a boost converter is needed. To fully utilize the energy that has been captured, even tiny power contributions from the piezoelectric source must be successfully incorporated into the microgrid through this high-efficiency conversion. A deeper modeling of the converter can be found in [16].

3.2. Solar

Power electronics interfaces, such as DC converters, are necessary to control the DC output voltage produced by a PV system. The PV system is equipped with a buck converter that can withstand variations in irradiance while preserving a high conversion efficiency of up to 90 %. It runs at a nominal voltage of 48V. A power switch (Q), switching controller, diode (D), inductor (L), capacitor (C), and DC load bus make up a DC-DC buck converter as shown in Figure 11. To obtain a lower output voltage, it steps down an input voltage. In this simulation, the input voltage is 48 V with 1000 irradiance. With the controller’s assistance, the buck converter lowers the voltage to the load bus. Because the inductor stores energy when the switch is on and releases it to the load when it is off, smoothing the output means that the output voltage grows linearly with the duty cycle. The performance can be impacted by non-ideal elements such as switching losses and inductor resistance (

R_{L}

). Ref. [17] contains further modeling details. Figure 12 illustrates this phenomenon.

3.3. Battery

The ability of the bidirectional DC converter to switch the direction of power flow and transfer power between two DC sources is well known. To efficiently handle this power transfer—which is essential for battery charging and discharging—two switches work together. Two MOSFETs are used as the switches in this instance. The connection between the load bus and the converter and battery is depicted in Figure 13. Two finely regulated switches are managed to accomplish the bidirectional conversion.

The output voltage

V_{o u t}

in an ideal buck–boost converter is connected to the duty cycle D and input voltage

V_{i n}

using the following equation:

V_{o u t} = \frac{D}{1 - D} \times V_{i n}

(1)

Figure 14 illustrates how this equation shows that for

0 < D < 1

, the output voltage grows as the duty cycle increases. A buck–boost converter’s performance in real-world applications can be impacted by non-idealities such as switching losses and inductor resistance (

R_{L}

). Both the buck–boost converter’s output voltage and efficiency are impacted by the inductor’s internal resistance (

R_{L}

). A battery system’s internal resistance, charging and discharging rates, and temperature all affect how efficient it is. For lithium-ion batteries, round-trip efficiency, or the ratio of energy production during discharge to energy input while charging, usually varies from 85% to 95%, contingent upon the operating circumstances and application. The ideal voltage conversion ratio is changed by the voltage drop across

R_{L}

throughout the on-time and off-time. Approximating the real output voltage

V_{o u t}

in light of the voltage drop across

R_{L}

is possible.

V_{o u t} \approx \frac{D \times (V_{i n} - i_{l} \times R_{L})}{1 - D}

(2)

where the average inductor current is denoted by

i_{l}

. This formula demonstrates how the output voltage is decreased by the

R_{L}

. Because of the unique design of the converter, the output voltage in a buck–boost is negative. A negative output voltage is produced when the input voltage’s polarity is reversed by the buck–boost converter [18]. The configuration of the switching elements and inductor causes this inversion. Furthermore, the buck–boost converter in our system functions as a boost converter, because the battery is in discharge mode. The study skips over the technical details of SoC control and battery charging. Nonetheless, bidirectional DC-DC converters are usually employed in microgrid systems to manage battery charging and discharging. Many battery management systems and industry standards recommend maintaining SOC within 10% to 90% limits to ensure a balance between performance, safety, and longevity. This practice is common in various applications, including electric vehicles and grid energy storage systems. Algorithms such as constant current/constant voltage (CC/CV) could be used to provide charge control and guarantee ideal charging. The SoC, which is crucial for preserving battery health, is frequently represented mathematically using voltage thresholds and current integration. The following equation can be used to describe how quickly a battery charges:

I_{charge} = \frac{P_{in}}{V_{bat}}

(3)

where

I_{charge}

is the charging current,

P_{in}

is the input power from the source (e.g., PV, piezoelectric, etc.), and

V_{bat}

is the battery voltage. A battery’s state of charge (SoC) can be represented as follows:

SoC (t) = SoC (t_{0}) + \frac{1}{C_{bat}} \int_{t_{0}}^{t} I_{bat} (τ) d τ

(4)

where

C_{bat}

is the battery capacity (in amp-hours, Ah),

I_{bat}

is the charging or discharging current, and

SoC (t_{0})

is the initial state of charge at time

t_{0}

. This equation integrates the current over time to determine the change in the state of charge. More modeling information about battery charging and discharging is provided in [19]. The converter parameters and power ratings of the planned DC MG are displayed in Table 1 along with its configuration.

4. Methodology and Proposed Controller

4.1. Reinforcement Learning

Power electronics and power systems have made substantial use of RL. RL is used in electric cars (EVs) to improve battery utilization and energy management [20]. RL increases efficiency in the field of electrical machines and converters [21], and it enhances autonomy and resilience in MGs [22]. Sequential decision making under uncertainty can be modeled using the Markov Decision Process (MDP) technique. Various statuses and behaviors are taken into consideration when creating an MDP. It is described as a five-element tuple

(δ, A, T, R, γ)

, in which the action space is represented by

a_{t} \in A

and the MG’s state space by

s_{t} \in δ

.

s_{t}

denotes the preferred course of action at time

t \in R^{+}

. A machine learning method based on trial and error is called RL. Through the use of a reward system, an RL agent actively experiments with various control actions in its environment, observing the dynamics by tracking results. The environment in an MDP gives a vector showing its state,

s_{t} \in δ

, at each time step

t \in R^{+}

. The agent (or control policy) sends a suitable action

a_{t}

in response to the perceived state

s_{t}

. Next, a scalar reward

r_{t + 1} = r (s_{t}, a_{t})

is given to the agent. The state will develop to

s_{t + 1} \in δ

as a result of this action’s impact on the environment, which is represented by the state-transition probability

p (s_{t + 1} | s_{t}, a_{t})

. A correspondence,

π

, between the state and the action, which can be either deterministic or stochastic, serves as a representation of the policy. The following yields the entire discounted reward

τ

:

R_{t} = \sum_{k = 0}^{\infty} γ^{k} r (s_{t + k}, a_{t + k})

(5)

When future uncertainty is taken into account by

γ \in [0, 1]

. The RL only considers maximizing short-term benefits when

γ = 0

. The agent gets more forward-looking as

γ

rises, changing to value long-term benefits over short-term earnings.

4.1.1. Regression

The Q-table becomes unfeasible when Deep Q-Learning is applied in scenarios where the state–action space is large or infinite. This challenge can be solved by reconstituting the Q-table using a non-linear form, like function approximation, which treats the task as a supervised learning task similar to regression. The rising ability of deep neural networks to handle complicated systems with high dimensions made them an attractive option for estimating the action value function. This methodology’s inception dates back to Mnih et al.’s 2015 debut of the Deep Q-Network (DQN) [23]. Deep Q-Learning has challenges that make the Q-table unfeasible when dealing with settings where the state–action space is large or infinite. This problem can be resolved by reconstructing the Q-table in a non-linear form, like function approximation, and treating it as a supervised learning task akin to regression. Power electronics converters can use regression to find a relationship between input and output variables. Regression analysis is mostly used in power electronics converters to predict the value of the input signal to obtain the intended output [24].

The connection between the input and output variables varies depending on the type of converter. Certain relationships might only be solved by non-linear regression as the boost converter, whereas others might be solved by linear regression as the buck converter, which has been used in this paper. The solution to linear regression involves fewer steps and less processing power than non-linear regression [25]. A polynomial of degree n is represented as Equation (6):

y = β_{1} x^{n} + β_{2} x^{n - 1} + β_{3} x^{n - 2} + \dots + β_{n}

(6)

A linear equation is used in the statistical technique known as linear regression to represent the association between one or more autonomous variables (X) and an affiliate variable (Y). Finding the optimum line of best fit that can accurately forecast the value of Y given a value of X is its main objective. Equation (7) is the expanded linear regression formula for several independent variables [26].

Y = β_{0} + β_{1} X_{1} + β_{2} X_{2} + β_{k} X_{k} + ε

(7)

Although deep reinforcement learning-based controllers perform exceptionally well on high-end computers, their limited computational capacity makes them unsuitable for end devices such as microcontrollers. To efficiently utilize this strategy on microcontrollers with limited processing capacity, this work combines a regression-based optimization technique with a more straightforward Q-table-based method [25].

4.1.2. Suggested Approach

Regression is used in the control system described in this paper to determine the best controller policy. It is based on RL. The duty cycles of the converter and the load quantity highly influence how well it performs, and support vector separation in this paper has all been mapped using the RL model. The duty cycle as well as the relationship between load impedance is nonlinear. To supply the required PWM signal, the model makes use of a second-order exponential formula as the guideline. To enhance the guidelines, the system makes use of optimization based on nonlinear backwardness. Nonlinear regression techniques are employed by the RL model to improve the policy through the use of Q-table data. The PWM of the duty cycle acts as the model’s action, while the output impedance of the load acts as its state. It features an immediate load impedance and a voltage-tracking loop. The output voltage at the load bus and the shunt voltage drop are used to track the load impedance. Equation (8) shows how the RL model is applied to the Q table update. After deciding if the current action is better than the previous one for the given state, the procedure logs the reward for the current state–activity pair. With a single operation, the controller can provide the desired result under any condition. As such, we do not need to consider prospective future benefits in addition to past rewards when rewarding any state–action pair.

S R_{\max} = \{\begin{matrix} S n R & if 23.9 < voltage < 24.1 \\ and S n R = | voltage - 24 | \\ and S R_{\max} < S n R \\ S R_{\max} & otherwise \end{matrix}

(8)

where it starts by obtaining the PWM and voltage readings. It will compute the highest reward

S R_{\max}

for the current state if the voltage is between 23.9 and 24.1. It then computes the new reward

S n R

as the absolute difference between the voltage and 24. In the event that

S R_{\max}

is less than

S n R

, the process will update

S R_{\max}

to

S n R

before coming to an end. To implement the proposed controller in this experiment, state–action pairing rewards are first recorded in a reward matrix. An initial policy function is defined by the controller, and it is optimized. Subsequently, the software determines whether to add a reward for a new state–action pair by analyzing data from the Q table. If this is the case, regression iterations are used by the software to improve and optimize the policy function. The policy function is used to generate the appropriate PWM signals after the load impedance has been measured. Algorithm 1 illustrates this.

A randomly chosen second-order exponential function serves as the policy’s initial value. The controller generates PWM data that powers the converters of the MG, which switches loads attached to its output at predetermined intervals. The policy function begins to optimize when the duty cycle, which adapts to different load scenarios, as shown in (9), and indicates that the controller should simulate this policy function to ascertain the relationship between the duty cycle and the connected load.

f (R_{n}) = a e^{b R n} + c e^{d R_{n}}

(9)

Algorithm 1 Policy update algorithm

1:: Start
2:: Define Q matrix
3:: Define a policy function
4:: while true do
5:: Read data from Q table
6:: if $n (S, A) > n p (S, A)$ then
7:: for $i \leftarrow 0$ to 49 do
8:: Iterate for the best policy
9:: end for
10:: $n p (S, A) \leftarrow n (S, A)$
11:: end if
12:: Read the load impedance
13:: Calculate PWM
14:: Write PWM
15:: Update Q table
16:: end while

The RL agent will compute a, b, c, and d using the described policy, where the values are

a = 6.56

,

b = - 0.54

,

c = 0.31

, and

d = - 0.032

. These values will be applied to the transfer function for the boost converter. As the duty cycle changes to account for changing loads, the policy function for the buck converter begins to optimize, as demonstrated in Equation (10), but now employs a linear function as the buck converter has a linear behavior, as depicted in Figure 12, rather than an exponential one.

f (R_{n}) = a R_{n} + b

(10)

Following this strategy, the RL agent computes the coefficients a and b, which have the values

a = 0.00839

and

b = 0.4897

. In order to compare the suggested MG controller to the PI in terms of robustness, four failure scenarios will be examined. Figure 15 shows how RL is implemented in the DC MG.

5. Results and Discussion

5.1. Short Circuit Across the Load

Typically, in DC MG systems, failures in power converters related to the AC side, including voltage source converters or short circuits in transmission and distribution lines, might result in AC side breakdowns. Conversely, DC faults can be short-circuit faults such as arc, line-to-ground (L2G), and line-to-line (L2L). When a system malfunctions, converters’ operating points drastically alter [27]. The DC load bus voltages will drop and vary if the controller cannot handle such situations, which will de-energize the entire system. Furthermore, there can be a chance of fire dangers based on the nature and location of the fault. Thus, it is imperative that short-circuit defects be taken into account when evaluating the resilience and reliability of converters [28]. In light of the previously indicated concerns, we carried out a comparison analysis to illustrate the resilience of the suggested approach. Tests were conducted using a short-circuit fault state across the load on both the PI which has been tuned using the trial-and-error technique that involves adjusting the proportional and integral gains based on system performance to minimize steady-state error and achieve desired response characteristics and the suggested technique. Figure 16 illustrates how the fault occurred at

t = 1

and was fixed at

t = 1.1

. As can be shown in Table 2, the RL outperforms the PI in terms of performance, with an approximately 6% lower overshoot and a reduced undershoot of about 11%. Furthermore, the RL is considerably diminished with the peak-to-peak value. While the RL performs better overall, its steady-state error is slightly greater than that of the PI.

A bar chart comparing the integral of absolute error (IAE), the integral of time-weighted absolute error (ITAE), and the integral of squared error (ISE) is shown in Figure 17. In SC fault circumstances, the RL-based control approach works better than the PI, resulting in reduced IAE, ITAE, and ISE values. This demonstrates how well it can minimize deviations from the target amount of load voltage when compared to a PI.

5.2. Converter Failure

The breakdown of a converter has similar negative consequences in a DC MG, but with unique subtleties. The associated loads’ power supply being disrupted is the immediate result. In DC MGs, voltage stability is especially important, and when it fails, the system as a whole may experience significant voltage variations [29]. Tests were applied on both RL and PI concerning the failure of one of the converters. This failure occurred at

t = 1

and was rectified at

t = 1.1

, as seen in Figure 18. Table 2 shows that the RL controller performs better in terms of both overshoot and undershoot; but, because of the harshness of the scenario, the improvements are not statistically significant. Still, the RL performs better than the PI.

A comparison bar chart between the IAE, ITAE, and ISE error measurement criteria is shown in Figure 19. With lower control metric values during converter failure, the RL outperforms the PI in terms of performance. In comparison to the PI, this highlights its capacity to keep load voltage closer to the desired level and hence minimize deviations.

5.3. Load Variation

One of the important duties of the controllers in a DC MG is to oversee the installation and removal of loads. During these transitions, the controller must efficiently restore and maintain voltage stability to guarantee the overall stability and reliability of the MG. To assess this situation, Figure 20 shows how a significant load was added to the system at

t = 1

. With an undershoot of over 8% and a steady-state error of around 3% less than the PI, the RL performs noticeably better than the PI.

Figure 21 presents a bar chart that contrasts the error metrics of the two controllers. It demonstrates that the ISE of the RL is roughly 3% lower than that of the PI, and that its IAE and ITAE are nearly 2% lower.

5.4. Open Circuit of the Load

When parts are detached for normal maintenance or equipment replacement, there may be a brief period of an open circuit; this may happen without the previous notice of the maintenance crew and the engineers, so the controller must be able to restore voltage to its nominal value. Any break or disconnect in the electrical channel that stops electricity from passing through the circuit and causes the linked load to lose power is known as an open circuit. To replicate this, a breaker disconnects the load at

t = 1

and reconnects it at

t = 1.1

. The controller is responsible for restoring voltage during this scenario. The RL shows less undershoot and overshoot by almost 2% and much less peak-to-peak amount than the PI, and although the steady-state error of RL is slightly more, the overall performance of RL is better, as depicted in Figure 22.

According to Figure 23, the suggested controller has IAE and ITAE control metrics that are over 2% and 4% lower, respectively, than the PI.

A thorough comparison of the two controllers is given in Table 2. The acronyms used in the table are as follows: CF for converter failure; LV for load variation; OCL for the open circuit of the load; and SCL for short circuit across the load variation.

6. What Is Next

Distributed control and load sharing have the potential to greatly improve system resilience and operational efficiency in the context of islanded DC microgrids while they will be investigated in connection and disconnection from the main grid. Each power source, including photovoltaic, battery, and piezoelectric modules, can function independently through distributed control, responding to local conditions and reducing communication latency. This enables decentralized decision-making. By distributing the load evenly among different sources, load sharing avoids overloading and makes the best use of the available energy resources. This method increases fault tolerance, optimizes energy flow, and strengthens system stability in situations like unexpected load fluctuations or power supply failures.

7. Conclusions and Future Directions

This study introduces a resilient control scheme for an islanded DC microgrid (DC MG) integrating solar, battery storage, and piezoelectric harvesters. The MG serves as an energy hub to supply electricity to lighting systems in the transportation sector such as roads. In this study, the piezoelectric harvesters were modeled using experimental data from a traffic simulator. The proposed RL method was tested under four severe and unexpected failure scenarios: a short circuit at the load side, a sudden and severe change in load, an open circuit, and converter failure. The performance of the control scheme was compared with a benchmark controller (i.e., PI control scheme). The results show the effectiveness of the proposed controller in improving the resilience of the energy hub under test. As a future direction, it is essential to further assess the economics, reliability, and durability of piezoelectric modules to enhance their viability in practical applications.

Author Contributions

Conceptualization, F.F. and M.J.K.; methodology, F.F.; software, F.F. and M.J.K.; validation, K.S., M.A.A. and M.S.; formal analysis, K.S., M.A.A. and M.S.; investigation, F.F., K.S., M.A.A. and M.S.; resources, F.F., M.J.K., V.K.G. and T.R.; data curation, K.S., M.A.A. and M.S.; writing—original draft preparation, K.S., M.A.A. and M.S.; writing—review and editing, F.F., M.J.K., V.K.G. and T.R.; visualization, K.S. and M.S.; supervision, F.F., M.J.K., V.K.G. and T.R.; project administration, F.F., M.J.K., V.K.G. and T.R.; funding acquisition, F.F. and M.J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by Louisiana Department of Transportation and Development, under grant# 24-5TIRE SIO Number: 1000500.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Saeed, M.H.; Fangzong, W.; Kalwar, B.A.; Iqbal, S. A review on microgrids’ challenges & perspectives. IEEE Access 2021, 9, 166502–166517. [Google Scholar]
Strielkowski, W.; Civín, L.; Tarkhanova, E.; Tvaronavičienė, M.; Petrenko, Y. Renewable energy in the sustainable development of electrical power sector: A review. Energies 2021, 14, 8240. [Google Scholar] [CrossRef]
Abdelghany, M.B.; Al-Durra, A.; Gao, F. A coordinated optimal operation of a grid-connected wind-solar microgrid incorporating hybrid energy storage management systems. IEEE Trans. Sustain. Energy 2023, 15, 39–51. [Google Scholar] [CrossRef]
Edla, M.; Lim, Y.Y.; Mikio, D.; Padilla, R.V. A Single-Stage Rectifier-Less Boost Converter Circuit for Piezoelectric Energy Harvesting Systems. IEEE Trans. Energy Convers. 2022, 37, 505–514. [Google Scholar] [CrossRef]
Lefeuvre, E.; Audigier, D.; Richard, C.; Guyomar, D. Buck-Boost Converter for Sensorless Power Optimization of Piezoelectric Energy Harvester. IEEE Trans. Power Electron. 2007, 22, 2018–2025. [Google Scholar] [CrossRef]
Bairagi, S.; ul Islam, S.; Shahadat, M.; Mulvihill, D.M.; Ali, W. Mechanical energy harvesting and self-powered electronic applications of textile-based piezoelectric nanogenerators: A systematic review. Nano Energy 2023, 111, 108414. [Google Scholar] [CrossRef]
Yingyong, P.; Thainiramit, P.; Jayasvasti, S.; Thanach-Issarasak, N.; Isarakorn, D. Evaluation of harvesting energy from pedestrians using piezoelectric floor tile energy harvester. Sens. Actuators A Phys. 2021, 331, 113035. [Google Scholar] [CrossRef]
Sun, J.-Q.; Xu, T.-B.; Yazdani, A. Ultra-High Power Density Roadway Piezoelectric Energy Harvesting System. Ph.D. Thesis, University of California, Merced, CA, USA, 2023; pp. 1–42. [Google Scholar]
Long, S.X.; Khoo, S.Y.; Ong, Z.C.; Soong, M.F. Finite element analysis of a dual-layer substrate sandwiched bridge piezoelectric transducer for harvesting energy from asphalt pavement. In Proceedings of the 2019 IEEE International Conference on Sensors and Nanotechnology, Penang, Malaysia, 24–25 July 2019. [Google Scholar] [CrossRef]
Rezaei-Hosseinabadi, N.; Amoorezaei, A.; Tabesh, A.; Khajehoddin, S.A.; Dehghani, R.; Moez, K. A Voltage-Feedback-Based Maximum Power Point Tracking Technique for Piezoelectric Energy Harvesting Interface Circuits. IEEE Internet Things J. 2024, 11, 20433–20442. [Google Scholar] [CrossRef]
Feng, Y.; Liang, M.; Li, Y. Adaptive Controller with Anti-Windup Compensator for Piezoelectric Micro Actuating Systems. IEEE Trans. Nanotechnol. 2024, 23, 45–54. [Google Scholar] [CrossRef]
Forrester, J.; Davidson, J.N.; Foster, M.P. Inductorless Step-Up Piezoelectric Resonator (SUPR) Converter: A Describing Function Analysis. IEEE Trans. Power Electron. 2023, 38, 12874–12885. [Google Scholar] [CrossRef]
Li, Z.; Wang, J.; Law, M.K.; Du, S.; Liang, J.; Cheng, X.; Han, J.; Zeng, X.; Chen, Z. Piezoelectric Energy Harvesting Interface Using Self-Bias-Flip Rectifier and Switched-PEH DC–DC for MPPT. IEEE J. Solid-State Circuits 2024, 59, 2248–2259. [Google Scholar] [CrossRef]
Mahmood, H.; Michaelson, D.; Jiang, J. A Power Management Strategy for PV/Battery Hybrid Systems in Islanded Microgrids. IEEE J. Emerg. Sel. Top. Power Electron. 2014, 2, 870–882. [Google Scholar] [CrossRef]
Yang Zhao, K.W.; Guan, M. An adaptive boost converter for low voltage piezoelectric energy harvesting. Ferroelectrics 2016, 502, 107–118. [Google Scholar] [CrossRef]
Arunkumari, T.; Indragandhi, V. An overview of high voltage conversion ratio DC-DC converter configurations used in DC micro-grid architectures. Renew. Sustain. Energy Rev. 2017, 77, 670–687. [Google Scholar] [CrossRef]
Cristri, A.; Iskandar, R. Analysis and Design of Dynamic Buck Converter with Change in Value of Load Impedance. Procedia Eng. 2017, 170, 398–403. [Google Scholar] [CrossRef]
Galkin, I.A.; Saltanovs, R.; Bubovich, A.; Blinov, A.; Peftitsis, D. Considerations on Combining Unfolding Inverters with Partial Power Regulators in Battery–Grid Interface Converters. Energies 2024, 17, 893. [Google Scholar] [CrossRef]
Kumar, R.R.; Bharatiraja, C.; Udhayakumar, K.; Devakirubakaran, S.; Sekar, K.S.; Mihet-Popa, L. Advances in Batteries, Battery Modeling, Battery Management System, Battery Thermal Management, SOC, SOH, and Charge/Discharge Characteristics in EV Applications. IEEE Access 2023, 11, 105761–105809. [Google Scholar] [CrossRef]
Mahazabeen, M.; Abianeh, A.J.; Ebrahimi, S.; Daoud, H.; Ferdowsi, F. Enhancing EV charger resilience with reinforcement learning aided control. e-Prime Adv. Electr. Eng. Electron. Energy 2023, 5, 100276. [Google Scholar] [CrossRef]
Seyedi, M.; Sheida, K.; Siner, S.; Ferdowsi, F. Enhanced Resilience in Battery Charging through Co-Simulation with Reinforcement Learning. Available online: https://www.techrxiv.org/doi/pdf/10.36227/techrxiv.170846720.02245839 (accessed on 30 September 2024).
Sheida, K.; Seyedi, M.; Ferdowsi, F. Adaptive Voltage and Frequency Regulation for Secondary Control via Reinforcement Learning for Islanded Microgrids. In Proceedings of the 2024 IEEE Texas Power and Energy Conference (TPEC), College Station, TX, USA, 12–13 February 2024; pp. 1–6. [Google Scholar] [CrossRef]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Zhao, S.; Blaabjerg, F.; Wang, H. An overview of artificial intelligence applications for power electronics. IEEE Trans. Power Electron. 2020, 36, 4633–4658. [Google Scholar] [CrossRef]
Marahatta, A.; Rajbhandari, Y.; Shrestha, A.; Phuyal, S.; Thapa, A.; Korba, P. Model predictive control of DC/DC boost converter with reinforcement learning. Heliyon 2022, 8, e11416. [Google Scholar] [CrossRef]
Saeidinia, Y.; Arabshahi, M.; Aminirad, M.; Shafie-khah, M. Enhancing DC microgrid performance through machine learning-optimized droop control. IET Gener. Transm. Distrib. 2024, 18, 1919–1934. [Google Scholar] [CrossRef]
Yadav, N.; Tummuru, N.R. Short-Circuit Fault Detection and Isolation Using Filter Capacitor Current Signature in Low-Voltage DC Microgrid Applications. IEEE Trans. Ind. Electron. 2022, 69, 8491–8500. [Google Scholar] [CrossRef]
Tarzamni, H.; Esmaeelnia, F.P.; Tahami, F.; Fotuhi-Firuzabad, M.; Dehghanian, P.; Lehtonen, M.; Blaabjerg, F. Reliability Assessment of Conventional Isolated PWM DC-DC Converters. IEEE Access 2021, 9, 46191–46200. [Google Scholar] [CrossRef]
Zhou, S.; Qian, Y.; Wan, Y.; Lin, Z.; Shamash, Y.A.; Premakumar, A.V.P.; Davoudi, A. On the Resilience Analysis of DC Microgrids with Power Buffer Control. IEEE Trans. Circuits Syst. I Regul. Pap. 2024, 71, 1–14. [Google Scholar] [CrossRef]

Figure 1. Different approaches to piezoelectric energy harvesting.

Figure 2. 810 Material Testing System with concrete sample.

Figure 3. Load and voltage profiles for the concrete sample.

Figure 4. Load and voltage profiles for the asphalt sample.

Figure 5. Asphalt sample.

Figure 6. Overall schematic of the islanded DC microgrid.

Figure 7. Piezoelectric sensor.

Figure 8. Two-stage circuit For piezoelectric energy harvesting.

Figure 9. Boost converter circuit.

Figure 10. Output voltage vs. duty cycle with various load amounts for boost converter.

Figure 11. Schematic of a buck converter.

Figure 12. Output voltage vs. duty cycle with various load amounts for buck converter.

Figure 13. Schematic of a bidirectional buck–boost converter with battery source.

Figure 14. Output voltage vs. duty cycle with various load amounts for bi-directional buck–boost converter.

Figure 15. Reinforcement learning controller in the proposed DC microgrid.

Figure 16. Voltage response with short circuit across load.

Figure 17. Control metric comparison for the short circuit.

Figure 18. Voltage response with converter failure.

Figure 19. Control metric comparison for converter failure.

Figure 20. Voltage response with sudden changes in the load.

Figure 21. Control metric comparison for load variation.

Figure 22. Voltage response with open circuit across the load.

Figure 23. Control metric comparison for open circuit across the load.

Table 1. Specification of the islanded microgrid.

Sources	Parameters	Values
Battery	Battery Voltage	12 V
	Battery Current	3 A
	Converter Inductor	4 mH
	Converter Capacitor	400 µF
	Internal Inductor Resistance	0.1 $Ω$
	Load Bus Voltage	24 V
	SOC	90%
Solar PV	Open Circuit Voltage	48 V
	Short-circuit Current	3 A
	Irradiance	1000 W/m²
	Temperature	25 °C
	Converter Inductor	4.5 mH
	Converter Capacitor	400 µF
	Internal Inductor Resistance	0.1 $Ω$
	Load Bus Voltage	24 V
Piezoelectric	Voltage Range	0–13.5 V
	Max Current/Module	0.8 A
	Converter Inductor	4 mH
	Internal Inductor Resistance	0.1 $Ω$
	Converter Capacitor	400 µF
	Load Bus Voltage	24 V

Table 2. Controller behaviors under various fault scenarios.

Controller	RL	PI	RL	PI	RL	PI	RL	PI
Metrics	Scenarios
Metrics	SCL (%)		OCL (%)		CF (%)		LV (%)
Steady-State Error	0.07	0.03	0.078	0.02	0.06	0	1.79	4.12
Overshoot	0.63	6.82	0.73	1.51	2.90	3.48	0.22	0.34
Undershoot	1.07	12.28	2.83	5.33	1.91	2.44	0.82	10.22
Peak-to-Peak	40.62	458.3	85.5	164.12	115.54	142.16	25.03	253.69
IAE	0.55	11.31	3.35	8.31	4.01	6.15	2.06	4.45
ISE	0.04	12.14	0.91	5.16	1.44	2.75	0.07	3.70
ITAE	0.59	12.57	3.56	9.25	4.31	6.74	3.03	5.37

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sheida, K.; Seyedi, M.; Afridi, M.A.; Ferdowsi, F.; Khattak, M.J.; Gopu, V.K.; Rupnow, T. Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric. Machines 2024, 12, 694. https://doi.org/10.3390/machines12100694

AMA Style

Sheida K, Seyedi M, Afridi MA, Ferdowsi F, Khattak MJ, Gopu VK, Rupnow T. Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric. Machines. 2024; 12(10):694. https://doi.org/10.3390/machines12100694

Chicago/Turabian Style

Sheida, Kouhyar, Mohammad Seyedi, Muhammad Ali Afridi, Farzad Ferdowsi, Mohammad J. Khattak, Vijaya K. Gopu, and Tyson Rupnow. 2024. "Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric" Machines 12, no. 10: 694. https://doi.org/10.3390/machines12100694

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Resilient Reinforcement Learning for Voltage Control in an Islanded DC Microgrid Integrating Data-Driven Piezoelectric

Abstract

1. Introduction

2. Material Testing Machine and Piezoelectric Data

2.1. Testing System

2.2. Embedment of Piezoelectric Sensors in Fiber-Reinforced Concrete Samples

2.3. Piezoelectric Sensors in Asphalt Samples

3. Islanded DC Microgrid Configurations and Setup

3.1. Piezoelectric

3.1.1. Rectifier

3.1.2. DC-DC Boost Converter

3.2. Solar

3.3. Battery

4. Methodology and Proposed Controller

4.1. Reinforcement Learning

4.1.1. Regression

4.1.2. Suggested Approach

5. Results and Discussion

5.1. Short Circuit Across the Load

5.2. Converter Failure

5.3. Load Variation

5.4. Open Circuit of the Load

6. What Is Next

7. Conclusions and Future Directions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI