1. Introduction
As science and technology have advanced, transportation has undergone a significant transformation. Initially, it was human-powered, such as bicycles and horse-drawn carriages. Later, it shifted to mechanically powered transportation, like automobiles and trains. Now, we are witnessing the emergence of electrified transportation, including electric cars and magnetic levitation trains. Furthermore, we are entering the era of intelligent transportation, which includes self-driving cars and drones. However, transportation systems have become increasingly complex and uncertain. This complexity arises due to the involvement of a large number of people, transportation facilities, and environmental elements [
1]. The uncertainty is mainly attributed to manufacturing and measurement errors, wear and aging, and state uncertainty. For example, variations in vehicle suspension and tire parameters can impact the vehicle’s dynamics and stability [
2]. Additionally, situations where the state of a system cannot be accurately predicted and controlled due to changes in the system’s internal and external environments can lead to state uncertainty. For example, sensors and actuators within the vehicle may also malfunction or become distorted, and changes in factors such as road conditions and weather conditions may affect the state of the vehicle [
3]. Other factors contributing to uncertainty include the nonlinear properties of vehicle dynamics systems and multi-body dynamics resulting from complex system interactions.
In situations where the dependability and safety of a system are paramount, particularly in complex transportation systems such as airplanes, high-speed trains, and subways, any error in the system control may have severe consequences and endanger the lives and property of individuals. In these complex transportation systems, control systems are critical in ensuring that the system operates at the desired state by regulating and adjusting key variables to reach the desired state of the controlled object. However, it is essential to recognize that all components of the control system are subject to failure. The types of failures that may occur include failures in the controlled object, instrumentation failures (which may involve sensor, actuator, and signal conversion interface failures), and computer and software failures (including hardware, troubleshooting programs, and control algorithm program failures). According to statistics, 80% of control system failures are attributed to sensor or actuator failures [
4]. Consequently, sensor and actuator failures are the primary causes of control system failures.
Moreover, electronic components and mechanical parts are the fundamental units of a control system, and the reliability of these basic units determines the reliability of the entire control system. However, using only high-reliability components can significantly increase the cost of the control system. Therefore, the objective of engineering design is to optimize the use of low-reliability components to create a high-reliability control system. This requires a comprehensive understanding of the operating mechanisms of the basic units and their interactions to design the system more effectively [
5]. To achieve this, fault-tolerant control techniques have been developed.
In general, fault-tolerant control technology, through the design and application of redundant controllers, actuators, and other components, as well as appropriate fault detection and diagnostic mechanisms, can quickly detect and isolate faults in the event of system failures to ensure the normal operation of the system [
6]. For example, fault-tolerant control technology in aircraft control systems can ensure that even if a sensor or actuator fails, the system can still operate normally and ensure the safe flight of the aircraft. Typically, fault-tolerant ideas can be divided into hardware-based redundancy and software-based fault-tolerant control.
The principle of the hardware redundancy method is to set up backups for each component within the control system, so that when a component within the system fails, the backup component is automatically activated to reorganize the operation of the system so that the normal operation of the system is not affected by the failure of the component [
7]. In general, the fault-tolerant effect of hardware redundancy systems is better, but excessive redundancy will increase the system cost. Therefore, a balance must be struck between fault-tolerant effectiveness and system cost.
There exist various classifications of software-based fault-tolerant controls, such as the categorization of linear and nonlinear fault-tolerant controls based on the type of system [
8] and the categorization of actuator fault-tolerant control and sensor fault-tolerant control based on the location where the fault occurred [
9]. Furthermore, fault-tolerant controls can be divided into active fault-tolerant control and passive fault-tolerant control based on the control method [
10]. In this paper, we categorize fault-tolerant controls into active fault-tolerant control and passive fault-tolerant control based on whether the fault-tolerant system relies on a fault detection and diagnosis system, and whether its control system can be restructured in terms of structure or parameters.
Active fault-tolerant control is the process of redesigning a control system’s characteristics after a fault has occurred to stabilize the entire system. The performance of the new control system may be inferior to the original system, and most active fault-tolerant control methods require fault detection and diagnosis subsystems. However, some methods do not require this subsystem but do require prior knowledge of the faults [
11]. Active fault-tolerant control methods are classified into four main categories: signal reconfiguration, fault compensation, gain scheduling, and online automatic controller design. In gain scheduling methods, a pre-computed control law is selected based on the fault situation, which is determined by the fault detection and diagnosis subsystem. The online automatic controller design approach involves constructing a new controller and computing its parameters, which is a reconfigurable control technique. The general strategy for active fault-tolerant control is illustrated in
Figure 1.
Passive fault-tolerant control is an approach that is analogous to robust control, which involves constructing a system that is insensitive to faults in the feedback loop. This approach ensures stability and optimal performance under normal operating conditions, as well as in the event of failures in the actuators, sensors, or other components, by employing controllers with a specific structure that takes into account the values of the parameters in both normal and fault situations. The use of the same control strategy before and after a fault occurs, without any adjustments, is a key feature of passive fault-tolerant control [
12]. The strategy for passive fault-tolerant control is shown in
Figure 2.
The benefit of a passive fault-tolerant control strategy is that the controller configuration and parameters are typically simple and have a fixed form. However, this approach is conservative, and the performance of the fault-tolerant control system may not be optimal. Additionally, if unforeseen faults occur, the system’s performance and stability cannot be guaranteed [
13]. On the other hand, active fault-tolerant control addresses these limitations. It can proactively handle faults as they occur and provides stronger adaptive fault tolerance than passive fault-tolerant control. However, active fault-tolerant control systems require a more complex design, as they need a robust basic controller to maintain stability during the reconfiguration of the control law and a fault-detection unit that is robust to reduce false alarms and shorten the time of fault detection [
14].
Fault-tolerant control methods rooted in the aforementioned concepts are extensively utilized in transportation systems. In the aerospace sector, fault-tolerant control techniques have proven to be highly effective [
15,
16,
17,
18,
19]. Given the exceptional safety and reliability demands of aircraft, particularly those with complex systems, fault-tolerant control technology is an essential tool for ensuring aircraft performance and safety while preventing accidents from occurring. After several decades of development, fault-tolerant control technology has yielded remarkable outcomes in aircraft control, and corresponding fault-tolerant control design techniques have been proposed for various aircraft models and fault types, such as rudder faults, sensor faults, and process faults, among others. The strategy for fault-tolerant control in an aircraft is depicted in
Figure 3. In this aircraft system, when the fault detection and diagnosis unit identify the faults, the control system will adjust the control strategy or will reconfigure the system to maintain normal operation.
Fault-tolerant control is also widely used in the automotive sector [
20,
21,
22,
23]. In vehicles, fault-tolerant control technology is widely used to improve the reliability and safety of vehicles. Fault-tolerant control systems for automotive actuators are mainly classified into two categories: direct fault diagnosis of actuator components and active fault-tolerant control by reconfiguring the upper-layer algorithms after the fault localization is accomplished; and indirect fault diagnosis of the actuators from the level of the whole vehicle, which is commonly used to identify the key dynamics parameters such as the vehicle speed, the traverse angular velocity, and the lateral deflection angle of the center of mass.
Figure 4 shows a fault-tolerant control method for a vehicle.
In the field of magnetic levitation trains, related fault-tolerant control research has achieved extensive results [
24,
25,
26]. Most of the research has focused on the levitation control systems of magnetic levitation trains. The research on the fault-tolerant control of sensors and actuators for magnetic suspension systems has yielded abundant results. On the engineering realization side, hardware redundancy approaches are widely used. On the software side, an active fault-tolerant control strategy consisting of fault diagnosis and feedback reconfiguration is a mainstream approach. This method performs a fault diagnosis of sensors, identifies critical parameters (e.g., levitation gap, levitation acceleration, solenoid current, etc.), and then reconstructs the inputs for the missing signals to maintain system stability.
Figure 5 shows a fault-tolerant control algorithm structure of a magnetic suspension system.
In addition to the several systems mentioned above, fault-tolerant control also has a wide range of applications in transportation systems such as ships and railroads. At present, scholars at home and abroad have conducted a series of review studies on fault-tolerant control [
28,
29,
30,
31]. These reviews provide some degree of introduction to fault-tolerant control, with related reviews providing detailed information on the application of fault-tolerant control to aircraft. However, these reviews lack an introduction to the fault-tolerant control of suspension systems for magnetic levitation trains. The magnetic suspension system consists of several sets of actuators and sensors that act as the “hands” and “eyes” of the magnetic suspension system by working together to keep the system functioning properly. For example, in the section on low-speed maglev trains, there are 10 levitation controllers and 20 sets of levitation sensors installed in the low-speed maglev trains [
32], and a high-speed maglev train is fitted with 32 levitation controllers and 64 levitation sensors [
33]. Faced with such a large and complex magnetic levitation train suspension control system, its safety, reliability, and effectiveness become key to this technology. Fault-tolerant control (FTC) technology, as a new technology developed in the 1980s aiming to improve the reliability of systems, has become a powerful tool for solving the problem of fault tolerance of the suspension control systems of magnetic levitation trains, and has attracted more and more academic attention.
In this literature review, our objective is to offer a comprehensive and cutting-edge analysis of the maglev field and delve into the recent advancements in fault-tolerant control techniques. To guarantee the breadth and representativeness of our study, the systematic literature review (SLR) methodology proposed by Kitchenham [
34,
35] was adopted and implemented in the research. Firstly, we adopted a systematic literature search strategy. Specifically, we carried out an in-depth literature search in the following electronic databases: IEEE Xplore, Web of Science, and Google Scholar. Our search extended from the establishment year of these databases up to January 2024, ensuring comprehensive temporal coverage. The search keywords were meticulously chosen to encompass the core of research on fault-tolerant control methods implemented in magnetic levitation systems and include “fault-tolerant control systems”, “magnetic levitation system”, “active suspension system”, “compensation”, and “real-time optimization”. To enhance the accuracy and scope of the search, we also made use of Boolean operators such as AND, OR, and NOT to formulate complex queries. For example, we employed the following query: ((Fault-Tolerant Control Systems) AND (Magnetic Levitation System OR Maglev Train)) AND (Real-time Optimization OR Compensation). Furthermore, we enforced inclusion and exclusion criteria to ensure that the selected literature were directly related to our research goals and scope. We excluded papers that centered on theoretical deliberations without experimental validation, as well as those that were not directly pertinent to the technological areas of interest. Through this approach, we endeavor to provide readers with a curated assortment of literature that reflects the latest trends and challenges in the application of fault-tolerant control technology to magnetic levitation systems. This paper can provide systematic and abundant references for scholars engaged in the research in this field, as well as the scientific and technological frontiers and key issues that should be paid attention to by the engineers of magnetic levitation transportation. The main contribution of this paper can be divided into two aspects:
This paper will emphasize the magnetic levitation train suspension system (MLTS) and incorporate it with the vehicle semi-active/active suspension system, which shares a similar structure with the MLTS. The study examines, evaluates, and synthesizes past research on fault-tolerant control, focusing on the routes, theoretical approaches, and technological tools that are common to both systems. The intended audience includes scholars and engineers in the fields of rail transportation, fault-tolerant control, and magnetic levitation.
The analysis examines the features of two types of engineered systems designed for fault-tolerant control. It delves into specific aspects such as redundancy, fault detection, fault diagnosis, and fault-tolerant control. This information can guide the selection of fault-tolerant strategies in different failure scenarios and holds significant implications for engineering applications. The fault-tolerant control methods discussed in this paper for suspension systems can be basically classified according to
Figure 6.
The remainder of the paper is structured as follows: in
Section 2, the technical characteristics of the suspension control system, including the basic principles, the modeling process, and the control objectives, are presented.
Section 3 reviews the literature on fault-tolerant control in vehicle active suspension systems.
Section 4 presents a literature review of fault-tolerant control in maglev train suspension systems.
Section 5 reveals the characteristics of several techniques. Finally,
Section 6 gives a summary and outlook.
5. Discussion
As per the examination of the fault-tolerant methods for magnetic levitation train suspension systems and vehicle suspension systems, the fault-tolerant strategies for both systems are mainly categorized into hardware redundancy, passive fault-tolerant control, and active fault-tolerant control based on their principles. This section offers a summary of the commonly used fault-tolerant control techniques in both systems and the situations in which they are applicable.
The research on hardware redundancy methods primarily focuses on the suspension system of magnetic levitation trains. The primary aim of these methods is to enhance the fault tolerance of the suspension system by increasing the number of sensors and actuators while maintaining the continuity and stability of the system. Redundant fault-tolerant strategies can be applied to a diverse range of systems and applications and are widely used in suspension systems for magnetic levitation trains. However, the adoption of too many redundant components also increases the hardware cost of the system accordingly. At the same time, the additional redundant components will occupy more space and weight, which may not be suitable for a system with strict requirements on volume and weight. Some scholars have proposed optimized redundancy strategies, such as the minimum number of sensors strategy (using three sensors to replace up to five sensors), the suspended controller integration (using a single control board to achieve redundant control of two control boards at the same time), and the multi-objective optimization of sensor selection methods. These methods aim to reduce the complexity of the control system, provide sensor fault tolerance, and ensure optimal performance for each possible set of sensors prior to a fault condition while also reducing overall cost. Therefore, redundant fault tolerance strategies for maglev train systems remain a relatively simple, effective, and widely used approach.
Passive fault-tolerant control methods have been more extensively studied in both systems. The passive fault-tolerant control strategy primarily depends on the redundancy and characteristics within the system, without altering the structure and parameters of the fault-tolerant controller before and after a system failure, thus allowing the system to remain stable. The concept of passive fault-tolerant control is more closely aligned with robust control, and its primary advantage lies in the fact that the controller is designed to be relatively straightforward, without the need for a fault diagnosis unit, making it easy to implement in engineering applications. Some researchers have combined this idea with robust control, adaptive control, intelligent control, and other methods to design a wide range of passive fault-tolerant control strategies. Passive fault-tolerant control methods are simple to implement and do not require information about faults, making them easy to put into practice. Since fault diagnosis and controller tuning are not necessary, passive fault-tolerant control exhibits a faster response time. However, passive fault-tolerant control is primarily applicable to known fault conditions and may not be as effective for unknown fault conditions. As a result, the effectiveness of passive fault-tolerant control may be limited and may not fully utilize the capabilities of the system. Passive fault-tolerant control systems for magnetic suspension often utilize robust control, adaptive control, and other nonlinear techniques. In contrast, the intelligent control method of fuzzy control is commonly employed in the suspension system’s passive fault-tolerant control of vehicles. The fundamental principle of fuzzy fault-tolerant control is the integration of fuzzy logic with fault-tolerant control strategies, and its benefits comprise independence from precise models, strong adaptability, exceptional robustness, and excellent real-time performance. These features make fuzzy fault-tolerant control a practical and promising approach for a magnetic levitation train’s suspension system. Further investigation into the passive fault-tolerant control of future magnetic suspension systems is warranted.
Active fault-tolerant control involves readjusting the controller structure or modifying control parameters following a fault to maintain consistent system performance. This strategy is more adaptable and intelligent, as it detects and diagnoses faults in real time, implementing appropriate measures based on the type and severity of the fault. The accuracy and timeliness of fault diagnosis results are crucial for active fault tolerance, which has been addressed through various detection methods such as signal-based, analytical model-based, and artificial intelligence-based approaches. These methods enable the rapid identification of fault locations, determination of fault types, and guidance for subsequent actions to ensure that the faulty system maintains static and dynamic performance similar to a normal system. Categorized into signal reconfiguration, fault compensation, gain scheduling, and online automatic controller design methods, active fault-tolerant control actively adjusts the controller parameters or structure according to system conditions to sustain stable operation even when faults occur. In the event of a fault, active fault-tolerant control allows the system to maintain a high performance similar to that which would have been achieved without the fault. This type of control is adaptive to the occurrence and magnitude of faults and can handle a wide range of unknown fault conditions. However, it typically relies on a fault diagnosis and isolation module to provide system fault information; thus, its effectiveness depends on the performance of this module. Additionally, the design process for active fault-tolerant control is complex, posing challenges in implementation. Active fault-tolerant control for magnetic suspension systems primarily utilizes methods such as reconfiguration, switching, and online optimization; while for vehicles, it focuses mainly on reconfiguration and compensation methods. Reconfiguration mainly refers to signal reconfiguration, which includes the reconfiguration of sensor signals and the reconfiguration of actuator control law signals. Active fault-tolerant control based on signal reconfiguration has a number of unique features that give the method a significant advantage when dealing with system faults. The approach also relies on a fault detection and diagnostic system that needs to be able to detect and respond to system faults in real time. As soon as a fault is detected, the control system performs signal reconfiguration to ensure the stability and continuity of system performance. The basic principle of this method is to reconstruct the state quantity signal or control law that the sensor should reflect through the healthy signal. The method is more flexible and applicable. However, the reconfiguration process usually relies on an accurate mathematical model of the system. Especially in the presence of nonlinear, time-varying, or uncertainty factors, model inaccuracies may lead to poor reconfiguration results. However, this method is still a superior approach.
Similar to the signal reconfiguration method, the basic idea of the compensation-based active fault-tolerant control method for vehicle systems is also to use the input and output signals of the system, as well as the state quantities of the system, to reconstruct the signals of faults occurring in the system and then directly input the compensation signals of the faults into the system. The core of this approach is that when a fault occurs in the system, the compensation-based active fault-tolerant control approach quickly recognizes and calculates the impact of the fault on the system’s performance, and then compensates for the impact by offsetting it. The approach is highly flexible. It can adopt different compensation strategies and parameters according to different fault types and degrees to achieve the best fault tolerance. However, this method has high requirements on the speed of the system in detecting and handling system faults. As soon as the fault detection system detects a fault, a compensation mechanism is activated to ensure that the system can quickly adapt and restore performance. Compensation-based active fault-tolerant control methods usually have a relatively simple implementation and can be easily integrated into existing control systems. This makes the method highly feasible and practical in practical applications. Similar to the signal reconfiguration method, the reconfiguration process is usually based on an accurate mathematical model of the system. The method is informative for the active fault-tolerant control of magnetic suspension systems.
The general idea of the switching fault-tolerant control method in magnetic suspension systems is to predesign the faults that may occur in the system and to realize the fault-tolerant control strategy through the reconfiguration of the control law when the fault occurs. The core of this method lies in the timely detection and effective isolation of system faults and the design of new control laws in advance for certain kinds of faults to replace the original failed parts after the occurrence of faults. This method has significant advantages in ensuring stable system operation and improving control performance. However, the method also has some limitations; for example, the method usually needs to be designed and implemented under specific failure modes. However, when faced with unknown or complex fault situations, the robustness of the method may be limited to effectively cope with and recover system performance. This is because the reconfiguration process is often based on a priori knowledge of the failure modes and may lack adequate handling mechanisms for unknown faults.
The active fault-tolerant control system based on online optimization for magnetic suspension systems uses real-time collected system data to dynamically adjust the control strategy through optimization algorithms to adapt to the changes caused by faults. The goal of online optimization is to find the optimal control strategy so that the system can still maintain good performance under fault conditions. The method includes real-time fault detection and diagnosis with online control strategy dynamic adjustment and online optimization. Real-time fault detection and diagnosis is an active fault-tolerant control method based on online optimization. By real-time monitoring of the system’s operating status and data, combined with fault diagnosis algorithms, the type and degree of system failure can be discovered and identified in a timely manner. This provides the necessary basis and support for subsequent online optimization and control strategy adjustment. Dynamic adjustment of the control strategy includes changing the control parameters, adjusting the control structure, or introducing new control algorithms. Through these adjustments, the impact of faults on system performance can be effectively mitigated or eliminated so that the system can continue to operate stably. The online optimization algorithm, on the other hand, needs to calculate the optimal control strategy parameters or structure based on the real-time collected system data and fault diagnosis results. This method is also more flexible and can flexibly adjust the control strategy according to different fault conditions and demands to realize targeted fault-tolerant control. At the same time, through the online optimization algorithm, the optimal control strategy can be found to improve the performance and efficiency of the system. Finally, this method has certain requirements on the real-time nature of the fault detection and diagnosis system, which is required to be able to detect and deal with system faults in real time. The method has been applied to a certain extent in magnetic levitation train suspension systems. With the continuous development and improvement of related technology, the method will play a greater role. According to the above analysis of the above fault-tolerant methods, their respective pros and cons are outlined in
Table 6.
6. Conclusions
Fault-tolerant strategies have been the focus of maglev train research. From the perspective of hardware and software, they can be categorized into hardware redundancy and fault-tolerant control methods. The fault-tolerant control methods can be further categorized into active fault-tolerant control and passive fault-tolerant control. By analyzing fault-tolerant control strategies for maglev train suspension systems and combining them with fault-tolerant control strategies for vehicle active suspension systems, this review outlines their strengths and weaknesses, as well as the conditions under which they are applicable. For example, active fault-tolerant control based on signal reconfiguration or signal compensation can be used if an exact model of the system is known; passive fault-tolerant control including robust fault-tolerant control as well as adaptive fault-tolerant control can be used if an exact model of the system is known and the type of faults likely to occur in the system is known; and passive fault-tolerant control including robust fault-tolerant control as well as adaptive fault-tolerant control can be used if an exact model of the system is unknown and uncertainty is high. If the system does not have a precise model and the uncertainty is strong, fuzzy fault-tolerant control can be used; if there is a large amount of operational data of the system in various situations and it needs the system to have a certain degree of judgment, online optimization of the active fault-tolerant control strategy can be used. This brings certain guidance and practical significance to the selection of fault-tolerant strategies for maglev transportation engineering. Nevertheless, fault-tolerant control in suspension systems is not perfect either. The use of fault-tolerant control will inevitably lead to the increase in system complexity, which will affect the cost and maintenance difficulty. Second, in order to achieve fault tolerance, the performance of the system must be compromised; for example: response speed, accuracy, or stability may not be as good as a system without fault tolerance control. Hence, engineers are required to make tradeoffs among the fault tolerance effect, cost, and performance in maglev systems.
The future research on maglev train suspension systems should focus on intelligent control. Intelligent fault-tolerant control methods, compared with the purely device-based fault detection and fault-tolerant control methods, which already have very rich results, can provide good error correction and bias rectification of the system in the face of the system’s possible errors and deviations. This expands a new research field and research scope for fault-tolerant control theory and technology. Secondly, future research should focus on fast FDI methods; the shorter the delay caused by fault detection and separation, the more favorable it is for the reconfiguration/reconstruction design of the control law. Additionally, in active fault-tolerant control based on signal reconfiguration, it is necessary to focus on research to simultaneously ensure the robustness of the underlying controller, the robustness of the fault detection and diagnosis methods, and the robustness of the reconstructed control law. Finally, the practicality of the designed fault-tolerant control strategy should be verified by evaluating it on a test bed or a real vehicle with both theoretical and practical applications.