Next Article in Journal
GLRM: Geometric Layout-Based Resource Management Method on Multiple Field Programmable Gate Array Systems
Previous Article in Journal
Model Simplification for Asymmetric Marine Vehicles in Horizontal Motion—Verification of Selected Tracking Control Algorithms
Previous Article in Special Issue
Experiment Study of Single Event Functional Interrupt in Analog-to-Digital Converters Using a Pulsed Laser
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation of Single-Event Effects for Space Applications: Instrumentation for In-Depth System Monitoring

1
IES, University of Montpellier, CNRS, 34095 Montpellier, France
2
LIRMM, University of Montpellier, CNRS, 34095 Montpellier, France
3
European Space Agency, ESTEC, 2200 AG Noordwijk, The Netherlands
*
Author to whom correspondence should be addressed.
Current address: SENAI Innovation Institute in Embedded Systems, Florianópolis, 88054-700, SC, Brazil.
Electronics 2024, 13(10), 1822; https://doi.org/10.3390/electronics13101822
Submission received: 12 April 2024 / Revised: 29 April 2024 / Accepted: 1 May 2024 / Published: 8 May 2024
(This article belongs to the Special Issue New Insights in Radiation-Tolerant Electronics)

Abstract

:
Ionizing radiation induces the degradation of electronic systems. For memory devices, this phenomenon is often observed as the corruption of the stored data and, in some cases, the occurrence of sudden increases in current consumption during the operation. In this work, we propose enhanced experimental instrumentation to perform in-depth Single-Event Effects (SEE) monitoring and analysis of electronic systems. In particular, we focus on the Single-Event Latch-up (SEL) phenomena in memory devices, in which current monitoring and control are required for testing. To expose the features and function of the proposed instrumentation, we present results for a case study of an SRAM memory that has been used on-board PROBA-V ESA satellite. For this study, we performed experimental campaigns in two different irradiation facilities with protons and heavy ions, demonstrating the instrumentation capabilities, such as synchronization, high sampling rate, fast response time, and flexibility. Using this instrumentation, we could report the cross section for the observed SEEs and further investigate their correlation with the observed current behavior. Notably, it allowed us to identify that 95% of Single-Event Functional Interrupts (SEFIs) were triggered during SEL events.

1. Introduction

Ionizing radiation induces a plethora of effects on electronic devices, leading to several types of failures that may compromise the operation of entire systems. For instance, in space applications, systems are exposed to various energetic particles (e.g., protons, heavy ions) received from three main sources: Galactic Cosmic Rays (GCR), Solar Energetic Particles (SEP), and trapped radiation [1,2]. These radiation-induced faults can be transient or permanent depending on the device’s architecture, sensitized nodes, semiconductor materials, and degree of damage, triggered either by single interactions, Single-Event Effects (SEEs), or accumulative degradation, Total Ionizing Dose (TID) and Displacement Damage (DD) [3].
In memory devices, as data retention elements, SEEs are usually observed as the corruption of the stored data. However, other effects can be observed, such as the occurrence of damaging current increases in parasitic elements of the circuit and functional mismatches during operation. Since memory elements represent a significant percentage of chip area in modern integrated systems, understanding the radiation effects in these structures is key for assessing reliability. In Static Random-Access Memories (SRAMs), using CMOS technology, a single cell is generally implemented with six transistors. Each cell is capable of storing a bit, due to its two stable states (high and low voltage levels of cell nodes). CMOS-based electronics are commonly susceptible to latch-ups due to a parasitic structure (thyristor) that is nominally in high impedance. When an ionizing particle interacts with the thyristor, a latch-up might be triggered, and it is referred to as a Single-Event Latch-up (SEL). During this event, the thyristor enters a conducting state, firstly generating a high-current state in the cell and, just after, in the neighboring circuits. If this high-current state is not blocked, the device may have permanent damage due to thermal runaway (Joule effect) [4,5].
In order to investigate the system’s response to the induced faults, high-reliability systems are subject to thorough characterization under radiation conditions. Testing and applying mitigation techniques are imperative for systems in the space environment with consistent and dependable operation over the mission lifetime. Hence, due to their importance, test methodologies evolved over the years to accommodate the growing complexity of the devices and achieve assertive results. Experimental tests using particle accelerators became a standard step towards reliable electronics, in which developers expose their systems and fault mitigation strategies to radiation using representative fault stimuli [6]. Depending on complexity and requirements, these systems are approached differently to achieve more feasibility during the development process, ranging from simple pass/fail tests to more robust analyses with the investigation of underlying phenomena [7].
Performing the simplified screening of systems is a convenient strategy to reasonably assess the non-critical elements of a system or perform an initial characterization, reducing project resource utilization and development efforts [8,9]. The risk of this method is estimated and accepted in the early stage of development. To achieve more accurate results, it is important to provide comprehensive test stimuli [10]. Moreover, to improve the experimental setup capabilities and test coverage, further tests with dedicated equipment are required, in which precise and configurable measurements are possible [11]. Finally, for a better understanding of the SEE response, enhancing observability is an essential aspect to be addressed, as further discussed in this article.
In this work, we present compact experimental instrumentation to perform the SEL testing of memories with extended observability to report other radiation-induced faults and synchronization to correlate these faults with the monitored current behavior. Despite focusing on SELs, this study includes the investigation of Single-Event Upsets (SEUs) and Single-Event Functional Interrupts (SEFIs). Here, we present the instrumentation applicability through a case study focusing on an SRAM memory, but the instrumentation and methodology can be applied to different types of memories and even extended to other complex systems (e.g., processors, systems-on-chip). The studied memory is a Commercial Off-The-Shelf (COTS) 4Mib SRAM with a 16-bit parallel interface: Samsung’s K6R4016V1D-TI10 [12], from the year 2001 (Rev. D). It was employed on-board PROBA-V [13], a satellite mission from the European Space Agency (ESA). We investigate this memory based on in-flight behavior related to SEL events and through two experimental campaigns with proton and heavy-ion accelerated beams, in which the proposed instrumentation and experiments are discussed in this article, and the characterization results are subject to a complementary publication [14]. For that, we introduce relevant instrumentation enhancements, such as improved test modes, log synchronization, and measurements for achieving more assertive results in comparison to baseline SEL characterization.
The remainder of the paper is structured as follows. Section 2 introduces related work and key aspects of the testing methodology targeting in-depth analysis. Section 3 presents the proposed experimental setup. Section 4 provides results from the performed irradiation campaigns and examples of analysis exploiting the instrumentation capabilities. Section 5 discusses the capabilities, limitations, and potential extension of the proposed approach. Finally, Section 6 gives conclusions on this work.

2. Related Work

This section describes different approaches and objectives of instrumentation targeting SEL testing in particle accelerators. It does not intend to be an extensive review of the subject but rather provides an insightful discussion of the practices and solutions found in the literature.

2.1. Simplified Screening

SEL testing can be primarily simplified to monitoring transients in the supply power of an operating system by measuring overcurrent events. When this transient is detected, the supply power is limited and cycled to prevent permanent damage and enable continuous testing with a single device. Despite a straightforward concept, there are several factors that influence the behavior of a system before, during, and after a SEL, and understanding this may be essential for a specific purpose. For example, an SEL can be functionally recoverable or fully destructive depending on the affected structures and its duration; the system may be functional or not during the overcurrent state; evaluating complex SEL mitigation schemes might be the main test objective rather than counting only event rates; or overcurrent events can even be more subtle and hard to differentiate from dynamic consumption fluctuations.
Thus, a simplified screening of these systems is appropriate for specific purposes, as seen in [8,9]. In these works, the proposed approach is to perform a preliminary selection of components with respect to their SEE (mainly SEL) sensitivity, reducing testing efforts when multiple device options are available. Similarly, in [15], a simplified SEL testing scheme is used since the objective is to investigate differences in various components with particular architectures, technologies, and manufacturers, instead of deeply investigating the particularities of these devices. These simplified approaches may lead to limited conclusions about the system’s behavior and knowledge of underlying phenomena. However, it provides a starting point for later comprehensive and accurate testing in a more effective and direct manner. Thus, it must be consciously applied and analyzed.

2.2. Comprehensive Test Stimuli

Representative testing schemes require realistic stimuli to be applied to the Device Under Test (DUT). In this case, monitoring SEL continues similarly as mentioned, but more attention is given to achieving accurate and meaningful cross sections under different test conditions, as seen in [16] for SELs and more generally for SEEs in [17]. With proper stimuli and test modes, the target device’s structures are properly sensitized and the entire device is homogeneously stressed over the different runs and modes. Using realistic and diverse test patterns supports later correlation with real behavior due to similar resource utilization. For example, in the case of a memory device, static and dynamic test algorithms should be applied. Static tests consist of writing a fixed pattern in the memory and performing a readback operation after a significant period to check the corrupted bits. In dynamic testing, the memory is constantly written and read following predefined patterns, and the errors are reported when detected. Several classical memory test algorithms can be found in the literature [18] and some were evaluated for radiation testing [10,19]. In summary, test modes are important to expose and provide proper stimuli for exposing different fault mechanisms, resulting in more comprehensive and accurate testing. Also, it reduces test biases that may be created by simplistic or specific test patterns.

2.3. Dedicated Equipment

In order to perform specific testing and monitor systems in real environments, dedicated equipment is required. In these cases, direct current measurements are preferred and provide better accuracy for SEL detection and classification. The equipment should be accurate, automated, and provide a fast response. In [11], the authors propose dedicated equipment and procedures to perform precise SEL characterization. The proposed setup is robust and designed to be used in challenging testing environments. It is composed of analog front-end components for the measurement and conditioning of the DUT current, power switches for handling the supply line, and a flash-based FPGA for controlling signals and logging. In [20], in-flight data is compared with ground estimations based on experiments in particle accelerators. For in-flight monitoring, different approaches can be used, but dedicated equipment is implemented to be integrated with the system and be resilient to the space environment, as demonstrated in [21].

2.4. This Work

The contribution of this work is to enhance the observation of dedicated SEL monitoring equipment by providing synchronized monitoring of the supply current and data errors in the memory, which are generated and observed with static and dynamic algorithms. It is worth noting that the proposed synchronization occurs in the tightly coupled measuring instrument itself and not externally in the monitoring equipment, as previously referred. It is important to acquire precise and coherent timestamps between the various observation points of experimental instrumentation. Enhanced time reporting provides the elements to correlate observed behaviors, find patterns, and elaborate novel hypotheses. For instance, one can correlate current spikes with certain events observed at the logical level in a circuit.
For that, we propose, implement, and analyze dedicated experimental equipment with those characteristics. Similar to [11], we proposed a monitoring system based on a flash-based FPGA for the controller and similar strategies for acquiring the current measurements. However, the monitoring was also extended to the IO lines of the DUT. Moreover, we included in the SEL reporting the simultaneous information regarding the errors (SEUs and SEFIs) dynamically induced in the memory.

3. Proposed Instrumentation

In this section, we describe the proposed instrumentation designed for in-depth analysis of SELs in the SRAM memories used for this case study.

3.1. Test Setup and Instrumentation

Figure 1 presents the proposed experimental setup. The system is composed of two main boards: a controller board, hosting an FPGA, and a test board, where the SRAM and measuring components were included. The test board was used to provide the electrical interface for the additional devices and facilitate a stable attachment of the controller board. This board includes the hardware for the load switches, current sensors, serial converters, and the DUT (SRAM) itself. We used dedicated digital current sensors, which allowed precise measuring and a convenient interface with the controller. The serial converters were used to provide the connection with a host computer through a robust protocol for receiving data and sending commands. The host computer continuously saves, manages, and displays the acquired experimental data. When required, the experiment operator can interact with the system using the host computer with a set of commands (e.g., start a specific test, stop a run, perform a functional test, change test parameters). Using this scheme, we achieved a compact, robust, and precise experimental setup, enabling tests in constrained spaces (e.g., the vacuum chamber used for heavy-ion irradiation) with enhanced observability and simplifying the test equipment.
Figure 2 details the architecture of the proposed instrumentation. In the controller board, we implemented: a timing management module for synchronizing all internal modules; two serial interfaces for sending the test data to the host computer; an SRAM controller for handling the memory operations; a test engine to execute various test algorithms; and the necessary logic for handling the measurements and load switches. It is worth mentioning that the controller was implemented in a COTS flash-based FPGA, in which the configuration memory is more robust to radiation effects [22], and the design included Triple Modular Redundancy (TMR) in the most critical elements to ensure proper operation during the tests. Hence, the controlling loop used the test algorithms and current monitoring to define the execution throughout the experiment and trigger the SEL protection and reporting.
This setup was proposed to allow flexibility, precise and coherent measurements, and functionally representative test stimuli. Since the experiments focus on SEL characterization, we designed a test board where current monitoring is carried out in the memory supply line and inputs/outputs (IOs). This is achieved by placing the current sensors in the supply lines of both the SRAM itself and the IO banks of the FPGA that interface to the memory. This additional IO monitoring and protection was used since these IOs contribute to the current consumption of the memory (as later shown in Section 3.3) and could be important to properly suppress ongoing SELs. Thus, the power supply of the FPGA IO banks is cut, and its interface signals are put in high impedance during the cut time to ensure proper power cycling.
With the controlling board simultaneously managing these measurements and the test algorithms, it allowed the coherent time stamping of two meaningful parameters during event occurrence, i.e., current variations and detected bit SEUs. In particular, this instrumentation is capable of reporting the actual current during an SEL, with the event ongoing. These features are essential to perform specific analyses that are detailed in Section 4.

3.2. Test Execution and Monitoring

The plot in Figure 3 shows the SEL monitoring approach. The power switches feeding the memory are disabled after a hold time following the detected SEL, allowing the monitoring of the memory behavior under the initial part of the SEL. The SEL detection is triggered when the current feeding the memory is over a given threshold value. This current value is reasonably large to ensure the memory function and not trigger false SEL, and, at the same time, small enough to always be triggered with actual SEL as soon as possible. The hold time is short enough to protect the circuit from unrecoverable damage. The power is enabled again after a cut time to recover the memory functionality and continue the test execution. Hold and cut times are configurable prior to the test run and depend on the experimental target. For this work, we defined 50 ms and 200 ms, respectively. The current threshold is also configurable and was set at 80 mA for most of the experiment. Using these configurable parameters, we carried out test runs at the end of the experiment with different variations to verify the full effect of the induced SELs. It is worth mentioning that the current threshold allowed additional tests without the need to physically access the setup inside the irradiation room, since it can be updated in real-time through the host computer using the available commands.
Four test modes were used to explore realistic fault models: 1. standby, when no operations are performed, leaving the device in a powered-on state; 2. chip-enabled, similar to standby, but the memory chip-select is set (now, the memory controlling logic is powered); 3. static, the memory is written, and then, after a sufficient period, it is read back; and 4. dynamic, in which the memory is constantly written and read using two different algorithms and errors are reported when detected by read accesses. In particular, for the dynamic mode, we applied two algorithms discussed and tested in [10]: March C-, displayed in (1), and March Dynamic Stress, displayed in (2). In dynamic test mode, the algorithms constantly access the memory by employing read and write operations in order to emulate realistic applications and detect functional faults. In the Equations (1) and (2), the arrow indicates the addressing order (either ‘↑’ increasing or ‘↓’ decreasing), ‘w’ (write), and ‘r’ (read) indicates the operation, and the following Boolean number indicates the data background. The algorithms are composed of elements indicated by the arrow, followed by the operations in parenthesis. In our work, the operations enclosed by the parenthesis are performed in sequence in each memory address. When the addressing order is ‘↑’, the operations are executed from the address 0 to N, and when it is ‘↓’, the operations are executed from the address N down to 0, with being N the highest memory address. Thus, e.g., the element ‘↑ (r0, w1)’, goes from the first address up to the last one, applying a read operation (where a solid ‘0’ data background is expected) followed by a write operation (using the solid ‘1’) in each address. A bracket pair ‘{ }’ delimits a complete dynamic test algorithm.
( w 0 ) ; { ( r 0 , w 1 ) ; ( r 1 , w 0 ) ; ( r 0 , w 1 ) ; ( r 1 , w 0 ) ; ( r 0 ) ; }
( w 1 ) ; { ( r 1 , w 0 , r 0 , r 0 , r 0 , r 0 , r 0 ) ; ( r 0 , w 1 , r 1 , r 1 , r 1 , r 1 , r 1 ) ; ( r 1 , w 0 , r 0 , r 0 , r 0 , r 0 , r 0 ) ; ( r 0 , w 1 , r 1 , r 1 , r 1 , r 1 , r 1 ) ; ( r 1 , w 0 , r 0 , r 0 , r 0 , r 0 , r 0 ) ; ( r 0 , w 1 , r 1 , r 1 , r 1 , r 1 , r 1 ) ; }
In Figure 4, the test flow is presented, which details all the steps to perform the static and dynamic tests. This procedure was repeated for all the specimens of memories throughout the experiment when feasible. Some calibration runs were performed for every beam configuration change due to the high device sensitivity to the radiation sources. This was more important for heavy ions since the error rates were very high for certain Linear Energy Transfer (LET) values. As shown in the figure, the test flow forecasts functional checks to identify the device state before and after starting irradiation, and at the end, after the power cycle. The main test conditions for defining the duration of a run are the accumulated fluence and the number of detected SELs. In particular, the runs were performed with a fluence of up to 10 7 ions/cm2 unless a large number of SELs occurred beforehand (hundreds).

3.3. Measurements and Reporting

The waveforms in Figure 5 show the enhanced capabilities of timing coherence achieved with the proposed instrumentation. With this type of information, we can correlate memory operations to a current level not only on the main power input but also in the IO lines. The Figure refers to a full dynamic test execution, in which each part of the algorithm is highlighted by the vertical separators and detailed with the captions at the top. The current is measured and logged in short intervals with a resolution of 25 μ A. Using this resolution, it is possible not only to observe the current plateaus but also minor variations during each phase of the device operation (e.g., decoding, precharge).
This system takes from 220 μ s to 370 μ s to acquire a current sample and switch off the power line, depending on the moment that the new sample is received in the controller from the digital sensor. Every error reported by the test engine records the cycle counter at the error detection, the test identification, and the error data and address. The current logging is concurrently performed and reports the cycle counter and current values. The cycle counter operates with a clock period of 80 ns.
In general, for memory testing, the SEL current threshold is defined before a test run and is not modified during the execution. However, dynamic threshold changing could be a required feature to support elaborated testing schemes in complex systems. For example, for a system-on-a-chip with multiple power domains and a long power-up sequence, the feature described in Section 3.2 enables real-time changing of the SEL current threshold with predefined values appropriate for each stage. Furthermore, as shown in Figure 5, distinct steps of an algorithm execution may have characteristic current profiles, allowing for the adjustment of the SEL monitoring scheme according to the executed algorithm.

4. Data Analysis

We performed experiments in two facilities with different particles: heavy ions, in the Radiation Effects Facility (RADEF) at the University of Jyväskylä, Finland, and protons, in the Proton Irradiation Facility (PIF) at the Paul Scherrer Institute (PSI), Switzerland. For RADEF, the beam has a 10% homogeneity for the selected irradiation area and heavy-ion cocktail of 16.3 MeV/n, capable of generating a flux up to approximately 3 × 10 5 ions/cm2/s. The ions selection included the following species and LET (at silicon surface and vacuum mode in MeV·cm2/mg): 17O6+, 1.52 ; 40Ar14+, 7.2 ; 57Fe20+, 13.3 ; 83Kr29+, 24.5 ; and 126Xe44+, 48.5 . In PSI, we selected the following proton energies (in MeV): 16, 29, 51, 70, 101, 151, and 200. Copper plates were used in the beam path to generate different energies and two primary proton energies were used to allow proper testing at the lower energy range (in MeV): 100 and 200. Proton fluxes of about 3 × 10 7 to 8 × 10 7 p/cm2/s were used, and all tests were performed at ambient room temperature and air pressure. The beam had a 10% homogeneity for the selected irradiation area and beam energies throughout the full experiment.
The experimental setup was reused for the facilities with minor adaptations. For the heavy-ion experiment, eight SRAM devices were evaluated, covering two manufacturing lots, and they required delidding for proper particle penetration. Using microscopy in the delidded chips allowed for verifying the die markings and some internal structures. Various irradiation angles and test modes were explored in the heavy-ion campaign. For the proton experiment, four SRAM devices were used, of which two were reused from the previous experiment for checking experimental data consistency and the remaining two (from the same manufacturing lot) were pristine to increase sample size but spot potential cumulative biases, if significant. Only the most relevant test modes were applied in this campaign, based on the outcomes from the heavy-ion experiments.
As previously stated, the main objective of this work is not to present the characterization results or to extensively report cross sections, which is out of the scope of this article and subject to publication in [14]. Conversely, we present those outcomes that support the evaluation of the capabilities of the proposed instrumentation and their benefits in comparison to more traditional and simpler strategies. From the experimental results, we could observe many interesting behaviors that were decisive in corroborating or excluding formulated hypotheses and initial assumptions.

4.1. SEE Cross Sections

To present the baseline capabilities of the experimental instrumentation, we include a summary of cross sections acquired during the heavy-ion campaign, shown in Table 1. The proposed instrumentation allowed the acquisition of SEL [14], SEU, and SEFI cross sections for the DUTs in a reliable manner throughout the experiment without the necessity for manual intervention inside the irradiation room. Moreover, alongside the facility’s capabilities, it was possible to explore different angles, allowing measurements for additional LETs, and allowing the testing of four DUTs without opening the chamber, optimizing the beam utilization. As previously described in Figure 4, we targeted an accumulated fluence of 1 × 10 7 ions/cm2 or at least a hundred events for each test run. This condition was necessary to optimize the beam utilization due to the high DUT sensibility for higher LETs, since hundreds of events were achieved earlier than the defined fluence. The outcomes suggest no impact on the cross section with the different test modes. Also, it is important to highlight that the SEU cross section values were artificially reduced for higher LETs due to the high number of SELs occurring. If the SEL rate is exceedingly high, the capability to retrieve the radiation-induced SEUs is reduced since the test is restarted frequently, inhibiting the possibility of retrieving all induced errors.

4.2. SEL Current and LET Correlation

For Figure 6, we performed an analysis with the intent to correlate current levels and the heavy ion LET. For that, we used the reported maximum current during an SEL event and associated it with the LET of this particular run. This analysis was carried out to further investigate the various SEL current levels that were recorded during irradiation. Finally, we could not observe a clear correlation between these parameters, which supported discarding the impact of LET on inducing distinct SEL types. However, this opened other possibilities that could be better investigated through a laser test campaign, in which region-specific targeting and isolation are possible. These current clusters might indicate the presence of multiple power domains or sensitive regions in the device that, when triggered by an SEL, generate distinct faulty current levels. Furthermore, in complex systems (e.g., mixed-signal integrated circuits), the SEL behavior may differ depending on affected structures in the device architecture due to their heterogeneous elements. For instance, affecting embedded memory can lead to distinct behavior in comparison to the arithmetic logic of a processing core or even an analog circuit structure within the device.

4.3. SEL and SEFI Correlation

In Figure 7, the current measurements (blue line) and error information (green dots) are superposed, allowing a time-coherent analysis of the memory behavior under irradiation. In order to support the visualization, a histogram is also presented, providing the average number of errors for a period of 25 ms. Using this information, it is possible to identify a peak of errors (bit upsets) more than 10 times higher than the average during a SEL. This correlation was observed in most SEL events, leading to the conclusion that SEL and error accumulation are associated events. Figure 7 is an example run that was selected from the heavy-ions campaign, but this correlation was also observed in the proton irradiation.
To further investigate this behavior, we performed an additional analysis, in which errors during and outside SEL events were highlighted to allow further classification. Figure 8 presents an example for this analysis using a heavy-ion test run in dynamic mode. It uses the timestamped bit error results to plot them in a logical bitmap representation. The logical bitmap is a two-dimensional plot of the memory indexing contiguous data addresses following rows, columns, and arrays logical mapping. It is possible to observe that, during SEL events, not only error accumulation is observed, but also larger patterns (column block errors) are formed. This behavior is important to take into consideration since block errors are capable of surpassing mitigation schemes, such as common Error Correcting Codes (ECC). In this context, based on their characteristics and occurrence (95% of test runs), we classified the accumulated column block errors during SELs as SEFI events and the isolated upsets (mainly outside the SELs) as SEU events. Therefore, the SEL not only changes the memory’s current behavior but also triggers SEFIs with more potential impact on the execution than most SEUs.

4.4. SEFI Classification

The logical bitmap that is shown in Figure 9 enables another type of analysis that was possible with the proposed instrumentation. It uses the timestamped bit errors to plot a logical bitmap representation. This information is displayed in real-time on the host computer based on the latest log files, allowing the operators to easily be aware of the DUT behavior and follow the experiment progression. In this figure, an example of the reported errors is shown during distinct SEL events. We selected this run because the three main observed SEFIs show the following behavior: 1. lines in the edges of the bitmap; 2. lines errors in central regions; and 3. sparse error clusters, i.e., forming lines, in four distinct regions. This result supports the analysis of the ECC effectiveness in correcting the SEFIs since, despite being a robust hardening strategy, it has limitations with error accumulation. With access to the memory’s physical bitmap, it would be possible to further investigate, classify, and correlate error types with the observed error location and current behavior. It is worth noting that for this memory device, the observed SEFIs are intuitively observed as block errors and correlated with SELs, but for other systems, the observed SEFIs can be difficult to classify or distinguish their source. Thus, as proposed, with precise and synchronized current monitoring, it is possible to identify current signatures for each SEFI type, supporting their classification.

5. Discussion

We introduced an experimental instrumentation based on widely available COTS components for in-depth monitoring of SEEs in electronic systems. Using a case study (SRAM memory), we presented the main characteristics of the proposed approach, discussed the obtained results, and provided examples exploiting the instrumentation capabilities. As discussed, traditional SEL monitoring solutions for memories often just count overcurrent events and rarely report the correlation with other types of events. In particular, coherent timestamping is another missing capability, reducing the opportunities for in-depth analysis. Even if different test modes are employed, many setups perform the current monitoring with external circuitry, lacking the correlation between events and current spikes. The proposed instrumentation provided a solution for these limitations, enabling enhanced analysis of the observed effects. Notably, when compared to traditional or simplified approaches, some important conclusions for this study would not be possible, such as the correlation between SELs and SEFIs (error accumulation), which required precise time synchronization. This information was necessary to support the understanding of the flight behavior of the studied SRAM memories, as reported in prior work [14].
Besides that, the proposed instrumentation has an adequate resolution on the individual SEL events with more precise current logging with high-frequency sampling and clustering capabilities. The usage of fine-grain measurements highly improves the investigation of the observed events, providing details of the induced errors using information about their current signatures and faulty behavior. The instrumentation also allowed the characterizing of other SEE cross sections, which improves the overall knowledge about the memory’s response to radiation effects. Table 2 summarizes the performance and functional parameters of the developed instrumentation.
It is worth mentioning that this instrumentation is not limited to SRAM memories since it can be used for other memory types. Only a few adaptations are required, such as addressing size, pin assignments, and specific protocols. In some cases, even the same test board could be used since the memories are often built with similar chip packages and pinouts. Since the proposed instrumentation is based on an FPGA, it has a very flexible pin assignment that supports easy integration with other memory types. Furthermore, as discussed in Section 4 with a few examples, the strategy presented in this work can be extended to other device types and more complex systems, such as microprocessors, FPGAs, and systems-on-chip. To achieve that, the test engine and error reporting should accommodate other types of tests (e.g., benchmark output checking). With this type of SEE monitoring with enhanced synchronization, it is possible to identify the current signatures of the tested device for a broad range of workloads and operational modes. Using this information, researchers can compile a dictionary with current signatures correlating operational states and SEL response that could provide more capabilities for predicting systems behavior under the target radiation environments.

6. Conclusions

We proposed experimental instrumentation for SEE monitoring aiming to achieve in-depth analysis by enhancing the observability and synchronization of distinct types of measurements. For that, we discussed the adopted methodology and practical aspects beyond the available radiation testing strategies in the literature. The limitations and opportunities of our approach were discussed. We also presented further perspectives on the improvement and extension of the instrumentation for complex devices and systems. After carrying out two experimental campaigns in irradiation facilities, we validated our proposal and presented the main positive outcomes of using this approach in comparison with more traditional strategies. Notably, the solution provides enhanced synchronization, high sampling rate, fast response time, detailed resolution, and flexibility. Therefore, as part of this study, we were not only able to further investigate the SRAM memories used on-board the PROBA-V satellite and provide insights about their flight behavior and SEL sensitivity, but also propose robust instrumentation for the SEE monitoring of general electronic systems.
In future work, integrating temperature control capabilities into the system will allow the exploration of other temperature ranges that can significantly affect the SEL response of the devices. Besides that, a feature that might support the experiment execution is a flexible interface ready for integration with the facility’s beam shutter control, enabling fine timing adjustments, such as SEL events isolation. Finally, we also observed some behaviors that can be worthy of further investigation by using the proposed instrumentation (e.g., through a laser test campaign).

Author Contributions

Conceptualization, A.M.P.M., D.A.S., L.M.L., V.G. and L.D.; methodology, A.M.P.M., D.A.S., L.M.L., V.G. and L.D.; software, D.A.S. and L.M.L.; validation, A.M.P.M., D.A.S., L.M.L. and V.G.; formal analysis, A.M.P.M., D.A.S. and L.M.L.; investigation, A.M.P.M., D.A.S., L.M.L. and V.G.; resources, V.G. and L.D.; data curation, A.M.P.M., D.A.S. and L.M.L.; writing—original draft preparation, A.M.P.M. and D.A.S.; writing—review and editing, A.M.P.M., D.A.S., L.M.L., V.G. and L.D.; visualization, A.M.P.M., D.A.S. and L.M.L.; supervision, V.G. and L.D.; project administration, V.G. and L.D.; funding acquisition, V.G. and L.D. All authors have read and agreed to the published version of the manuscript.

Funding

This study has received funding from the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 101008126), from an ESA contract (contract no. 4000134557/21/NL/KML/rk), and from the Region d’Occitanie and the École Doctorale I2S from the University of Montpellier (contract no. 20007368/ALDOCT-000932).

Data Availability Statement

The data presented in this study are available on reasonable request from the corresponding authors.

Acknowledgments

The authors would like to acknowledge the support given by the irradiation facilities personnel: RADEF and PSI. Also, to the former PROBA-V project members, for the provided information and discussions, and ESA researchers, for the discussions and test samples with accordance to flight parts.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Stassinopoulos, E.; Raymond, J. The space radiation environment for electronics. Proc. IEEE 1988, 76, 1423–1442. [Google Scholar] [CrossRef]
  2. Bourdarie, S.; Xapsos, M. The Near-Earth Space Radiation Environment. IEEE Trans. Nucl. Sci. 2008, 55, 1810–1832. [Google Scholar] [CrossRef]
  3. Yang, M.; Hua, G.; Feng, Y.; Gong, J. Fault-Tolerance Techniques for Spacecraft Control Computers, 1st ed.; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2017. [Google Scholar] [CrossRef]
  4. Bruguier, G.; Palau, J.M. Single particle-induced latchup. IEEE Trans. Nucl. Sci. 1996, 43, 522–532. [Google Scholar] [CrossRef]
  5. Schrimpf, R.D.; Fleetwood, D.M. Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices; World Scientific: Singapore, 2004; Volume 34. [Google Scholar] [CrossRef]
  6. Rajkowski, T.; Saigné, F.; Wang, P.X. Radiation Qualification by Means of the System-Level Testing: Opportunities and Limitations. Electronics 2022, 11, 378. [Google Scholar] [CrossRef]
  7. Quinn, H. Challenges in Testing Complex Systems. IEEE Trans. Nucl. Sci. 2014, 61, 766–786. [Google Scholar] [CrossRef]
  8. García Alía, R.; Brugger, M.; Daly, E.; Danzeca, S.; Ferlet-Cavrois, V.; Gaillard, R.; Mekki, J.; Poivey, C.; Zadeh, A. Simplified SEE Sensitivity Screening for COTS Components in Space. IEEE Trans. Nucl. Sci. 2017, 64, 882–890. [Google Scholar] [CrossRef]
  9. Bezerra, F.; Dangla, D.; Manni, F.; Mekki, J.; Standarovski, D.; Alia, R.G.; Brugger, M.; Danzeca, S. Evaluation of an Alternative Low Cost Approach for SEE Assessment of a SoC. In Proceedings of the 2017 17th European Conference on Radiation and Its Effects on Components and Systems (RADECS), Geneva, Switzerland, 2–6 October 2017; pp. 1–5. [Google Scholar] [CrossRef]
  10. Dilillo, L.; Tsiligiannis, G.; Gupta, V.; Bosser, A.; Saigne, F.; Wrobel, F. Soft errors in commercial off-the-shelf static random access memories. Semicond. Sci. Technol. 2016, 32, 013006. [Google Scholar] [CrossRef]
  11. Secondo, R.; Alía, R.G.; Peronnard, P.; Brugger, M.; Masi, A.; Danzeca, S.; Merlenghi, A.; Vaillé, J.R.; Dusseau, L. Analysis of SEL on Commercial SRAM Memories and Mixed-Field Characterization of a Latchup Detection Circuit for LEO Space Applications. IEEE Trans. Nucl. Sci. 2017, 64, 2107–2114. [Google Scholar] [CrossRef]
  12. Samsung. K6R4016V1D Datasheet. 256K x 16 Bit High-Speed CMOS Static RAM, Rev. 1.0. 2002. Available online: https://www.farnell.com/datasheets/10596.pdf (accessed on 30 April 2024).
  13. Francois, M.; Santandrea, S.; Mellab, K.; Vrancken, D.; Versluys, J. The PROBA-V mission: The space segment. Int. J. Remote Sens. 2014, 35, 2548–2564. [Google Scholar] [CrossRef]
  14. Mattos, A.M.P.; Santos, D.A.; Luza, L.M.; Gupta, V.; Borel, T.; Dilillo, L. Investigation on Radiation-Induced Latch-Ups in COTS SRAM Memories On-Board PROBA-V. IEEE Trans. Nucl. Sci. 2024, 1–9. [Google Scholar] [CrossRef]
  15. Page, T.; Benedetto, J. Extreme latchup susceptibility in modern commercial-off-the-shelf (COTS) monolithic 1M and 4M CMOS static random-access memory (SRAM) devices. In Proceedings of the 2005 IEEE Radiation Effects Data Workshop, Seattle, WA, USA, 11–15 July 2005; pp. 1–7. [Google Scholar] [CrossRef]
  16. Heidecker, J.; Allen, G.; Sheldon, D. Single Event Latchup (SEL) and Total Ionizing Dose (TID) of a 1 Mbit Magnetoresistive Random Access Memory (MRAM). In Proceedings of the 2010 IEEE Radiation Effects Data Workshop, Denver, CO, USA, 20–23 July 2010; p. 4. [Google Scholar] [CrossRef]
  17. Söderström, D.; Luza, L.M.; de Mattos, A.M.P.; Gil, T.; Kettunen, H.; Niskanen, K.; Javanainen, A.; Dilillo, L. Technology Dependence of Stuck Bits and Single-Event Upsets in 110-, 72-, and 63-nm SDRAMs. IEEE Trans. Nucl. Sci. 2023, 70, 1861–1869. [Google Scholar] [CrossRef]
  18. Lee, K.; Kim, J.; Baeg, S. Fault Coverage Re-Evaluation of Memory Test Algorithms With Physical Memory Characteristics. IEEE Access 2021, 9, 124632–124639. [Google Scholar] [CrossRef]
  19. Tsiligiannis, G.; Dilillo, L.; Bosio, A.; Girard, P.; Todri, A.; Virazel, A.; Touboul, A.D.; Wrobel, F.; Saigné, F. Evaluation of test algorithms stress effect on SRAMs under neutron radiation. In Proceedings of the 2012 IEEE 18th International On-Line Testing Symposium (IOLTS), Sitges, Spain, 27–29 June 2012; pp. 121–122. [Google Scholar] [CrossRef]
  20. Kerboub, N.; Alia, R.G.; Mekki, J.; Bezerra, F.; Monteuuis, A.; Fernández-Martinez, P.; Danzeca, S.; Brugger, M.; Standarovski, D.; Rauch, J. Comparison Between In-flight SEL Measurement and Ground Estimation Using Different Facilities. IEEE Trans. Nucl. Sci. 2019, 66, 1541–1547. [Google Scholar] [CrossRef]
  21. Harboe-Sorensen, R.; Poivey, C.; Guerre, F.X.; Roseng, A.; Lochon, F.; Berger, G.; Hajdas, W.; Virtanen, A.; Kettunen, H.; Duzellier, S. From the Reference SEU Monitor to the Technology Demonstration Module On-Board PROBA-II. IEEE Trans. Nucl. Sci. 2008, 55, 3082–3087. [Google Scholar] [CrossRef]
  22. Microsemi. WP0203 White Paper. Single Event Effects–A Comparison of Configuration Upsets and Data Upsets, Rev. 1. 2015. Available online: https://www.microchip.com/content/dam/mchp/documents/FPGA/ProductDocuments/SupportingCollateral/SEE-%20A%20Comparison%20of%20Configuration%20Upsets%20and%20Data%20Upsets.pdf (accessed on 30 April 2024).
Figure 1. Proposed experimental setup: controller board (rear) and test board (front).
Figure 1. Proposed experimental setup: controller board (rear) and test board (front).
Electronics 13 01822 g001
Figure 2. Architecture of the proposed experimental setup.
Figure 2. Architecture of the proposed experimental setup.
Electronics 13 01822 g002
Figure 3. SEL monitoring method. The current threshold and hold/cut times are configurable.
Figure 3. SEL monitoring method. The current threshold and hold/cut times are configurable.
Electronics 13 01822 g003
Figure 4. Test execution flow for all test modes.
Figure 4. Test execution flow for all test modes.
Electronics 13 01822 g004
Figure 5. Instrumentation capabilities: precise synchronization between memory operations and current measurements. The current profile refers to a complete March C- (1) execution, in which operation stages are separated by vertical lines.
Figure 5. Instrumentation capabilities: precise synchronization between memory operations and current measurements. The current profile refers to a complete March C- (1) execution, in which operation stages are separated by vertical lines.
Electronics 13 01822 g005
Figure 6. Histogram of SEL current levels per LET during heavy-ion irradiation. Current levels are the average measured during the SEL period for the runs in standby mode.
Figure 6. Histogram of SEL current levels per LET during heavy-ion irradiation. Current levels are the average measured during the SEL period for the runs in standby mode.
Electronics 13 01822 g006
Figure 7. Test run for March C- (1) during heavy-ion irradiation showing the bit upset histogram and the DUT current consumption superposed with synchronized timestamps.
Figure 7. Test run for March C- (1) during heavy-ion irradiation showing the bit upset histogram and the DUT current consumption superposed with synchronized timestamps.
Electronics 13 01822 g007
Figure 8. Logical bitmap of the memory address space with the identified bit upsets during a March C- (1) run with a LET of 48.5 MeV·cm2/mg. The x and y axis represent addresses in an arrangement based on the logical memory organization with arrays, rows, and columns, as specified on the datasheet [12]. For this bitmap, bit upsets during and outside SEL events were shown in distinct grayscale tones and color hues, respectively.
Figure 8. Logical bitmap of the memory address space with the identified bit upsets during a March C- (1) run with a LET of 48.5 MeV·cm2/mg. The x and y axis represent addresses in an arrangement based on the logical memory organization with arrays, rows, and columns, as specified on the datasheet [12]. For this bitmap, bit upsets during and outside SEL events were shown in distinct grayscale tones and color hues, respectively.
Electronics 13 01822 g008
Figure 9. Logical bitmap of the memory address space with examples of identified SEFIs during distinct SELs. The x and y axis represent addresses in an arrangement based on the logical memory organization with arrays, rows, and columns, as specified on the datasheet [12]. For this bitmap, a partial March C- (1) run is shown, in which isolated SEUs were hidden to allow better visualization of the clusters. The highlighted groups 1, 2, and 3 represent the identified types of clusters.
Figure 9. Logical bitmap of the memory address space with examples of identified SEFIs during distinct SELs. The x and y axis represent addresses in an arrangement based on the logical memory organization with arrays, rows, and columns, as specified on the datasheet [12]. For this bitmap, a partial March C- (1) run is shown, in which isolated SEUs were hidden to allow better visualization of the clusters. The highlighted groups 1, 2, and 3 represent the identified types of clusters.
Electronics 13 01822 g009
Table 1. Summary of SEE cross sections for the heavy-ion experiment.
Table 1. Summary of SEE cross sections for the heavy-ion experiment.
Cross Section 1,2LET [MeV·cm2/mg]Weibull Fitting
1.5 7.2 13.3 24.5 48.5 W S XS sat LET th
X S S E U  [cm2/bit] 4.62 × 10 12 4.15 × 10 9 1.09 × 10 8 2.24 × 10 8 1.20 × 10 8 14.6 1.7 2.06 × 10 8 1.5
X S S E L  [cm2/device] 9.97 × 10 8 3.65 × 10 7 1.61 × 10 6 3.46 × 10 3 7.64 × 10 2 25.5 7.4 1.03 × 10 1 7.2
X S S E F I [cm2/device] 9.07 × 10 8 -- 3.59 × 10 3 6.87 × 10 2 ----
1 Only cross sections for perpendicular incidence in one DUT (same as flight lot) are shown; 2 All tested modes included for SEL cross section and only static/dynamic for SEU and SEFI cross sections.
Table 2. Summary of functional parameters of the instrumentation.
Table 2. Summary of functional parameters of the instrumentation.
ParametersValueObservations
Current resolution25 μ A-
Threshold currentup to 725 mAConfigurable in real-time
Timing resolution80 nsCurrent and memory errors synchronization
Current conversion140 μ sAnalog to digital conversion
Current sampling220 μ s-
Response time 11.1 ms to 1.25 msFrom high current to SEL detection
Test modes-Standby, chip-enabled, static, and dynamic 2
Power monitoring-Main supply and IO banks monitored
1 Considering five current samples for filtering, 2 March C- and Dynamic Stress.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mattos, A.M.P.; Santos, D.A.; Luza, L.M.; Gupta, V.; Dilillo, L. Investigation of Single-Event Effects for Space Applications: Instrumentation for In-Depth System Monitoring. Electronics 2024, 13, 1822. https://doi.org/10.3390/electronics13101822

AMA Style

Mattos AMP, Santos DA, Luza LM, Gupta V, Dilillo L. Investigation of Single-Event Effects for Space Applications: Instrumentation for In-Depth System Monitoring. Electronics. 2024; 13(10):1822. https://doi.org/10.3390/electronics13101822

Chicago/Turabian Style

Mattos, André M. P., Douglas A. Santos, Lucas M. Luza, Viyas Gupta, and Luigi Dilillo. 2024. "Investigation of Single-Event Effects for Space Applications: Instrumentation for In-Depth System Monitoring" Electronics 13, no. 10: 1822. https://doi.org/10.3390/electronics13101822

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop