1. Introduction
Cardiotocography is a crucial part of modern perinatal medicine, which enables to monitor and assess a fetal condition during pregnancy and labor. It involves recording a fetal heart rate (FHR) signal against uterine contractile activity and fetal movements [
1,
2,
3,
4,
5]. Normal fetal heart activity is an indirect evidence of adequate fetal oxygenation and preservation of the central nervous system functions [
6]. The FHR signal values in the range 110 ÷ 150 bpm confirm a normal fetal heart activity. Accelerations or decelerations are defined as episodes of increasing or slowing of the heart rate. Another activity feature is an instantaneous variability, represented by two components: short-term variability (STV) describing FHR fluctuations occurring from beat-to-beat and long-term variability (LTV) defining periodic fluctuations in direction and extent of the STV.
The highest accuracy of the FHR signal determination is ensured by recording primary electrical excitations of a fetal heart, i.e., the fetal electrocardiogram (FECG) [
7,
8], which ensures very precise locations of characteristic points—R waves, representing subsequent fetal heartbeats. Based on time T
i between two consecutive R waves, the instantaneous FHR
i value is determined in beats per minute [bpm]:
where: T
i is the interval between two consecutive heartbeats expressed in milliseconds.
Unfortunately, the use of fetal electrocardiography during pregnancy faces serious problems [
9,
10,
11,
12]. When acquiring an indirect FECG from maternal abdominal surface, its amplitude is much lower than the electrical activity from mother, and in addition, a high level of other interferences often prevents successful T
i measurement [
13,
14,
15]. A good quality signal can be obtained invasively from an electrode directly attached to a fetal head, which can only be carried out during an advanced labor. These limitations caused that already in early 70s a Doppler ultrasound (US) method [
16,
17,
18,
19,
20,
21,
22] was applied for monitoring of mechanical activity of the fetal heart. The Doppler echo signal reflected from moving heart is characterized by a complex structure originating from opening and closing of valves, as well as from movements of heart walls [
23,
24,
25,
26,
27,
28]. In addition, it is characterized by high variability in time due to fetal and maternal movements [
3,
29,
30] and changing of transducer position in relation to a moving signal source—the fetal heart [
31,
32,
33]. Identification of the heartbeat occurrence, based on a simple peak detection in looking for characteristic component, is very inaccurate due to its variable shape [
23,
34]. In addition, any interfering impulses of sufficiently high amplitude can be identified as successive beats [
23,
35,
36,
37]. All of these makes practically impossible to indicate points equivalent to the R-waves in FECG [
38]. Therefore, for FHR determination based on the US signal, correlation methods considering the full impulse shape are used [
33,
35,
39,
40,
41,
42].
The dominant method is the autocorrelation function (AF), which comparing to the cross correlation, is easier to perform and does not require selecting a heartbeat pattern that makes it resistant to signal interferences [
43,
44,
45]. Unfortunately, only the periodicity is estimated, not the exact locations of subsequent heartbeats in the analyzed signal. Since measurement of the T
i is repeated in windows covering at least two complete cardiac cycles, a single heart interval can be represented by more than one measurement. Such signal form makes it impossible to reliably assess a beat-to-beat variability. The physiological FHR range is from 60 to 240 bpm (which corresponds to T
i intervals from 1000 to 250 ms). Therefore, not to miss any cardiac cycle, subsequent FHR values must be determined at least every 250 ms. Such repetition time is a widely accepted standard at which Doppler ultrasound monitor provides a new FHR value at its output.
To date, the possibilities of parameterization and objectification of the FHR variability evaluation relate to accurate quantitative signal analysis carried out by dedicated computer-aided fetal monitoring systems [
46,
47,
48,
49,
50,
51]. Automated signal analysis ensures repeatability of interpretation [
52,
53,
54,
55,
56], as well as the extraction of information hidden for visual assessment, such as instantaneous FHR variability at a beat-to-beat level [
57]. At turn of the 60 s and 70 s, several different indices were defined for quantitative description of the instantaneous FHR variability [
58]. Most of these definitions assume that the analyzed signal comes from an invasive direct FECG, i.e., the indices are calculated based on T
i intervals determined between the R waves. In other words, the analyzed FHR signal has a form of time series of events, in which each cardiac cycle is represented by a single measurement only. Nevertheless, a direct application of the definitions originating from FHR acquired via FECG, to the FHR determined from Doppler US signal, does not give good results. Especially the short-term variability indices, calculated from measurements occurring every 250 ms, show significant errors [
59,
60,
61,
62].
Therefore, attempts have been made to process the FHR signal from a time series of evenly spaced measurements into a time series of events representing subsequent heartbeats [
63,
64,
65]. Despite the improved method for determination of FHR variability indices [
65], their accuracy was still limited due to a wide autocorrelation window resulting in averaging of several neighboring intervals. The window width selection is a compromise between noise immunity and measurement accuracy. In practice, the AF windows of a few seconds are used, which ensures the continuity of periodicity measurement even in the presence of interferences from fetal movements, however at a cost of the measurement accuracy. The determined interval value does not then represent a single cardiac cycle, but an averaged value of few last beats [
36,
62,
64,
66]. It does not affect the results of classical FHR analysis—the baseline estimation or detecting accelerations and decelerations, because such analysis is carried out on a signal averaged in 2.5 s periods [
62]. However, averaging of measurements practically prevents a reliable STV determination regarding beat-to-beat changes [
62,
65].
It seems that analysis of the US signal inside a fetal monitor do not use its full potential. This results partly from a lack of requirements for this type instrumentation, due to relatively short history of fetal monitoring systems enabling automated FHR analysis with beat-to-beat accuracy. The question arises whether, by improving the built-in algorithm for determining the periodicity in a raw Doppler ultrasound signal, it is possible to obtain measurements close to reference values from the FECG. Peters [
66] developed an algorithm for determining the FHR from US signal, providing it in a form of time sequence of events, which enabled correct determination of indices describing the FHR instantaneous variability [
66]. The sequence of events was obtained by using markers that approximately corresponded to subsequent heartbeats. They were used to select the signal fragments for which the AF was determined. Peters showed a high correlation with the FHR signal determined from direct FECG. However, he examined only one short signal and did not evaluate an interval measurement accuracy.
Such evaluation was made in [
67], where an autocorrelation window comprising two cardiac cycles was positioned in US signal based on the simultaneously recorded FECG. The research concerned different values of window shift in relation to its initial position, determined by an occurrence of the R wave. Unfortunately, the work is only theoretical, because the method, as requiring an additional signal source, is not suitable for any practical use in fetal monitor.
Al-Angari has also investigated the problem of determining the FHR signal from the Doppler ultrasound, as time series of events [
68]. However, the obtained FHR signals were characterized by unusually high variability. Moreover, the source of reference FHR was a FECG signal, but recorded indirectly from a maternal abdomen. Jezewski also dealt with the analysis of a raw US signal [
36] recorded together with a reference direct FECG signal. The width and step of the AF window were adapted to the previously determined interval value. The efficiency of a new method was estimated by the interval measurement and de Haan’s variability index determination errors [
58]. Thus, the obtained results confirmed the high efficiency, but only in relation to a single FHR variability index [
64].
The cited works do not allow us to answer the previously asked question about the possibility of determining the FHR from a raw US signal with the accuracy similar to the reference direct FECG. Therefore, the authors of this work conducted comprehensive research on the Doppler ultrasound signal describing the mechanical activity of a fetal heart. The aim of this research was to develop new methods that will both improve the accuracy of FHR determination and provide the resulting signal in form of time series of events. Achieving of these two goals should result in a more accurate description of instantaneous FHR variability, and thus a more effective early detection of fetal distress.
The goal of our research was to develop a complete channel for processing the Doppler ultrasound signal enabling precise periodicity measurement. First, the information content of the US signal was analyzed-two subranges of Doppler bandwidths were separated using bandpass filtering to describe slower heart wall movements and valve motions of a higher speed. In a next stage, the US envelope was determined, and a classic low-pass filtering and Hilbert transform were considered. To determine the FHR signal, the autocorrelation function supported by the trapezoidal prediction function was used. In final stage, we proposed two different methods providing the FHR signal in a form of time series of events: first using correction of duplicate measurements and second based on segmentation of instantaneous periodicity measurements. In addition, very important was the original methodology for evaluating efficiency of the proposed algorithms, which was based on automated synchronization of examined FHR signals. All these aspects are important because there are no other publications comprehensively evaluating the FHR signal determination accuracy, both by estimating cardiac interval Ti accuracy and calculating clinically crucial indices describing the instantaneous FHR variability and additionally using direct fetal electrocardiography as reference data.
2. Methodology
Using an ultrasound method, determination of a fetal heart rate is based on analyzing the Doppler effect, i.e., a frequency change of US beam reflected from moving parts of a fetal heart. The received returning echo is a signal with a frequency of the US wave modulated by the speed of a given moving structure. After amplification it is demodulated for separation of particular Doppler components. Then the signal is converted into a digital form, and a further processing is carried out—leading to the FHR determination.
When all phases of a cardiac cycle, originating both from valve and wall movements are present (
Figure 1) much more complex signal structure (US1, US2) in relation to a simultaneously recorded FECG can be noticed. The amplitude and shape of individual components change during long-term monitoring, as the object (fetus) can change its position relative to a transducer [
31]. The FECG additionally presented in
Figure 1, allows for detailed identification of cardiac phases observed in signals of mechanical heart activity.
In addition to an occurrence rate, particular cardiac phases are characterized by different bandwidths. The lower frequencies correspond to the heart wall, while the valve movements are represented by higher frequency components (
Figure 2). The US signal envelope describing the valves is characterized by higher dynamics, and narrow peaks enabling accurate cardiac cycle length estimation. In turn, the envelope signal from the walls is much more smoothed, with wider peaks which reduces periodicity measurements accuracy. Additionally, a risk of erroneously doubling the heart rate increases, since the signal envelope fragments during the systole and diastole phases are very similar.
2.1. Data Acquisition
The research material used for this study was collected at the Obstetric Clinic of the Medical University of Silesia in Zabrze. It was carried out by investigators from the Medical University of Silesia during development the project “Application of direct electrocardiography for intrapartum assessment of fetal distress “(Bioethics Committee approval number: NN-013-345/02)”.
The fetal monitor used in this study enables to acquire a raw Doppler US signal (1 MHz) obtained after the demodulation process. The FHR_CTG measurements calculated by built-in fetal monitor software were accessible on a digital output.
Since the additional direct FECG recording (for determining FHR_E) required measuring electrode placed on fetal head, all signals were recorded as part of a routine procedure for labor monitoring. Six sessions (39 ÷ 41 week of pregnancy) lasting from 280 to 1263 s were carried out. The final research material included 58 min of simultaneously acquired: Doppler US signals being the input data for the algorithms investigated, original FHR_CTG signals determined by a fetal monitor, as well as reference information in the conducted research—FHR_E signals. The resolution of both FHR_CTG and FHR_E signals were 0.25 bpm.
2.2. Signal Processing
The goal of the research was to develop a complete channel for processing the Doppler ultrasound signal enabling precise periodicity measurement. The basic assumption was that the resulting FHR signal should be in form of time series of events, i.e., each subsequent interval between two heartbeats should be represented only by a single measurement—like in FECG.
The first stage of the research focused on optimal representation of the Doppler ultrasound signal (
Figure 3). The signal information content was analyzed for the influence of individual cardiac phases on the accuracy of measurement. Two bandwidths of Doppler frequencies were separated using bandpass filtering: one describing wall and one relating to valve movements. The next stage concerned reduction of signal complexity by calculating its envelope. Classic low-pass filtering and the Hilbert transform were considered. For methods evaluation, the FHR values were determined by AF, calculated every 250 ms.
During the next stage, we focused on tuning algorithms for determining the FHR signal: selection of autocorrelation windows parameters both for the autocorrelation and prediction function. Next, two different methods were developed to provide FHR signal in form of time series of events: using correction of duplicate measurements [
65] and basing on segmentation of instantaneous periodicity measurements. During the algorithm development, different factors were recognized as having a major impact on the final Ti measurements. No less important was the final measurements validation, if the determined interval value was correct, due to interferences originating from maternal heart activity or fetal movements.
The final stage of the study concerned metrological assessment of the proposed methods in relation to reference values provided by a fetal electrocardiogram. To validate the results, a precise synchronization of FHR signals was very important to ensure comparing the corresponding intervals. Two different procedures were proposed to estimate the inconsistency. A direct comparison of signals by the differences of instantaneous FHR values was accompanied by signal loss analysis and evaluating the signals impact on the indices describing long-term and short-term FHR variability.
2.2.1. Signal Preprocessing
In the first step, the appropriate signal bandwidth was selected to obtain “optimal” signal information content for more accurate interval measurement. Initially, it was assumed that the entire bandwidth, corresponding to both heart valve and wall movements, should be analyzed. It is difficult to measure signal periodicity precisely because each cardiac cycle in the envelope signal is represented by several impulses related to different cardiac phases. The number of events observed in a signal can be reduced by separation of signal components corresponding to different motion speeds of heart structures. In the case of fetal monitoring, it is assumed that using an ultrasound transducer emitting with 1 MHz, wall and valves movements are within bandwidth of 100 ÷ 600 Hz [
36] Using band-pass filtering the subranges were separated, covering frequencies from 100 to 300 Hz (wall movements—
Figure 4B) and from 300 to 600 Hz (valve movements—
Figure 4C). Filtering below 100 Hz enabled to remove components originating mainly from fetal trunk and limbs movements, while above 600 Hz from flows in maternal blood vessels.
The next step involved determining the Doppler US signal envelope aimed to reduce the signal complexity. The easiest way to obtain an envelope is to convert a bipolar signal into a unipolar one using a module function and next a low-pass filtering. Too low filter cutoff frequency causes blurring of phases observed in a cardiac cycle, which, leads to lowering the AF peak and reducing measurement accuracy. On the other hand, too high cutoff frequency (e.g., 150 Hz) incorrectly shifts the peak due to high AF corrugation. The filter parameters should be adjusted to Doppler frequencies, which are proportional to the 1 MHz. During the tests, the envelope signal was determined using different cutoff frequencies (25, 50, 75, 100, 150 Hz), as well as without any additional filtering.
The Hilbert transform was used as an alternative method for determining the envelope [
33]. The envelope determined using the Hilbert transform was characterized by quite high variability, especially in fragments of high dynamics. Therefore, a moving average filter (MA) was additionally used to smooth the envelope. The best smoothing while maintaining the impulse shape, is ensured by averaging over 11 or 21 signal samples (established empirically). For further studies, the envelope was determined using the Hilbert transform without additional filtering and MA with a window of 11 or 21 samples.
Figure 4B,C shows exemplary envelopes determined with a low-pass filter with a cutoff frequency of 50 Hz and using the Hilbert transform.
To assess the impact of the bandwidth applied on the accuracy of measurements at this stage of study, a simple AF determined every 250 ms within windows of 1.5 s was used. Because at this stage of analysis we just want to compare two methods for envelope determining, so such approach, which is a compromise between ensuring the continuity of measurements and their accuracy, is the most right. Additionally, the AF threshold relating to low level of signal quality and thus defining the FHR signal loss was empirically determined. For this purpose, the most disturbed signal fragments were analyzed, in which in the absence of fetal heart episodes, many local AF peaks were observed, whose amplitude usually did not exceed 0.1. Therefore, if the amplitude of AF peak did not exceed P
TH = 0.1, a loss marker was set in the resulting FHR signal. Such AF threshold is more rigid than 0.15 proposed in the study [
36].
2.2.2. Periodicity Determination
Two different methods for determining the autocorrelation function were used. The first is the detection of the AF peak calculated arbitrarily every 250 ms. In the second case, AF is calculated more frequently with every 25 ms, and the outlier values most often caused by temporary interferences, are rejected from a group of measurements. An important factor affecting both the measurement accuracy of T
i interval value and the level of FHR signal loss is the AF window width. It is required that the window should cover at least two of the same cardiac phases belonging to two consecutive cardiac cycles. The multiphase motion process observed in US signal means that determination of the T
i interval does not require a window including two full cardiac cycles. Even if certain cardiac phase will occur only once in a narrower window, then most likely the other phase will be represented twice, allowing correct measurement. A narrow AF window means minimization of the averaging effect over several cycles, and thus a higher measurement accuracy. However, too narrow of a window may prevent correct measurement, which will result in FHR signal loss. Jezewski [
36] set the minimum window width at 1.5 of the last measured interval duration. The advantage of this adaptive window width selection is reduction of necessary calculations, especially in case of short T
i intervals, which may be important in practical implementation in battery-powered mobile fetal monitors. On the other hand, so strong dependence of the calculation process on previously measured values can cause a permanent switch to an incorrect range of T
i values.
The algorithm for the periodicity determination assumes that when the signal quality is satisfactorily high, the maximum of AF found is likely to correspond to the T
i interval value in US signal. To prevent erroneous measurements for low-quality signal fragments (AF peak amplitude < 0.5), the prediction function (PF) was proposed, where recently determined interval value is taken as an expected value for the next one. Such AF peak amplitude used in prediction function was determined empirically based on analysis of problematic cases observed in available signals. In the absence of interferences, the most frequently observed AF peak amplitudes were usually above 0.6. In the presence of additional disturbing episodes observed due to fetal movement activity, the AF peak amplitude was just below 0.5. The value of the AF is corrected by using a trapezoidal PF window, with a selected width of the lower base, a base ratio of 1:4 and the center located at the point τ = T
i−1 (
Figure 5). The proposed solution is a modification of the approach from [
36], where a triangular window was used, which however exposed too much the last measured value. In cases of increased heart rate changes, e.g., on acceleration and deceleration slopes, where the differences of successive intervals are significant, the use of a triangular window caused erroneous measurements.
The new trapezoidal window causes that the AF amplitude for shifts τ close to the values of recently determined Ti remains unchanged. While for less likely intervals (τ values significantly different from Ti−1) the amplitude is reduced according to the applied PF. In addition, it was assumed that if the current interval was determined using a prediction algorithm, then it does not modify the PF position for the next iteration. Finally, the Ti takes the value of τ for which the AF(τ) reaches maximum value.
In the next part of investigation, we developed two independent methods based on the autocorrelation function, which allow the resulting FHR signal to get the required form of time series of events. For this purpose, a method of correcting duplicate measurements [
65] was used, as well as a segmentation of instantaneous periodicity measurements.
2.2.3. Extraction of Time Series of Events
The first method (D1) for determining the fetal heart rate signal from the Doppler US signal provides the FHR value every 250 ms (like in classical monitors), after which a correction of duplicate measurements proposed in [
65] is applied (
Figure 6). The AF was calculated every 250 ms, but sometimes also more often (every 25 ms) and then the representative FHR value was the median of 10 consecutive instantaneous measurements F
j. According to [
36], the minimum window width was set at 800 ms, which is 1.5 of maximum interval value (533 ms) in the reference signal, and other tested widths were: 1, 1.5, 2 and 3 s. The impact of different PF window widths (S) on the final measurements was also tested with 125, 250 and 500 ms, which were adopted basing on visual analysis of the AF. Value S = 0 means that PF was not used, and then the position of maximum peak of AF (in the physiologically acceptable range from 30 to 2000 ms) defined periodicity.
The FHR values determined every 250 ms, underwent correction of duplicate measurements to obtain a signal in the form of series of events according to the authors’ algorithm [
65]. The aim of the algorithm for identifying and correcting duplicate measurements is determination of the number of true heart cycles represented by a sequence of instantaneous measurements. For each analyzed sequence, the Min and Max values are determined, defining two different numbers of such intervals. If the rounded Min and Max values are equal, then this value represents the only correct solution. In the case of different values, determination of the number of real heart intervals is based on analysis of time dependencies as well as assessment of probability of a given solution existence.
In the D2 method, the autocorrelation function is determined in a window shifted with 25 ms (
Figure 7). Subsequent values of instantaneous periodicity F
j are determined using the prediction function (as in the D1 method). The next stage is segmentation of measurements F
j, which identifies a group of subsequent measurements, which will then be used to calculate the resulting interval T
i. Segmentation relies on continuous computation of a median value until a time being the product of number of measurements and repetition period (25 ms) exceeds the current median value. Then the median becomes the searched T
i value and the procedure is repeated. Unfortunately, the obtained results depend on a starting point determining the phase shift between segments and series of F
j. Two approaches are proposed to reduce this unwanted effect.
The first one is to specify a starting point other than the signal beginning (
P ≠ 0). If the point correctly represents a cardiac cycle beginning, then the value of T
i will be determined only on the F
j corresponding to the considered interval. Otherwise, some of them will relate to the neighboring T
i−1 interval, causing erroneous value of the current T
i. To find the starting point, RMS value was calculated using the envelope signal. Various windows (100, 200, 250, 500 and 1000 ms) moved with 1 ms step were tested (
Figure 8). This range of RMS window width was adopted from observation of information content in terms of identifying the patterns representing fetal cardiac phases. For the first three seconds of the signal, the maximum value of the RMS function was determined. Then, going back to the beginning, the starting point P was set as soon as the current RMS value dropped below 2/3 of the maximum.
The second matching algorithm (MAA) concerns the correction of already determined T
i interval value and compensates the effect of shift between the determined segments and the F
j series (
Figure 9). The matching is controlled by a mean value of the absolute differences calculated between particular F
j and mean value of all F
j measurements within the analyzed segment. It makes that the value is calculated for three different cases: no shift of time stamps of the analyzed segments, a negative shift and a positive shift. As many as seven recently determined segments are analyzed, which ensures that none of distorting factors affect the final result of matching. If the minimum difference corresponds to a nonzero shift value, then an appropriate phase correction is applied to the time stamp of the interval beginning, while the T
i value itself remains unchanged. The details of the algorithm operation are presented in
Figure 9. In the first stage, the value of the T
i interval is calculated as the median among measurements of instantaneous periodicity in the range between τ
i and τ
i + T
i−1 (
Figure 9A). Then a mean difference is calculated between F
j values, obtained for seven intervals from τ
i−6 to τ
i + T
i−1 and for values from T
i−6 to T
i, representing these intervals (
Figure 9B). To emphasize the time dependencies, only a signal fragment with a length of 1.5 s is presented and the measurements F
j (which differ significantly from a given T
i interval), are marked with a circle. In the next stage, the calculations are repeated for time markers τ
i−6 ÷ τ
i + T
i−1 shifted by γ = ±25 ms, i.e., by a period between two consecutive fetal heart rate measurements F
j (
Figure 9C,D). If the minimum average difference is in the time stamp shift, the correction coefficient ε takes a value of ±1/5·γ, i.e., ±5 ms, otherwise it is 0 (
Figure 9E). Then a time stamp value of a beginning of the next interval is determined as τ
i+1 = τ
i + T
i−1 − ε, the increment of the variable “i” and the repetition of calculations for subsequent intervals are carried out.
A similar correction method was used in [
36], however, only the last three T
i intervals were analyzed, which was not always enough. If some of F
j measurements represented incorrect values, proper matching was difficult or even impossible. A similar matching problem concerned the signal fragments with very low variability—several successive intervals had very similar values. Since in [
36] AF was calculated with variable step (depending on recently determined T
i), the value of the correction was also variable and equal to 0.25 of the currently adopted step for calculating F
j.
2.2.4. Signal Loss Analysis
Additional interferences, originating from fetal movements or maternal blood vessels, may result in shortening of the determined cardiac cycle length or its lengthening if one of heartbeats is missed. To ensure a high accuracy in measuring intervals, an elimination of significant errors is required. Since it cannot be done as part of the AF, a preliminary acceptance criterion should be applied to Ti intervals already at the stage of AF calculation. The AF amplitude is an indicator of a signal quality and consequently, the correctness of the Ti calculation. Instantaneous periodicity measurements for which the AF peak is lower than the threshold PTH = 0.1 are omitted. The threshold value was determined empirically, as it was described Signal preprocessing subsection. Intervals considered to be invalid cause that signal loss markers are inserted in the FHR.
The next stage concerns the FHR limits resulting from physiology. The classic van Geijn criterion is used to identify and remove artifacts [
69]. However, the results of our research indicate that in some situations the criterion does not work correctly. Too wide range of T
i instantaneous changes is considered acceptable, mainly to ensure correct interpretation of values within the slopes of deceleration and acceleration patterns. Unfortunately, very often these values are incorrect and distort the results of beat-to-beat variability analysis. Therefore, more rigorous validation criteria are proposed, where the T
i interval must meet the condition:
where:
The T
i interval is accepted if it belongs to a group of at least three successive intervals meeting the above condition. In some cases, not only artifacts are removed from the signal, but also the correct interval values, especially within the slope of FHR acceleration or deceleration patterns. To prevent this, validation takes place in two directions of the timeline, accompanied by a change of thresholds. Only the T
i, which does not meet the criterion for both directions, is considered as incorrect. The final verification of T
i values initially classified as incorrect is carried out based on the analysis of the monotonicity of beat-to-beat changes. A given T
i interval is considered incorrect, if the differences between itself and neighboring intervals are of the same sign and the product of these differences is greater than 35 ms
2, which was determined on the basis of research presented in [
70].
2.3. Evaluation Methodology
To assess the differences between FHR signals obtained by the investigated methods in relation to the reference values obtained from a direct fetal electrocardiography, a set of descriptive statistics parameters calculated for all 58 signals was determined:
—mean value of Ti interval differences between the investigated and reference method;
—standard deviation of the differences ;
—measurement error of the i–th interval, as an absolute value of ;
—mean value of the measurement error .
Each of the FHR determination methods introduced a variable shift of the calculated intervals in relation to the source Doppler US signal. It is caused by the fact that different window width and different starting point (depending on a signal content) were used for AF. In addition, reference signals were obtained from a direct FECG, hence the presence of an additional time shift. Therefore, to ensure the reliability of the analysis, it was very important to synchronize the FHR signals to make certain the comparison of corresponding cardiac cycles in the investigated signal (Ti_B) and the reference one (Ti_E). Due to the size of the research material, synchronization was carried out automatically.
In the first step, both FHR signals were presented in a form of time series (
Figure 10). Then the FHR_U signal determined by the investigated method was shifted in the range +/– 3000 ms, in steps of 1 ms. The mean interval measurement error was calculated for each shift. It was assumed that subsequent values
are calculated the middle of the duration of subsequent reference intervals (indicated by arrows in
Figure 8):
where: T
i_B—value of the i-th interval determined with the investigated method T
i_E—reference value of this interval obtained from a FECG.
In addition, only intervals within the range from 5 to 55 s (of a 60-s reference signal) were considered. It was found that the correct synchronization refers to the shift value for which the minimal mean error is obtained.
A very important FHR quality measure is the signal loss level, which has a major impact on the results of automated analysis [
32,
71]. Very high measurement accuracy may be a result of rejecting too large number of “uncertain” measurements during calculations. Only the evaluation of the T
i measurement error together with the amount of FHR signal loss enable a reliable performance assessment of a given method. FHR signal loss is calculated using loss markers determined during a method verification. As in the case of the T
i error, in order to standardize the results, an analyzed fragment concerned a time interval from 5 to 55 s.
Considering diagnostic significance of indices for quantitative description of the instantaneous FHR variability (both short- and long-term), the method evaluation was not limited only to descriptive statistics of the measurement error. Detailed error analysis in variability indices determination was considered as very important criterion for assessing the efficiency of the investigated method. For purposes of the study, six short-term (named with S followed by an index author’s name) and six long-term (named similarly with L) FHR variability indices [
58] were selected. They were calculated over the corresponding fragments from 5 to 55 s for the FHR_CTG and FHR_U signals and compared with the corresponding reference values. Relative errors of indices determination were calculated, for example, the determination error for the S_DAW index is described by a formula:
where:
4. Discussion
The paper proposed new methods for determining a fetal heart rate signal based on a detailed analysis of the raw Doppler ultrasound signal. The primary requirement for these methods was that the FHR signal should be in the form of time series of events reflecting subsequent cardiac cycles—Ti intervals.
Experimental studies were carried out based on simultaneously recorded Doppler ultrasound signal, original FHR_CTG signal calculated on its basis inside a fetal monitor, as well as the FHR_E signal determined from a direct fetal electrocardiogram (being the source of reference information). The effectiveness of the developed methods was evaluated on the basis of measurement accuracy of Ti intervals between successive heartbeats, combined with estimation of the resulting FHR signal loss. In turn, the second stage of assessment concerned the accuracy evaluation of the instantaneous FHR variability, based on dedicated indices describing short-term and long-term variability.
There are various ways to assess the quality of FHR signals from the outputs of popular ultrasound-based fetal monitors relative to the reference FECG. The comparison of signals in metrological aspect concerns both visual assessment of the FHR traces and estimation of statistical parameters describing the signals, as well as comparison of the synchronized measurements—Ti intervals. On the other hand, the verification of the ultrasound method in a clinical aspect consists in comparing diagnostically significant patterns determined on the FHR basis, such as: baseline, acceleration and deceleration patterns or indices describing instantaneous variability.
The first reports regarding the FHR signal recorded by an US method were published when fetal monitors became popular-in the seventies. Already in 1976, Lauersen [
72] studied the differences between simultaneously recorded FHR signals via ultrasound and electrocardiography. The standard deviation for differences in corresponding measurements was as much as 9.8 bpm, mainly because of a lack of precise signal synchronization. Dawes [
73] also omitted the synchronization problem—and when compared the FHR mean values representing one-hour signals—he obtained a correlation of 0.9. In addition, he assessed the FHR short-term variability using the S_DAW index, for which a correlation was 0.63. The difference between mean values of the S_DAW index: 2.3 ms for FECG and 4.6 ms for US, was statistically significant. Lawson [
74] used the same method, but with a different fetal monitor. For FHR signals from the FECG channel, the mean S_DAW value was 2.1 ms, while for the US it was three times higher −6.2 ms. The obtained errors [
72,
73,
74] resulted from using monitors with US channel of the so-called first generation, where T
i intervals were determined solely on the peak detection. In another work [
75], Lawson has already used a monitor with a US channel of second generation, in which AF has already been applied to measure a periodicity. Furthermore, in this case, the difference was statistically significant, although much smaller and with the opposite trend—the S_DAW index was 1.29 ms for the US, which was 35% lower than its value of 1.98 ms for the FECG channel. The reason for underestimation was the autocorrelation function leading to averaging of T
i intervals.
Jezewski made a comprehensive assessment of the accuracy of FHR measurements [
62], comparing 21,941 pairs of T
i intervals recorded simultaneously with US channel of fetal monitor MT-430 (Toitu) and direct fetal electrocardiography. Reference FECG was used to eliminate duplicate measurements, which after synchronization served as a clock signal for the FHR signal from a fetal monitor. This correction method has enabled a reliable assessment, but it was impossible to apply in clinical practice. For higher reliability of results, the FHR reference signal was determined by two independent algorithms from the FECG, and during final analysis the measurements for which the difference between them exceeded one millisecond were rejected. The mean error
between US and FECG was 2.98 ms. In turn, the errors of variability indices determination ranged from −2.0 to −9.3% for LTV and from −5.9 to as much as −39.5% for STV. No results were provided for uncorrected data from a fetal monitor, which would allow estimating the impact of duplicate measurements on errors of the instantaneous variability assessment.
Other studies concerned methods for determining FHR signals in the form of time series of events from measurements occurring every 250 ms and recorded with a classical fetal monitor [
63,
64,
65]. Research conducted in [
65] clearly showed that, in relation to the FECG the fetal heart rate signal containing duplicate measurements introduces errors, especially when determining the STV indices (from −51.5% to as much as −91.3%). Effective removal of duplicate measurements—conversion of the FHR signal to the form of time series of events—allowed to reduce these values (from −28.0% to −74.0%), which unfortunately still does not allow for reliable analysis of the FHR variability. This was confirmed by the results in [
62], but although errors of the algorithm for duplicate measurements correction were eliminated by FECG signal timing, the errors of STV indices still had unacceptable values. The main reason was a use of too wide AF windows being responsible for measurements averaging. Therefore, only the development of more effective algorithms to analyze the raw US signal may improve the accuracy of FHR determination and thus the quantitative assessment of instantaneous variability.
Therefore, the authors developed two new methods for extraction of time series of events—the occurrences of fetal heartbeats. The first is D1 which is based on the duplicate measurements correction previously developed by the authors [
65]. The obtained FHR signal and STV indices are much accurate than results obtained for the FHR_CTG signal from a fetal monitor. The mean error of interval measurements
was 1.91 ms, against 2.78 ms for the signal from a monitor output. For LTV indices, errors ranged from −0.1% to −1.4%. In contrast, for STV indices errors ranged from −2.1% to −13.3%. It is worth to note that, errors obtained for the original FHR_CTG signals were from −50.8% to −91.3%.
In the second D2 method, required events are results of segmentation of instantaneous FHR measurements, based on the analysis of measured values. This method provides a significantly lower interval measurement error of 1.35 ms. The increase in measurement accuracy does not occur at the expense of signal loss level equal to 0.41%. Errors of determining LTV indices using D2 were very similar to the D1 method and ranged from −0.1% to −1.3%. In the case of STV indices, errors were always lower than for D1 and ranged from −1.9% to −10.1%. The obtained results clearly confirm that the D2 method is the best in terms of both the accuracy of measurement and reliability of estimation of FHR instantaneous variability.
In general, the research carried out in the work covered a very wide range of issues, from an initial analysis of raw Doppler ultrasound signal after demodulation to the final measurement of FHR values. Therefore, the obtained results were influenced by many different control parameters. Some of them as: threshold of the AF peak amplitude defining the FHR signal loss and threshold deciding about the measurements support by an additional prediction function, were determined empirically basing on our previous experience [
36,
70]. Significantly larger group are parameters whose optimal values have been determined on the basis of results of research carried out as part of this study. These are: Doppler signal bandwidths, control parameters in determining the envelope signal, window parameters of the autocorrelation and prediction functions, as well as the width of the RMS window used in the final segmentation process. The process of determining all these parameters and evaluation of their impact on final results is documented in
Section 3.1 Study of sensitivity.
The analyzed signal bandwidth has a significant impact on the results obtained. It should cover the range from 300 to 600 Hz corresponding to fetal heart valves. The use of short AF window (1 s) and measurement support with a trapezoidal prediction function turned out to be very important, which ensured high accuracy of Ti measurement and low signal loss in the resulting FHR. Nevertheless, for reliable assessment of indices describing beat-to-beat variability, a signal in the form of time series of events is required. In this case, better results were obtained using segmentation of instantaneous measurements Fj.
As the final results showed, it was possible to obtain a high accuracy of Ti measurement and what is very important at a low level of FHR signal loss. However, we are far from claiming that the proposed algorithms are ready-to-use as completely reliable solutions. The work is primarily of a research nature, thanks to which it was shown that it is not the Doppler ultrasound signal itself (as representing less accurate mechanical activity of the fetal heart), that is the source of errors, but the methods of its analysis used.
The problem of determining the fetal heart rate in the form of time series of events for the US signal, which source was the second-generation HP 8040A monitor, was raised by Peters [
66]. He applied a low-pass filter with a cutoff frequency of two hertz, and then the obtained markers representing the subsequent fetal heartbeats were used to separate fragments of an envelope signal, for which the Ti intervals were determined using AF. The 100-s signal was processed, and the results were referred to a direct FECG. The correlation coefficient between FHR signals was high and reached 0.977. Unfortunately, there was no assessment of the interval measurement accuracy, only a graph illustrating differences between the measurements for both methods against their means was provided.
A detailed assessment of an interval measurement error and STV indices determination based on an US signal analysis is provided in [
67]. Referring to a simultaneously recorded FECG signal, in the US signal an AF window with the width of two full cardiac cycles was positioned. The tests concerned different values of window shift relative to an initial position, determined by the occurrence of the R wave. The minimum error
was 1.59 ms, while the minimum error of the S_DAW index was 7.12%. Unfortunately, the work has not any practical value because the US signal analysis requires to be supported by the FECG signal.
Jezewski also dealt with analysis of raw US signal to determine FHR in the form of time series of events [
36]. The signals were recorded during labor using a prototype instrumentation, while the reference was a direct FECG signal. An AF window was used, whose width and step were changed adaptively depending on the last determined interval. The best results were obtained for a window width equal to twice the width of a previous interval, and for a step ensuring five measurements of each interval. The mean error
was 1.91 ms, with FHR signal loss of 1.6%. In addition, an error of −6.9% was determined for the S_HAA variability index.
Although investigations [
36] were conducted using other database than in this study, in both cases a direct FECG was the reference method, and the number of analyzed intervals was similar—8945 against 7498 intervals examined by the authors. Therefore, the statistical significance of both studies represents similar level. This, in turn, allows to state that the results obtained in this work show higher accuracy of periodicity measurement and lower level of FHR signal loss. For the D2 method, the error in interval measuring was only 1.35 ms, with a signal loss of 0.41%, which confirms a high efficiency of the developed solution. However, lower interval error did not lead to decrease of the S_HAA variability index error, as it was −8.9%, while in [
36] only −6.9%. Unfortunately, in [
36] only S_HAA index was analyzed, whereas it is obvious that various indices react in different ways to the error in measuring the interval.