Approach 1 deals with cracks, breakouts, and rolling contact fatigue, as these effects are present relatively often in the area around frogs. High-definition images in combination with image recognition algorithms are a possibility to detect effects of this kind. Refs. [
25,
26] provide examples on the possibilities of the technology. However, image recognition alone is not expected to be sufficient, as not all cracks are visible from above, so ultrasonic testing will still be required in the future. While all the other proposed quality indicators will be evaluated in detail, image recognition is not part of this study. The six approaches proposed are interrelated and need to be analysed together. However, each approach requires different data sources and data processing. Therefore, each approach is described in detail and finally summarised in a common datasheet. Since track geometry is the most widely used indicator in track quality assessment, we selected it as the first approach.
2.1. Track Geometry Assessment
Due to the dynamic impact loads, the frog area is susceptible to track geometry failures [
11]. Track geometry in vertical direction is best described by the longitudinal level in the D1 range as defined by EN 13848-1 [
27]. Most infrastructure managers worldwide measure this parameter regularly using dedicated measurement cars either based on an inertial unit or a chord-based system in order to guarantee the safe geometrical condition of the track. These measurements are also available in turnouts. Following appropriate data alignment, they can be used to monitor isolated track geometry faults at the frog.
Figure 2 illustrates the raw signal of longitudinal level and its deterioration over time for one of the analysed turnouts.
The upper half of the figure shows the longitudinal level expressed spatially. The frog area is highlighted by a grey rectangle. An isolated failure of the track geometry (local settlements) is clearly visible in this area. The lower half shows the deterioration of the longitudinal level of the highlighted area. To perform time series analysis, we utilised maximal zero deviations of the longitudinal level as the quality indicator, as defined by regulatory guidelines. Local settlements are increasing over time, as the indicator demonstrates. Historical maintenance actions are illustrated as vertical dashed lines. Tamping in 2011 had a positive effect on the longitudinal level of the entire turnout, but as there were no severe isolated failures in the frog area, tamping had no effect on the frog. By the end of 2017, no significant reduction in failures had been achieved, but the system had stabilised to some extent, with a lower rate of deterioration after tamping than before. In 2021 and 2022, only local measurements were carried out on the frog instead of tamping the whole turnout. No data is available on the type of maintenance carried out, but this example is consistent with experience and shows that local measures are rarely sustainable.
The time series displays a recurring pattern of tamping actions in turnouts: tamping typically enhances the track geometry of a turnout, though it is not always effective in addressing isolated track geometry failures beneath the frog. There may be two reasons for this. Either the tamping machine used is not capable of sufficiently compacting the ballast under the frog due to geometric constraints, or the ballast under the frog is already severely degraded to the point that the ballast bed cannot support the tamped geometry. A wavelength-based analysis of the longitudinal level can give an indication of the latter possibility and will be addressed in the next approach.
2.3. Consideration of Weld Surface Irregularities
Frogs are commonly welded into the track and are therefore adjusted to welded joints. Irregularities in the rail surface resulting from these welds can give rise to significant dynamic impacts and therefore necessitate monitoring. The effect of weld battering causing growing rail surface irregularities at welds is described in detail in [
28]. As shown by previous research, these irregularities can be monitored using the rail surface measurement system. The system uses three vertical distance lasers to build a measuring chord, thereby recording the rail head’s longitudinal profile. An actuator shifts the device laterally to ensure it aligns with the rail head’s centre position. In Austria, this measurement system is mounted onto the measurement cars and therefore provides regular measurement of the longitudinal profile of the rail head in the wavelength span between 20 mm and 1000 mm.
Figure 5 pictures the welds of a frog and depicts relevant rail surface signal characteristics.
While the signal output is unstable due to the crossing nose gap, the evaluation of several turnouts shows that the signal is stable at the position of the welded joints close to the frog. Ramp angles are the best way to quantify rail surface irregularity, as they include the amplitude of an irregularity and the longitudinal expansion of the irregularity in the calculation. Ramp angles represent the sum of the linearised slopes of a V-shaped irregularity. The superiority of using ramp angles over using maximum amplitudes alone is that dynamic vehicle responses depend on a combination of amplitude and wavelength and ramp angles therefore they better represent the vehicle response induced by the irregularity [
29]. By the calculation of ramp angles for every measurement run, the evolution of the ramp angle over time can be depicted (
Figure 6).
For the first turnout (upper graph in
Figure 6), there is a clear linear deterioration trend for both welds from 2012 to 2020. The rate of deterioration is the same for both welds; however, the weld on the side of the wing rail started with a larger ramp angle in 2012. In 2020, the frog and therefore the welds were replaced, resulting in a significant reduction in the ramp angles. After the replacement, the weld on the side of the frog started to deteriorate again. To date, no deterioration has been observed in the wing rail side weld. The second example (lower graph in
Figure 6) again shows linear growth of the ramp angles of both welds, but with different growth rates. In 2020, a repair weld was performed which resulted in a reduction in angle for the wing side weld but not for the frog side weld. The worker performing the repair weld apparently only treated one of the welds during the grinding that is part of the repair weld process. In 2023, the frog was replaced, resulting in smooth welds on both sides.
The use of the rail surface makes it possible to assess the welds around the frog. Welds in poor condition are dynamic in themselves, but they can also interfere with a smooth wheel transition. Keeping these welds in good condition can therefore contribute to better frog performance. However, compared to the impact caused by the frog itself, the impact caused by these welds is significantly lower. Therefore, methods need to be established that include frog imperfections. Measurements of axle box acceleration can contribute to this assessment.
2.4. Axle Box Accelerations—Direct Assessment
Frogs in poor condition lead to higher vehicle reactions and consequently higher system forces. As axle box accelerations include information on unsprung (unfiltered) wheel vibrations, which are highly dependent on infrastructure condition, they have the potential to provide insights into the condition of passed frogs. There are two further issues to consider when using axle box acceleration data: (1) The measured data contain not only information about the track, but also about the condition of the vehicle, in particular, the geometry of the wheel. Simulations performed by [
30,
31] show that wheel geometry is one of the dominant predictors of accelerations and forces. As the data used here is from the accelerator mounted on the track-recording car, it can be assumed that the wheels are in good condition, as there are stricter limits and regular inspections for the recording car. (2) Speed has a significant effect on vehicle response. To use axle box accelerations as an indicator of frog condition, it is necessary to eliminate the influence of speed. For this purpose, the method developed by Joanneum Research—based on comparing accelerations from multiple turnout passes at different speeds [
30]—proved most effective. Their study employed a custom-built sensor setup, and the resulting formula was adapted in this work to process acceleration data from Austria’s standard recording car. A self-made sensor concept was used. The formula obtained was adapted by the authors of this paper for the acceleration data of the standard recording car in Austria.
The formula from Joanneum Research is adjusted in two ways for this study. The unit of speed is changed from m/s to km/h, as the latter is more commonly used in the railway sector. In addition, a reference speed of 100 km/h is chosen for values in a realistic range. By applying this formula, the influence of speed can be reduced to an acceptable level. Using acceleration data from the standard recording car has the advantage of covering the entire network over approximately 20 years. However, a major limitation arises specifically for the Austrian recording car and does not necessarily apply to recording cars in other countries. As it is processed within the recording car into an indicator used to detect corrugation and then stored, raw data is not available.
Figure 7 provides an example of the stored accelerations for one turnout in the Austrian network.
The two newest measurement runs are represented in order to demonstrate the stability of the signal characteristic. However, it is clear that high frequency parts of the signal are lost due to the sliding average calculations performed on the recording car. Nevertheless, clear signal characteristics are visible at relevant points. In the transition zone at the front of the turnout, an insulated rail joint and a change in track stiffness led to relatively high accelerations. The effect of welds is also visible in the data. The highest accelerations are caused by the frog. As the signal has already been processed and does not allow for more sophisticated index calculation methods, root mean square (RMS) and maximum amplitude were analysed as potential quality indicators using different window lengths. The most stable results are obtained using RMS values with a window length of 5 m. In order to ascertain the extent to which the calculated indicator demonstrates good or poor quality, we conducted statistical evaluations. For 56 turnouts, information about frog replacements is provided by OeBB Infrastrukur AG. By assuming a poor quality before and a good quality after the exchange of the crossing, quality areas can be defined based on the historical data.
The left box in
Figure 8 represents the quality before frog exchanges, which is relatively poor, while the right box represents the quality after frog exchanges, which is relatively good. The respective frogs of the same turnout are connected by a dashed red line. The data shows that almost every replacement resulted in a significant improvement in frog quality, which in turn resulted in lower axle box accelerations. A relatively high deviation can be seen for the quality indicators before the frog replacement, while only a few high values remain after the replacement. There are three possible explanations for this: (1) It is possible that several frogs were replaced due to rail defects and cracks. While these are valid reasons for replacement, they are not necessarily the cause of increased accelerations. (2) The data points represent the last measurement run prior to replacement. It is common practice to carry out repair welding before replacing a frog. This may result in slightly lower accelerations for frogs where repair welding is effective. (3) Replacing the frog without addressing the bedding quality may result in increased accelerations after frog replacement.
Despite the variations observed, it is clear that there is a distinction between the boxes. These results are used to define quality ranges (right side of the plot). Four quality ranges are defined, representing ‘very good’, ‘good’, ‘moderate’, and ‘poor’ quality. Values up to 2.89 are indicative of very good quality, as only frogs that have been replaced have reached this level. The range for good quality (2.89–5.09) is defined by the median of the accelerations before frog replacement. Values up to the third quartile of pre-frog-change accelerations are defined as moderate quality, while higher values are defined as poor quality. It should be noted that poor quality does not necessarily equate to safety issues, but it may be advisable to avoid accelerations of this magnitude in the long term, as high dynamics can lead to faster deterioration of the system. In
Figure 9, a time series of the described indicator is depicted. The quality areas defined are included for an easier interpretation.
As axle box acceleration data is available for both rails, the indicator is calculated for the frog side and the guard rail side of the turnout. Darker crosses represent the frog side, which shows higher values than the guard rail side. This is to be expected, as the wheel passing the frog is more directly affected by the frog geometry. While there is a large scatter despite the harmonisation of the measurement speed, there is a trend of increasing accelerations over time from 2012 to 2020. In 2020, the frog was repair welded, which significantly reduced the acceleration (from the red bad area to the green good area). However, due to further wear of the crossing, the accelerations increased again since 2020, indicating the need for further maintenance in the near future. While condition information is contained in the pre-processed acceleration data from the Austrian track-recording car, deep evaluations are limited, and detailed conclusions cannot be drawn. Since some track-recording cars and standard locomotives measure and store raw acceleration data, approaches using these data are meaningful and will be discussed in the next chapter.
2.5. Axle Box Accelerations—Assessment Based on Dynamic Loads
In Approach 5, a quality indicator for the frog is derived directly from axle box acceleration measurements. Approach 6, however, uses axle box accelerations to approximate the vertical wheel trajectory while passing a crossing and incorporates the calculated ramp angle into an analytical model for computing dynamic impact loads. For this purpose, we use axle box acceleration measurements from two regular locomotives equipped with vertical accelerometers. As noted previously, the standard track-recording car used in Austria does not provide raw acceleration measurements. Nevertheless, there is an increasing trend among infrastructure operators to record axle box accelerations using standard measurement vehicles, making raw axle box accelerations representative of typical measurement data. By integrating axle box accelerations and implementing a suitable filter, the vertical movement of the wheel over a given distance can be determined. Filtered accordingly, these movements approximate longitudinal level [
32]. Two aspects must be considered. (1) In order to nullify the impact of the measurement speed (in this case the speed of the regular locomotive), double-integrated axle box acceleration has to be dived by the squared measurement speed. (2) As integration results in long-wave drifts of the signal due to measurement noise, a method must be implemented which deals with the noise. This phenomenon is well known and often discussed in the literature [
33]. While different approaches for eliminating drifts are suggested in the literature, a simple filter method is used here. After the first integration step, a Butterworth filter is used, allowing only short wavelengths. The second integration uses the filtered signal as input and returns the displacement of the wheel, which approximates longitudinal level. For validation, the double integrated axle box acceleration signal is filtered to the D1 wavelength range (3–25 m) and compared to longitudinal level measurements from the track-recording car (
Figure 10).
The black line represents the longitudinal level as measured by the track-recording car using an IMU. The coloured lines represent the double-integrated axle box accelerations, filtered to the wavelength range of the longitudinal level (3–25 m). Four measurements from two vehicles are available for comparison. The comparison shows that three of the acceleration signals fit the longitudinal level quite well after double integration. One measurement, the red signal, shows some unexplained deviations. Further applications of the method show that double integration is a reliable approach to approximating the longitudinal level. However, as demonstrated in
Figure 10, there are exceptions that warrant further investigation in future research.
The advantage of using double-integrated axle box acceleration data instead of longitudinal level is that it contains a wide frequency spectrum. Filtering to 3–25 m can be used for validation purposes, but the inclusion of shorter wavelengths can provide additional information on vertical track geometry. This is particularly useful for the frog area, as the inclusion of shorter wavelengths makes it possible to visualise the actual trajectory of the wheels.
Figure 11 compares longitudinal level with double-integrated axle box accelerations, including shorter wavelengths.
The red line represents the double-integrated acceleration data, including wavelengths from 1 to 25 m. Due to the inclusion of shorter wavelengths, a deeper peak is depicted in the data. This peak corresponds to the short-wave wheel drop in the frog area and is not detectable in the longitudinal level as it includes longer wavelengths only. The use of double-integrated axle box accelerations therefore adds additional information that can be used for evaluation. However, the results are affected by the wavelength range chosen. This is illustrated in
Figure 12.
The figure illustrates the vertical wheel trajectory obtained through double integration of axle box accelerations, with lower wavelength limits varying between 0.01 m and 3 m. As the minimum wavelength decreases, the depth of trajectory peaks increases, although this effect is limited. A focused analysis of three frogs, shown on the right side of
Figure 12, reinforces this observation, indicating that wheel drop amplitudes tend to grow with shorter wavelengths. However, this trend levels off within the minimum wavelength range of 0.1 to 0.25 m. Shorter wavelengths than 0.1 m seem not to contribute to the wheel drop and are therefore not relevant for the load modelling. While this is not the case for the data analysed, data instabilities are more likely for signals including shorter wavelength fractions. Given the lack of additional information within wavelengths below 0.1 m and the increased risk of data instability, data is filtered to 0.1–25 m.
The severity of the wheel drop provides an indication of the condition of the frog. Worn frogs result in more severe wheel drops. More severe wheel drops are associated with higher dynamic impact loads that damage components. Monitoring and limitation of wheel drop to a certain level could therefore lead to more sustainable frog performance. By the incorporating track design and properties of passing vehicles, condition monitoring can be further specified for the respective line. For that, we utilise the analytical approach for the calculation of dynamic impact loads based on Jenkins [
29]. Impact loads are typically characterised by two force peaks named
P1 and
P2 peak. While
P1 acts directly after the wheel is stimulated by an irregularity (the crossing nose dip) and with very high frequencies around 1000 Hz,
P2 acts several milliseconds later and with frequencies between 20 and 100 Hz. As
P2 transfers more energy into the system and comes with frequencies which lead to an increase ballast degradation and track geometry deterioration, it is the more critical one.
P0 Static wheel load
2α Ramp angle of the irregularity [rad]
V Vehicle speed [km/h]
mu Unsprung mass per vehicle wheel [kg]
mt Effective vertical track mass per vehicle wheel [kg]
Ct Effective track damping per vehicle wheel [Ns/m]
Kt Effective track stiffness per vehicle wheel [N/m]
The formula comes with input parameters of track (blue) and vehicles (red) which have to be filled with representative values for the respective track vehicle combination. For track, component-specific input data must be defined. For vehicle parameters, the Austrian standard universal locomotive is used as the reference vehicle. Vehicle speed is defined by the permitted speed of track. 140 km/h was chosen for the frog analysed. The ramp angle 2α is derived from two data sources: (1) double-integrated axle box accelerations measurements, as described above, and (2) geometry information from handheld tools. Data was provided by a project partner and gained by a rail profile measurement executed several times over the longitudinal extension of the frog. The measurement principle is described in detail in [
34]. Ossberger also developed a method to simulate the wheel trajectory of an ORE-S1002 wheel passing the measured frog. This method is best described in [
34].
Figure 13 provides a 2D representation of three frog geometry measurements available for this study.
The left (falling) lines represent the geometry of the wing rail, the right (rising) lines represent the geometry of the frog. The upper curves refer to the first measurement after the turnout was renewed, so the frog can be assumed to be in good condition. During operation, the geometry deteriorates, resulting in the lower curves. It is clearly visible that the ramp angle has increased due to the deterioration process. The frog was replaced after 2018, resulting in the geometry shown by the middle curves. Again, the geometry is more symmetrical and the transition of the wheel is smoother. For further interpretation, the geometries need to be linearised. This can be achieved by using linear regressions described by the coloured lines in
Figure 13.
For one turnout in the Austrian network, both data sources are available, therefore this turnout is analysed in detail. Additionally, one of the regular locomotives which was equipped with axle box acceleration sensors was also equipped with a measurement wheel as described in [
35] for the measurement of Q forces. Q forces are important for vehicle homologation and therefore well known in the railway sector. Typically, measurements are limited to a frequency of 20 Hz, which is an important aspect in result interpretation.
Figure 14 combines the described methods and depicts calculated and measured forces arising when a locomotive passes the analysed frog.
Local measurements are available between 2012 and 2023; therefore, dynamic forces are presented over time. The blue points represent calculated P2 forces when a locomotive passes through the geometry provided by local measurements. As clearly visible, due to wear of the frog and the wing rail, the linearised ramp angle grows over time, leading to P2 forces becoming higher over time. In 2020, a repair welding action repairs the frog geometry and therefore reduces the P2 force level. In mid-2020, in addition to local measurements, ramp angles gained from double-integrated axle box accelerations from two locomotive types with two measurements were available and depicted as red points. Again, the Jenkins model is used for P2 force calculation. As the two locomotives come with similar masses, the P2 forces are comparable. Also, in mid-2020, Q forces provided by a measurement wheel set were available and visualised as black points. As the vehicles passed the frog with a speed of 140 km/h, P2 force calculation also assumed the same velocity. When comparing the force level of the three groups, significant differences appear. Although they all represent dynamic forces, a direct comparison is not technically possible. The lowest force levels (~200 kN) are represented by the measured Q forces. The reason for this is that the measurement principle (wheel set measurement) is limited to frequencies up to 20 Hz. This results in a force signal that filters out high frequency force peaks and therefore reduces the amplitudes of the force peaks, as the highest amplitudes are typically caused by high frequencies on top of lower frequencies. The P2 forces of the Jenkins model represent the 20–100 Hz frequency range. Therefore, Q forces from wheel set measurements clearly underestimate the force peaks relevant to ballast damage.
The intermediate force level (~270 kN) is the result of ramp angles derived from manually measured geometry data and incorporated into the Jenkins formula. Although the critical frequency spectrum is represented, the results still underestimate the actual force level as the manually measured geometry data only represents the unloaded frog geometry. The highest and most realistic force level (~450 kN) is achieved by using double-integrated axle box accelerations as input. In this case, the actual wheel trajectory represents the loaded ramp angle of the frog and includes geometric properties of the metal part and bedding effects.
Finally, the developed indicators are applied in two complementary ways. First, a datasheet integrating the time series of all indicators is presented to provide a detailed assessment of the condition of a single turnout. Such a datasheet can serve as a practical tool for decision-making in operational asset management. Second, indicators from multiple turnouts are analysed to determine typical index values and to examine the distribution of the indicators across a random sample of the main network, allowing for the identification of general trends and patterns.