1. Introduction
Nondestructive testing (NDT) of machine parts’ surface integrity has grown greatly over the last 20 years. There are many methods and techniques, based on different physical effects, that can be used for NDT (
Figure 1). Faster and more efficient process control leading to reduced set up time can be achieved through the introduction of proper methods and innovative concepts for nondestructive analysis and verification of product quality, especially for altering the component type in the production line [
1].
One of these NDT methods is Barkhausen noise testing (BNT), which is utilized to assess changes in the surface layer of ferromagnetic materials, especially to monitor changes in their hardness and residual stresses. BNT is based on the interaction between the external magnetic field and the ferromagnetic material. The reorganization of the magnetic domains and the formation of an internal magnetic field are registered by the sensor. The magnitude of the registered signal and its parameters depend on many factors. Many of them are noncorrelated, while others share a strong correlation. One can easily demonstrate that the set of factors affecting the Barkhausen signal comprises more than 200 components, including interactions between factors (
Table 1). The combination of all these factors results in the material response of external magnetization.
The basic equipment consists of a measuring device with a sensor and connecting cable to generate and measure Barkhausen noise (BN) (
Figure 1).
The BN analyzing instrument requires both generating the applied magnetic field, in order to send it into the material, and picking up and presenting the BN signal generated from the material. For that purpose, BN sensors are used. There are normally two primary functions of a BN sensor. First, an external magnetizing field is applied, which penetrates the surface of the material to be analyzed. For that, a magnetic yoke with two magnetic pole pieces is required. The orientation of the magnetizing pole pieces determines the direction of the applied field and also the measurement direction of the BN signal. Second, some kind of sensor that can detect the BN signal is needed. The most common way is to use a sensing coil that can be tailored to the analyzing frequency. Alternatively, a Hall element can be utilized. There is also the possibility of using an external magnetic yoke to generate the applied magnetic field and then using a pick-up sensor for the BN signal. The available BN sensor types are often classified into two groups: surface-specific sensors (external and internal) and product-specific sensors (camshaft, crankshaft, gears, etc.). The surface-specific sensors have a broader range of application and are divided into general purpose sensors, flat sensors, outer diameter (OD) sensors, and inner diameter (ID) sensors. The product-specific sensors are oriented for specific components, such as camshafts, crankshafts, gears, or other components.
The generated BN signal needs to be picked up by a sensing coil or element, amplified, filtered, and presented. The presentation can either be in numbers on a display or as an oscilloscope signal that can be stored and further analyzed. The BN signal is analyzed by examining features computed representing, for example, Barkhausen activity and the shape and position of the BNT envelope. BNT is a stochastic phenomenon and thus only averaged properties are reproducible. The use of a high sample rate and acquisition of time-related data during a specified number of magnetizing cycles, typically 10 BN bursts, is required to obtain good averaging. The BN signal is acquired over a larger analyzing frequency spectrum, for example, 1–1500 kHz, which makes it possible to later select different or narrower bands of analysis. Typical features studied are, for instance, the root-mean-square (RMS) value, peak height, peak width, and peak position concerning the signal. Both the pulse-like noise signal and envelope of the BN burst can be analyzed. Also, the amplitude spectrum and pulse height spectrum can be studied to obtain information concerning the material properties.
A challenge with BNT is that the measured values are not reproducible and depend on the measurement arrangement. The sensor, measurement parameters, signal processing parameters, and the issues related to the measurer may influence the measurement result. In this study, an interlaboratory proficiency testing was carried out to evaluate if participant-related issues are significant. Three laboratories with similar equipment performed BN measurements for two sets of samples. The samples were ground with different grinding parameters to obtain changes in the BN response. A standard analysis of variance (ANOVA) was carried out to distinguish between the effects of grinding parameters and the measurer.
This study also evaluated how measurement uncertainty decreases as the BN measurement is repeated. For this measurement, uncertainty was computed as a function of repetitions. Two uncertainty indices were computed, the first emphasizing the average expected uncertainty, while the other considered the worst-case scenario of maximum uncertainty. This study highlights the significance of repetitions to draw valid conclusions.
2. Proficiency Testing
Usually, proficiency testing is carried out as one essential activity of testing laboratories and it has become a mandatory requirement for laboratory accreditation. The testing ensures that the statistical methods which are adopted are fit for the intended purposes [
2]. Generally, the proficiency testing scheme is at first described according to the intended objective and purpose of the study. Then, the statistical test plan with methods is performed. The last phase is to evaluate the results from the individual test laboratories (performance evaluation).
The samples can be tested either by each laboratory by themselves with certain instructions [
2] or laboratories can use a group of samples made by a certain party which is distributed to them [
3]. We studied pressure vessel samples with different annealing treatments (thermal degradation) as one project partner in a round-robin BNT study.
BNT itself is affected by many factors: the equipment [
4], the sensor design [
5], the participant [
6], the way the measurement is carried out [
7], and the software used. Generally, gauge, repeatability, and reproducibility (GR&R) tests are used when studying measurement variations and their causes, which include the effects of a participant, effects of the equipment, and the way the measurements are carried out. We studied, among other things, the use of different statistical calculation tools for testing the BNT equipment’s performance (repeatability and reproducibility) in a quality check before sending BNT equipment to customers.
Fewer BNT-based round-robin studies have been carried out than X-ray-diffraction-based residual stress round-robin studies, which have been performed by, for example, [
8] and [
9]. Regarding BN round-robin studies, even as early as 1977, a round-robin activity study was performed measuring railroad wheels and evaluating their residual stresses with different methods, including the BN method [
10]. The round-robin studies involved several research institutes, but the BNT measurements were carried out by only one research institute. The outcome of the BNT measurements was compared to the results obtained with other methods.
Takahashi’s group studied the degradation of ferromagnetic materials with BNT [
11]. Their round-robin studies in the Universal Network for Magnetic NDE (Non-Destructive Evaluation) concentrated on the evaluation of the measurement technique to help the standardization procedure of magnetic BNT. However, the BN results showed considerable disagreement among the participating groups and the most likely reason for this was stated to be the differences in the measurement techniques.
The study and analysis of differently ground samples was the objective of this interlaboratory round-robin test that involved researchers from three different laboratories in Sweden and Finland. The first part of the experiment was to prepare two batches of ground samples in different facilities with similar grinding parameters. The second part of the experiment consisted of an interlaboratory round-robin comparison carried out with the magnetic Barkhausen noise method. The main tasks were (1) interlaboratory comparison and (2) evaluation of the effect of grinding on Barkhausen noise features.
3. Materials and Methods
In this experiment, the near-surface influence on magnetic BNT was investigated by grinding-hardened specimens of various hardness values with different abrasives and intensities.
3.1. Design of Experiment
We implemented a full factorial experiment design with three repetitions [
12]. Three factors were chosen: hardness, abrasives, and intensities, each at two levels. That gave
experiments, where
k is the number of repeated experiments and
p is the number of factors studied. This gave
experiments. Furthermore, the experiments were repeated for two sets of samples, giving a total number of 48 experiments. Running the full experiment design with all possible factor combinations meant that all of the main and interaction effects could be estimated. For three factors at two levels, this meant three main effects, three two-factor effects, and one three-factor effect. This combination is described by the following model:
This model allows estimation of all α coefficients and the analysis of significance of the terms. The design applied can be improved by adding at least three center point runs.
3.2. Materials
Round bar samples manufactured from 20MnCrS5+A steel (
Table 2) with a diameter of 40 mm and a height of 35 mm were used in this study. The samples were carburized case hardened with oil quenching. The total hardening depth was max. 1.2 mm. In total, 56 samples were prepared and divided into two batches for further processing. Half of the carburized samples were also tempered at 180 °C for 1.5 h. After the heat treatments, a carefully planned grinding procedure was carried out for the samples.
3.3. Grinding Plan
Two different grinding batches were prepared in different grinding facilities. In total, three different variables were changed in the grinding, as shown in
Figure 2.
The parameters of the grinding experiment are presented in
Table 3. All of the specimens were case-carburized to a hardness of 63 HRC. Half of the hardened specimens were tempered at 180 °C for 1.5 h to differentiate the hardness. In total, 48 samples were studied, 24 for each batch.
The CBN grinding wheel grain size varied and was either B126 or B181 (Ilyich Abrasive Company, Saint Petersburg, Russia). The grinding intensity was altered by carrying out the grinding of, in total, 0.6 mm stock removal in either four or eight steps (batch #1) or in three or six steps (batch #2). The grinding wheel speed was 35 m/s, and the grinding table speeds were 8 m/min (batch #1) and 10 m/min (batch #2). The cooling fluid (water emulsion 5%) flow was 15 L/min. Four samples were ground by the normal grinding procedure, of which two were in a hardened condition and two were tempered. Normal grinding was carried out with a B126 grinding wheel with a high number of steps.
3.4. Measurements
The Barkhausen noise analyzer Rollscan 300 (Stresstech Oy, Vaajakoski, Finland) was utilized in each laboratory with a similar type of sensor (S1-16-13-01) for flat surfaces (
Figure 3). Two of the laboratories utilized the same sensor, serial number S6387, and the third laboratory utilized a sensor with the serial number S7582. The difference between the sensors was the number of coil turns. The size of both sensors was 18 × 20 mm. The measurements were carried out with Microscan software (Stresstech Oy, Vaajakoski, Finland), which records the Barkhausen noise signal and the magnetizing signal. The sweep method was utilized to determine the measurement parameters (voltage and frequency) [
13]. The measurement parameters were 5 volt-peak-to-peak (Vpp) for the magnetizing voltage and 80 Hz for the magnetizing frequency. It is worth noting that the magnetic field was not of interest in this study. The bandwidth of the analyzing frequency range was 70–200 kHz. In total, 10 repetitions were carried out in two directions, referred to as the grinding direction and perpendicular to the grinding direction. The analysis was carried out perpendicular to the grinding direction, as is the standard procedure for stress and hardness change evaluation. The moving average was used for smoothing of the signal and polynomial fitting for the peak calculation. The direct results obtained from the Microscan software were utilized in the data comparison.
3.5. Participating Laboratories
Both industry and university laboratories in Sweden and Finland were involved in the study. The participating laboratories were as follows: Kungliga Tekniska Högskolan (KTH Royal Institute of Technology, Stockholm, Sweden) in collaboration with Scania CV AB (Södertälje, Sweden), Stresstech Oy (Vaajakoski, Finland) and Tampere University of Technology (now Tampere University, Tampere, Finland). All participating laboratories were using equipment from Stresstech Oy, Vaajakoski, Finland.
3.6. Datasets
Two sets of ground samples were prepared according to the full factorial design of experiments. Ten repetitions of BN measurements were carried out for each sample by the three laboratories. The measurement device calculated certain features of the BN signal. From these, the traditional RMS value together with the peak height, position, and width of the BN envelope were used. Thus, the dataset included 1440 rows of data in two separate datasets. Grinding burns were observed in two samples of dataset 1. The data from these samples were removed, as suggested in [
14].
One of the challenges with Barkhausen noise measurement is that the measured values may not be reproducible but depend on the measurement arrangement (participant, sensor, etc.) [
15].
Figure 4 shows the box plot of dataset 1 showing that the absolute values were not reproducible. Thus, the so-called
z-scores [
16] were calculated independently for each participant with:
where
is the assigned value and
is the estimated standard deviation. Both values were calculated independently for each participant. The z-scores were used in the analysis instead of the direct measurement results. The analysis of variance used the average values from 10 repetitions, while the data were more thoroughly used in the analysis of uncertainty.
3.7. Analysis of Variance
ANOVA is a statistical testing scheme where grouped measurements are compared with each other. The null hypothesis of the testing scheme is accepted or rejected based on the statistics calculated from the experimental data. Usually, the null hypothesis states that all the groups are random samples from the same population. The null hypothesis is rejected if the calculated p-value is lower than a predefined α-risk level. The α-risk is related to a type I error (false positive), where the null hypothesis is falsely rejected.
Depending on the data, one- or two-factor analysis can be applied. Furthermore, if the data include repeated measurements, the computational procedures differ. ANOVA employs the F-test in determining the test result. The test statistics are computed first by computing the sum of squares (SS) for the grouped measurements. By dividing the SS by its degree of freedom, the mean squares value is obtained. The mean squares of the grouped measurements are divided by the within group mean squares to come up with the F-test statistics. The within group mean squares is an approximation of the variance of the measurements under the same conditions. The computed statistics are compared with the reference value, and the p-value is computed to determine if the null hypothesis is rejected. The reference value depends on the α-risk level and the variance estimates’ degrees of freedom [
16].
3.8. Computational Procedure
The standard ANOVA was applied to the z-scores to determine if the grinding parameters or the appraiser influenced the measured BN feature. Each experiment was repeated three times, and thus, a two-factor ANOVA with replication was applied. As mentioned above, the null hypothesis states that the grouped measurements are samples from the same population and, thus, the factor has no effect on the measured BN feature. It is rejected if the observed
F-test statistic is greater than the reference value. The analysis of variance was carried out only for the RMS values measured. The equations for ANOVA are not presented here but can be found, for example, in [
16].
Barkhausen noise is a stochastic phenomenon, and thus, only averaged properties are reproducible. Thus, the measurement needs to be repeated in order to draw conclusions reliably. The reliability involved with repeated measurements is assessed with uncertainty. In this study, uncertainty was computed to evaluate a feasible number of repetitions. Uncertainty was obtained as the standard deviation of the mean given by [
14,
15]:
where
s is the standard deviation calculated from
p measurements ranging from 2 to 10. However, there are
possible combinations of measurements depending on the value of
k. In this study, the standard deviation for every combination was computed and followed by two uncertainty values. The first one was obtained by taking the average of
standard deviations and the second one by taking the maximum of those. The computational procedure is illustrated in
Figure 5. The uncertainties were calculated for all of the features selected. The uncertainty from (3) was further divided by the average to obtain the relative uncertainty in percentages.