Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing

Antosz, Katarzyna; Knapčíková, Lucia; Husár, Jozef

doi:10.3390/app142210450

Open AccessArticle

Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing

by

Katarzyna Antosz

^1,*

,

Lucia Knapčíková

²

and

Jozef Husár

²

¹

Faculty of Mechanical Engineering and Aeronautics, Rzeszow University of Technology, Powstańców Warszawy 8, 35-959 Rzeszów, Poland

²

Department of Industrial Engineering and Informatics, Faculty of Manufacturing Technologies, Technical University of Košice, Bayerova, 1, 08001 Prešov, Slovakia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(22), 10450; https://doi.org/10.3390/app142210450

Submission received: 25 October 2024 / Revised: 5 November 2024 / Accepted: 7 November 2024 / Published: 13 November 2024

(This article belongs to the Topic Smart Production in Terms of Industry 4.0 and 5.0)

Download

Browse Figures

Versions Notes

Abstract

:

This article presents a discussion of the application of machine learning methods to enhance the quality of drive shaft production, with a particular focus on the identification of critical quality issues, including cracks, scratches, and dimensional deviations, which have been observed in the final stages of machining. A variety of classification algorithms, including neural networks (NNs), bagged trees (BT), and support vector machines (SVMs), were employed to efficiently analyse and predict defects. The results show that neural networks achieved the highest accuracy (94.7%) and the fastest prediction time, thereby underscoring their efficiency in processing complex production data. The BT model demonstrated stability in its predictions with a slower prediction time, while the SVM model exhibited superior training speed, though with slightly lower accuracy. This article proposes that optimising key process parameters, such as temperature, machining speed, and the type of coolant used, can markedly reduce the prevalence of production defects. It also recommends integrating machine learning with traditional quality management techniques to create a more flexible and adaptive control system, which could help reduce production losses and enhance customer satisfaction.

Keywords:

machine learning; SVM; bagged trees; NN model; quality management; metal manufacturing

1. Introduction

Implementing quality control represents a crucial element in enhancing both productivity and product quality within industrial contexts. Furthermore, it serves as a valuable instrument for the optimisation of production processes. The plethora of studies documented in the literature offers insight into the multifaceted aspects of implementing quality control in manufacturing, encompassing both traditional and automated production systems.

As Ryabchik et al. [1] observe, improvements in the quality control of manufacturing processes exert a direct influence on production efficiency. As evidenced by the research conducted by Pangestu et al. [2], the effective implementation of quality control measures can markedly improve the efficiency of manufacturing processes. Ramlah et al. [3] investigate the impact of methods such as Quality Control Circles and the ’seven quality tools’ on improving productivity. Fani et al. [4] add that effective quality control not only increases productivity but also improves product quality, which is particularly important in competitive industries.

Marco et al. [5] proposed a quality visualisation model that facilitates more comprehensive and effective real-time quality monitoring, thus supporting the optimisation of production processes. Šedík et al. [6] investigate the potential for marketing approaches to support strategic management by influencing quality control and process alignment. Ana Carolina and Hérida Regina [7] posit that quality management tools are an indispensable component of strategic management success. Mayer and Jochem [8] provide an overview of the application of machine learning to quality control in digital industries, highlighting automation as a pivotal factor in development.

Hariharan [9] investigates the practical application of machine learning techniques for textile quality control in the textile industry, highlighting how digital technologies can optimise production processes.

Bekbayeva et al. [10] build on this theme by examining the automation of quality control using machine learning and computer vision. This approach has the potential to reduce the reliance on manual intervention while improving inspection precision. In a similar vein, Gross et al. [11] highlight the value of explainable methods in quality prediction models, emphasising the importance of fostering a deeper understanding of the results generated by ML algorithms. Himmel et al. [12] also delve into this subject matter, examining the potential of such methods in chemical and biochemical production systems, with promising indications of enhanced quality and productivity.

In a subsequent study, Mayer and Jochem [8] examine the utility of capability indices for predictive process control, underscoring the importance of these instruments in automating and optimising production in the context of Industry 4.0. Hariharan [9] once more directs attention to the deployment of machine learning in the textile industry, demonstrating that these algorithms can reduce waste and improve material quality.

In a similar vein, Kim et al. [13] concentrate on the optimisation of production processes and prediction of quality using ML models. They demonstrate that these algorithms can adjust production parameters in real time based on data collected in real time. This approach has the potential to enhance productivity while simultaneously reducing the incidence of production errors. This research is part of a broader trend towards the integration of machine learning into manufacturing systems, with the aim of improving operational efficiency.

Subsequent research, as exemplified by the works of Adamczak et al. [14] and Mascenik et al. [15], elucidates the potential of digitisation, including process simulations, to improve quality control, thus facilitating superior production management and performance forecasting. Such simulations are especially beneficial for the identification of potential quality issues at an early stage, prior to the commencement of mass production.

In a novel contribution to the field of quality management, Pavlenko et al. [16] highlight the importance of energy optimisation in the context of sustainable quality management. Fuzzy logic research demonstrates that optimising energy consumption in production has the potential to impact not only operational costs but also the quality of production processes. In this context [17], the concept of quality control also includes considerations of sustainability and resource management, which are becoming increasingly significant in modern manufacturing enterprises.

Hrehová and Matiskova [18] are investigating the potential of incorporating machine learning elements in user interface design, with the aim of improving quality control processes in manufacturing systems. The use of intuitive and interactive interfaces supported by machine learning (ML) enables faster real-time detection of anomalies, which in turn results in increased efficiency and reduced production losses.

Barzizza et al. [19] concentrate on predictive quality control, using LightGBM and Random Forest models to analyse production data and predict potential defects. Their research demonstrates that these algorithms are capable of effectively optimising production processes, resulting in enhanced product quality and increased productivity. The deployment of these technologies is especially crucial in industries that require high precision and defect minimisation.

In their discussion of the use of cyberphysical systems integrated with the Internet of Things (IoT) for real-time quality monitoring, Arora and Gupta [20] present a compelling argument for the potential of such systems to enhance the efficiency and responsiveness of quality monitoring processes. The research demonstrates that such systems can function as comprehensive quality control instruments, automating procedures and facilitating expedited responses to irregularities. The integration of the Internet of Things (IoT) with production systems enables companies to monitor quality parameters in real time, thereby facilitating dynamical adjustment of processes to ensure compliance with requirements.

In their study on predictive maintenance in industry, Riccio et al. [21] demonstrate that the integration of machine learning with maintenance systems can enhance operational efficiency, reduce unplanned downtime, and improve production quality. The use of such methodologies facilitates the optimisation of resource management, which ultimately results in a reduction in operating costs and an improvement in production efficiency.

Mahapatra and Gaurav [22] focus on the utilisation of machine learning (ML)-based decision support systems to optimise manufacturing processes. The findings of their research indicate that these algorithms can help managers make more informed quality control decisions, which directly enhances operational efficiency and reduces defects. The implementation of data-driven decision support systems has the potential to significantly improve the precision and expediency of manufacturing decisions.

Zoubaidi et al. [23] are investigating the implementation of a Quality 4.0 model that uses data-driven decision making to improve real-time production efficiency and improve product quality. The research indicates that data collected and evaluated in real time can be used to improve production processes, resulting in increased productivity and a reduction of defective products.

The application of machine learning (ML) algorithms has been shown to offer significant advantages over traditional statistical methods in the context of quality control, particularly in terms of efficiency and adaptability. These algorithms are particularly adept at processing large and complex datasets and discerning intricate patterns that are beyond the reach of traditional statistical techniques. As a result, they are well suited to modern manufacturing environments, which are characterised by ever-increasing data volumes and complexity. The following section presents a comparative analysis of the efficiency and adaptability of machine learning (ML) algorithms in quality control, contrasting them with conventional statistical approaches.

ML algorithms have demonstrated proficiency in handling substantial datasets from various sources, detecting nonlinear relationships and complex patterns that traditional statistical methods might overlook [24,25]. Studies have shown that Random Forests, for example, are more effective than conventional methods such as Hotelling’s T² control charts in predicting manufacturing performance, underscoring their superior efficiency in quality control tasks [24]. Furthermore, machine learning enables automation, thereby reducing the need for human intervention and improving product quality by providing rapid and reliable information for quality assessment [26].

The adaptability of ML is evident in its real-time data adjustment capabilities, which allow continuous parameter tuning based on ongoing analysis. This has been demonstrated in the QU4LITY project for bearing testing. Furthermore, machine learning’s predictive capabilities allow it to anticipate potential issues and adapt to changing conditions, which is crucial for consistent performance in dynamic systems such as enterprise resource planning (ERP) [27]. The continuous learning aspect of machine learning models also allows for accuracy and reliability improvements over time, a flexibility that static statistical models lack [28].

Although machine learning algorithms offer considerable advantages, traditional statistical methods remain relevant, particularly in contexts where the data are relatively straightforward or where interpretability is a priority. However, as the data produced by manufacturing processes becomes increasingly complex, the efficiency and adaptability of machine learning become increasingly valuable. The incorporation of machine learning (ML) into quality control procedures not only enhances operational efficiency but also facilitates predictive maintenance and risk management. To illustrate, industries that use IoT sensors with ML models can proactively address issues based on patterns in vibration or temperature data, thus reducing downtime and maintenance costs [29]. Despite these advances, challenges persist, including the need for substantial, high-quality training data, concerns about algorithm transparency, and the importance of skilled personnel to interpret ML outputs [29].

Implementing ML in quality control improves efficiency and adaptability over traditional methods, aligning with the demands of complex, data-intensive manufacturing environments and advancing toward more resilient production processes.

A review of the literature makes it evident that digital tools, including machine learning, cyberphysical systems, and data analytics, are significant in the context of modern quality management. The automation of quality control processes, the enhancement of operational efficiency, and the dynamic adjustment of production parameters based on real-time data are becoming pivotal elements that enable manufacturing companies to achieve superior product quality, minimise defects, and optimise the utilisation of resources.

Notwithstanding these advances, there remain several significant gaps in the existing literature that require further investigation. One of the main challenges is the restricted scalability of ML solutions for smaller enterprises. Toigo et al. [30] posit that small and medium enterprises (SMEs) encounter obstacles in the complete implementation of sophisticated predictive technologies due to limitations in technical and financial resources. Further research is required to develop more accessible and scalable solutions that can be easily implemented in diverse enterprises.

Although ML offers considerable promise in quality management, significant gaps remain that impede the comprehensive deployment of these technologies. More research is needed to assess the availability of ML tools, their integration with existing systems, the improvement of data quality, and the adaptation of solutions to specific industry needs. Further advancement in these areas can markedly improve operational efficiency and improve production quality in numerous industrial sectors.

That is why this paper explores how machine learning techniques can support defect detection in the manufacturing process. By focussing on the final manufacturing stage of propeller shafts, this study demonstrates how ML can be used to achieve more proactive and effective quality management, ultimately resulting in improved efficiency, reduced defects, and enhanced product quality. Integration of ML into traditional quality management systems is an important step toward optimising manufacturing processes for future requirements.

The structure of this paper is as follows. In Section 2, the case study company is presented. The research methodology and methods used are shown in Section 3. In Section 4, the research results of the machine learning algorithms methodology are presented. Section 5 and Section 6 provide conclusions, limitations, and suggestions for future research. This study is a continuation of the research presented in previous works [31,32].

2. Description of the Research Problem and Case Study Company

2.1. Formulation of Research Problems

The research problem analysed in this paper concerns the reduction of nonconforming products in the finishing production of drive shafts. In this process, various defects occur, a total of approximately 8%, which affect the quality of the final product and its compliance with technical requirements and customer specifications. The most common defects include nicks and scratches (D1), cracks (D2), surface irregularities (D3), radial runout (D4), improper dimensions (D5), and improper hardness (D6). Each of these defects can result in product rejection, additional machining, or, in extreme cases, failure in use, leading to significant financial losses and reduced customer confidence.

Analysis of these defects indicates the need to understand their root causes and implement appropriate countermeasures to minimise or eliminate them. Defects such as nicks, scratches, and cracks often arise from inadequate material preparation or improper machine operation. Surface irregularities, radial runout, improper dimensions, and hardness issues typically result from problems related to the machining process, including inaccurate machine settings, tool wear, or errors in the hardening process. Addressing these defects, which affect around 8% of production, is critical to improving overall product quality and operational efficiency.

2.2. Process Description in the Case Study Company

The case study company specialises in the manufacture and machining of drive shafts and offers a comprehensive service covering the entire manufacturing process, from initial material preparation through precision machining to final assembly [33]. With state-of-the-art machinery and a highly skilled team of engineers and technicians, the company can carry out even the most demanding projects. Committed to the highest quality standards, the company continually invests in innovative technologies and advanced manufacturing processes to ensure that its products meet the expectations of customers around the world [34].

The company’s propeller shafts are renowned for their strength, precision, and durability. Advanced technologies such as induction hardening and precision grinding enhance the wear resistance of the shafts and ensure smooth and reliable operation under the most demanding conditions. These drive shafts are used in a wide range of industries, including automotive, heavy industry, agricultural machinery, and rail vehicle drive systems. The combination of high product quality and customisability makes the company’s drive shafts the preferred choice of leading machinery and vehicle manufacturers worldwide.

The company’s finishing process for drive shafts begins with rough turning, which is the first stage of machining. Rough turning removes significant amounts of excess material from the raw shaft, shaping it to a shape close to the final dimensions. This stage prepares the semifinished product for more precise machining in subsequent stages.

3. Materials and Methods

3.1. Research Methodology

The research methodology, as illustrated in Figure 1, consisted of four main stages, each of which was meticulously planned to guarantee a complete analysis and inference. The initial stage of the research process involved the gathering of data, which began with the formulation of a comprehensive data collection plan. The plan delineated the specific data to be collected, the sources from which it would be obtained, and the methodology to be employed in its collection. Subsequently, the data collected were subjected to a preliminary analysis with the objective of verifying the quality. At this stage, the completeness of the data was verified, and any anomalies or measurement errors were identified and assessed.

The second stage consisted of statistical analyses that were conducted with the objective of examining the relationships between variables. A Chi-square test was used to assess the presence of dependencies between categorical variables, which proved particularly effective in the context of data grouped by categories. Furthermore, the Kruskal–Wallis test was employed as a nonparametric alternative to ANOVA, allowing for the comparison of medians across more than two groups. This test proved invaluable when the data did not follow a normal distribution, as it prevented the potential for errors that could arise from using an inappropriate analysis method.

The third stage was dedicated to the utilisation of machine learning algorithms for the development of predictive models. Neural networks (NNs), SVM (Support Vector Machine), and Bagged Trees (BT) were employed to construct a variety of models that facilitated effective data analysis and prediction.

The fourth and final stage comprised the evaluation of the results and discussion thereof. The efficacy of the developed models was evaluated using a range of quality indicators that assessed the precision, sensitivity, and accuracy of the predictive models. For the assessment of classification models, the Receiver Operating Characteristic (ROC) curve was employed to illustrate the model’s capacity to differentiate between classes, along with the Area Under the Curve (AUC), which quantified the quality of the classifier. A higher AUC value indicated a better performance of the model.

3.2. Data Collection

During this stage, data from the production process were collected. The objective of this data collection was to evaluate the quality of the products after production, ensuring compliance with specified standards and identifying any defects. Data were collected after the completion of all production stages, when the products were prepared for final inspection.

Both attribute data and continuous data were collected. The attribute data comprised the number of nonconforming products and the types of defects (D1–D6) identified in the final products. Continuous data were concentrated on particular process parameters, designated as variables X, including the curing temperature (X1), type of coolant used (X2), cooling speed (X3), feed rate (X4), tool wear (X5 and X7), cutting speed (X6), grinding wheel speed (X8), and grinding time (X9).). These parameters were instrumental in identifying the factors that had the greatest influence on the occurrence of defects.

In accordance with the quality control results, each product was classified as compliant (OK) if it met all quality standards or nonconforming (NOK) if any deviations from the required specifications were detected that could impact its functionality or appearance. For products classified as noncompliant, a comprehensive list of defects was compiled, including scratches and abrasions (D1), cracks (D2), surface irregularities (D3), radial runout (D4), incorrect dimensions (D5), and inappropriate hardness (D6). This detailed classification facilitated a more precise analysis of the root causes of defects.

3.3. Statistical Analyses

The main objective of the data analysis was to transform the collected data into information and knowledge that could be used to act. The initial step involved an evaluation of the significance of the influence of the process parameters (X1 to X9) on the occurrence of various defects. To achieve this, statistical methods such as the Kruskal–Wallis test and the Chi-square test were employed.

The Kruskal–Wallis test [35,36], a nonparametric statistical method, was used to compare medians in more than two independent groups. This test is appropriate when the assumption of normal data distribution is untenable or when the sample sizes vary across groups. It is an extension of the Mann–Whitney U test for cases involving more than two groups and is particularly effective for analysing ordinal or continuous data that do not meet the assumptions of normality. The null hypothesis (H₀) of the Kruskal–Wallis test posits that all groups originate from populations with the same median, whereas the alternative hypothesis (H₁) suggests that at least one group comes from a population with a different median.

The test is used to classify the data in all groups and to facilitate a comparison of the sums of ranks between the groups. The test statistic, designated as H, is calculated using the following formula:

H = \frac{12}{N (N + 1)} \sum_{i = 1}^{k} \frac{R_{i}^{2}}{n_{i}} - 3 (N + 1)

(1)

where

N is the total number of observations;
R_i is the sum of ranks for the i-th group;
n_i is the number of observations in the i-th group;
k is the number of groups.

If the resulting p-value is small (typically p < 0.05), it indicates that there are statistically significant differences between the medians of the groups.

The Chi-square (χ²) test is used to examine the relationship between two categorical variables by comparing the observed and expected frequency of occurrence in different categories [27,28]. The null hypothesis (H₀) of the Chi-square test assumes no association between the variables, thereby indicating that they are independent. In contrast, the alternative hypothesis (H₁) suggests a dependency between the variables. The Chi-square statistic is calculated using the following formula:

χ^{2} = \sum \frac{{(O - E)}^{2}}{E}

(2)

where Oij is the observed value and

E_{i j}

are theoretical values.

If the p-value obtained from this test is less than 0.05, the null hypothesis is rejected, indicating a statistically significant association between the variables.

In this study, the Kruskal–Wallis test was used to compare the medians of multiple groups when normality assumptions were not met. Meanwhile, the Chi-square test was used to assess the relationship between categorical variables by comparing the observed and expected frequencies. These methods were used to determine the importance of the influence of the process parameters (X1 to X9) on the appearance of various defects (D1 to D6) in the production process.

3.4. Machine Learning Methodology

Three different machine learning (ML) methods were used in the classification analysis to assess product quality: Neural Networks (NNs), Bagged Trees, and Support Vector Machines (SVMs). These methods were chosen because of their proven effectiveness in solving various classification problems, especially when dealing with complex data [37,38]. The dataset used in this study consisted of 565 records, further emphasising the importance of carefully selecting the classification approach to ensure optimal performance and prediction precision.

Neural networks (NNs) have demonstrated their suitability for analysing highly complex datasets due to their ability to model intricate, nonlinear relationships. With deep neural network architectures, it becomes possible to uncover hidden patterns and structures within the data, especially when the relationships between variables are complex and not easily captured by simpler models. However, one of the challenges of using NNs is the risk of overfitting, particularly with smaller datasets and to mitigate this, careful consideration must be given to the network architecture, including the number of layers, the neurones in each layer, and the choice of activation functions.

Bagged Trees, an ensemble method based on the bootstrap aggregation (bagging) technique, was also used. This method involves building multiple decision trees on different subsets of data and aggregating their results to produce more stable and accurate predictions. By training each decision tree on randomly selected subsets of the data, bagged trees can capture different aspects of the dataset and reduce model variance. In classification tasks, the final decision is based on the majority vote of all trees. Bagged trees are particularly useful for data with high variability and complexity and are more resistant to overfitting than single decision trees. However, parameters such as the number of trees in the ensemble and the depth of each tree must be carefully tuned for optimal performance. Despite their advantages, bagged trees can be computationally intensive, especially for larger datasets [39].

Support Vector Machines (SVMs) are well known for their ability to effectively handle high-dimensional data and determine the optimal decision boundary between classes. By using kernel functions, SVMs can model complex, nonlinear relationships, which is critical when analysing data that are not linearly separable. However, while SVMs are a powerful tool for classification, selecting the correct kernel and tuning parameters, such as the kernel scale and box constraint, requires careful analysis and experimentation. Improper parameter selection can lead to suboptimal model performance.

The decision boundary in a linear SVM is represented by the equation of a hyperplane:

w ⋅ x + b = 0

(3)

where

w is the weight vector;
x is the input characteristic vector;
b is the bias term.

SVM works by maximising the margin between the decision boundary and the nearest data points. The margin is given by:

Margin = 2/∥w∥

(4)

Maximising this margin while minimising classification errors leads to the following optimisation problem:

\begin{matrix} m i n \\ w, b \end{matrix} \frac{1}{2} ∥ w ∥

subject to y_i(w ⋅ x_i + b) ≥ 1 ∀i.

Where

y_i is the label of the i-th training example (y_i = +1 for one class, and y_i = −1 for the other);
x_i is the i-th training example.

For cases where the data are not linearly separable, SVMs introduce slack variables ξi that allow for some misclassification. This leads to the soft-margin SVM formulation:

\begin{matrix} m i n \\ w, b, ξ \end{matrix} \frac{1}{2} ∥ w ∥^{2} + C \sum_{i = 1}^{n} ξ i, subject to y_{i} (w \cdot x_{i} + b) \geq 1 - ξ_{i} and ξ_{i} \geq 0

where ξi are the slack variables, and C is the regularisation parameter that controls the trade-off between maximising the margin and minimising classification errors. In cases where the data are not linearly separable, SVMs use kernel functions to map the data into a higher-dimensional space where a linear separation is possible. The choice of kernel function and the tuning of parameters are critical to achieving good performance with SVMs.

When choosing a classification method, it is important to consider the specific characteristics of the data, the nature of the problem, and the available computational resources. Therefore, before applying a particular method, detailed data analysis and experimentation is necessary to evaluate the effectiveness of each model in the context of the given task. In the present study, the development of the model was carried out using Matlab R2024a, specifically using the Classification Learner app to facilitate the process.

To assess the performance of the models, the k-fold cross-validation method was used, where the dataset is divided into five folds. In this approach, the entire dataset is divided into five equal subsets, with each subset used once as the test set and the remaining subsets forming the training set. This process ensures that every data point is used for validation, providing a comprehensive evaluation of the model’s performance. In the case of the Bagged Trees model, several configurations were tested to capture differences between classes. Parameters such as the number of trees, tree depth, and voting methods were varied to determine their effect on model performance. In addition, the choice of splitting criteria—whether information gain or the Gini index—played a key role in the decision-making process for splitting nodes in the trees. The Gini index is determined as follows (5):

Q_{G} (m) = \sum_{j = 1}^{s} p_{m i} (1 - p_{m i}) = 1 - \sum_{j = 1}^{s} p_{m i}^{2}

(5)

where

p_{m i}

is a conditional probability for the

j

-th class in a node, s—several classes. In node

m

with

n_{m}

observations, the conditional probability for the

j

-th class is equal to the following (6):

p_{m i} = \frac{# \{y = c_{i} : x \in R_{m}\}}{n_{m}}

(6)

The quality of the classification models was assessed using several key statistical measures. One of the primary metrics was the accuracy of the model (Equation (7)), calculated as the ratio of correctly classified instances (true positives, TPs, and true negatives, TNs) to the total number of instances (TPs + TNs + false positives, FPs + false negatives, FNs). The confusion matrix provided a detailed breakdown of the classification results, including the values for TPs, TNs, FPs and FNs, giving a clear picture of the types and frequency of classification errors.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(7)

In addition to the accuracy, the receiver operating characteristic (ROC) curve and the area under the curve (AUC) were used to assess the model’s ability to discriminate between classes. These metrics are particularly useful for assessing how well the model performs in terms of sensitivity and specificity. The error rate (Equation (8)), which is the complement of accuracy, was used to express the percentage of cases that were incorrectly classified.

E r r o r R a t e = 1 - A c c u r a c y

(8)

Analysis of these indicators provided a comprehensive profile of the performance of the model. This allowed an assessment of its effectiveness across several dimensions, including overall accuracy and its ability to accurately discriminate between different classes.

4. Results and Discussion

4.1. Preliminary Data Analyses

Data collection for the analytical process was conducted between April and May 2024, during the final quality control stage of the finished products. The data collected were subject to extensive preliminary analysis. The first step was to assess the completeness of the data, identify any missing data, and analyse outliers that could affect the results of further analyses. Advanced statistical methods and algorithms were used to assess the quality of the data, allowing anomalies to be accurately identified and removed. Data were then standardised and normalised to ensure consistency of scale and to prevent the influence of different units of measurement of variables on subsequent stages of analysis. This process ensured consistent results when comparing different variables.

Furthermore, a detailed analysis of the types of defects in finished products was performed. This analysis included classifying and categorising the different types of defects and determining their frequency of occurrence.

Figure 2 shows that the most common defects recorded in the manufacturing process are D2 and D3, which represent 25.20% and 21.85% of the total defects, respectively. Cracks pose a significant threat to product integrity, often leading to serious performance problems or product failure. Surface imperfections, which affect both the appearance and functionality of the product, are also a major concern, particularly in industries that require high-precision and smooth surfaces. Addressing these two types of defects should be a primary focus to improve overall quality and reduce production losses.

D1 follows with 15.63% of defects, making it the third most common problem. Although not as critical as cracks or irregularities, scratches and abrasions can still affect aesthetic quality and, in certain applications, performance. D4, which accounts for 14.04% of defects, can cause significant problems in components that require rotational accuracy, such as drive shafts or machine parts. Controlling these medium-level defects can improve product reliability and customer satisfaction, especially in performance-critical environments.

The least common defects are D5 at 12.28% and D6 at 11.00%. Incorrect dimensions can lead to assembly problems or operational malfunctions, while inappropriate hardness can undermine the durability and wear resistance of the product. Although less common, these defects still need to be addressed, as their impact on product performance can be significant. Focussing on reducing these defects, in addition to addressing the most common problems, will ensure an overall improvement in production quality and process efficiency.

Figure 3 illustrates the distribution of defect types (D1 to D6) across specific weeks from April to May 2024.

In the initial week of April, the data indicate a notable prevalence of defects D1, D2, and D3, with D2 exhibiting the highest incidence among all identified defects. D4 and D5 are relatively less prevalent, while D6 has a minimal occurrence. In the second week, there is a notable increase in the incidence of defect D3, while D1 and D2 exhibit a slight decline in comparison to the preceding week. The distribution of defects D4, D5, and D6 remains low, but there is a more even distribution of these defects. In the third week, defects D3 and D2 continue to exert a dominant influence, maintaining high counts. In contrast, D1 and D5 exhibit a slight increase, while D4 and D6 remain at lower levels. In the fourth week, D3 reaches its highest count for this defect type throughout the period, with moderate counts for D1 and D2 and lower counts for D5 and D6.

In May, there is a notable shift in the distribution of defects. In the initial week, the frequency of defect D3 declines, while that of D1 and D2 increases, with a slight rise in D4 and minimal change in D5 and D6. In the second week, the number of occurrences of D1 and D2 remained consistent, with D2 exhibiting a slight lead. Defect D3 remains at a moderate level of occurrence, while D4 increases slightly, and D5 and D6 remain at a low level of occurrence. In the third week, there is a notable increase in the prevalence of defect D2, which becomes the most common defect type during this period. Moderate levels of D1, D3, and D4 are observed, while the levels of D5 and D6 remain relatively low. In the fourth week, D2 reaches a peak level comparable to that observed in the fourth week of April. The counts for D1 and D3 are moderate, while those for D4, D5, and D6 remain low.

This analysis demonstrates that defects D2 and D3 are the most prevalent throughout the period under review, with D2 peaking frequently in May and D3 peaking in late April. Defects D4, D5, and D6 demonstrate consistently lower counts, indicating a lesser impact compared to D1, D2, and D3. Both April and May’s fourth weeks exhibit elevated counts for specific defects, suggesting potential end-of-month process variations or batch-specific issues that may necessitate further investigation.

The preliminary analysis of the impact of individual process parameters on defect occurrence is presented in Figure 4. The preliminary analysis indicated that the curing temperature (X1) (Figure 4a) exhibited a correlation with the type of defect, with defects D1 and D2 occurring with greater frequency at higher temperatures. This suggests that a reduction in temperature may prove to be an effective strategy for the mitigation of these defects, especially that some outlies (black dots) for D2 defect can be observed. Defect D5 manifests at lower temperatures, thereby underscoring the necessity to adjust the temperature to curtail its prevalence.

Moreover, the cooling speed (X3) (Figure 4b) has a considerable influence on defect D6, which manifests at elevated values of this parameter. It can be posited that a reduction in cooling speed may prove to be an effective method of limiting the occurrence of this defect. Furthermore, the control of this parameter may also prove beneficial in reducing the incidence of other defects, especially that some outlies (black dots) for the several types of defects can be observed. Furthermore, the feed rate (X4) (Figure 4c) is found to be correlated with defect D5, which manifests at lower feed rates. It can be posited that an increase in the feed rate may serve to reduce this particular type of defect. Furthermore, it is conceivable that a meticulous calibration of the feed rate for each specific defect type could have a beneficial impact on the overall quality of the product.

Furthermore, the cutting speed (X6) (Figure 4d) has an impact on the occurrence of various defects. Defects D1 and D2 are more prevalent at higher cutting speeds, while defects D3 and D4 are more common at lower speeds. The implementation of an effective control strategy for cutting speed has the potential to reduce the prevalence of nonconforming products. Regarding the grinding wheel speed (X8) (Figure 4e), it was observed that defect D1 occurs at higher values, while D5 occurs at lower values. Optimisation of this parameter may prove to be an effective method of reducing the occurrence of these defects.

The grinding time (X9) (Figure 4f) also plays an important role in this process; the occurrence of defect D4 is more common with longer grinding times. This may suggest that overheating or excessive wear from extended processing times contributes to its occurrence. It can thus be surmised that optimising the grinding time may prove to be an effective method of reducing the number of defects associated with this particular parameter.

The aforementioned analysis demonstrated that a number of process parameters, including the curing temperature, coolant type, cooling speed, feed rate, cutting speed, grinding wheel speed, and grinding time, exert a considerable influence on the occurrence of specific defects. Optimising these parameters has the potential to markedly enhance product quality by reducing the number of defects, thereby improving the efficiency of the production process.

4.2. Results of Statistical Analyses

Two main statistical tests were performed on the data collected: the Chi-square test and the Kruskal–Wallis test. These tests were chosen based on the nature of the data; some parameters were categorical, making them suitable for the Chi-square test, while others were continuous or ordinal, making them suitable for the Kruskal–Wallis test. The results of these analyses are shown in Table 1.

4.3. Machine Learning Models

In the machine learning (ML) process, three different classification models were developed and evaluated: Bagged Trees (BT), Neural Network (NN), and Support Vector Machine (SVM). These models were trained using significant process parameters (X1 to X9) identified in previous stages of the analysis as key to predicting the occurrence of defects labelled D1 to D6. The dataset employed in the present study comprised 565 data points obtained during the final quality control phase of the manufacturing process. These data points encompassed a range of defect types (such as cracks, scratches, surface irregularities, and dimensional inaccuracies) and critical process parameters (e.g., curing temperature, feed rate, tool wear, cooling speed, etc.). This scope permitted a meaningful evaluation of the machine learning models and was sufficient for training and validation purposes. To prevent overfitting, a five-fold cross-validation method was used, and an 80/20 split between the training and test set was used. This ensured that each model was tested on different subsets of the data, allowing full use of the dataset in the validation and training process.

Each model was subjected to a detailed evaluation based on several key metrics. These included prediction speed, which measures how quickly the model can make a prediction using new data; training time, which indicates how long it takes to train the model; and accuracy, which measures how well the model classifies defect cases compared to actual outcomes. Furthermore, the error rate was analysed, which reflects the percentage of incorrect classifications made by the model.

Table 2 provides a detailed comparison of the results for each of these metrics across all models, allowing an assessment of their effectiveness and practical applicability. By comparing these key parameters, it is possible to select the best model for defect prediction in terms of both prediction accuracy and time efficiency.

A comparative analysis of the best tree classification models was performed. The performance of the Bagged Trees (BT), Neural Network (NN), and Support Vector Machine (SVM) models was evaluated in terms of key metrics, including the prediction speed, training time, hyperparameters, accuracy, and error rate.

The Bagged Trees (BT) model is based on an ensemble (bagging) method comprising 30 decision trees, with a maximum of 564 splits in each tree. The principal benefit of this approach is the high degree of stability in the predictions, which is achieved through the aggregation of the results from multiple models. This resulted in an accuracy rate of 94.2%, which is a highly commendable result. However, the model exhibits notable limitations in terms of prediction speed, with a processing capacity of only 1400 observations per second, rendering it the slowest among the models under analysis. The training time was 21.02 s, which also puts it last in terms of time efficiency. The error rate of the BT model was 5.8%, which was marginally higher than that of the neural network, but nevertheless within an acceptable range.

In several respects, the neural network (NN) model demonstrated the greatest efficiency. The neural network comprising 100 neurones in the initial layer and the ReLU activation function demonstrated remarkable efficacy in numerous categories. The NN model demonstrated the highest accuracy, at 94.7%, and was therefore the most effective in terms of prediction accuracy. Furthermore, the neural network exhibited the lowest error rate of 5.3%, indicating the lowest number of misclassifications. In addition, the NN model exhibited an impressive prediction speed of 7800 observations per second, making it the fastest model in the analysis. The training time was 8.39 s, which, although not the shortest, was sufficiently rapid given the complexity of the model.

The Support Vector Machine (SVM) was utilised with a Gaussian kernel having a scale of 0.75, which allowed the creation of intricate decision boundaries.

The most significant advantage of this model was its exceptionally short training time of just 1.67 s, making it the most time-efficient model. However, the SVM demonstrated suboptimal performance in terms of prediction precision, achieving an accuracy of 94%, which was significantly different from the results of NN and BT. The error rate for SVM was 6%, indicating a higher number of misclassifications compared to the other models.

The neural network (NN) was identified as the most comprehensive and efficient model, exhibiting the highest accuracy, lowest error rate, and fastest prediction speed. Although the training time was not the shortest, it was sufficiently rapid to allow the NN to outperform the other models in terms of overall efficiency. Despite demonstrating a level of accuracy similar to that of the NN, the BT exhibited slower processing times and a reduced capacity for real-time operation. Consequently, it is less suitable for applications that require rapid response times. On the contrary, the SVM exhibited the fastest training times but exhibited lower accuracy and a higher error rate, limiting its applicability to scenarios where training speed is a primary consideration. In such cases, the NN would be the optimal choice for accurate and rapid prediction.

The performance of each machine learning model was evaluated using confusion matrices, which are a key tool in this context. These matrices provide a comprehensive breakdown of the performance of the models by comparing the actual class labels (true labels) with the predicted labels generated by the models. This analysis enables not only the evaluation of the overall accuracy of the models but also gains a deeper understanding of the types and frequencies of specific errors made by each model. This allows for a detailed assessment of the models’ strengths and weaknesses, highlighting areas where improvements may be necessary, such as identifying which classes tend to be misclassified and the severity of such errors. Figure 5 presents the confusion matrices for each machine learning model developed.

The results presented in Figure 5a are based on a model developed using the bagged tree method. The matrix presents the percentages of correct and incorrect classifications for six different classes, labelled D1 through D6. Cells along the main diagonal of the matrix indicate the correct classifications. For example, the model achieved 96.2% accuracy for class D1, while the accuracy for D2 was 90.3%. The highest level of precision was observed for class D5, with 97.2% of cases correctly classified. In contrast, the least accurate classifications occurred for class D2. Additionally, the misclassifications are illustrated in the off-diagonal cells. For example, 3.2% of the instances belonging to class D1 were incorrectly assigned to class D2, while 4.5% of the cases classified as D3 were, in fact, classified as D2.

Furthermore, the positive predictive value (PPV), which gauges the accuracy of classification for each class, and the false discovery rate (FDR), which quantifies the proportion of erroneous predictions, are also presented. The highest level of precision was observed for class D5, with a value of 97.2%, while the lowest level of precision was observed for class D2, with a value of 90.3%. Similarly, class D2 exhibited the highest FDR at 9.7%, indicating that predictions for this class were more susceptible to error. On the contrary, class D5 exhibited the lowest FDR at 2.8%, demonstrating the highest classification precision. Overall, the Bagged Trees model demonstrates robust performance, although further optimisation is necessary for classes such as D2 to reduce misclassification rates.

The confusion matrix for the NN model (Figure 5b) indicates a high degree of accuracy, with classes such as D3 and D5 achieving the highest results at 97.4% and 97.2%, respectively. Class D1 is correctly classified 95.2% of the time, while class D2 achieves an accuracy of 92.2%. The highest frequency of misclassification is observed for D6, with 9.0% of instances incorrectly predicted, predominantly as D1. The misclassification rates for the remaining classes are relatively moderate, with 4.5% of instances of class D1 being misclassified as class D6 and 3.6% of instances of class D2 being predicted as class D4.

Furthermore, the precision (PPV) and false discovery rate (FDR) metrics provide further insight into the model’s strengths and weaknesses. D3 and D5 demonstrate the highest precision at 97.4% and 97.2%, accompanied by low FDR values of 2.6% and 2.8%, respectively. This indicates a robust capacity to accurately identify these classes. However, D6 exhibits the lowest precision at 91.0% and the highest FDR at 9.0%, indicating that this class is the most challenging for the model to classify accurately. Although the overall performance of the neural network is strong, additional refinement, particularly for class D6, could help improve the model’s prediction accuracy and reduce misclassification rates.

The confusion matrix for the SVM model (Figure 5c) indicates a high level of accuracy across most classes, particularly for classes D1, D4, D5, and D6, where the model achieved a perfect classification rate of 100%. However, the performance for class D2 is notably less accurate, with an accuracy of 84%, while class D3 shows a correct classification of 94.3%. The highest frequency of misclassification was observed for class D2, with 16% of instances incorrectly classified as class D1, and for class D3, with 5.7% of instances incorrectly classified as class D2. Despite these misclassifications, classes D4, D5, and D6 demonstrate no misclassification whatsoever, thereby underscoring the robustness of the model with respect to these categories.

In terms of the precision (PPV) and false discovery rate (FDR), the model demonstrated remarkable performance for classes D1, D4, D5, and D6, achieving 100% precision and no false discoveries for these classes. However, the precision for D2 is significantly lower at 84% and D3 has a PPV of 94.3%, which is indicative of the impact of misclassifications. The false discovery rate (FDR) for D2 is the highest at 16%, indicating that the model faces greater challenges with this class. In comparison, D3 has an FDR of 5.7%. Overall, the SVM model demonstrates excellent performance, particularly for specific classes. However, further enhancements could be made to D2 to minimise misclassification and improve precision.

A comparison of the results of the confusion matrices for each model demonstrates a high overall performance, although there are differences in the precision and misclassification rates for specific classes. The BT model demonstrates optimal performance for classes D5 and D1, exhibiting high precision. However, it exhibits suboptimal performance for class D2, with the highest rate of misclassification. The neural network model also demonstrates robust performance, particularly for classes D3 and D5. However, it faces challenges with class D6, which presents the highest misclassification rate. In contrast, the SVM model shows excellent precision for classes D1, D4, D5, and D6, with 100% correct classifications for these classes. However, it significantly underperforms with class D2, showing the highest misclassification rate (16%). In general, all three models perform well in most classes. However, the SVM model demonstrates exceptional accuracy for some classes, while the NN and BT models would benefit from further optimisation for classes D6 and D2, respectively.

Table 3 presents a comparison of the three models (BT, NN, and SVM) based on their performance in classifying different types of defects (D1 to D6). The table includes accuracy information for each model, expressed as the true positive rate (TPR), and notes any classification problems (False Discovery Rate, FDR).

A comparison of these three models allows for the identification of the strengths and weaknesses of each in the classification of defect types D1 to D6. SVM achieves perfect accuracy (100% TPR) for D1, D4, D5, and D6, establishing it as the optimal choice for these defects. The neural network shows superior performance in identifying D3 (97.4% TPR) and exhibits satisfactory results for D5 and D2. However, it exhibits some degree of misclassification for D1 and D6. The bagged tree model exhibits balanced performance across all defects, with notable results for D5 (97.2% TPR) and high precision for D3, D4, and D6. This makes it suitable for general classification without focussing on any specific defect type.

In addition to the confusion matrices, the performance of the developed models was further evaluated using Receiver Operating Characteristic (ROC) curves and the corresponding Area Under the Curve (AUC) metrics. The ROC curve is a visual representation that demonstrates a classifier’s capacity to differentiate between classes. It is constructed by plotting the true positive rate (TPR) against the false positive rate (FPR) at varying threshold levels. This enables a comprehensive analysis of the trade-off between sensitivity (recall) and specificity at varying decision thresholds. The AUC metric serves to complement the ROC curve by providing a single numeric value that can summarise the model’s overall discriminatory ability. An AUC value approaching 1 indicates optimal performance, which indicates that the model is highly capable of distinguishing between the positive and negative classes. On the contrary, an AUC value of approximately 0.5 indicates that the model performance is no more accurate than random guessing. Therefore, the ROC-AUC analysis offers a more comprehensive evaluation of the robustness and effectiveness of the model across various classification thresholds (as illustrated in Figure 6), facilitating a more nuanced understanding of the model’s generalisation capabilities beyond mere accuracy metrics.

The ROC curve analysis for the three models (Figure 6) demonstrates their robust performance, although with some variation between different classes.

The BT model (Figure 6a) exhibits exemplary classification capabilities, with AUC values ranging from 0.9691 for D1 to 1.0 for D5, signifying near-flawless performance for most classes. Furthermore, D6 shows an exceptional degree of performance, with an AUC value of 0.9994, indicating that the model is capable of almost perfectly distinguishing between positive and negative instances. However, the AUC for D1, although still high at 0.9691, exhibits a marginally elevated false positive rate compared to the other classes, thus identifying a potential avenue for improvement. For classes D2, D3, and D4, the model maintains an AUC value greater than 0.98, indicating a robust and reliable classification across the board.

The NN model (Figure 6b) also demonstrates robust performance, although the variability across the classes is more pronounced in comparison to the bagged trees model. The AUC values range from 0.9392 for D1 to 0.9984 for D5, indicating that, while the model performs well, there is a higher rate of false positives for certain classes, particularly D1 and D4. Regarding D5 and D6, the model demonstrates a near-perfect classification, as evidenced by the AUC values of 0.9984 and 0.9816, respectively. This indicates a high degree of discriminatory power for these classes. However, the somewhat diminished AUC values for D1 (0.9392) and D4 (0.9479) indicate that the model faces greater challenges with these classes. The model’s overall performance remains robust; however, further tuning could prove beneficial, particularly for classes presenting a higher rate of misclassification.

The SVM model (Figure 6c) demonstrates the most consistent and high performance across all classes, with AUC values ranging from 0.9937 for D4 to a perfect 1.0 for D5. This indicates a near-flawless performance, particularly for classes D5 and D6, where the AUC values reach 1.0 and 0.9991, respectively. Even for classes such as D1, D2, and D3, where the AUC values are 0.9956, 0.9989, and 0.9969, the model performs exceptionally well, consistently maintaining a very high true positive rate while minimising false positives. The slightly lower AUC for D4 (0.9937) is nevertheless still excellent, indicating that the model only struggles marginally with this class.

The above results for all three models demonstrate robust classification capabilities, with AUC values consistently exceeding 0.93 in all classes. The SVM is the most robust and reliable model, demonstrating near-perfect performance across all classes. Furthermore, the Bagged Trees model also performs exceptionally well, particularly for certain classes such as D5 and D6. However, while the NN model is strong overall, it shows more variability and a slightly higher rate of false positives, particularly for D1 and D4. Therefore, for applications that require consistent and highly accurate performance across all categories, the SVM would likely be the optimal choice. However, the Bagged Trees and Neural Network models remain strong contenders, depending on the specific requirements of the task.

Partial dependence analysis of input parameters and predicted defect categories (D1–D6) (Figure 7) in the optimal neural network (NN) model allows for the identification of pivotal production parameters that exert the most substantial influence on the incidence of diverse defect types. The following section provides a comprehensive account of the influence of each parameter on production quality.

Furthermore, the production temperature (X1) (Figure 7a) has been identified as a significant factor that influences the occurrence of cracks (D2). As illustrated in the graph, an increase in temperature above 1020 °C is associated with a notable increase in the predicted value for defect D2, indicating that elevated temperatures can contribute to an enhanced likelihood of cracking. In the case of defects such as surface irregularities (D3) and radial runout (D4), the predicted value exhibits a slight decrease at higher temperatures, which may be indicative of a reduced likelihood of occurrence at elevated temperatures. On the contrary, defects such as scratches (D1) and improper hardness (D6) are practically unaffected by temperature changes. This suggests that controlling the production temperature is vital to reducing the incidence of cracks.

The type of coolant used (X2) (Figure 7b) has a significant impact on the incidence of scratches (D1). A notable reduction in the predicted value for defect D1 was observed when the coolant was changed from type “A” to type “O”, from 0.7 to nearly 0. This suggests that utilising an appropriate coolant can markedly diminish the likelihood of scratches. The remaining defects (D2, D3, D4, D5, D6) exhibit low predicted values and minimal variation dependent on the type of coolant, suggesting that this parameter exerts a relatively limited influence on these defects.

The analysis of the relationship between the parameter X4 (Figure 7c) and defects indicates that it has a significant impact on the occurrence of cracks (D2) and dimensional problems (D5). Regarding cracks, the predicted value for defect D2 shows a marked increase when X4 exceeds 0.35, suggesting that the control of this parameter may be a crucial factor in the reduction of cracks. Similarly, defect D5 (dimensional inaccuracies) exhibits a maximum at medium X4 values, indicating that maintaining X4 within a specific range can help ensure dimensional accuracy. The influence of X4 on other defects, such as D1 (scratches), D3 (surface irregularities), and D6 (hardness problems), is less pronounced.

The categorical parameter (X5) exerts a moderate influence on a range of defects. The predicted values for defects D2 (cracks) and D5 (dimensional inaccuracies) demonstrate a moderate increase depending on the category of X5, indicating a potential relationship, albeit less pronounced than for other parameters. The remaining defects demonstrate a relatively stable response to changes in X5.

The wear of the tool (X7) (Figure 7d) has a significant impact on the incidence of cracks (D2). Upon reaching a high level of tool wear (“H”), the predicted value for defect D2 exhibits a notable increase, reaching 0.8. This indicates that tools that show suboptimal conditions represent a significant contributing factor to the appearance of cracks. On the contrary, the predicted values for other defects, including surface irregularities (D3), radial runoff (D4), and improper hardness (D6), exhibit greater stability and less sensitivity to this parameter. This suggests that prioritising the control of tool condition may be an effective strategy to reduce the occurrence of cracks, while other defects appear to be less dependent on this parameter.

Analysis of the relationship between input parameters and defects reveals the pivotal influence of parameters such as tool wear (X7), temperature (X1), and type of coolant (X2) on the quality of the final product. It is of particular importance to monitor and regulate tool wear and temperature to minimise the occurrence of cracks, which are among the most prevalent defects. Additionally, the selection of an appropriate coolant can markedly reduce the incidence of scratches, thus enhancing the overall surface quality of the products.

Furthermore, the parameter X4 (Figure 7c) has been demonstrated to exert a significant influence on the incidence of cracks and dimensional inaccuracy. This underscores the need to maintain its value within a defined range to eliminate these defects. The results of this analysis should be used to implement strategies to optimise the production process, with the objective of reducing defects and improving product quality.

4.4. Proposal of the Improvements

A critical analysis of the input parameters and their influence on the incidence of defects in the production process reveals several key actions that the company must undertake to optimise production quality and reduce defects.

First, it is recommended that the company implement precise temperature control mechanisms in order to optimise and monitor production temperature (X1). Production temperatures must remain below 1020 °C to minimise the probability of cracks (D2) occurring. Implementing real-time temperature monitoring and control systems is recommended to ensure consistent temperature management, with the specific objective of minimising the occurrence of cracks.

Subsequently, the company should implement a policy of standardising the use of coolant type O throughout the production process, particularly in areas where scratches (D1) are prevalent. Given the significant reduction in the occurrence of scratches associated with the use of the “O” coolant, it would be prudent to establish this as the standard coolant. Additionally, periodic evaluations of coolant efficacy should be carried out to determine its continued capacity to diminish surface imperfections.

Regarding the parameter X4, it is recommended that the company implement monitoring systems to ensure that it remains within an optimal range, particularly below 0.35, with the objective of minimising both cracks (D2) and dimensional inaccuracies (D5). Given the significant impact of X4 on the prevalence of defects, meticulous regulation of this parameter can enhance product quality and reduce the incidence of defects.

Furthermore, the company should adjust the production speed (X5) based on the specific types of products and the observed defect rates. It is recommended that a moderate production speed be maintained to achieve an equilibrium between quality and productivity, as this parameter exerts a moderate influence on the occurrence of cracks and dimensional inaccuracies.

Furthermore, the company should implement a predictive maintenance system for tool wear (X7). The system should monitor the use of tools and the level of wear to facilitate the prompt implementation of maintenance or replacement procedures when wear reaches a critical threshold. It is evident that the wear of the tool plays a significant role in the formation of cracks. Therefore, it is imperative that regular maintenance procedures are implemented to minimise defects and ensure consistent quality.

To facilitate the aforementioned actions, it is recommended that the company implement advanced real-time quality control systems that monitor critical parameters, including the temperature, tool wear, and coolant. Such systems should be capable of providing automated alerts or adjustments to prevent the formation of defects, thereby reducing the need for rework and ensuring consistent quality throughout the production process.

Furthermore, it is imperative that employee training be conducted on a regular basis to ensure that operators and technical staff fully understand the importance of monitoring and controlling key parameters. Training should concentrate on the optimal methods for the management of temperature, tool wear, and coolant usage. Moreover, the company should undertake periodic process audits with a view toward identifying potential areas for further optimisation and ensuring that all production steps comply with the requisite quality standards.

Finally, the company should adopt a data-driven decision-making approach, using machine learning and data analytics to facilitate the continuous analysis of production data. By identifying trends in defect formation, the company can make adjustments to the production process in real time based on the insights gained from these models. This will assist in maintaining optimal conditions and reducing defect rates in a dynamic production environment.

By implementing these measures, the company can significantly reduce defects, particularly those of a cosmetic nature, improve dimensional accuracy, and enhance overall product quality. These steps not only address current problems but also establish a foundation for long-term consistency and optimisation, leading to cost savings, reduced waste, and increased customer satisfaction.

5. Conclusions

The application of traditional quality management techniques, including quality control and quality control circles (QCCs), has historically proven to be an effective approach for enhancing production process efficiency, reducing the prevalence of defective products, and improving competitiveness in the marketplace. The implementation of QCCs has been shown to enhance production efficiency by up to 92%. However, despite their advantages, traditional methods are constrained by limitations in today’s rapidly evolving production environments, which necessitate enhanced flexibility and rapid adaptation to changes. Maintaining high quality standards while increasing production speed has become a significant challenge, often resulting in elevated operating costs.

The advent of machine learning (ML) techniques has been a significant advancement in the field of quality management. The implementation of machine learning (ML) techniques in drive shaft production has facilitated more precise identification of defects, reduced false alarm occurrence, and optimised the overall production process. In the company under examination, the most common defects were cracks (25.20%) and surface irregularities (21.85%), which markedly affected the quality of the final product. Furthermore, other defects, including scratches (15.63%), radial runout (14.04%), incorrect dimensions (12.28%), and incorrect hardness (11.00%), also played a significant role in affecting the overall quality and required effective solutions.

A variety of machine learning (ML) techniques, including bagged trees (BT), neural networks (NNs), and support vector machines (SVMs), were employed to construct predictive quality models. The neural network model demonstrated the highest accuracy (94.7%) and the fastest prediction speed, processing 7800 observations per second, thus exhibiting the greatest efficiency in terms of computational time. Bagged trees, while exhibiting a slower processing speed (1400 observations per second), demonstrated consistent prediction accuracy. SVMs, although displaying slightly lower accuracy (94.0%), exhibited a higher training speed (1.67 s).

The results of the confusion matrix analysis demonstrated that neural networks exhibited the highest efficacy in classifying defects such as surface irregularities (D3) and incorrect dimensions (D5), with classification accuracy rates of 97.4% and 97.2%, respectively. The SVMs demonstrated efficacy in classifying scratches (D1), radial rupture (D4), incorrect dimensions (D5), and hardness problems (D6). However, they exhibited suboptimal performance in classifying cracks (D2), with an observed accuracy rate of 84%.

Although traditional quality management methods, such as QCCs, offer significant advantages, particularly in terms of improving production efficiency, they are constrained by limitations in terms of adaptability and speed. In rapidly evolving production environments, traditional methods are ill-equipped to accommodate real-time alterations in parameters and processes. The application of machine learning (ML) addresses these challenges by automating the detection of defects and enabling the prediction of quality issues before they occur. This makes ML a more adaptable and responsive method than traditional techniques. Furthermore, ML techniques facilitate the optimisation of critical production parameters, such as temperature, cooling speed, and tool wear. ML models continuously learn from production data, thereby improving their performance in predicting and preventing defects over time. The combination of traditional and modern approaches offers a more comprehensive and adaptable framework for quality control in the current production environment.

To capitalise on these findings, the company recommends several improvement actions for its implementation.

1.: It would be beneficial for the company to integrate traditional quality management tools, such as the Quality Control Circle (QCC), with modern machine learning (ML) techniques. The combination of these two approaches will facilitate the identification of defects in a more expeditious and precise manner while also leveraging the predictive capacity of machine learning to anticipate potential quality issues before they materialise. This approach will result in a reduction in the number of defective products and an improvement in overall production efficiency.
2.: It is evident that key parameters such as temperature, cooling speed, and tool wear have a significant impact on the occurrence of defects. It is imperative that these variables are monitored and controlled in real time to reduce defects and improve production efficiency. By focussing on optimising these parameters, the company can reduce the incidence of defects such as cracks and surface irregularities, thereby improving product quality.
3.: Although machine learning models such as neural networks, bagged trees, and support vector machines have demonstrated satisfactory performance, there is scope for further optimisation. It would be beneficial for the company to investigate the potential of deeper neural networks and advanced ensemble learning techniques to enhance prediction accuracy and reduce processing time. This will facilitate more precise detection and prevention of defects.
4.: The successful implementation of modern technologies such as machine learning (ML) requires ongoing training of employees. The organisation should facilitate regular workshops on the utilisation of predictive systems and the analysis of data, ensuring that the production team is fully equipped to employ these tools effectively. Training will facilitate a more rapid response to data-driven insights, thus further enhancing production processes.
5.: It would be beneficial for the company to implement automated quality control systems based on machine learning and visual inspection technologies. Such systems will facilitate the real-time detection of defects, thus reducing the likelihood of halting production, mitigating the potential for production losses, and enhancing overall efficiency. In addition, automation will guarantee consistent quality while reducing operational costs.

Implementing these improvement actions will significantly improve production quality, reduce costs, and increase competitiveness in the marketplace. The integration of traditional methods with advanced ML techniques, along with the optimisation of production parameters, employee training, and the use of automation, will create a robust quality management framework. This framework will enable the company to meet the demands of modern production environments, ensuring high quality product while maintaining efficiency and adaptability.

6. Limitations and Future Research

Despite the promising results of this study, there are several limitations that must be acknowledged to improve the effectiveness of machine learning (ML) techniques in the quality management of propeller shaft manufacturing.

First, the dataset used in this study consisted of 565 datasets, which, while sufficient for preliminary modelling, may not be representative of the full complexity of the manufacturing process. A larger and more diverse dataset would improve the reliability and generalisability of the models, making them more robust to different production environments.

Another limitation is the potential risk of model overfitting, particularly with complex techniques such as neural networks. With smaller datasets, models may perform well during training but fail to generalise to unseen data. Although cross-validation was implemented to address this issue, additional regularisation techniques or access to larger datasets could help mitigate this risk.

This study also focused on predicting certain types of defects, such as cracks and surface irregularities, which can limit the applicability of the models to other potential quality issues not included in the analysis. Future work should consider a broader range of defects to increase the versatility of ML-based solutions.

Furthermore, the complexity of the manufacturing environment may not have been fully captured in this research. Factors such as machine downtime, operator behaviour, and external environmental conditions such as temperature and humidity, which could influence defect occurrence, were not included in the models.

Finally, the computational demands of advanced machine learning models, particularly deep neural networks, can present a significant challenge for small and medium-sized enterprises (SMEs) with limited access to high-performance computing infrastructure. Although the presented approach demonstrates that standard hardware is sufficient for smaller datasets, the scalability of these models is constrained as the dataset size increases. For SMEs, this implies that while entry-level models can be effectively trained and deployed with typical computing resources, the adoption of more complex models for large datasets may be constrained by infrastructure limitations. Additionally, the implementation of ML in production environments requires the acquisition of specific employee skills in data handling, model application, and result interpretation. Consequently, although computational resources may not initially be a limiting factor for smaller datasets, both data volume and workforce competency are critical considerations for SMEs aiming to effectively scale ML solutions.

In summary, these limitations highlight areas for further improvement, including expanding data collection, optimising models for broader applicability, and addressing practical constraints on computational resources.

Future research on the application of machine learning (ML) to quality management in manufacturing should focus on several key areas to improve the effectiveness and scalability of current models. One of the most important directions is the expansion of the dataset. Collecting larger and more diverse data from multiple production cycles and different products would improve the generalisability of the models, allowing them to predict defects more accurately in different manufacturing environments.

Another important path for future research is to explore additional ML techniques. While this study used neural networks, bagged trees, and support vector machines (SVMs), other algorithms such as Random Forests, gradient boosting machines, or reinforcement learning could provide new insights and further improve prediction accuracy, especially when dealing with more complex, nonlinear datasets.

The integration of real-time data into predictive models is another promising area. By incorporating sensor data from the production line, such as machine performance, environmental conditions, and operator behaviour, future research could improve real-time monitoring capabilities. This would enable immediate corrective action, reduce error rates, and optimise the overall production process.

Multi-objective optimisation models are also a valuable direction for future work. These models could balance different factors such as quality, cost, and efficiency, providing manufacturers with decision-making tools that optimise both production output and product quality.

Furthermore, investigating the impact of human factors on the occurrence of defects offers an opportunity for improvement. Incorporating data on operator performance, training, and compliance into ML models can reveal new ways to reduce defects and improve overall production quality.

Author Contributions

Conceptualization, K.A., L.K., and J.H.; methodology, K.A.; software, K.A.; validation, K.A., L.K., and J.H.; formal analysis, K.A.; investigation, K.A.; resources, K.A. and J.H.; data curation, K.A.; writing—original draft preparation, K.A.; writing—review and editing, L.K. and J.H.; visualization, K.A.; supervision, L.K.; project administration, K.A.; funding acquisition, K.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Slovak Research and Development Agency under contract No. APVV-23-0591 and by the projects VEGA 1/0268/22 and KEGA 038TUKE-4/2022 granted by the Ministry of Education, Science, Research, and Sport of the Slovak Republic.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ryabchik, T.A.; Smirnova, E.E.; Lukashova, M.I.; Haydar, H. Manufacturing processes quality control as a main factor of performance enhancement in industrial management. In Proceedings of the 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Saint Petersburg/Moscow, Russia, 28–31 January 2019; Volume 10. [Google Scholar] [CrossRef]
Pangestu, A.D.; Sunarya, E.; Mulia, F. Pengaruh quality control terhadap efektivitas proses produksi. Costing 2022, 5, 1236–1246. [Google Scholar] [CrossRef]
Ramlah, P.; Maffud, N.; Najamuddin, N.; Serlin, S. Analyzing the impact of quality control circle and seven tools methods on enhancing productivity. J. Apresiasi Ekon. 2024, 12, 223–227. [Google Scholar] [CrossRef]
Fani, K.P.; Ambiyar, N.; Rizal, F.; Hasan, M.; Ashar, F. The impact of quality control implementation on productivity and product quality in industry. PaperAsia 2024, 40, 52–58. [Google Scholar] [CrossRef]
Marco, G.; Usländer, T.; Beyerer, J. Production quality control through a user-oriented and characteristic-based quality visualization model. Procedia CIRP 2019, 81, 564–569. [Google Scholar] [CrossRef]
Šedík, P.; Horska, E.; Skowron-Grabowska, B.; Illés, C.B. Generation marketing in strategic marketing management: Case study of honey market. Pol. J. Manag. Stud. 2018, 18, 326–337. [Google Scholar] [CrossRef]
Ana, C.K.; Salgado, H.R.N. Quality tools for a successful strategic management. Int. J. Bus. Process Integr. Manag. 2017, 8, 393–399. [Google Scholar] [CrossRef]
Mayer, J.; Jochem, R. Capability indices for digitized industries: A review and outlook of machine learning applications for predictive process control. Processes 2024, 12, 1730. [Google Scholar] [CrossRef]
Hariharan, V. Machine learning techniques for quality control in textile fabric manufacturing. In Proceedings of the 2024 2nd International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 10–12 July 2024; Volume 2. [Google Scholar] [CrossRef]
Bekbayeva, R.; Ospanov, Y.; Bekbayev, K.; Kozhakhmetova, B.; Zhapar, Y.; Zhanuzakov, B.; Mussayev, D. Automation of quality control in manufacturing using machine learning and computer vision: System development and performance analysis. D. Serìkbaev Atyndaġy Šyġys Ķazaķstan Teh. Unìversitetìnyņ Habar. 2023, 4, 77–91. [Google Scholar] [CrossRef]
Gross, D.; Spieker, H.; Gotlieb, A.; Knoblauch, R. Enhancing manufacturing quality prediction models through the integration of explainability methods. arXiv 2024, arXiv:2403.18731. [Google Scholar] [CrossRef]
Himmel, A.; Matschek, J.; Kok, R.; Bruno, M.; Nguyen, H.H.; Findeisen, R. Machine Learning for Control of (Bio)chemical Manufacturing Systems. In Artificial Intelligence in Manufacturing: Concepts and Methods; Academic Press: Cambridge, MA, USA, 2024; pp. 181–240. [Google Scholar] [CrossRef]
Kim, G.; Park, S.; Choi, J.G.; Yang, S.M.; Park, H.W.; Lim, S. Developing a data-driven system for grinding process parameter optimization using machine learning and metaheuristic algorithms. CIRP J. Manuf. Sci. Technol. 2024, 51, 20–35. [Google Scholar] [CrossRef]
Adamczak, M.; Kolinski, A.; Trojanowska, J.; Husár, J. Digitalization Trend and Its Influence on the Development of the Operational Process in Production Companies. Appl. Sci. 2023, 13, 1393. [Google Scholar] [CrossRef]
Mascenik, J.; Coranic, T. Experimental Determination of the Coefficient of Friction on a Screw Joint. Appl. Sci. 2022, 12, 11987. [Google Scholar] [CrossRef]
Pavlenko, I.; Piteľ, J.; Ivanov, V.; Berladir, K.; Mižáková, J.; Kolos, V.; Trojanowska, J. Using Regression Analysis for Automated Material Selection in Smart Manufacturing. Mathematics 2022, 10, 1888. [Google Scholar] [CrossRef]
Antosz, K.; Jasiulewicz-Kaczmarek, M.; Paśko, Ł.; Zhang, C.; Wang, S. Application of machine learning and rough set theory in lean maintenance decision support system development. Eksploat. Niezawodn. Maint. Reliab. 2021, 23, 695–708. [Google Scholar] [CrossRef]
Hrehová, S.; Matiskova, D. Possibilities of user interface design with the involvement of machine learning elements using Matlab. In Proceedings of the 23rd International Carpathian Control Conference (ICCC), Sinaia, Romania, 29 May–1 June 2022; pp. 153–157. [Google Scholar] [CrossRef]
Barzizza, C.; Agosti, M.; Rossi, D. Enhancing predictive quality control in manufacturing using LightGBM and Random Forest. J. Ind. Prod. Eng. 2024, 32, 215–231. [Google Scholar]
Arora, P.; Gupta, N. Real-time quality monitoring with IoT-integrated cyber-physical systems in smart factories. J. Manuf. Syst. 2024, 46, 75–89. [Google Scholar]
Riccio, G.; Rossi, M.; Bianchi, L. Predictive maintenance in manufacturing: Leveraging machine learning for enhanced operational efficiency. J. Ind. Eng. Manag. 2024, 15, 102–115. [Google Scholar]
Mahapatra, S.; Gaurav, P. Decision support systems for optimizing production processes using machine learning. Decis. Anal. J. 2023, 10, 100213. [Google Scholar]
Zoubaidi, L.; Zoghlami, N.; Cherkaoui, A. Implementing the Quality 4.0 model: How data-driven decisions enhance production efficiency. Int. J. Qual. Reliab. Manag. 2024, 41, 45–60. [Google Scholar]
Almanei, M.; Oleghe, O.; Jagtap, S.; Salonitis, K. Machine Learning Algorithms Comparison for Manufacturing Applications; Advances in Transdisciplinary Engineering; IOS Press: Amsterdam, The Netherlands, 2021. [Google Scholar] [CrossRef]
Deshmukh, H.S.; Jadhav, V.S.; Deshmukh, S.P.; Rane, K. Towards Applicability of Machine Learning in Quality Control—A Review. ECS Trans. 2022, 107, 14729–14737. [Google Scholar] [CrossRef]
Khoza, S.C.; Grobler, J. Comparing Machine Learning and Statistical Process Control for Predicting Manufacturing Performance. In Proceedings of the International Conference on Advances in Intelligent Systems and Computing, Vila Real, Portugal, 3–6 September 2019. [Google Scholar] [CrossRef]
Bonomi, N.; Cardoso, F.; Confalonieri, M.; Daniele, F.; Ferrario, A.; Foletti, M.; Giordano, S.; Luceri, L.; Pedrazzoli, P. Smart Quality Control Powered by Machine Learning Algorithms. In Proceedings of the IEEE 17th International Conference on Automation Science and Engineering, Lyon, France, 23–27 August 2021. [Google Scholar] [CrossRef]
Aldoori, S.A.Q.; Mabrouk, M.; Mahmoud, A.M. Advances in Machine Learning-Driven Quality Control within ERP Systems: A Comprehensive Survey. In Proceedings of the 2023 Eleventh International Conference on Intelligent Computing and Information Systems, Cairo, Egypt, 21–23 November 2023. [Google Scholar] [CrossRef]
Lucas, S.M.; Romero, I.R.C.; Reyes, S.D.G. Unlock Your Writing Potential with AI: Elevate Your English Skills Effortlessly; Centro de Investigación y Desarrollo (CID): Barcelona, Spain, 2024. [Google Scholar] [CrossRef]
Toigo, L.; Perrone, G.; Silva, C. Democratizing machine learning for quality control in small-scale manufacturing. Int. J. Manuf. Syst. 2024, 35, 45–59. [Google Scholar]
Kulisz, M.; Antosz, K.; Kozłowski, E. Integration of Statistical Analysis and Machine Learning Techniques for Enhanced Quality Control in Candle Oil Cartridge Manufacturing. In Advances in Design, Simulation and Manufacturing VII: Proceedings of the 7th International Conference on Design, Simulation, Manufacturing: The Innovation Exchange, DSMIE-2024, Pilsen, Czech Republic, 4–7 June 2024—Volume 1: Manufacturing Engineering; Ivanov, V., Trojanowska, J., Pavlenko, I., Rauch, E., Piteľ, J., Eds.; Lecture Notes in Mechanical Engineering; Springer: Cham, Switzerland, 2024; pp. 376–387. [Google Scholar] [CrossRef]
Antosz, K. Prediction Model of Product Quality in Production Company: Based on PCA and Logistic Regression. In Flexible Automation and Intelligent Manufacturing: Establishing Bridges for More Sustainable Manufacturing Systems—Proceedings of FAIM 2023, Porto, Portugal, 18–22 June 2023, Volume 2: Industrial Management; Silva, F.J.G., Ferreira, L.P., Sá, J.C., Pereira, M.T., Pinto, C.M.A., Eds.; Lecture Notes in Mechanical Engineering; Springer: Cham, Switzerland, 2024; pp. 425–432. [Google Scholar] [CrossRef]
Araújo, A.F.; Varela, M.L.R.; Gomes, M.S.; Barreto, R.C.C.; Trojanowska, J. Development of an Intelligent and Automated System for Lean Industrial Production, Adding Maximum Productivity and Efficiency in the Production Process. In Advances in Manufacturing; Hamrol, A., Ciszak, O., Legutko, S., Jurczyk, M., Eds.; Lecture Notes in Mechanical Engineering; Springer International Publishing: Cham, Switzerland, 2018; pp. 131–140. ISBN 978-3-319-68618-9. [Google Scholar] [CrossRef]
Zidek, K.; Pitel’, J.; Hošovský, A. Machine Learning Algorithms Implementation into Embedded Systems with Web Application User Interface. In Proceedings of the IEEE International Conference on Intelligent Engineering Systems (INES), Larnaca, Cyprus, 20–23 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 77–82. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer: New York, NY, USA, 2013. [Google Scholar]
Fawcett, T. An Introduction to ROC Analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Hand, D.J.; Till, R.J. A Simple Generalization of the Area Under the ROC Curve for Multiple Class Classification Problems. Mach. Learn. 2001, 45, 171–186. [Google Scholar] [CrossRef]
Lazár, I.; Husár, J. Validation of the serviceability of the manufacturing system using simulation. J. Effic. Responsib. Educ. Sci. 2012, 5, 252–261. [Google Scholar] [CrossRef]

Figure 1. Research methodology.

Figure 2. Defects recorded in the manufacturing process.

Figure 3. Distribution of defect types (D1 to D6) across specific weeks.

Figure 4. Impact of individual process parameters on defect occurrence. (a) X1 vs. Defect, (b) X3 vs. Defect, (c) X4 vs. Defect, (d) X6 vs. Defect, (e) X8 vs. Defect, (f) X9 vs. Defect.

Figure 5. The confusion matrices of the developed models. (a) BT model; (b) NN model; (c) SVM model.

Figure 6. ROC and AUC analyses for the developed models. (a) BT model; (b) NN model; (c) SVM model.

Figure 7. Partial dependence analysis for the NN model. (a) X1 vs. Defect, (b) X2 vs. Defect, (c) X4 vs. Defect, (d) X1 vs. Defect.

Table 1. Results of statistical tests.

Parameter	p-Value	Significance	Test Type
Temperature (X1)	0.00	Significant	Kruskal–Wallis
Coolant Type (X2)	0.00	Significant	Chi-square
Cooling Speed (X3)	0.00	Significant	Kruskal–Wallis
Tool Wear (X5)	0.04	Significant	Chi-square
Cutting Speed (X6)	0.03	Significant	Kruskal–Wallis
Tool Wear (X7)	0.03	Significant	Chi-square
Rotational Speed of the Grinding Wheel (X8)	0.02	Significant	Kruskal–Wallis
Grinding Time (X9)	0.01	Significant	Kruskal–Wallis

Table 2. The best models, training parameters.

Training Results	BT	NN	SVM
Prediction speed [obs./sec]	1400	7800	5900
Training time [sec]	21.02	8.39	1.67
Model Hyperparameters	Preset: Bagged Trees Ensemble method: Bag Learner type: Decision Tree Maximum number of splits: 564 Number of learners: 30	Preset: Wide Neural Network First layer: 100 Activation: ReLU Iteration limit: 1000	Preset: Fine-Gaussian SVM Kernel scale: 0.75 Kernel function: Gaussian
Accuracy	94.2	94.7	94.0
Error Rate	5.8	5.3	6.0

Table 3. The best models for particular types of defects.

Defect Type	Bagged Trees	Neural Network	SVM
D1	High accuracy (96.2% TPR), some minor errors	Good accuracy (95.2% TPR), slight misclassification (4.8% FDR)	Excellent accuracy (100% TPR)
D2	Moderate performance (90.3% TPR)	Good accuracy (92.2% TPR)	Low precision (84.0% TPR), high misclassification (16.0% FDR)
D3	High accuracy (95.0% TPR)	Excellent accuracy (97.4% TPR)	Good accuracy (94.3% TPR)
D4	High accuracy (96.1% TPR)	Good accuracy (95.9% TPR)	Excellent accuracy (100% TPR)
D5	Excellent accuracy (97.2% TPR)	Excellent accuracy (97.2% TPR)	Excellent accuracy (100% TPR)
D6	High accuracy (93.8% TPR)	Moderate accuracy (91.0% TPR), some errors (9.0% FDR)	Excellent accuracy (100% TPR)

The grey colour highlights the model that best predicts each defect type.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Antosz, K.; Knapčíková, L.; Husár, J. Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing. Appl. Sci. 2024, 14, 10450. https://doi.org/10.3390/app142210450

AMA Style

Antosz K, Knapčíková L, Husár J. Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing. Applied Sciences. 2024; 14(22):10450. https://doi.org/10.3390/app142210450

Chicago/Turabian Style

Antosz, Katarzyna, Lucia Knapčíková, and Jozef Husár. 2024. "Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing" Applied Sciences 14, no. 22: 10450. https://doi.org/10.3390/app142210450

APA Style

Antosz, K., Knapčíková, L., & Husár, J. (2024). Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing. Applied Sciences, 14(22), 10450. https://doi.org/10.3390/app142210450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation and Application of Machine Learning Techniques for Quality Improvement in Metal Product Manufacturing

Abstract

1. Introduction

2. Description of the Research Problem and Case Study Company

2.1. Formulation of Research Problems

2.2. Process Description in the Case Study Company

3. Materials and Methods

3.1. Research Methodology

3.2. Data Collection

3.3. Statistical Analyses

3.4. Machine Learning Methodology

4. Results and Discussion

4.1. Preliminary Data Analyses

4.2. Results of Statistical Analyses

4.3. Machine Learning Models

4.4. Proposal of the Improvements

5. Conclusions

6. Limitations and Future Research

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI