The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process

Lee, Junhan; Kim, Jongsun; Kim, Jongsu

doi:10.3390/app132312876

Open AccessArticle

The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process

by

Junhan Lee

,

Jongsun Kim

and

Jongsu Kim

^*

Molding & Metal Forming R&D Department, Korea Institute of Industrial Technology, Bucheon 14442, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(23), 12876; https://doi.org/10.3390/app132312876

Submission received: 17 October 2023 / Revised: 24 November 2023 / Accepted: 29 November 2023 / Published: 30 November 2023

(This article belongs to the Special Issue Applications of Deep Learning and Artificial Intelligence Methods)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, a multi-input, multi-output-based artificial neural network (ANN) was constructed by classifying output parameters into different groups, considering the physical meanings and characteristics of product quality factors in the injection molding process. Injection molding experiments were conducted for bowl products, and a dataset was established. Based on this dataset, an ANN model was developed to predict the quality of molded products. The input parameters included melt temperature, mold temperature, packing pressure, packing time, and cooling time. The output parameters included mass, diameter, and height of the molded product. The output parameters were divided into two cases. In one case, diameter, and height, representing length, were grouped together, while mass was organized into a separate group. In the other case, mass, diameter and height were separated individually and applied to the ANN. A multi-task learning method was used to group the output parameters. The performance of the two constructed multi-task learning-based ANNs was compared with that of the conventional ANN where the output parameters were not separated and applied to a single layer. The comparative results showed that the multi-task learning architecture, which grouped the output parameters considering the physical meaning and characteristics of the quality of molded products, exhibited an improved prediction performance of about 32.8% based on the RMSE values.

Keywords:

injection molding; artificial neural network (ANN); Multi-input multi-output (MIMO); multi-task learning; quality prediction

1. Introduction

Injection molding is the process of injecting molten resin at high temperatures into a cavity in a mold at high speeds and pressures to form the final product. This process involves a complex interplay of various physical phenomena and material behaviors, such as rheology, flow dynamics, and heat transfer, at each stage of the process. Consequently, research has been actively pursued for many years to model the relationships between influencing factors such as process conditions in the injection molding process and the quality of the molded product, with the goal of predicting the process and optimizing product quality [1,2].

In recent years, with the advent of the Fourth Industrial Revolution, artificial intelligence (AI) technology has found applications in various fields such as data mining, image processing, engineering system modeling, technical control, and more. This integration has been made possible via the development of intelligent information technology and data computing, ushering in a new era of AI-driven solutions. Among them, there is a growing industrial demand for artificial neural networks (ANNs), which have shown strong performance in unraveling complex nonlinear relationships, making them one of the most promising languages in the field of artificial intelligence [3,4].

In line with this paradigm shift, the injection molding industry has also embraced the application of artificial intelligence technology to overcome the limitations of existing techniques for predicting the relationship between process factors and product quality in injection molding processes. Models for predicting the quality of molded products using ANNs can be categorized into two types. One is the multi-input, single-output (MISO) structure, which predicts a single quality parameter for multiple process conditions, while the other is the multi-input, multi-output (MIMO) structure, which predicts multiple quality parameters for different process conditions. Yang et al. [5] conducted a study in which they used a MISO-ANN to predict the mass of injection-molded products based on 10 process conditions, including melt temperature and mold temperature, set as input parameters. Their research aimed to explore the optimal conditions for molding products with the desired mass. The performance evaluation of the established ANN showed that the deviation between the predicted values and the actual values of the mass was within 0.23 ± 0.02 g. Based on this, they used the ANN to derive the process for a product with a target mass of 41.14 g. Injection molding experiments were conducted, and when the mass of the product was compared with the target mass, it showed a minimal deviation of 0.15 ± 0.07 g. This demonstrated a high degree of accuracy in deriving process conditions. They concluded that the ANN model for this product had a high degree of accuracy and reliability. Heinisch et al. [6] used 2 mm thick flat samples and used simulations to create a dataset with three output variables: product mass, length, and width. This dataset was based on six input variables including resin temperature, mold temperature, injection time, packing pressure level and time, and cooling time. Various experimental designs were used to construct the dataset. When comparing the prediction accuracy of ANNs for datasets generated using different experimental designs, the central composite design (CCD) showed the highest coefficient of determination at 0.930, indicating the most effective performance. However, in some cases, the D-optimal design and the L25 orthogonal array method also showed coefficients of determination similar to those of the CCD, indicating excellent performance. This indicates that ANNs can achieve a high level of accuracy and reliability and produce good results.

However, in the case of MIMO models, multiple quality factors with different physical meanings and characteristics are evaluated using a single feed-forward neural network. This structure includes multiple input variables, input neurons, a certain number of hidden layers, and an appropriate number of output neurons responsible for predicting multiple desired variables. The disadvantage of this structure is that it is not flexible enough to evaluate all quality factors, because the output neurons must use the same features (the output of the last hidden layer) for all variables. If the input variables fundamentally affect each of the output variables in a different way, this structure may not produce acceptable results [7]. For example, when constructing a model using a MIMO structure, changing the weights and biases associated with the input variables to improve the prediction accuracy of one quality factor may result in a decrease in the prediction accuracy of other output values.

This study applied the data-based intelligent neural networks algorithm developed for the injection molding of a light guide module with a fine pattern on a large area mold core. This paper proposes a model for the correlation between improved injection molding process conditions and the multiple qualities of molded products using multi-task learning-based ANNs. This ANN takes the form of establishing multiple task structures with separate branches for different sets of output parameters, all sharing common input parameters. Injection molding experiments were conducted for bowl products and a dataset was established. Six process conditions including melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time were set as input parameters. Output parameters included mass, diameter, and height. Based on this, two multi-task learning-based ANN architectures were constructed. One architecture groups the length parameters represented by diameter and height into a single category, while the remaining parameter, mass, is placed in a separate group. The other architecture constructs a multi-task learning-based ANN where mass, diameter, and height are all distinguished into separate groups. Then, comparisons are made between the predictive performance of two multi-task learning ANNs and that of a single-task MISO structure. Based on this, structural guidelines for ANNs to predict multiple qualities of injection molded products are presented.

2. Theoretical Background

2.1. Artificial Neural Networks

Artificial neural networks (ANNs) provide a powerful solution for handling complex, non-linear relationships in various industries where conventional methods struggle. They mimic the information processing structure of the human brain and are widely used in fields such as control engineering and robotics. In ANNs, artificial neurons process data by multiplying inputs with weights and applying an activation function, resulting in output generation. This process is represented by the simplest ANN, the perceptron in Figure 1, the data operations of which are shown in Equation (1). In Equation (1), x represents the input variables, while w and b denote the weights and biases required for model updates. F represents the activation function in Figure 1.

y = f (\sum_{i = 1}^{n} (x_{i} w_{i} + b))

(1)

An ANN is a computational processing system, as shown in Figure 2, that consists of multiple interconnected perceptron structures [9]. Unlike perceptrons, the structure in Figure 2 contains several intermediate computational layers called hidden layers. These hidden layers are typically so called because the processes taking place within the computational layers are not readily observable by users. Numerous nodes (neurons) are distributed within these layers, and this configuration is known as an ANN. Hidden layers can consist of multiple layers, and the term “deep learning” implies the depth of these hidden layers. Equation (2) represents the process of calculating the output value,

y_{i}^{(l)}

, of the ith neuron in layer lth. It depicts the formula for computing the output of neurons in a multi-layer ANN, extending from the output form of the perceptron shown in Equation (1).

y_{i}^{(l)} = f (\sum_{j = 1}^{n^{(l - 1)}} (y_{j}^{(l - 1)} w_{i j}^{(l)} + b_{i}^{(l)}))

(2)

2.2. Backpropagation

The backpropagation algorithm, a fundamental technique for training neural networks, takes its name from the way it handles errors by propagating them in the opposite direction of the network’s forward flow. This method involves two key steps: the forward pass and the backward pass. In the forward pass, the network computes predictions by processing input data through its layers. In the backward pass, it uses the error between the predictions and the actual data to adjust the model’s internal parameters, such as weights and biases. This iterative process continues until the network converges to an optimal state, at which point the chain rule of derivatives applies, as shown in Figure 3 and Equation (3) [8].

\frac{\partial y}{\partial w} = \frac{\partial K (p, q)}{\partial p} \cdot g^{'} (v) \cdot f^{'} (w) + \frac{\partial K (p, q)}{\partial q} \cdot h^{'} (z) \cdot f^{'} (w)

(3)

2.3. Hyperparameters

Hyperparameters are user-defined variables that are essential for training ANN models. These hyperparameters have a significant impact on the efficiency and performance of the model. Key hyperparameters include not only structural elements such as the number of neurons or hidden layers, but also various other factors that can affect the model’s effectiveness.

The proper configuration of these hyperparameters is a critical stage in determining the efficiency and performance of the model. In this work, the hyperband technique of Li et al. [10] is used to identify hyperparameter settings that are suitable for the characteristics and structure of the data. The hyperband approach progressively selects and optimizes hyperparameter combinations that exhibit superior performance, rather than evaluating all possible combinations at once. This method is known to provide accelerated optimization that outperforms that of traditional techniques such as grid search, random search, and Bayesian optimization, while achieving superior results. As a result, the hyperband method is widely used in practice.

2.4. Multi-Task Learning

A neural network model that handles multiple input and output variables is known as a multi-input, multi-output (MIMO) model. The methods for building MIMO models can be divided into single-task learning and multi-task learning. In single-task learning, all variables share a layer, which poses difficulties due to interdependencies. Multi-task learning, on the other hand, separates the variables into different layers within a single model, allowing tailored learning for each variable. This approach is more efficient and suitable for building MIMO models for multiple output predictions. Figure 4 shows the basic structure of multi-task learning.

The goal of a multi-learning task can be briefly described as “ Multi-task learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared low dimensional representation; what is learned for each task can help other tasks be learned better” [13,14]. The fundamental assumption of multi-task learning is that the tasks being learned are either all related or at least some of them are, so jointly learning all tasks can lead to better learning outcomes compared to independently learning each task. The idea is that by having tasks learn together, shared information between different tasks can result in improved overall performance. In other words, in processes like injection molding, where the qualities of the molded product for process conditions are not entirely independently separated but influenced by the interplay known as “Interaction”, with some aspects affecting each other and contributing to the overall output (qualities), multi-task learning is a suitable structure. This is because certain aspects jointly influence tasks, while others independently affect them, creating an intertwined relationship among tasks. Fundamentally, multi-task learning is commonly employed to enhance the pattern recognition accuracy in computer vision applications. This involves multiple tasks sharing the same input, and they are processed within a single neural network, a setup often referred to as multi-class learning [14]. However, in recent times, multi-task learning techniques have been applied to various deep learning architectures such as CNNs, LSTMs, and regression problems within artificial neural networks. [11,14] Lately, the adoption of deep learning techniques, particularly the deep neural network structure, has gained prominence in multi-task learning due to its ability to learn latent representations of data without requiring explicit hand-crafted formulations [11]. In various applications, the approach often involves either hard parameter sharing (where the hidden layers are shared among all tasks) or soft parameter sharing (where each task has its own model, represented by its own set of parameters, within the hidden layers) [11,12,14], as shown in Figure 4.

3. Experiment

3.1. Materials and Molding Equipment

In this study, injection molding experiments were conducted to collect data for the development of an artificial neural network (ANN). A single-cavity mold was used to injection mold a bowl-shaped product with specific dimensions, including a diameter of 99.90 mm and a height of 50.80 mm, as shown in Figure 5. The material selected for injection molding was LUPOL GP1007F polypropylene (LG Chem, Seoul, Republic of Korea). The injection molding machine used in this study is the LGE-150 model (LSMtron, Anyang, Gyeonggi, Republic of Korea) equipped with a 32 mm diameter screw. This model has a clamping force of 150 tons, a maximum injection speed of 1000 mm/s and a maximum injection pressure of 350 MPa.

3.2. Experimental Conditions

The recommended molding ranges for resin and mold temperatures were defined by considering the resin manufacturer’s recommendations and the property database of LUPOL GP1007F within Moldflow Insight 2023 (Autodesk, San Rafael, CA, USA). These temperature ranges were categorized into three levels, as shown in Table 1. The packing pressure and time were determined based on preliminary experiments to establish suitable process ranges for the product, and these were also divided into three levels and applied in actual molding experiments, as shown in Table 1. The injection time and cooling time were obtained through CAE analysis using Moldflow Insight 2023 (Autodesk, USA), and these conditions were also divided into three levels, similarly to other process parameters, for application in molding experiments. Based on the levels of process conditions in Table 1, 27 process conditions were generated using the L27 orthogonal array design. Additionally, 23 process conditions were created by randomly selecting values within the minimum and maximum ranges of the process conditions in Table 1 as shown in Table A1 (Appendix A). In total, injection molding experiments were conducted for 50 sets of process conditions, collecting data on the mass, diameter, and height of the injection-molded products to construct the dataset used for training the ANN. The mass of the injection-molded product was measured using the CUX420H (CAS, Yangju-si, Gyeonggi-do, Republic of Korea) electronic scale. The mass measurements were conducted under ambient conditions, using a case cover during the process to minimize the influence of atmospheric movement. The measurements were taken to two decimal places and the average of five measurements was used. The diameter of the product was evaluated using the average values measured at six points, as shown in Figure 6a. Diameter measurement was performed using Datastar200 (RAM OPTICAL INSTRUMENT, Westlake, OH, USA), a non-contact optical measurement device. The molded part was placed on a properly leveled measuring table and the outline of the part was measured with the device for the diameter shown as Figure 6b. The height of the product was determined using the average values measured at four points shown in Figure 7 with the digimatic height gauge (Mitutoyo, Kawasaki, Kanagawa, Japan). The height was measured by attaching a height gauge to a vertical rod and placing the product between the gauge and a leveled measuring table. For height, measurements of the points in Figure 7 were measured five times each, and the average value was used.

Out of the dataset containing 50 different process conditions, 38 datasets were designated as the training dataset, 6 datasets were designated as the validation dataset, and the remaining 6 datasets were designated as the test dataset, which was used to evaluate the performance of the model. To ensure that the influence of the parameter scale was consistent and to standardize the magnitudes and differences between the parameter values, all datasets underwent min–max normalization using Equation (4). In the early stages of this research, an artificial neural network was constructed using standard min–max normalization, which is commonly employed in statistics, scaling values to the range between 0 and 1. However, challenges arose due to parameters being normalized to 0, leading to data saturation and difficulties in predicting accurate output values for certain cases. To resolve this issue, the saturation problem was addressed by implementing min–max normalization within the range of 0.1 to 0.9, as outlined in Equation (4):

x ’_{i} = (0.9 - 0.1) \times {\frac{(x_{i} - M i n . X)}{(M a x . X - M i n . X)}} + 0.1, x ’_{i} \in X

(4)

4. Neural Network Architectures and Implementation

In this study, three multi-input, multi-output (MIMO) models were constructed with six process parameters, melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time, as input parameters, and the mass, diameter, and height of the molded product as output parameters. One of the models is Network A, shown in Figure 8, a conventional artificial neural network (ANN) that combines all three output parameters in a single-task layer. The other two models were built using multi-task learning by grouping the output parameters based on their physical meanings and characteristics. One model groups length, represented by diameter and height, and separates mass as a separate group, as shown in Network B (Figure 9), while the other model, Network C, classifies all output parameters into separate groups, as shown in Figure 10.

Table 2 shows the hyperparameters used to build the ANN models in this study and the search ranges for optimization. In this study, the exploration of optimizing hyperparameters included the seed number. Typically, in the construction of artificial neural networks, the seed number is set to a specific value while the remaining hyperparameters are explored. However, the seed number, akin to the batch size, possesses algorithmic characteristics within the device and program that trains the neural network, and is a factor that must be carefully considered. In the initial stages of the research, the seed number was fixed to a specific value without exploration. However, it was observed that, even with the same structure of artificial neural networks, the results could vary significantly depending on the seed number. Therefore, in this study, to prevent such variations and optimize results, the seed number was treated as a hyperparameter and explored for optimal values. The optimizer was consistently set to the widely used Adams optimizer, with its parameters defined based on the research proposed by Kingma et al. [15]. Initially, an artificial neural network was constructed by applying the widely used default coefficient values to the Adams optimizer. However, confirming variations in the performance of the Adams optimizer observed in previous studies and recognizing the need for coefficient exploration depending on the dataset, this research applied the coefficient ranges used in exploration in the study by Kingma et al. [15]. This approach aimed to search for and apply the optimal coefficients in the Adams optimizer. The activation function was set to the popular ELU function from the ReLU family, and the initializer was chosen as the He normal initializer, which is known for its good performance along with that of the ReLU family. For the output layer, where the results of the neural network model are produced, a linear function was applied as the activation function, and the Xavier normal initializer, which performs well with the linear function, was used. Other hyperparameters were explored according to the ranges shown in Table 2. However, to facilitate a comparison of the performance of Networks A, B, and C, the number of common hidden layers was set to 3, and the number of hidden layers associated with each output parameter was set to 1. In addition, the root mean square errors (RMSEs) were used as a metric to evaluate the performance during the training process of the ANNs.

5. Results

Table 3 shows the results of hyperparameter exploration for Networks A, B, and C. It is important to note that during hyperparameter exploration, the hidden layer structure was kept consistent across the three different artificial neural network (ANN) architectures to facilitate intuitive comparison. The prediction results for the untrained test data (Experiments #28, 30, 31, 32, 36, and 45, as shown in Table A1) using the neural network described in Table 3 are shown in Table 4. To evaluate the performance, the root mean square error (RMSE) between the measured values and the predictions generated by the neural network was calculated for the normalized test data.

As shown in Table 4, the application of grouping for the quality factors of injection-molded products resulted in superior performance for Network B and C, which used the multi-task learning structure, compared to that of the conventional single MIMO structure used in Network A. In particular, Network C exhibited the best RMSE value, showing an improvement of approximately 32.8% over that of the conventional structure of Network A in the total normalized test data.

Table 5 shows the results comparing the prediction performance for each quality of the molded product, focusing on individual quality factors rather than the entire test dataset. Figure 11 graphically illustrates these results. In Table 5, it can be observed that Network A, the conventional MIMO neural network structure, exhibited the lowest prediction performance for all three factors: mass, diameter, and height. On the other hand, Network C showed the best prediction performance compared to that of Network A for mass, diameter, and height. It showed the most significant improvement in mass, achieving approximately 56.6% better performance based on the RMSE values. These results are also confirmed in Figure 11, where it is shown that the multi-task learning structures, grouped by the quality factors of the molded products, generally outperformed the conventional ANN.

To analyze the error deviation between the actual measurements of the injection-molded product’s quality and the predictions of the networks, the mean and standard deviation of the squared errors were calculated, as shown in Table 6. The average of the squared errors is represented by the mean squared error (MSE). Figure 12 shows the comparison of the standard deviations of the prediction errors for the molded product’s quality between each of the networks as calculated in Table 6. Figure 12c shows that the difference in standard deviations between Network A and B is almost negligible. However, when considering the overall results for mass, diameter, and height, it is evident that the standard deviations of the prediction errors for Networks B and C improved compared to that of the conventional single-task structure of Network A. For mass, diameter, and height, Network C, which assigns separate tasks to each, shows an improvement in standard deviation of approximately up to 83.9% compared to that of Network A.

The performance of the networks with the dimensional quality specifications according to ISO 20457:2018 (plastic-molded parts—tolerances and acceptance conditions) for injection molded parts, including diameter and height in millimeters [16], as well as the quality specifications for mass in percent [15], which are commonly applied to PP molded parts, is shown in Figure 12. The ISO 20457:2018 specifications for the injection-molded parts used in this study are both 0.09 mm for diameter and height [16], and the quality specification for the mass of the molded parts is 0.5% [17]. Studies applying artificial neural networks to injection molding processes have primarily expressed performance by comparing results using metrics such as error ratios or RMSEs. However, to assess whether the constructed artificial neural network is practically applicable to injection molding processes, it is essential to compare results against quality specifications used in the industry. Therefore, in this study, the final performance of the artificial neural network was evaluated using the actual quality specifications of the manufactured products as a benchmark. Comparing the results with the quality standards for mass, it is confirmed that test dataset 1, 4 and 5 in Network A exceed the quality specifications, as shown in Figure 13a. On the other hand, both Network B and Network C meet the quality standards for mass, with Network C generally providing predictions that are closest to the actual measured values. The comparative results for diameter in Figure 13b and height in Figure 13c show that all networks have predictions that meet the quality standards. Furthermore, similar to Figure 13a, among these networks, Network C consistently produces results that are closest to the actual quality measurements. Based on these results, it can be confirmed that in the construction of ANNs for predicting the quality of injection-molded products in terms of mass, diameter, and height, the architectures of Networks B and C, which apply multi-task learning by grouping the quality factors according to their characteristics, outperform the traditional single-task MIMO neural network structure (Network A).

6. Discussion and Conclusions

In this study, artificial neural networks (ANNs) were built to predict the relationship between process conditions and product quality in injection molding. Injection molding experiments were conducted on bowl products, and data were collected to evaluate the predictive performance of a multi-task learning structure with the grouping of quality factors based on their physical meanings and characteristics.

Based on the collected dataset, three different ANN networks with different architectures were constructed. One is the Network A architecture, which is the existing multi-input, multi-output (MIMO) structure, where the output parameters for them ass, diameter, and height of the molded product are connected to a single task layer. Another is Network B, where diameter and height are grouped and assigned to one task layer, and mass is separated into a separate group with its own task layer. The last is Network C, where all output parameters, mass, diameter, and height, are grouped separately with individual task layers. In the case of Networks B and C, which applied multi-task learning according to output parameter groups, both showed relatively superior performance in predicting product quality in all scenarios compared to that of the typical MIMO-ANN, Network A. In particular, the architecture of Network 3, which assigns product mass, diameter, and height to separate task groups, showed excellent performance in predicting product weight, diameter, and length. When compared to the RMSE value of the general MIMO-ANN, Network A, the overall root mean square error (RMSE) for Network C on the entire test data showed an improvement of approximately 32.8%. For mass, diameter, and height, the respective improvements were 56.6%, 15.0%, and 44.3%, indicating that Network C exhibited superior predictive performance compared to that of the conventional MIMO neural network (Network A) based on RMSE. These results suggest that a multi-task learning architecture, which separates groups based on the characteristics of the quality factors of injection-molded products set as output parameters, may be a more suitable approach for the quality prediction of injection-molded products using the ANN.

The analysis of the specific dataset of the bowl product used in this study indicates that a multi-task learning architecture, which divides and assigns separate tasks based on the physical meanings and characteristics of the quality factors of injection molded products set as output parameters, may be a better choice for predicting the mass, diameter, and height of injection-molded products compared to the conventional MIMO structure of the ANN. The results of this study are expected to serve as valuable reference material for future research on the application of the ANN in the injection molding industry.

Author Contributions

Conceptualization, J.L., J.K. (Jongsun Kim) and J.K. (Jongsu Kim); methodology, J.L., J.K. (Jongsun Kim) and J.K. (Jongsu Kim); data curation, J.L.; formal analysis, J.L., J.K. (Jongsun Kim) and J.K. (Jongsu Kim); validation, J.L., J.K. (Jongsun Kim) and J.K. (Jongsu Kim); visualization, J.L.; investigation, J.L.; resources, J.L.; writing—original draft preparation, J.L.; writing—review and editing, J.L., J.K. (Jongsun Kim) and J.K. (Jongsu Kim). All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Trade, Industry and Energy, and the Korea Evaluation Institute of Industrial Technology (KEIT) in 2023 (20019131, KM230100).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available at https://doi.org/10.3390/polym14091724, reference number [17].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Injection molding conditions: orthogonal array of L27 and random array. Reprinted/adapted with permission from Ref. [17]. 2022, Lee, J.; Yang, D.; Yoon, K.; Kim, J.

Exp. No.	Melt Temperature (°C)	Mold Temperature (°C)	Injection Speed (mm/s)	Packing Pressure (bar)	Packing Time (s)	Cooling Time (s)	Note
1	200	40	40.0	150	6.0	38	L27
2	200	40	40.0	150	12.0	48	L27
3	200	40	40.0	150	18.0	58	L27
4	200	50	70.0	200	6.0	38	L27
5	200	50	70.0	200	12.0	48	L27
6	200	50	70.0	200	18.0	58	L27
7	200	60	100.0	250	6.0	38	L27
9	200	60	100.0	250	18.0	58	L27
10	220	40	70.0	250	6.0	48	L27
11	220	40	70.0	250	12.0	58	L27
12	220	40	70.0	250	18.0	38	L27
13	220	50	100.0	150	6.0	48	L27
14	220	50	100.0	150	12.0	58	L27
15	220	50	100.0	150	18.0	38	L27
16	220	60	40.0	200	6.0	48	L27
17	220	60	40.0	200	12.0	58	L27
18	220	60	40.0	200	18.0	38	L27
19	240	40	100.0	200	6.0	58	L27
20	240	40	100.0	200	12.0	38	L27
21	240	40	100.0	200	18.0	48	L27
22	240	40	40.0	250	6.0	58	L27
23	240	50	40.0	250	12.0	38	L27
24	240	50	40.0	250	18.0	48	L27
25	240	60	70.0	150	6.0	58	L27
26	240	60	70.0	150	12.0	38	L27
27	240	60	70.0	150	18.0	48	L27
28	214	55	82.7	204	16.3	52	Random
29	204	44	43.4	202	13.9	41	Random
30	203	46	93.6	205	13.7	45	Random
31	202	54	83.4	213	6.6	48	Random
32	206	43	61.6	221	6.9	39	Random
33	212	44	53.3	240	17.0	52	Random
34	212	51	90.8	224	6.1	48	Random
35	200	52	50.0	215	17.6	39	Random
36	229	51	46.2	153	11.7	45	Random
37	228	49	53.2	217	12.3	58	Random
38	222	51	63.7	167	8.7	51	Random
39	219	50	41.4	156	16.3	52	Random
40	228	46	96.5	154	16.7	57	Random
41	228	46	62.5	191	10.9	46	Random
42	219	42	98.4	237	17.9	41	Random
43	220	43	55.8	241	14.8	44	Random
44	233	42	50.8	198	13.5	55	Random
45	238	53	41.6	221	17.2	40	Random
46	234	48	68.2	222	8.8	41	Random
47	233	44	84.9	171	6.7	55	Random
48	234	43	56.9	176	11.1	48	Random
49	239	49	41.2	234	8.6	52	Random
50	240	49	76.1	241	6.4	51	Random

References

Rosato, D.V.; Rosato, M.G. Injection Molding Handbook; Springer Science & Business Media: Berlin, Germany, 2012. [Google Scholar]
Fernandes, C.; Pontes, A.J.; Viana, J.C.; Gaspar-Cunha, A. Modeling and optimization of the injection-molding process: A review. Adv. Polym. Technol. 2018, 37, 429–449. [Google Scholar] [CrossRef]
Shen, C.; Wang, L.; Li, Q. Optimization of injection molding process parameters using combination of artificial neural network and genetic algorithm method. J. Mater. Process. Technol. 2007, 183, 412–418. [Google Scholar] [CrossRef]
Ozcelik, B.; Erzurumlu, T. Comparison of the warpage optimization in the plastic injection molding using ANOVA, neural network model and genetic algorithm. J. Mater. Process. Technol. 2006, 171, 437–445. [Google Scholar] [CrossRef]
Yang, D.C.; Lee, J.H.; Yoon, K.H.; Kim, J.S. A study on the prediction of optimized injection molding condition using artificial neural network(ANN). Trans. Mater. Process. 2020, 4, 218–228. [Google Scholar] [CrossRef]
Heinisch, J.; Lockner, Y.; Hopmann, C. Comparison of design of experiment methods for modeling injection molding experiments using artificial neural networks. J. Manuf. Process. 2021, 61, 357–368. [Google Scholar] [CrossRef]
Michelucci, U.; Venturini, F. Multi-task learning for multi-dimensional regression: Application to luminescence sensing. Appl. Sci. 2019, 9, 4748. [Google Scholar] [CrossRef]
Aggarwal, C. Neural Networks and Deep Learning; Springer: Cham, Switzerland, 2018. [Google Scholar]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning (Adaptive Computation and Machine Learning Series); MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; Talwalkar, A. Hyperband: A novel bandit-based approach to hyper-parameter optimization. J. Mach. Learn. Res. 2017, 18, 6765–6816. [Google Scholar] [CrossRef]
Rudder, S. An overview of multi-task learning in deep neural networks. arXiv 2017, arXiv:1706.05098. [Google Scholar] [CrossRef]
Zhang, Y.; Yang, Q. An overview of multi-task learning. Natl. Sci. Rev. 2018, 5, 30–43. [Google Scholar] [CrossRef]
Caruana, R. Multitask Learning; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar]
Thung, K.H.; Wee, C.Y. A brief review on multi-task learning. Multimed. Tools Appl. 2018, 77, 29705–29725. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar] [CrossRef]
ISO 20457:2018; Plastics Moulded Parts—Tolerances and Acceptance Conditions. ISO: Geneva, Switzerland, 2018.
Lee, J.; Yang, D.; Yoon, K.; Kim, J. Effects of Input Parameter Range on the Accuracy of Artificial Neural Network Prediction for the Injection Molding Process. Polymers 2022, 14, 1724. [Google Scholar] [CrossRef]

Figure 1. Structure of perceptron with bias [8].

Figure 2. Process for artificial neural network.

Figure 3. Chain rule of derivatives in backpropagation algorithm [8].

Figure 4. Multi-task learning architecture in deep learning [11,12].

Figure 5. Structure: (a) mold; (b) bowl product.

Figure 6. (a) Measurement points of bowl product; (b) method for measurement of diameter.

Figure 7. Measurement points of bowl product: height.

Figure 8. Network A with the output parameters of mass, diameter, and diameter being connected to the single-task layer.

Figure 9. Network B with diameter and height as one group and mass as the other group.

Figure 10. Network C with mass, diameter, and height all categorized into separate groups.

Figure 11. Root mean square errors (RMSEs) for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 12. Standard deviation of square errors for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 13. Performances of the prediction models using test data according to networks in terms of (a) mass; (b) diameter; (c) height.

Table 1. Process conditions and levels for the experiment.

Conditions	Level 1	Level 2	Level 3
Melt temperature (°C)	200	220	240
Mold temperature (°C)	40	50	60
Injection speed (mm/s)	40	70	100
Packing pressure (bar)	150	200	250
Packing time (s)	6.0	12.0	18.0
Cooling time (s)	38	48	58

Table 2. Ranges of hyperparameters for networks.

Hyperparameters	Range	Note
Seed number	0–50	Step size was 1
Batch size	16, 32, 64, …	Increased in multiples of 2 until it could cover the number of learning data
Optimizer	Adams [15]	Fixed
Learning rate	0.0001–0.01 [15]	Step size was 0.0001
Beta 1	0.1–1.0 [15]	Step size was 0.1
Bata 2	0.9, 0.99, 0.999, 0.999 [15]	-
Number of neurons	3–18	Step size was 1
Initializer	He normal (hidden layer) Xavier normal (output layer)	-
Activation function	Elu (hidden layer) Linear (output layer)	-
Drop number	0.0–0.4	Step size was 0.1
Coefficient of L2 normalization	0.001, 0.01, 0.1	-

Table 3. Determined hyperparameters for networks.

Hyperparameters	Network A	Network B	Network C
Seed number	17	6	47
Batch size	16	16	32
Optimizer
Learning rate	0.0073	0.0051	0.0052
Beta 1	0.6	0.3	0.1
Bata 2	0.99	0.99	0.99
Number of neurons	17-13-8-7	17-15-13 (common layers) [5, 7] (mass, length layer)	18-7-7 (common layers) [3, 5, 6] (mass, diameter, height layer)
Initializer
Activation function
Drop number	0.0-0.3-0.1-0.1	0.03-0.00-0.00 (common layer) [0.2, 0.0] (mass, length layer)	0.0-0.2-0.0 (common layers) [0.1, 0.4, 0.2] (mass, diameter, height layers)
Coefficient of L2 normalization	0.01	0.001, 0.001 (mass, length)	0.01, 0.001, 0.1 (mass, diameter, height)

Table 4. Root mean square errors (RMSEs) of total normalized property data for networks.

Network	Total Normalized Test Data
A	$6.162 \times 10^{- 2}$
B	$5.280 \times 10^{- 2}$
C	$4.142 \times 10^{- 2}$

Table 5. Root mean square errors (RMSEs) of each normalized property data for networks.

Network	Mass	Diameter	Height
A	$6.280 \times 10^{- 2}$	$7.122 \times 10^{- 2}$	$4.876 \times 10^{- 2}$
B	$4.083 \times 10^{- 2}$	$6.753 \times 10^{- 2}$	$4.622 \times 10^{- 2}$
C	$2.725 \times 10^{- 2}$	$6.056 \times 10^{- 2}$	$2.714 \times 10^{- 2}$

Table 6. Mean square errors (MSEs) and standard deviations of each normalized property data for networks.

Network	Mass	Diameter	Height
A	$(3.944 \pm 3.182) \times 10^{- 3}$	$(5.072 \pm 7.723) \times 10^{- 3}$	$(2.377 \pm 2.288) \times 10^{- 3}$
B	$(1.667 \pm 2.162) \times 10^{- 3}$	$(4.561 \pm 4.336) \times 10^{- 3}$	$(2.136 \pm 2.265) \times 10^{- 3}$
C	$(7.430 \pm 5.120) \times 10^{- 4}$	$(3.668 \pm 4.275) \times 10^{- 3}$	$(7.370 \pm 6.440) \times 10^{- 4}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, J.; Kim, J.; Kim, J. The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process. Appl. Sci. 2023, 13, 12876. https://doi.org/10.3390/app132312876

AMA Style

Lee J, Kim J, Kim J. The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process. Applied Sciences. 2023; 13(23):12876. https://doi.org/10.3390/app132312876

Chicago/Turabian Style

Lee, Junhan, Jongsun Kim, and Jongsu Kim. 2023. "The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process" Applied Sciences 13, no. 23: 12876. https://doi.org/10.3390/app132312876

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artificial Neural Networks in Injection Molding Process

Abstract

1. Introduction

2. Theoretical Background

2.1. Artificial Neural Networks

2.2. Backpropagation

2.3. Hyperparameters

2.4. Multi-Task Learning

3. Experiment

3.1. Materials and Molding Equipment

3.2. Experimental Conditions

4. Neural Network Architectures and Implementation

5. Results

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI