Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding

Wenzel, Manuel; Raisch, Sven Robert; Schmitz, Mauritius; Hopmann, Christian

doi:10.3390/polym16172465

Open AccessArticle

Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding

¹

Corporate Research, Robert Bosch GmbH, Robert-Bosch-Campus 1, 71272 Renningen, Germany

²

Institute for Plastics Processing (IKV) in Industry and Craft at RWTH Aachen University, Seffenter Weg 201, 52074 Aachen, Germany

^*

Author to whom correspondence should be addressed.

Polymers 2024, 16(17), 2465; https://doi.org/10.3390/polym16172465

Submission received: 10 July 2024 / Revised: 26 August 2024 / Accepted: 28 August 2024 / Published: 29 August 2024

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence for Polymer Injection Molding Design and Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Machine learning (ML) methods present a valuable opportunity for modeling the non-linear behavior of the injection molding process. They have the potential to predict how various process and material parameters affect the quality of the resulting parts. However, the dynamic nature of the injection molding process and the challenges associated with collecting process data remain significant obstacles for the application of ML methods. To address this, within this study, hybrid approaches are compared that combine process data with additional process knowledge, such as constitutive equations and high-fidelity numerical simulations. The hybrid modeling approaches include feature learning, fine-tuning, delta-modeling, preprocessing, and using physical constraints, as well as combinations of the individual approaches. To train and validate the hybrid models, both the experimental and simulated shrinkage data of an injection-molded part are utilized. While all hybrid approaches outperform the purely data-based model, the fine-tuning approach yields the best result in the simulation setting. The combination of calibrating a physical model (feature learning) and incorporating it implicitly into the training process (physical constraints) outperforms the other approaches in the experimental setting.

Keywords:

hybrid machine learning; hybrid modeling patterns; injection molding; surrogate model; shrinkage

1. Introduction

Injection molding is a processing technique widely utilized to produce plastic components. Its ability to achieve short cycle times and manufacture complex geometries has made it a preferred choice for high-volume production in various industries [1]. However, as customer demands continue to rise, manufacturers face the challenge of maintaining and improving the quality of injection-molded parts. This necessitates the optimization and monitoring of the injection molding process, which can be achieved through the application of ML methods and modeling techniques.

ML methods offer immense potential in optimizing the injection molding process by uncovering the underlying relationships between process feedback (e.g., cavity sensors) or process settings (e.g., set holding pressure) and the resulting quality attributes (e.g., dimensions). By learning these relationships, ML models can predict the resulting part quality and, subsequently, optimize or control the quality by determining the optimal process settings [2]. Examples of surrogate models in injection modeling include the optimization of mechanical properties [3], shrinkage and warpage [4,5], or in model predictive control [6,7].

Neural networks (NN) have gained significant popularity in recent years, particularly in areas such as vision [8] and speech applications [9]. In these domains, supervised training procedures are commonly employed, minimizing the discrepancy between the network’s predictions and the training data. However, in scientific and engineering fields, generating the necessary amount of training data can be a complex, time-consuming, and costly endeavor, especially when dealing with complex nonlinear relationships and incorporating real-world trials.

In the context of thermoplastic injection molding, Design of Experiments (DoE) is frequently employed to generate the required data for modeling the relationships between process and machine parameters and quality attributes. However, the learned relationships of data-driven models are statistical in nature and lack physical insights. Additionally, classical data-driven ML methods lack robustness or fail to generalize when confronted with partial information i.e., small datasets or when trying to extrapolate [10]. A potential solution is the combination of ML methods with physics-based domain knowledge, also known as hybrid modeling [11].

This study aims to reduce the amount of data needed to establish a surrogate model for shirnkage prediction in the injection molding process by improving their generalizability. Since the optimization of the shrinkage and warpage of injection-molded parts is a commonly performed task [5], the focus of the study is modeling the resulting part width depending on the process settings. To achieve improved generalization, various hybrid modeling patterns are evaluated by combining the underlying physics, via a process simulation, and data into the training of a surrogate model.

This paper is structured in the following way. First, a brief background in shrinkage prediction for polymers in injection molding is provided, as well as an introduction into hybrid modeling. Next, the used specimen and the setup of the hybrid model is presented together with the simulation and experimental data used for calibrating and validating the approach. Lastly, the results of the hybrid approaches are investigated and possible future extensions are discussed.

2. State of the Art

2.1. Physics-Based Shrinkage Prediction

For the prediction of the resulting part dimensions in injection molding, two approaches are commonly utilized: PVT models and residual stress or strain models [12]. PVT models estimate the free volumetric shrinkage of the part after it detaches from the cavity wall. A typical injection molding cycle is depicted on a PVT plot in Figure 1. The shrinkage is estimated using the relative reduction in the specific volume between the two time points

t_{P_{0}}

and

t_{r o o m}

, where

t_{P_{0}}

is the point where the pressure reaches ambient pressure (

P = 1

bar).

To account for the effects of shrinkage and warpage, thermoelastic stress–strain models can be employed for the displacement calculation, given by Equation (4) [12]:

σ_{i j} = C_{i j k l} (ϵ_{k l}^{total} - ϵ_{k l}^{th}) + σ_{I_{i j}}

(1)

The displacement field u is determined by solving the following Equation (2), subject to appropriate boundary conditions [12]:

\nabla \cdot σ = 0

(2)

For instance, in the case of a detached part, the free boundary condition applies as follows [12]:

σ_{i j} n = 0

(3)

Within these equations,

ϵ_{k l}^{total}

represents the total elastic strain, while

ϵ_{k l}^{th}

denotes the thermal strain from free quench during cooling after the part is ejected. The term

σ_{i j}

represents the total stress and

σ_{I_{i j}}

represents the initial stress, i.e., thermally and pressure-induced residual stresses generated during cooling inside the mold. Lastly,

C_{i j k l}

is the strain–stress matrix, coupling the two. The constitutive strain–displacement relationship is expressed as follows:

ϵ_{x x} (x, θ) = \frac{d u_{x}}{d x}

(4)

Commercial simulation software is typically used to create physics-based shrinkage and warpage predictions. For instance, ref. [13] conducted a study to identify the design variables that have a significant impact on warpage and volume shrinkage in the injection molding process. They employed the response surface method (RSM), MoldFlow^® Insight^® 2004.5 simulation, and statistical analysis of variance (ANOVA) to analyze the effects of various parameters. A central composite design with 30 runs was used to create the response surface. Their results indicated that the melt temperature had the highest influence on dimension shrinkage in the transverse direction, followed by packing pressure, mold temperature, and injection velocity. Similarly, ref. [4] conducted a separate study to identify the design variables that significantly impact warpage and volumetric shrinkage in the injection molding process. They also used Moldflow^® simulations, RSM, and ANOVA to analyze the effects of various parameters. They used an orthogonal design with six factors and five levels, with a total of sixty-nine samples. Their findings revealed that the melt temperature, holding time, injection time, and cooling time were the most influential factors affecting the outcomes of the study.

While modern simulation tools are capable of predicting the primary effects of process settings on shrinkage and warpage, they often neglect microscale effects, leading to discrepancies between the simulated and experimental results. While the thermal strain due to free quench during cooling can be accurately estimated with the Coefficient of Thermal Expansion (CTE), the estimation of the residual stresses

σ_{I_{i j}}

formed within the processes poses a challenge. To tackle this issue, some commercial simulation softwares have adopted their own strategies. Among others, Autodesk Moldflow Insight 2021.1^® uses a PVT-scaled approach by default for the displacement calculation, which is proprietary knowledge, and the exact calculation is not known to the public. Another example is the developed Corrected In-Mold Residual Stress (CRIMS) model, which gives a better estimation of the residual stresses developed during the molding process [14]. Furthermore, efforts have been made to develop more accurate material models, including improved PVT models and crystallization models. For instance, ref. [15] incorporated microscale properties into standard PVT models, utilizing the Two-Domain Tait Equation, to address this issue, resulting in more accurate simulations. However, it is important to note that these approaches require time and effort to calibrate and implement.

2.2. Data-Driven Surrogate Models

The creation of data-driven surrogate models for the injection molding process with NNs dates back to the 90s, where networks were first trained to model the relationships between process parameters and final quality attributes. The use of NNs enables the establishment of an approximate function to estimate the non-linear relationships between design variables and quality indicators. Subsequently, numerous studies have been published, wherein networks have been trained to learn the relationship between process inputs and final dimensions. These surrogate models play a pivotal role in the workflow of injection molding optimization, offering a computationally efficient method to explore the input design space. Furthermore, the precision of the subsequent optimization is directly influenced by the accuracy of these predictive models.

For example, in the study by [4], in addition to RSM, NNs with two hidden layers were employed to construct prediction models capable of handling the non-linear relationship between input variables and shrinkage and warpage. Similarly, in [16,17], a Taguchi experimental design and ANOVA method were initially used to investigate the impact of process settings on shrinkage, followed by training a simple NN to create a surrogate model which can then be used for optimization routines. Furthermore, ref. [18] compared different DoEs for generating datasets for a NN and polynomial regression models, finding that a

2^{6 - 3}

fractional factorial design with a center point was the most efficient, while a central composite design was the most effective. However, data acquisition poses a challenge in injection molding due to the wide range of processes and phenomena involved. Extensive DoEs can be costly, and, in regular production, only limited variation is expected. Additionally, in the early development phases, only small datasets are typically available. Regardless of the input variables, the relationships established between input and output are purely data-driven and lack a comprehensive physical understanding of the injection molding process.

2.3. Hybrid Modeling Patterns

In recent years, the integration of domain knowledge into ML algorithms [11], also known as hybrid modeling, has gained significant attention to address the need for large datasets for initial training and model updating. A recent article by [19] has provided a comprehensive summary and formalization of frequently used hybrid models, deriving reusable patterns from them. The term “patterns” was chosen by the authors to denote the different types of base patterns that can be combined to create more complex hybrid models. In their paper, they presented four patterns as follows: physics-based preprocessing (PP), delta model (DM), feature learning (FL), and physical constraints (PC). Additionally, this work explores a fine-tuning (FT) approach. The following short descriptions and mathematical formalizations of combining data-driven models

D (θ)

with physics-based models

P (θ)

to create hybrid models

H (θ)

is based on the work of [19] and has been extended to incorporate the FT approach.

2.3.1. Physics-Based Preprocessing (PP)

Physics-based preprocessing steps are commonly employed in practice, such as in physics-based feature engineering. The inputs

θ

undergo a series of physics-inspired transformations, which are then additionally fed into the data-based model. Mathematically,

P (θ)

is utilized as an additional input as follows:

H (θ) = D (θ, P (θ))

(5)

This patternis applied in injection molding, for instance for dimensionality reductions in sensor data. Physically interpretable pressure integrals over the injection and holding phase are utilized for this purpose, owing to their high correlation with weight and dimensional features (see e.g., [20]).

2.3.2. Delta Model (DM)

In scenarios where an initial prediction can be made based on a physical model, the data-based model can be employed to learn the error between the physics-based prediction and the observations. The final prediction can be obtained by combining the two models:

H (θ) = D (θ) + P (θ)

(6)

Despite the easy-to-implement approach, the authors are not aware of any publications that have applied this method specifically for quality prediction in injection molding.

2.3.3. Feature Learning (FL)

Instead of utilizing data solely to assess the error of the physics-based model, as in the delta model, an alternative approach involves using data to calibrate the physical model. In this context, the “D” index signifies the calibration, with the prediction still being made by a physics-based model:

H (θ) = P_{D} (θ)

(7)

For instance, in the field of injection molding, ref. [21] employed pressure data from an actual process to calibrate material coefficients for simulation by identifying matching simulated and real pressure curves.

2.3.4. Physical Constraints (PC)

Physical constraints are used to inform the architecture or learning process of a data-driven model. The constraints can affect the structure of the model, its parameters, or its computational results. The hybrid model is formed by incorporating these constraints either directly on the final outputs or intermediate results. The general form is denoted by the following equation:

H (θ) = D_{P} (θ)

(8)

Furthermore, a distinction is made between hard and soft constraints. Hard constraints are implemented to ensure that the hybrid model cannot violate the constraints, while soft constraints are typically expressed as physics-informed losses that guide predictions to fall within a desired range.

An example of a hard constraint is the use of a softmax activation function, where the final prediction cannot violate the desired constraint, such as becoming negative [19]. Implicit knowledge utilization has been explored in the work of [22], where conditional physics-informed neural networks (PINNs) were employed to develop a surrogate model of the part temperature during the cooling within the injection molding process. This study demonstrated that the effects of process parameters could be implicitly learned using only physics-informed loss functions.

2.3.5. Fine-Tuning (FT)

The fine-tuning approach is a key hybrid modeling pattern that involves the pre-training of the network on data with similar physics, enabling a subsequent fine-tuning or transfer-learning [23] step with fewer data samples. This process can be represented as follows:

w = D (θ) \leftarrow P (θ)

(9)

An illustrative example of this hybrid modeling pattern in the context of injection molding is the application of a transfer-learning approach to share information between simulation and real processes [24] or different materials [25] using pretrained models.

2.4. Summary

The field of shrinkage prediction in injection molding has seen significant advancements, but there remains a need for further research to enhance the accuracy and efficiency of current approaches. While physics-based methods have made progress, they still struggle to capture microscale effects and fully account for the complex material behavior in the process. Data-driven methods, such as NNs, have been widely used but are limited by the availability of data and the lack of physical understanding. Consequently, there is a need to implement hybrid approaches for shrinkage prediction that integrate the knowledge of the underlying physics. Although individual hybrid modeling approaches have been applied in various applications, there is a lack of comparative studies on these approaches. Hence, there is a need to evaluate and compare different hybrid modeling patterns to determine their effectiveness.

3. Data and Methodology

This section provides an overview of the methodology and data utilized to establish hybrid models for predicting the final dimensions of the part based on the process settings. It begins by introducing the specimen, its material, and the simulated and experimental data. Next, an overview of the hybrid models used is given, followed by a detailed description of each approach, including the used domain knowledge and strategy for integrating data.

3.1. Specimen, Material, and Data Acquisition

A simple mold geometry is chosen for the evaluation of the hybrid approaches. The injection-molded part used in this work is shown in Figure 2. It is a thin-walled part characterized by a constant rectangular cross section. An unreinforced polyoxymethylene homo-polymer (POM) was chosen as the material due to its industrial relevance.

The used simulation model is shown in Figure 3a and was already introduced in more detail in [26]. The simulations are carried out using Autodesk^® Moldflow^® Insight 2021.1 (AMI2021.1). For the material, the default parameter from the Moldflow database of the Material Delrin^® 111P NC010 from Delrin, Wilmington, DE, USA is selected. The 3D simulation includes filling, packing, and cooling phases.

The three process parameters

θ = (T_{mold}, V_{inj}, P_{hold})

in Table 1 are varied in a full-factorial DoE across a total of 27 simulations. An overview of all combinations can be found in the Appendix A in Table A1. The process parameters are the coolant temperature

T_{c, in}

, the injection velocity

V_{inj}

during the injection phase, and the holding pressure

P_{hold}

. Instead of the actual controlled coolant temperature, the resulting approximated mold temperature

T_{mold} = T_{c, in} - 4

[26] is used.

The parameters

T_{mold}

and

V_{inj}

are chosen due to their influence on the temperature within the process. While the holding pressure does not impact the temperature significantly, it is the main influencing factor of the formed residual stresses

σ_{I_{i j}}

. An overview of the process settings, which are not varied, is presented in Table 2.

The data extracted from the simulation are the node values (at locations

x_{i}

for a specific parameter combination

θ_{i}

) of the pressure P, temperature T, and displacement u. Examples of the extracted temperature and pressure profiles of the simulated process are illustrated for various process settings in the Appendix A in Figure A1 and Figure A2. The displacement data

u (x, θ)

need to be transformed to the part width

w (x, θ)

. For this, the displacements at the boundaries (1D case:

x_{min}

,

x_{max}

) are subtracted from the initial width

w_{0} = 20

mm of the geometry as follows:

w (θ) = w_{0} - (u (x_{min}, θ) - u (x_{max}, θ))

(10)

For the experimental data generation, an electric injection molding machine (E-Motion 440/220 T from ENGEL AUSTRIA GmbH, Schwertberg, Austria) is used. The mold used is shown in Figure 3b. The drying of the material is achieved using a hot air dryer for 2 h at a temperature of 80 °C. To ensure comparable experimental conditions, the basic procedure is maintained for each experiment. In the experiment preparation, cylinder zone heating and mold temperature control are first switched on and left in this state for one hour to warm up without producing molded parts. After this waiting time, the production is started in a fully automatic mode, which is maintained for half an hour (30 parts). This ensures sufficient thermal equilibrium before the actual experiment is conducted according to the respective specifications. For each experimental setting, seven parts are produced fully automatically to capture any additional process variations. During the experiment, the parts are removed using a robot handling system. With 27 experimental settings, this results in 189 parts. A 3D profilometer (Model VR-5000 from Keyence Corporation, Osaka, Japan) is used to measure shrinkage in the width direction. The point of measurement can be seen in Figure 2. Depending on the type of measurement and the monitor magnification, this device has different accuracies. According to the manufacturer, it has an accuracy of

\pm 5 μ

m for width measurements at a 12× monitor magnification. For each process setting, the final widths in the experimental dataset are determined by calculating the average width across the seven repetitions. An overview of the process settings of the simulated and experimental average widths, along with the standard deviations across the seven repetitions, can be found in Table A1 in the Appendix A. The average standard deviation across the varied process settings is

\pm 6.9 μ

m, which is acceptably close to the measurement device’s accuracy.

3.2. Overview of the Hybrid Models and Training Data

The primary objective of the study is to compare hybrid modeling approaches with the aim of improving the predictive accuracy of data-driven models in low data regimes. The general modeling task is to predict the width w, which is dependent on the process settings

θ

. Surrogate modeling this type of dependency is the first step in injection molding optimization routines, as described in [5,16,17].

The literature showed that for these types of regression tasks in injection molding, NNs are a suitable choice due to their ability to model non-linear dependencies. Within this work, simple NNs with similar architectures are used to assess different hybrid approaches, aiming to physically inform the predictions while maintaining a consistent underlying data-based model to facilitate better comparability. NNs are additionally the choice of architecture since, due to their flexibility, different hybrid approaches can be used. For the implementation, the torchphysics library [27] was used, which uses pytorch [28] as a backend. The simulation dataset is utilized for the physics-based knowledge and to test the capabilities of the data-based, physics-based, and hybrid approaches. With the real measurement data, the hybrid approaches are further validated.

The performance of the different hybrid models is evaluated using different sub-sections of the datasets. The models are trained on two different subsections labeled “Combined Effects (CE)” and “Individual Effects (IE)”, as shown in Figure 4. The CE data split is a Full-Factorial DoE with two factors (Low and High). The IE data split is a Star DoE. One additional parameter combination is used as a test setting for the hyperparameter optimization, as well as for an EarlyStopping [28] criteria. The remaining data points from the dataset are used as validation points. For both data splits, the training data is below

30 %

and contains only the linear effects of the process settings. By using the two data splits, it can be tested how well the hybrid approaches learn non-linear patterns incorporating a physics-based model. The mean absolute error (

M A E

) is used for the evaluation and comparison of the different approaches.

A 10-fold cross-validation is conducted since, due to the random initialization of the network weights and the random shuffling of the dataset, each optimization run of a network yields a slightly different result. The average across the 10 predictions is denoted by

M A E_{10}

.

In Table 3, an overview of the hybrid approaches is given. The first two models represent the purely data-based and purely physics-based approaches. The subsequent five models are the individual hybrid modeling patterns. The last five models are combinations of the individual hybrid modeling patterns with more complex models.

3.3. Base Models

3.3.1. Data-Based Model

The data-based model serves as a baseline model, learning a direct relationship between the process settings

θ

and the part width w. With three process settings, the network has three inputs

θ = (T_{mold}, V_{inj}, P_{hold})

and one output

w_{predicted}

. The data loss

L_{Data}

used to train the network is modeled with the L2 norm of the error between the training data and the network predictions as follows:

L_{Data} = \frac{1}{N_{data}} \sum_{i = 1}^{N_{data}} | | w_{data} (θ_{{data}_{i}}) - w_{predicted} (θ_{{data}_{i}}) {| |}_{2}

(11)

Here, an individual process parameter combination is denoted by

θ_{{Sim}_{i}}

.

| | \cdot | |

denotes the L2 norm.

To determine the optimal hyperparameters for the data-based approach, a grid search is conducted. Table 4 provides an overview of the hyperparameters and search space. The resulting optimal network architecture and optimization approach is highlighted within the following table.

The identified hyperparameters are maintained at a constant for the following data models within the hybrid approaches, unless specified otherwise. This promotes a better assessment of the impact of the various hybrid modeling patterns.

3.3.2. Physics-Based Model

To physically predict the shrinkage behavior based on the process settings

θ

, the final displacement

u (x, θ)

needs to be determined. Here, the thermoelastic constitutive equation for the unidirectional composite (Equation (1)) is rewritten in terms of its strains as follows:

ϵ_{i j}^{total} = ϵ_{k l}^{th} + ϵ_{i j}

(12)

where the total strain

ϵ_{i j}^{total}

is the sum of the elastic strain

ϵ_{i j}

and the thermal strain

ϵ_{i j}^{thermal}

. The thermal strain is expressed as follows [12]:

ϵ_{k l}^{th} = α Δ T

(13)

where

α = 100 e - 6 \frac{1}{° K}

represents the coefficient of linear thermal expansion (CTE) of the used polymer, taken from the CAMPUS^® (Computer-Aided Material Pre-selection by Uniform Standards) database, available at http://www.campusplastics.com/ (accessed on 10 July 2024). However, predicting the elastic strain

ϵ_{i j}

is challenging due to the complex formation of the residual stresses

σ_{I_{i j}}

within the process. As a result, for the purely physics-based model, a simplification is made and only the thermal strains are considered. The modeling error is aimed to be compensated for utilizing the hybrid approaches.

In the reference simulation software, the thermal strain

ϵ_{k l}^{th}

is calculated using the difference between the temperature at the end of the process

T_{end}

and room temperature

T_{room}

. Instead of

T_{end}

, in this work, the temperature

T_{P_{0}}

at which the process reaches ambient pressure within the cavity and the part detaches from the wall is utilized. This temperature range is, e.g., used for calculating the volumetric shrinkage (see Section 2.1, Figure 1). Using this, parts of the temperature-dependent in-process shrinkage effects are included in the thermal strain.

To have

T_{P_{0}}

available for different process settings, the entire dataset of the simulated temperatures and pressures is compressed into surrogate models, enabling interpolation to set combinations not covered by the dataset. NNs are trained for this purpose, with one NN for the temperature

N_{T_{Sim}} (x, t, θ)

and another for the pressure

N_{P_{Sim}} (x, t, θ)

. Due to the fast inference time of the NN, the results can be obtained in real time. The network specifications for the surrogate models are in Table 5. The specifications were chosen according the networks’ ability to approximate the simulated temperatures and pressures, similarly to how it is described in [29]. Different network sizes were studied while monitoring the overall training, i.e., compression error. The same networks

N_{T_{Sim}} (x, t, θ)

and

N_{P_{Sim}} (x, t, θ)

are used for the physics-based as well as hybrid models, increasing the comparability of the different approaches by sourcing the same information.

Once trained, the surrogates can then be used to calculate

T_{P_{0}} (x, θ)

. For a given setting combination

θ_{i}

at a specific location

x_{i}

, the model

N_{P_{Sim}} (t, x_{i}, θ_{i})

is utilized to predict the pressure at

n = 100

discrete time points on a set interval from

t_{start}

to

t_{end}

. Subsequently, the time point

t_{P_{0}}

at which the predicted pressure drops below a threshold

ε = 1 + 0.1

bar is determined (

N_{P_{Sim}} (t_{j}, x_{i}, θ_{i}) < ε \to t_{j} \approx t_{P_{0}}

), and this value is used to predict the temperature

T_{P_{0}} (x_{i}, θ_{i})

using

N_{T_{Sim}} (t_{j}, x_{i}, θ_{i})

.

To calculate the resulting part width, a 1D simplification is used. By employing the constitutive strain–displacement relationship (Equation (4)), the final displacement can be obtained using integration as follows:

u_{x} (x, θ) = \int ϵ_{k l}^{th} (x, θ) d x

(14)

In this work, a standard integration scheme is implemented with

u_{x_{0}} (x_{0} = 0) = 0

and the following:

u_{x_{i}} = (x_{i} - x_{i - 1}) ϵ_{k l}^{th} (x_{i}, θ) + u_{x_{i - 1}}

(15)

The final width of the part can be calculated by taking the final displacement of the boundary points

x_{\min} = 0

mm and

x_{\max} = 20

mm and subtracting it from the original geometry, where the original part width is

w_{0} = 20

mm:

w (θ) = P (θ) = w_{0} - \int_{x_{\min}}^{x_{\max}} ϵ_{k l}^{th} (x, θ) d x

(16)

3.4. Hybrid Models

3.4.1. FL

Using the FL approach, a calibration of the physics-based model is targeted. Detailed domain knowledge about the shortcomings of the used physics-based model is necessary to ensure robust extrapolation capabilities. In the used physics-based model, the temperature-dependent shrinkage effects have been addressed, but the elastic strain

ϵ_{i j}

occurring due to residual stresses

σ_{I_{i j}}

, formed during the processes, has not been considered (Equations (1) and (12)). While for non-reinforced thermoplastic materials, the thermal strain accounts for the main effects of the tool temperature and injection velocity, the influence of pressure remains a challenge for physics-based models as well as for most standard simulation softwares. Hence, with the FL approach, a data-driven relationship between the elastic strain

ϵ_{i j} (P_{h o l d})

and the holding pressure is modeled. Instead of a NN, for the FL approach, a linear regression model (

y = a x + b

) is chosen to model

ϵ_{i j} (P_{h o l d})

due to the limited training samples in the dataset for the chosen input parameter

P_{h o l d}

. For the IE data split, only two observations of variations in

P_{h o l d}

are found within the training dataset, making a linear regression model with two degrees of freedom an optimal choice. To estimate the coefficients, combinations of process parameters from the training data that show variations at lower pressures

θ_{P_{{h o l d}_{-_{j}}}}

and higher pressures

θ_{P_{{h o l d}_{+_{k}}}}

are used to calculate the minimum and maximum total effective strains

ϵ_{P_{{h o l d}_{-}}}

and

ϵ_{P_{{h o l d}_{+}}}

, respectively:

ϵ_{P_{{h o l d}_{-}}} = \frac{1}{n_{d a t a_{-}}} \sum_{j = 0}^{n_{d a t a_{-}}} \frac{w_{data} (θ_{P_{{h o l d}_{-_{j}}}}) - w_{0}}{w_{0}}

(17)

ϵ_{P_{{h o l d}_{+}}} = \frac{1}{n_{d a t a_{+}}} \sum_{k = 0}^{n_{d a t a_{+}}} \frac{w_{data} (θ_{P_{{h o l d}_{+_{k}}}}) - w_{0}}{w_{0}}

(18)

Using the fittet total effective strains

ϵ_{P_{{h o l d}_{-}}}

and

ϵ_{P_{{h o l d}_{+}}}

, the linear model is given by the following:

ϵ_{i j} (P_{h o l d}) = \frac{(P_{h o l d} - P_{{h o l d}_{-}})}{(P_{{h o l d}_{+}} - P_{{h o l d}_{-}})} (ϵ_{P_{{h o l d}_{+}}} - ϵ_{P_{{h o l d}_{-}}}) + ϵ_{P_{{h o l d}_{-}}}

(19)

The corrected displacement values are calculated using the same integration scheme, incorporating the estimated elastic strain as follows:

u_{corrected} (x_{i}) = (x_{i} - x_{(i - 1)}) (ϵ_{k l}^{t h} (x_{i}, θ) + ϵ_{i j} (P_{h o l d})) + u (x_{(i - 1)})

(20)

The resulting part width is then determined using the corrected displacement values as follows:

w (θ) = P_{D} (θ) = l - \int_{x_{\min}}^{x_{\max}} ϵ_{k l}^{t h} (x, θ) + ϵ_{i j} (P_{h o l d}) d x

(21)

3.4.2. DM

The DM approach builds on the physics-based model by using the same inputs as the purely data-based approach, but focuses on predicting the deviation from the physics-based model. By subtracting the physics-based model

P (θ_{data})

from the observed data

w_{data}

, one obtains the delta which is learned by the delta model using a NN.

3.4.3. DM + FL

The extension for the DM approach is the utilization of the calibrated physics-based model via the FL approach. Within this hybrid approach, the data are used to calibrate the physics model, as well as to learn the differences between the physics model and the observed data points.

3.4.4. FT

For the FT approach, first, a NN is pretrained on a separate dataset created by the physical model

P (θ)

. The full-factorial DoE with three steps (Table 1) is used to create discrete predictions by the physics-based model. The actual training data are then used to fine-tune the pretrained model. The model uses the same inputs as a purely data-driven model and directly estimates the resulting part width. Notably, this approach does not involve freezing any layers during fine-tuning (i.e., transfer learning) as initial studies indicated that doing so does not enhance performance for the shallow networks employed.

3.4.5. FT + FL

This extension of the FT method involves pretraining the NN on a dataset specifically generated through the FL approach. This step aims to leverage the refined data insights from FL for an even more accurate model initialization before fine-tuning with actual training data.

3.4.6. PP

For the PP approach, the prediction of the physics-based model serves as an additional input feature of the NN. This means that the network has four input parameters, the three process settings as well as the physics-based prediction

P (θ)

. The output of the network is the prediction of the final width.

3.4.7. PP + FL

Like in the previous hybrid approaches, the combination of the PP approach with the FL approach is created by substituting the standard physics-based prediction

P (θ)

with the enhanced prediction

P_{D} (θ)

from the FL approach.

3.4.8. PC

Within this work, a soft constraint approach is examined, using conditional PINNs [30]. PINNs have the ability to incorporate spatiotemporal ODEs and PDEs into the learning process using additional physics-based loss terms [31]. Due to the general applicability of differential descriptions within the engineering domain, the approach has been used to study, e.g., heat equations [32], flow equations [33], and solid mechanics [29]. In [22,30], a more detailed description of the PINN methodology for solving parameterized ODEs and PDEs is given.

For the implementation of the physics-based shrinkage behavior with PINNs, the complexity of the data-based model is increased by adding a 1D spatial domain. The input parameters of the NN include the location parameter x next to the process parameters

θ

. Instead of predicting the resulting part width directly, the displacement

u_{P C} (x, θ)

is modeled like in the physics-based approach. The final width can be analytically obtained again by using the predicted displacement

u_{P C} (x, θ)

. The data-based losses as well as the physics-based losses for the displacement are introduced in the following.

Data-Based Loss

Since the network predicts the displacement, the resulting part width must be transformed into displacement data. While the final displacements are available for the simulation dataset, for the real parts, only the resulting part width is measured. To generate displacement data from width measurements, a simple linear displacement is assumed, starting from a zero-displacement point

x_{0}

, like in the physics-based model. The effective strain is calculated as follows:

ϵ_{eff} (θ) = \frac{w_{0} - w_{data} (θ)}{w_{0}}

(22)

Using the strain–displacement relationship, displacement data

u_{data} (x, θ)

are derived for both the minimum (

x_{\min}

) and maximum (

x_{\max}

) values of the 1D domain:

u_{data} (x_{\min}, θ) = (x_{0} - x_{\min}) ϵ_{eff} (θ)

(23)

u_{data} (x_{\max}, θ) = (x_{0} - x_{\max}) ϵ_{eff} (θ)

(24)

The data-based loss is defined as follows:

L_{D a t a_{P C}} = \frac{1}{N_{D a t a_{P C}}} \sum_{i = 1}^{N_{D a t a_{P C}}} | | u_{P C} (x_{i}, θ_{i}) - u_{data} (x_{i}, θ_{i}) | |

(25)

The number of training points, in this case

N_{D a t a_{P C}} = N_{θ_{train}} N_{x_{train}}

, are the number of training settings

N_{θ_{train}}

multiplied by the number of observed (boundary) points

N_{x_{train}} = 2

.

Physics-Based Loss

The network architecture is chosen such that by using automatic differentiation, the gradient

\frac{d u_{P C} (x, θ)}{d x}

can be obtained for creating a physics-based loss [34]. The physical loss uses a soft constraint, as well as the same physical process description and pretrained temperature and pressure models from Section 3.3.2, to obtain the total strain

ϵ_{i j}^{total} (θ)

during the training process:

L_{ϵ} = \frac{1}{N_{ϵ}} \sum_{i = 1}^{N_{ϵ}} (\frac{d u_{P C} (x, θ)}{d x} - ϵ_{i j}^{total} (θ))

(26)

An additional loss enforces a zero-displacement condition at the point

x_{0}

across the input domain of the process parameters

θ

:

L_{x_{0}} = \frac{1}{N_{x_{0}}} \sum_{i = 1}^{N_{x_{0}}} u_{P C} (x = x_{0}, θ)

(27)

While the observed points of the data-based loss are only on the boundary, the physics-based loss can be applied across the whole input domain. Specifically, the domain for

L_{ϵ}

, denoted as

Ω_{ϵ}

, spans both the process parameter space

Ω_{θ}

and the spatial domain

Ω_{x}

, whereas the loss enforcing zero displacement

L_{x_{0}}

operates over

Ω_{θ}

and at a specific spatial location

x_{0}

. The process parameter domain

Ω_{θ}

is within the range of values of the used full-factorial DoE (Table 1). The spatial domain is

Ω_{x} = [0 mm, 20 mm]

, the width of the original geometry. To generate data points for evaluating the physics-based loss, a random uniform sampling strategy is employed. The number of points

(θ_{i}, x_{i})

sampled at each training step are as follows:

N_{ϵ} = N_{x_{0}} = 1200

.

The optimization of PINNs relies on a loss function that is a weighted sum of the data-based loss

L_{D a t a_{P C}}

, the zero-displacement loss

L_{x_{0}}

, and the physics-based loss

L_{ϵ}

[31]:

L = w_{D a t a_{P C}} L_{D a t a_{P C}} + w_{x_{0}} L_{x_{0}} + w_{ϵ} L_{ϵ}

(28)

The weighting factors for these losses can be adjusted to prioritize certain aspects of the model. In this study, the weights

w_{x_{0}}

and

w_{D a t a_{P C}}

are set to 1, while the weight

w_{ϵ}

is significantly higher at 100. This higher weighting for

L_{ϵ}

is chosen to balance the scale of the loss terms, considering that the magnitude of strain values within

L_{ϵ}

is generally smaller than that of the overall displacement values. This strategic weighting ensures that each aspect of the loss function contributes appropriately to the model’s training, aiming for an accurate and physically consistent prediction of the system behavior.

3.4.9. PC + FL

For the extension of the PC approach with the FL approach, the corrected total strain

ϵ_{k l}^{t h} (x_{i}, θ) + ϵ_{i j} (P_{h o l d})

(Equation (12)) is used within the physics-based loss function

L_{ϵ}

(Equation (26)). This adjustment allows the model to incorporate more accurate physical constraints derived from both theoretical and empirical insights, enhancing the model’s predictive accuracy.

3.4.10. PC + FT + FL

This approach leverages the 1D displacement calculations from the physics-based model to pretrain the NN. By utilizing the detailed displacement data, the network gains an initial understanding of the physical dynamics. The final hybrid model emerges from fine-tuning this pretrained network with the additional physical constraints and observed data, creating a model that benefits from both the depth of physics-based analysis and the adaptability of ML techniques.

4. Results

In this chapter, the results of the different hybrid approaches are presented, assessing their performance across different datasets (Simulated and Experimental) and data splits (IE and CE). First, the results of the purely data- and physics-based approach are shown and discussed, after which the results of the hybrid approaches are compared.

4.1. Base Models

For a first comparison of the base models, contour plots are created through the linear interpolation of the part width at the process settings of the full-factorial DoE with three levels. These plots are shown in Figure 5a for the simulation dataset and in Figure 5b for the experimental dataset.

Firstly, looking at the differences of the simulated and experimental datasets, the increased complexity of the real process behavior is visible. This is also reflected within the

M A E_{10}^{t o t a l}

, where the data-based model can fit the IE data split with an

M A E_{10}^{t o t a l}

of

0.0083

mm on the simulated dataset, while on the experimental dataset, the

M A E_{10}^{t o t a l}

increases to

0.0168

mm. Importantly, the average standard deviation observed in the experimental width measurements (±0.0069 mm) indicates the potential lowest

M A E

attainable for the experimental dataset. Given that the data splits only encompass linear effects, the purely data-driven model’s predictions are inherently linear.

While the simplified physics-based model demonstrates its capability to predict the directionality of effects for

T_{mold}

and

V_{i n j}

, an overall offset can be observed. This offset is more pronounced at lower

P_{h o l d}

, suggesting that the model’s inability to account for residual stresses contributes to its discrepancies. While the model aligns well with the simulated effects of

T_{mold}

and

V_{i n j}

, it fails to capture the non-linear dynamics present in the real process, particularly under varying

P_{h o l d}

.

Following this, the results of the hybrid approaches are shown, first on the simulated dataset and then on the experimental dataset.

4.2. Hybrid Approaches

4.2.1. Simulated Dataset

The summarized results in Table 6 highlight the train

M A E_{10}^{t r a i n}

, validation

M A E_{10}^{v a l}

, and total

M A E_{10}^{t o t a l}

across different models and data splits, offering a quantitative basis for assessing the performance of each hybrid approach.

The analysis reveals a clear trend: hybrid modeling patterns mostly achieve a better accuracy and lower standard deviation than the purely data-based model. This improvement underscores the significant advantage of incorporating physics-based insights into ML models, enhancing their predictive accuracy, robustness, and generalization capabilities.

Feature Learning (FL)

The FL approach shows an improvement in accuracy over the purely physics-based model, with an IE

M A E_{10}^{t o t a l}

of 0.0054 mm and a CE

M A E_{10}^{t o t a l}

of 0.0057 mm. Since the final prediction of the FL approach is physics-based, the standard deviation is

0.0

. This indicates the effectiveness of integrating data-based features, which enables the physics-based model to better capture the underlying dynamics of the injection molding process, thereby reducing prediction errors. However, since only the variations of the holding pressure are captured, the purely data-based model outperforms the FL approach using the CE data split. On the other hand, with only two examples, the physics-based model is able to extrapolate the unobserved

T_{mold}

and

V_{inj}

variations with good accuracy within the IE data split.

Delta Model (DM) and Fine-Tuning (FT)

Both DM and FT approaches exhibit promising performances, with the FT approach being slightly better with a CE

M A E_{10}^{t o t a l}

of 0.0027 mm compared to the DM’s CE

M A E_{10}

of 0.0029 mm. Across both data splits, the combined FT+FL approach achieves the lowest validation

M A E_{10}^{v a l}

.

Physics-Based Preprocessing (PP)

The PP approach, while not achieving the top performance, still improves upon the purely data-based model with a CE

M A E_{10}^{t o t a l}

of 0.0031 mm. This highlights the benefit of augmenting the input feature space with physics-based predictions, providing the model with additional contextual information for more accurate predictions.

Physical Constraints (PC)

Remarkably, the PC approach, especially when combined with FL (PC + FL), achieves one of the lowest

M A E_{10}^{t o t a l}

, with a CE

M A E_{10}^{t o t a l}

of 0.0024 mm. The PC approaches notably achieve the lowest

M A E_{10}^{t r a i n}

. This superior performance is likely due to the direct integration of physical laws as regularization terms, ensuring that predictions not only adhere to known physical constraints but also align closely with the observed data. In contrast, while the L2 regularization employed in the other approaches is pivotal for robust generalization, it results in a higher

M A E_{10}^{t r a i n}

.

Combined Hybrid Approach

The combined hybrid model PC + FL + FT stands out with an impressive CE

M A E_{10}^{t o t a l}

of 0.0022 mm, showcasing the potential of leveraging multiple hybrid modeling patterns for enhanced accuracy. This approach, by integrating feature learning, fine-tuning, and physical constraints, effectively captures the complex dynamics of the injection molding process, resulting in the most accurate surrogate model for the CE data split.

The comparative analysis of hybrid ML approaches on the simulated dataset demonstrates the clear benefits of integrating physics-based knowledge with data-driven models. The combined hybrid models, particularly PC + FL + FT, emerge as the most effective in accurately capturing part shrinkage in injection molding processes. While each hybrid approach uses the same underlying physics-based model, combining different strategies of extracting knowledge can lead to superior results. Concerning the overall accuracy, already with the IE data split, the

M A E

, using a hybrid modeling approach, is able to surpass the average repetition accuracy of the experimental process

\pm 0.0069

mm. By utilizing more accurate and data-efficient underlying surrogate models, this approach enhances the potential for improved optimization tasks in the injection molding process, as, e.g., illustrated in [4].

4.2.2. Experimental Dataset

In this subsection, the hybrid ML approaches are further examined using the experimental dataset to gain a more nuanced understanding of their performance in real-world scenarios. The results, shown in Table 7, offer a granular view of the predictive accuracy of each model across two different data splits. This analysis is instrumental in showcasing the practical applicability of these hybrid models in the domain of injection molding, particularly for the prediction of part shrinkage.

The analysis of the experimental dataset reveals a complex landscape of model performance, with hybrid approaches generally demonstrating enhanced predictive capabilities over purely data-driven and physics-based models. The data-based model exhibits an IE

M A E_{10}^{t o t a l}

of 0.0146 mm and a CE

M A E_{10}^{t o t a l}

of 0.0124 mm, setting a baseline for the evaluation of hybrid models. On the other hand, the average standard deviation of ±0.0069 mm sets the benchmark for an optimal

M A E

with the attained experimental measurements.

Feature Learning (FL)

The FL approach is showing the same

M A E_{10}^{t o t a l}

in the IE data split (0.0114 mm) and the CE data split (0.0114 mm). For the CE data split, this achieves the best validation

M A E_{10}^{v a l}

, showcasing the capability of the calibrated physics-based model to predict the effects of

T_{mold}

and

V_{i n j}

also in real applications. The robust extrapolation capabilities, especially for the IE data split, is supported by the mostly linear dependency between

P_{h o l d}

and the width w in the given design space. However, the used linear regession model is not flexible enough to fit non-linear relationships, which is shown in a comparably high

M A E_{10}^{t r a i n}

.

Delta Model (DM) and Fine-Tuning (FT)

Both the DM and FT approaches show a good performance on the experimental dataset, with the FT approach marginally outperforming the DM in the CE data split with an

M A E_{10}^{t o t a l}

of 0.0111 mm. This suggests that the strategy of leveraging pretraining on physics-based predictions, followed by fine-tuning with empirical data, is beneficial in real-world applications. A combination with the FL approach achieves for both hybrid modelling patterns a further improvement in prediction accuracy.

Physics-Based Preprocessing (PP)

The PP approach, despite its integration of physics-based predictions as input features, yields a CE

M A E_{10}^{t o t a l}

of 0.0124 mm, mirroring the baseline set by the purely data-driven model. However, a lower standard variation indicates an improved robustness. The combined approach, PP + FL, shows enhanced robustness and higher prediction accuracy compared to the data-driven model.

Physical Constraints (PC)

The PC approach, particularly in its standalone form, encounters challenges, as shown by a CE

M A E_{10}^{t o t a l}

of 0.0276 mm. However, when combined with FL (PC + FL), it demonstrates a marked improvement, achieving a CE total

M A E_{10}

of 0.0111 mm and the best performance on the IE data split with a

M A E_{10}^{t o t a l}

of 0.0104 mm. Furthermore, it demonstrates a comparable or lower standard deviation than the purely data-driven approach, thereby enhancing its robustness. This improvement illustrates the value of embedding physical laws directly into the learning process, especially when calibrated with data-driven insights.

Combined Hybrid Approach

The combined hybrid model PC + FL + FT exhibits a slightly worse performance than PC + FL, with a CE

M A E_{10}^{t o t a l}

of 0.0121 mm. The results indicate that adding more hybrid modeling patterns does not always improve the model. While various hybrid models can extract information, the total amount of information in the data and physics is finite, creating a potential upper limit. Nevertheless, leveraging the unique strengths of different hybrid modeling patterns improves the robustness in the studied case, with room for further optimization.

Overall, the examined hybrid modeling approaches showed an improved and more robust prediction accuracy for the real injection molding process for both data splits. For the IE data split, having less training data available, the improvements achieved by adding physical knowledge has a higher effect. While with the models trained on simulation data, the repetition accuracy of the measurement device could be surpassed, the increased non-linear behavior within the real process increases the overall

M A E

for all approaches. However, the achieved accuracy of the surrogate model is already close to the measurement accuracy of the device at hand.

5. Conclusions and Outlook

In this work, a comprehensive exploration of hybrid ML approaches for the surrogate modeling of part shrinkage in injection molding processes has been carried out. These types of surrogate models are at the basis of most optimization workflows for injection molding. By increasing the overall accuracy and data efficiency of surrogate models using hybrid modeling patterns, this consequently contributes to enhanced subsequent optimization routines.

By comparing five distinct individual hybrid modeling patterns and developing more complex hybrid strategies systematically, the potential of integrating physics-based knowledge with data-driven ML models to enhance predictive accuracy and generalizability has been shown. The used formalization of these hybrid modeling patterns provides a clear framework for understanding and combining the different approaches.

The comparative analysis across both simulated and experimental datasets revealed that hybrid approaches generally outperform purely data-driven and physics-based models, with a higher relative improvement in terms of accuracy observed for the smaller IE data split. Notably, the FL approach, DM, and FT strategies emerged as particularly effective, with the FT approach showing a marked improvement in predictive accuracy, especially in the CE data split. This underscores the value of pretraining models on physics-based predictions before refining them with empirical data, allowing for a foundational understanding of process physics that is enhanced through exposure to real-world data. The FL approach proved to be the most efficient, particularly for the IE data split, where it could extrapolate to different temperature and injection speed settings without any reference data.

The PC approach, especially when combined with FL as PC + FL, showcased a robust capability to navigate the complexities inherent in experimental data, achieving one of the lowest

M A E

across both data splits. This illustrates the effectiveness of embedding physical laws directly into the learning process that is calibrated with data-driven insights. More complex hybrid models, combining multiple hybrid modeling patterns, outperformed the individual approaches in most cases, with the FT + FL approach having the best average performance and the more complex PC + FL approach performing best on the real dataset. The exploration of further combined hybrid patterns, particularly PC + FL + FT, highlighted the potential of leveraging multiple hybrid modeling patterns.

This investigation into hybrid ML approaches for surrogate modeling in injection molding has laid a foundation for future research aimed at optimizing these models for greater efficiency and integrating them in process optimization tasks. Individual research must continue in the fields of improving purely data-based as well as physics-based models. A further development of the physics-based models is necessary for a more detailed integration of the underlying physical principals. For example, the utilized strain–displacement relationship could be extended to integrate the spatiotemporal stress–strain-based process description. However, the non-linear process behavior of the real process, not covered by the physics-based model, only can be resolved by adding additional observations into the training dataset. Consequently, further investigations into the optimal amount of training data, along with a selection strategy between the different hybrid modeling approaches, need to be completed for an optimal utilization of the underlying physics- and data-based knowledge. Combining hybrid modeling techniques with dynamic DoE approaches, e.g., active learning, could further enhance the models’ predictive capabilities while keeping the necessary resources to a minimum. Possible future extensions include the integration of additional material, process, and geometric parameters, as well as predicting further process and quality attributes. This would enhance their applicability for subsequent optimization tasks, thus offering a more comprehensive approach to optimizing processes, materials, and designs for polymer products. As these hybrid models are improved, it is expected that they will play a key role in creating digital twins of the injection molding processes, ultimately contributing to the optimization and production of higher quality parts with reduced time and cost implications.

Author Contributions

Conceptualization, M.W., S.R.R., M.S. and C.H.; methodology, M.W.; software, M.W.; validation, M.W., S.R.R. and M.S.; formal analysis, M.W.; investigation, M.W.; resources, S.R.R. and C.H.; data curation, M.W.; writing—original draft preparation, M.W.; writing—review and editing, S.R.R., M.S. and C.H.; visualization, M.W.; supervision, C.H.; project administration, S.R.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Datasets generated during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

Authors Manuel Wenzel and Sven Robert Raisch were employed by the company Robert Bosch GmbH. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

NN	Neural Network
ML	Machine Learning
PINN	Physics-Informed Neural Network
RSM	Response Surface Method
ANOVA	Analysis of Variance
DoE	Design of Experiments
POM	Polyoxymethylene
CTE	Coefficient of Thermal Expansion
FT	Fine-Tuning
PP	Preprocessing
PC	Physical Constraints
DM	Delta Model
FL	Feature Learning
CE	Combined Effects
IE	Individual Effects

Appendix A

Table A1. Used DoE with simulated and experimental measured resulting part widths.

$T_{c o o l} [° C]$	$P_{h o l d} [MPa]$	$V_{i n j} [\frac{{cmm}^{3}}{s}]$	DoE Nr.	Sim. Width [mm]	Exp. Width ± Std. [mm]
84	20	10	1	19.695	19.615 ± 0.0017
84	20	30	2	19.692	19.591 ± 0.0023
84	20	50	3	19.691	19.586 ± 0.0021
84	50	10	4	19.705	19.668 ± 0.0021
84	50	30	5	19.704	19.653 ± 0.004
84	50	50	6	19.702	19.6785 ± 0.0029
84	80	10	7	19.721	19.751 ± 0.0005
84	80	30	8	19.715	19.724 ± 0.0103
84	80	50	9	19.713	19.786 ± 0.0167
99	20	10	10	19.694	19.603 ± 0.001
99	20	30	11	19.691	19.581 ± 0.0016
99	20	50	12	19.689	19.568 ± 0.0063
99	50	10	13	19.702	19.659 ± 0.01
99	50	30	14	19.700	19.659 ± 0.0025
99	50	50	15	19.697	19.654 ± 0.0026
99	80	10	16	19.717	19.7545 ± 0.0143
99	80	30	17	19.711	19.713 ± 0.0127
99	80	50	18	19.707	19.713 ± 0.0168
114	20	10	19	19.689	19.576 ± 0.0034
114	20	30	20	19.684	19.557 ± 0.0034
114	20	50	21	19.681	19.55 ± 0.0053
114	50	10	22	19.696	19.638 ± 0.0056
114	50	30	23	19.691	19.623 ± 0.0031
114	50	50	24	19.686	19.617 ± 0.0052
114	80	10	25	19.705	19.739 ± 0.0171
114	80	30	26	19.697	19.7135 ± 0.0203
114	80	50	27	19.692	19.702 ± 0.0047

Figure A1. Examples of exported temperatures profiles within the cross section of the used part geometry at specific time steps of the simulation for different process parameter combinations. The network

N_{T} (t, x, θ)

is used to capture the dataset.

Figure A1. Examples of exported temperatures profiles within the cross section of the used part geometry at specific time steps of the simulation for different process parameter combinations. The network

N_{T} (t, x, θ)

is used to capture the dataset.

Figure A2. Examples of exported pressure profiles at the central node of the cross section of the used part geometry for different process parameter combinations. The network

N_{P} (t, x, θ)

is used to capture the dataset.

Figure A2. Examples of exported pressure profiles at the central node of the cross section of the used part geometry for different process parameter combinations. The network

N_{P} (t, x, θ)

is used to capture the dataset.

References

Kennedy, P.; Zheng, R. Flow Analysis of Injection Molds, 2nd ed.; Carl Hanser Verlag GmbH & Co. KG: Munich, Germany, 2013. [Google Scholar] [CrossRef]
Arinez, J.F.; Chang, Q.; Gao, R.X.; Xu, C.; Zhang, J. Artificial intelligence in advanced manufacturing: Current status and future outlook. J. Manuf. Sci. Eng. 2020, 142, 110804. [Google Scholar] [CrossRef]
Kenig, S.; Ben-David, A.; Omer, M.; Sadeh, A. Control of properties in injection molding by neural networks. Eng. Appl. Artif. Intell. 2001, 14, 819–823. [Google Scholar] [CrossRef]
Song, Z.; Liu, S.; Wang, X.; Hu, Z. Optimization and prediction of volume shrinkage and warpage of injection-molded thin-walled parts based on neural network. Int. J. Adv. Manuf. Technol. 2020, 109, 755–769. [Google Scholar] [CrossRef]
Manjunath, P.G.; Krishna, P. Prediction and Optimization of Dimensional Shrinkage Variations in Injection Molded Parts Using Forward and Reverse Mapping of Artificial Neural Networks. Adv. Mater. Res. 2012, 463–464, 674–678. [Google Scholar] [CrossRef]
Reiter, M.; Stemmler, S.; Hopmann, C.; Ressmann, A.; Abel, D. Model Predictive Control of Cavity Pressure in an Injection Moulding Process. IFAC Proc. Vol. 2014, 47, 4358–4363. [Google Scholar] [CrossRef]
Stemmler, S.; Vukovic, M.; Ay, M.; Heinisch, J.; Lockner, Y.; Abel, D.; Hopmann, C. Quality Control in Injection Molding based on Norm-optimal Iterative Learning Cavity Pressure Control. IFAC-PapersOnLine 2020, 53, 10380–10387. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv 2017, arXiv:1711.10561. [Google Scholar]
von Rueden, L.; Mayer, S.; Beckh, K.; Georgiev, B.; Giesselbach, S.; Heese, R.; Kirsch, B.; Walczak, M.; Pfrommer, J.; Pick, A.; et al. Informed Machine Learning—A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems. IEEE Trans. Knowl. Data Eng. 2021, 35, 614–633. [Google Scholar] [CrossRef]
Zheng, R.; Tanner, R.I.; Fan, X.J. Injection Molding; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar] [CrossRef]
Chen, C.C.; Su, P.L.; Lin, Y.C. Analysis and modeling of effective parameters for dimension shrinkage variation of injection molded part with thin shell feature using response surface methodology. Int. J. Adv. Manuf. Technol. 2009, 45, 1087–1095. [Google Scholar] [CrossRef]
Kennedy, P.; Zheng, R. High Accuracy Shrinkage and Warpage Prediction for Injection Molding 525. In Proceedings of the Society of Plastics Engineers, ANTEC 2002 Conference Proceedings, San Francisco, CA, USA, 5–9 May 2002; Volume 1, pp. 593–599. [Google Scholar]
Hopmann, C.; Xiao, C.; Kahve, C.E.; Fellerhoff, J. Prediction and validation of the specific volume for inline warpage control in injection molding. Polym. Test. 2021, 104, 107393. [Google Scholar] [CrossRef]
Chen, W.C.; Kurniawan, D. Process parameters optimization for multiple quality characteristics in plastic injection molding using Taguchi method, BPNN, GA, and hybrid PSO-GA. Int. J. Precis. Eng. Manuf. 2014, 15, 1583–1593. [Google Scholar] [CrossRef]
Oliaei, E.; Heidari, B.S.; Davachi, S.M.; Bahrami, M.; Davoodi, S.; Hejazi, I.; Seyfi, J. Warpage and Shrinkage Optimization of Injection-Molded Plastic Spoon Parts for Biodegradable Polymers Using Taguchi, ANOVA and Artificial Neural Network Methods. J. Mater. Sci. Technol. 2016, 32, 710–720. [Google Scholar] [CrossRef]
Heinisch, J.; Lockner, Y.; Hopmann, C. Comparison of design of experiment methods for modeling injection molding experiments using artificial neural networks. J. Manuf. Process. 2021, 61, 357–368. [Google Scholar] [CrossRef]
Rudolph, M.; Kurz, S.; Rakitsch, B. Hybrid modeling design patterns. J. Math. Ind. 2024, 14, 3. [Google Scholar] [CrossRef]
Chen, J.Y.; Tseng, C.C.; Huang, M.S. Quality Indexes Design for Online Monitoring Polymer Injection Molding. Adv. Polym. Technol. 2019, 2019, 3720127. [Google Scholar] [CrossRef]
Saad, S. Towards the Use of Surrogate Modeling in Model Parameter Calibration in Injection Molding Process Simulation. Ph.D. Thesis, HESAM Université, Paris, France, 2022. [Google Scholar]
Wenzel, M.; Raisch, S.R.; Saad, S.; Schmitz, M.; Hopmann, C. Hybrid Modeling of the injection molding process using PINNs. In Proceedings of the SPE ANTEC 2023—Proceedings, Denver, CO, USA, 27–30 March 2023; pp. 1–6. [Google Scholar]
Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 1–40. [Google Scholar] [CrossRef]
Hopmann, C.; Jeschke, S.; Meisen, T.; Thiele, T.; Tercan, H.; Liebenberg, M.; Heinisch, J.; Theunissen, M. Combined learning processes for injection moulding based on simulation and experimental data. AIP Conf. Proc. 2019, 2139, 030003. [Google Scholar] [CrossRef]
Lockner, Y.; Hopmann, C.; Zhao, W. Transfer learning with artificial neural networks between injection molding processes and different polymer materials. J. Manuf. Process. 2022, 73, 395–408. [Google Scholar] [CrossRef]
Saad, S.; Sinha, A.; Cruz, C.; Régnier, G.; Ammar, A. Towards an accurate pressure estimation in injection molding simulation using surrogate modeling. Int. J. Mater. Form. 2022, 15, 72. [Google Scholar] [CrossRef]
Freudenberg, T.; Heilenkötter, N. TorchPhysics. 2021. Available online: https://torchphysics.readthedocs.io/en/latest/ (accessed on 6 March 2022).
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv 2019, arXiv:1912.01703. [Google Scholar]
Haghighat, E.; Raissi, M.; Moure, A.; Gomez, H.; Juanes, R. A deep learning framework for solution and discovery in solid mechanics: Linear elasticity. arXiv 2020, arXiv:2003.02751. [Google Scholar]
Kovacs, A.; Exl, L.; Kornell, A.; Fischbacher, J.; Hovorka, M.; Gusenbauer, M.; Breth, L.; Oezelt, H.; Yano, M.; Sakuma, N.; et al. Conditional physics informed neural networks. Commun. Nonlinear Sci. Numer. Simul. 2022, 104, 106041. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Cai, S.; Wang, Z.; Chryssostomidis, C.; Karniadakis, G.E. Heat Transfer Prediction With Unknown Thermal Boundary Conditions Using Physics-Informed Neural Networks. In Proceedings of the Fluids Engineering Division Summer Meeting, Virtual, 13–15 July 2020. [Google Scholar] [CrossRef]
Cai, S.; Mao, Z.; Wang, Z.; Yin, M.; Karniadakis, G.E. Physics-informed neural networks (PINNs) for fluid mechanics: A review. Acta Mech. Sin. 2021, 37, 1727–1738. [Google Scholar] [CrossRef]
Baydin, A.G.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic differentiation in machine learning: A survey. J. Mach. Learn. Res. 2018, 18, 1–43. [Google Scholar]

Figure 1. Example of an injection molding cycle within a PVT diagram for semi-crystalline thermoplastics.

Figure 2. Illustration of the used part and measurement position.

Figure 3. Simulation model and mold geometry used. (a) The meshed simulation model including the part (dark green), runner and sprue (light green), cooling channels (blue), and feed system (red) [21]. (b) The used mold for the experimental data acquisition.

Figure 4. Used “Individual Effects” (IE) and “Combined Effects” (CE) data split.

Figure 5. Sampled data points for the inverse approach. (a) Contour plots of the simulated widths w dependent on process settings

θ

, where the data-based model has been trained on the IE data split. (b) Contour plots of the experimental widths w dependent on process settings

θ

, where the data-based model has been trained on the IE data split.

Figure 5. Sampled data points for the inverse approach. (a) Contour plots of the simulated widths w dependent on process settings

θ

, where the data-based model has been trained on the IE data split. (b) Contour plots of the experimental widths w dependent on process settings

θ

, where the data-based model has been trained on the IE data split.

Table 1. Steps used in the full-factorial DoE.

$θ$	$T_{mold}$ (°C)	$V_{inj}$ ( ${cm}^{3}$ /s)	$P_{hold}$ (MPa)
+	85	10	80
0	95	30	120
−	105	50	160

Table 2. Fixed process settings.

Parameter	Value
Holding Time	16 s
Cooling Time	30 s
Melt Temperature	215 °C
Switch-Over	10 ${cm}^{3}$

Table 3. Models used in the study and their corresponding formalization.

Model	Formalization
Data-based	$w = D (θ)$
Physics-based	$w = P (θ)$
FL	$w = P_{D} (θ)$
DM	$w = D (θ) + P (θ)$
FT	$w = D (θ) \leftarrow P (θ)$
PP	$w = D (θ, P (θ))$
PC	$w = D_{P} (θ)$
DM + FL	$w = D (θ) + P_{D} (θ)$
FT + FL	$w = D (θ) \leftarrow P_{D} (θ)$
PP + FL	$w = D (θ, P_{D} (θ))$
PC + FL	$w = D_{P_{D}} (θ)$
PC + FT + FL	$w = D_{P_{D}} (θ) \leftarrow P_{D} (θ)$

Table 4. Hyperparameters and search space for the data-based model, where the resulting optimal network architecture and optimization approach are highlighted.

Parameter	Values
Network Size	[5,5], [10,10], [20,20]
L2 Regularization	0.05, 0.005, 0.00001
Activation Function	ReLU(), Tanh()
Optimizer	Adam, SGF
Learning Rate	1 × $10^{- 3}$ , 5 × 10⁻⁴, 1 × $10^{- 4}$ , 1 × $10^{- 5}$

Table 5. Network specifications for the surrogate models.

Parameter	Values
Network Size	[80,80,80,80]
Activation Function	Tanh()
Optimizer	Adam
Learning Rate	1 × $10^{- 3}$

Table 6. Model performance comparison for the simulated dataset across the 10-fold cross validation.

Model	IE			CE
Model	$M A E_{10}^{t r a i n}$ [mm]	$M A E_{10}^{v a l}$ [mm]	$M A E_{10}^{t o t a l}$ [mm]	$M A E_{10}^{t r a i n}$ [mm]	$M A E_{10}^{v a l}$ [mm]	$M A E_{10}^{t o t a l}$ [mm]
Data-Based	0.0035	0.0083	0.0072	0.0041	0.0029	0.0033
Data-Based	±0.0009	±0.0015	±0.0014	±0.0014	±0.0024	±0.0015
Physics-Based	0.0185	0.0209	0.0204	0.0223	0.0195	0.0204
Physics-Based	±0.0	±0.0	±0.0	±0.0	±0.0	±0.0
FL	0.0029	0.0062	0.0054	0.0072	0.0051	0.0057
FL	±0.0	±0.0	±0.0	±0.0	±0.0	±0.0
DM	0.0025	0.0064	0.0056	0.0038	0.0026	0.0029
DM	±0.0006	±0.0009	±0.0008	±0.0012	±0.0007	±0.0008
FT	0.0024	0.0062	0.0053	0.0030	0.0026	0.0027
FT	±0.0009	±0.0011	±0.0009	±0.0011	±0.0011	±0.0009
PP	0.0035	0.0080	0.0070	0.0046	0.0025	0.0031
PP	±0.0018	±0.0025	±0.0023	±0.001	±0.0005	±0.0006
PC	0.0005	0.0059	0.0047	0.0070	0.0023	0.0037
PC	±0.0001	±0.0043	±0.0033	±0.0066	±0.002	±0.0033
DM + FL	0.0014	0.0053	0.0045	0.0046	0.0030	0.0035
DM + FL	±0.0004	±0.0028	±0.0023	±0.0008	±0.0036	±0.0027
FT + FL	0.0025	0.0041	0.0038	0.0041	0.0023	0.0028
FT + FL	±0.0016	±0.0019	±0.0018	±0.0014	±0.0012	±0.0012
PP + FL	0.0028	0.0073	0.0063	0.0052	0.0028	0.0035
PP + FL	±0.0013	±0.0019	±0.0018	±0.0006	±0.0003	±0.0004
PC + FL	0.0014	0.0048	0.0041	0.0013	0.0028	0.0024
PC + FL	±0.0004	±0.0005	±0.0005	±0.0012	±0.001	±0.0006
PC + FL + FT	0.0016	0.0056	0.0047	0.0013	0.0026	0.0022
PC + FL + FT	±0.0005	±0.0007	±0.0004	±0.0002	±0.0002	±0.0002

Table 7. Model performance comparison for the simulated dataset across the 10-fold cross validation.

Model	IE			CE
Model	$M A E_{10}^{t r a i n}$ [mm]	$M A E_{10}^{v a l}$ [mm]	$M A E_{10}^{t o t a l}$ [mm]	$M A E_{10}^{t r a i n}$ [mm]	$M A E_{10}^{v a l}$ [mm]	$M A E_{10}^{t o t a l}$ [mm]
Data-Based	0.0068	0.0168	0.0146	0.0082	0.0141	0.0124
Data-Based	±0.0012	±0.0012	±0.0012	±0.0024	±0.003	±0.0017
Physics-Based	0.0805	0.0774	0.0781	0.0782	0.0781	0.0781
Physics-Based	±0.0	±0.0	±0.0	±0.0	±0.0	±0.0
FL	0.0055	0.0131	0.0114	0.0186	0.0084	0.0114
FL	±0.0	±0.0	±0.0	±0.0	±0.0	±0.0
DM	0.0014	0.0015	0.0012	0.0093	0.0154	0.0136
DM	±0.0012	±0.0015	±0.0014	±0.0026	±0.0023	±0.0019
FT	0.0045	0.0154	0.0130	0.0069	0.0128	0.0111
FT	±0.0016	±0.0027	±0.0022	±0.0024	±0.0039	±0.0028
PP	0.0061	0.0165	0.0142	0.0088	0.0140	0.0124
PP	±0.0006	±0.0006	±0.0005	±0.0029	±0.0012	±0.0013
PC	0.0007	0.0198	0.0156	0.0008	0.0389	0.0276
PC	±0.0003	±0.0088	±0.0068	±0.001	±0.008	±0.0057
DM + FL	0.0052	0.0138	0.0119	0.0110	0.0133	0.0127
DM + FL	±0.0002	±0.0003	±0.0003	±0.0036	±0.0018	±0.002
FT + FL	0.0051	0.0139	0.0119	0.0103	0.0132	0.0124
FT + FL	±0.0004	±0.0006	±0.0004	±0.0027	±0.0014	±0.0017
PP + FL	0.0059	0.0164	0.0141	0.0084	0.0128	0.0115
PP + FL	±0.0016	±0.0019	±0.0018	±0.0026	±0.0015	±0.0013
PC + FL	0.0032	0.0124	0.0104	0.0016	0.0151	0.0111
PC + FL	±0.0015	±0.0015	±0.0012	±0.0009	±0.0012	±0.001
PC + FL + FT	0.0045	0.0138	0.0117	0.0027	0.0161	0.0121
PC + FL + FT	±0.0018	±0.0022	±0.0015	±0.0013	±0.0011	±0.0008

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wenzel, M.; Raisch, S.R.; Schmitz, M.; Hopmann, C. Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding. Polymers 2024, 16, 2465. https://doi.org/10.3390/polym16172465

AMA Style

Wenzel M, Raisch SR, Schmitz M, Hopmann C. Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding. Polymers. 2024; 16(17):2465. https://doi.org/10.3390/polym16172465

Chicago/Turabian Style

Wenzel, Manuel, Sven Robert Raisch, Mauritius Schmitz, and Christian Hopmann. 2024. "Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding" Polymers 16, no. 17: 2465. https://doi.org/10.3390/polym16172465

APA Style

Wenzel, M., Raisch, S. R., Schmitz, M., & Hopmann, C. (2024). Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding. Polymers, 16(17), 2465. https://doi.org/10.3390/polym16172465

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Hybrid Machine Learning Approaches for Surrogate Modeling Part Shrinkage in Injection Molding

Abstract

1. Introduction

2. State of the Art

2.1. Physics-Based Shrinkage Prediction

2.2. Data-Driven Surrogate Models

2.3. Hybrid Modeling Patterns

2.3.1. Physics-Based Preprocessing (PP)

2.3.2. Delta Model (DM)

2.3.3. Feature Learning (FL)

2.3.4. Physical Constraints (PC)

2.3.5. Fine-Tuning (FT)

2.4. Summary

3. Data and Methodology

3.1. Specimen, Material, and Data Acquisition

3.2. Overview of the Hybrid Models and Training Data

3.3. Base Models

3.3.1. Data-Based Model

3.3.2. Physics-Based Model

3.4. Hybrid Models

3.4.1. FL

3.4.2. DM

3.4.3. DM + FL

3.4.4. FT

3.4.5. FT + FL

3.4.6. PP

3.4.7. PP + FL

3.4.8. PC

Data-Based Loss

Physics-Based Loss

3.4.9. PC + FL

3.4.10. PC + FT + FL

4. Results

4.1. Base Models

4.2. Hybrid Approaches

4.2.1. Simulated Dataset

Feature Learning (FL)

Delta Model (DM) and Fine-Tuning (FT)

Physics-Based Preprocessing (PP)

Physical Constraints (PC)

Combined Hybrid Approach

4.2.2. Experimental Dataset

Feature Learning (FL)

Delta Model (DM) and Fine-Tuning (FT)

Physics-Based Preprocessing (PP)

Physical Constraints (PC)

Combined Hybrid Approach

5. Conclusions and Outlook

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI