**6. Discussion**

A suitable monitoring scheme requires an adequate model that describes the normal operations conditions of the process. One way to construct such NOC model is through the use of data-driven methods based on historical records of normal operation. However, when a new product enters the production line, such past information is not available. To overcome this situation, one can simply run the process without monitoring until sufficient data is acquired. However, this approach can lead to inadequate models since the data collected in a short period of time might not represent the overall variability of the process. This is particularly more relevant when the process has non-stationary characteristics, such as those caused by distinct production lots. As an alternative to the current data-driven modelling approaches, it is here proposed to generate artificial data based on accumulated process knowledge and then build a monitoring scheme using the augmented NOC dataset, enriched with information about inter-lot variation sources.

In this work, the proposed AGV module aims to simulate the structural components of common cause variability of a solder past printing process, accounting for (i) translation and rotation effects on offset-X and offset-Y, (ii) squeegee effects on offset-Y and height (iii) and solder mask effects on height. Furthermore, these effects are structured into multiple components in order to accommodate for inter-lot (common to all PCBs in a lot), intra-lot (common to all pads in a PCB) and pad specific (unique to each pad) variability. The amount of variability attributed to each component, as well as the magnitude of the effects is established by a set of tuning parameters handily defined by the user.

Throughout this work, it was observed that the tuning parameters can define a large variety of operation conditions. While this gives a high degree of flexibility to the AGV module, it also implies that selecting the best set of tuning parameters is a critical task in order to achieve realistic simulations. In this case, we had accesses to historical data and the tuning parameters were chosen to resemble the real data of the product under study. For a new product with distinct characteristics, the best combination of tuning parameters might be different. We expect the tuning parameters to be related to external factors such as operators and equipment operation and therefore they should not vary much from product to product. Nevertheless, further analysis is required to confirm this conjecture and the applicability of the similar tuning parameters to PCBs with different layouts.

By use of the PCA model built on the simulated data, it was verified that the simulated data was distributed in clusters related to the simulated lots. This behavior was also observed for the real data. Real data was dispersed over the same range of values as simulated data, which confirms the reliability of the AGV module as a realistic data augmentation tool. The loadings of the PCA model trained on the simulated data also showed to be related with known phenomena of the printing process. Therefore, the PCA model can provide very useful insights on the operation conditions of each PCB, as well as fault diagnosis. For instance, the PCA model can be used to identify distinct lots and the printing direction.

In this study the number of simulated lots was selected to have a reasonable representation of the process variability while also allowing for an easy visualization of the results. For a more accurate representation of the process the number of lots should be selected as the point where the PCA parameters and control limits stabilize. Throughout this work several realizations of 20 lots were simulated, and it was confirmed that they lead to consistent results of the PCA models and control limits for the statistics.

Following the good results in terms of process modelling, the PCA model was used for process monitoring. In this regard, the monitoring scheme showed to solve the excessive false alarms problem of current methodologies, at the cost of presenting low sensitivity to fault detection when faults are located on a relatively small number of SPDs. The reason for this is twofold. Firstly, the PCA model was built to account for the variability across multiple lots. However, for a given lot, the PCBs are likely to vary within a localized region of the principal components subspace. Furthermore, to detect abnormal PCBs within a lot, their measurements must di ffer not only from the typical measurements on their lot, but also typical measurements on other lots. Secondly, deviations from the PCA model can be masked by the sum of squared residuals over a very large number of variables. To overcome this limitation, in analogy to the cumulative sum (CUSUM) control chart [6,11], we propose to screen for the relevant residuals (i.e., those above a given allowance threshold) and then build a truncated *Q*-statistic using only the residuals that were found relevant. This approach is already under development and preliminary results show that it reduces the number of variables/residuals included in the *Q*-statistic and therefore puts more focus on the most critical deviations.

Although none of the case studies led to alarms, fault diagnosis was run for PCBs with atypical monitoring statistics. For these cases, the contribution plots were able to identify abnormal measurements as well as the a ffected pads. Therefore, the resort to simulated data can provide useful information for fault diagnosis. These results also support the development of an alternative monitoring statistic to enhance the detection of deviations in the residual subspace.

Finally, it is noted that real data can also be included in the PCA model as the process progresses. Another improvement that can be made as soon as real data is available is to replace the correction constants for the volume (**f**, Equation (20)) by those computed from real data. Consequently, the relationships between volume and the product of area and height will be closer to reality. Nevertheless, even without the use of real data, the current study already demonstrates the usefulness of the AGV module for process monitoring when historical data is not available.
