3.4.2. Parameter Optimization

The variations of parameters for each statistical model framework, RF and ANN, can be seen in Tables 5 and 6, respectively. Each combination of parameters represents one model type. The data cleaning strategies and variable batches were included as additional, model framework independent, parameters. By including the cleaning strategies as a parameter, it is possible to find out which cleaning strategy achieves the most optimal trade-off between accuracy and data omission. By including the variable batches as a parameter, it is possible to find the model with a relative high accuracy but with the least amount of input variables. One should always abide to the concept of model parsimony, which is to select the simplest model among a set of models with next to the same performance.


**Table 5.** Parameter combinations used for the RF models. Each value is separated by a comma. *m* is the number of input variables. Each combination of parameters represents one model type.


**Table 6.** Parameter combinations used for the ANN models. Each value is separated by a comma. The topology (z) and (z,z), indicate one and two layers with z nodes in each layer, respectively. Each combination of parameters represents one model type.
