*4.2. Tested Databases*

The different database-generation techniques defined in the previous section were applied to create five different training databases related to the test arrangement:


The number of damage cases *N* = 20,000 was chosen to ensure a proper convergence of the accuracy [11]. For the same reason, 20,000 damage cases were included in the MCD20 and MCA20 databases, whereas *nd* = 8 was chosen for the P8 database to have a similar number of damage cases and ensure a fair comparison of the methods. Figure 7 shows the resulting cumulative density functions of the damage dimensions, location, and *time-to-flood* of the damage cases in the four databases.

The MC-based methods provided a continuous distribution of all the parameters, whereas the parametric one led to scattered distributions except for *tf* . As expected, an almost uniform distribution of *tf* was associated with the MCA20 database. This means that many small damages were generated located near the free surface, while the number of nonsurvival scenarios was lower than that of MC20. On the contrary, both, the MCD20 and P8 databases contained a significantly higher percentage of nonsurvival scenarios compared to MC20. It is worth noticing that MC20 and MCD20 drive toward a comparable cumulative distribution of *tf* , although driven by different probability distributions of damage parameters.

Besides the training databases, a single validation database was defined:

• MC50b: based on the SOLAS probability distributions and being composed of 50,000 damage cases (8059 nonsurvival cases).

The SOLAS probability distributions were chosen for the validation database since they were considered the most representative of the collision damages that might occur in an operative environment. Hence, aiming at the definition of a methodology to be employed during a real flooding emergency, such a choice was deemed appropriate.

For all the databases, the maximum simulation time was set to 2250 s.

**Figure 7.** Cumulative density functions of damage dimensions, location, and *time-to-flood* related to the tested database-generation methods.

#### **5. Results**

In Figure 8, the overall performances obtained applying the different training databases are provided. It is worth noticing that the MC20 database employing the SOLAS probability distributions was not always associated with the best performances despite a SOLAS-based dataset always being employed for validation.

**Figure 8.** Comparison of the performances of the tested database-generation methods. Validation: MC50b; method: RFs.

Namely, the MCD20 database provided similar results for the *final fate* and *flooded compartments* classification problems compared to the MC20 one. Besides, the MCD20 dataset was more accurate in the prediction of *tf* than MC20. The MCA20 database showed a lower accuracy than MC20 and MCD20 for all the studied problems. However, MCA20 showed a larger region of stable *tf* forecast, although a lower maximum of *R*<sup>2</sup> and *R*2<sup>∗</sup> was reached. *R*2<sup>∗</sup> decayed under a null value at about *t* = 1215 s (corresponding to 0.5% of the ongoing damage scenarios in the validation database) instead of *t* = 735, which was related to the MCD20 database. The parametric generation method always led to lower accuracy compared to all the methods based on MC generation. Moreover, the P8 training database led to strong accuracy instability in both classification problems.

#### **6. Discussion**

Considering the *final fate* classification problem, the adoption of the MCD20 training database had a very limited effect on the overall and ongoing accuracy. In detail, a less skewed training dataset was obtained by the larger number of nonsurvival damage cases in MCD20. This reduced by about 0.05% the type I error (i.e., when a nonsurvival scenario was classified as a survival one by the learner), while increasing by about 0.15% the type II one (i.e., when a survival scenario was classified as a nonsurvival one by the learner). The adoption of the MCA20 training database decreased the accuracy by about 2% in forecasting the *final fate* compared to MC20/MCD20. Furthermore, a strong decay of the ongoing accuracy for long damage scenarios was also observed. The main reason can be found in the larger number of damage scenarios exceeding the maximum simulation time compared to the other databases: in MCA20, 30% of damage scenarios had *tf* > 2250 s. The adoption of the P8 training database led to a similar behavior, although the accuracy reached an almost stable value later (at about 500 s). Analyzing the results provided by the parametric generation, note that damage cases affected one or two adjoining compartments. On the other hand, the validation database (MC50b) was based on the SOLAS probability distributions. According to SOLAS, the maximum damage length for the test barge is 22.72 m, which exceeds the length of each watertight compartment (15 m). Hence, 7.4% of the damage scenarios in the validation database affected three adjoining compartments. The accuracy gap between MC20 and P8 for the *final fate* classification problem in the stable region was between 1% and 3%, being lower than 7.4%. This means that many of the three-compartment damage scenarios were still correctly classified by RFs trained with the P8 database.

On the contrary, the classification of *flooded compartments* led to an accuracy gap of about 9–10%, which was larger than the percentage of the three-compartment damage scenarios included in the validation database. For *flooded compartments* classification, the MCD20 and MCA20 databases had an overall accuracy 1% and 2% lower than the MC20 one, respectively. On the other end, the ongoing accuracy of MCD20 was comparable to the MC20 one, whereas the MCA20 one was again 2% worse and showed a greater instability for large *t* ∗ values.

The *tf* regression related to nonsurvival scenarios based on the different techniques for training database generation led to more interesting results. In Figures 9–12, the predicted–observed plots evaluated at *t* ∗ = 500 s and *t* ∗ = 1000 s are provided.

As mentioned, the SOLAS-based training database was not always the best option. The MCD20 training database led to better results since it contained more than double the nonsurvival scenarios compared to MC20, while the two databases had almost the same distribution of the *time-to-flood*. Due to the increased density of the training database, the RFs better forecast the *tf* up to *t* ∗ = 1700 s. Nevertheless, at larger *t* ∗ values, the ongoing accuracy still decayed. In such a region, better results could be reached by employing the MCA20 training database, which allowed a good forecast of the *time-to-flood* up to *t* ∗ = 2000 s. The main reason was the lower dimension of the generated damages, which led to a higher density of the training data in the most critical region. However, note that the model tended to overestimate *tf* , highlighted by the clusters in the lower part in Figure 10.

Another criticality associated with the MCA20 training database was the poor capacity to deal with large damages having short *tf* . This was mainly due to the limited number of nonsurvival damage scenarios. Bulky errors can be observed in the lower *tf* region, leading to poor values of the overall *R*<sup>2</sup> (never greater than 0.55). Regarding the parametric generation method, it was not capable of ensuring good results for the *tf* regression problem for the test geometry.

**Figure 9.** Predicted over observed values of *tf* computed at *t* ∗ = 500 s and *t* ∗ = 1000 s. Training: MC20; validation: MC50b; method: RFs.

**Figure 10.** Predicted over observed values of the *time-to-flood* computed at *t* ∗ = 500 s and *t* ∗ = 1000 s. Training: MCA20; validation: MC50b; method: RFs.

**Figure 11.** Predicted over observed values of the *time-to-flood* computed at *t* ∗ = 500 s and *t* ∗ = 1000 s. Training: MCD20; validation: MC50b; method: RFs.

**Figure 12.** Predicted over observed values of the *time-to-flood* computed at *t* ∗ = 500 s and *t* ∗ = 1000 s. Training: P8; validation: MC50b; method: RFs.

#### **7. Conclusions**

The paper explored multiple options for training database generation on damage consequences' assessment. Four generation algorithms were tested on a box-shaped barge, demonstrating that the accuracy of prediction can be heavily affected by the databasegeneration method. An interesting result was that the training database generation based on the SOLAS probability distributions was not always the best option for the test geometry. Although the validation database was also based on the SOLAS probability distributions, different distributions applied to the damage parameters driving the MC sampling led to equal or better results.

Among the studied problems, the *tf* regression problem was the most affected by the training database. Namely, the application of the uniform probability distribution of the damage dimensions or inverse area significantly improved the performances for short and large *t* ∗ values, respectively. For the two classification problems, the uniform probability distribution of the damage dimensions led to slightly better results compared to the SOLAS one.

The parametric generation method showed always poorer performances compared to the other tested options. However, this gap was mainly due to the assumption of a maximum of two adjoining compartments involved in a damage scenario valid for the parametrically generated database. Moreover, the parametric method showed quite good resilience in the *final fate* classification problem and led to a large percentage of long nonsurvival damage cases, thus to a less skewed training database. Hence, it can be concluded that the parametric generation is worthy of further investigation to assess its real effectiveness.

Future research should also focus on more complex geometries (such as a full-scale passenger vessel), the effect of the ship loading condition, waves, and internal openings' type and status (open/closed). All these issues shall be properly considered during the training database generation to move towards a real application of flooding-sensor-agnostic DSS in an operative environment. Nevertheless, the outcomes of the present study can help naval architects in addressing the issues that might affect the result of a flooding-sensoragnostic DSS due to the adopted database-generation technique.

**Author Contributions:** Conceptualization, L.B., J.P.-O., and M.V.; methodology, L.B., J.P.-O. and M.V.; software, L.B.; validation, L.B. and M.V.; formal analysis, L.B. and M.V.; investigation, L.B.; resources, L.B.; data curation, L.B.; writing—original draft preparation, L.B.; writing—review and editing, J.P.-O. and M.V.; visualization, L.B., M.V. and J.P.-O.; supervision, J.P.-O. and M.V.; project administration, J.P.-O.; funding acquisition, J.P.-O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was fully supported by the Croatian Science Foundation under the project IP-2018-01-3739.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data are contained within the article.

**Acknowledgments:** This work was also supported by the University of Rijeka (Project Nos. uniritehnic-18-18 1146 and uniri-tehnic-18-266 6469).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

