*2.2. Sample Collection and Analytical Procedures*

The data used in this study resulted from several surveys carried out throughout the country over a time span of 13 years (2006–2019). Sampling sites within different administrative regions were chosen, taking into account the number of people served and the accessibility (Figure 1). The water sources comprised hand-dug shallow wells (*n* = 216) and boreholes (*n* = 35), fitted with a bucket or a pump (manual, solar, or electric) to collect water (Figure S1). Out of the 252 water sources surveyed, 47 were examined in the dry season, whereas 83 were studied in the wet season. The remainder (122) water sources were assessed in both seasons. Moreover, throughout the study period, each sampling location was surveyed between 1 and 17 times.

Figure 1. Location of surveyed water sources across Guinea-Bissau (source of second-level administrative divisions of Guinea-Bissau: Hjmans et al. [16]). **Figure 1.** Location of surveyed water sources across Guinea-Bissau (source of second-level administrative divisions of Guinea-Bissau: Hjmans et al. [16]).

Water samples were collected using 500 mL plastic sterile flasks. All samples were kept in the dark in refrigerated ice chests and processed within 4 h of collection at a field laboratory similar to the one described in [13]. Water temperature, conductivity, dissolved oxygen, oxygen saturation, and pH were measured in situ using a Hanna Instruments 9828 portable meter. The exact position of each water source was obtained by means of GPS (Magellan 600). Water samples were collected using 500 mL plastic sterile flasks. All samples were kept in the dark in refrigerated ice chests and processed within 4 h of collection at a field laboratory similar to the one described in [13]. Water temperature, conductivity, dissolved oxygen, oxygen saturation, and pH were measured *in situ* using a Hanna Instruments 9828 portable meter. The exact position of each water source was obtained by means of GPS (Magellan 600).

Monthly precipitation data were extracted from the historical GHCN gridded V2 dataset provided by NOAA/OAR/ESRL PSD, available on their website (http://www.esrl.noaa.gov/psd/, accessed on 4 June 2020). Resolution was 2.5 degrees for the available grid of pixels covering the Guinea-Bissau country area (11–13.5° N, 18–14° Monthly precipitation data were extracted from the historical GHCN gridded V2 dataset provided by NOAA/OAR/ESRL PSD, available on their website (http://www.esrl. noaa.gov/psd/, accessed on 4 June 2020). Resolution was 2.5 degrees for the available grid of pixels covering the Guinea-Bissau country area (11–13.5◦ N, 18–14◦ W).

W). Samples for water colour, nitrate, nitrite, ammonium, aluminium, arsenic, copper, chromium, cyanide, and iron and were assayed in a 12 V multiparameter Hanna HI83200 photometer, according to standard methods supplied by the manufacturer(www.hannacom.pt, accessed on 1 June 2019). A Hanna HI-93102 Multi Samples for water colour, nitrate, nitrite, ammonium, aluminium, arsenic, copper, chromium, cyanide, and iron and were assayed in a 12 V multiparameter Hanna HI83200 photometer, according to standard methods supplied by the manufacturer(www.hannacom. pt, accessed on 1 June 2019). A Hanna HI-93102 Multi Range Portable Turbidity Meter for water analysis was used for turbidity assessment.

Range Portable Turbidity Meter for water analysis was used for turbidity assessment. Samples for faecal indicators evaluation were filtered onto sterile gridded cellulose nitrate membranes (0.45 µm pore size, 47 mm diameter, Whatman, Maidstone, U.K.), and placed on mFC-agar (Difco, Le Pont de Claix, France) and Slanetz–Bartley agar (Oxoid, Hants, U.K.) plates, for faecal coliforms (FC) and intestinal enterococci (IE) enumeration, respectively. Incubation was performed at 44.5 °C for 24 h (FC) or 48 h (IE) [17] using Samples for faecal indicators evaluation were filtered onto sterile gridded cellulose nitrate membranes (0.45 µm pore size, 47 mm diameter, Whatman, Maidstone, UK), and placed on mFC-agar (Difco, Le Pont de Claix, France) and Slanetz–Bartley agar (Oxoid, Hants, UK) plates, for faecal coliforms (FC) and intestinal enterococci (IE) enumeration, respectively. Incubation was performed at 44.5 ◦C for 24 h (FC) or 48 h (IE) [17] using solar-generated electricity in the absence of an electrical supply grid. Typical colonies were counted and results expressed as colony-forming units (CFU)/100 mL.

solar-generated electricity in the absence of an electrical supply grid. Typical colonies were counted and results expressed as colony-forming units (CFU)/100 mL. Guinea-Bissau does not have guidelines concerning drinking water quality; therefore, the parameters assayed were compared with the WHO [18], EU [19] (1998), and Guinea-Bissau does not have guidelines concerning drinking water quality; therefore, the parameters assayed were compared with the WHO [18], EU [19] (1998), and UK [20] guidelines to establish whether the quality of the water was fit for human consumption.

### U.K. [20] guidelines to establish whether the quality of the water was fit for human *2.3. Statistical Analysis*

consumption. 2.3. Statistical Analysis Faecal indicators concentrations were Log (*n* + 1) transformed prior to analysis. The Spearman's rank correlation coefficient was used to assess the relationship of environmental factors and microbiological indicators. Spatial and seasonal statistically significant differences among samples were evaluated through analysis of variance (one-way ANOVA), followed by a post hoc Tukey honestly significant difference (HSD) multi-comparison test. The significance level used for all tests was 0.05.

Boosted regression trees (BRTs) were used to assess the relationship between environmental factors and microbiological indicators in the hand-dug wells only, because these represented the primary water source in Guinea-Bissau and revealed the highest contamination levels. BRTs are tree-based ensemble methods that combine the algorithms of regression trees and boosting (which build and combine a collection of models). The method works by iteratively fitting simple tree models using a forward stage-wise procedure, which progressively fits trees to the residuals of the previously fitted trees [21,22]. Some of the wells were surveyed several times; therefore, one sample per season (wet and dry) was selected at each location. Data were selected from the years when more samples were collected (2010 and 2009), and, when not available, from the closest years. When more than one sample was collected per season and year, the sample collected in the month closest to the middle of the season was chosen. Two BRT models were built, one for FCs and other for IE, using a Gaussian error distribution. Models were built in R software, version 4.0.5 [23], using the packages "dismo" and "gbm" [22,24]. Combinations of several settings were fitted before finding the optimal final setting: tree complexity (tc) of 1, learning rate of 0.001, bag fraction of 0.5 and k-fold cross validation of 10. The full models were then simplified by removing non-informative variables based on the decrease in variance. Final models were chosen based on their statistical performance, evaluated by the explained cross-validated deviance, i.e., the cross-validated correlation between training and testing data.
