*2.4. Microorganisms*

The inoculum used was a mixed culture of acetic acid bacteria (AAB) obtained from bioreactors previously operating in repeated acetification cycles with the same type of wine as the substrate. This is the usual mode of operation for the vinegar industry.

#### *2.5. Analytical Methods*

The only variable not determined automatically was acidity, which was measured by acid—base titration with an NaOH solution approximately 0.5 N that was previously standardized with potassium hydrogen phthalate. The volume and ethanol concentration were measured in a continuous manner by using an EJA 110 differential pressure probe from Yokogawa Electric Corp. (Tokyo, Japan) and an Alkosens probe equipped with an Acetomat transducer from Heinrich Frings, respectively.

#### *2.6. Mathematical Methods*

#### 2.6.1. Polynomial Models and Fitting Methods

The models used here were based on second-order non-linear polynomial equations of the following type:

$$Y = b\_0 + \sum\_{i=1}^{n} b\_{i^\*} X\_i + \sum\_{\substack{i=1 \\ i$$

where *Y* denotes the dependent variable, *b*<sup>0</sup> the independent regression term, *bi* the coefficients of the linear (first-order) terms, *bij* those of non-linear (interaction) terms, *bii* those of quadratic terms, *Xi* the independent variables, *Xj* the independent variables with *i* < *j*, and *n* the number of independent variables. The *Y* variable is the response or output of the polynomial model and *X* variables are the inputs.

The equations were fitted by using one of several specific methods along with experimental data to calculate the previous coefficients. Such methods allow identifying those terms in Equation (1) that are significant and those that are not. In practice, they solve a least-squares problem by minimizing the sum of squares of the residuals (prediction errors). Second-order polynomial models can be fitted by using several methods [38–43]. In this work, the Forward Stepwise Regression method has been used, which successively incorporates one by one those independent variables that contribute to predicting a dependent variable from the most to the least predictive. The sequence in which terms are added to or removed from a polynomial in each step of the process is established according to the F-to-enter, F-to-remove (reference values set by the modeler for the F-values associated to each term of the model), R and R2 criteria. Then, the F-statistics also give a measure of the model sensitivity to each term of the polynomial relating the input variables.

#### 2.6.2. ANalysis Of VAriance (ANOVA)

Before the model equations were fitted, the experimental data to be used for this purpose were checked for statistical differences by analysis of variance (ANOVA), which compares the means for two or more data samples in terms of their variances. ANOVA's null hypothesis is that all means are identical and the alternative hypothesis that at least one mean will be different from all others [36]. ANOVA is used to identify sources of differences among data samples and to assess whether the differences among sample media are too large to be assigned to random errors alone [44].

#### **3. Results and Discussion**

#### *3.1. Experimental Design*

The experimental design that was initially intended to use was a Box-Behnken or Doehlert central composite factorial design. With two levels per factor, the total number of experiments needed was 2*<sup>n</sup>* + 2*n* + 1, where the first term represents the overall factorial design, the second the central points of the faces, and the third the central point. The factors (operational variables) used were the ethanol concentration at the loading stage in the first bioreactor (*El*1), the ethanol concentration at unloading time in the first bioreactor (*Eu*1), the volume unloaded from the first bioreactor into the second one (*Vu*1), the working temperature in the first bioreactor (*T*1), the ethanol concentration at loading time in the second bioreactor (*El*2), and the working temperature in the second bioreactor (*T*2). A total of 77 different experiments were in theory needed for to model these operational variables. In practice, however, this would have involved too many runs—probably more than needed to fit the model. Also, each experiment would have to be replicated many times to obtain reproducible results and lag phases would be needed each time the operational variables were modified to investigate a new case.

It was therefore necessary to reduce the number of experiments without sacrificing experimental information. This was accomplished by using a fractional factorial design requiring only 2*n*−*<sup>p</sup>* experiments, where *n* is the total number of variables and *p* the number of those variables obtained from some interactions of the others. Thus, considering *p* = 2 (variables *T*<sup>1</sup> and *T*2) and design generators *I*<sup>1</sup> = *Vu*1·*Eu*1·*El*<sup>1</sup> and *I*<sup>2</sup> = *Vu*1·*Eu*1·*El*<sup>2</sup> for a 1/4 fraction design, the experimental setup is shown in Table 1. Normalized values −1 and +1 have been used for *Vu*1, *Eu*1, *El*1, *El*<sup>2</sup> and columns for *T*1, *T*<sup>2</sup> have been obtained using *I*1, *I*<sup>2</sup> generators, respectively. For example, normalized values for *T*<sup>1</sup> were the product of normalized values of *Vu*1, *Eu*<sup>1</sup> and *El*1; similarly, normalized values for *T*<sup>2</sup> were the product of normalized values of *Vu*1, *Eu*<sup>1</sup> and *El*2.

**Table 1.** Normalized values of the operational variables for the proposed 26−<sup>2</sup> fractional factorial experimental design.


It was thought essential to use more than one central experiment (level 0 of all operational variables) at different times in order to examine the variance of the response variables. In this work, we used two central experiments (see Table 2). We would also use far-spaced experiments (levels −2 and +2) symmetrically placed at a distance α from center, α being the fourth root of the number of fractional factorial design experiments excluding central points (i.e., <sup>α</sup> <sup>=</sup> <sup>√</sup><sup>4</sup> 16 = 2). Therefore, the distance of levels (−2) and (+2) from level (0) was twice that from (0) to (−1) and (+1). This required adding two experiments per operational variable (see Table 3). In summary, a total of 30 experiments would be required with the proposed design (i.e., less than one-half those required by a pure factorial design), with the added advantage that the ranges of operating conditions would be extended to far extreme values.

**Table 2.** Normalized values of the operational variables for the central points of the proposed 26−<sup>2</sup> experimental design.



**Table 3.** Normalized values of the operational variables for the extended points of the proposed 26−<sup>2</sup> experimental design.

Because of the serial operation of the two bioreactors, several specific constraints must be considered for previous variables, since they cannot take certain values in practice:

Constraints on *El*<sup>1</sup> :


$$(E\_{l1})\_{\max} = \frac{(8 - V\_{u1}) \cdot E\_{u1} + V\_{u1} \cdot 12}{8},\tag{2}$$

Constraints on *Eu*1:


```
Constraints on Vu1:
```
This variable must range from 1 to 7 L if the self-aspirating turbine with which the bioreactors were equipped is to operate properly. The turbine was used to feed oxygen to the microorganisms and help homogenize the culture medium.

Constraints on *T*<sup>1</sup> and *T*2:

The values of these variables must be compatible with the activity of acetic acid bacteria (AAB), which are the microorganisms affecting the process. The optimum temperature for AAB to oxidize ethanol into acetic acid is 25–35 ◦C. Because AAB activity drops outside this range [46,47], temperatures from 26 to 34 ◦C have been used here.

Constraints on *El*2:

1. The ethanol concentration in the feed wine is typically 12% (*v*/*v*).

2. *El*<sup>2</sup> must be less than the maximum ethanol concentration in the medium of the second bioreactor at any time. Such a concentration depends on *Eu*1, *Vu*<sup>1</sup> and the ethanol concentration of feed wine, and can be easily obtained from a simple mass balance:

$$(E\_{l2})\_{\max} = \frac{V\_{u1} \cdot E\_{u1} + (8 - V\_{u1}) \cdot 12}{8},\tag{3}$$

Based on the previous constraints, *Eu*1, *Vu*1, *El*<sup>1</sup> and *El*<sup>2</sup> are mutually related; also, feasible *Eu*<sup>1</sup> and *Vu*<sup>1</sup> values span the range 1–5% (*v*/*v*) and 1–7 L, respectively. However, not all potential combinations of the values of the operational variables can be used in practice. Thus, the initial volume of the second bioreactor—unloaded from the first—*Vu*1, together with *Eu*1, imposes some additional constraints on *El*<sup>1</sup> and *El*2. As a result, the experimental design is subjected to the following constraints:


Based on the foregoing, the constraints on *El*<sup>1</sup> and *El*<sup>2</sup> can be summarized as follows: Constraints on *El*1:


Constraints on *El*2:


A systematic analysis of all possible combinations of the values of the operational variables with provision for the above-described constraints was done by using a self-developed script in MATLAB [48] (see MATLAB file "Get\_feasible\_combinations.m" in Supplementary Materials). Each combination corresponded to a complete experimental design, as depicted in Tables 1–3. The software used different values for level (0) of each variable within a preset range in combination with different values for levels (−2) and (+2)—levels (−1) and (+1) were calculated automatically in each case. The specific ranges considered for level (0) were 1–5% (*v*/*v*) for *Eu*1, *El*<sup>1</sup> and *El*2; and 1–7 L for *Vu*1, all in 0.5 steps. A 0.5 step over the range 0.5–5 was used for iteration of levels (−2) and (+2) in each case.

The script initially identified 148611 feasible combinations. Such a large number required additional constraints to be imposed based on the following practical arguments:

A1. There must be a simultaneous maximum difference between levels (−1) and (+1) in *Eu*<sup>1</sup> and *Vu*<sup>1</sup> (see Table 1). This constraint expanded the feasible range of operational conditions and allowed the influence of each variable and mutual relations to be more accurately detected and examined as a result.

A2. Because in the semi-continuous operation mode it is a common practice to unload at least half of the working volume from the first bioreactor, *Vu*<sup>1</sup> level (−1) must be greater than or equal to 4 L.

A3. Because the initial ethanol concentration in the second bioreactor must not be too high in order to facilitate appropriate performance (see constraint 2 on *Eu*1), *Eu*<sup>1</sup> level (+2) must be less than 5% (*v*/*v*).

Introducing these additional constraints in the MATLAB script reduced the number of feasible combinations to 310 (see tab "Pass 1" in Excel file "Results.xlsx" in Supplementary Materials). Subsequently, in line with the previous argument A1, the MATLAB script was used with those combinations maximizing the difference between *El*<sup>1</sup> levels (−2) and (+2), which further reduced the number of combinations to 31 (see tab "Pass 2" in Excel file "Results.xlsx" in Supplementary Materials). Finally, selecting the combinations resulting in the greatest differences between *El*<sup>2</sup> levels (−2) and (+2) reduced the number of choices to 4 (see Table 4).


**Table 4.** The last four feasible combinations of the operational variables values and the selected combination (in grey).

On the other hand, too low an *El*<sup>2</sup> value would slow down the process exceedingly at the loading stage. This fact allows the combinations involving the lowest *El*<sup>2</sup> value at level (−2) (viz., 1, 3 and 4 in Table 4) to be discarded, thus leaving a single one which fulfils all constraints imposed (viz., combination 2 in Table 4). Tables 5–7 show the set of experiments corresponding to such a combination as established according to the 26−<sup>2</sup> experimental design previously described by using an identical range for both working temperatures (*T*<sup>1</sup> and *T*2): 26–34 ◦C.

As recommended when using experimental design methodology, only the experiments in Tables 5 and 6 would initially be performed (in random order). If the results were not conclusive enough, then the experiments in Table 7 would also be needed.


**Table 5.** Experiments performed according to the proposed 26−<sup>2</sup> fractional factorial design.

**Table 6.** Experiments for the central points of the proposed 26−<sup>2</sup> fractional factorial design.


**Table 7.** Extended experiments for the proposed 26−<sup>2</sup> fractional factorial design.


### *3.2. Experimental Results*

Conducting the 18 experiments of Tables 5 and 6 (compiled in Table S2.1 in file "S2.docx" in Supplementary Materials) required changing the operating conditions between experiments, so cycles had to be repeated until steady conditions were reached with each new set of operation variables; after that, the experiments were replicated to obtain statistically valid results. Table 8 shows the total number of cycles and that of useful cycles for each experiment. As can be seen, the minimum number of cycles used to calculate the target variables was 5—some experiments required nearly 20, however. The fact that the total number of cycles differed markedly among experiments was a result of being

performed in random order. Therefore, depending on the differences in the operating conditions between consecutive experiments, either a lower or a greater number of cycles may be required for the system to become stable again as a result.


**Table 8.** Experimental cycles.

By way of example, Figure 3 shows the results in terms of ethanol, acidity, volume, and total and viable cells in the first reactor in Experiment 1. Each result was the mean of 5 replicate cycles with its standard deviation (see Table 8). Similarly, Figure 4 shows the ethanol, acidity, and volume results for the second reactor as the means for 5 replicated runs and their standard deviations. The previous results were used to obtain the values of Table 9. The procedure used to calculate the values of non-measurable variables (e.g., mean rate of acetic acid formation, total acetic acid production and mean volume) is described in file "S1.docx" in Supplementary Materials. Figure S1.1 in such file shows the typical time course of ethanol concentration, acidity and volume in the first bioreactor and Figures S1.2–S1.3 show the time course of such variables in the second bioreactor when a fast loading or pre-depletion step exists, respectively. Also, the results obtained in the 18 experiments of Tables 5 and 6 are compiled in file "S2.docx" in Supplementary Materials (Figures S2.1 to S2.36 and Tables S2.2 to S2.19). On the other hand, Tables 10 and 11 show the experimental values of the key variables intended to be modelled using polynomial models relating them to the operational variables.

## *3.3. Obtained Polynomial Models*

Below are stated the polynomial models (or response surfaces) corresponding to the target variables. By way of example, the specific procedure used to estimate the mean overall rate of acetic acid formation in the proposed two-bioreactor system, (*rA*)*global est*, is detailed below, and that for each of the other variables in file "S3.docx" in the Supplementary Materials.

### 3.3.1. Mean Overall Rate of Acetic Acid Formation

ANOVA exposed the presence of significant differences at a 99.9% confidence level in the mean overall rate of acetic acid formation, (*rA*)*global*, among experiments. In the absence of such differences, (*rA*)*global* would be independent of the operational variables due to experimental error. Therefore, the 18 experiments of Tables 5 and 6 were representative enough of the operating conditions, so the other 12 potentially required (experiments in Table 7) were unnecessary. As shown below, the ANOVA led to identical conclusions for all target variables. Accordingly, the experimental (*rA*)*global* data in Table 10 were fitted to a second-order polynomial by Forward Stepwise Regression, as described in Section 2.6.1, using the software SigmaStat 2.0 [49]. Although 27 terms (viz., the 6 individual operational variables and their 21 possible interactions) were initially considered, the maximum possible number was that of the experiments performed: 18.

Each fitted polynomial was checked for significant differences in the regressions between each step and the previous one by the effect of the addition or removal of terms. ANOVA revealed the presence of significant differences at a 95% confidence level in all cases and, therefore, the results are worth no additional detailed description here.

The fitting steps using the Forward Stepwise Regression method (Section 2.6.1) are detailed in Section S3.1 in file "S3.docx" in Supplementary Materials (Tables S3.1 to S3.14 show intermediate results). The final model was that of Equation (4) and allowed the mean overall rate of acetic acid formation to be estimated with an error of 0.01 g acetic acid·(100 mL·h)<sup>−</sup>1. Such a model, however, only held over the operating ranges shown in Tables 5 and 6 for each variable.

$$\begin{array}{llll} \left(r\_A\right)\_{global\text{ set}} = & -0.51 + 0.43 \cdot E\_{u1} - 0.0654 \cdot E\_{u1}^2 \\ & - 0.00456 \cdot E\_{l2} \cdot V\_{u1} - 0.00468 \cdot E\_{u1} \cdot E\_{l1} \\ & + 0.000672 \cdot T\_1 \cdot E\_{l1} + 0.000839 \cdot E\_{l2} \cdot T\_1 \end{array} \tag{4}$$

Based on statistical significance, not all operational variables in Equation (4) had a direct influence on (*rA*)*global* (in fact, only *Eu*<sup>1</sup> had a direct impact) and this variable was independent of *T*2. Also, only some of the interaction terms influenced (*rA*)*global*.

**Figure 3.** Time course of the mean ethanol concentration, acidity and volume in the first reactor in Experiment 1, and corresponding standard deviations.

**Figure 4.** The time course of the mean ethanol concentration, acidity, and volume in the second reactor in Experiment 1, and corresponding standard deviations.



**Table 10.** Experimental values of selected variables: (*rA*)*global* is the mean overall rate of acetic acid formation in the two-bioreactor system, *Pm* the total acetic acid production in the two-bioreactor system, *Eu*<sup>2</sup> the ethanol concentration at the time the second bioreactor was unloaded, *Vu*<sup>2</sup> the volume unloaded from the second reactor, *tcycle* the duration of a cycle, and *Vm* the mean overall volume in the two-bioreactor system during a cycle.


**Table 11.** Experimental values of selected variables. *EtOHm*<sup>1</sup> is the mean ethanol concentration in the first bioreactor during a cycle, *EtOHm*<sup>2</sup> the mean ethanol concentration in the second bioreactor during a cycle, *HAcm*<sup>1</sup> the mean acetic acid concentration in the first bioreactor during a cycle, and *HAcm*<sup>2</sup> the mean acetic acid concentration in the second bioreactor during a cycle.


A comparison of the (*rA*)*global* predictions with the experimental values (Figure 5, which is the same as Figure S3.1 in in file "S3.docx" in Supplementary Materials) reveals that the two coincided exactly in 10 of the 18 experiments and differed only very slightly in the other 8. Also, all predictions fell within the 95% prediction interval.

**Figure 5.** Plot of (*rA*)*global est* against (*rA*)*global*, and curves for the 95% confidence and prediction intervals.

As can be seen from Figure 6 (same as Figure S3.2 in file "S3.docx" in Supplementary Materials), the residuals (viz., the differences between experimental and predicted values) had a zero or near-zero mean in most experiments and were normally distributed in all.

**Figure 6.** Residuals of the fitting of (*rA*)*global est* for each experiment.

The process used with the other estimated variables is described in file "S3.docx" in Supplementary Materials. Their polynomial models are discussed below.

### 3.3.2. Total Acetic Acid Production

This variable, *P*m, was also estimated from the experimental data of Table 10. ANOVA revealed statistically significant differences at a 99.9% confidence level between experiments and, hence, *P*<sup>m</sup> was dependent on the operational variables. Therefore, the experimental data of *P*<sup>m</sup> were fitted to a second-order polynomial by using Forward Stepwise Regression to construct the model of Equation (5), which estimated this variable with an error of 0.7 g acetic acid·h−1. The fitting steps are detailed in Section S3.2 in file "S3.docx" in Supplementary Materials (Tables S3.15 to S3.28 show intermediate results)

$$\begin{array}{rcl}P\_{\text{m.cst}} = & -243.705 + 18.324 \cdot V\_{\text{u1}1} + 72.736 \cdot E\_{l1} + 21.525 \cdot E\_{\text{u1}} \\ & - 9.708 \cdot E\_{l1}^2 - 1.102 \cdot E\_{\text{u1}} \cdot V\_{\text{u1}} - 0.534 \cdot T\_1 \cdot V\_{\text{u1}} \\ & + 0.742 \cdot T\_1 \cdot E\_{l1} - 0.416 \cdot E\_{l2} \cdot E\_{l1} + 0.175 \cdot T\_2 \cdot E\_{l1} \\ & - 0.12 \cdot T\_1 \cdot E\_{\text{u1}} - 0.399 \cdot T\_2 \cdot E\_{\text{u1}} + 0.101 \cdot T\_2 \cdot E\_{l2} \end{array} \tag{5}$$

As can be seen, *Pm est* depends directly on *Vu*1, *Eu*<sup>1</sup> and *El*1, as well as on various interaction terms. A comparison of *Pm est* and its experimental counterpart (Figure S3.3 in file "S3.docx" in Supplementary Materials) revealed the presence of scarcely significant differences. Also, the residuals of the fitting had a near-zero mean in most experiments and were normally distributed in all (see Figure S3.4 in file "S3.docx" in Supplementary Materials).

3.3.3. Final Ethanol Concentration at the Time the Second Reactor Was Unloaded

As with the previous two variables, the ANOVA on *Eu*<sup>2</sup> (Table 10) revealed the presence of statistically significant differences at a 99.9% confidence level between the predicted and experimental results. As before, the experimental results were fitted to a second-order polynomial by using Forward Stepwise Regression. The fitting steps are detailed in Section S3.3 in file "S3.docx" in Supplementary Materials (Tables S3.29 to S3.41 show intermediate results). The resulting model (Equation (6)) predicted the ethanol concentration with an error of 0.3% (*v*/*v*).

$$\begin{array}{rcl} E\_{\text{u2 est}} &=& 14.935 - 5.371 \cdot E\_{\text{u1}} + 0.988 \cdot E\_{\text{u1}}^2 - 0.0592 \cdot E\_{l1} \cdot V\_{\text{u1}} \\ &- 0.456 \cdot E\_{\text{u1}} \cdot V\_{\text{u1}} + 0.0678 \cdot E\_{\text{u1}} \cdot E\_{l1} \\ &+ 0.494 \cdot E\_{l2} \cdot E\_{\text{u1}} - 0.049 \cdot T\_{\text{l}} \cdot E\_{l2} \end{array} \tag{6}$$

As can be seen, the variable *Eu*<sup>2</sup> *est* is directly dependent on *Eu*<sup>1</sup> and on some interaction terms; however, it is independent of *T*1.

As with the previous variables, a comparison of *Eu*<sup>2</sup> *est* and its experimental counterpart (Figure S3.5 in file "S3.docx" in Supplementary Materials) confirmed that the predictions of the model were quite accurate; also, they fell within the 95% prediction interval except in one case.

Finally, as can be seen from Figure S3.6 in file "S3.docx" in Supplementary Materials, the residuals had a near-zero mean in most experiments and were normally distributed in all.

### 3.3.4. Volume of Fermentation Medium Unloaded from the Second Reactor

The ANOVA of the volume unloaded from the second reactor, *Vu*2, revealed the presence of statistically significant differences between experiment means at a 99.9% confidence level (see Table 10).

Like the previous variables, *Vu*<sup>2</sup> was fitted by Forward Stepwise Regression. The fitting steps are detailed in Section S3.4 in file "S3.docx" in Supplementary Materials (Tables S3.42 to S3.53 show intermediate results). The resulting model (Equation (7)) predicted it with an error of 0.22 L.

$$\begin{array}{llll}V\_{u2\ est} = & 4.921 + 1.399 \cdot E\_{l2} - 0.59 \cdot E\_{u1}^2 \\ & + 0.665 \cdot E\_{u1} \cdot V\_{u1} - 0.324 \cdot E\_{l2} \cdot V\_{u1} \\ & + 0.172 \cdot E\_{l2} \cdot E\_{u1} - 0.0275 \cdot T\_2 \cdot E\_{u1} \end{array} \tag{7}$$

As can be seen, *Vu*<sup>2</sup> *est* is independent of *El*<sup>1</sup> and *T*1. A comparison of *Vu*<sup>2</sup> *est* and its experimental counterpart, *Vu*2, confirmed the goodness of the predictions (see Figure S3.7 in file "S3.docx" in Supplementary Materials). In fact, all fell within the 95% prediction interval. Also, the residuals had a zero or near-zero mean in most experiments and were normally distributed in all (see Figure S3.8 in file "S3.docx" in Supplementary Materials).

#### 3.3.5. Total Cycle Duration

As in the previous cases, the ANOVA on *tcycle* revealed the presence of statistically significant differences at a 99.9% confidence level (see Table 10). Applying Forward Stepwise Regression to the experimental data provided the model of Equation (8), which predicted *tcycle* with an error of 2.5 h. The fitting steps are detailed in Section S3.5 in file "S3.docx" in Supplementary Materials (Tables S3.54 to S3.60 show intermediate results)

$$\begin{array}{llll} t\_{\text{cycle }est} &=& 518.591 - 156.652 \cdot V\_{\text{u1}} \\ &-6.474 \cdot T\_1 + 13.556 \cdot V\_{\text{u1}}^2 \\ &+ 1.076 \cdot T\_1 \cdot V\_{\text{u1}} - 0.864 \cdot E\_{\text{u1}} \cdot E\_{I1} \end{array} \tag{8}$$

As can be seen, *tcycle est* is influenced by the variables *Vu*1, *El*1, *Eu*<sup>1</sup> and *T*1, which is logical since the overall behaviour of the first bioreactor dictates when the second is to be unloaded. A comparison of *tcycle est* and its experimental counterpart, *tcycle*, revealed that all predictions fell within the 95% interval (see Figure S3.9 in file "S3.docx" in Supplementary Materials). Finally, as with the previous variables, the residuals had a near-zero mean in most cases and were normally distributed in all (see Figure S3.10 in file "S3.docx" in Supplementary Materials).

3.3.6. Mean Overall Volume in the Two-Bioreactor System

The ANOVA on the experimental data of *Vm* (Table 10) revealed statistically significant differences between experiment means at a 99.9% confidence level. The multiple regression model obtained (Equation (9)) estimated the mean overall volume in each cycle with an error of 0.33 L. The fitting steps are detailed in Section S3.6 in file "S3.docx" in Supplementary Materials (Tables S3.61 to S3.74 show intermediate results). Also, as can be seen from the equation, *Vm est* is independent of *T*1.

$$\begin{array}{ll} V\_{m\,\,est} = & 4.306 + 4.739 \cdot E\_{u1} - 0.841 \cdot E\_{u1}^2 \\ & + 0.336 \cdot E\_{u1} \cdot V\_{u1} - 0.0127 \cdot T\_2 \cdot V\_{u1} \\ & - 0.125 \cdot E\_{l2} \cdot E\_{l1} + 0.0326 \cdot T\_2 \cdot E\_{l1} \\ & - 0.0735 \cdot T\_2 \cdot E\_{u1} + 0.0358 \cdot T\_2 \cdot E\_{l2} \end{array} \tag{9}$$

As shown by a plot of *Vm est* against the experimental data *Vm* (Figure S3.11 in file "S3.docx" in Supplementary Materials), all predictions fell within the 95% interval, so the degree of fitting was acceptable. This was further confirmed by the residuals for each experiment (Figure S3.12 in file "S3.docx" in Supplementary Materials), which had a near-zero mean in virtually all experiments and were normally distributed.

#### 3.3.7. Mean Ethanol Concentration in the First Bioreactor

The ANOVA of *EtOHm*<sup>1</sup> (Table 11) also exposed statistically significant differences between experiment means at a 99.9% confidence level. The experimental data were fitted by Forward Stepwise Regression to construct Equation (10), which reproduced them with an estimation error of 0.2% (*v*/*v*). The fitting steps are detailed in Section S3.7 in file "S3.docx" in Supplementary Materials (Tables S3.75 to S3.81 show intermediate results)

$$\begin{array}{ll} \text{EtOH}\_{\text{m1 cst}} = & 2.312 - 0.0932 \cdot E\_{\text{u1}}^2 \\ & + 0.0191 \cdot E\_{l1} \cdot V\_{\text{u1}} + 0.175 \cdot E\_{\text{u1}} \cdot E\_{l1} \end{array} \tag{10}$$

As expected, *EtOHm*<sup>1</sup> *est* is independent of *T*1, *El*<sup>2</sup> and *T*2. Comparing *EtOHm*<sup>1</sup> *est* and its experimental counterpart *EtOHm*<sup>1</sup> (Figure S3.13 in in file "S3.docx" Supplementary Materials) revealed that all predictions of the model fell within the 95% interval. Therefore, the fitting was good, as further confirmed by the residuals (Figure S3.14 in file "S3.docx" in Supplementary Materials), which had a near-zero mean in most experiments and were normally distributed.

### 3.3.8. Mean Ethanol Concentration in the Second Bioreactor

Based on the results of the ANOVA on *EtOHm*<sup>2</sup> (Table 11), there were statistically significant differences between experiment means at a 99.9% confidence level. Fitting the experimental results by Forward Stepwise Regression provided the model of Equation (11), which estimated *EtOHm*<sup>2</sup> with an error of 0.4% (*v*/*v*). The fitting steps are detailed in Section S3.8 in file "S3.docx" in Supplementary Materials (Tables S3.82 to S3.88 show intermediate results)

$$\begin{aligned} \text{EtOH}\_{\text{m2 est}} &= \begin{aligned} 4.327 - 0.179 \cdot \text{E}\_{\text{u1}} \cdot V\_{\text{u1}} \\ + 0.306 \cdot \text{E}\_{l2} \cdot \text{E}\_{\text{u1}} - 0.0182 \cdot \text{T}\_{2} \cdot \text{E}\_{l2} \end{aligned} \tag{11}$$

As can be seen, *EtOHm*<sup>2</sup> *est* is independent of *El*<sup>1</sup> and *T*1. A plot of *EtOHm*<sup>2</sup> *est* against experimental data *EtOHm*<sup>2</sup> (Figure S3.15 in file "S3.docx" in Supplementary Materials) revealed that the former all fell within the 95% prediction interval. Also, the residuals (Figure S3.16 in file "S3.docx" in Supplementary Materials) were all normally distributed and had a near-zero mean in most cases.

#### 3.3.9. Mean Acetic Acid Concentration in the First Bioreactor

Based on the results of the ANOVA on *HAcm*<sup>1</sup> (Table 11), there were statistically significant differences between experiments at a 99.9% confidence level. Fitting the data provided the model represented by Equation (12), which predicted the mean acetic acid concentration in the first bioreactor with an error of 0.2% (*w*/*v*). The fitting steps are detailed in Section S3.9 in file "S3.docx" in Supplementary Materials (Tables S3.89 to S3.95 show intermediate results)

$$HAc\_{m1\text{ est}} = \begin{array}{r} 9.188 + 0.0932 \cdot E\_{u1}^2 \\ -0.0191 \cdot E\_{l1} \cdot V\_{u1} - 0.175 \cdot E\_{u1} \cdot E\_{l1} \end{array} \tag{12}$$

As can be seen, *HAcm*<sup>1</sup> *est* depends on the same variables as *EtOHm*<sup>1</sup> *est* and is also independent of *T*1, *El*<sup>2</sup> and *T*2—a logical result, since the two variables are mutually related.

A comparison of *HAcm*<sup>1</sup> *est* and its experimental counterpart *HAcm*<sup>1</sup> (Figure S3.17 in file "S3.docx" in Supplementary Materials) revealed that the predictions of the model all fell within the 95% interval. Also, as can be seen from Figure S3.18 in file "S3.docx" in the Supplementary Materials, the residuals had a near-zero mean in virtually all cases and were normally distributed.

#### 3.3.10. Mean Acetic Acid Concentration in the Second Bioreactor

As with the previous variables, the ANOVA on *HAcm*<sup>2</sup> (Table 11) exposed statistically significant differences at a 99.9% confidence level in mean acetic acid concentration between experiments. Equation (13) represents the linear model obtained by fitting the experimental data with the Forward Stepwise Regression method. The fitting steps are detailed in Section S3.10 in file "S3.docx" in Supplementary Materials (Tables S3.96 to S3.102 show intermediate results). The model predicted *HAcm*<sup>2</sup> with an error of 0.4% (*w*/*v*).

$$\begin{array}{rcl} \text{HAc}\_{m2\text{ est}} = & 7.227 + 0.177 \cdot \text{E}\_{\text{u1}} \cdot V\_{\text{u1}} \\ & - 0.305 \cdot \text{E}\_{\text{l2}} \cdot \text{E}\_{\text{u1}} + 0.0178 \cdot \text{T}\_2 \cdot \text{E}\_{\text{l2}} \text{ }' \end{array} \tag{13}$$

Similarly to *HAcm*<sup>1</sup> *est* and *EtOHm*<sup>1</sup> *est*, *HAcm*<sup>2</sup> *est* is dependent on the same operational variables as *HAcm*2. A comparison of *HAcm*<sup>2</sup> *est* predictions and experimental data (*HAcm*2, Figure S3.19 in file "S3.docx" in Supplementary Materials) revealed that the former invariably fell within the 95% interval. Also, the residuals (Figure S3.20 in file "S3.docx" in Supplementary Materials) had a near-zero

mean in most cases and were normally distributed in all. Therefore, the model can be deemed acceptably accurate.

#### *3.4. Discussion about the Obtained Polynomial Models*

Although a black-box model does not allow one to ascertain why some operational variables are influential whereas others are not, it could be interesting to identify the most influential variables and their interaction terms. The influence (statistical significance) of each term in a polynomial equation can be assessed through statistic *F*, which was used here to decide whether a term was to be included or excluded. By way of example, Table S3.14 in file "S3.docx" in Supplementary Materials reveals that the highest *F* values were those for *Eu*<sup>1</sup> and *E*<sup>2</sup> *<sup>u</sup>*1. Therefore, the variable (*rA*)*global est* was especially sensitive to the ethanol concentration at the time the first bioreactor was unloaded—it was directly influenced by *Eu*<sup>1</sup> and by its quadratic term.

As a rule, the operational variables associated to the first bioreactor were more markedly influential on most of the dependent variables than were those pertaining to the second. This is unsurprising if one considers that the first reactor not only contributed to the total acetic acid production but also supplied the second with the microorganisms which must work under more extreme conditions in the second bioreactor, since one of the main goals was to deplete ethanol in the medium. Therefore, the conditions prevailing in the first reactor should allow a high concentration of very active acetic acid bacteria to be maintained. As stated in the introduction, such conditions are obtained by keeping the ethanol and acetic acid concentration at not too high levels. It is thus unsurprising that the polynomials used to estimate (*rA*)*global est* and *Pm est* were so strongly dependent on *Eu*<sup>1</sup> and *El*<sup>1</sup> as the latter two variables are directly related to acidity in the reaction medium. *Vu*<sup>1</sup> is also highly influential; in fact, the greater the volume unloaded into the second bioreactor is, the more marked will be the potential changes in ethanol and acidity levels in the first as a result of the need for a greater volume of fresh medium for replenishment. As expected, the interaction term *Eu*1·*El*<sup>1</sup> in the polynomial for *EtOHm*<sup>1</sup> *est* is especially important; in fact, changes in *Eu*<sup>1</sup> and *El*<sup>1</sup> must have a strong impact on the mean ethanol concentration in each transformation cycle in the first bioreactor.

One other interesting inference is that the polynomials for *EtOHm*<sup>1</sup> *est* and *HAcm*<sup>1</sup> *est* are complementary; in fact, they only differ in the independent term and in the signs of the others. The sum of the independent term coincides with the overall content of the medium [% (*v*/*v*) ethanol + % (*w*/*v*) acetic acid]. Since the total concentration remains constant, in the absence of volatile losses by sweeping—which was the case with our experiments—this result is unsurprising and provides support for the correlation procedure used to develop the equations. Similar reasoning can be applied to *EtOHm*<sup>2</sup> *est* and *HAcm*<sup>2</sup> *est*.

The variable *Ed*<sup>2</sup> *est* is of special interest as it is a measure of ethanol depletion in each biotransformation cycle. One aim of the acetification process may be not to operate at the highest possible rate but rather to deplete or nearly deplete the substrate in each cycle—in which case *Ed*<sup>2</sup> *est* will be zero or near-zero. Again, the variables *Eu*1, *El*<sup>1</sup> and *Vu*<sup>1</sup> will be especially influential—not directly, but through their interaction terms—, but so will *T*<sup>2</sup> and *El*2.

Once the previously described models have been obtained, the operating conditions can only be optimized, for specific purposes, through a well-designed optimization process using several objective functions. Hence, additional work would be necessary in this regard.

#### **4. Conclusions**

Despite the broad available experience and technical knowledge available on the biotransformation of ethanol into acetic acid in the vinegar production process, a number of essential questions remain unanswered. Such is the case, for example, with the nature of the microbiota that effects the process and with its complex metabolic interactions. In practice, the process continues to require more or less extensive modelling in order to relate operational variables to specific objective functions. Black-box models based on generalized polynomials have proved especially suitable for representing the behaviour of acetification systems in the form of response surfaces.

In this work, an effective experimental design based on useful data for the unequivocal calculation of the coefficients of the polynomial equations has been developed. Once the main operational variables have been identified, their admissible ranges have been established. That information has been used into an experimental design in order to identify the combinations of values of the operational variables that would allow the number of experiments maximizing the predictive ability of a model to be minimized. Another aim carried out in this work was to model key variables related to the acetification process performed with two serial bioreactors working in the semi-continuous mode by using quadratic polynomial equations.

The operational variables considered were the ethanol concentration at the time of unloading (*Eu*1), unloaded volume (*Vu*1), ethanol concentration during loading (*El*1) and operating temperature (*T*1) in the first reactor, and the ethanol concentration during loading (*El*2) and operating temperature (*T*2) in the second. As it has been shown, the variation ranges for these variables are subject not only to physico–chemical constraints, but also to others arising from the fact that the two reactors operate in a serial mode and from the limited number of combinations allowed for by the experimental design.

As a result, a fractional factorial design with 30 experiments (see Tables 5–7) considering the previous constraints has been obtained, with it being necessary to perform only 18 of them (see Tables 5 and 6).

With the gathered experimental data, second-order polynomials for several target variables of interest in terms of the considered operational variables involved in the industrial production process were developed. Specifically, models for the mean overall rate of acetic acid formation in the two-bioreactor system [(*rA*)*global*], total acetic acid production in the system (*Pm*), ethanol concentration at the time the second bioreactor was unloaded (*Eu*2), the volume unloaded from the second bioreactor (*Vu*2), cycle duration (*tcycle*), the mean overall volume in the two bioreactors during a cycle (*Vm*), the mean ethanol concentration in the first bioreactor during a cycle (*EtOHm*1), the mean ethanol concentration in the second bioreactor during a cycle (*EtOHm*2), the mean acetic acid concentration in the first reactor during a cycle (*HAcm*1) and the mean acetic acid concentration in the second reactor during a cycle (*HAcm*2) were obtained. The experimental results and their estimates were correlated via Forward Stepwise Regression, which allowed models with high predictive ability and minimal errors to be obtained. The resulting goodness of fit allowed a set of polynomials to be established that accurately reproduced the experimental results with only 18 experiments rather than the 30 needed in theory. Such models should allow us to carry out further optimization studies.

With all models, the variables associated to the first bioreactor were the more influential on the process. This was particularly so with *Eu*1, *El*<sup>1</sup> and *Vu*1, and can be ascribed not only to the fact that these variables depend on the fermentation conditions in the first bioreactor—and hence contribute to the total production of acetic acid—but also to the operating conditions in the second—usually more extreme—being influenced by those under which the first is operated.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/24/9064/s1. "Get\_feasible\_combinations.m": MATLAB script for systematic analysis of feasible combinations of operational variables. File "Results.xlsx": Excel file with successively obtained feasible combinations of operational variables. File "S1.pdf": Description of the procedure used to determine the non-measurable variables of the process. File "S2.pdf": Description of the experimental results. File "S3.pdf": Detailed description of the procedure used to obtain the polynomial models of all analysed variables.

**Author Contributions:** Conceptualization, I.G.-G.; methodology, C.M.Á.-C., I.M.S.-D. and I.G.-G.; formal analysis, C.M.Á.-C., I.M.S.-D., J.E.J.-H. and I.G.-G.; Data curation, C.M.Á.-C. and I.M.S.-D.; writing—original draft preparation, J.E.J.-H. and I.G.-G.; writing—review and editing, J.E.J.-H., I.G.-G. and I.M.S.-D.; funding acquisition, I.G.-G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by "XXIII Programa Propio de Fomento de la Investigación 2018" (MOD 4.2. SINERGIAS, Ref XXIII. PP Mod 4.2) from University of Córdoba (Spain) and by "Programa PAIDI" from Junta de Andalucía (RNM-271).

**Conflicts of Interest:** The authors declare no conflict of interest.
