*2.2. Experimental Data*

Inclusion criteria were the presence of pathological conditions in the Emergency Department and in the Department of Medicine resulting in overload (heart failure) or volume depletion (dehydration or moderate bleeding). As a control group, patients without the previous conditions were selected. Exclusion criteria were chronic obstructive pulmonary disease, pulmonary hypertension, interstitial disease or thromboembolism, tension pneumothorax, cirrhosis and/or ascitic effusion, serum creatinine >3 mg/dl, constrictive pericarditis and cardiac tamponade. Fifty patients were included in the study. They were selected from a database of 69 patients (Table 1). On the basis of clinical considerations (based upon physical examination, laboratory data and imaging), each patient was associated to one of the following classes:


US B-mode video-clips of about 15 s were recorded bedside in spontaneous breathing, with subxifoideal approach, using a MyLab Seven system (Esaote, Genova, Italy; frame rate 30 Hz, 256 gray levels) equipped with a convex 2–5 MHz probe. M-mode scans were also recorded to allow for standard manual measurements.

According to the Declaration of Helsinki, subjects provided written informed consent for the collection of data and subsequent analysis. The study was approved by the local Ethics Committee.

The data included here were only those for which both video-clips, recorded along either long or short axis, could be reliably processed. They were 20 from patients in overload, 19 controls and 11 patients with volume depletion (notice that video-clips of patients with volume depletion are more difficult to be processed, due to the small dimension of the IVC, which could even collapse in some frames, hindering proper processing).

Figure 2 shows examples of patients in the 3 classes.

**Table 1.** Number of patients included in different groups (with indication of the entire database and of the patients for which successful processing of both long and short axis ultrasound videos was achieved).


### *2.3. Automated Identification of the Volemic Status*

Three different classification approaches were fit to our dataset, as a preliminary step: the error-correcting output codes (ECOC) model, using support vector machines (SVM [32]) for binary one-to-one classifications [33,34]; the Naive Bayes classifier (estimating data distributions using smoothed densities with normal kernel) [35]; the BTM [35]. For each approach, different models were fit to our data, considering all possible combinations of input features (detailed below). The performances of different classifiers were compared in terms of a 10-fold cross-validation test, which allowed to select the best input features and classification approach. Then, the selected classifier was tested by a leave-one-out approach and, finally, trained on the entire dataset, to provide an ultimate prediction model. In the following, we will focus only on the BTM, as best results were obtained using this approach.

Different BTMs were fit to our multi-class classification problem (including 3 classes), selecting the simplest one (i.e., with minimum dimension) with best performances. A BTM iteratively splits the dataset in two groups, after comparing an index with a threshold (Gini's diversity index was used as splitting criterion). Thus, it is built by choosing the optimal number of splittings, the specific index to be considered for each binary separation and selecting the threshold value for each splitting. Different BTMs were developed considering all possible combinations of input indexes (exhaustive search): all possible choices of a single index, all pairs, triplets, ... until using all indexes.

**Figure 2.** Examples of data from patients in either hypo-, eu- or hyper-volemic conditions. The first frames of the long and short axis scans are shown (left and right, respectively), together with the IVC boundaries identified by the algorithm. Time series are also shown for the diameters in 5 sections of the IVC (in gray, with superimposed the mean diameter in black) and for the IVC area estimated from the long and short axis scans, respectively. In the case of long axis scans, pulsatility indexes were computed as averages of estimations from each of the 5 sections; in the case of short axis scans, they were computed from the equivalent diameter, proportional to the square root of the IVC cross-section area. *Dm*: mean diameter; *Am*: mean area; CI: caval index; RCI: respiratory caval index; CCI: cardiac caval index.

Different sets of indexes were used, considering the possibility of either employing the semi-automated processing or not (so that in the latter case only manual measurements were considered).

The set of indexes obtained by semi-automated video processing was the following:



The set of indexes obtained by standard manual measurements was the following:


For each set of indexes, different BTMs were developed using all possible combinations of features taken from it and the one with highest performance was selected (thus, they were 255 and 15, for the first and second features set, respectively). Specifically, the best categorical predictor split was chosen from all possible combinations of choices. As mentioned above, the models were cross-validated considering 10 folds. The order of the data was random, so that the three categories of patients had a similar representation in each fold (however, they could not be equally represented; this problem is emphasized by the small size of our dataset). The one providing minimum average root mean squared regression error (or loss) on the validation sets was then selected. This specific model was then tested by a leave-one-out approach, to reduce the bias in error estimation (considering our small dataset) [36].
