2.3.3. Dietary Patterns with Clustering Analysis

A two-step cluster analysis procedure was conducted for the automatic selection of the best number of clusters that would otherwise not be apparent on the FFQ variables. Three clusters were assumed to be the optimum number. Later, unsupervised hierarchical clustering analysis was applied to construct clusters of subjects with similar characteristics using the "pheatmap" R software package. The distance matrix was defined by Euclidean distances, and Ward's method was used as linkage criteria to group the clusters. The agglomerative coefficient, calculated by the Agnes function, was always higher than 0.85.

Heat maps were used to visualize hierarchical clustering, which allowed us to simultaneously picture clusters of subjects and features. Hierarchical clustering was done of both the rows and the columns of the data matrix, which were re-ordered according to the hierarchical clustering result, putting similar observations close to each other. The blocks of 'high' and 'low' values are adjacent in the data matrix. A red-blue color scheme was applied for the visualization to help to find the variables that appear to be characteristic for each subject cluster.

#### **3. Results**
