1. Introduction
Monitoring the diet quality of pregnant women is crucial for the improved health of both themselves and their child. Good nutrition during pregnancy may help to prevent deficiencies that affect the health of both the mother and fetus [
1,
2]. Women following a Mediterranean diet (MedDiet) during pregnancy have displayed superior health for both themselves and their offspring [
3]. The economic benefits of using food frequency questionnaires (FFQs), compared to more expensive methods such as 24 h recalls (24 hR) or food records [
4], make them a popular tool to estimate food intake for different populations. However, the validity of an FFQ is limited to the target population in which it was validated. Thus, an FFQ validated in the general population [
5] should not be administered to pregnant women.
The accuracy of an FFQ can be determined by comparing food and nutrient intake information collected by this instrument with those obtained using a reference method. Weighed food records are considered the gold standard method; however, the application of this method is often not feasible. Therefore, food records and 24 hR are widely used as the reference method. However, these instruments are not free from measurement errors caused by memory bias among others [
4]. These markers are in contrast to auto reported data objective measures of food intake. It is, therefore, appropriate to include additional biological markers of food intake in the evaluation of the validity of an FFQ [
6].
Most FFQ validation studies compare intake estimates collected by an FFQ with those collected by a reference method to assess their agreement in terms of reported intakes of individual nutrients and/or food groups. However, it should be acknowledged that these results cannot yield a generalized score composed of several specific nutrients and/or foods like the Mediterranean diet score (MedDiet score).
Given the great impact of diet quality on health [
1,
3,
7], it is important that FFQs correctly estimate food and nutrient intake. However, most FFQ validation studies have not gone further than analyzing the presentation of data on test–retest reliability and concurrent validity of the questionnaire. Thus, this study aimed to determine the test–retest reliability indicating the stability of the FFQ over time through repeated measures at two different time points. Furthermore, we determined concurrent or relative validity, which indicates the amount of agreement between two different measures, and construct validity a measure of the concept that it is intended to measure. For this purpose, we compared dietary data obtained by the FFQ with those derived by food records and biological markers of food intake. Additionally, we analyzed these validity domains along with changes in adherence to the MedDiet.
3. Results
General characteristics of the study population are shown in
Table 1. The mean of the MedDiet score at baseline and follow up was 4.0 ± 1.5 and 4.1 ± 1.6, respectively.
Table 2 shows the test–retest reliability of the FFQ administered at 19–24 and 34–36 weeks of gestation. We found a significant test–retest reliability for each of the 12 food groups and 17 nutrients. The ICC ranged from moderate (cereal = 0.42) to very good (vegetables = 0.83) for foods and from moderate (MUFA = 0.48) to very good (Vitamin A = 0.86) for nutrients. The average correlation of foods and nutrients was 0.70 and 0.69, respectively. Test–retest reliability for the MedDiet score showed an ICC of 0.53, considerably lower than for foods and nutrients.
The analysis of concurrent validity (
Table 3) yielded correlations ranging from poor to good for foods and nutrients. The average correlations of foods and nutrients between the two methods were 0.24 and 0.21, respectively. All correlations were significant with the exception for olive oil and vitamin B12 for non-standardized and olive oil, vitamin B12 and monounsaturated fat for standardized foods and nutrients (
Table 3 and
Table 4). The degree of correlations ranged from poor (olive oil ρ = 0.12) to good (dairy products ρ = 0.63) for foods and from poor (vitamin B12 ρ = 0.10) to moderate (vitamin C ρ = 0.47) for nutrients (
Table 3). Concordance between methods ranged from poor to fair for foods and nutrients. κ statistic showed a fair concordance between the methods for foods (weighted κ = 0.28) and nutrients (weighted κ = 0.21). The adjustment for energy intake did not meaningfully improve correlations or concordance between the methods (
Table 4). The comparison of the biomarker α-linolenic acid was limited to nut consumption because there were no other relevant food sources α-linolenic acid included in the FFQ. Hydroxytyrosol was correlated with olive oil and olives, both principal sources of this bioactive compound. The comparison between biomarkers of food and nutrient intake with the corresponding data derived by the FFQ revealed a poor (r = 0.07;
p = 0.59) to moderate (r = 0.41;
p <0.001) concurrent validity for hydroxytyrosol and α-linolenic acid, respectively (
Table 5).
The mean ratings of the MedDiet score derived by the FFQ at baseline and follow-up 4.0 ± 1.5 and 4.1 ± 1.6, respectively, and for dietary records 4.0 ± 1.5. The FFQ significantly (
p < 0.001) overestimated the MedDiet score (by 12%) compared to the corresponding MedDiet score derived by the reference method. However, no proportional bias was found (β coefficient 0.072; 95% CI-0.064, 0.204;
p < 0.287) (
Table 6 and
Figure 1) across score ratings. The Pearson coefficient revealed a moderate and significant correlation (0.46 and
p < 0.001) between the scores derived by the dietary records and FFQ. Additionally, the intraclass correlation coefficient, an indicator of the degree to which both instruments assigned the same absolute score ratings, showed the same degree of correlation (ICC = 0.46;
p < 0.001). These findings indicate that the FFQ had a moderate ability to rank participants according to their adherence to the MedDiet. To analyze construct validity, we hypothesized a priori relationships between higher scores of the more favorable intake profiles for 17 nutrients. We found that intakes of 17 nutrients were associated in the anticipated direction with MedDiet score ratings derived by the FFQ, the associations were significant for nearly 50% of the nutrients (
Supplementary Table S1).
4. Discussion
The results of this validation study show a good test–retest reliability and a fair to moderate validity of the FFQ for pregnant women. The questionnaire adequately ranked women according to their adherence to the MedDiet. Additionally, the construct of the MedDiet score was valid.
FFQs are useful tools for assessing long-term dietary intake in epidemiological studies [
17]. Few FFQs capturing the complete diet have been developed and validated in European populations of pregnant women during the last 20 years [
18,
19,
20,
21,
22,
23,
24]. Of those that have, only test–retest reliability and concurrent validity were analyzed, using 24 hRs or dietary records as the reference method. In this study, the average test–retest reliability for 12 food groups and 17 nutrients was moderate (ICC = 0.55) and good (ICC = 0.61), respectively. This is somewhat higher than that found by Vioque et al. [
21] who reported moderate reliability for 29 nutrients (r = 0.51) and 17 foods (r = 0.41). A Finnish study found a higher average correlation (ICC = 0.65) for all foods and nutrients [
18]. However, these comparisons are somewhat limited due to the different amounts of foods and nutrients considered and, in the case of Vioque et al. [
21], different statistical methods used. Additionally, the test–retest reliability of diet in these studies is based on the assumption that differences found between the two estimations are mainly due to measurement errors and less to alterations in dietary habits. In the present study one third of the pregnant women were allocated to a nutritional intervention program and have, therefore, changed their dietary habits during the test–retest period. This fact might partially explain the magnitude of the test–retest reliability of the present FFQ
In this study we found a poor (olive oil ρ = 0.12) to good (dairy products ρ = 0.63) concurrent validity of the FFQ for foods and from poor (vitamin B12 ρ = 0.10) to moderate (vitamin C ρ = 0.47) for nutrients compared with dietary records These findings are comparable with previous reports of the validity of FFQs in European pregnant women [
22,
23,
24,
25,
26,
27]. The poor concurrent validity of olive oil in the present study was somewhat surprising because it is a characteristic food of the Mediterranean diet and hence it is unlikely that the relatively short reporting time of the reference method was a reason for this finding. However, the objective measure of olive oil consumption by its corresponding biological marker revealed a better correlation. In contrast, a poor correlation of vitamin B12 derived by the FFQ with the corresponding data from the reference method and the biological marker were found for both. One might argue that this possibly reflects the difficulty in estimating meat servings but the correlations of meat and processed meat between methods were substantially better than that for vitamin B12. The poor correlation of vitamin B12 derived by the FFQ with its biological marker might be biased by dietary supplements containing vitamin B12.
Drawing a fair comparison between the concordance between the methods applied in this study and that of other publications presents a challenge, as most validation studies present proportional agreement instead of kappa statistics. The proportional agreement is easily understandable although it does not consider coincidental occurrences of agreement between two different measures. In this study, the concordance between food and nutrient intake derived by the FFQ and food records ranged from poor to moderate with an average weighted kappa of 0.21 and 0.28 for foods and nutrients, respectively. This finding indicates a fair overall concordance between the FFQ and reference methods.
Food records and 24 hRs are frequently used as gold standards in validation studies of dietary assessment [
17]. These reference methods are not free from measurement errors, however, the type of error is independent of those from an FFQ [
17]. The objective measurement of food and nutrient intake by corresponding biological markers yielded a more robust estimate of an FFQs validity. In this study, objective measurements of plasma levels of both folic acid and linolenic acid and urinary hydroxyltyrosol were fairly correlated (
p < 0.05) with their corresponding FFQ-derived nutrient or food. Poor correlations were found for olives (r = 0.07;
p = 0.58) and vitamin B12 (r = 0.09;
p = 0.30). The magnitude of correlation was somewhat better for folic acid (0.12 vs. 0.25) but similar for vitamin B12 than that found by Vioque et al. [
21]. Usually, validation studies compare intake estimates between the collected data from the desired instrument against a reference method to assess whether it correctly classifies reported intakes of nutrients and food groups. From these results, nothing can be deduced regarding the accuracy of the ranking for a composite score of multiple nutrients and/or foods like the MedDiet score; evidence for the validity of predefined indices, such as the MedDiet score, is scarce [
22,
23,
24]. Benitez-Arciniega et al. [
25] also reported a moderate (r = 0.48) correlation between a modified MedDiet score derived by an FFQ and repeated 24 h dietary recalls in a Mediterranean population. Stronger concurrent validity was found for the traditional and alternate MedDiet score in German women [
27] and a multiethnic Asian population, respectively [
27]. In comparison with this study, the agreement between test and reference method was considerably stronger in the Asian group but only slightly different in the German population.
Construct validity should also be considered when selecting a dietary assessment tool. We hypothesized that both the FFQ- and dietary records-derived dietary quality scores would show a positive correlation and that both of the FFQ-derived dietary quality indices would be positively associated with a favorable nutrient intake profile estimated by food records. Intakes of 17 nutrients were significantly associated in the anticipated direction with MedDiet score ratings derived by the FFQ. These findings are in line with that of Benitez-Arciniega et al. [
25] who reported good construct validity of two MedDiet scores by correlating nutrient intake derived by multiple food records with the MedDiet scores.
The strength of the present study is that validity was determined by correlations with foods and nutrients derived by a self-reported reference method and biological markers of food and nutrient intake. An additional strength is the inclusion of the validity of MedDiet adherence by a composite score. Finally, as in all validation studies, an inherent limitation is that reference methods such as multiple dietary recalls or records are themselves not free from error [
17]. Food records for example requires motivated subjects and place a high burden on the participants. The ideal choice of the reference method is weighed food records, which is considered the “gold standard”. However, the administration of food records or weighed food records may lead to participants changing their diet during a recording period. Food frequency questionnaires on the other hand are prone to memory bias because these questionnaires asked for the retrospective food intake. Furthermore, average consumption frequency of seasonal foods is especially critical and the fixed food list in fixed portion sizes are other sources of measurement error. Finally, the use of the FFQ to present data of absolute intakes of foods and nutrients is limited without prior calibration of these data by a reference method. This is especially the case for foods and nutrients with poor concurrent validity and concordance.