**1. Introduction**

Artificial sweeteners are widely used in a variety of foods and beverages as a sugar substitute that mimics the effect of sugar on taste without adding calories. However, consumers have a negative perception of artificial sweeteners not only due to aversive sensations such as bitter off-taste [1,2] but also due to potential health risks and demand more natural options [3]. To respond to the consumers' demand for natural sugar substitutes with low/zero calories, the food industry has focused on stevia, which is a natural high-intensity non-nutritive sweetener. Stevia (*Stevia Rebaudiana* Bertoni) is a shrub native to Paraguay, and the leaves of stevia have been used to sweeten teas for hundreds of years in Paraguay and Brazil [4,5]. Stevia is the source of many different types of steviol glycosides, which are the sweetening compounds in stevia leaves [6]. Stevioside and rebaudioside (Reb) A are the major sweet compounds among the steviol glycosides [6] and are the most widely used steviol glycosides on the market according to a Mintel Global New Products Database (GNPD) product search [7]. However, stevioside and Reb A exhibit bitter and licorice off-taste [8–13], which pose challenges to product formulation.

To overcome the taste challenges of stevioside and Reb A, the researchers and food industry have looked into other minor steviol glycosides in the stevia leaves to provide better sugar reduction solutions. Several studies have reported that the two minor steviol glycosides, Reb D and M, elicit significantly less bitterness with better sweetness than Reb A and also work well in products without sacrificing the taste [14–19]. Prakash et al. [16] reported that Reb M had less bitterness and astringency than Reb A. Most of the studies investigating sensory characteristics of steviol glycosides were conducted within a specific range of 5–10% sweetness equivalency related to sucrose (SE) [8,10,11,13,16]. Little research was done at high concentrations for high-sugar applications such as frozen desserts, which generally contain 13–22% sucrose *w*/*v* [20], although sweetness potency of stevia heavily depends on the SE [13].

For sensory characterization of food products, sensory descriptive analysis using trained assessors is the most widely used method, but it is time-consuming to train a panel. Less time consuming and more flexible methodologies such as check-all-that-apply (CATA) or intensity scales using consumers have been discussed in the last two decades [21]. It has been reported that consumers were capable of evaluating sensory attributes of various products, showing good agreement between consumers and trained assessors in terms of discrimination, reproducibility, and consensus [22–25]. Although Worch et al. [24] found that the trained panel showed greater consensus among each other, the larger sample size of consumers compensated for the higher variability. Moskowitz [26] suggested that a minimum of 40–50 people was needed to get stable averages, and the averages would not be affected by the base size much once the participant number exceeded 80. Ares et al. [27] also indicated that 80 consumers would be sufficient to get stable results when samples had large differences, but caution would be needed if samples had smaller differences or more complex attributes.

CATA is also often used to determine the characteristics of a product from a consumer perspective, which allows the consumers to describe a product by selecting terms from a given list that would match the product [28]. CATA questions have been used for a variety of foods and beverages [28–32], and these studies showed that CATA was a simple way to understand consumer perception on the sensory profile of a product.

Phenylthiocarbamide (PTC) and 6-n-Propylthiouracil (PROP) are bitter-tasting compounds that have been used to test people's sensitivity to bitter taste. Supertasters are a group of people who perceived intense bitter taste from PTC and PROP, while non-tasters barely detect the bitterness of them [33]. It has been reported that individuals have different sensitivity to the aftertaste of high-intensity sweeteners [34], and thus researchers have long been interested in understanding the relationship between PROP status (e.g., non-tasters vs. supertasters) and perceived taste intensities of high-intensity sweeteners. Bartoshuk [35] and Drewnowski et al. [36] found a significant difference between non-tasters and supertasters in the bitterness of saccharin at low concentrations. Zhao and Tepper [37] also suggested that supertasters perceived more bitterness and sweetness than non-tasters in carbonated soft drinks with artificial sweeteners, including sucralose, aspartame, acesulfame-K. However, Horne et al. [38] did not find a relationship between PROP taster status and the sweetness and bitterness of saccharin and acesulfame-K. Rankin et al. [39] failed to find any significant difference in bitterness between supertasters and non-tasters in cola drink sweetened with artificial sweeteners either. Risso et al. [40] looked into the effect of genetic variations on stevioside and found that the bitter taste receptor for PROP did not predict the bitterness perception of stevioside. However, little research was done to investigate the influence of PROP status on the perceived sweet and bitter taste intensities of novel steviol glycosides such as Reb D and M.

The primary objective of this study was to determine sensory characteristics of Reb A, D, and M, compared to 14% (*w*/*v*) sucrose, using a consumer panel. A secondary objective was to determine if there is a relationship between PROP taster status and the perceived intensities of the three steviol glycosides.

#### **2. Materials and Methods**

#### *2.1. Materials*

Sweeteners used in the study were 95% Reb A (ENLITEN® 30000015 High Intensity Sweetener, Ingredion, Westchester, IL), 95% Reb D (BESTEVIA® Reb D stevia leaf sweetener, Ingredion, Westchester, IL), 95% Reb M (BESTEVIA® Reb M stevia leaf sweetener, Ingredion, Westchester, IL, USA), and sucrose (Smidge & SpoonTM, Kroger, Cincinnati, OH, USA). PROP (6-Propyl-2-thiouracil, #P3755, Sigma-Aldrich, St. Louis, MO), NaCl (Sigma-Aldrich, St. Louis, MO, USA), and filter papers (1.5 dia. cm, VWR Scientific Products, West Chester, PA, USA) were used to make paper disks for supertaster screening.

#### *2.2. PROP Status Determination*

The paper disks for PROP status determination were prepared following the method described by Zhao et al. [41]. Blank, NaCl, and PROP disks were prepared. Blank disks were used as the control. NaCl disks were made by placing filter papers in 1.0 mol/L NaCl solution for 30 s at room temperature and oven-dried for 1 h at 121 ◦C (250 ◦F). 50-mmol/L PROP solution at boiling temperature was used for PROP disks.

PROP testing and classification were based on Zhao et al. [41] and Zhao and Tepper [37]. Michigan State University SONA Paid Research Pool (https://msucas-paid.sona-systems.com) was used to recruit participants with age between 18 and 55. Participants were instructed to rinse their mouth with water, taste the paper disk for 15 s or until the disk is wet, discard the paper disk, and then rate the perceived intensity of the taste on the labeled magnitude scale (LMS). The participants would taste a blank, a NaCl, and a PROP disk in order with a 30 s break in between samples to minimize fatigue and carryover. The set was repeated after a 5 min break.

The LMS is a 100 mm quasi-logarithmic spacing vertical scale with verbal labels from "barely detectable" to "strongest imaginable" [42]. The scale set up was "no sensation" = 0, "barely detectable" = 1.5, "weak" = 6, "moderate" = 17, "strong" = 35, "very strong" = 52, and "strongest imaginable" = 100 [43,44]. The PROP score of participants was calculated based on the mean of the two replicates. Because the LMS is not equal in spacing, the difference between two scores when both ratings are at the higher end is less than when ratings are at the lower end. If the difference between two ratings was bigger than 30 mm, or bigger than 40 mm when both ratings were higher than "very strong", the participant would be considered having bad reproducibility and would not be invited to the following water solution testing. Out of 224 participants, 27 were excluded.

Initially, "moderate" or below (≤17 mm on the LMS) and "very strong" or above (≥52 mm on the LMS) of PROP score were used to group participants into non-tasters and supertasters. The group means and 95% confidence interval were then calculated to set new cut-off scores. The new cut-off score for non-tasters was 10.3 and for supertasters was 70.7. Participants with scores in between were classified as medium tasters. When the PROP score of a participant was borderline, the NaCl score was used to help classify the person [41]. A participant would be classified as a non-taster if the person gave a non-taster borderline score and rated NaCl much higher than PROP (~30 mm difference on the LMS). When a participant was at the supertaster borderline and gave a much lower NaCl score than PROP, the person would be classified as a supertaster. Out of 197 remaining participants, 25 were identified as non-tasters and 55 were supertasters.

#### *2.3. Subjects Demographics*

Following the PROP test, participants were asked to provide some basic demographic information, including age, gender, ethnicity, educational level, weight, height, health condition, consumption frequency of low/zero sugar added products, consumption of sweeteners on a regular basis (at least once a month), and familiarity with stevia.

#### *2.4. Consumer Testing*

#### 2.4.1. Samples and Sample Preparation

All solutions were prepared using deionized water and the concentration of the sample is expressed in g/L (*w*/*v*). Sucrose at 14% was chosen as the control. Reb A, D, and M at 0.09% were used in a preliminary test (*n* = 31) to determine the relative sweetness to 14% sucrose. The result showed that 0.09% Reb M were not statistically different from 14% sucrose in sweetness intensity (*p* = 0.16), but there was still a 1.1 difference on a marked 15-cm line scale with descriptors of "not at all" and "extremely" as endpoint anchors. Another preliminary test (*n* = 65) was then conducted to prove the sweetness equivalency of Reb M to sucrose, using 0.09% and 0.12% Reb M and 10% and 14% sucrose. The result indicated that both 0.09% and 0.12% Reb M were not significantly different from 14% sucrose (*p* = 0.34 and *p* = 0.11, respectively), with 0.09% Reb M closer to 14% sucrose at 0.5 difference on a 15-cm line scale, comparing to a 1.0 difference between 0.12% Reb M and 14% sucrose. Since 0.09% Reb M was again lower in intensity on the scale, Reb M at 0.10% was chosen for the consumer testing. Reb A and D at 0.10% were used to compare the sensory characteristics of the three steviol glycosides at the same concentration. Thus, samples used for the testing were Reb A, D, and M at 0.10%, and 14% sucrose. The consumer test lasted four days and fresh samples were made 1 day before testing each day. Ten milliliters of each solution was measured into a 1 oz soufflé cup and stored in the refrigerator (4 ◦C) prior to serving.

#### 2.4.2. Testing Procedure

This study was approved by the University Institutional Review Board of the Michigan State University (East Lansing, MI) (Study ID: STUDY00004019). SIMS 2000 software (SIMS Sensory Software, Morristown, NJ, USA) was used to create and administer the questionnaire.

Consumers were instructed to rate the sweetness and bitterness intensities of the solutions on a 15-cm line scale three times, which were while the solution was in the mouth, 5 s after expectorating it, and 1 min after expectorating it. Consumers were asked to pinch their nose while holding the solution in the mouth to focus on the taste. The sweet and bitter tastes perceived at this time would be called in-mouth sweetness and bitterness throughout this paper. The perceived intensities of sweet and bitter tastes 5 s after expectorating would be referred to as immediate sweetness and bitterness. A check-all-that-apply (CATA) question on the taste was followed after evaluating the immediate tastes, including terms collected from an open-ended question in the two preliminary tests (*n*<sup>1</sup> = 31 and *n*<sup>2</sup> = 65), asking if the consumers noticed any aftertaste. The term *pleasant* was added to the list as a positive word, *and spicy* was added as an attention check to identify careless respondents and would be removed from the correspondence analysis. The final list of CATA consisted of 15 terms, which were *artificial*, *bitter*, *chemical*, *honey*, *licorice*, *metallic*, *minty*, *pleasant*, *pungent*, *spicy*, *sweet*, *tangy*, *tart*, *tingling*, and *vanilla*, and the terms were listed in alphabetical order. A 45 s break was enforced after the CATA question, which was before evaluating the sweet and bitter tastes 1 min after expectorating. The perceived intensities would be considered as lingering sweetness and bitterness. Water and crackers were provided as palate cleansers in between samples.

#### *2.5. Statistical Analyses*

Data analysis was performed using XLSTAT (AddinSoft, New York, NY, USA). Intensity data were analyzed using a one-factor ANOVA model. For CATA analysis, the frequencies of each attribute were counted. Cochran's Q test was performed for each attribute to compare the difference among samples. Multiple pairwise comparisons using critical difference (Sheskin) were performed when the attribute was significant (*p* < 0.05). Correspondence analysis (CA) was generated to visually show the relationship between sensory attributes and samples. A two-way ANOVA model was used to determine the effect of PROP taster status, sweetener, and their interaction. Fisher's least significant difference (LSD) post hoc test was performed when *p* < 0.05. Agglomerative hierarchical clustering

(AHC) was used as a second way to classify PROP groups. Pearson correlation test was performed and correlation coefficients were calculated between PROP bitterness and sweet and bitter tastes of Reb A, D, and M combined over time (in-mouth, immediate, lingering sweetness and bitterness).

## **3. Results**

## *3.1. Participant Characteristics*

A total of 126 naïve consumers completed the study, with an average age of 23 ± 1.7 years and an average BMI of 24.7 ± 4.6 kg/m<sup>2</sup> based on self-reported height and weight. None of the participants had heart disease, cancer, or diabetes. The socio-demographics of participants are shown in Table 1. The majority were female (72.2%) and 60.3% of the participants identified themselves as white. Table 2 listed out the responses of sweetener consumption behavior questions. Sucrose (81.0%) and honey (69.8%) were the most commonly consumed sweetener on a regular basis (at least once a month), followed by stevia (19.8%), sucralose (19.0%), and aspartame (19.0%), which were high-intensity sweeteners. Other sweeteners consumed (8.7%) included maple syrup, brown sugar, xylitol, high fructose corn syrup, and acesulfame K. Sixty-seven percent of participants consumed low or zero sugar added products at least once a month. More than half of the participants (54.8%) said they were somewhat or very familiar with stevia.


**Table 1.** Socio-demographic characteristics of participants (*n* = 126).




**Table 2.** *Cont*.

<sup>1</sup> This is a check-all-that-apply question.

#### *3.2. Sensory Characteristics*

## 3.2.1. Intensities of Sweet and Bitter Tastes

Table 3 summarizes the mean intensity ratings (± SEM) for four sweetener solutions evaluated by all participants. The decreasing trend in sweetness and bitterness intensities from in-mouth to immediate (5 s after expectorating the samples) to lingering (1 min after expectorating the samples) indicated that consumers followed the directions and evaluated the samples correctly, since a fading in intensity over time was expected.

**Table 3.** Mean intensity scores (± SEM) of sweetener solutions by participants (*n* = 126).


1,2 In-mouth tastes (sweetness and bitterness) were evaluated when the solution was in the mouth; Immediate tastes were evaluated 5 s after expectorating the sample; Lingering tastes were evaluated 1 min after expectorating the sample. <sup>3</sup> Intensities were evaluated on a marked 15-cm line scale anchored with "not at all" to "extremely". <sup>4</sup> Different letters in the same column show the significant differences between sample means at *<sup>p</sup>* <sup>&</sup>lt; 0.05 by Fisher's LSD.

The in-mouth sweetness of 14% sucrose and 0.1% Reb M were not significantly different (*p* = 0.55). The in-mouth sweetness of Reb D was slightly lower than sucrose but was still considered to be not different from sucrose (*p* = 0.19). Reb A showed significantly less in-mouth sweetness than Reb M and sucrose (*p* < 0.01 and *p* < 0.05, respectively). Reb M had the highest immediate sweetness among the samples and was significantly different from others. The sweetness of Reb M remained the highest after one minute. The lingering sweetness of Reb M (intensity = 5.3) was higher than Reb D (intensity = 4.5), but the difference was not significant (*p* = 0.05). Reb D was higher in lingering sweetness than sucrose (intensity = 3.6), but it was not significantly different (*p* = 0.05). The participants rated the in-mouth bitterness of sucrose, Reb D, and Reb M around 1, while the rating of Reb A was at 3.5 on a 15-cm line scale. The bitterness of Reb A persisted after 5 s (intensity = 3.5). Reb D was perceived to have more immediate bitterness than sucrose (*p* < 0.05), and there was no significant difference in the immediate

bitterness between Reb M and sucrose (*p* = 0.27). While the lingering bitterness of sucrose, Reb D, and Reb M was at a minimum, Reb A still had detectable bitterness remaining (intensity = 1.6).

#### 3.2.2. CATA

Table 4 summarizes the total counts of CATA attributes selected by the consumer panel (*n* = 126) to describe the aftertaste of each sweetener solution. The term *sweet* was the most frequently used term, and *spicy* was the least, which were as expected. Significant differences among samples were found in 10 out of 15 attributes (*p* < 0.05). Reb A, D, and M were described as *artificial* more frequent than sucrose. The *bitter* and *chemical* tastes of Reb A were significantly higher than other sweeteners, and fewer participants considered Reb A as *sweet* and *pleasant*. *Honey* and *vanilla* were checked the most for sucrose, followed by Reb D and M, while Reb A was rarely associated with these two terms. *Licorice*, *metallic*, *minty*, *pungent*, *spicy*, *tangy*, *tart*, and *tingling* were rarely selected by participants, with no more than 15 counts for each sample. Among those 8 less-checked terms, *licorice*, *pungent*, *spicy*, *tangy*, and *tingling* were not significantly different among samples.


**Table 4.** Total counts of check-all-that-apply attributes for sweetener solutions.

\* indicates *p* < 0.05, \*\* indicates *p* < 0.01, \*\*\* indicates *p* < 0.001, and ns indicates no significant differences among samples. Different letters in the same row indicate the significant differences between sample means at *p* < 0.05 by Critical Difference (Sheskin).

The sensory attributes of sweeteners were summarized visually in Figure 1. The first two dimensions explained 96% of the variation. *Honey* and *vanilla* were associated with sucrose. Reb A was close to *metallic*, *bitter*, and *chemical*. Reb D and M were similar to each other and were closer to sucrose as compared to Reb A. Reb D and M were mostly associated with the positive words, but *artificial* was between Reb A and Reb D and M.

**Figure 1.** Correspondence analysis (CA) of sweeteners. Gray color indicates non-significant attributes; Red color indicates significant attributes; Samples are in blue.
