1. Introduction
Front of pack (FOP) nutritional labels have attracted considerable attention in the marketing and nutrition research literature over the last two decades [
1,
2]. The purpose of an FOP nutritional label is to offer guidance to consumers on the nutrition status of the food product to which it is attached. It has been a requirement of most developed countries that food products carry specific information as to their ingredients, source and nutritional content on a dedicated nutrition information panel (NIP). However, this information is usually complex, cryptic and placed on areas other than the front/prime facing, where it is unavailable for use in the fast visual evaluations that characterize the large majority of fast moving consumer goods (FMCG) and food purchases [
3].
FOP labels seek to augment this existing nutritional information, and to present it in a manner that makes it available as an input to these high speed consumer decisions. There is as yet no consensus as to the best way that this might be done, and consequently a very wide variety of third party FOP label formats are proposed or in use around the world [
4]. In addition there are an increasing number of “proprietary” FOP labelling systems that are the private property of individual manufacturers and retailers, which further increases communication “noise” and the potential for consumer confusion [
5]. Despite this variety, there is one underlying and uniform theoretical “backbone” to all these labels. They can all be placed upon a single continuum anchored at one end by FOP labels that are fully reductive in nature, and at the other by FOP labels that are fully evaluative (
Figure 1).
A fully reductive FOP label presents factual information only, without expressing any opinion. The percentage daily intake (PDI) label shown in
Figure 1 is an example of this type [
7]. A fully evaluative label presents opinion only, without information. The “Swedish Keyhole” label shown in
Figure 1 is an example of this type [
8]. The keyhole image transmits no factual information whatsoever; it merely expresses an unqualified third party opinion that this product is nutritionally “good”. In between these two extremes lie a variety of hybrid FOP systems that contain elements of both information and opinion [
9]. The Traffic Light Label (TLL) is the most widely deployed hybrid FOP label, and a wide range of nutritional information (e.g., percentage daily intake of fat) and opinion (related TLL colour) mixes are available within the broad TLL format (
Figure 1).
There is a second typological continuum, in addition to the anchored continuum described above, that is based upon manner in which the information is transmitted. Fully reductive formats, such as the PDI label, are “ratio” in nature. They express information in regular units with a specific zero point. Purely evaluative labels are “binary” in nature; they express the third party opinion, by the presence or absence of the relevant mark (Either the food product has a Swedish keyhole, or it does not—there are only two available states). The intermediate types are all ordinal in nature. They can express several statuses for the product. This ordinal hierarchy can get quite complex. For instance, the four dimension, three status (green amber/red) traffic light mark, that is in the middle of the continuum in
Figure 1, can express nine different nutritional “scores” in any one of 3
4 (81) different ways. If the four traffic lights of a four light array are taken to be fully compensatory, with “red” scoring zero, “amber” scoring one and “green” scoring two, then there the label is able to express nine possible discrete consolidated nutritional “scores” for the product, ranging from four reds (zero) to four greens (eight). The 81 possible combinations are distributed among these nine scores according to a binomial distribution as follows: 0:1:, 1:4, 2:10, 3:16, 4:19, 5:16, 6:10, 7:4, 8:1.
These two continuums, the nature of the information that is transmitted and the manner in which it is transmitted by the FOP, run in parallel with one another, and are represented by the double-headed arrow in
Figure 1. Several academic reviews of the extensive body of FOP research exist within the literature [
1,
2,
10,
11]. Two recent reviews have observed that the vast majority of the research to date has been developmental in nature in that it tests the efficacy of FOP label types [
10,
11]. Such developmental/efficacy studies typically involve the comparison of several different FOP label formats or sub formats. The dependent variables are typically consumer assessments of product healthfulness or a similar intermediate response measure that is then related to the various FOP label types used in the research.
The distinction between efficacy and effectiveness research is subject to a degree of interpretation. It may be drawn on the basis of research conducted in “real world” (effectiveness) and controlled (efficacy) environments. The researchers chose to draw it on the basis of the dependent variable being used, with studies using the “final” dependent consumer variable of consumer choice being an “effectiveness” test and research using the “intermediate” consumer variables of attitude, understanding etc. as an ‘efficacy’ test. Both definitions have their limitations, and the only study that would satisfy both these reasonable criteria for “effectiveness” would be a very large scale, real world experiment under controlled conditions with consumer choice/purchase as a dependent variable. Such a research exercise is theoretically possible, but is beyond the means of all researchers, bar a state agency operating with total industry support and cooperation at all levels. No such studies have been conducted to date.
Those that do test the effectiveness of FOP labels by measuring their effect on consumer choice or intentions are predominantly based upon scanner data of actual sales, which can create issues of effective control over longitudinal datasets in a free retail environment [
12,
13,
14,
15].
The two very detailed reports by the US Institute of Medicine (IOM) [
16,
17] concluded that the evidence to support performance of reductive FOP labels was lacking, and that future efforts to influence dietary choice should be based upon evaluative formats. They made a series of specific recommendation that the FOP system used should be: (1) Standardized across all fresh and packaged foods in a supermarket; (2) Simple and easy to understand; (3) Ordinal; (4) Interpretative via use of colour and symbols; (5) Supported by an ongoing promotional campaign [
17].
Lachat and Tseng made specific observations that the widespread introduction of FOP label systems was occurring before their effectiveness had been systematically tested, and they called for specific research in the area of direct FOP performance/effectiveness testing under controlled conditions [
11]. In response to this call, an experimental exercise was undertaken in New Zealand to test the impact that TLL (hybrid) and PDI (reductive) FOP nutritional labels had on consumer choice behaviours towards breakfast cereals of high and low nutritional status when measured in the field [
6]. A fully evaluative mark was not tested, as one was not available in New Zealand at the time.
The research results strongly indicated that consumers did “acquire” and process these labels, and that the labels consequently had a significant impact on consumer choice for all the cereal products that were tested. However, the impact on choice was not moderated by the information presented by the FOP labels. If an FOP label was present, then the consumers” selection intent rose significantly and consistently, regardless of whether the nutritional status being communicated to the consumer by the FOP label was high or low. This could be more properly described as a malfunction, rather than a failure of the FOP labels, as they were processed by the sample, but were not processed in the expected manner. However, as the sole purpose of an FOP label is to support consumer discrimination on the basis of nutritional status, these results indicate that both formats were entirely ineffective in the research situation.
The results of the New Zealand study suggested that the consumers tested were in fact (mis)using the reductive and hybrid labels as a fully evaluative “binary” cue, much like the “Swedish Keyhole” system or the term “organic”. As the vast majority of proposed and existing FOP nutritional label systems outside Scandinavia are untested in terms of their impact upon unprompted consumer choice behaviour, and are either reductive or hybrid types, [
2,
11,
18,
19,
20,
21] the researchers concluded that replication and development of this small initial testing exercise was urgently required.
At this point, in mid-2014, the Australian New Zealand Food Standards Authority (ANZFA) announced the introduction of a common FOP nutritional label system that would be voluntary throughout Australasia [
22]. The “Health Star Rating” (HSR) FOP label is a hybrid type containing a reductive type PDI element, and a novel evaluative “star rating” system derived from the existing Energy Star Rating system for appliances [
23]. It incorporates four of the five recommendations made by the IOM [
16,
17], lacking only heavy promotion. The style guide allows a degree of latitude as to how the label is presented, but the “star disk” is a consistent feature (
Figure 2) [
24]. The star ratings are calculated for a small number of broad food categories using the nutritional information provided elsewhere on the product using a website delivered calculator [
25,
26].
The HSR system was introduced without testing its impact on consumer choice. The “Star Disk” principle component of the HSR system is a hybrid type that lies well to the evaluative end of the reductive/evaluative continuum. It is similar to the format suggested by the second IOM report [
17], and is well established as a system for indicating energy efficiencies in electrical goods in Europe, the United States and Australasia [
17]. The format also bears considerable similarities to the proprietary “Guiding Stars” system owned by the US retailer “Hannafords”. The Guiding Stars system is one of the very few that has empirical evidence to suggest that it effectively supports consumer purchase discrimination at the point of sale [
27].
However, there are also significant differences between the Hannafords and ANZFA system. Most notably, the Hannafords system is simpler (three stars
versus five), proprietary, and was backed up by considerable targeted communication effort in a retail environment that was controlled by its owner when the research was conducted. It is also notable that both of these features of the “Guiding Stars” system were recommended by the second IOM report [
17] (p. 4). The introduction of the HSR system was not without controversy, with some suggesting that the more established TLL system should have been deployed instead [
28,
29]. However, it should be noted that the TLL system is not as well established in Australasia as it is in Europe, has a very wide variety of presentation formats, with none of these dominant, and has, as yet, no consistent empirical evidence to support its effectiveness in influencing unprompted consumer choice [
6].
It was clear therefore, that a much larger field test of this particular FOP label should be undertaken immediately, to establish if the new ANZFA HSR disk would influence consumer purchase behaviour as its proponents predicted, or if it would either not work at all or malfunction in the manner observed in the previously reported research [
6]. Two research hypotheses were therefore developed for testing:
H1—The HSR FOP label would significantly influence consumer choice.
H2—The HSR FOP labels’ impact upon consumer choice would be moderated by variations in the 0.5–5 “star” rating expressed by the label.
4. Discussion
The objective of this research was to test the impact of the HSR label on consumer choice with as simple and direct a methodology as possible. The ANZFA literature that supports the program indicates that there is no specific target market for the label [
25]. The researchers therefore selected a homogenous but generic sample of retail cereal shoppers’, acquired by qualified exit intercept from New Zealand’s leading retail store chain on a specific day of the week. The sample amounts to some 1.2% of the entire population of a cosmopolitan university city.
In order to maximize the internal validity of the exercise, a very tight research focus was adopted, with two very specific hypotheses and a very tightly controlled and focused research environment that nevertheless relied upon a field rather than a laboratory environment. The use of a homogenous sample, single product type, a full factorial design, a revealed consumer response and the choice dependent variable minimized the assumptions made to the greatest extent that was possible.
H1: “
The HSR FOP label would significantly influence consumer choice”. is supported by this result as the consumer samples clearly took significant notice of the HSR FOP labels, and their presence exerted a significant influence upon consumer choice behaviour with regard to both products. The presence of the HSR FOP nutritional label consistently reduced consumer preference. This is shown by the strong downward slope of the two lines in
Figure 6. If the labels had failed to impact upon choice, then the two lines in
Figure 6 would be horizontal. This result is consistent with the earlier research undertaken by these authors [
6] in that the TLL and PDI labels also had a significant main effect impact upon consumer choice. However, the earlier research results indicated that the PDI and TLL FOP cues consistently increased consumer preference, whereas in this research, the HSR consistently reduced it.
The reasons for these differences in main effects are at present completely unknown, but may have their origins in subconscious consumer processing of the very different graphic designs of the two label systems. The mechanics of consumer reactions to package graphic design is at present poorly understood, but graphic design has been shown to have considerable and unpredictable effects on food consumer preference behaviour [
35].
H2: “
The HSR FOP labels’ impact upon consumer choice would be moderated by variations in the 0.5–5 ‘star’ rating expressed by the label”. is not supported by this result. The nutritional status reported by the two HSR FOP labels used varied widely between the two products, but this variation in status reported by the FOP had no observable differential impact upon consumer product choice. The two lines in the chart in
Figure 6 reflect this observation as they are almost precisely parallel with one another. This is also supported by the almost non-existent level of product
x FOP label interaction reported in the analysis of variance, despite the very substantial main effects observed for both product and label in the same table. Had the nutritional status expressed by the HSR FOP label been a significant input to the consumer choice, then this effect would have been expressed as a product
x FOP label interaction in the analysis of variance table, and the two lines on the chart would be at a distinct angle relative to each other.
Had the labels failed outright, then they would not have been used in the consumer evaluation and choice process. As a consequence there would have been no observable main effect for FOP label, and the two lines shown in the chart in
Figure 6 would have been parallel and horizontal. These three possible outcome scenarios are reproduced graphically in
Figure 7.
The results achieved here are exactly consistent with scenario three of
Figure 7. They indicate that this large consumer sample’s choice behaviour was significantly depressed with regard to products that presented the HSR FOP, and that the consumer sample was also either unwilling, or unable, to use its informational features as an input to a differential evaluation of a food product in the manner expected by ANZFA. This outcome indicates a significant functional failure of the HSR FOP label in this research environment.
If this result is an outcome of a fundamental mismatch between the HSR FOP’s structured cognitive input, and a dominant, unstructured non-cognitive consumer decision process at the retail point of sale, then it is may be a highly intractable issue—if it is tractable at all. The fact that H1 is supported, while H2 is not, does suggest that this may be a possibility.
The concept of potential failure via mismatch may be more easily understood when the nature of consumer communication via the package at the point of sale is considered.
Figure 8 shows a typology of cues, and their relative frequency of occurrence in the retail food point of sale. Ninety nine percent plus of all cues presented to the consumer on the package at the point of sale are nominal (brands) or, much more rarely, binary (third party endorsements). Both of the types are processed by the consumer on the basis of their presence or absence, not by any range of states that they express—however simple.
Ordinal and ratio cues are virtually unknown as on-package consumer communication elements beyond the various FOP label formats that are available around the World. “The Gold” “Silver” and “Bronze” wine awards are one of the very few deployed examples of an ordinal cue that are known [
36]. Nominal/binary cues and their related consumer processing pathways are thus the norm in retail food communication, a situation that is supported by commercial brand consumer research and communication budgets that run into the tens of billions of dollars worldwide [
37,
38]. The scale upon which nominal/binary cue based communication occurs at the retail point of sale means that any cue format that does not conform to this type, such as an ordinal cue, is likely to face a very challenging environment.
Even when presented by the one truly common ratio cue, price, research has shown that consumers will “convert” this information to a nominal/binary form if they are given the opportunity to do so (e.g., is it “on special” or not) [
39]. Retailers are well aware of this consumer tendency and exploit it widely via price/special branding programs that facilitate and encourage this process of ratio to binary cue conversion by the consumer [
40,
41].
It is now possible to relate this argument to the plot shown in
Figure 6, and to Scenario 3 of
Figure 8. Food consumers have been trained to react subconsciously to nominal or binary cues, presented as pictorial icons. Commercial food brands are the dominant type of such icons. As the presentation of these individual iconic cues is highly consistent, the individual consumer’s subconscious reaction to them is too. The massive intellectual property valuations attached to commercial brands and their registered designs is based upon the assumption that if these brand icons are applied to an appropriate food product, then, ceteris paribus, the plot of the type shown in Scenario 3 of
Figure 8 will occur [
42].
Figure 6 indicates that the HSR label was being processed by this sample in the same way that they would process an iconic brand. The results show that it was acquired and processed by the sample, and it was having a significant impact on their choices, but this impact was not moderated in any way by the number of stars presented on the label. The label was being treated as a single integrated iconic cue by the sample. The fact that the main impact of the HSR label was to significantly depress consumer selection is interesting, but not directly relevant to the HSR label’s failure to create a significant consumer distinction based upon the number of stars presented. Any attempt to explain this consistent depression in preference would be purely speculative at this point.
5. Conclusions
The result generated in this research is that the HSR FOP nutritional label is processed in some way by consumers in the sample, but it does not support consumer discrimination on the basis of the nutritional status reported by it. This is consistent with the consumer sample internally “converting” the ordinal HSR nutritional label into a nominal/binary type cue, and then using it in this form as an input for the purposes of supporting decision processing in a manner that is more familiar to them. This conversion process is both theoretically plausible, and has been observed.
These results suggest a significant functional failure of the HSR FOP system in this instance. The implications of this are not encouraging for the backers of the HSR FOP, or of any of the other major hybrid/ordinal FOP systems. More field trials to test the relative impact of reductive, hybrid and evaluative FOP systems upon unprompted, revealed consumer choice are required to either confirm, modify, or dismiss this result.
While the researchers are satisfied that this research has a high degree of internal validity, there are some very clear limitations which relate to its very tight focus and consumer sample. The results were gathered within a single city in New Zealand, and any extrapolation to the entire population of New Zealand and Australia would have to be undertaken with caution. The same comments apply to the products used, which were restricted to two products in the cold cereals category. The area of consumer response to FOP labels represents an environment of considerable complexity, and considerable further situation-specific FOP label research needs to be conducted before generic statements such as “Food labelling would increase the amount of people selecting a healthier food product by about 17.95% (confidence interval: by about 17.95% (confidence interval: +11.24% to 22.46%)” [
43] (p. 201) can be made with any degree of confidence or credibility.
The hypothetical choice/purchase situation also has the capacity to introduce several forms of bias including; social desirability bias relating to the consumer’s perception that this is an observed behaviour, forced exposure to a far more limited choice set than is usually available, and lack of tangible consequences for a decision that does not involve an actual purchase.
The HSR third-party FOP mark itself is of a type that is unique to Australasia, and other major markets are pursuing other FOP designs such as the TLL and PDI format. While the TLL and PDI are of the same highly atypical ordinal/ratio cue type as the HSR label, and are in fact both slightly more extreme in this regard, a similar performance outcome for these label systems cannot be assumed on the basis of this commonality.