1. Introduction: The Need for Traceability of Food Ingredients
The European Union ([
1] p. 8) defines “food traceability” as “the ability to trace and follow a food, feed, food-producing animal or substance intended to be, or expected to be incorporated into a food or feed, through all stages of production, processing and distribution”. Traceability is needed to enable consumers, regulators, production and marketing actors in the foods supply chains to react to potential risks in food and feed, and to make sure that all food products within a certain territory are safe for consumption. A significant number of food poisoning related cases are investigated by official health departments and reported each year, affecting dozens of million people. Food services and restaurant chains suffer every year from lost sales through loss of consumer confidence as a result of such outbreaks.
Another common problem which may put consumers at risk and that could be reduced by reliable food traceability systems is food fraud [
2]. Food fraud is a common type of crime which typically abuses the consumer by mislabeling food items in order to induce consumer purchase but uses ingredients that are not the ones indicated or by making misallegations on the sources of the ingredients.
It is paramount that, when food businesses identify potential risks to customers or brands, these can reliably and quickly be traced back to its sources and the suspect materials be segregated thus avoiding them from reaching consumers.
Efficient traceability allows for targeted recalls, reducing the eventual cost of the recall of false positive items. It also allows for more accurate information to the public, reducing damage to the brand image. These costs can be very high as has been demonstrated in recent cases, such as with the European dioxin contamination, Chinese milk products contaminated by melamine [
3] and the horse meat salami scandal in Europe [
4]. In several cases [
5,
6,
7,
8], food unsafe for human consumption was, according to their manufacturers, quickly identified and isolated. This protected consumer health from food poisoning risks and consumer good faith from counterfeit, malicious information or fraud, albeit without a third party certification and with significant costs.
Recurring examples are available from “Monthly Summary of Articles on Food Fraud and Adulteration” (
https://ec.europa.eu/jrc/sites/jrcsh/files/jrc-food-fraud-summary-september-2017.pdf). Additionally the FDA publishes several of the known food recall actions in their web site. A brief analysis of this list typically shows one dozen cases every month for recall actions. The direct and brand damage costs involved in a recall action cannot be overestimated.
Both in the food poisoning and the food fraud cases, a recall action of the suspect items is mandatory by law and necessary to maintain brand confidence. Any recall intervention should be as fast and as precise as possible, avoiding as much as possible the undue cost of recalling products that are safe.
The increasing discussion on genetic modified organism (GMO) labeling spurs the public to question the origin and the reliability of the track record a food product. The final consumer has the right to know not only where the ingredients and associated products come from, as well as the processing methods, but also the reliability of the sourcing information made available to the consumer on labels.
Consumers have not only shown increasing concern with the origin of the ingredients in their food, but some are apparently willing to pay more for quality. According to ([
9] p. 1) “A portion of consumers display traceability consciousness, and are willing to pay a premium for traceable food products”.
For recipe based foods, i.e., foods that derive from processes of commingling a fixed proportion of ingredients, due to the complexity of tracing all ingredients, the focus has been on the traceability of some premium ingredients.
Case studies for full chain traceability, i.e., from farm to consumer, are available for foods in general [
10] but none for recipe based foods was found. This is due to opposed needs between consumer and processor of recipe based food brands. A direct conflict of interests between consumer and food processor requirements arises. The increasing interest of consumers in tracing specific ingredients in the recipe, rather than just believing in the claims on the labels, clashes with the manufacturer’s legitimate right to protect their business sensitive information concerning these recipes and suppliers. An efficient method that can be used for certification of ingredients without jeopardizing the commingler’s interests is needed. This method must reliably capture data on each transformation and transfer of custody of these specific ingredients along the food chain without hurting the requirements of each stakeholder in a transparent and trustworthy manner.
Blockchain technology (BCT) has been identified as advantageous for many applications requiring public verifiable certification and integrity as in postal and lottery services [
11,
12]. Applications requiring also transparency and privacy such as Supply Chain Management (SCM) adopted BCT as did Skuchain (Skuchain, 2017.
https://www.skuchain.com/.), Provenance (Provenance, 2017.
https://www.provenance.org/.) and Everledger (Everledger, 2017.
https://www.everledger.io/) as well as applications for traceability of goods in complex distributed business environment [
13,
14,
15].
One of the blockchain’s attributes that may help solve this conflict is the open nature of the code of the smart contracts [
16,
17]. A further property is the tamper proof and universal access nature of information recording method [
18].
The method also needs to detect if these valuable ingredients are used in a fraudulent manner, i.e., avoid double-spending [
19] of these ingredients. In this manner, the method builds trust among the stakeholders and particularly the consumer about the origin of key ingredients used. Since the certified products and the certified ingredients they use will probably reach higher end prices, and BCT tends to reduce transaction costs, all the actors in the supply chain would be encouraged to use the method.
In summary this research was motivated by the conflict between:
Increasing interest of consumers in tracing any type of goods, including foods, particularly the specific case of recipe based foods, rather than just believing in the claims on the labels;
The commingler’s legitimate right to protect their business sensitive information concerning these recipes and suppliers.
This paper investigates the traceability requirements in recipe based foods and proposes an alternative solution to full chain traceability. It proposes a food ingredient certification framework capable of maintaining the non disclosure of business sensitive data for foods that derive from commingling operations using the blockchain.
This paper is organized as follows:
Section 2 describes the key definitions and requirements for traceability, data collection and organization along with some available solutions and their main characteristics.
Section 3 details the traceability aspects related to ingredients within recipes.
Section 4 summarizes BCT as currently applied to Supply Chain Management.
Section 5 justifies the use of BCT in the case of ingredient certification. The
Section 6 presents the supply chain to token synchronization concept.
Section 7 addresses the algorithms and steps for the certification.
Section 8 discusses the smart contracts and implementation of the IGR token used to demonstrate the feasibility and usage of the certification framework.
Section 9 summarizes the results obtained and finally
Section 10 presents conclusions.
2. Traceability of Food: Key Definitions, Requirements and Review of Food Traceability Frameworks
The trustworthiness of the information concerning the ingredients of a food article and the nature of this information has been the focus of a large number of researchers.
For the purpose of this research, we use the concept of ingredient as being any of the components that are mixed to form a recipe. The concept of raw material is restricted to the components at their original state as of the moment they were extracted or harvested.
The Traceable Resource Unit (TRU), originally defined in ([
20] p. 12) for batch processing of foods, was adapted by the ISO 22005/2007 (ISO 22005:2007 Traceability in the feed and food chain- General principles and basic requirements for system design and implementation (
https://www.iso.org/standard/36297.html. “Traceable lot” is defined as the “set of units of a product which has been produced or processed or packaged under similar conditions”. Further, the concept of granularity [
21] describes the level and the size of the units in a traceability system have reached widespread agreement among models of food traceability and the analysis of food recall actions.
The literature shows various proposed solutions to trace the life cycle of food products.
Lo Bello [
22] proposed a broad framework for a network for traceability of food supply chains, mostly focused on reducing the high latency time or uncertain response times for end-user traceability queries. The framework is capable of processing ad-hoc queries to the recipe based food chain traceability databases, although issues concerning Quality of Service, response times and above all the storage, processing costs and access rights to the server traceability database was identified as a setback.
Cimino [
23] proposed, within the Cerere (Cerere website—
http://cerere2020.eu) project, an information system supporting traceability using electronic business extensible Markup Language (ebXML) (ebXML website—
http://www.ebxml.org), that allows users to build loosely interconnected databases.
Similarly [
24] introduced the “TraceFood” framework containing recommendations for “Good Traceability Practice”, common principles for the unique identification of food items as well as a common generic standard for electronic exchange of traceability information (TraceCore XML).
We understand that both these frameworks lacked practical usage due to the confidentiality and control of data issue inherent to the recipe based food chain.
Bechini [
25] builds on the CERERE project to introduce a data model for traceability and a set of patterns to encode generic traceability semantics and a suitable technological standards to define, register, and enable business collaborations. The Unified Modeling Language (UML) activity diagram shown in
Figure 1 includes the responsible actors and the transitions involved in a traceable ingredient processing system.
Traceability requires that an individual specific instance of a food ingredient: The TRU, lot or batch, be identified and positively followed. Since most industrial food packaging operations use the same date coding for production batches of typically one work shift per production line producing one specific package, this is chosen as the TRU. The date coding is required by law. Lots need to be individualized both externally, i.e., between businesses, but also internally, i.e., within a factory or warehouse. The traceability granulometry is thus one lot.
This calls for an information system able to manage an unique ID for each lot as well as some disciplined method to name and tag each lot unambiguously. The international standardization organization GS1 (GS1 official website—
https://www.gs1.org), proposes an industry accepted internal and external traceability methodology by means of a well accepted criterion for individualization of lots and packages using the so-called stock keeping unit (SKU) code. The most important contribution of GS1 is the de facto individualization of each product instance by means of its unique SKU appended by the lot date code as seen by consumers on food labels.
Once a specific instance of a food item is identified, it is necessary to record data on the transitions that this item goes through, in order to keep record of its path from farm to consumer. Which pieces of data should be recorded as a minimum? The Institute of Food Technologists (Institute of Food Technologists official website—
http://www.ift.org) (IFT) made significant advances to answer this question and provided a broad and powerful framework to apply traceability to food items in different segments [
27]. These requirements were homogenized for the six major food supply chains families and sets forth a set of structured requirements for the six food chains segments: Baked foods, dairy, meat and poultry, prepared processed foods, produce and fruit as well as fish and seafood. Typical actors within each of the six main food supply chain families can be established as per
Table 1.
A CTE is described as any event along the chain of transformation or hand over of a food product instance where a supply chain actor makes changes to the nature or to the custody of the product lot.
Each CTE generates one particular record of data, the so called corresponding KDE, to ensure the flow of information is kept linked to the physical product flow.
The generic list of CTEs and associated KDEs is applicable to all food families as per
Table 2. While some particular events do not occur in some food supply chains, or thus some KDEs need not be collected, and whilst nomenclature may differ from one food chain family to another, from a traceability perspective, “food is handled and distributed across the value chain in a fairly consistent manner.” [
27].
The capture of each CTEs with their respective KDEs are understood to be the necessary and sufficient condition to assure traceability of one specific food lot across any of the food chain families.
The IFT methodology has been extended by GS1 through the “GS1 Global Traceability Standard” (
https://www.gs1.org/sites/default/files/docs/traceability/GS1_Global_Traceability_Standard_i2.pdf) which focuses on the data management aspects of traceability. It identifies and references the necessary requirements for capturing and sharing data “for full chain traceability meeting practical regulatory and industrial requirements”. Albeit using slightly different nomenclature, GS1 and IFT are compatible with each other and with the ISO 22.000 (ISO 22000:2005 Food safety management systems: Requirements for any organization in the food chain) philosophy as well as Hazard Analysis and Critical Control Point (HACCP) systems.
The GS1 methodology is focused on the three main tasks: Identify, capture and share the instances of the food products and has achieved a widespread acceptance for the unique identification and the capture of the flow of industrialized product.
Today, most retail food products are marked with the GS1 coding. A typical food product instance can be identified through its SKU bar code plus Lot Number or Lot “best before” date.
Neiva [
26] showed the logical architecture capturing the main activities, responsibilities, boundaries, and services involved in the traceability business process based on a use case model as proposed by the GS1 standard.
Note that the bakery, processed food and dairy foods families deal mostly with recipe based, i.e., commingled food products. In these families, the tasks of keeping a traceability trail of one specific food lot is intertwined with that of other food lots due to the commingling of different ingredients each of which with their own traceability trail.
Nevertheless, the GS1 methodology requires further support for certification of recipe based foods due to the reluctance of the recipe owners to have their business sensitive data disclosed.
3. Traceability of Ingredients Within Recipes
For commingled foods segments such as recipe based, animal feed, processed and dairy foods, although internal CTEs with their KDEs are usually captured, they are rarely shared. This is probably due not only to the increasing complexity of recipes and products, the geographical dispersion of the supply chain actors and the intense competition between recipe based foods producers, but also to the recipes being an important competitive factor within the business models of these producers.
Food products that have shown full chain traceability are mostly of the produce or meat or fish supply-chain. Within these families most of the supply chain processes are those of separating parts of a larger item into smaller pieces, and therefore the traceability procedure are reduced to an obvious sub itemization concept.
These three first families: Fruit, meat and poultry, typically involve operations where no recipes or commingling operations are involved. In the case of these families a number of full-chain traceability cases are reported. For fruit and produce traceability, software tools based on dedicated server-client hand held devices are available and reported to be reliable such as HarvestMark (
https://www.harvestmark.com/). For the meat and poultry and the fish and seafood families, the literature reports a few successful use cases (Walmart and IBM will use blockchain to track pork from China,
https://finance.yahoo.com/news/walmart-and-ibm-will-use-blockchain-to-track-pork-from-china-142530691.html).
On the other hand, very few full chain traceability use cases are reported for recipe based foods. In fact, businesses are extremely reluctant to, if not totally against, making their recipes or sub suppliers public. This data is considered to be business sensitive information and is typically kept internal to corporate databases. Protecting corporate recipes and suppliers data is a legitimate business goal. All recipe information is handled as classified information, almost as intellectual property and, as such, should not to be made available to external public or competitors.
Additionally, most large business use extensive marketing efforts to promote trust in their brands and recipes, pushing consumers to see their corporate name as the trustworthy element behind customer choices [
28]. When a corporation claims to use some differentiated ingredient or raw material they support their claim through the trust that has been built by their marketing efforts.
Further, more and more recipe based food producers claim to use special socially, environmental or health related added value ingredients. Are these claims for the use of special added value ingredient trustworthy? How can a serious recipe based food processor evidence his claim without hurting his trade secrets? This impasse explains the small number of publications showing full-chain traceability of recipe based foods.
Nevertheless, we will show that with the use of the Blockchain and Smart Contracts it is possible to obtain, even for the recipe based foods, an alternative to traceability by means of the certification of specific ingredient value properties all the way from harvest source to the final consumer through third party certificates issued by any desired authorities.
The traceability of foods use case diagram [
26] was extended to show the separation of the Business-sensitive-data from the Non-business-sensitive-data as shown in
Figure 2.
As an alternative to full traceability of all ingredients, we will show that certification of a specific ingredient, from source to final consumer, is possible without compromising the business sensitive information of the commingler. This can be achieved by means of open source code and freely accessible data structure such as those inherent to BCT.
4. Supply Chain Management and BCT
Recent research literature involving the keywords “supply chain management”, “traceability” and “blockchain” and “blockchain technology” [
16,
29,
30,
31] have shown that the traceability of pharmaceutical and foods, both recipe based type of products, are important fields that gain increasing researcher’s interest. Indeed BCT shows several advantages that can boost the information provenance to food ingredients. Nevertheless the references of those article do not address the recipe business sensitive concern.
Bitcoin [
32], the first cryptocurrency, introduced the concept of a scarce digital objects and BCT to avoid double-spending of a digital asset. Ethereum [
18] expanded the concept to include programs that are executed independent of human command, the so called smart contract and decentralized applications expanding the blockchain utility. These distributed technologies provide the needed transparency and prevention mechanism to possible double-spending frauds for the ingredient chain certification system.
“A central aspect of BCT is the distributed ledger, which contains a record of all previous transactions. It is called a distributed ledger because it is not stored in a central location, rather it is stored across a network of computers across the world. Key to the operation of a distributed ledger is ensuring the entire network collectively agrees with the contents of the ledger; this is the job of the consensus mechanism” (
https://hackernoon.com/consensus-mechanisms-explained-pow-vs-pos-89951c66ae10).
The blockchain is a collaborative distributed ledger replicated in several physical locations, capable of maintaining its information integrity. This allows for companies doing business within a production and marketing chain—such as raw material producers and crop growers, processing companies, re-packers, transportation companies, distributors and retailers—to record concurrent transactions securely on a global basis [
14,
33,
34,
35]. “The BCT as a foundation for distributed ledgers offers an innovative platform for a new decentralized and transparent transaction mechanism in industries and businesses. The inherited characteristics of this technology enhances trust through transparency and traceability within any transaction of data, goods, and financial resources.” [
36,
37].
The technology behind blockchain smart contracts successfully implements the access integrity for collaborative databases in the form of distributed ledgers for trust-less computing environment. The underlying data structures and control mechanisms with validation through proof of work stakeholders show that this technology is an important step towards supply chain transparency of traceability data [
38,
39].
Blockchains are cryptographically auditable, append only, tamper resistant, distributed data structures accessible to anyone by means of a web browser [
17]. Blockchains require no central trust mechanism, hence no central point of failure. Smart contracts are intimately linked to BCT and allow for extensive development and control ensuring transparency of each data manipulation and thus trust.
BCT’s main strengths are:
Software driven, i.e., BCT does not require human privileged operators to maintain and operate the transactions, thus the system is not prone to bribery;
Tamper resistant, making it impossible to change data that has been recorded;
Pseudo anonymity: Data is available publicly but encoded through hashes of keys that allow for certification without disclosing trade secrets;
Distributed presence: The data structures are replicated maintaining full integrity with each other data set in a perfectly smooth manner;
Allows for certification of the tamper proof storage of large volumes of data in a side blockchain. through the use of hierarchically certified sub-blockchains database which can also hold much more data than what retailers get today, providing tools for more detailed analysis, including certification of multimedia content.
On the other hand, BCT, in its current proof-of-work implementation, has shown weakness in the aspect of scalability, slow speed of transaction validation and high energy usage ([
40] p. 5), especially in the case where a larger volume of data needs be kept, the blockchain may not be the ideal solution. All of these issues are currently receiving a lot of improvement efforts.
Permissionless or Permissioned Blockchains
To store large data on the blockchain can be costly and cumbersome. Since BCT uses a replicated database, every single node on the network would have to manage all data.
The type of read and write access restrictions imposed on the data of a blockchain established the nomenclature: Permissionless and permissioned blockchains as opposed to private centralized database [
41].
Permissionless blockchains are those that grant anyone read and write access, provided and validated by the corresponding smart contracts. These include the original Bitcoin, Ethereum, Rinkeby Testnet and several other blockchain projects accessible through web browser applications.
Permissioned blockchains are those that selectively grant anyone read and restrict write access, also provided and validated by the corresponding smart contracts or access methods. Some examples of permissioned blockchains are Hyperledger [
13] and Corda [
42].
6. Tracing Ingredient Value Properties and Their Certificates with Tokens
Consumers are becoming more demanding and require different processes and raw-materials in food they consume. Claims such as non-genetic-modified seeds, organic crop cultures, agro-chemical free, gluten-free are often included in product advertisements, but manufacturers are seldom willing to provide evidence for those claims.
We here define such positive attributes of the raw materials used by a specific food product as an ingredient value property. The concept can also include claims of geographical origin such as “Oregon Potatoes”, “Champagne Sparkling Wine” or “Hallertau Hops”.
For the certification of ingredients we created an Ethereum token—the Ingredient Token (IGR). Making sure that for each change of custody of the food ingredient, the corresponding transfer of token takes place is the main challenge.
Figure 5 shows the synchronization between flow of food lots and the flow of information through the IGR token, in order to evidence to a final consumer that a certain instance of a recipe based food ingredient effectively contains a certain ingredient value property.
The interested crop grower can define the ingredient value property claimed and have this property and the corresponding date and volume produced certified by an external certification third party: The certification authority.
The initiative of defining a ingredient value property comes from the farmer, but requires that a certification authority confirms the quality, quantity and date of the lot produced and generates a web page, i.e., URL under the authority’s web domain certifying this information. This lot of ingredients would than be awarded an appropriate amount of consumer value perceived property IGR tokens. For simplicity we define one IGR to be one gram of any certified ingredient value property ingredient irrespective of the authority or nature of the property.
The certifying authority uses a smart contract to create the IGR tokens based on the farmer claims and evidences including the correct nature, quantity, location and time of the harvest. The full details of each IGR tokens is made public through this URL, so as to serve as a “certificate of birth” for a desired raw material product plus lot.
At each CTE where changes in ownership, custody or quantities of the raw-material value properties are involved, a smart contract captures the linkage and quantity parameters, without recording the business sensitive information. This would serve the purpose of disciplined token transfer without disclosure of intellectual or business critical data.
At the final link, a consumer, e.g., by means of a mobile application, is be able scan the SKU and lot number to confirm the certification URL claimed in the label of the baked good or processed food. The same information is available publicly through the web based blockchain scanners (examples of blockchain scanners
www.etherscan.io or
www.ethplorer.io).
7. Steps of the Certification of Ingredients Used in Food Mixtures
Any ingredient that may be certified typically passes through several commercial and processing links along the supply chain from source to consumer. To be general, we use the three following types of processing or changes of custody for each specific lot from farm to consumer: “Sale without depletion”, “sale with depletion” and “sale from commingler”.
“Sale without depletion” is the operation in which the delivered lot has exactly the same mass as the mass of the ingredient received or harvested. This is opposed to the “sale with depletion”, where the incoming lot will loose part of its mass, e.g., milling operation where the grain will loose part of its mass when the tusk is separated from the grain that will be sold as milled grain. On the other hand, the commingling operation mixes this ingredient in fixed proportions with other ingredients. These states and transitions are pictured in the state diagram Food Ingredient Transition Diagram in
Figure 6.
In order to keep the provenance of the ingredients, it is necessary to use routines that will link six entities:
The first five of these entities need to write to the Etherium blockchain. The consumer and any other interested party may read from the blockchain, at any time without any difficulty, using a mere web browser (e.g.,
https://etherscan.io).
At the very start, a Farmer will request a freely chosen certification authority to certify one specific lot of an ingredient through the procedure “farmerRequestCertificate()”. The request to the certification authority requires information such as: Public address of certification authority, quantity of ingredient to be certified, name of ingredient, the property that the ingredient meets that will add value to this ingredient, location of harvest (typically the GPS coordinates of the harvest) and date of harvest.
In the following step, the certification authority will inspect the farmer and use the smart contract to confirm that the claimed ingredient properties are trustworthy. After inspection, the certification authority will typically issue a public certificate on a webpage under its domain, i.e., URL. To make the certification data publicly available, the authority will perform the “certAuthIssuesCerticate()” procedure with parameters: Public address of farmer, amount of certified ingredients, URL in the web site, name of ingredient, ingredient value property, location and date of production. Once confirmed by the certification authority, the farmer can sell ingredients and transfer the corresponding IGR tokens certifying the trustworthiness of its property.
When selling ingredients, farmers should use the procedure “sellsIngrWithoutDepletion()” to transfer custody of the IGR tokens related to those ingredients. This requires the following parameters: Public address the new holder of tokens, quantity of ingredients, certificate URL. Typically the farmer transfers the ingredient to a first processor, typically known as a mill, although direct transfer to commingler or even to final consumer is also possible.
After the mill receives the custody of the certificates and the ingredients, he will typically process the food ingredients with some losses. The processed ingredient can be sold further, but, because the processing at the mill involves a depletion, it is necessary to specify the percentage of mass reduction involved. Typically the selling of ingredients with depletions requires the following procedure “sellsIntermediateGoodWithDepletion()” with these parameters: Public address of new custodian, quantity of ingredients, URL, mass ratio of output to input in percentage.
Once the ingredient is available to the commingler, the mass of the certified ingredient used in each SKU, according to the recipe, can be recorded in a manner that allows the end consumer to easily check for the presence of IGR tokens in the blockchain. In order for the final consumer to be able to check if a certain lot carries certified ingredients via IGR tokens, the SKU GTIN-13 identification, alongside the lot validity date will generate a public key in the Ethereum blockchain.
This blockchain public address will be credited with the corresponding tokens through the procedure “comminglerSellsProductSKUWithProRataIngred()” with the exact mass (in grams) of the certified ingredient. With the information that the consumer can read from the package, he can access the blockchain via a browser or via a dedicated Android application. The procedure needs as parameters: Public address of wholesaler or retailer, quantity of SKUs sold, URL, quantity of ingredient in the low level of SKU, GTIN 13 of package and “best consumed by” date.
Finally, for the consumer to confirm that the specific lot of food includes certified ingredients, the consumer only needs to read the SKU GTIN 13 and validity date of the product to access the blockchain. This will lead him to all the harvest data at the certification authority web page URL. The different states for the IGR token are pictured in the IGR token state transition diagram
Figure 7. Food lots event transitions are followed by the IGR transitions every time change of ownership or custody takes place.
The implementation of the smart contracts followed the data flow diagram shown in
Figure 8 which shows a generic case of ingredient provenance data being synchronized along its change of custody using the procedures of the IGR smart contract.
The motivation for the IGR changes of ownership is a probable higher sales price for the ingredient when certified, in the sense that a certified ingredient is more valuable than a non-certified equivalent. This mechanism is the driving force for the synchronization of the ingredients with the certification flow of data as shown in
Figure 6.