Modernization Data Analysis and Visualization for Food Safety Research Outcomes

Vargas, David A.; Bueno López, Rossy; Casas, Diego E.; Osorio-Doblado, Andrea M.; Rodríguez, Karla M.; Vargas, Nathaly; Gragg, Sara E.; Brashears, Mindy M.; Miller, Markus F.; Sanchez-Plata, Marcos X.

doi:10.3390/app14125259

Open AccessReview

Modernization Data Analysis and Visualization for Food Safety Research Outcomes

by

David A. Vargas

¹

,

Rossy Bueno López

¹

,

Diego E. Casas

¹

,

Andrea M. Osorio-Doblado

²,

Karla M. Rodríguez

¹,

Nathaly Vargas

³,

Sara E. Gragg

⁴,

Mindy M. Brashears

¹,

Markus F. Miller

¹ and

Marcos X. Sanchez-Plata

^1,*

¹

International Center for Food Industry Excellence, Department of Animal and Food Sciences, Texas Tech University, Lubbock, TX 79409, USA

²

Department of Animal & Dairy Science, University of Georgia, Athens, GA 30602, USA

³

Independent Researcher, Quito 170528, Pichincha, Ecuador

⁴

Department of Animal & Dairy Sciences, University of Wisconsin Madison, Madison, WI 53706, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5259; https://doi.org/10.3390/app14125259

Submission received: 15 April 2024 / Revised: 10 June 2024 / Accepted: 13 June 2024 / Published: 18 June 2024

(This article belongs to the Special Issue Data Science Methods in Big Data Era)

Download

Browse Figures

Versions Notes

Abstract

:

Appropriate data collection and using reliable and accurate procedures are the first steps in conducting an experiment that will provide trustworthy outcomes. It is key to perform an assertive statistical analysis and data visualization for a correct interpretation and communication of results. A clear statistical summary and presentation of the data is critical for the reader to easily process and comprehend experimental results. Nowadays, there are a series of different tools to perform proper statistical analysis and create elaborate graphs that will help readers to understand the data, identify trends, detect outliers, evaluate statistical outputs, etc. However, researchers that are beginning to navigate experiments do not frequently encounter a guide that can provide basic principal concepts to begin their statistical analysis and data presentation. Therefore, the objective of this article is to provide a guide or manual to analyze and presents results focused on different types of common food safety experiments, including method comparisons, intervention studies, pathogen presence experiments, bio-mapping, statistical process control, and shelf life experiments. This review will provide information about data visualization options and statistical analysis approaches for different food safety experiments. In addition, basic concepts about descriptive statistics and possible solutions for issues related to microbiological measurements will be discussed.

Keywords:

method comparison; intervention experiments; shelf life; bio-mapping studies; pathogen presence; limit of quantification; limit of detection; statistical analysis; data visualization

1. Introduction

Constant research in food safety is crucial for reducing foodborne illnesses because they impact society, health, and the economy of the food industry. Some of the benefits of performing research in this field are improving economic productivity and public safety, reducing healthcare costs for individuals affected by foodborne pathogens, increasing the shelf life of food products, and providing a safer food supply to consumers [1]. In the United States, it is estimated that foodborne diseases caused by pathogens affect 9 million people (about half the population of New York) and cause about 56,000 hospitalizations and 1300 deaths each year [2]. The four main pathogens of importance in terms of foodborne illnesses are Salmonella, Escherichia coli O157:H7, Listeria monocytogenes, and Campylobacter [3]. About 80% of foodborne illnesses are linked to Salmonella and Escherichia coli O157:H7 [3]. Also, around 75% of Salmonella illnesses are mainly derived from chicken, fruits, pork, seeded vegetables, beef, or turkey [3], while Escherichia coli O157:H7 is linked to vegetable row crops and beef [3]. Listeria monocytogenes illnesses are associated with dairy products, fruits, and vegetable row crops [3]. Although Campylobacter is one of the major pathogens that causes foodborne illnesses in United States, estimates linked to this pathogen are ambiguous because they contrast considerably with non-outbreak-associated illnesses [3]. This valuable information demonstrates the importance of microbial foodborne pathogens in our society, showing that data are a powerful dissemination tool [4,5].

Collection of data in a proper manner that follows procedure and uses accurate methods is crucial to obtaining reliable data when conducting an experiment. However, it is key to perform an assertive statistical analysis and data visualization for a correct assessment of microorganisms [6,7,8]. A clear statistical summary and presentation of the data is important for the reader to easily process and comprehend results. The summary of data in an absorbable manner can be achieved through statistics, which is a science composed of two main branches, descriptive and inferential statistics. Descriptive statistics summarize a data set and disseminate it in an accessible manner, while inferential statistics lead to conclusions from a small sample size that is exposed to random variation and make predictions and generalizations about a population [9,10,11].

In microbiology, correct estimations of microbial concentrations and their presence in food products are necessary for public health regulations and decisions because they comprise the interpretation and description of models that can subsequently affect public health [12,13]. Microorganisms are discrete objects that cannot be measured precisely; therefore, microbial concentrations are measured by enumeration or detection of the objects in a fixed sample size prior to performing an experimental trial [14,15,16]. These approaches are broadly used in the food and health industry, and these estimates are mostly compared to determined concentrations or targets, which can decrease the accuracy of the data, detection, and enumeration of microorganisms [17,18]. Microbial data are analyzed and presented in a lognormal distribution to describe the variability of bacterial concentrations, which allows the data to be interpreted following a normal distribution [19,20,21,22]. Some of the challenges of log transforming microbial data are obtaining non-detectable values and back-calculating concentrations from microbial counts [12]. Therefore, a correct data analysis should be performed to avoid biased results.

Some of the misconceptions that can be dismissed regarding inaccurate data analysis are due to crucial differences between discrete and continuous data. These are obtained using analytical chemistry methodologies, and they expose bias obtained by statistical analysis within chemistry data and its dissipation into discrete microbial data [12]. These approaches impede the accurate presentation of low microbial concentrations and microbial methodologies with low or variable analytical power; therefore, non-detectable results can be expected. Also, it is not rare for many researchers to apply improper statistical tests without considering appropriate approaches before performing experiments [23,24].

A correct statistical analysis is essential when presenting results; however, data visualization provides more value to such results by facilitating the interpretation and display of information. Understanding of the data is simplified through graphs, plots, histograms, maps, etc. [25,26]. Currently, just displaying results in an old-fashioned manner is not enough; rather, there is more value in providing graphics that can be interpreted in a clean and concise style. Data visualization is an excellent tool for data cleaning and structure, detecting outliers, identifying trends, evaluating statistical outputs, and presenting results [27]. Some programs that have allowed for the evolution of theory graphics are Wilkinson’s Grammar of Graphics (2005), followed by Hadley Wickham’s implementation of it in the R package ggplot2 [28,29]. In the past, graphics were rarely used and difficult to construct; however, technology nowadays facilitates data visualization.

As discussed, statistical approaches and data visualization are necessary and powerful tools to spread assertive information efficiently. However, researchers who are beginning to navigate these sciences do not frequently encounter a guide that can provide the principal concepts to begin statistical analysis and presentation of their data. Therefore, the objective of this review paper is to provide a “manual” for researchers in the food safety sector on statistical analysis and proper data visualization before performing experimental trials involving microbial counts, validations, and detection of foodborne pathogens.

2. Types of Data

Every research experiment should follow the scientific method, which starts with a question followed by a hypothesis. Then, in order to answer the question, an experiment is constructed including an experimental design and statistical analysis. In addition, before beginning a research trial, the variables of interest are predefined and data are collected to obtain scientific evidence that the conclusion of your experiment is valid. When collecting data, it is important to understand the type of data generated in the experiment because the statistical methods used are highly dependent on this characteristic [30]. Data are classified in two main categories: quantitative or continuous data, and qualitative or categorical data.

2.1. Quantitative Data

A quantitative variable represents a numerical value designating how much, how often, or how many of an item is assigned to a specific sample, thus providing information about quantities of a specific object [31,32]. In food safety experiments, enumeration of bacteria is the most common example of quantitative data, as it provides the number of a specific bacterium in a sample; however, quantitative data can be further classified into two different types: discrete and continuous data. Continuous data represent a numerical value that can be calculated, and they have an infinite number of probable values expected withing a range such as height, weight, and pH values. Discrete data contain a finite number of probable values; thus, they can only be counted as whole integers, such as the number of bacteria in a specific sample.

In statistics, distribution functions and their properties are usually discussed; however, it is important to understand what a distribution represents. When a large number of measurements are collected during an experiment, those observations can be organized into groups or classes that have the same interval size, resulting in groups with a certain number of observations or frequencies [32]. If plotted, those groups and frequencies will result in a histogram, in which the height of the bars will represent the frequency of each class. Frequency distributions can be used as models for experimental data where appropriate [33]. The most common models used for microbiological analysis include the normal, lognormal, and binomial distributions.

2.1.1. Normal Distribution

Normal distribution is the most common type of distribution, assumed in nearly all types of statistical analyses and continuous populations. It is symmetric about the mean, suggesting that data closer to the mean are more frequent in occurrence than data far from the mean [34]. Normal distribution appears as a bell curve, and it is represented by two parameters: the mean and the standard deviation. Also, in normal distribution, 68% of observations lie within one standard deviation of the mean, 95% lie within two standard deviations of the mean, and 99.9% lie within three standard deviations from the mean. A very strong theorem based on normal distribution is the “central limit theorem”, which states that under appropriate conditions, the distribution of a sample variable approximates a normal distribution as the sample size becomes larger [32]. However, how it can be known if a data set comes from a normal population? If the sample is reasonably large (n > 30 is a good rule of thumb), a histogram may give a good indication, peaking in the center and decreasing less or more along the sides; this is a good sign of normality [35]. For small samples, it is more difficult to tell whether the observations are normally distributed; however, statistical tests such as the Kolmogorov–Smirnov test and Shapiro–Wilk tests exist to test normality. The null hypothesis of the tests states that collected data are normally distributed, meaning that if the p-value of the test is not significant, the null hypothesis is accepted; thus the data are considered normally distributed. Some authors suggest that the Shapiro–Wilk test is more appropriate for small sample sizes, whereas the Kolmogorov–Smirnov test is used for large data sets [7].

2.1.2. Lognormal Distribution

The most common types of statistical test (tests for the difference between two or more means) are based on parametric methods, in which data should come from a normal distribution with verified normality and assumed homoscedasticity [36]. Microbiological counting is usually heteroscedastic and non-normally distributed, meaning that the variation across a set of independent variables is not constant. Therefore, the variance is not homogeneous, suggesting that the use of parametric methods for statistical analysis may introduce considerable error [32]. To overcome this issue, microbial growth data should be log₁₀ transformed, followed by an appropriate model to check if the transformation improved the distribution of microbial counts, rendering it similar to a normal distribution. The transformation is relatively easy to perform, but one special condition to consider is that no zeros should exist in the data set, as the log₁₀ of zero will result in undetermined results. Options to overcome this issue are available, and all can apply to different data sets; nevertheless, it is important to mention them during the description of the statistical analysis in an article or presentation so the public can understand the significance of the results. Transformation of data is essential in microbial counts in order to use parametric statistical analysis, such as analysis of variance (ANOVA) or t-testing.

2.1.3. Binomial Distribution

The most typical example to explain the binomial distribution used in statistics literature is tossing a certain number of coins, such that the average probability of an equal number of heads and tails equals

p = q = (1 - p) = 0.5

. In its general form, each of the n trials results either in “success”, with probability p, or “failure”, with probability

q = (1 - p)

[37]. In microbiology, the detection of pathogens is the best example of when this distribution is applied. The results can be a positive or negative presence. Data obtained in these situations should be analyzed using some of the various statistical methods based on the binomial distribution. It is important that the estimation of p expresses the percentage of successes in a certain number of trials, but its calculation is just an estimation. Therefore, the calculation of the uncertainty should also be estimated when reporting results about binomial distribution, as is done with the mean and standard deviation for normal distributions [38].

2.2. Qualitative Data

A qualitative variable (also known as categorical or factors) represents variables that are not numerical and can be placed into categories; thus, its values belong to a specific group under a certain criterion when they are classified. These variables can be classified into nominal (no natural ordering) or ordinal (ordered groups). Nominal variables categorize a group based on a specific characteristic that cannot be ranked or ordered, such as sex, treatment groups (antimicrobials), symptoms of disease, etc. [39]. Ordinal variables allow for classification into categories that can be ranked or ordered; however, the distance between categories is not known [39]. For example, with temperature (high, medium, and low), it is clear that high is greater than medium, and medium greater than low; still, the difference between the categories cannot be described. Qualitative variables are usually summarized in terms of frequencies or proportions [32].

3. Descriptive Statistics

While conducting an experiment, data analysis and statistical methods are key components of research project management. They allow the researcher to make inferences about a certain population by relying on a sample. This sample needs to be randomly taken and have a predetermined size n, in which each observation is equally likely to be chosen from a certain population [40]. A good real-life example is lottery, in which each collected ticket has an equally likely possibility of being the winning ticket. Moreover, the observations of a sample need to be independent from each other, meaning that observations chosen for one sample do not provide any information about observations chosen for the next sample. For example, when testing the effect of an intervention in beef trimmings, swabs from both sides of the trim are collected to reduce the number of beef trimmings needed for that project. In this scenario, the measurement of the microbial load in the trimmings are related to each other, because the same beef trimming was swabbed twice.

After randomly selecting observations and making sure that they are independent, the next step is to provide summary statistics. Normally, summary statistics provide a measurement of central tendency, such as the mean, median, or mode, and a measurement of variation such as range, variance, and standard error. The combination of these two types of statistics allows the researcher to understand their data. An example of these measurements is provided below:

Assume that summary statistics of aerobic counts on fresh ground beef were calculated with actual values of 10,000, 1000, 2000, 5000, and 3000 colony-forming units (CFUs) per gram.

3.1. Range

The range provides a measurement of deviation between the largest and smallest observation in a data set, and it is calculated by subtracting the lowest value from the highest value. For the example data, the correct way to calculate the range is

10,000 - 1000 = 9000 C F U / g

, which, converted to log₁₀, equals 3.95 Log CFU/g. Others will suggest first applying log₁₀ to the values and then applying the same principle, resulting in

4.00 - 3.00 = 1.00 {L o g}_{10} C F U / g or 10 C F U / g

, which yields misleading results, as the difference between logarithms should not be back-transformed. Instead, the interpretation should be that the difference between logarithms will provide the proportional difference between both values, resulting, in this case, in a 90% change.

R a n g e = (\frac{10^{4} - 10^{3}}{10^{4}}) \times 100 R a n g e = 90 % = 1 \log d i f f e r e n c e

Range is often used in statistical process control, but since it is affected by extreme values, its use is limited.

3.2. Median

The median represents the value that separates a population or a data sample into groups of 50%. The calculation of the median depends on the number of observations or values in a data set. If the number of observations is odd, the median is the middle value of an ordered data set. If the number of observations is even, the median is the average of the two middle values in an ordered data set. For the example data, the median value of 1000, 2000, 3000, 5000, and 10,000 = 3000 CFU/g or 3.48 Log₁₀ CFU/g. The same principle can be applied with log₁₀ values, as the order of the observations will not change. For this sequence, the median value for 3.00, 3.30, 3.48, 3.70, and 4.00 = 3.48 Log₁₀ CFU/g.

3.3. Interquartile Range

The interquartile range (IQR) shows the spread of the central 50% of a distribution, whereas the range gives the spread of the whole data set. A distribution can be segmented into quartiles (Q1 to Q3) that are basically four equal parts of an ordered set of values. The second quartile (Q2) is the median, as it represents the middle value of an ordered data set. The first quartile (Q1) is the median value below Q2, and the third quartile is the median value above Q2. The IQR is calculated by subtracting Q3–Q1. For this sequence of observations, the first step is to order the values in ascendent order and then identify the quartiles. In this case, Q1 = 2000 CFU/g or 3.30 Log₁₀ CFU/g, Q2 = 3000 CFU/g or 3.48 Log₁₀ CFU/g, and Q3 = 5000 CFU/g or 3.70 Log₁₀ CFU/g, thus the IQR is

5000 - 2000 = 3000 C F U / g

. Remember that back-transforming the subtraction between logarithms can lead to misleading results, as explained in Section 3.1.

3.4. Arithmetic Mean and Geometric Mean

The arithmetic mean is one of the most famous measures of central tendency of a distribution, and it is calculated by adding a sequence of values and then dividing them by the number of values. Nevertheless, the mean has a small flaw, as it is easily affected by the presence of extreme values or outliers in a distribution. For example, one of the values in the current data set will be changed from 10,000 to 100,000 as shown below. The formula for the arithmetic mean is represented in Equation (1).

\bar{x} = \frac{(\sum_{i = 1}^{n} x_{i})}{n} .

(1)

where

\bar{x}

is the arithmetic mean value and n is the number of observations.

\begin{matrix} \bar{x} = \frac{1000 + 2000 + 3000 + 5000 + 100,000}{5} \\ \bar{x} = 22,200 C F U / g \end{matrix}

Is this value a good measure of the central tendency for this distribution of values? It is not, and the problem is that the extreme value highly affects the arithmetic mean; therefore, the arithmetic mean is not the most accurate measure of central tendency in data where its distribution changes multiplicatively and not additively as it is bacterial growth in microbiology.

The geometric mean is the n^th root of the product of all the values in a data set. The formula of the geometric mean is represented in Equation (2).

{\bar{x}}_{g e o m} = \sqrt[n]{\prod_{i = 1}^{n} x_{i}}

(2)

where

{\bar{x}}_{g e o m}

is the geometric mean value and n is the number of observations.

\begin{matrix} {{\bar{x}}_{g e o m} = (1000 \times 2000 \times 3000 \times 5000 \times 100,000)}^{1 / 5} \\ {\bar{x}}_{g e o m} = 4959 C F U / g \end{matrix}

Alternatively, another way to calculate the geometric mean, which applies perfectly to microbiology, is to log₁₀ the values of the data set and then follow the steps to calculate the arithmetic mean. The formula of the second way to calculate the geometric mean is represented in Equation (3).

{\bar{x}}_{g e o m} = a n t i l o g [\frac{(\sum_{i = 1}^{n} {{l o g}_{10} x}_{i})}{n}]

(3)

where

{\bar{x}}_{g e o m}

is the geometric mean value, n is the number of observations.

\begin{matrix} {{\bar{x}}_{g e o m} = 10}^{[{l o g}_{10} (1000) + {l o g}_{10} (2000) + {l o g}_{10} (3000) + {l o g}_{10} (5000) + {l o g}_{10} (100,000)] / 5} \\ {\bar{x}}_{g e o m} = 4959 C F U / g \end{matrix}

Note that the geometric mean value can be presented in Log₁₀ CFU/g, which in both cases equals 3.6954 Log₁₀ CFU/g.

3.5. Variance and Standard Deviation

Variance is a measurement of the spread between observations in a data set, and it allows us to measure how far the values in the data set are from the mean. The adjusted sample variance is calculated by the sum of the deviances between the values and the mean value and divided by the degrees of freedom of the data set (n − 1). Since the geometric mean is the measurement used for positively skewed data such as microbial counts, variance of log₁₀-transformed values should be estimated. Moreover, variance does not have the same unit as the original data, as the deviance is squared for its calculation. The formula of the variance is represented in Equation (4).

s^{2} = \sum_{i = 1}^{n} \frac{{({l o g}_{10} x_{i} - \bar{{l o g}_{10} x})}^{2}}{n - 1}

(4)

where

s^{2}

is the variance of the transformed values and n is the number of observations.

\begin{matrix} s^{2} = [\frac{{(3 - 3.6954)}^{2} + {(3.3010 - 3.6954)}^{2} + {(3.3 . 4771 - 3.6954)}^{2} + {(3.6990 - 3.6954)}^{2} + {(5 - 3.6954)}^{2}}{5 - 1}] \\ s^{2} = [\frac{0.4836 + 0.1555 + 0.0477 + 0.00001 + 1.7019}{4}] \\ s^{2} = 0.5971 {(L o g C F U / g)}^{2} \end{matrix}

The standard deviation represents the average amount of variability in the data set, and it is calculated by the square root of the variance, resulting in a measurement with the same units as the original data. A simple interpretation for the standard deviation is that a high standard deviation suggests that the observations in the data set are generally far from the mean, while low values indicate that the observations are clustered close to the mean. The wider the distribution, the further the sample mean is from the population mean, suggesting that the parameter is not very good at explaining the data set. The formula for the standard deviation is represented in Equation (5).

s = \sqrt{s^{2}}

(5)

where

s

is the standard deviation.

\begin{matrix} s = \sqrt{0.5971} \\ s = 0.7727 L o g C F U / g \end{matrix}

Note that the variance and the standard deviation of the geometric mean (mean log count) should not be directly back-transformed, since the resulting value can be misleading.

3.6. Standard Error of the Mean and Confidence Intervals

The standard error of the mean is a measurement that shows how different the sample mean is from the population mean. Basically, it describes how much of the sample mean would vary if the same study were repeated using different sets of samples within the same population (precision). It is calculated using the standard deviation and dividing it by the square root of the number of samples. The formula for the standard error of the mean is presented in Equation (6).

{S E}_{M} = \frac{s}{\sqrt{n}}

(6)

where

{S E}_{M}

is the standard error of the mean,

s

is the standard deviation, and

n

is the number of observations.

\begin{matrix} {S E}_{M} = \frac{0.7727}{\sqrt{5}} \\ {S E}_{M} = 0.3456 L o g C F U / g \end{matrix}

Furthermore, another parameter used to describe the results from a study where several replicates were collected is confidence intervals (CIs), which describe a range of values that is likely, with certain percentage of confidence, to include the population mean. Two-sided 95% CI is the most common range used in microbiological research, as it provides an estimate that the mean will fall inside the range with a 95% confidence level (CL). When data are normally distributed, a 95% CI is approximately determined as

\pm 2 \times {S E}_{M}

. If more certainty is needed, a 99% CI can be approximately estimated as

\pm 3 \times {S E}_{M}

. It is important to understand that with a bigger percentage of confidence, larger ranges between the confidence limits will be estimated. Moreover, a wider CI suggests that the estimated population mean have more uncertainty, and possibly more data need to be collected before reaching conclusions. The formula for the estimation of confidence intervals is represented in Equation (7).

C I = {\bar{X}}_{g e o m} \pm t_{n - 1, α / 2} \times {S E}_{M}

(7)

where

C I

is the confidence interval,

{\bar{X}}_{g e o m}

is the geometric mean,

t_{n - 1, α / 2}

is a factor estimated depending on the number of observations minus 1 and the confidence level (

α = 1 - C L / 100)

using Student’s t distribution, and

{S E}_{M}

is the standard error of the mean. In our example,

n - 1

equals 4 and

α / 2

equals 0.025, producing a factor value of 2.776.

\begin{matrix} 95 % C I = 3.6954 \pm 2.776 \times 0.3456 \\ 95 % C I = (2.7360; 4.6548) L o g C F U / g \end{matrix}

Note that the approximate upper (UCL) and lower (LCL) 95% confidence intervals around the geometric mean can be converted to CFU/g with the antilog of the confidence limits estimated before.

\begin{matrix} L C L = 10^{2.7360} = 545 C F U / g \\ U C L = 10^{4.6548} = 45,164.79 C F U / g \end{matrix}

Also, the estimated confidence limits are asymmetrical around the mean value, and the large confidence interval is due to the outlier used in this example. These results can also express that on 20 occasions out of 100, the sample mean for a repeated test would be expected to fall outside the reported confidence interval.

3.7. Issues in Microbiological Counts

3.7.1. Negative Counts

During an estimation of microbial counts, one of the issues that microbiologists encounter when conducting research is how to deal with observations that are not quantifiable by the method or by observations, due to reporting units which will result in negative counts. At first, negative counts are not intuitive for people because one thinks, “How is it possible to be able to count something and at the same time get a negative value?” These types of issues are very common in microbiology, especially when changing reporting units for the counts, which dramatically affects the final result. For example, an experiment was conducted in which beef carcasses were swabbed with a 25 mL pre-hydrated swab on a 100 cm² area, and aerobic counts on a plate were analyzed. It was decided to plate directly from the swab, as it is known that counts will be very low, and the average results obtained from the plate were 2 CFU/mL. Then, after some literature review, it was observed that swabbed areas are normally reported as Log CFU/cm². The conversion is calculated below.

\frac{2 C F U}{m L} * \frac{25 m L}{s w a b} * \frac{1 s w a b}{100 {c m}^{2}} = \frac{0.5 C F U}{{c m}^{2}} = - 0.30 \frac{L o g C F U}{{c m}^{2}}

As mentioned before, the final units to express results will have an impact on positive or negative value of the result. To overcome this issue, changing reporting units will solve the problem. If the final reporting units for the last example were Log CFU/mL, the final result would be 0.30 Log CFU/mL. Furthermore, the use of negative values will directly affect the calculation of the mean; however, another problem arises when dealing with negative values. Usually, researchers will report zeros in their data sets in samples that were not quantifiable, meaning that these samples had lower counts than the ones that researchers were actually able to count. For those cases, another set of solutions can be used to express the data in the best possible way, as explained below.

3.7.2. Zeros, Limits of Detection, and Limits of Quantification

When describing microbial populations on a certain product or surface, concentrations are estimated by counting the number of colony-forming units in a finite sample portion that later will be evaluated against a concentration-based criterion to make a decision. While enumerating, zero counts or non-results will come up frequently and usually will be reported as “zero” or “below the limit of quantification” (LOQ). LOQ in microbiology stands for the lowest concentration that can be determined in a specific analytical procedure [12,17]. The use of zeros or “below the LOQ” has been widely implemented and usually provides results with a biased mean concentration estimate (depending on the number of observations reported). Approaches for handling this type of problem can be implemented in a data set in the following way:

(1): The simplest way to deal with values < LOQ is by ignoring non-quantifiable observations or setting log₁₀ to zero (0 Log). Both alternatives will bias the mean because setting observations below LOQ will decrease the mean, and if ignored, the mean will increase.
(2): As mentioned before, if the microbial counts are log₁₀-transformed, the presence of zero counts in a data set can cause an issue, as the log₁₀ of zero will give a non-determined number. One of the options to deal with this issue is by transforming microbial counts with a log₁₀ of the observation plus one $[{l o g}_{10} (x + 1)]$ [32]. This transformation will change all zero counts to 1, and the log₁₀ transformation can be applied without any problem. Moreover, this correction is usually insignificant in terms of microbial counts; however, it should be considered when back-transforming the data and interpreting a biased mean.
(3): Another simplistic approach consists of setting all values < LOQ to one half the limit of quantification [41]. Then the log₁₀ transformation is applied, and estimates will be obtained. An implicit assumption for this method is that a distribution below the limit of quantification follows a uniform distribution. If the data collected follow a lognormal distribution, then this method would give acceptable estimates; however, if the lognormal assumption is nearly correct some limitations may be encountered [42].
(4): To overcome the limitation previously mentioned, it is suggested to set all values < LOQ to $L O Q / \sqrt{2}$ and then log₁₀ the observations to obtain the estimates [42]. When the data are not highly skewed, the replacement of values below the LOQ using this method would produce good estimates [42].
(5): The last and most accurate method to deal with values below the LOQ takes advantage of the normal distribution to extrapolate it back from the LOQ to estimate maximum likelihood estimates of the mean and standard deviation [12,17,42]. While this method is very accurate, it is laborious and understanding of it is somewhat limited among researchers.

Despite the existence of these solutions to overcome values below the LOQ, when more than half of the data are not quantifiable, it is impossible to obtain an accurate estimate without biasing the results [42]. Moreover, in this type of case, reporting mean and standard deviation is questionable, so other estimates should be presented, such as the percentage of samples below the limit of detection and the range of the samples that were quantifiable [42].

Generally, two measurements are analyzed in food microbiology: (1) the occurrence of a microorganism, characterized in terms of presence or absence (proportion of food contaminated) and (2) the concentration of the microorganism in a food unit. Presence or absence is determined by detection methods, while concentrations can be determined by quantitative enumeration methods. Although they are determined separately, they are closely related, as higher concentrations are more likely at higher presence levels [17]. This relationship brings up the concept of limit of detection (LOD), which is the minimum concentration detected by a detection method, different from LOQ, which is defined as the minimum concentration enumerable by a quantitative enumeration method [17]. Typically, the LOD is smaller than the LOQ; therefore, when conducting an experiment in which presence and concentrations are estimated, it is recommended to first quantify and then only detect the samples with results below LOQ.

Nowadays, due to the improvement of methodologies, there are methods that will provide presence and quantification results that combine binomial and continuous distributions [43,44,45]. However, this improvement has its downside for analysis, because normal parametric methods are designed to deal with only one type of distribution at a time. The most common solution is to use separate analysis (one for presence and one for quantification); however, the use of non-parametric tests creating different grouping ranks for presence and quantification results based on LOD, LOQ, and quantifiable values creates an opportunity for their simultaneous analysis [46,47].

4. Type of Food Safety Experiments

This review will cover some of the most common types of food safety experiments, describing objectives, outcomes, and options for presenting the results graphically. A brief explanation of the statistical approaches that can be used to assess the objectives established within the experimental design will also be included.

4.1. Methodology Comparison

For the scope of this paper, methodology comparison studies will cover experiments in which different methods were used to analyze the same sample to find statistical differences in enumeration or detection of microorganisms. Typically, comparison between methodologies is performed due to a lack of information about the method tested in certain matrices or limited information in the literature regarding discrepancies between methods to enumerate or detect a certain microorganism [45,48,49,50,51,52,53,54,55,56]. Moreover, when new methodologies are developed by companies, it is common to compare methodology studies to validate their newly developed method [57,58,59,60,61].

For instance, Line et al. 2011 compared an automated most probable number technique with traditional plating methods for estimating populations of total aerobes, coliforms, and Escherichia coli associated with freshly processed broiler carcasses [52]. Owen et al. 2010 evaluated the same automated most probable number technique (TEMPO^®), but for the enumeration of Enterobacteriaceae in food and dairy products [55]. Vargas et al. 2021 compared TEMPO^® with 3M^TM Petrifilm^TM for aerobic counts in order to support the study’s validity, as the experimental design tested different lengths of time and used different microbial enumeration methods [56]. Meighan et al. 2016 validated Hygiena’s MicroSnap^TM method for enumeration of total viable count in a variety of foods to provide supporting data for AOAC certification consideration [54]. Vargas et al. 2023 developed a novel methodology for quantification of Salmonella in pork and beef lymph nodes and also validated the methodology with a comparison study of commonly used methods [45]. These are just a few examples of method comparison studies with different objectives, following the same basic principle of validating a certain methodology while conducting an experiment.

4.1.1. Data Visualization and Data Analysis

The objective for this type of experiment is to show how different methodologies compare and contrast for detecting or enumerating a certain microorganism at a specific matrix. Therefore, the outcome should be any measurement that shows how comparable the methods are, such as p-values, confidence intervals, or correlation coefficients. There are several approaches to prove with a certain level of confidence that the methods are similar or different. However, the selection process requires an understanding of the statistical analysis and interpretation of the results.

(1): Means Comparison: when comparing two methods for enumeration of microorganisms, it is intuitive to determine whether the microbial means of both methods are equal [50,52,55]. The basic idea is quite simple: if the difference between the sample means is far enough from zero, there will be enough evidence that both methods are different. As a result, a Student’s t-test for paired samples or the non-parametric alternative Wilcoxon’s test for paired samples are the best alternatives (Figure 1A,B,D). A significant p-value will show enough evidence that both methods are different when enumerating a certain microorganism on a specific matrix. When more than two methods are to be compared, an ANOVA test followed by a pairwise comparison t-test or the non-parametric alternative Kruskal–Wallis test followed by a pairwise Wilcoxon’s test can be used for these purposes (Figure 1C). The multiple comparison test will be only performed if the null hypothesis is originally rejected.

Figure 1A shows a bar chart plot with error bars (±3SE) as a representation of the mean and variability of the two methods. A non-parametric statistical test was used for descriptive purposes. A more informative plot is shown in Figure 1B, where boxplots and actual data points are used to represent the observations collected during the experiment with a Student’s t-test for statistical analysis. In each boxplot, the horizontal line crossing the box represents the median, the bottom and top of the box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower quartile range. Furthermore, in Figure 1D, the difference in microbial counts of both methods was calculated and is displayed using a boxplot and the actual data points with a Student’s t-test. In these three figures (Figure 1A,B,D), the plots and statistical analysis are different approaches to the comparison between two methods. Figure 1C presents a three-method comparison; thus a boxplot and the actual data points were used for graphical representation, and an ANOVA was performed to check differences between the means of the three methods. For all these cases, all methods were not statistically significant below an α value of 0.05, meaning that all methods are comparable to each other.

(2): Linear Regression: When conducting a simple linear regression for method comparison, it is essential to establish the standard method (Method 1) and the methods to be compared (Method 2 and 3). Then, the analysis will estimate two parameters that evaluate the relationship between two variables and can be used to estimate the average rate of change (Figure 2). The slope represents the estimated increase in the dependent variable (Method 2 and 3) per unit increase in the independent variable (Method 1); the greater the value of the slope, the greater the rate of change between the variables. The intercept represents the value associated with the dependent variable (Method 2 and 3) when the independent variable (Method 1) is equal to zero [45,49,51,53,54,56].

For example, Figure 2 shows the graphical representation of linear regression analysis between Method 2 and 3 when compared with Method 1. Ideally, the slope should take a value of 1, suggesting that for any 1 Log increase using any of the alternative methodologies, there will be an increase in 1 Log, using the standard methodology. In addition, confidence intervals for the slope are estimated for statistical analysis purposes to verify with a certain level of confidence that the estimated slope is different from 1 (Table 1).

Moreover, the intercept measures the similarity of the total magnitude between the methods. Ideally, the intercept should be equal to 0, and the p-value for the intercept can be used for statistical significance, as it is calculated evaluating a null hypothesis that the intercept is equal to zero; thus, if significant, discrepancies between methodologies can be determined (Table 1). For this example, the 95% confidence interval of the slope for Method 3 did contain a value of 1, and the p-value for the intercept was not significant (p = 0.653); therefore, the methodologies are not different.

(3): Proportion Comparison: the last two examples covered the comparison between two enumeration methodologies, but sometimes methods will give results that follow binomial distributions. A clear example in microbiology is the presence or non-detection of a certain microorganism. The basic idea behind this type of experiment is to compare if the proportion of samples found in each category are statistically different [57,58,61]. McNemar’s chi-square test is often the best approach when two categories are created (Table 2). A significant p-value suggests that methods are not the same at classifying a certain variable in both categories. In this case, there is enough evidence to conclude that Method 1 and 2 are the same, while Method 3 is statistically different from Method 1 and 2 (Table 2). An alternative test is the test for the difference between two proportions. This can be implemented when comparing the proportion of positive (or negative) results between methods. A significant p-value serves as evidence to conclude that the methods are different when detecting positive (or negative) results. Note that this test is based on the central limit theorem, and in order to be valid, samples from each group must be reasonably large (at least 10 positive and 10 negative results in each sample) [34]. Finally, other measures such as sensitivity, specificity, positive predictive value, negative predictive value, and accuracy can also be estimated to evaluate the performance of the new method when compared with the gold standard.

4.2. Pathogen Prevalence Studies

As technology has advanced, the quantification of pathogens has become easier. However, there are various situations in which it is desired that the pathogen presence within a set of samples be compared. The most common set of studies that use pathogen prevalence make inferences about a population are epidemiological studies. In this context, we must define the difference between incidence and prevalence. Incidence measures the frequency of new cases of a particular disease in a population over a period of time. In general, they are reported as

x

number of cases in a set of population, for example, 23 cases in 100,000. Comparatively, prevalence refers to the proportion of a population which presents the evaluated characteristic, for example, a percentage of Salmonellosis cases in Mexico. For the scope of this review, the proportion of pathogen presence in a set of samples will be compared; however, the prevalence of pathogens in a set population will not be reported.

For instance, pathogen presence variation due to seasonality has been compared using a chi-square test of independence, to determine if the presence of a pathogen is significantly greater in the rainy season when compared to the dry season [62,63]. Similarly, differences in pathogen presence at different stages and times of a processing environment have been evaluated [64,65,66,67,68]. Studies that evaluate epidemiological data and are large in size need more complex analysis that can be carried out by running logistic regression models, which associate different factors with the prevalence of the pathogen of interest. Therefore, understanding which factors may have a bigger impact in the prevalence of the pathogen can more effectively predict the risk of detecting a pathogen in a given environment [69,70]. Multinomial logistical regression analysis will not be included in the scope of this review.

Data Visualization and Data Analysis

When comparing sets of samples with different foodborne pathogen presences, the distribution of the data compared will follow a binomial distribution (presence or non-detection in a sample). In this kind of study, the objective is to identify if a set of samples contains significant differences in pathogen presence. The most common comparison used involves the use of contingency tables, in which the statistical comparison used is a chi-square test of independence; however, other ways of presenting results, such as odds ratios with 95% confidence intervals, are alternatives widely used in food safety. For discussion purposes, a 2 × 2 contingency table will be presented. In this contingency table, the presence and non-detection of the foodborne pathogen of interest throughout different samples was evaluated (Table 3). The table analyzes the counts of the pathogen presence and non-detection and compares the different samples. Percentages or proportions were not used to perform this comparison.

Table 3 shows an example of a 2 × 2 contingency table in the context of pathogen presence comparison between 2 seasons (rainy vs. dry). In this case, the chi-squared comparison will compare the distribution of the rows (rainy season against dry season). The chi-squared comparison is best for comparing a large number of observations within each category. An alternative method (rarely used) is Fisher’s exact test, which calculates p-values for small sample sizes; however, it is only suitable when the marginal row and column totals are fixed, which is rarely the case in pathogen presence studies. The chi-squared test uses the observed values of the contingency table and compares them against their expected values. If results are significant, the conclusion would lead to establishing that the binomial distributions of Salmonella presence in the rainy and dry season, which are independent; hence, the name chi-square test of independence. For this example, the p-value is less than 0.05, which indicates that the sample distribution of Salmonella presence in rainy season is different when compared to the dry season. Therefore, it was concluded that the presence of Salmonella on the beef harvest floor is significantly greater in the rainy season when compared to the dry season. For multiple comparisons of 3 or more sample sets, the chi-squared statistic is set up to test if at least one of the sample sets has a different distribution than the rest. If significant, a post hoc chi-squared test can be carried out to compare distributions of each sample set and identify differences [37].

Data in this kind of scenario within food microbiology are generally presented as a proportion of presence in a given sample, with the respective standard error of proportions as a table. Additionally, the most suitable graph representation of a contingency table would be a side-by-side bar chart indicating the frequencies at which each outcome occurs within each group. Pathogen presence studies are limited to positive or negative outcomes, and authors generally present their data as side-by-side bar charts comparing the proportion of presence among sample sets.

4.3. Intervention Studies

Different interventions can be applied in the food industry as control measures to avoid pathogenic contamination. Interventions can be classified as physical, chemical, or biological. Physical interventions include temperature treatments such as hot water, steam, chilling, and thermal treatments, among others [71]. Scheinburg et al. [72] conducted a study on high-pressure processing and boiling water treatments to reduce Listeria monocytogenes, E. coli O157:H7, Salmonella, and Staphylococcus aureus in beef jerky. Dixon et al. [73] observed that steam vacuum and high-pressure water have the potential to reduce microbial loads during beef processing. Biological interventions are an approach for those consumers that look for more natural interventions such as the use of prebiotics, probiotics, antioxidants, or natural processes such as fermentation, etc. [74]. A study that described pre-and post-harvest interventions to reduce microbial loads in the beef industry established that the use of probiotics nowadays is more common and effective as an alternative to avoid foodborne pathogens [75]. Balta et al. [76] conducted a project to analyze the effects of different probiotics added to poultry diet related to campylobacter colonization, successfully showing reductions in bacteria populations on cecum, feces, and other organs.

Chemical interventions are widely used in the food industry, especially in the meat industry. Antimicrobial interventions such as lactic acid, Citrilow, BoviBrom, etc., are implemented to reduce any potential pathogens present on the carcass surface [77]. Intervention studies are an efficient methodology that can be applied in a processing plant to assess the effectiveness of microbial load reduction “before” and “after” treatments. These interventions can be implemented from the beginning of the process at the lairage area to the final part of the process [78]. Different antimicrobial interventions can be applied in the food industry, but it is necessary to analyze and define the expected results and evaluate which intervention will fit better.

In some cases, an intervention can be successfully implemented by obtaining relying results on its efficacy; however, that same intervention does not deliver same results when applied to different matrices. Vargas et al. [56] evaluated the antimicrobial efficacy in an in-plant ozone intervention in variety meats (head, heart, and liver) compared to a lactic acid intervention. In both interventions, before and after samples were significantly reduced in aerobic counts and Escherichia coli counts, except in heart samples of Escherichia coli counts. In another study, Casas et al. [62] compared Bio-Safe aqueous ozone treatment to lactic acid treatment on beef carcasses and beef trimmings inoculated with Salmonella and E. coli surrogates. Muriana et al. [79] evaluated fourteen antimicrobials on inoculated lean beef wafers and beef subprimals with a cocktail containing four E. coli O157:H7 strains; seven antimicrobials were effective in terms of microbial reduction after treatment during blade tenderization. In pork, Fernandez et al. [65] fed alginate hydrogel beads to pigs during transportation to reduce microbial loads in the animals’ feces, evaluating loads at three stages of the process: “before fast”, “before transportation”, and “after transportation showing a significant difference at the “before fast” and “after transportation” sampling points. Furthermore, Vargas et al. [80] evaluated the effectiveness of five different antimicrobials on pork loins for up to 42 days in storage, assessing microbial loads and shelf life for both “before” and “after” treatment on each designated storage day. In poultry, Wideman et al. [81] evaluated different practices performed in six different processing plants to assess before and after interventions by establishing a baseline to assess which quantity of peroxyacetic acid, an antimicrobial that is widely used, was the most effective in reducing Campylobacter and Salmonella on carcass samples. Benli et al. [82] evaluated the antimicrobial efficacy of a sequential spray application of ɛ-polylysine (EPL) + acidic calcium sulfate (ACS) or lauramide arginine ethyl ester (LAE) + ACS on chicken carcasses inoculated with Salmonella, showing that both treatments effectively reduced Salmonella when comparing before and after treatment.

Data Visualization and Data Analysis

The objective for this type of experiment is to show how an intervention or a treatment applied to a certain product or surface impacts its microbial loads. Typically, enumeration or detection of microorganisms is performed before and after the application of the intervention; therefore, the outcome should be any measurement that demonstrates that the counts or presence of a certain microorganism before and after treatment are similar or different with a certain level of confidence. Similar methods as described in Section 4.1.1 can be used to analyze this type of study.

(1): Means comparison: As described in Section 4.1.1, the same principle of means comparison is used to evaluate differences before and after the application of an intervention. If the study was designed with random samples taken before and after intervention, a Student’s t-test or non-parametric alternative Wilcoxon’s test are the best alternatives for statistical analysis (Figure 3B). However, if results are desired to be presented as reductions (difference between before and after intervention), it is important to design the study properly, with paired samples, in order to use a Student’s t-test for paired samples or a non-parametric alternative Wilcoxon’s test for paired samples, which will give more robust results for analysis. Moreover, if two interventions are compared before and after, an ANOVA test followed by a pairwise comparison t-test or non-parametric alternative Kruskal–Wallis test followed by a pairwise Wilcoxon’s test can be used for these purposes (Figure 3A).

Figure 3A presents a boxplot with an ANOVA followed by a pairwise comparison-adjusted Tukey’s test for two antimicrobials tested before and after its use. Boxes with different letters are significantly different at p < 0.05. This analysis shows differences before and after each intervention. In addition, it compares the interventions at each sampling point (before and after). In this case, the initial loads before intervention for both antimicrobials were similar, and both antimicrobials had a significant reduction after application. Moreover, no difference in microbial loads after interventions for either antimicrobial was observed. On the other hand, Figure 3B presents a bar chart plot with error bars (±3SE) of the two antimicrobials tested before and after their use. This analysis only allows for differences before and after the use of each specific antimicrobial. For instance, there were significant differences before and after the use of the intervention for each specific antimicrobial. Furthermore, if it is desired that reduction be presented, Figure 1D presents the graphical representation for this type of analysis.

(2): Proportion comparison: When presence or non-detection should be evaluated before and after the use of a certain intervention, the methods and graphical representation described in Section 4.1.1 can be applied on this case. Briefly, McNemar’s chi-square test or a test for the difference between two proportions can be implemented. In both cases, a significant p-value serves as evidence to conclude that both methods are different at detecting positive (or negative) results.

4.4. Bio-Mapping of Processing Facilities and Process Monitoring

Bio-mapping or process mapping refers to the practice of sampling at specific points in processing, which allows for an assessment of contamination levels. For instance, in poultry processing this is conducted by measuring the microbiological status of birds using a specific target organism or a class or organisms [83]. Similarly, bio-mapping can also be applied in the beef and pork industries. Bio-mapping can also be seen as a process of monitoring antimicrobial interventions, which can help processors meet performance standards while simultaneously improving the overall microbial quality of products through better process control [83,84]. Bio-mapping consists of collecting samples for one or more indicator bacteria and pathogens of interest at one or several points in the process. One of the major advantages of bio-mapping is that it allows facilities to use the collected data to compare pathogen levels on incoming and final product, which serves to determine if the process achieved the desired or intended levels of reduction of microbial loads.

Typically, when the process is performing adequately, bio-mapping results of indicator organisms provide a baseline level of the specific operation, which can be used to make adjustments in situations in which monitoring information indicates deviations from the “usual” performance. Furthermore, in cases when indicator data are used as a measure of pathogen control, then indicator results from bio-mapping are beneficial for food safety management decisions. Although a few process mapping studies have been conducted in the past [85,86], there has been a recent increased interest in bio-mapping due to regulatory changes, new Salmonella quantification methodologies, intervention assessment, and risk-based decision making, resulting in poultry [47,87], beef [56,62,88,89], pork [46], and leafy green [50] bio-mapping studies.

As evaluated in the study by De Villena et al. [47], poultry bio-mapping baselines paired with pathogen quantification serve as a basis for the development of statistical process control parameters for food safety management. Bio-mapping data can enhance statistics-based process control from farm to fork [84]. One of the major takeaways of bio-mapping studies is identifying where food safety risk is the greatest on the processing line. Microbial “hot spots” in the process can more easily be identified with bio-maps. In addition, bio-mapping helps optimize the process for improved microbial control, enables better resource utilization, and facilitates the implementation of newer technologies into the process. The experimental design of a bio-map is dependent on obtaining a microbial baseline of the overall process over time. A robust bio-mapping is one that assesses process control based on relevant microbial indicators and/or pathogen tests to the facility that ultimately facilitates food safety management. For example, 1000 samples can be collected over the course of 10 weeks (5 samples per sampling location, 2 shifts, 10 sampling locations) to construct a poultry facility’s bio map. The proposed experimental design can result in a microbial baseline before and after the implementation of higher line speeds under the New Poultry Inspection System (NPIS). This baseline can also serve as an assessment of the impact of antimicrobial intervention modifications and process optimization based on the facility’s performance before and after interventions. In this example, 5 samples per shift are used to account for both origin and microbial accumulation at each processing location by ensuring coverage and shift changes over 10 weeks (may provide data on seasonality) within the facility. Sampling for 10 weeks can provide information on week-to-week variability and insight regarding any facility operational or logistical issues.

Data Visualization and Data Analysis

(1): Boxplots and Kernel Density Estimations: one way to display bio-mapping data is through the use of boxplots as they constitute a powerful graphing tool to visualize sample data and for comparisons across samples [90]. Box plots can be used with a sample size greater than 5 and deliver more information regarding the tails of the distribution. A full description of the components of a boxplot is presented in Section 4.1.1. In practice, the statistical analysis of bio-mapping data for microbial indicators in a processing facility consists of an analysis of variance, when parametric assumptions are satisfied, as displayed in Figure 4B with a bio-mapping of a beef processing facility encompassing 5 sampling locations. ANOVA is a statistical method applied to uncover the main effects of independent variables on a dependent variable [32]. The estimate of variance is calculated from the sum of squares of the differences between each mean data value and the overall mean value. A Kruskal–Wallis test is used as a non-parametric alternative to ANOVA for data analysis in the case when parametric assumptions are not met. In situations when the ANOVA or Kruskal–Wallis are found to be significant, a pairwise comparison assessment is applied, typically using a pairwise t-test for significant ANOVA or a Wilcoxon’s test for significant Kruskal–Wallis tests [37].

(2): Kernel Density Estimation: A kernel density estimation is an alternative technique used for visualization of bio-mapping data, based on the estimation of probability density function (pdf). The kernel estimation yields a smooth empirical pdf using the individual locations of all sample data [91]. Consequently, using this probability density function estimate is a better option for representation of a “true” pdf of a continuous variable. This approach is presented in Figure 4A with a bio-map of a beef processing facility, including 5 sampling locations and the assessment of 2 treatments.
(3): Shewart’s Statistical Process Control: statistical process control (SPC) is an approach for continuous control of quality and safety which provides tools to monitor, analyze, and evaluate the control of a process with the goal of process efficiency optimization [32]. In terms of process optimization, SPC utilizes monitoring, analysis, and evaluation to achieve a stable process. SPC represents another way to present and visualize microbiological data as that obtained from bio-mapping. Control charts serve to indicate whether a process is in a state of control and can help determine when changes or deviations occur. Two control charts used to display bio-mapping data are the R-chart (based on the range of the results), as seen in Figure 5B, and the S-chart (based on the sample standard deviation), presented in Figure 5A. The applicability of these two types is based on their ability to monitor variations within a process [32]. As its name implies, the R-chart uses the range of a sample set to monitor both process variations. Control limits are established for range data sets by setting upper and lower control limits symmetrically about the target center line. Figure 5B applies the R-chart to a pork bio-map to compare the gambrel table location with the post-intervention location. With a sample size greater than n = 12, standard deviation or an S-chart is a better option to monitor variation [32]. When using an S-chart, s is calculated by averaging the sample standard deviations of the sample sets. Figure 5A presents the SPC of a 10-week poultry bio-map using an S-chart to establish the upper and lower control limits. This plot is very useful to compare initial and final sampling locations which provide information on the effect of interventions and to identify weeks in which process deviations occurred.

(4): Statistical Process Control (XmR Chart): This type of statistical process control is commonly used to monitor industrial processes with a subgroup of size one. X is the data being measured, and mR is the mean moving range, which is calculated by the difference between consecutive data point measurements [34]. In an XmR chart, as shown in Figure 6, the center line represents the mean of the data, while the dashed lines represent the upper and lower control limits. As mentioned before, these control limits can be established using standard deviation or range; however, for this type of chart, the sequential deviation is calculated, as it better explains the random process variation, while the standard deviation is strongly influenced by systematic variation [50]. Then, after the plot is created, violation analysis can be performed under a set of four rules established by Shewart to establish if the process is under control or if it is out of control. As an example, one of the rules states that any point outside the control limits is evidence that a process is out of control [32,34]. In microbiology, observations below the lower control limit (LCL) are not considered out of control, as lower counts represent better microbial performance. However, observations above the upper control limit (UCL) are considered for violation analysis. In our case, one observation in live receiving is above the UCL, suggesting that our process is out of control.

4.5. Shelf Life Studies

Food preservation has been practiced for years. A lot has changed over the years, with the main objective being for the food to be safe for consumption and to make it last longer, maintaining high-quality traits. Different methods have been implemented for food preservation, such as fermentation, salting, drying, irradiation, freezing, and antimicrobial agents, and the type of packaging plays an important role depending on the type of food [92]. Methods do not have to be used alone; a combination of strategies has resulted favorably in food preservation, resulting in longer food shelf life. One of the main challenges when it comes to shelf life is the consumer’s perception of the product, which is related to when the food is produced or processed and it is acceptable for consumption to when the food is no longer acceptable from a quality and safety point of view [93]. For consumers, important factors that are considered before purchase, for example in meat products, are appearance, such as visible fat, and quality traits, such as flavor, tenderness, and juiciness [94]. This is one of the reasons research projects are conducted on meat color and shelf life to maintain food at a high level of quality for the longest time possible.

Different methods can be used to extend shelf life which are very effective in inhibiting both microbial and spoilage microorganisms. Lately, the food industry has continued research on how to extend food shelf life by introducing novel strategies. Approximately 30–40 percent of the food supply is wasted in the United States per year [95]. This could be because food is no longer safe to consume if it is not stored properly or food is not at its highest quality, to the point is no longer acceptable to the consumer.

Shelf life can be evaluated in different ways depending on which method or combined of methods has been used. Vargas et al. [96] conducted a study on pork loins treated with different antimicrobials to evaluate instrumental and visual color of both fat and muscle from a quality perspective and further analysis on indicator microorganisms. Casas et al. [88] determined the effect on microbial loads after spray- or dry-chilling combined with hot water treatment for up to 135 days. Steele et al. [97] evaluated internal meat temperature, case temperature, and the visual and instrumental color of pork chops, beef steaks, ground beef, and ground turkey displayed under LED and FLS separately to determine the effect of lightning in color. Allen et al. [98] evaluated the physical and microbiological properties of broiler breasts of different colors to determine meat quality and shelf life. Xu et al. [99] carried out a food-grade microbial culture on fresh meat to extend shelf life and additionally surveyed Australians for their perception of this novel approach. Guo et al. [100] conducted a study to determine the effects of normal and modified atmosphere packaging by evaluating the physicochemical and microbiological properties of roast chicken meat. Bolton et al. [101] evaluated the effect of chemical treatments such as lactic acid, citric acid, trisodium phosphate, and acidified sodium chlorite on microbial loads and shelf life.

Data Visualization and Data Analysis

When conducting shelf life studies, the main objective, or the main question to be answered is, “what is the shelf life of a certain product?”, meaning what is the maximum allowable time in which a food will remain safe and will retain all its sensorial and functional characteristics when stored under specific conditions (temperature and oxygen) [96,102]. Many are the factors to be considered when conducting shelf life studies, as the shelf life of a product can be lost not only by reaching certain microbial concentration but also by color changes, odor, overall appearance of the product, sensorial properties, etc. For the scope of this paper, only microbial measurements are presented; however, statistical analysis and visualization plots can be further used for other types of measurement conducted in shelf life studies such as color, odor, overall liking, volatile compound analysis, oxidation byproducts, pH, etc. Since the objective of this type of study is to determine the shelf life of a product, the analysis should provide enough evidence that a product will last a certain amount of time under specific conditions with a certain level of confidence. Different approaches and examples are described below:

(1): The most common statistical approach to shelf life studies consists of the evaluation of the effect of a certain treatment over a certain amount of time. Since two factors are evaluated (treatment and time), a two-way ANOVA will provide information about the effect of treatment and time by themselves (main effects) in microbial counts or the interaction between both factors (treatment × time). If the interaction is significant, the statistical analysis suggests that there are significant differences between your groups (treatment) and over time, in other words, the change in the dependent variable (microbial counts) over time is different depending on group membership (treatment). Then, a pairwise comparison between the combination of treatment × time needs to be performed to find differences between treatments at specific time points (Figure 7B). However, if the interaction is not significant, the main effects should be evaluated, understanding that the change in the dependent variable over time is not different depending on group membership, and the results should be presented as they are when one-way ANOVA is performed for each of the main effects.

Furthermore, some experiments may have more than two factors evaluated and each factor can have several levels. When multiple factors are analyzed, mean comparisons can be very difficult to interpret, especially when interactions are found to be significant. A possible solution is to analyze if the effect of time is statistically significant (which normally in microbiological experiments throughout time it is, as bacteria grow), and if it is, exclude it from the initial model, and then analyze differences between other factors for each specific time point (Figure 7A).

Figure 7A shows a boxplot comparing microbial counts between a control group and a treatment group. Differences between the groups were assessed with a t-test for each specific time point and boxes with different letters are significantly different at p < 0.05. In this case, the effect of time was pre-assessed and found to be significant (p < 0.001), and then time was removed from the model, and the effect of treatment was evaluated at each time point. No difference was found at 0 h between the control and treated group; however, at 48 h and 96 h, differences between the groups were found to be significant. On the other hand, Figure 7B shows a line plot of the same information presented in Figure 7A. Nevertheless, a two-way ANOVA was performed with significant interactions, followed by pairwise comparison between the treatment × time groups. Different letters are significantly different at p < 0.05. In this case, unlike Figure 7A, the analysis allows us to compare means between control and treated groups at 0 h, 48 h, and 96 h. Choosing between this type of analysis is dependent on the type of outcome you want to achieve.

(2): Multivariate analysis: more complex analysis can be performed in shelf life studies, especially when multiple dependent variables should be analyzed, which is the case while conducting shelf life studies. As mentioned before, many are the factors to be measured in order to establish the shelf life of a product, which is why multivariate analysis such as principal component analysis (PCA) or linear discriminant analysis (LDA) are great tools to simplify the complexity of high-dimensional data while enabling us to observe trends and patterns. When multiple factors are evaluated, an increased error exists due to multiple test correction when evaluating each of the factors and their association with one or several outcomes. The main objective of PCA is to reduce data by projecting them onto dimensions called principal components (PCs) that will summarize the data by maximizing the correlation between data and their projection onto the PCs while trying to maximize the explained variance of the model (Figure 8A). Scale matters with PCA, as a set of variables with larger magnitude than others will overweight the PCA, resulting in variables ignored and results highly affected by these high-magnitude variables, reason why standardization is a key step before conducting PCA.

Figure 8A shows a PCA of microbial counts of three different microorganisms for samples with and without an intervention (Control and Treatment A) and displayed for 96 h. Microbial enumeration was performed at 0 h, 48 h, and 96 h of display under refrigerated conditions. For the raw-data PCA, PC1 explained 96.76% and PC2 explained 2.47% of the variation associated with microbial counts of the treatment × time combinations. PC1 clearly separated all group combinations by time, while PC2 separated all group combinations by treatment. Bacteria C are more associated with the control group, suggesting that more growth of bacteria C was promoted when no treatment was applied. Moreover, bacteria A and bacteria B are more associated with the treatment group, suggesting that more growth of bacteria A and B was promoted when the treatment was applied. Of course, this is a simple example with only three dependent variables, but several variables can be analyzed, and association can be estimated between groups combinations and the dependent variables [96,103].

(3): Multiple linear regression: when conducting shelf life studies, typically a threshold value is established as a measurement to assess if certain attribute will fail. Predicting when this threshold value will be reached will allow to determine the shelf life of a certain product by a certain factor. Multiple linear regression analysis is a more complex type of statistical analysis that uses several explanatory variables to predict the outcome of a response variable, unlike simple linear regression, in which only one independent variable is used to model the response of a dependent variable. In shelf life studies, it is normal that several factors with multiple levels are evaluated to describe their effect on the shelf life of a certain product, reason why multiple linear regression is an interesting alternative for this type of experiment. When conducting analysis of variance (ANOVA), a comparison between the means of each of the groups is evaluated with results highly linked with the levels of the factors established at the beginning of the trial. As an example, imagine that you evaluated the growth of a specific microorganism at different temperatures. The results of the analysis of variance will conclude in differences between the mean groups for each specific temperature evaluated. A more efficient conclusion would be to understand how much the growth is affected when a unit of temperature is changed. This brings up the concept of marginal effect which measures the impact that a unit of change in a variable has on the outcome variable while other variables are held constant. In simple linear regression, the slope of the model measures the marginal effect of the independent variable on the dependent variable, as only one variable is modeled. However, if multiple variables are used in the regression, the marginal effect needs to be calculated with more advanced mathematical methods.

Figure 8B shows the use of multiple linear regression analysis to evaluate shelf life studies. Microbial counts were established as the dependent variable, treatment as a qualitative variable with three levels (Control, Treatment A [TrtA], and Treatment B [TrtB]), and display time as a continuous variable. The control group was used as the base level for the qualitative variable, and 95% confidence intervals were created for each slope. As an example, the model for Bacteria A is described in Equation (8).

C o u n t s = β_{0} + β_{1} T r t A + β_{2} T r t B + β_{3} T i m e + β_{4} T r t A \times T i m e + β_{5} T r t B \times T i m e + ε)

(8)

in which,

β_{i}

are parameters, and

ε

is the error term.

As an example, the slope of the control group for Bacteria A is 0.70 Log CFU/mL × Day, suggesting that for every one-day increase in display time, there is an increase of 0.70 Log CFU/mL of Bacteria A. The same interpretation can be translated to each of the different treatments (control, Treatment A, and Treatment B) and type of bacteria (Bacteria A and Bacteria B). Then, a 95% confidence interval can be used for statistical comparison. For instance, Treatment A is different from Treatment B and the control group for both microorganisms. Additionally, the intercept of the regression models denotes the value of the dependent variable when the independent variables are equal to zero. For our example, the intercept represents the initial microbial load of the sample before even starting the display time.

5. Conclusions

The main objective of food safety is to protect consumers from health risks related to foodborne pathogens associated with the consumption of food. Data collection, appropriate and reliable methods, suitable statistical analysis and assertive data visualization and communication are crucial steps when conducting any type of experiment, as one without the other will not result in reliable conclusions. A simple guide that presents alternatives for statistical analysis and visualization for the most common type of food safety experiments can be a tool for actual and future researchers to help answer questions established in their studies while presenting their results in an assertive and elaborated way. This article covered some of the most basic food safety experiments with a few options about statistical analysis approaches as well as visualization techniques that can be implemented; however, there are a lot of more complex and interesting alternatives to be used in order to analyze and present outcomes of food safety experiments.

Author Contributions

Conceptualization, D.A.V., S.E.G., M.M.B., M.F.M. and M.X.S.-P.; methodology, D.A.V., M.X.S.-P., A.M.O.-D., K.M.R., R.B.L., N.V. and D.E.C.; validation, D.A.V., M.X.S.-P., A.M.O.-D., K.M.R., R.B.L., N.V. and D.E.C.; investigation D.A.V., M.X.S.-P., A.M.O.-D., K.M.R., R.B.L., N.V. and D.E.C.; data curation, D.A.V.; writing—original draft preparation, D.A.V., A.M.O.-D., K.M.R., R.B.L., N.V. and D.E.C.; writing—review and editing, D.A.V., S.E.G. and M.X.S.-P.; visualization, D.A.V.; supervision, M.X.S.-P.; project administration, D.A.V. All authors have read and agreed to the published version of the manuscript.

Funding

This review was funded by the International Center for Food Industry Excellence (ICFIE) at Texas Tech University.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

White, J.C.; Haven, N. USDA NIFA Workshop on Toxic Elements in Food: Identification of Critical Knowledge Gaps. 2022. Available online: https://portal.ct.gov/-/media/caes/documents/publications/press_releases/2022/october-20/nifa-c2z-workshop-full-report_toxic-elements-in-food.pdf (accessed on 16 March 2023).
Hedberg, C.W. Foodborne illness acquired in the United States. Emerg. Infect. Dis. 2011, 17, 1338–1340. [Google Scholar] [CrossRef] [PubMed]
Interagency Food Safety Analytics Collaboration. Foodborne Illness Source Attribution Estimates for Salmonella, Escherichia coli O157, Listeria monocytogenes, and Campylobacter using Outbreak Surveillance Data, United States; Interagency Food Safety Analytics Collaboration: Bossier City, LA, USA, 2022.
Bintsis, T. Foodborne pathogens. AIMS Microbiol. 2017, 3, 529–563. [Google Scholar] [CrossRef] [PubMed]
Newman, K.L.; Leon, J.S.; Rebolledo, P.A.; Scallan, E. The impact of socioeconomic status on foodborne illness in high-income countries: A systematic review. Epidemiol. Infect. 2015, 143, 2473–2485. [Google Scholar] [CrossRef] [PubMed]
Granato, D.; de Araújo Calado, V.Ô.M.; Jarvis, B. Observations on the use of statistical methods in Food Science and Technology. Food Res. Int. 2014, 55, 137–149. [Google Scholar] [CrossRef]
Mishra, P.; Pandey, C.M.; Singh, U.; Anshul, G.; Sahu, C.; Keshri, A. Descriptive Statistics and Normality Tests for Statistical Data. Ann. Card. Anaesth. 2019, 22, 67–72. [Google Scholar] [PubMed]
Pampoukis, G.; Lytou, A.E.; Argyri, A.A.; Panagou, E.Z.; Nychas, G.J.E. Recent Advances and Applications of Rapid Microbial Assessment from a Food Safety Perspective. Sensors 2022, 22, 2800. [Google Scholar] [CrossRef] [PubMed]
Bland, M. Introduction to Medical Statistics, 4th ed.; Oxford University Press: Oxford, UK, 2015. [Google Scholar]
Kaplan, R.M.; Chambers, D.A.; Glasgow, R.E. Big data and large sample size: A cautionary note on the potential for bias. Clin. Transl. Sci. 2014, 7, 342–346. [Google Scholar] [CrossRef]
Sundaram, K.R.; Dwivedi, S.N.; Sreenivas, V. Medical Statistics: Principles and Methods, 2nd ed.; Medknow Publications and Media Pvt. Ltd.: New Delhi, India, 2015. [Google Scholar]
Chik, A.H.S.; Schmidt, P.J.; Emelko, M.B. Learning Something From Nothing: The Critical Importance of Rethinking. Front. Microbiol. 2018, 9, 2304. [Google Scholar] [CrossRef] [PubMed]
Tropea, A. Microbial Contamination and Public Health: An Overview. Int. J. Environ. Res. Public Health 2022, 19, 7441. [Google Scholar] [CrossRef]
Emelko, M.B.; Schmidt, P.J.; Reilly, P.M. Particle and microorganism enumeration data: Enabling quantitative rigor and judicious interpretation. Environ. Sci. Technol. 2010, 44, 1720–1727. [Google Scholar] [CrossRef]
Gracias, K.S.; McKillip, J.L. A review of conventional detection and enumeration methods for pathogenic bacteria in food. Can. J. Microbiol. 2004, 50, 883–890. [Google Scholar] [CrossRef] [PubMed]
Lund, B.; Baird-Parker, T.C.; Gould, G.W. Microbiological Safety and Quality of Food, 1st ed.; Springer Science & Business Media: Gaithersburg, MD, USA, 2000. [Google Scholar]
Duarte, A.S.R.; Stockmarr, A.; Nauta, M.J. Fitting a distribution to microbial counts: Making sense of zeroes. Int. J. Food Microbiol. 2015, 196, 40–50. [Google Scholar] [CrossRef] [PubMed]
Schijven, J.F.; De-Roda-Husman, A.M. Applications of Quantitative Microbial Source Tracking and Quantitative Microbial Risk assessmentMicrobial Source Tracking: Methods, Applications & Case Studies; Springer: New York, NY, USA, 2011. [Google Scholar]
Busschaert, P.; Geeraerd, A.H.; Uyttendaele, M.; Van Impe, J.F. Hierarchical Bayesian analysis of censored microbiological contamination data for use in risk assessment and mitigation. Food Microbiol. 2011, 28, 712–719. [Google Scholar] [CrossRef] [PubMed]
Gao, A.; Martos, P. Log transformation and the effect on estimation, implication, and interpretation of mean and measurement uncertainty in microbial enumeration. J. AOAC Int. 2019, 102, 233–238. [Google Scholar] [CrossRef] [PubMed]
Gilchrist, J.E.; Campbell, J.E.; Donnelly, C.B.; Peeler, J.T.; Delaney, J.M. Spiral plate method for bacterial determination. Appl. Microbiol. 1973, 25, 244–252. [Google Scholar] [CrossRef] [PubMed]
Kilsby, D.C.; Pugh, M.E. The Relevance of the Distribution of Microorganisms Within Batches of Food to the Control of Microbiological Hazards from Foods. J. Appl. Bacteriol. 1981, 51, 345–354. [Google Scholar] [CrossRef] [PubMed]
Gherezgihier, B.A.; Mahmud, A.; Samuel, M.; Tsighe, N. Methods and Application of Statistical Analysis in Food Technology. J. Acad. Ind. Res. 2017, 6. [Google Scholar]
Weiss, N.A. Introductory Statistics, 10th ed.; Pearson: New York, NY, USA, 2015. [Google Scholar]
Chang, W. R Graphics Cookbook: Practical Recipes for Visualizing Data, 2nd ed.; O’Reilly Media: Sebastopol, CA, USA, 2018. [Google Scholar]
Unwin, A. Why is Data Visualization Important? What is Important in Data Visualization? Harvard Data Sci. Rev. 2020, 1–7. [Google Scholar]
Unwin, A. Graphical Data Analysis with R; Chapman & Hall/CRC: Boca Raton, FL, USA, 2015. [Google Scholar]
Wilkinson, L. The Grammar of Graphics, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Wickman, H. Elegant Graphics for Data Analysis; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Freeman, J.V.; Walters, S.J.; Campbell, M.J. How to Display Data, 1st ed.; Blackwell Publishing: Malden, MA, USA, 2008. [Google Scholar]
Aitken, M.; Broadhurst, B.; Hladky, S. Mathematics for Biological Scientists; Taylor & Francis Group: New York, NY, USA, 2010. [Google Scholar]
Jarvis, B. Statistical Aspects of Sampling of Microbiological Analysis. Statistical Aspects of the Microbiological Examination of Foods, 3rd ed.; Academic Press: London, UK, 2016. [Google Scholar]
Bliss, C.I.; Fisher, R.A. Fitting the Negative Binomial Distribution to Biological Data. Biometrics 1953, 9, 176. [Google Scholar] [CrossRef]
Navidi, W. Statistics for Scientist and EngineersStatistics for Engineers and Scientists, 3rd ed.; McGraw-Hill: New York, NY, USA, 2011. [Google Scholar]
McDonald, J.H. Handbook of Biological Statistics, 3rd ed.; Sparky House Publishing: Baltimore, MD, USA, 2014. [Google Scholar]
Fagerland, M.W.; Sandvik, L.; Mowinckel, P. Parametric methods outperformed non-parametric methods in comparisons of discrete numerical variables. BMC Med. Res. Methodol. 2011, 11, 44–48. [Google Scholar] [CrossRef]
Conover, W.J. Practical NonParametric Statistics, 3rd ed.; John Wiley & Sons Inc.: Hoboken, NJ, USA, 1999. [Google Scholar]
Jarvis, B. Frequency distributions. In Statistical Aspects of the Microbiological Examination of Foods, 3rd ed.; Jarvis, B., Ed.; Academic Press: Cambridge, MA, USA, 2016. [Google Scholar]
Mayya, S.S.; Monteiro, A.D.; Ganapathy, S. Types of biological variables. J. Thorac. Dis. 2017, 9, 1730–1733. [Google Scholar] [CrossRef] [PubMed]
Hawkins, D. Biomeasurement: A Student’s Guide to Biological Statistics, 4th ed.; Oxford University Press: Oxford, UK, 2019. [Google Scholar]
Nehls, G.J.; Akland, G.G. Procedures for Handling Aerometric Data. J. Air Pollut. Control Assoc. 1973, 23, 180–184. [Google Scholar] [CrossRef]
Hornung, R.W.; Reed, L.D. Estimation of Average Concentration in the Presence of Nondetectable Values. Appl. Occup. Environ. Hyg. 1990, 5, 46–51. [Google Scholar] [CrossRef]
Applegate, S.F.; Englishbey, A.K.; Stephens, T.P.; Sanchez-plata, M.X. Development and Verification of a Poultry Management Tool to Quantify Salmonella from Live to Final Product Utilizing RT-PCR. Foods 2023, 12, 419. [Google Scholar] [CrossRef] [PubMed]
Chaney, W.E.; Englishbey, A.K.; Stephens, T.P.; Applegate, S.F.; Sanchez-Plata, M.X. Application of a Commercial Salmonella Real-Time PCR Assay for the Detection and Quantitation of Salmonella enterica in Poultry Ceca. J. Food Prot. 2022, 85, 527–533. [Google Scholar] [CrossRef] [PubMed]
Vargas, D.A.; Betancourt-barszcz, G.K.; Blandon, S.E.; Applegate, S.F.; Brashears, M.M.; Miller, M.F.; Gragg, S.E.; Sanchez-Plata, M.X. Rapid Quantitative Method Development for Beef and Pork Lymph Nodes Using BAX^® System Real Time Salmonella Assay. Foods 2023, 12, 822. [Google Scholar] [CrossRef] [PubMed]
Bueno López, R.; Vargas, D.A.; Jimenez, R.L.; Casas, D.E.; Miller, M.F.; Brashears, M.M.; Sanchez-Plata, M.X. Quantitative Bio-Mapping of Salmonella and Indicator Organisms at Different Stages in a Commercial Pork Processing Facility. Foods 2022, 11, 2580. [Google Scholar] [CrossRef] [PubMed]
De Villena, J.F.; Vargas, D.A.; López, R.B.; Chávez-Velado, D.R.; Casas, D.E.; Jiménez, R.L.; Sanchez-Plata, M.X. Bio-Mapping Indicators and Pathogen Loads in a Commercial Broiler Processing Facility Operating with High and Low Antimicrobial Intervention Levels. Foods 2022, 11, 775. [Google Scholar] [CrossRef]
Applegate, S.F.; Sánchez-Plata, M.X.; Nightingale, K.K.; Thompson, L.; Stephens, T.P.; Brashears, M.M. Development, Verification, and Validation of a RT-PCR Based Methodology for Salmonella Quantification as a Tool for Integrated Food Safety Management in Poultry from Live Production to Final Product. Foods 2023, 12, 419. [Google Scholar] [CrossRef]
Beuchat, L.R.; Copeland, F.; Curiale, M.S.; Danisavich, T.; Gangar, V.; King, B.W.; Lawlis, T.L.; Likin, R.O.; Okwusoa, J.; Smith, C.F.; et al. Comparison of the SimPlate Total Plate Count Method with Petrifilm, Redigel, and Conventional Pour-Plate Methods for Enumerating Aerobic Microorganisms in Foods. J. Food Prot. 1998, 61, 14–18. [Google Scholar] [CrossRef]
Brown, L.N.P.; Sanchez-Plata, M.X.; Thompson, L.; Singh, S.; Echeverry, A.; Brashears, M.M. Integration of Regulatory Compliance Assessments, Microbial Bio-Mapping, and Novel Intervention Technologies for Food Safety Management in Controlled Environment Agriculture: Vertical Hydroponics Leafy Green Facility; Texas Tech University: Lubbock, TX, USA, 2022. [Google Scholar]
Hygiena. HygienaTM MicroSnapTM vs 3MTM PetrifilmTM vs bioMérieux TEMPO® Correlation Objective. Available online: www.hygiena.com (accessed on 16 March 2023).
Line, J.E.; Stern, N.J.; Oakley, B.B.; Seal, B.S. Comparison of an Automated Most-Probable-Number Technique with Traditional Plating Methods for Estimating Populations of Total Aerobes, Coliforms, and Escherichia coli Associated with Freshly Processed Broiler Chickens. J. Food Prot. 2011, 74, 1558–1563. [Google Scholar] [CrossRef] [PubMed]
Meighan, P.; Chen, Y.; Brodsky, M.; Agin, J. Validation of the microsnap coliform and E. coli test system for enumeration and detection of coliforms and E. coli in a variety of foods. J. AOAC Int. 2014, 97, 453–478. [Google Scholar] [CrossRef] [PubMed]
Meighan, P.; Smith, M.; Datta, S.; Katz, B.; Nason, F. The validation of the microsnap total for enumeration of total viable count in a variety of foods. J. AOAC Int. 2016, 99, 686–694. [Google Scholar] [CrossRef] [PubMed]
Owen, M.; Willis, C.; Lamph, D. Evaluation of the TEMPO most probable number technique for the enumeration of Enterobacteriaceae in food and dairy products. J. Appl. Microbiol. 2010, 2004, 1810–1816. [Google Scholar] [CrossRef] [PubMed]
Vargas, D.A.; Casas, D.E.; Chávez-Velado, D.R.; Jiménez, R.L.; Betancourt-Barszcz, G.K.; Randazzo, E.; Lynn, D.; Echeverry, A.; Brashears, M.M.; Sánchez-Plata, M.X.; et al. In-plant intervention validation of a novel ozone generation technology (Bio-safe) compared to lactic acid in variety meats. Foods 2021, 10, 2106. [Google Scholar] [CrossRef]
Belete, T.; Crowley, E.; Bird, P.; Gensic, J.; Wallace, F.M. A Comparison of the BAX System Method to the U.S. Food and Drug Administration’s Bacteriological Analytical Manual and International Organization for Standardization Reference Methods for the Detection of Salmonella in a Variety of Soy Ingredients. J. Food Prot. 2014, 77, 1778–1783. [Google Scholar] [CrossRef] [PubMed]
Johnson, J.L.; Brooke, C.L. Comparison of the BAX for Screening/E.coli O157: H7 Method with Conventional Methods for Detection of Extremely Low Levels of Escherichia coli O157: H7 in Ground Beef. Appl. Environ. Microbiol. 1998, 64, 4390–4395. [Google Scholar] [CrossRef] [PubMed]
Liu, T.; Belk, K.E.; Zagmutt, F.J. Evaluation of Gene-Up and TEMPO AC for Determination of Shiga Toxin Producing Escherichia coli and Total Aerobic Microbial Populations from Microtally Sheets used to Sample Beef Carcasses and Hides; Colorado State University: Fort Collins, CO, USA, 2020. [Google Scholar]
Manfreda, G.; De Cesare, A.; Bondioli, V.; Franchini, A. Comparison of the BAX R System with a multiplex PCR method for simultaneous detection and identification of Campylobacter jejuni and Campylobacter coli in environmental samples. Int. J. Food Microbiol. 2003, 87, 271–278. [Google Scholar] [CrossRef]
Maria, U.; Silva, D.A.; Mu, J.; Filipini, T.A.; Ange, D.; Moliterno, L.; Santos, A.D.O.S.; Baccarin, A.; Lea, A.; Frezza, O.C.; et al. Comparison of the BAX System PCR Method to Brazil’s Official Method for the Detection of Salmonella in Food, Water, and environmental samples. J. Food Prot. 2008, 71, 2442–2447. [Google Scholar]
Casas, D.; Brashears, M.M.; Miller, M.F.; Inestroza, B.; Bueso-ponce, M.; Huerta-leidenz, N.; Calle, A.; Paz, R.; Bueno, M.; Echeverry, A. In-Plant Validation Study of Harvest Process Controls in Two Beef Processing Plants in Honduras. J. Food Prot. 2019, 82, 677–683. [Google Scholar] [CrossRef]
Shah, M.; Kathiiko, C.; Wada, A.; Odoyo, E.; Bundi, M.; Miringu, G.; Guyo, S. Prevalence, seasonal variation, and antibiotic resistance pattern of enteric bacterial pathogens among hospitalized diarrheic children in suburban regions of central Kenya. Trop. Med. Health. 2016, 44, 39. [Google Scholar] [CrossRef]
Casas, D.E.; Vargas, D.A.; Randazzo, E.; Lynn, D.; Echeverry, A.; Brashears, M.M.; Sanchez-Plata, M.X.; Miller, M.F. In-Plant Validation of Novel on-Site Ozone Generation Technology (Bio-Safe) Compared to Lactic Acid Beef Carcasses and Trim Using Natural Microbiota and Salmonella and E. coli O157:H7 Surrogate Enumeration. Foods 2021, 10, 1002. [Google Scholar] [CrossRef] [PubMed]
Fernandez, M.; Garcia, A.; Vargas, D.A.; Calle, A. Dynamics of Microbial Shedding in Market Pigs during Fasting and the Influence of Alginate Hydrogel Bead Supplementation during Transportation. Microbiol. Res. 2021, 12, 888–898. [Google Scholar] [CrossRef]
Forgey, S.J.; Englishbey, A.K.; Casas, D.E.; Jackson, S.P.; Miller, M.F.; Echeverry, A.; Brashears, M.M. Presence of Presumptive Shiga Toxin—Producing Escherichia coli and Salmonella on Sheep during Harvest in Honduras. J. Food Prot. 2020, 83, 2008–2013. [Google Scholar] [CrossRef]
Mcauley, C.M.; Mcmillan, K.; Moore, S.C.; Fegan, N.; Fox, E.M. Prevalence and characterization of foodborne pathogens from Australian dairy farm environments. J. Dairy Sci. 2014, 97, 7402–7412. [Google Scholar] [CrossRef] [PubMed]
Rortana, C.; Nguyen-viet, H.; Tum, S.; Unger, F.; Boqvist, S.; Dang-xuan, S.; Koam, S.; Grace, D.; Osbjer, K.; Heng, T.; et al. Prevalence of Salmonella spp. and Staphylococcus aureus in Chicken Meat and Pork from Cambodian Markets. Pathogens 2021, 10, 556. [Google Scholar] [CrossRef]
Pelt, A.E.; Quiñonez, V.B.; Lofgren, H.L.; Bartz, F.E.; Newman, K.L.; Leon, J.S. Low Prevalence of Human Pathogens on Fresh Produce on Farms and in Packing Facilities: A Systematic Review. Front. Public Health 2018, 6, 40. [Google Scholar] [CrossRef]
Smith, B.A.; Meadows, S.; Meyers, R.; Parmley, E.J.; Fazil, A. Seasonality and zoonotic foodborne pathogens in Canada: Relationships between climate and Campylobacter, E.coli and Salmonella in meat products. Epidemiol Infect. 2019, 147, e190. [Google Scholar] [CrossRef]
Loretz, M.; Stephan, R.; Zweifel, C. Antibacterial activity of decontamination treatments for pig carcasses. Food Control. 2011, 22, 1121–1125. [Google Scholar] [CrossRef]
Scheinberg, J.A.; Svoboda, A.L.; Cutter, C.N. High-pressure processing and boiling water treatments for reducing Listeria monocytogenes, Escherichia coli O157: H7, Salmonella spp., and Staphylococcus aureus during beef jerky processing. Food Control. 2014, 39, 105–110. [Google Scholar] [CrossRef]
Dixon, E.; Rabanser, I.; Dzieciol, M.; Zwirzitz, B.; Wagner, M.; Mann, E.; Stessl, B.; Wetzels, S.U. Reduction potential of steam vacuum and high-pressure water treatment on microbes during beef meat processing. Food Control. 2019, 106, 106728. [Google Scholar] [CrossRef]
Nielsen, B.; Colle, M.J.; Ünlü, G. Meat safety and quality: A biological approach. Int. J. Food Sci. Technol. 2021, 56, 39–51. [Google Scholar] [CrossRef]
Wheeler, T.L.; Kalchayanand, N.; Bosilevac, J.M. Pre- and post-harvest interventions to reduce pathogen contamination in the U.S. beef industry. Meat Sci. 2014, 98, 372–382. [Google Scholar] [CrossRef] [PubMed]
Balta, I.; Butucel, E.; Stef, L.; Pet, I.; Gradisteanu-Pircalabioru, G.; Chifiriuc, C.; Gundogdu, O.; Mccleery, D.; Corcionivoschi, N. Anti-Campylobacter Probiotics: Latest Mechanistic Insights. Foodborne Pathog. Dis. 2022, 19, 693–703. [Google Scholar] [CrossRef] [PubMed]
Brashears, M.M.; Chaves, B.D. The diversity of beef safety: A global reason to strengthen our current systems. Meat Sci. 2017, 132, 59–71. [Google Scholar] [CrossRef] [PubMed]
Zdolec, N.; Kotsiri, A.; Houf, K.; Alvarez-Ordóñez, A.; Blagojevic, B.; Karabasil, N.; Salines, M.; Antic, D. Systematic Review and Meta-Analysis of the Efficacy of Interventions Applied during Primary Processing to Reduce Microbial Contamination on Pig Carcasses. Foods 2022, 11, 2110. [Google Scholar] [CrossRef]
Muriana, P.M.; Eager, J.; Wellings, B.; Morgan, B.; Nelson, J.; Kushwaha, K. Evaluation of antimicrobial interventions against E. Coli O157:H7 on the Surface of Raw Beef to Reduce Bacterial Translocation during Blade Tenderization. Foods 2019, 8, 80. [Google Scholar] [CrossRef] [PubMed]
Vargas, D.A.; Miller, M.F.; Woerner, D.R.; Echeverry, A. Microbial growth study on pork loins as influenced by the application of different antimicrobials. Foods 2021, 10, 968. [Google Scholar] [CrossRef]
Wideman, N.; Bailey, M.; Bilgili, S.F.; Thippareddi, H.; Wang, L.; Bratcher, C.; Sanchez-Plata, M.; Singh, M. Evaluating best practices for Campylobacter and Salmonella reduction in poultry processing plants. Poult. Sci. 2016, 95, 306–315. [Google Scholar] [CrossRef]
Benli, H.; Sanchez-Plata, M.X.; Ilhak, O.I.; De González, M.T.N.; Keeton, J.T. Evaluation of antimicrobial activities of sequential spray applications of decontamination treatments on chicken carcasses. Asian-Australas. J. Anim. Sci. 2015, 28, 405–410. [Google Scholar] [CrossRef]
Singh, M.; Thippareddi, H. Biomapping: An Effective Tool for Pathogen Control during Poultry Processing. 2020. Available online: https://extension.uga.edu/publications/detail.html?number=C1200&title=biomapping-an-effective-tool-for-pathogen-control-during-poultry-processing (accessed on 16 March 2023).
Dutta, V. The Importance of Leveraging Biomapping in Salmonella Control. 2022. Available online: https://www.foodqualityandsafety.com/article/the-importance-of-leveraging-biomapping-in-salmonella-control/ (accessed on 16 March 2023).
Biasino, W.; De Zutter, L.; Mattheus, W.; Bertrand, S.; Uyttendaele, M.; Van Damme, I. Correlation between slaughter practices and the distribution of Salmonella and hygiene indicator bacteria on pig carcasses during slaughter. Food Microbiol. 2018, 70, 192–199. [Google Scholar] [CrossRef] [PubMed]
O’Connor, A.M.; Wang, B.; Denagamage, T.; McKean, J. Process Mapping the Prevalence of Salmonella Contamination on Pork Carcass from Slaughter to Chilling: A Systematic Review Approach. Foodborne Pathog. Dis. 2012, 9, 386–395. [Google Scholar] [CrossRef] [PubMed]
Vargas, D.A.; De Villena, J.F.; Larios, V.; Bueno, R.; Ch, D.R.; Casas, D.E.; Jim, R.L.; Blandon, S.E.; Sanchez-plata, M.X. Data-Mining Poultry Processing Bio-Mapping Counts of Management Decision Making. Foods 2023, 12, 898. [Google Scholar] [CrossRef] [PubMed]
Casas, D.E.; Manishimwe, R.; Forgey, S.J.; Hanlon, K.E.; Miller, M.F.; Brashears, M.M.; Sanchez-Plata, M.X. Biomapping of Microbial Indicators on Beef Subprimals Subjected to Spray or Dry Chilling over Prolonged Refrigerated Storage. Foods 2021, 10, 1403. [Google Scholar] [CrossRef] [PubMed]
Vargas, D.A.; Rodríguez, K.M.; Betancourt-Barszcz, G.K.; Ajcet-Reyes, M.I.; Dogan, O.B.; Randazzo, E.; Sánchez-Plata, M.X.; Brashears, M.M.; Miller, M.F. Bio-Mapping of Microbial Indicators to Establish Statistical Process Control Parameters in a Commercial Beef Processing Facility. Foods 2022, 11, 1133. [Google Scholar] [CrossRef] [PubMed]
Krzywinski, M.; Altman, N. Visualizing samples with box plots. Nat. Methods 2014, 11, 119–120. [Google Scholar] [CrossRef] [PubMed]
Węglarczyk, S. Kernel density estimation and its application. ITM Web Conf. 2018, 23, 00037. [Google Scholar] [CrossRef]
Papadochristopoulos, A.; Kerry, J.P.; Fegan, N.; Burgess, C.M.; Duffy, G. Natural anti-microbials for enhanced microbial safety and shelf-life of processed packaged meat. Foods 2021, 10, 1598. [Google Scholar] [CrossRef] [PubMed]
Nicoli, M.C. Shelf Life Assessment of Food; CRC Press: Bacon Raton, FL, USA, 2012. [Google Scholar]
Santos, D.; Monteiro, M.J.; Voss, H.P.; Komora, N.; Teixeira, P.; Pintado, M. The most important attributes of beef sensory quality and production variables that can affect it: A review. Livest. Sci. 2021, 250, 104573. [Google Scholar] [CrossRef]
United States Department of Agriculture. Food Waste FAQs. 2023. Available online: https://www.usda.gov/foodwaste/faqs (accessed on 16 March 2023).
Vargas, D.A.; Blandon, S.E.; Sarasty, O.; Osorio-Doblado, A.M.; Miller, M.F.; Echeverry, A. Shelf-Life Evaluation of Pork Loins as Influenced by the Application of Different Antimicrobial Interventions. Foods 2022, 11, 3464. [Google Scholar] [CrossRef]
Steele, K.S.; Weber, M.J.; Boyle, E.A.E.; Hunt, M.C.; Lobaton-Sulabo, A.S.; Cundith, C.; Hiebert, Y.H.; Abrolat, K.A.; Attey, J.M.; Clark, S.D.; et al. Shelf life of fresh meat products under LED or fluorescent lighting. Meat Sci. 2016, 117, 75–84. [Google Scholar] [CrossRef]
Allen, C.D.; Fletcher, D.L.; Northcutt, J.K.; Russell, S.M. The Relationship of Broiler Breast Color to Meat Quality and Shelf-Life. Poult. Sci. 1998, 77, 361–366. [Google Scholar] [CrossRef] [PubMed]
Xu, M.M.; Kaur, M.; Pillidge, C.J.; Torley, P.J. Australian consumers’ attitudes to packaged fresh meat products with added microbial bioprotective cultures for shelf-life extension. Meat Sci. 2023, 198, 109095. [Google Scholar] [CrossRef] [PubMed]
Guo, Y.; Huang, J.; Sun, X.; Lu, Q.; Huang, M.; Zhou, G. Effect of normal and modified atmosphere packaging on shelf life of roast chicken meat. J. Food Saf. 2018, 38, e12493. [Google Scholar] [CrossRef]
Bolton, D.J.; Meredith, H.; Walsh, D.; McDowell, D.A. The effect of chemical treatments in laboratory and broiler plant studies on the microbial status and shelf-life of poultry. Food Control. 2013, 36, 230–237. [Google Scholar] [CrossRef]
Institute of Food Science and Technology. Shelf-Life of Foods: Guidelines for Its Determination and Prediction Institute of Food Science and Technology, 1st ed.; Institute of Food Science and Technology: London, UK, 1993. [Google Scholar]
Ponce, J.; Brooks, J.C.; Legako, J.F. Chemical Characterization and Sensory Relationships of Beef M. longissimus lumborum and M. gluteus medius Steaks After Retail Display in Various Packaging Environments. Meat Muscle Biol. 2020, 44, 10481. [Google Scholar] [CrossRef]

Figure 1. Different graphical representation and statistical analysis approaches to evaluate method comparison studies based on means comparison. (A) Microbial counts (Log CFU/mL) of two different methods (Method 1 and Method 2), (n = 30 observations per method). The bar chart represents the mean counts for each method, and the error bars represent ± 3SE. Different letters are significantly different according to Wilcoxon’s test for paired samples at p < 0.05. (B) Microbial counts (Log CFU/mL) of two different methods (Method 1 and Method 2), (n = 30 observations per method). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters are significantly different according to t-test at p < 0.05. The points represent the actual data points. (C) Microbial counts (Log CFU/mL) of three different methods (Method 1, Method 2, and Method 3), (n = 30 observations per method). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters or are significantly different according to ANOVA followed by pairwise comparison test adjusted Tukey at p < 0.05. The points represent the actual data points. (D) Difference in microbial counts (Log) between two methods (Method 1 and Method 2), (n = 30 observations per method). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters or are significantly different according to t-test at p < 0.05. The points represent the actual data points.

Figure 2. Graphical representation of the linear correlation between Method 2 and Method 3 when compared with Method 1 (n = 64 observations per method). The solid line represents the least square regression, and the dots represent the actual data points.

Figure 3. Different graphical representation and statistical analysis approaches to evaluate intervention studies. (A) Microbial counts (Log CFU/mL) before and after the use of two different antimicrobials (Antimicrobial 1 and Antimicrobial 2), (n = 30 observations per antimicrobial). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters are significantly different according to ANOVA followed by pairwise comparison test adjusted with Tukey’s test at p < 0.05. The dots represent the actual data points. (B) Microbial counts (Log CFU/mL) before and after the use of two different antimicrobials (Antimicrobial 1 and Antimicrobial 2), (n = 30 observations per antimicrobial). The bar chart represents the mean counts before and after each antimicrobial intervention, and the error bars represent ± 3SE. Different letters are significantly different according to t-test at p < 0.05.

Figure 4. Different graphical representation and statistical analysis approaches while conducting bio-mapping of processing facilities. (A) Kernel density estimation of microbial counts (Log CFU/cm²) on each sampling location with distinct treatments (Treatment A and Treatment B) applied at different locations in a processing facility (n = 50 observations per location and treatment). (B) Microbial counts (Log CFU/cm²) from each sampling location throughout a processing facility (n = 50 observations per location). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters are significantly different according to ANOVA followed by pairwise comparison test adjusted Tukey’s test at p < 0.05. The points represent the actual data points.

Figure 5. Graphical representation of statistical microbial process control for process monitoring. (A) Statistical microbial process control (Log CFU/mL) using an S-chart (standard deviation chart) at harvest stage, comparing the initial (live receiving) and final (post-chilling) locations of a processing facility over ten weeks (n = 5 observations per location at each sampling week). The horizontal solid line represents the mean of each location, and the dashed lines represent the upper and lower control limits using ±3σ. The solid line represents the change on the average of each location during 10 different sampling weeks. The solid square represents the mean of each location at each sampling week. (B) Statistical microbial process control (Log CFU/mL) using an R-chart (range chart) at the harvest stage, comparing the initial (gambrel table) and final (post-intervention) sampling locations of a processing facility over ten weeks (n = 5 observations per location at each sampling week). The horizontal solid line represents the mean of each location, and the dashed lines represent the upper and lower control limits using ±3σ. The solid line represents the change in the average of each location over 10 different sampling weeks. The solid square represents the mean of each location at each sampling week.

Figure 6. Shewart’s statistical microbial process control (Log CFU/mL) using an XmR chart at two different locations in a processing facility. The solid black line represents the mean of each location, and the black dashed lines represent the upper (UCL) and lower (LCL) control limits using ±3 sequential deviations for each location. The colored interactive line represents the change in the actual value of sequential observations in each location. The points represent the actual value of sequential observations in each location. (n = 50 observations per location).

Figure 7. Different graphical representation and statistical analysis approaches to evaluate shelf life studies. (A) Microbial counts (Log CFU/mL) of samples with (Treatment A) and without (control) an intervention retail displayed for 96 h at refrigerated conditions (n = 50 samples per treatment/retail display time). In each boxplot, the horizontal line crossing the box represents the median, the bottom and top box are the lower and upper quartiles, the vertical top line represents 1.5 times the interquartile range, and the vertical bottom line represents 1.5 times the lower interquartile range. Boxes with different letters or symbols are significantly different according to t-test at p < 0.05. The points represent the actual data points. (B) Microbial counts (Log CFU/mL) of samples with (Treatment A) and without (control) an intervention retail displayed for 96 h at refrigerated conditions (n = 50 samples per treatment/retail display time). Solid symbols represent mean counts per treatment and retail display time. Different letters are significantly different according to a 2-Way ANOVA followed by pairwise comparison test adjusted Tukey at p < 0.05.

Figure 8. Alternative graphical representation and statistical analysis approaches to evaluate shelf life studies. (A) Principal component (PC) analysis for indicator microorganisms for all treatments (Treatment A and Control) × display time (0 h, 48 h, and 96 h) combinations (n = 50 observations per intervention and display time). (B) Average (± 95% confidence intervals) growth rates (Log CFU/mL × Day) for each microorganism (Bacteria A and Bacteria B) and treatment (control, Treatment A, and Treatment B) combination. The dot represents the average growth rate and the horizontal lines represent the 95% confidence interval (n = 50 observations per intervention and retail display time).

Table 1. Summary table of linear model using least square regression method predicting bacterial counts using Method 2 and Method 3 when compared with Method 1.

Method	Coefficient	Estimate	Standard Error	p-Value	95% Confidence Intervals
Method	Coefficient	Estimate	Standard Error	p-Value	Lower (2.5%)	Upper (97.5%)
Method 2	Intercept	−0.028	0.035	0.426	−0.099	0.042
Method 2	Slope	1.018	0.007	<0.001	1.004	1.032
Method 3	Intercept	−0.037	0.081	0.653	0.085	0.837
Method 3	Slope	1.023	0.016	<0.001	0.991	1.056

Table 2. McNemar’s chi-square p-values of the different methods to detect a certain microorganism.

Detection Methods	Detection Methods
Detection Methods	Method 1	Method 2	Method 3
Method 1	-	0.305	0.005
Method 2	0.305	-	0.002
Method 3	0.005	0.002	-

p-values < 0.05 show statistically significant difference.

Table 3. Example of 2 × 2 contingency table for Salmonella presence in dry and rainy seasons on a beef harvest floor.

Season	Salmonella Presence		Row Total
Season	Presence	Non-Detected	Row Total
Rainy ^a	112	458	570
Dry ^b	55	530	585
Column Total	167	988	1155

^a,b Different letters suggest statistical difference between seasons using chi-squared under p < 0.05.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vargas, D.A.; Bueno López, R.; Casas, D.E.; Osorio-Doblado, A.M.; Rodríguez, K.M.; Vargas, N.; Gragg, S.E.; Brashears, M.M.; Miller, M.F.; Sanchez-Plata, M.X. Modernization Data Analysis and Visualization for Food Safety Research Outcomes. Appl. Sci. 2024, 14, 5259. https://doi.org/10.3390/app14125259

AMA Style

Vargas DA, Bueno López R, Casas DE, Osorio-Doblado AM, Rodríguez KM, Vargas N, Gragg SE, Brashears MM, Miller MF, Sanchez-Plata MX. Modernization Data Analysis and Visualization for Food Safety Research Outcomes. Applied Sciences. 2024; 14(12):5259. https://doi.org/10.3390/app14125259

Chicago/Turabian Style

Vargas, David A., Rossy Bueno López, Diego E. Casas, Andrea M. Osorio-Doblado, Karla M. Rodríguez, Nathaly Vargas, Sara E. Gragg, Mindy M. Brashears, Markus F. Miller, and Marcos X. Sanchez-Plata. 2024. "Modernization Data Analysis and Visualization for Food Safety Research Outcomes" Applied Sciences 14, no. 12: 5259. https://doi.org/10.3390/app14125259

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modernization Data Analysis and Visualization for Food Safety Research Outcomes

Abstract

1. Introduction

2. Types of Data

2.1. Quantitative Data

2.1.1. Normal Distribution

2.1.2. Lognormal Distribution

2.1.3. Binomial Distribution

2.2. Qualitative Data

3. Descriptive Statistics

3.1. Range

3.2. Median

3.3. Interquartile Range

3.4. Arithmetic Mean and Geometric Mean

3.5. Variance and Standard Deviation

3.6. Standard Error of the Mean and Confidence Intervals

3.7. Issues in Microbiological Counts

3.7.1. Negative Counts

3.7.2. Zeros, Limits of Detection, and Limits of Quantification

4. Type of Food Safety Experiments

4.1. Methodology Comparison

4.1.1. Data Visualization and Data Analysis

4.2. Pathogen Prevalence Studies

Data Visualization and Data Analysis

4.3. Intervention Studies

Data Visualization and Data Analysis

4.4. Bio-Mapping of Processing Facilities and Process Monitoring

Data Visualization and Data Analysis

4.5. Shelf Life Studies

Data Visualization and Data Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI