1. Introduction
Better product quality brings higher profit. Quality affects the costs of processing, marketing, and the pricing of food products. Quality evaluation is a labor-intensive process and constitutes a significant portion of the expense of food production. Due to decreasing labor availability and increasing labor costs, the constant hiring and training of capable workers has become a daunting task for food processing facilities. Machine vision technology has been a proven solution for inspection and grading of food products since the late 1980s. Feedback received from several small and midsize food processing facilities in the US indicated that system and operating costs, operation complexity and, in some cases, unsatisfactory results have discouraged them from embracing or upgrading to the latest developments.
Quality evaluation involves (1) grading to separate products into different quality grades and (2) sorting to detect and remove imperfect products. Human vision can perform these tasks nearly effortlessly, but the efficiency often suffers from inconsistency, fatigue, or inexperience. Many commercial grading or sorting machines for food products on the market have been designed to automate these tasks and have proven to be very successful.
Many recent developments were aimed at broad grading applications. For example, a machine learning-based system was built to use color and texture features for tomato grading and surface defect detection [
1]. Color and texture features were collected for maturity evaluation, defect detection, size and shape analysis for mangoes [
2]. Guava fruits were classified into four maturity levels using the K-nearest neighbor (KNN) algorithm to analyze color distribution in HSI (Hue, Saturation, and Intensity) color space [
3]. Image processing techniques were used for corn seed grading [
4], plum fruit maturity evaluation [
5], quality assessment of pomegranate fruits [
6], and the ripeness of palm fruit [
7]. Two grading systems for date maturity [
8] and skin quality [
9] evaluation were designed specifically for Medjool dates. All these recent developments were designed for their unique grading applications. Although some may be easier than others to adapt for similar applications, minor adjustments are needed even for a simple generic solution such as color grading or shape analysis.
A comprehensive review of fruit and vegetable quality evaluation systems revealed that many systems use slightly different handcrafted features for even the same applications [
10]. For example, twenty-six different color features in four different color spaces and twenty-two texture features (some include color and size) were designed for quality analysis. Eighteen morphological features were designed for shape quality grading. Forty-three combinations of methods and features were designed for defect detection. Thirty-six classification techniques (some are similar) were implemented for evaluating quality of fresh produce. Although handcrafted image features have been successfully used to describe the characteristics of food products for quality evaluation, they often work well only for their unique applications. The adaption of handcrafted features for different applications requires either developing new algorithms or setting new grading criteria. These are not trivial tasks and require highly trained workers.
Our solution to the above challenges is an affordable smart camera capable of handling the computation load and interfacing with the existing equipment. It is designed to be integrated into grading and sorting machines for all sorts of food products. Cost involves the hardware, installation, operation, and maintenance. The hardware cost of this solution includes the camera, ARM processor board, control board, lens, and the enclosure. It is in a self-contained IP-66 housing that is “dust tight” and protected against powerful jets of water. The installation is easier and space utilization is smaller than most computer-based vision systems. The operating cost is lower because of its ease of changeover for different products or different grading criteria.
Besides the hardware, the most important element of the smart camera is a robust visual inspection algorithm to improve the efficiency of the quality evaluation process [
10,
11]. Unlike the general object recognition that includes a large number of classes, visual inspection usually has two (good or bad) or a very small number of classes, consistent lighting and a uniform image background. All these unique situations make it a much more manageable task. The biggest challenge for machine-learning based visual inspection algorithms is the collection of a large number of negative samples for training. A good visual inspection algorithm must be able to automatically construct needed distinctive features to achieve good performance [
10,
11].
A review published in 2017 discussed methods for feature extraction for color, size, shape, and texture [
11]. The authors also discussed machine learning methods for machine vision, including K-nearest neighbor (KNN), support vector machine (SVM), artificial neural network (ANN), and the latest developments in deep learning or convolutional neural network (CNN). A fuzzy inference system was applied to fruit maturity classification using color features [
12]. Artificial neural networks have been successfully used for sorting pomegranate fruits [
13], apples [
14], fruit grading based on external appearance and internal flavor [
15], and color-based fruit classification [
16]. Some researchers explored and applied deeper neural networks, such as CNN, to fruit and vegetable grading [
17,
18,
19] and flower grading [
20] and achieved great success. With different degrees of complexity, all these methods significantly advanced the development of machine vision technology.
Our visual inspection algorithm uses evolutionary learning to automatically learn and discover salient features from training images without human intervention [
21]. The proposed algorithm is able to find subtle differences in training images to construct non-intuitive features that are hard to describe, even for humans. This unique ability makes it a good candidate for the visual inspection of food products when the differentiation among grades is hard to describe. Compared to the aforementioned machine learning and sophisticated but powerful deep learning approaches, the proposed algorithm does not require a large number of images or computational power for training [
22]. It does not require extensive training for configuration and it is easier to change the algorithm to fit new products or new grading criteria in the factory. Its classification is faster and more efficient than most machine learning and deep neural network approaches as well.
We created two datasets as examples of quality evaluation for specialty crops and agriculture products. One has infrared images of Medjool dates with four levels of skin delamination. The other one consists of grayscale images of oysters with varying shape quality. The challenges and related work on these two representative applications are outlined in the next two subsections.
1.1. Date Skin Quality Evaluation
Because of their unique climate patterns, Southern California and Arizona are the best areas in the U.S. to grow dates [
23]. Usually, dates are harvested in a short period of time in August and September. They are harvested almost all at once, regardless their maturity. Harvested dates are graded into three or four categories according to their maturity levels [
8]. Mature dates are graded based on their skin quality before packing [
9].
The literature review found very few good quality work that addresses date surface quality evaluation. RGB images of date fruit were used for the grading and sorting of dates and achieved 80% accuracy in the experiment [
24]. A Mamdani fuzzy inference system (MFIS) was used to grade the quality of Mozafati dates [
25]. A bag of features (BOF) method was used for evaluating fruit external quality [
26]. Histogram and texture features were extracted from monochrome images to classify dates into soft and hard fruit using linear discriminant analysis (LDA) and artificial neural network (ANN) to obtain 84% and 77% accuracy, respectively. A simple CNN structure was used to separate healthy and defective dates and predict the ripening stage of the healthy dates [
27].
The majority of work related to date processing is for date classification or recognition. CNN was used for assisting consumers to identify the variety and origination of the dates [
28] and classifying dates according to their type and maturity level for robotic harvest decisions [
29]. Basic image processing techniques were used for grading date maturity in HSV (Hue, Saturation and Value) color space [
30]. Mixtures of handcrafted features, such as shape, color, and texture, were used to classify four [
31] and seven [
32] varieties of dates. Others used SVM [
33] and Gaussian Mixture Models [
34] to classify dates.
Some of the works mentioned above focused on using handcrafted features for classifying varieties of dates, not quality evaluation. For a few of those that were developed for date quality evaluations, basic image processing techniques and handcrafted features were used. They did not address the challenges mentioned previously. They require an experienced worker to configure and operate the system and their performance is in the mid 80% range.
1.2. Oyster Shape Quality
Product quality can be determined by many criteria. For most man-made products, food or none food, quality can be evaluated as the product’s dimensions, color, shape, etc. Unlike simple, geometrically shaped man-made products, agriculture and aquaculture products are organic and naturally growing objects with irregular and inconsistent shapes or surface curvature. Their quality is usually evaluated by criteria such as color, size, surface texture, and shape [
10,
11]. For consumer satisfaction, shape is one factor that cannot be ignored, especially for food products. Our focus of this experiment is on shape evaluation for food products whose quality evaluation is more challenging than man-made products.
More than 2500 oyster processing facilities were registered and licensed in the United States to grade oysters for packing or repacking as of 22 August 2017 [
35]. Sorting oysters by shape and size is an important step in getting high quality half-shell oysters to market. Restaurants prefer oysters that have a strong or thick shell, smooth shape, and deep cup filled with meat. Unfortunately, oysters are not like man-made products that are made with a uniform shape. Some are very long, and others have a depth to them, or are thin and round. Their varying shape and size are the result of growing area salinity, species, food, and tidal action. Even with the same growing conditions, oysters are never identical in shape and size.
We selected an oyster shape grading application that had the most shape variation to demonstrate the performance of our visual inspection algorithm. Currently, whole oysters are graded manually by their diameter and weight, which pretty much ignores the consumer’s demand for appearance. A study conducted in 2004 showed that consumers prefer round and medium-size oysters around 2 inches in diameter [
36]. Guidelines were established to describe desirable and undesirable shapes [
37].
Machine vision methods were developed to estimate the volume and weight of raw oyster meat [
38,
39]. Not many papers in the literature reported research on shape grading specifically for whole oysters [
40,
41]. Machine learning methods have been successfully used for shape analysis [
10,
11] but none of the them used machine learning techniques that are able to automatically learn distinct features from the images and adapt for the different grading criteria the grower prefers. As discussed previously, each oyster grower or harvester has their own set of grading rules. A visual inspection algorithm that is able to learn and mimic human grading is essential to building a versatile smart camera. As an example, we use the proposed algorithm to grade oysters into three categories. Its grading accuracy is more than adequate for commercial use.
3. Experiments, Results, and Discussions
Besides the three simple test cases shown in the previous section, we created two datasets to test our visual inspection algorithm for specialty crops and aquaculture products. The first dataset includes infrared images of Medjool dates with four levels of skin delamination. The second dataset includes images of oysters with four different shape categories.
Trainings for both datasets were performed on a desktop computer. We limited training iterations up to 80 iterations. We performed training multiple times and obtained 30 features each time. As shown in later sections, it took the algorithm slightly over 60 iterations for the date dataset and less than 10 iterations for the oyster dataset to reach the steady state performance. After the features were learned, the classification can be performed with the strong classifier. We also used an embedded system (Ordroid-XU4) equipped with a Cortex-A15 2GHz processor to test the classification speed. The classification time depends on the number of features used. For 30 features, it took approximately 10 milliseconds for each classification or 100 frames per second.
The three example cases reported in
Section 2.5.3 or even the two real-world applications discussed in this section may be viewed as trivial. As discussed in the introduction section, unlike general object recognition that includes a large number of classes, visual inspection applications usually have two (good or bad) or a very small number of classes. Unlike outdoor robot vision applications, they usually have consistent lighting and uniform image background. All these unique situations make it a much more manageable task, make our simple algorithm a viable solution for embedded applications. The three example cases and the following two test cases clearly demonstrated the versatility of the proposed visual inspection algorithm.
3.1. Date Skin Quality
Dates are sorted into four grades in Southern California and Arizona, USA according to the criteria shown in
Table 1 [
7]. With proper calibration and segmentation, the size measurement in terms of the length of the date can be measured with high accuracy since the contrast between the background and fruit is fairly high. In this work, the focus was on testing our visual inspection algorithm on skin quality evaluation, not size measurement. According to
Table 1, we labeled four classes of skin quality as Large (<10% skin delamination), Extra Fancy (10%~25%), Fancy (25%~40%), and Confection (>40%) for our experiments.
The original image size was 200 × 800. We cropped and collected 419 200 × 300 images (with the date in the center of the image) for training (118 Large, 100 Extra Fancy, 101 Fancy, 100 Confection) and 240 images for testing (74 Large, 79 Extra Fancy, 60 Fancy, 27 Confection) for the infrared date image dataset. The infrared date image dataset was created to include four levels of skin delamination.
Figure 8a shows sample images of these four classes from this dataset. All images in the dataset were captured on the blue plastic chains used on a sorting machine as shown in
Figure 8b. The dates were singulated and aligned with the chain to allow very minor rotation. The blue plastic background is very bright in the infrared image, whereas the delaminated or dry date skin is slightly darker, and the moist date is the darkest.
We recognize that many image processing techniques can be used to extract hand-crafted features to solve this problem with high accuracies around 86% [
26], and 95%~98% [
9]. This study was for demonstrating the versatility of our visual inspection and its advantages of not depending on hand-crafted features, requiring a very small number of images for training, fast and accurate grading, and ease of training and operation.
3.1.1. Performance
The proposed visual inspection algorithm successfully graded the test images into four grades according to their skin delamination. An overall accuracy of 95% was achieved and easily surpassed human grader’s average accuracy of 75%~85% [
9]. The confusion matrix of this experiment is shown in
Table 2. The error change during training is shown in
Figure 9. The error rate dropped as the training went on. It finally reached a steady state of error rate at approximately 5% after 60 iterations of learning.
Of the 12 misclassifications out of 240 test samples, 10 occurred between neighboring classes and only two were misclassified by twp grades. For example, one date from the Fancy class was misclassified as Confection, which is only one grade lower than Fancy, and three dates from the Confection class were misclassified as Fancy, which is only one grade higher than Fancy. Grading dates with borderline grades is often subjective and is acceptable, especially for such a low percentage (4.2% of dates were misclassified by one grade). Accepting those borderline samples as correct grades, our visual inspection algorithm correctly graded 238 of 240 samples, an impressive 99.2% accuracy.
3.1.2. Visualization of Date Features
Since the features for classification were learned automatically by the evolutionary learning algorithm, it was unclear what features the algorithm actually used to obtain such a good performance. As discussed previously, our features consist of a series of simple image transforms and the output of one transform is the input of the next transform. In order to further analyze what features are learned, we selected two learned features and displayed their transform outputs. Because every training image is slightly different, we averaged all outputs of each image transform into a learned feature for each training image for visualization
The two selected features are shown in
Figure 10. There are four rows in this figure. Each of them is for one grade as marked (Large, Extra Fancy, Fancy, Confection). The first feature selected and shown in
Figure 10a included six transforms. They were Median Blur, Laplacian, Gradient, Gaussian, Sobel, and Gabor transforms. The second feature selected and shown in
Figure 10b included three transforms. They were Gradient, Sobel, and Gabor transforms. White pixels mean they are more common in the output among all training images.
The first feature emphasizes more on the texture of the dates. The second feature emphasizes both the shape and texture information. Shape and texture information is the key information used in the infrared date classification in our experiments and, with these learned features, our model obtained near perfect classification performance.
3.2. Oyster Shape Quality
We collected 300 oysters with varying shape for this work. We had an experienced grader grade them into three categories: Banana, Irregular, and Good. The Good oysters could be further graded into large, medium, and small based on their estimated diameter. Size can be easily measured by counting the number of pixels of the oyster or fitting a circle or an ellipse to estimate the diameter. We focused on shape grading in this experiment. Broken shells were combined with Irregulars, as they should both be considered the lowest quality.
Figure 11 shows an example of each of the three final categories. A sample of Broken shape is also included. We collected 50 pieces of Banana, 100 pieces of Irregular, and 150 pieces of Good shape. The original image size was 640 × 480. Images were rotated to have the major axis aligned horizontally.
Oyster shape grading is subjective. Banana shape could be hard to be separated from Irregular, and some Broken shapes could appear very close to Good. We had an experienced grader select 38 pieces of Banana shape, 75 pieces of Irregular shape, and 113 pieces of Good shape to generate a training set. These training samples in this training set are the least ambiguous ones among the samples in its category. Our test set was also created by experienced graders. Our algorithm was trained and tested with these human-graded samples, which, by design, guides it to perform shape grading that satisfies human preference.
3.2.1. Performance
The proposed visual inspection algorithm successfully graded the test images into three grades according to their shape quality. An overall accuracy of 98.6% was achieved and easily surpassed human grader’s average accuracy of 75%~85%. The confusion matrix of this experiment is shown in
Table 3. All the Banana and Irregular oysters were correctly classified. There was one Good oyster that was misclassified as Irregular. The error rate change during training is shown in
Figure 12. The error rate dropped slowly as the training went on and reached a steady state of error rate at 1.4% after 8 iterations of learning.
We also analyzed the performance of learned features during the evolutionary learning process and through multiple generations.
Figure 13 shows the statistics of the fitness scores for the entire population in each iteration. The feature with the highest fitness score in each iteration was selected for classification. The maximum fitness score reached 100 (a feature provides perfect classification result) after around seven learning iterations.
Figure 14 shows the process of five different features that were evolved through generations. Our evolutionary learning process is able to learn good features through evolution. Features f1 and f2 took 10 generations to reach their highest fitness scores. Feature 3 was selected, but stopped at around 40%. Features 4 and 5 reached their highest fitness scores after six generations. Fitness score evolves differently but only the best ones are selected for classification.
3.2.2. Visualization of Oyster Features
The most straightforward way to interpret the features our evolutionary learning algorithm constructs is to show the image transform output of each class. The output of each image transform in the feature for all training images were averaged and normalized from zero to 255 in order to be viewed as images. White pixels mean they are more common in the output images among all training images. As examples, two features learned from the oyster dataset are shown in
Figure 15. There are three categories of oysters in the dataset, including Good, Banana, and Irregular.
There are three rows in
Figure 15. Each of them is for one grade, as marked (Good, Banana, Irregular). The first feature selected and shown in
Figure 15a included four transforms. They were Sobel, Gabor, Gabor, and Gradient transforms. The second feature selected and shown in
Figure 15b included three transforms. They were Gabor, Gradient, and Laplacian transforms.
Both features in
Figure 15 show that shape information is the most important piece of information being extracted by the evolutionary learning algorithm. The first feature in
Figure 15a emphasizes both the shape and texture information. The feature in
Figure 15b emphasizes the shape of the oysters more.
4. Conclusions
Vision based computer systems have been around for decades. Vision technology is a proven solution for food grading and inspection since the late 1980s. We have briefly discussed the latest developments in machine vision systems and reviewed sophisticated and more powerful machine learning and deep-learning methods that can easily perform the same visual inspection tasks and with impressive result. As powerful as the popular convolutional neural network approaches are, they often must be fine-tuned to get the optimal hyper-parameters in order to get the best classification results.
Most facilities we have come in contact with have either the old optical sensor-based systems or bulky computer-based vision systems, neither of which are flexible or user friendly. Most importantly, those systems cannot be easily adapted to new challenges in the increasing demands for food quality and safety. Surprisingly, a compact smart camera that is versatile and capable of “learning” is not yet offered by others. What we have presented in this paper is a niche embedded visual inspection solution for food processing facilities. We have reported our design of an embedded vision system for visual inspection of food products. We have shown impressive results for three simple test cases and two real-world applications for food products.
Our evolutionary learning process was developed for simplicity [
21,
22] and for visual inspection of food products. It is not only capable of automatically learning unique information from training images, but also improving its performance through the use of boosting techniques. Its simplicity and computational efficiency make it suitable for real-time embedded vision applications. Unlike other robot vision applications, visual inspection for factory automation usually operates indoor and under controlled lighting, especially when using the LED lights with regulated voltage. This is another reason our simple visual inspection algorithm can work well.
We performed the training multiple times for our date and oyster datasets on a desktop computer using our training images for up to 80 iterations to obtain between 30 to 50 features. We used the learned features on our smart camera equipped with a Cortex-A57 processor. The processing time was approximately 10 milliseconds per prediction or 100 frames per second. Our algorithm was proved to be efficient for very high frame rates even with a small ARM-processor.