1. Introduction
Plastic parts and products have become an integral part of our daily lives, coexisting with humans on a regular basis, and remain essential in various aspects of our everyday routines. However, once a product is utilized, it gives rise to environmental issues, particularly plastic waste pollution, which is a significant concern in today’s world. Recent research indicates that, on average, humans could ingest 0.1–5 g of plastic per week, with the highest value equivalent to consuming a credit card weekly [
1]. As of 2021, there were 24.4 trillion pieces of microplastics in the world’s oceans, which is equivalent to 8.2 × 10
4~57.8 × 10
4 tons [
2]. Moreover, the global plastic market is projected to grow at an annual rate of 3.4% from 2021 to 2028 [
3]. Plastic offers advantages, such as low weight, durability, and cost-effectiveness, which have contributed to the rapid growth of the plastic industry [
4,
5]. Astonishingly, humans produce approximately 500 million tons of plastic products each year, with 40% being single-use items [
6]. Despite this substantial production and single-use rate, less than 16% of plastic is recycled. Insufficient recycling facilities and confusion regarding what can and cannot be recycled result in the majority of plastics ending up in landfill [
7]. The process of classifying plastic waste, whether through manual or automated means, represents a significant challenge in waste management. Manual methods require more time, effort, and labour, making them less profitable compared with automated approaches.
Automated methods for plastic classification utilize sensors, radiation, or analysing the chemical and physical properties of different materials [
8]. A common technique employed in automated classification processes is spectroscopy, which has been the subject of numerous studies aiming to enhance classification efficiency [
9,
10]. The most common spectroscopic methods investigated for this purpose are Raman spectroscopy, laser-induced breakdown spectroscopy (LIBS), infrared spectroscopy, and X-ray spectroscopy. However, the presence of additives, such as flame retardants, for instance, cannot be detected accurately using LIBS and Raman spectroscopy [
11]. Infrared spectroscopy, particularly near-infrared spectroscopy (NIR) within the wavelength range of 0.8 to 2.5 μm, has gained significant attention globally for plastic identification and classification, and has already been implemented in some recycling facilities, exhibiting high performance in identifying polymer classes. However, it may not be suitable for classifying black plastics due to their high absorption within this wavelength range [
12]. Fourier-transform infrared (FTIR) spectroscopy in the medium-wave infrared (MWIR) range can effectively classify polymers by type, including black plastics. However, it is highly sensitive to plastic shape and surface characteristics [
13].
In contrast, X-ray spectroscopy can classify plastics based on traces, making it suitable for the identification of black plastics, but less effective in distinguishing between polymers of the same family due to their identical chemical composition [
14].
In the field of plastic classification, researchers have explored non-invasive techniques, such as near-infrared (NIR) and mid-infrared (MWIR) spectroscopy, for material characterization. However, these spectroscopic methods are influenced by factors like surface topology, particle size, and orientation, which can affect the spectra. Various statistical methods have been employed to identify and classify polymer spectra. These efforts have aimed to differentiate between different types of plastics, but certain challenges remain. For example, the NIR domain has limitations in identifying resin collections and distinguishing between high-density polyethene (HDPE) and low-density polyethene (LDPE).
Recent studies have utilized techniques like attenuated total reflectance Fourier-transform infrared spectroscopy (ATR-FTIR) and mid-infrared hyperspectral imaging (MIR-HSI) for plastic classification. Researchers have developed reference spectral libraries and used ATR-FTIR measurements for polymer identification. Hyperspectral imaging has been employed for the classification of marine microplastics, and deep learning architectures have been utilized for the automatic counting and classification of microplastics. These studies indicate the potential for improving plastic classification using machine learning and advanced spectroscopic techniques.
Some researchers have initiated implementing machine learning and AI for the plastic classification process. Lorenzo-Navarro et al. developed a deep learning network architecture for the automatic counting and classification of microplastics between 1 and 5 mm in size, classifying them into fragments, pellets, and lines [
15]. Their results suggest that deep learning architectures have the potential to improve further and can be applied to different spectral ranges. Jacquin et al. employed a mid-infrared hyperspectral imager (MIR-HSI) camera and machine learning algorithms to propose a cautious classification procedure for waste electrical and electronic equipment (WEEE) plastics, ensuring high purity of the classified samples [
16]. However, the MWIR wavelength range did not work effectively for certain plastics of interest, limiting the classification to only four polymer classes. Wu et al. compared the performance of three algorithms—spectral angle mapper (SAM), partial least-squares discriminant analysis (PLS-DA), and linear discriminant analysis combined with principal component analysis (PCA-LDA)—for classifying NIR spectra from WEEE plastic samples [
17]. While achieving good results, the NIR experiment was primarily suitable for light-coloured plastics, whereas most WEEE plastics are dark. Additionally, several studies have utilized hyperspectral imaging (HIS) in the short-wave infrared range (SWIR: 1000–2500 nm) for different imaging classifications [
18,
19,
20,
21]
Carrera et al. carried out an extensive implementation of machine learning models for polymer identification and classification using infrared spectra (NIR or MWIR) measured using different spectrometers and wavelengths. Their study focused on specific plastic types, but did not cover black plastics or differentiation between HDPE and LDPE [
3].
Considering all the exciting approaches discussed above, there is an important issue in plastic recycling that needs to be addressed, especially when microwave-heating of foods is involved [
22]. The mixing of plastics containing toxic materials with food-grade plastics poses a significant concern in terms of public health and safety. When different types of plastics, especially those intended for non-food applications, are mistakenly mixed with food-grade plastics, it can lead to contamination of the food packaging or storage containers [
23]. Toxic substances present in certain plastics, such as phthalates, bisphenol A (BPA), or heavy metals, can leach into the food, posing potential health risks upon consumption. Therefore, strict separation and proper identification of plastics during the recycling process are crucial to prevent the mixing of toxic materials with food-grade plastics and ensure the integrity of food packaging and storage systems.
In this study, we propose a complementary first-stage separation technique for food-grade plastic packaging from any other plastics at the beginning of the recycling process, where food-grade plastics can be separated from any non-food plastics. We have developed a unique label coated with a thin film transparent to the visible spectrum of light, yet possessing infrared-reflecting properties. This thin film coating technology coupled with a computer vision-based system can identify food-grade plastic packaging at the first stage of the recycling process. This conceptual methodology will be used to classify images based on their thermal images and can identify and classify packaging plastics during the first stage of the recycling process via the label. A thermal camera will identify the special food-grade label from a conventional label, irrespective of the printed pattern on the label.
2. Experimental
In this study, we explore the application of a unique metal oxide coating on polyvinyl chloride (PCV) label sheets, deposited using a V6000 confocal sputter system (Manufactured by Scientific Vacuum Systems LTD., Reading, UK) equipped with an RF plasma source, to facilitate the classification of plastics during recycling. The coating, which remains transparent to visible light, exhibits a distinct property of reflecting mid-infrared (IR) radiation. As such, it will not interfere with the desired pattern or designs of the labels placed on food products. By utilizing these coated labels affixed to plastic packaging, we present a novel approach to plastic classification based on infrared thermal imaging.
The experimental setup involves the deposition of the metal oxide coating onto plastic packaging labels using the sputter deposition system. The nature of the metal oxide and its specific structure and deposition regime cannot be disclosed in full detail for commercial purposes. However, it is primarily based on zinc oxide with a particular doping and specific control of its deposition process to achieve maximum mid-infrared reflection. Infrared reflection by metal oxides occurs due to their unique electronic and structural properties. Metal oxides exhibit different levels of infrared reflectivity depending on factors such as composition, crystal structure, and surface morphology. Via the selective doping of metal oxides, the concentration of charge carriers can be manipulated, which ultimately translates to bandgap manipulation [
24]. Different metal oxides, such as aluminium oxide (Al
2O
3), titanium dioxide (TiO
2), or zinc oxide (ZnO), exhibit different levels of infrared reflectivity and have unique mechanisms governing their behaviour. The deposition conditions and doping can lead to eliminating energy levels that typically absorb photons in the mid-infrared range and, thus, enhancing the reflectivity of infrared photons. ZnO can also absorb infrared radiation through the mechanism of free carrier absorption. When ZnO is doped or contains defects, it can generate free electrons or holes that can absorb infrared photons whose energy matches their levels. This absorption can reduce the overall reflectivity of ZnO in the infrared range [
24,
25]. Therefore, precise deposition conditions must be selected to prevent free carrier absorption. In our experiments, while developing these specific coatings, the following conditions were precisely fine-tuned for the desired reflective properties:
The deposition has to be carried out at a very specific RF plasma power, which translates to the kinetic energy of the argon atoms attacking the target and ejecting the zinc and doping element to be deposited on the substrate;
Very low chamber pressures. The best results were observed when depositing at pressures below 1.5 × 10−3 mbar. Above this chamber pressure, the infrared reflectivity was significantly reduced.
Generally, depositing under low pressure has certain advantages, such as increasing the mean free path, decreasing gas scattering, increased energy transfer, and decreased gas density variation. Gas density variation can lead to uneven deposition and hinder uniformity [
26,
27]. Considering that our objective was to achieve maximum infrared reflection, smoother and more even surface coatings were desired, hence why the pressure limit affected the reflectivity performance.
The metal oxide coating, with its specific composition and thickness, imparted the desired optical properties necessary for mid-IR reflection while maintaining transparency to visible light. The coated labels were then attached to multiple plastic objects representing different types of plastics commonly encountered in recycling processes.
To validate the effectiveness of the proposed classification method, we acquired thermal images of plastic samples both with and without the metal oxide-coated labels. These thermal images served as inputs for a computer vision-based image classification algorithm, which enabled the automated identification and sorting of plastic materials. By leveraging the distinctive mid-IR reflection properties of the metal oxide coating, the thermal images provided valuable information for accurate plastic classification.
The main objective of this research was to demonstrate the feasibility and effectiveness of using our unique transparent metal oxide coating in conjunction with infrared thermal imaging for plastic classification during recycling. By streamlining the sorting process through automation, we aimed to enhance the efficiency and accuracy of food plastic recycling, ultimately contributing to the reduction in toxins contained in food-grade plastics during recycling.
All the labels (coated and uncoated) were identical. The pattern of the labels is presented in
Figure 1. The labels were coated under vacuum (under 1.5 × 10
−3 mbar) in an Argon atmosphere with a plasma power of 100 w applied to a 6-inch magnetron fitted with a target material with a unique metal ZnO/doping complex. At an industrial scale, such label preparation can be carried out on a role-to-role basis with a plasma coating system that can be adequately designed for this objective to produce the labels very cheaply.
3. Results
The Fourier-transform infrared (FTIR) transmittance spectra of the coated and uncoated labels are depicted in
Figure 2. FTIR analysis provides valuable insights into the molecular composition and chemical characteristics of samples; however, in this study, our objective for using FTIR was to illustrate the comparative transmission of the coated labels and the uncoated labels in the IR region. The peaks present in
Figure 2 are, as such, of no interest, as they are only related to the primary label material. From
Figure 2, we can clearly see that our coating transmitted a significantly lower quantity of infrared radiation between 2000 and 16,500 nm, which was reflected and observed in the thermal images. The UV/Vis spectrum of a coated label in the visible to near infra-red region is presented in
Figure 3.
The labels coated with infrared reflecting film and uncoated labels were placed on various identical plastic objects (for coated and uncoated labels). The samples were then deformed randomly by hand to mimic their form of shape after having been disposed of, when on a recycling conveyer belt. A total of just under 450 images of various plastic objects with an IR-reflecting label and standard label were obtained using an FLIR E96 thermal visualization camera. The FLIR E96 offers an impressive thermal resolution of 640 × 480 pixels. This high-resolution thermal detector enables accurate temperature measurement and detailed thermal imaging. With a thermal sensitivity of less than 0.03 °C, the FLIR E96 can detect even subtle temperature variations. An example of objects labelled accordingly is illustrated in
Figure 4. However, many more objects, as well as varying orientations and angles of view, were used to capture the dataset of images for this project.
The image data were split into training and validation sets, where 356 images were allocated for model training and 89 images were kept away to evaluate the performance of the computer vision models intended for the classification process. Two models were evaluated. The first model was built using the sequential API of the TensorFlow software library possessing a simple structure.
3.1. Basic Convolutional Neural Network Model
The model was a sequential deep learning model designed for image classification tasks. The model architecture consisted of several layers, including convolutional layers, max pooling layers, a flatten layer, dense layers, a dropout layer, and a final dense layer with a sigmoid activation function. The convolutional layers extracted relevant features from the input images, while the max pooling layers downsampled the feature maps to reduce their dimensions. The flatten layer converted the 2D feature maps into a 1D vector. The dense layers performed high-level feature extraction, and the final dense layer with sigmoid activation produced the classification output. The model was compiled with the binary cross-entropy loss function, the Adam optimizer (Adam optimization is a stochastic gradient descent method which is implemented in various deep learning applications, such as computer vision), and accuracy as the evaluation metric. This model can be trained to classify images into two classes based on the specified architecture and optimization setup; the model was trained for image recognition on the test data categorising infrared-reflecting samples from the conventional uncoated labels. The structure of this model is illustrated in
Figure 5.
The model performance metrics plots are illustrated in
Figure 6 and, by tracking the validation loss plot, while good training accuracy was being achieved, the model was overfitting on the training data. Owing to the thermal signatures significantly varying on the samples depending on how they were crushed or deformed, and the limited features present in a thermal image compared to a standard image of objects, such a simple model would find it challenging to perform with 100% accuracy.
In the context of machine learning models, training loss, validation loss, accuracy, and validation accuracy are important metrics used to evaluate and monitor the performance of the model during the training process. Training loss refers to the measure of error or mismatch between the predicted output of the model and the actual target output on the training dataset. It quantifies how well the model is learning the patterns and relationships within the training data. The goal during training is to minimize the training loss, indicating that the model is improving its ability to make accurate predictions.
Validation loss, on the other hand, measures the error on a separate validation dataset that is not used during the training phase. It provides an estimate of how well the model generalizes to unseen data. Validation loss helps to assess if the model is overfitting or underfitting. Ideally, the validation loss should be like the training loss, indicating that the model is performing well on unseen data.
Accuracy is a metric that quantifies the overall correctness of the model’s predictions. It calculates the percentage of correctly classified instances out of the total number of instances. In the context of training, the training accuracy is calculated using the training dataset and reflects how well the model predicts the correct labels for the training data. Validation accuracy measures the performance of the model on the validation dataset (in this case, thermal images of the plastic samples that the model has not encountered before). It indicates how accurately the model predicts the labels of the validation data. A high validation accuracy indicates that the model is performing well and generalizing effectively to unseen data.
During the training process, the objective is to minimize both training loss and validation loss, while maximizing accuracy and validation accuracy. Monitoring these metrics helps in assessing the progress of the model, identifying potential issues, such as overfitting, and making informed decisions for model optimization and improvement.
The validation data (set of thermal images of the plastics labelled accordingly, which were kept hidden from the model) were separated into three batches and the trained model was used to classify their images into IR-labelled or conventionally labelled. The three confusion matrices, as well as the associated F1 score, in
Figure 7 illustrate the performance of the model on thermal images that it had never encountered before. The F1 score is a widely used metric for evaluating the performance of classification models. It combines precision and recall into a single value, providing a balanced measure of a model’s accuracy. The F1 score considers both the model’s ability to correctly identify positive instances (precision) and its ability to capture all positive instances (recall). It is calculated as the harmonic mean of precision and recall, ranging from 0 to 1, where a higher value indicates better performance. The F1 score is particularly useful when the dataset is imbalanced, where the number of instances in different classes varies significantly. By considering both precision and recall, the F1 score provides a comprehensive assessment of the model’s ability to make accurate and comprehensive predictions.
A confusion matrix is a tabular representation that summarizes the performance of a classification model. It provides a visual representation of how well the model predicted the actual classes of the thermal images. The matrix consists of rows and columns, with each row representing the instances in a predicted class and each column representing the instances in an actual class.
The confusion matrix shows four key metrics:
True Positives (TP—Top green): The number of instances correctly predicted as positive by the model;
True Negatives (TN—Bottom green): The number of instances correctly predicted as negative by the model;
False Positives (FP—Top pink): The number of instances incorrectly predicted as positive by the model;
False Negatives (FN—Bottom pink): The number of instances incorrectly predicted as negative by the model.
From the confusion matrices presented in
Figure 7, we can see that, while the model was able to detect the infrared reflections in the majority of the images, a certain number of misclassifications was made. We can see that the model seemed to achieve an accuracy of about 90%. Overall, out of 89 images, a total of 7 wrong classifications were made, which is not acceptable. Four of the misclassified images are illustrated in
Figure 8. While visually, we can notice the reflection of the IR in the images, the model had wrongly classified them to be of the conventional uncoated label type.
As such, we can conclude that the model needs to be powerful enough to distinguish the shadows and brightness patterns of an infrared image. Therefore, we decided to implement transfer learning and benefit from a more sophisticated model.
3.2. Transfer Learning Model
ResNet-50 v2 is a convolutional neural network (CNN) architecture that is a variant of the original ResNet-50 model. ResNet stands for “Residual Network,” and it was introduced by He et al. to address the degradation problem encountered in very deep neural networks [
28]. ResNet-50 v2 improves upon the original ResNet-50 by introducing changes to the architecture to enhance performance and training efficiency. The ResNet-50 v2 architecture consists of 50 layers, including convolutional layers, pooling layers, fully connected layers, and shortcut connections. The core building block of the network is the residual block, which contains two or three convolutional layers, depending on the variant. The residual block introduces shortcut connections that allow the network to learn residual mappings, instead of attempting to learn the full transformation. This helps alleviate the vanishing gradient problem (the vanishing gradient problem occurs when gradients become extremely small during deep neural network training, leading to slow learning or ineffective training of earlier layers) and enables the training of very deep networks. During the recycling process, the plastic samples can be crushed or deformed in shape and, as such, the IR signature captured by the thermal camera can look significantly different, as observed in
Figure 8; as such, a powerful model like ResNet will be better adapted for identifying various thermal signatures for the classification objective.
In ResNet-50 v2, the residual blocks are organized into different stages. The first stage performs initial convolution and pooling operations, while subsequent stages consist of multiple residual blocks stacked together. The number of residual blocks per stage may vary depending on the architecture variant.
Another key feature of ResNet-50 v2 is the use of bottleneck blocks. These blocks are designed to reduce computational complexity by employing 1 × 1 convolutions to reduce the number of input channels, followed by 3 × 3 convolutions, and finally 1 × 1 convolutions to restore the number of channels. This bottleneck design allows the model to achieve better performance with fewer parameters. ResNet-50 v2 has demonstrated excellent generalization capabilities, allowing it to learn hierarchical features that are transferable across different datasets and tasks. This makes it suitable for transfer learning and fine-tuning on specific domain datasets. We modified the output layer of ResNet-50 v2 to match it to our set of binary data and kept all of the original weights frozen, only training the model for weights associated with the modified output layer. The model’s structure is presented in
Figure 9.
The transfer learning model was then trained with the same training data over 60 epochs, and the learning curve metrics of the model are presented in
Figure 10. The model was then tested using the same validation data, and it was able to classify the infrared thermal images with 100% accuracy. The confusion matrices demonstrating the accuracy of the model’s performance are presented in
Figure 11.
4. Discussion
By precisely manipulating the optoelectronic properties of a zinc oxide-based thin film via doping and precise deposition, its mid-infrared-reflecting properties can render the film useful for various applications. In our studies, the main challenge faced was identifying hot spots under the precise conditions of deposition, particularly the deposition chamber pressure and the RF plasma generating power.
Figure 12 illustrates the UV-Vis spectroscopy result associated with a portion of our samples during the fine-tuning process of achieving a thin film with the desired infrared-reflecting properties. Increasing plasma power would lead to a higher rate of deposition; however, we had to ensure that the power did not exceed levels that would damage the PVC labels. While 1 WCm
−1 plasma power achieved faster coating on glass substrates, it would damage the labels. Hence, we had to identify a plasma power threshold that will yield optimal coating conditions without damaging the labels. The chamber pressure seemed interesting, as only under 1.5 × 10
−3 mbar could we observe a drop in the IR absorption after the 1500 nm wavelength range.
However, when considering real-world applications, it is important to note that the labels coated with the film reflected infrared light, just as a mirror reflects visible light. This means that the angle of incidence from a thermal radiating source (in our experiments, a black-bodied object with a temperature of above 40 degrees centigrade) to the object carrying the label needs to be such that the thermal reflection falls within the receptive field of the thermal camera lens. Considering that such objects may be crushed and deformed, a single camera will fail to capture the thermal reflection; as such, a setup, as illustrated in
Figure 13, will need to be considered where two thermal cameras need to be operating simultaneously. This would surely add additional one-off cost to the process, but will not be difficult to implement.
The thermal source can be a surface that bears a certain temperature, ideally above 40 degrees centigrade. The design of the thermal source can be carried out considering the reflectance of the IR radiation above the 1500 nm range and utilizing the Stefan–Boltzmann law and, in particular, Wien’s law. Wien’s law, also known as Wien’s displacement law, is a fundamental principle in the study of black body radiation [
29]. It establishes a relationship between the temperature of a black body and the wavelength at which the intensity of its radiation is at its maximum (peak wavelength). According to Wien’s law, the product of the peak wavelength (λ_max) and the temperature (T) of a black body is a constant. Mathematically, it can be expressed as λ_
max = C/T, where C is Wien’s displacement constant (2.898 × 10
−3 m·K) and T is the temperature of the black body (the thermal radiating source for our experiment) in Kelvins. For example, a black body with a body temperature of 40 degrees Celsius will emit thermal radiation with a peak wavelength of 9254 nm.
The Stefan–Boltzmann law describes the total power radiated by a black body and states that the power emitted per unit area is proportional to the fourth power of the absolute temperature [
30]. Mathematically, it can be expressed as P = σ × A × T
4, where P is the power, σ is the Stefan–Boltzmann constant (approximately 5.67 × 10
−8 W/(m
2·K
4)), A is the surface area, and T is the absolute temperature. This means, for example, a thermal source bearing one square meter of area and 9254 degrees Celsius will be emitting around 400 watts of power. As such, designing a detection module as depicted in
Figure 12 will not be hard to achieve and will be within economical scope.
The successful application of the thin film coating process to create infrared reflecting labels coupled with an AI-based classification model for thermal image classification can open ideas in the field of plastic recycling. What is rather interesting is that, by using a ResNet50 model that has been trained on 1000 standard images, we achieved the best results. ResNet50 has not been trained on thermal images, yet it was able to successfully capture features from the thermal reflections to be able to accurately classify the labelled samples accordingly. By harnessing the properties of the unique metal oxide coating, our study leverages the power of AI-based deep learning models for image classification. On a final note, the labels produced in this study were, in fact, based on a plastic that is ultimately a pollutant itself. More recently, biodegradable materials for food packaging have been explored [
31,
32,
33], and this labelling technology can potentially be coupled with these novel materials in terms of labelling them or, more interestingly, exploring the actual label material to be made of such biodegradable materials. Such an approach will make this labelling concept be further in line with the objective of minimizing environmental plastic pollution.
5. Conclusions
In this report, we conducted an explorative study that showcases the potential of infrared-reflecting labels created via the plasma sputtering of a thin film coating on the labels. Our research demonstrates that, when combined with AI-based deep learning models for image classification, these labels hold significant promise in effectively separating food-grade plastics during the initial stages of plastic recycling. Although this methodology is in its nascent stages, further advancements and refinement can yield remarkable results, particularly in the context of segregating food-grade plastics from plastics used for other applications, in particular, exploring the implementation of our coating on biodegradable labels.
While the potential of this methodology is already evident, further research and development hold tremendous promise for its integration into existing plastic-recycling workflows. This advancement would enable the efficient separation of food-grade plastics, which often require stricter processing and handling protocols, from plastics used in other applications.
The merging of thin film material science with artificial intelligence (AI) is generating significant excitement and opening new possibilities in research and applications. By combining the expertise of material scientists with the power of AI, we can accelerate the discovery and development of novel thin film materials with enhanced properties and functionalities. AI algorithms can efficiently analyse large datasets, identify patterns, and make predictions, enabling the rapid screening and optimization of thin film materials for specific applications.