1. Introduction
With the tremendous growth of the population and consumption, a huge amount of municipal solid waste (MSW) is generated every day, especially in developing countries [
1]. MSW in developing countries is composed mainly of household garbage (55–80%) and commercial waste (10–30%) [
2]. China has a population of 1.4 billion people and is the most populated developing country in the world. According to a survey conducted by the Ministry of Ecology and Environment of the People’s Republic of China, the investigated 200 large and medium-sized cities generated 21,147.3 million tons of household solid waste (HSW) in 2018 [
3]. Landfilling, the dominant waste disposal technology in China, has introduced serious water contamination to over half of the existing landfills due to the limited available land space in cities and the lack of high-cost permeate collection equipment in treatment systems [
4]. As another important method by which to dispose of waste, incineration is expensive to operate and maintain, and also easily introduces air pollution if there is a lack of air pollution control equipment [
5]. Moreover, the large amount of dioxin emitted in the process of incineration aggravates global warming. Thus, solid waste disposal has become a challenging problem in China.
In the modern procedure of waste disposal, which includes waste separation, collection, transportation, and final treatment, recyclables and compostable waste account for 89.3% of HSW [
6,
7]. The accurate and efficient classification of waste can prevent waste pollution caused by mixing waste of different types, and can preclude the need for secondary sorting. As the initial point of the entire waste recycling process and the fundamental condition for ensuring effective recycling, the classification of waste can both enhance the efficiency of recycling and effectively protect the environment [
8]. In other words, waste sorting is an effective way to reduce waste [
9]. Because various types of waste require different types of disposal, a proper HSW standard is imperative in waste classification [
10].
Waste classification is the procedure by which waste is assigned to specific classes based on its properties, characteristics, and/or components [
6]. In past years, to effectively dispose of waste in China, the general criterion of municipal waste separation has been to divide waste into two types: recyclables and non-recyclables [
6]. Nevertheless, to keep up with the practical demands of economic and environmental development, Beijing enacted a new waste classification policy in 2020 based on the policies of developed countries, including Japan [
11] and Germany [
12]. The new standard is to classify waste into four types, namely wet waste, recycling, harmful waste, and dry waste [
13]. The ways by which to improve the efficiency of waste treatment rely not only on scientific criteria of waste classification, but also on the reliable and fast implementation of waste classification [
14].
At present, waste classification mostly relies on inefficient manual work, which has many shortcomings, such as high work intensity, high cost, and potential harm to the health of workers [
15]. Recently, automatic waste sorting and recycling facility systems based on the common sensor spectrum have been proposed. For example, Wu et al. [
16] proposed an automatic plastic sorting system; they focused on sorting different plastics from waste electrical and electronic equipment based on near-infrared (NIR) spectroscopy. Additionally, Riba et al. [
17] presented an approach for the sensing and classification of parts of an automatic waste textile sorting machine based on the infrared spectra of textile samples. These two studies respectively focused on the more refined classification of plastics and textiles. However, to gather spectral data for further processing, particular equipment is required, including NIR spectrometers, such as the NIR512 spectrometers by Ocean Optics, and the NIR radiation provided by a halogen light source. The equipment is expensive and complicated, and requires operation by professional personnel. In contrast, waste classification based on waste images via machine learning is accurate, simple, and convenient, and could therefore be used to construct an automatic smart waste sorter to alleviate the difficulties inherent in manual waste classification.
With the development of machine learning in recent years, deep learning has been widely used in speech recognition, visual object recognition, object detection, and many other fields [
18,
19,
20]. A convolutional neural network (CNN) is a typical model that has been extensively used in image recognition and detection problems. Most recently, image recognition techniques in the computer vision field have been applied to waste classification. For instance, Xie et al. [
21] proposed a framework based on a multilayer hybrid deep learning system (MHS) to recognize waste in urban public areas as recyclables or other types of waste. AlexNet [
22] was used to extract representative features from waste images, and multiple functional sensors were used to obtain other information about the waste. While a high accuracy of over 90% was achieved, only two categories of waste were considered.
Thung et al. [
23] released a dataset called TrashNet, which consists of 2527 images of waste divided into six different classes, namely glass, paper, plastic, metal, cardboard, and trash. The authors of Reference [
24] proposed a model called RecycleNet, which achieved a classification accuracy of 81% on the TrashNet dataset. The authors of Reference [
25] proposed a combined model called Inception-ResNet, which achieved a classification accuracy of 88.6% on the TrashNet dataset. However, the size of this dataset is not large enough for deep learning, and easily leads to overfitting. Moreover, the performance results of these models have room for improvement.
In the context of classification tasks, ensemble-based methods have been employed to minimize the test errors [
26]. There also exist several ensemble strategies for the combination of the prediction abilities of many different models. For example, Szegedy et al. [
27] proposed an ensemble method that averages the softmax probabilities over all the individual classifiers to obtain the final prediction results, and this method was found to outperform single classifiers. Additionally, Chen et al. [
28] adopted majority voting to combine models, and obtained a similar result. However, neither majority voting nor softmax probability averaging consider the differences between classifiers, and set the same weights for classifiers. Moreover, these methods may easily generate false predictions rather than correct predictions, as the integrated classifier regards the individual classifiers with the same reliability. In weight integration, each weight coefficient should be properly set, which is essential for the final prediction results [
29].
Unequal precision measurement (UPM) is common in practice. Bar-Shalom et al. [
30] proposed a one-step target tracking system solution for measurements obtained in discrete time, and Prieto et al. [
31] proposed an adaptive likelihood method for robust data fusion in location systems. These models both fuse data by processing data of different types and with unequal precision, and parameter estimation can be improved with the assistance of data fusion. In the process of data fusion, the fusion weights of multiple heterogeneous unequal-precision data obtained under several different conditions are significant for the improvement of the precision of the measurement result [
32].
Based on the preceding discussion, this paper proposes an ensemble learning model called EnCNN-UPMWS, which is based on three CNNs with different architectures and a UPM weighting strategy (UPMWS). The CNN ensemble couples the superior capabilities of the individual CNNs in terms of learning and exploring the patterns in waste image data, which improves the accuracy of the ensemble. Three state-of-the-art (SOTA) CNNs, namely GoogLeNet [
27], ResNet-50 [
33], and MobileNetV2 [
34], are chosen as ingredient classifiers, and their performance on waste datasets is also demonstrated. To achieve further improvement in waste classification, the UPMWS, which involves the determination of the weights for UPM, is introduced in the CNN ensemble. It is worth mentioning that the UPMWS, which measures values in the process of data fusion, has never before been used in ensemble learning, let alone in an ensemble of CNNs. The main contributions of this work lie in the following three aspects:
In this study, 47,332 images of waste belonging to four different classes, namely wet waste, recyclable waste, harmful waste, and dry waste, were collected from several open-access datasets and the Internet to create the FourTrash dataset;
The proposed framework consists of several diverse SOTA CNNs (GoogLeNet, ResNet-50, and MobileNetV2) with different structures to deeply learn the features and explore the implicit information in waste images. These networks are treated as ingredient classifiers in the CNN ensemble;
UPMWS is introduced to obtain reliable predictions by multiplying the result of each classifier and its corresponding weight coefficient. This can provide more robust results during the aggregation of the forecasting results of the CNNs.
The remainder of this article is organized as follows. Information about the materials and methods is provided in
Section 2, and the proposed methods are presented in
Section 3. The experimental results and discussion of this study are explained in
Section 4. Finally, the conclusions are given in
Section 5.
5. Conclusions
In this paper, a framework (EnCNN-UPMWS) based on an ensemble learning strategy of three CNNs (GoogLeNet, ResNet-50, and MobileNetV2) and integration with the unequal precision measurement weighting strategy (UPMWS) was presented for HSW classification. In the proposed EnCNN-UPMWS model, three different types of CNN models are separately trained and saved. During training, the UPMWS is used to compute the weights for individual models. The three trained classifiers are then combined by adding the weighted predicted probability vectors together to obtain the final result for test samples. To evaluate the performance of the developed framework, it was compared with existing SOTA models in terms of four metrics (accuracy, F1-score, weighted F1-score, and macro F1-score) on two waste image datasets, namely FourTrash and TrashNet. In addition, the use of the majority voting method in the ensemble was also compared with the UPMWS.
Via the comparison of the results presented in
Section 4, the proposed EnCNN-UPMWS was found to exhibit enhanced classification performance as compared to GoogLeNet, ResNet-50, and MobileNetV2. Moreover, the experimental results imply that the ensemble learning strategy outperformed the single models, and the proposed UPMWS method for weight setting outperformed the majority voting. On the FourTrash dataset, the overall accuracy of the proposed model for the four waste classes was 92.85%, which was 1.88% higher than the best accuracy of the single models and 0.53% higher than that of voting. Moreover, the macro and weighted
F1-
scores were respectively 0.8825 and 0.9264, which were respectively 0.026 and 0.0183 higher than the best indices of the single models and respectively 0.0064 and 0.0048 higher than voting. Furthermore, the proposed framework exhibited superior
F1-
scores for each class. For TrashNet, the overall accuracy of the proposed model for the six waste classes was 93.50%, which was 1.62% higher than the best accuracy of the single models and 0.92% higher than that of voting. Moreover, the weighted and macro
F1-
scores were respectively 0.9351 and 0.9315, which were respectively 0.0158 and 0.019 higher than the best index of the single models and respectively 0.0093 and 0.0107 higher than voting. Furthermore, the proposed framework was superior to the other models in terms of the
F1-
score for most categories. Finally, the overall results demonstrate that the proposed EnCNN-UPMWS model can be considered to be a candidate for waste image classification.
The proposed UPMWS method, via which a set of proper weight coefficients is provided for base classifiers, works better than the majority voting method, via which the same weight coefficients are set for classifiers, and can therefore be applied in ensemble learning for classification tasks. In the future, the potential of the EnCNN-UPMWS model to solve more complicated tasks in waste image detection will be explored from the perspective of complex backgrounds.