**1. Introduction**

The automatic detection of abnormalities, diseases and pathologies constitutes a significant factor in computer-aided medical diagnosis and a vital component in radiologic image analysis. For over a century, radiology has been a typical method for abnormality detection. A typical radiological examination is performed by utilizing a posterior–anterior chest radiograph, which is most commonly called Chest X-Ray (CXR). CXR imaging is widely used for health diagnosis and monitoring, due to its relatively low cost and easy accessibility; thus, it has been established as the single most acquired medical image modality [1]. It constitutes a significant factor for the detection and diagnosis of several pulmonary diseases, such as tuberculosis, lung cancer, pulmonary embolism and interstitial lung disease [1]. However, due to increasing workload pressures, many radiologists today have to daily examine an enormous number of CXRs. Thus, a prediction system trained to predict the risk of specific abnormalities given a particular CXR image is considered essential for providing high quality medical assistance. More specifically, such a decision support system has the potential to support the reading workflow, improve efficiency and reduce prediction errors. Moreover, it could be used to enhance the confidence of the radiologist or prioritize the reading list where critical cases would be read first.

The significant advances in digital chest radiography and the continuously enlarged storage capabilities of electronic media have enabled research centers to accumulate large repositories of classified (labeled) images and mostly of unclassified (unlabeled) images from human experts. To this end, researchers and medical staff were able to leverage and exploit these images by the adoption of machine learning and data mining techniques for the development of intelligent computational

systems in order to extract useful and valuable information. As a result, the areas of biomedical research and diagnostic medicine have been dramatically transformed, from rather qualitative sciences which were based on observations of whole organisms to more quantitative sciences which are now based on the extraction of useful knowledge from a large amount of data [2].

Nevertheless, distinguishing the various chest abnormalities from CXRs is a rather challenging task, not only for a prediction model but even for an human expert. The progress in the field has been hampered by the lack of available labeled images for efficiently training a powerful and accurate supervised classification model. Moreover, the process of correctly labeling new unlabeled CXRs usually incurs monetary costs and high time since it constitutes a long and complicated process and requires the efforts of specialized personnel and expert physicians.

Semi-Supervised Learning (SSL) algorithms have been proposed as a new direction to address the problem of shortage of available labeled data, comprising characteristics of both supervised and unsupervised learning algorithms. These algorithms efficiently develop powerful classifiers by meaningfully relating the explicit classification information of labeled data with the information hidden in the unlabeled data [3,4]. Self-labeled algorithms probably constitute the most popular class of SSL algorithms due to their simplicity of implementation, their wrapper-based philosophy and good classification performance [2,5–8]. This class of algorithms exploits a large amount of unlabeled data via a self-learning process based on supervised learners. In other words, they perform an iterative procedure, enriching the initial labeled data, based on the assumption that their own predictions tend to be correct.

Recently, Triguero et al. [9] proposed an in-depth taxonomy based on the main characteristics presented in them and conducted a comprehensive research of their classification efficacy on several datasets. Generally, self-labeled algorithm can be classified in two main groups: *Self-training* and *Co-training*. In the original Self-training [10], a single classifier is iteratively trained on an enlarged labeled dataset with its most confident predictions on unlabeled data while in Co-training [11], two classifiers are separately trained utilizing two different views on a labeled dataset and then each classifier augments the labeled data of the other with its most confident predictions on unlabeled data. Along this line, several self-labeled algorithms have been proposed in the literature, while some of them exploit ensemble methodologies and techniques.

Democratic-Co learning [12] is based on an ensemble philosophy since it uses three independent classifiers following a majority voting and a confidence measurement strategy for predicting the values of unlabeled examples. Tri-training algorithm [13] utilizes a bagging ensemble of three classifiers which are trained on data subsets generated through bootstrap sampling from the original labeled set and teach each other using on majority voting strategy. Co-Forest [14] utilizes bootstrap sample data from the labeled set in order to train Random trees. At each iteration, each random tree is reconstructed by newly selected unlabeled instances for its concomitant ensemble, utilizing a majority voting technique. Co-Bagging [15] trains multiple base classifiers on bootstrap data created by random resampling with replacement from the training set. Each bootstrap sample contains about 2/3 of the original training set, where each example can appear multiple times. Recently, a new approach has been given by Livieris et al. [2,8,16,17] and Livieris [18] in which some ensemble self-labeled algorithms are proposed based on voting schemes. The proposed algorithms exploit the individual predictions of the most efficient and frequently used self-labeled algorithms using simple voting methodologies.

Motivated by these works, we propose a new semi-supervised self-labeled algorithm which is based on a sophisticated ensemble philosophy. The proposed algorithm exploits the individual predictions of self-labeled algorithms, using a new weighted voting methodology. The proposed weighted strategy assigns weights on each component classifier of the ensemble based on its accuracy on each class. Our main aim is to measure the effectiveness of our weighted voting ensemble scheme over the majority voting ensembles, using identical component classifiers in all cases. On top of that, we want to verify that powerful classification models could be developed by the adaptation of advanced ensemble methodologies in the SSL framework. Our preliminary numerical experiments

prove the efficiency and the classification accuracy of the proposed algorithm, demonstrating that reliable prediction models could be developed by incorporating ensemble methodologies in the semi-supervised framework.

The remainder of this paper is organized as follows: Section 2 presents a brief survey of recent studies concerning the application of machine learning for the detection of lung abnormalities from X-rays. Section 3 presents a detailed description of the proposed weighted voting scheme and ensemble algorithm. Section 4 presents a series of experiments carried out in order to examine and evaluate the accuracy of the proposed algorithm against the most popular self-labeled classification algorithms. Finally, Section 5 discusses the conclusions and some research topics for future work.
