1. Introduction
The efficiency of photovoltaic (PV) panels is closely related to combating dust accumulation. The soiling effect occurs due to the deposition of dust particles, sand, pollution, substances, or organic debris on the panel surface. These block solar radiation and increase the local temperature, leading to a reduction in the short-circuit current (Isc) and open-circuit voltage (Voc). These statements are supported by the research of Bošnjaković et al. [
1], which documents a 7.39% decrease in power after dust contamination. Also, in the study by Uzorka et al. [
2], cement dust reduced strength by 80.6% due to fine particles (1–100 µm) and high CaO content. To detect this effect, advanced techniques such as convolutional neural networks (CNNs) have been put forward in the paper by Onim et al. [
3], where the SolNet model achieved an accuracy of 98.2% in classifying clean versus dirty panels, based on a new dataset. Chen et al. [
4] analyze the particle–substrate adhesion forces, which are dependent on relative humidity and condensation and intensify agglomeration. Cleaning solutions include both passive and active methods. For example, in the paper by Alagoz and Apak [
5], Surface Acoustic Waves (SAWs) are studied for non-invasive and damage-free particle removal by modeling adhesion and gravitational forces. Active methods include automated robotic brush systems, water pumps, and spraying systems [
6]. These robots are controlled by microcontrollers and Internet of Things (IoT) infrastructure. Intelligent systems integrate dust sensors that trigger cleaning when the power drops by 5% [
7]. An interesting idea is presented in research by Mohandes et al. [
8], which uses drones to generate aerodynamic turbulence with the aim of reducing deposits through air movement. In the paper by Elnaby Kabeel et al. [
9], a combination of a concave roof, water cooling, natural air, and mirrors increased electrical efficiency by 24.11% compared to a conventional module. Innovative techniques such as super-hydrophilic and super-hydrophobic coatings [
10] and a paraffin spectral filter [
11], which reflects IR and cools the panel, improve transmissivity and reduce temperatures by up to 68.77%. Kim et al. [
12] recommend optimizing the tilt angle to maximize the irradiation of converted energy, also considering the dust level through machine learning (ML) algorithms.
Current–voltage (I-V) curve analysis is a tool that evaluates the performance of PV panels. This curve shows the variation in the electric current produced by a solar panel as a function of the terminal voltage. Gupta et al. [
13] use the I-V curve to demonstrate that in a vacuum, the conditions produce a partial reduction in the panel temperature from 48.17 °C to 37.14 °C. The visible change in the curve shape is correlated with a 6% increase in maximum power. In the paper by Attia et al. [
14], the comparison of half-cell panels versus full-cell panels is also performed using the I-V curve under partial shading. Distorted shapes indicate local current losses. Also based on the I-V curve, Awoyinka et al. [
15] show that under real conditions in Nigeria, the maximum power decreases by up to 28.9% compared to the values declared by the manufacturer. These differences are due to irradiation and temperature compared to Standard Test Conditions (STCs). To compensate for these variations, techniques such as Maximum Power Point Tracking (MPPT) are integrated using fuzzy controllers, which adjust the parameters of a DC–DC (direct current) converter in real time [
16]. In the paper by Benaboud [
17], MPPT is combined with a two-axis solar shading system to keep the panel perpendicular to the sun. The work notes that this is how the maximum possible power is obtained from the available I-V curve. From a modeling perspective, I-V curves are used in complex mathematical equations to identify if they are identical to those measured by the manufacturer at STCs, effectively validating the model [
18,
19]. In the paper by Xu and Qiu [
20], a stochastic fractal search optimization algorithm estimates the unknown parameters of the equivalent electrical model for I-V curve fitting until it matches the experimental one. Olivares et al. [
21] and Hossion [
22] demonstrate that dust or physical degradation alters the I-V curve, and Kalliojarvi-Viljakainen et al. [
23] propose preprocessing methods to remove noise from measured data. In the paper by Alaas [
24], an emulator artificially generates I-V curves to predict the yield of PV panels during testing phases without the need for measurement under real sunlight conditions. Analyzing these specialized works, it is found that the I-V curve analysis measures the efficiency of PV systems [
22]. These curves are measured using specialized equipment called I-V or I-V sweep analyzers. The process involves real-time measurement of the current and voltage produced by the solar panel under controlled irradiation [
25]. The cost of such a device, for example, a Seaward, is approximately 3500 EUR [
26].
Another tool for detecting issues such as soiling, partial shading, and hotspots, caused by electrical or mechanical defects in solar panels, is infrared thermography (IRT) analysis. Cardinale-Villalobos et al. [
27,
28] show that visual inspection has 100% efficiency in detecting dirt and shading. However, this technique does not identify internal defects, which is why it is recommended to combine visual inspection with IRT or electrical analysis (EA). These techniques achieve an efficiency of 73% to 78%. Thermal thermography works by measuring the infrared radiation emitted by panels, where defective areas with cracked cells or faulty diodes heat up more than normal ones, generating distinct thermal images [
29]. For automating the analysis, artificial intelligence (AI) techniques [
30,
31], such as CNNs [
32], can be used. For example, in the paper by Bu et al. [
33], the obtained mean average precision was 97.4%. In the research by Gallardo-Saavedra et al. [
34] and Henry et al. [
35], drones equipped with Red, Green, and Blue (RGB) and thermal cameras are advised for manual intervention-free inspection. Automated systems, such as those presented by de Oliveira [
36], use algorithms like Mask-RCNN for instant segmentation of defective modules and their precise location on orthomosaic maps, with only 10% false positives. Hybrid models achieved maintenance cost reductions of up to 70% [
37]. Manual inspection of automated systems based on IRT combined with AI techniques has improved solar panel maintenance methods through rapid monitoring. The basic element of IRT inspection is the infrared thermal camera, which functions as a multi-element acquisition assembly [
38].
The efficiency of solar panels is closely correlated with the detection of dust present on the surface of the solar panels. Dust deposition reduces the ability to transmit light, which leads to a decrease in energy production of up to 30% within a month [
39]. This decline has financial implications, especially within large solar parks [
40]. Recent studies show that deep learning (DL) models trained on large sets of labeled images can detect dust deposits. In this context, several approaches have been analyzed in specialized literature. In the paper by Mohammed and Ba Alawi [
41], the EfficientNet model, named CASolarNet, is developed. It achieved over 98% accuracy in tests under various environmental conditions. Additionally, DVNET, an end-to-end model based on light transmittance estimation, uses image processing to calculate dust density [
42]. The results presented in the paper achieve a mean squared error of 0.00044. Another hybrid approach combined feature extraction with tree-based classifiers to achieve 97% accuracy [
43,
44]. In regions with high pollution, such as the United Arab Emirates, models like Long Short-Term Memory (LSTM) and Artificial Neural Network (ANN) have been tested. Following the tests, an accuracy of 99.5% and 98.5% was achieved in predicting panel performance, taking into account factors such as temperature and irradiance. Another method presented by Bashir et al. [
39] transforms current–voltage electrical parameters into RGB mosaic images using a CNN–Random Forest (RF) model. It achieves 100% accuracy in classifying operational states and 98% in assessing dust severity, resulting in a reduction in water consumption of up to 90%. Dust identification errors in complex backgrounds are reduced in the paper by Cui et al. [
45] to improve the efficiency of PV panels. Additionally, optimized algorithms like the Equilibrium Optimizer (EO) have improved the hyperparameters of CNNs [
40]. For inspections using an Unmanned Aerial Vehicle (UAV), accuracy increased by 4.6% and the number of parameters was reduced by 24%, making it suitable for edge computing [
46,
47]. The combination of DL, visual attention, hyperparameter optimization, and integration with UAVs offers scalable, cost-effective, and precise solutions for automated dust monitoring, contributing to the profitability of solar energy [
48].
This paper does not aim to propose a new ML model, but rather to analyze and validate an IoT–cloud infrastructure through which existing models can be integrated into a scalable system for automatic monitoring and cleaning of PV panels. The focus is on integrating hardware components (Raspberry Pi, dedicated camera) and software (Azure Custom Vision, acquisition/transmission scripts) into a complete industrial architecture compatible with the Industry 4.0 paradigm.
The main objective of this paper is to propose a distributed IoT infrastructure for the predictive maintenance of PV panels. This infrastructure must demonstrate the ability to automatically detect dust deposits to activate the cleaning process.
The secondary objectives are as follows:
Integrating ML-based visual classification into an IoT architecture;
Evaluating the performance of existing cloud classification models in Azure Custom Vision for PV surface monitoring;
Analyzing the economic and sustainability implications of such an implementation in the context of Industry 4.0.
The major contributions of the paper are stated below:
Designing and implementing a hardware–software IoT infrastructure based on Raspberry Pi and cloud services for monitoring PV panels.
Identifying how to choose the model capable of detecting the best training results.
Integrating image classification models (Azure Custom Vision) into a complete predictive maintenance workflow.
Validating the functionality of the infrastructure through experiments that test image acquisition, transmission, classification, and the automatic triggering of the cleaning process.
Cost analysis of the proposed solution compared to traditional maintenance methods.
The paper is structured into five sections.
Section 2 includes the authors’ proposed hardware and software architecture (Raspberry Pi, camera, cloud services used, feasibility tests, dedicated scripting).
Section 3 presents the operational method of the infrastructure, the execution of the proposed comparative tests, the interpretation of the classifier’s performance as validation of the workflow, and the presentation of the elements used at the hardware infrastructure level. The discussions highlight the advantages of distributed IoT infrastructure, its economic and practical implications, and a comparison with traditional methods, and future directions of the paper are depicted in
Section 4.
Section 5 focuses on the conclusions, outlining the main contribution (scalable infrastructure for predictive maintenance), the relationship between the work and the secondary objectives, and the limitations.
2. Materials and Methods
The methodology discussed in this paper addresses the need to integrate Industry 4.0-specific digital technologies into the predictive maintenance processes of PV panels. The main objective of maintenance processes is to optimize energy production. The authors of this paper begin with the premise that dust deposits on the surface of solar modules cause energy production losses. These losses increase operating costs. In this context, the methodological strategy aims to integrate data acquisition, automatic classification through an ML algorithm hosted in the cloud, and the automatic triggering of a cleaning process, all while continuously monitoring a large area associated with PV parks, with all these components residing within a dedicated architecture for the smart factory concept.
Solar panel technology has evolved considerably in recent years, due to new technological innovations in materials science. The choice of materials is important in determining the efficiency, cost, and environmental impact of solar panels. Solar panels are made from several important materials that synergistically work to capture sunlight and turn it into electricity. The main ingredient is silicon, in the form of monocrystalline silicon or polycrystalline silicon. The front layer is made of tough tempered glass, which protects the cells from weather and damage. The frame surrounding the panel is usually made of lightweight aluminum to prevent corrosion and provide sturdy support. Inside, the substance named Ethylene Vinyl Acetate (EVA) acts like a glue, holding everything together and insulating the cells. The back of the panel is covered with a special polymer sheet, often referred to as a backsheet, which protects the inside components from moisture and mechanical factors. Concerning the electrical connections, manufacturers often use silver or copper wires because they are excellent conductors (silver being a top choice for high-performance panels). Finally, there are tiny devices called bypass diodes made from silicon, which help the panel work better even if parts of it are shaded or dirty [
49].
The increasing number of solar panel installations has led to some concerns about the environmental impact of solar panel waste [
50]. Recycling solar panel materials is required for ensuring a circular economy and minimizing the environmental footprint of solar energy. The main materials from which solar panels are made are silicon (60% of the weight), glass (20%), aluminum (10%), copper (5%), and plastic (3%). As for their recycling efficiency, the highest value is achieved for silicon (95%), followed by glass (90%), copper (85%), aluminum (80%), and plastic (70%) [
51].
The identification of dust on the surface of a PV panel is performed at a conceptual level, as the principle can be extended to other types of factors that reduce production capacity. This category can include dirt generated by birds, leaves, cracks on the panel surface, and other elements that can reduce energy production. The energy performance of solar panels depends a lot on how clean they are. When panels become dirty, covered with dust, dirt, leaves, or bird droppings, they cannot absorb sunlight properly, which leads to low energy output. For instance, a clean panel might operate at nearly full capacity, but if it is greasy or dusty, it could lose 20–30% of its efficiency. Thus, regular cleaning is important to maintain panels working at their high capacity. In areas where dirt, pollution, or dust are more prevalent, cleaning becomes even more important. When solar panels are clean, they generate more power, are more efficient, and last longer [
52].
2.1. IoT Prototype Hardware and Software Infrastructure
Unlike existing solutions, which send all data to a central application for analysis and decision-making, this paper proposes a distributed mechanism where local units (Raspberry Pi) collect and transmit data, receive a response from the cloud service, and autonomously decide to activate the cleaning mechanism. This flow reduces latency, decreases the load on the central infrastructure, optimizes costs, and intervenes in the automated decision-making process.
From a methodological perspective, the hardware component refers to the physical infrastructure necessary for data collection. For this, the Raspberry Pi 5 platform (manufactured by Raspberry Pi Ltd, Pencoed, Wales) was proposed, equipped with a dedicated Pi camera, which is used to acquire images containing the PV panels of an area within the PV park. The way the Raspberry Pi 5 platform is designed allows it to be integrated as an IoT node into an overall infrastructure. This node captures images at regular intervals and sends them to the cloud, where a service then analyzes the received image. Thanks to its Sony IMX708 12-megapixel sensor with I2C-controlled focus actuator (produced by Raspberry Pi Ltd, Pencoed, Wales), the Pi camera identifies the visual differences between clean and dusty panels. The 75-degree diagonal angle of view camera is positioned to capture an area of a PV module that will be considered a reference for a specific part within the PV park. This will avoid excessive reflections and perspective distortions. To this end, the device is protected by a weather-resistant casing, given that the outdoor working environment exposes it to intense solar radiation, temperature variations, precipitation, or other mechanical interventions.
The physical equipment, consisting of the Raspberry Pi module, camera, and power supply, is integrated into a single IoT node. This node operates autonomously, capable of capturing, processing, and transmitting data to the cloud service without additional equipment. Thus, the proposed architecture is completely integrated in terms of hardware and software.
By designing the IoT node, the infrastructure complies with Industry 4.0 requirements for interoperability and modularity, with each unit functioning as an autonomous element connected to a centralized hub.
Figure 1 presents the hardware IoT node.
The central element of the software infrastructure is responsible for data processing within the proposed flow in
Figure 2. The image captured locally from the Raspberry Pi device is transmitted to a service within the Microsoft Azure cloud platform. The service used is Azure Custom Vision, and it is responsible for the classification task. The images used in the training process are the “Solar Panel Dust Detection” dataset [
53]. The dataset comprises 2302 images and combines 1385 images categorized as dust-free and 935 images from the category of PV panels that require cleaning. This initial distribution uses a slight class imbalance, favoring the class of dust-particle-free images. This is a public corpus containing images labeled into two classes: clean panels (named in the following as
no_dust) and dusty panels (labeled as
dust). The service offers training through six scenarios, called General, General [A1], General [A2], as well as non-standard options like Food, Landmarks, and Retail. The present research identifies whether there are unforeseen benefits from the transfer of characteristics between domains. Each model was evaluated using the metrics of accuracy, precision, recall, and F1-score. Using these metrics, a comparative analysis was performed between the models obtained from a performance perspective.
Another issue that was analyzed in this paper related to the number of images used in training. Initially, 1385 images were used for the no_dust version and 917 for the dust version. This imbalance between dust and no_dust led to an initial set of performance metrics. The unbalanced ratio between the two classes is justified by the fact that the number of situations where the panel is clean is greater than the number of cases where it is dirty.
Subsequently, all six tests were repeated within the Azure Custom Vision platform, using an equal number of images, specifically 915 for each category. The objective of this type of test was to highlight whether the balanced image variant leads to a superior performance compared to the unbalanced image variant.
The comparative analysis of Azure Custom Vision scenarios is used only as an infrastructure validation step, not as a research objective aimed at developing new detection models. The results of these tests demonstrate the superiority of pre-implemented models at the service level within cloud platforms. The ratio between classification accuracy and estimated processing cost per image shows that the Azure Custom Vision-based solution offers a superior cost–benefit ratio compared to those stated by Di Nardo et al. [
54] and Bautista et al. [
55]. This is due to a reduction in implementation time by approximately 80% compared to a locally trained ML model. This analysis extends the contribution of the paper by demonstrating the algorithmic and economic advantage of the proposed architecture.
This methodological approach is integrated into a centralized IoT Hub platform, where data connected from multiple nodes is aggregated. In this way, the solution is scaled to large-scale PV parks, where each Raspberry Pi element monitors a section of the solar field. Decisions are made locally, at the level of each Raspberry Pi unit. These functionalities align with the Industry 4.0 paradigm, which involves data connectivity and real-time decision-making.
The workflow presented in
Figure 2 summarizes the reference elements within the process. Thus, the section of the PV park contains a series of solar panels. The image from this section is captured by the Raspberry Pi camera and transmitted by the module to the cloud service, Azure Custom Vision. This service performs the classification, and the result is returned to the local node. If the panel is considered dusty, the Raspberry Pi module will prompt the cleaning system to be activated. A key element of the proposed methodology is the comparative analysis of Azure Custom Vision training scenarios. The proposed methodology integrates a system designed to be activated only when necessary, which means a reduction in resource consumption. Eliminating repeated manual inspections contributes to reducing maintenance costs. Thus, the proposed methodology presents a technical system constituted by a complete intelligent monitoring and actuation architecture through IoT technology. The proposed architecture demonstrates how a hardware node integrated into a cloud ecosystem and supported by ML services transforms the traditional maintenance process into an intelligent one.
2.2. Automation Scheme for the Proposed System
The proposed automation scheme is designed according to the logic of an industrial control system. This was adapted specifically to the monitoring of PV panels for automatically triggering the cleaning process. The process begins with the section containing the solar panels, as the process targets the physical surface of the monitored PV panels. The disruptive factor of the process is the dust deposits on them, which must be identified and removed. To detect this disturbance, a Raspberry Pi camera is used, which acquires images through its sensor (Pi Camera Sensor—PCS) at regular time intervals, forming the input signal to the system. Because the process is slow, the time interval can be set to 6 h or even longer. The images obtained are captured by the Raspberry Pi acquisition module, on which a script is running. It functions as an acquisition and preprocessing element. The Raspberry Pi then transmits the image to the cloud, also through the script and via a communication channel. This module is an interface with the cloud infrastructure. The information then reaches the Azure Custom Vision Classification Engine (CEACV), where the analysis is performed using ML algorithms. This block transforms raw images into a binary decision signal,
dust or
no_dust. The result is transmitted back to the local control level. The Decision Logic Control Raspberry Pi (DLCRP) module receives the result and triggers the cleaning action if necessary. The decision threshold functions as a binary regulator. The signal is transmitted to the next stage, where the command is directed to the execution element that performs the cleaning of the PV panel. This powers the Cleaning Mechanism Raspberry Pi (CMRP), shown in
Figure 3. To close the control loop, the system includes a Feedback Dust Detection Confirmation (FSDDC) that checks the panel’s status after cleaning. This feedback avoids unnecessary repeated cycles in the automation scheme.
In the automation scheme proposed in
Figure 3, the process is represented by the surface of the solar panel. The input size is represented by the presence of dust particles. The controller includes the Raspberry Pi module along with the Azure Custom Vision classification module. This block receives information from the Raspberry Pi camera and the feedback confirmation module. Next, it sends a signal to the CMRP, which acts as an execution element, modifying the dust particle level on the solar panel surface by wiping it clean.
The controller logic is integrated through a script written in Bash for the Raspberry Pi 5 native operating system. This integrates the software nature of the central acquisition and interfacing element between the local IoT node and the cloud platform. In methodological terms, it represents the data acquisition and transmission agent responsible for transforming raw visual information into digital signals that can be automatically analyzed for decision-making purposes.
Figure 4 shows one iteration of the process of checking for dust particles on the surface of solar panels using the mentioned Bash script.
This iteration is executed in the proposal in this paper every 6 h. This script serves the purpose of data acquisition by capturing images of the PV panels at regular intervals and represents the input signal for the automated system. The script also performs local pre-processing by checking the integrity and format of the images to avoid transmitting corrupted data to the cloud service. The script also interfaces with the Azure Custom Vision service, sending the image for automatic classification. At the script level, the response is received from the classification engine and transformed into a binary signal usable in the local control logic. The script acts as a link between the local hardware nature and the cloud infrastructure through the way raw data is transformed within the monitoring and automation flow. In this case, the integration of specific digital technologies from the Industry 4.0 concept into the predictive maintenance processes of PV panels is based on this script. It is designed to be implemented on a Raspberry Pi 5 unit operating as an autonomous IoT node responsible for imaging PV panels within a predefined area of the solar park. The script’s functionality is based on a series of sequential steps that adhere to data acquisition principles. Initially, the script checks for the presence of a system function capable of acquiring images through the Raspberry Pi camera. Next, it captures the image, a process controlled by resolution and exposure time parameters, ensuring that each image meets Azure Custom Vision analysis standards.
The novelty of the scheme lies in the fact that the decision to activate cleaning is not made at the level of a central server, but locally, at the Raspberry Pi unit. In this way, each node in the infrastructure functions as an autonomous microcontroller, capable of reacting quickly and reducing dependence on permanent connections or a centralized control application.
The image is subsequently transmitted by the script to the Azure Custom Vision service via an HTTP POST request, within which the script attaches the image and uses the necessary access keys for authentication. The response generated by the cloud service is retrieved by the script, processed, and provides the IoT node with information to decide whether to activate the cleaning process. This integration makes decisions based on automated data analysis, eliminating the need for manual inspection. Thus, the script used is a tool for acquiring and transmitting images within the intelligent IoT ecosystem. This allows communication with local hardware elements and cloud services, transforming raw data into autonomous decisions that align with the Industry 4.0 paradigm. This methodological integration ensures that each IoT node functions as an autonomous microcontroller participating in a distributed monitoring and control system for the entire PV park.
The entire IoT proposal is further based on the quality of the Azure Custom Vision service. Its performance will further determine the efficiency of the proposed prototype. The results section will present the details of the tests performed.
The setup of the experiment included a Raspberry Pi 5 platform, 8 GB RAM, a Sony IMX708 camera, and specific cables for connections. The images were captured every six hours under natural light conditions and transmitted via HTTP POST to the Azure Custom Vision service (General scenario [A2]). The classification threshold was set at 0.5, and the decision logic was implemented in a Bash script that controlled the cleaning mechanism. Network communication was secured using the HTTPS protocol, with each node operating as an independent local processing unit (edge). These details allow for the complete reproduction of the proposed flow. Azure Custom Vision does not provide specific model details precisely so that it cannot be replicated outside the platform. In this way, the superiority of Microsoft’s proprietary models cannot be copied.
In the current stage, the cleaning process is handled at a logical level. Its activation is simulated by software within the IoT flow. The proposed architecture is compatible with the implementation of a real robotic mechanism. For example, an automatically controlled brush system or water jet can be integrated into the proposed infrastructure within this work. The present research focuses on the integration of automated decision-making, not the construction of the mechanical device itself.
The proposed methodology is built on the principle of integrating all stages of a complete IoT–cloud flow, starting from data acquisition and ending with the activation of an automated action. In the first stage, image acquisition is performed at the IoT node. These images are transmitted via the HTTP protocol to the Azure Custom Vision service, where automatic classification takes place using a pre-trained dust detection model.
In the second stage, the response received from the cloud service (label and confidence score) is processed locally by the IoT node’s microcontroller, which applies a simple decision-making logic: if the probability of contamination exceeds a predefined threshold, the command for the cleaning mechanism is activated. Thus, each component of the methodology directly contributes to validating the distributed architecture: data acquisition demonstrates hardware–cloud interoperability, automatic classification illustrates the use of intelligent services as part of the IoT infrastructure, and local decision-making validates the operational independence of the nodes. Together, these stages define a flow that supports the main objective of the work, which is technological integration in the context of Industry 4.0.
3. Results
The energy performance of a PV park is dependent on the quality and cleanliness of the active surface of the solar modules, as presented in the Materials and Methods section. PV panels are predominantly made of crystalline silicon covered with an anti-reflective glass layer. These materials allow for the absorption of solar radiation and its conversion into electrical energy. Deposits of dust, pollen, organic particles, or bird droppings reduce the transparency of the protective layer. These elements also affect energy production. Even a thin film of dust can cause losses in energy production. The uneven distribution of impurities creates partial shading effects that generate hot spots on solar cells. From an economic perspective, this translates into reduced operating, maintenance, and intervention costs and an increased capacity factor for the PV park.
The implementation of an IoT prototype based on a Raspberry Pi 5 equipped with a Sony IMX708 12 MP Pi camera with an I2C-controlled focus actuator demonstrated the feasibility of automated PV park monitoring. This IoT infrastructure, which includes Microsoft Azure Custom Vision services, has not been presented in the specialized literature in the context of PV panels until now. Experimental results have shown that an architecture integrating multiple autonomous IoT nodes via Raspberry Pi units is capable of transmitting data to the cloud and receiving intervention instructions. This validates the idea that such an IoT configuration can be integrated at the level of a large-scale, fully covered PV park.
The detection–cleaning flow automation is implemented according to an industrial control cycle adapted specifically for solar panels. The test results, obtained using a Bash script executed by a Raspberry Pi unit, demonstrated their ability to ensure data acquisition, image integrity verification, transmit to Azure Custom Vision, and receive the response from it in binary format, allowing for integration into local control logic. The feedback loop confirms particle removal after cleaning by reducing the probability of unnecessary cycles being triggered. Tests have demonstrated that distributed decision logic at the level of each IoT node minimizes reaction times by reducing dependence on central servers. Triggering the cleaning process only when necessary is achieved through the optimized distributed system exclusively via IoT infrastructure. At the macro level, the proposed architecture allows dozens or hundreds of IoT nodes to be connected to a centralized IoT hub, with the direct consequence being a future reduction in the costs required for the predictive maintenance solution. From this perspective, the major responsibility falls on the Azure Custom Vision service’s training process.
The experiment conducted within this research had the main objective of determining the performance of image classification models derived from PV parks. The experiment used the Azure Custom Vision platform to identify the presence or absence of dust deposits on the surface of the panels. The major degree of difficulty in the research stems from the fact that dust particles are difficult to identify at the image level. The authors questioned whether this identification is possible using a cloud infrastructure. If so, another issue concerns the conditions under which these cloud services operate at their best possible level. The analysis was conducted using the metrics of Precision, Recall, and average precision (AP). These three parameters are standard in the field of automatic classification and accurately describe the model’s ability to make decisions in real-world scenarios. The dataset used was a public one, and the tests were divided into two categories. The first category used the imbalanced dataset, favoring the most frequent cases in reality, meaning those that do not contain dust. In other words, the number of images with clean panels was greater than the number of images with dusty panels. The second scenario used a balanced dataset for the two classes, with 915 images for each class.
The hardware infrastructure is presented at a conceptual level in this paper, which is why the reference images come from the public dataset “Solar Panel Dust Detection”. As already mentioned, it contains samples for clean panels and dust-covered panels. These images were included to illustrate the classification process and model behavior without taking new field captures.
Figure 5 presents the precision graph for all 12 scenarios analyzed. Six scenarios used unequal classes, corresponding to the Iterations marked in blue in
Figure 5, inspecting all six types of domains explicitly set in Azure Custom Vision. The last six tests focused on equal classes of images, using all six predefined domains, marked in orange in
Figure 5.
A comparative analysis between the six domains demonstrates the system’s ability to generalize in unexpected situations. Transfer learning between domains is a strategy used in ML, and in this case, it is applied to observe whether models trained on datasets labeled for other purposes improve dust detection performance. In
Figure 5, the results show that the General [A1] and General [A2] domains provide the best performance, but in different scenarios. In the case of classes with an even number of elements, General [A2] provided the best results. In the case of classes with a different number of elements, General [A1] provided the best results. The remaining two parameters need to be analyzed next.
Figure 6 and
Figure 7 show the results for Recall and AP.
The iteration with the two balanced classes on the general scenario [A2] yielded the best results. This iteration achieved an AP value of 95.1%, representing consistency across the entire spectrum of classification thresholds. The achieved precision was 88.5% and the recall was 88.3%. This indicates that the two metrics are balanced. In comparison, the corresponding iteration of General [A1] with unequal classes achieved similar values, with an AP of 95%, a precision of 88.5%, and a recall of 88.1%. Further analysis, as shown in
Table 1, associated with the results by class, reveals that the recall for the
dust class is lower, with a value of 83.7%, which suggests a higher probability of missing some dusty panels. From an operational perspective, this difference affects the performance of the proposed infrastructure, as dirty panels are precisely the category of interest for triggering maintenance. Other iterations, such as the Landmark iteration with unequal classes, for which precision was 87.6% and recall was 87.2%, or the iteration with equal classes and General [A1], where precision was 87.1% and recall was 86.9%, performed well, but below the levels of the best-performing iteration. Additionally, the associated AP is smaller than the best scenario obtained, which places the equal class iteration in the General [A2] variant as having the best results.
The analysis of performance evaluation by class provides information related to the capability of the proposed system in predictive maintenance. In the case of solar panels, the dust class is considered a critical class because it determines whether detection is performed correctly regarding dirty panels, for which the cleaning process is triggered. For the iteration identified with the best results, the no_dust class achieved a precision of 89%, a recall of 88%, and an average precision of 94.4%. These values show that most clean panels are correctly identified. This reduces the risk of unnecessarily triggering the cleaning mechanism. For the dust class, the iteration yielded precision values of 88%, recall of 88.5%, and an average precision of 95.6%. This is the best result from the entire set of experiments. This means the system has an excellent ability to detect dirty panels, which minimizes the number of cases where a dusty panel might be incorrectly labeled as clean. Comparing the results, it becomes evident that balancing the dataset led to a noticeable increase in performance, especially for the dust class, which was established as the most important class. Unlike iterations with imbalanced datasets, the increase was almost 10 percentage points, clearly demonstrating the superiority of this iteration. The phenomenon is explained by the fact that in an imbalanced dataset, the algorithm prioritizes the dominant class, which is no_dust. This results in poorer performance for the minority class, which is the dust class. Therefore, the result of iteration 7 represents the best numerical score, confirming the validity of a fundamental best practice in ML regarding maintaining a balance between data classes to avoid algorithmic bias.
The response provided by the Azure Custom Vision service for
no_dust solar panels when running the Bash script is depicted in
Figure 8.
This result classifies the image as belonging to the no_dust class with an accuracy of 89.22%, and the probability of belonging to the dust class is only 10.07%. Analyzing the two probabilities, the correct classification performed by the Azure Custom Vision service is evident.
Beyond the classifier’s performance, the tests also aimed to validate the proposed IoT infrastructure. The results show that each Raspberry Pi node was able to transmit images to the cloud service and receive responses within an average time of under 3.78 s, with a download speed of 299.79 Mb/s and an upload speed of 50.59 Mb/s. Additionally, distributing the decision-making process locally reduced the load on communication channels and central infrastructure, which confirms the advantage of a distributed architecture.
The experimental results should be interpreted within the context of the proposed infrastructure, with the performance of the classification model used as a benchmark for sizing the IoT flow, without losing sight of the fact that the main goal of the research remains the testing of the feasibility of a distributed architecture for automated predictive maintenance. The validation process is carried out under real operating conditions of PV systems. During model training, the service automatically applies multiple augmentation operations, including brightness and contrast adjustment, random rotations, scaling, and cropping. These operations prevent subsequent predictions from being affected by environmental variations encountered in field PV installations that were not explored during the training phases. Examples such as changes in solar irradiance, camera perspective, partial shading, or other types of weather do not affect the image analysis method. These augmentations expose the model to a wider range of visual conditions than those explicitly present in the original dataset. This approach improves the ability to generalize.
4. Discussion
The results obtained during the experiment show that the energy performance of PV panels is dependent on the condition of the active surface. Glass with anti-reflective coatings and polycrystalline or monocrystalline silicon cells partially loses its optical properties when covered with dust, pollen, excrement, or other types of particles. The monitoring process is a matter of visual maintenance with direct consequences on the energy efficiency and lifespan of solar panels. The fact that deposits are not uniform and that partial contamination can create local shading effects and hotspots necessitates a series of automatic monitoring mechanisms that can be implemented through IoT systems transformed into protection tools for the investment in PV infrastructure. From a hardware perspective, the authors’ proposal to use the Raspberry Pi 5 platform with a Sony camera demonstrates a low-cost solution suitable for achieving the proposed objective. The automation scheme implemented through distributed control offers the great advantage of local decision-making. Basically, each node is capable of activating cleaning independently, strictly in the necessary intervention area. Thus, overloading the central network is prevented. The authors propose a six-hour interval for the maintenance process because it does not experience rapid variations in environmental conditions that would necessitate the adoption of specific fast-dynamic standards. Furthermore, the implementation of a confirmation loop to avoid repeated cycles immediately detects the need for the cleaning process. It is not dependent on the condition of the remaining residue on the panels.
Compared to traditional IoT architectures, where decisions are centralized and generate additional costs, the proposed solution introduces a distributed logic. Thus, each IoT node makes a local decision based on the response from the cloud service. Multiple advantages are associated with reducing data traffic, lowering costs associated with central infrastructure, increasing response speed, and ensuring greater system resilience against potential connectivity issues. The implementation of this infrastructure highlights the potential of Industry 4.0 in the field of renewable energy: data-driven predictive maintenance, IoT integration with cloud services, cost reduction through automation, and the possibility of expansion toward complete cyber-physical systems.
From the perspective of IoT and Industry 4.0, integrating nodes into a centralized IoT hub is a solution for intelligent predictive management of PV parks. In this regard, the discussion focuses on the fact that systems can be scaled to hundreds of nodes, leading to an increase in the complexity of the communication infrastructure that can be seamlessly supported by the Azure platform, which is specifically designed to handle a large influx of users. Industrial adoption depends on a detailed cost–benefit analysis. Manual cleaning costs can be lower than implementing an IoT infrastructure, but in the long run, the latter can offer energy benefits that justify the initial investment. Beyond this limitation, several factors that can affect image quality in the short term can also be discussed, including dependence on environmental conditions such as fog or solar reflections. Another limitation is the dependence of the infrastructure on proprietary cloud services. This dependence generates recurring costs. The data privacy vulnerability issue is not a specific concern in the context of PV panels, as they do not contain confidential data. In the context of the transition toward open architectures, a future development direction for this project lies in the development of a hybrid system that combines local processing (edge computing) with open-source ML models hosted on independent platforms.
The results present an experimental model developed within a demonstration of the feasibility of using ML techniques applied to images in the context of predictive maintenance for PV parks. The images included in the database come from various contexts, including different capture angles, different degrees of lighting, different atmospheric variations, and heterogeneous backgrounds. These variations represent a source of variability in the conditions that affect the accuracy of a visual model. However, the results presented by the experimental model demonstrate that the classification system achieves performance levels that allow for its practical application. It can be anticipated that by adopting rigid data collection protocols, such as consistently capturing images from the same angle, in the same optical context, under the same lighting conditions, and with the same background, models with a superior performance to those reported in this paper will be generated. This perspective highlights that standardized data acquisition procedures reduce unwanted variability and lead to the development of more accurate models with better long-term predictability. In an industrial scenario, this translates to much better detection of dirty panels.
It is important to note that this paper proposes the use of a new pre-implemented classification model with Azure Custom Vision. The research conducts experimental tests on real PV installations, and model validation is also performed using the public dataset, employing images that were not included in the training phase. Furthermore, the authors conceptually proposed a distributed IoT infrastructure for integrating existing ML services into predictive maintenance workflows. The dust detection model used is taken from a public dataset and serves solely to demonstrate the functionality of the proposed cloud-edge infrastructure.
At this stage, validation was performed at the software platform level by simulating the complete acquisition, transmission, and automatic decision-making flow without physically implementing the cleaning mechanism. This approach is suitable for research focused on architecture and system integration, and a complete experimental validation is a separate study that will provide details regarding node placement relative to the panel, present scenarios, and other hardware-related details.
From a statistical perspective, comparing the 12 training iterations indicates that balancing the dataset led to an improvement in model performance. The increase in recall values for the dust class confirms a specific statistical behavior of binary classification under class imbalance conditions. This result allows us to infer that maintaining a balanced dataset is a determining factor for increasing detection reliability and, consequently, the performance of the proposed distributed architecture.
A deep analysis of the results obtained suggests that the model’s main limitation is not related to the classification algorithm itself, but to the variables that introduce noise. However, the distributed control logic compensates for this limitation by integrating a temporal mediation performed by the cleaning mechanism, which is only activated when multiple successive detections confirm the presence of dust. This approach increases confidence in decision-making by highlighting the adaptive nature of the proposed IoT architecture under dynamic environmental conditions.
The authors acknowledge that in industrial practice, the performance of PV panels is monitored through current–voltage curve analysis, which highlights efficiency losses caused by dirt or defects, integrated solar radiation and temperature sensors for calculating the deviation between theoretical and actual production, IRT techniques for identifying overheating areas or solar cell defects, etc. However, the authors propose complementary exploration that integrates image processing through ML services embedded in IoT infrastructures. The objective of the research is not to replace established models, but to demonstrate the possibility of using a computer vision-based system, trained with data from an IoT stream, that provides information for predictive maintenance. This methodology has an advantage based on the implementation of a low-cost solution, using simple cameras and edge computing modules, which avoids the need for expensive equipment. In the context of Industry 4.0, the experiment presented demonstrates that image processing is a valuable complementary reference component as a tool for the management of PV parks.
Experimental results demonstrated that among all the tests conducted, an equal number of images used in each training class generated the best performance evaluation metrics for the General [A1] scenario in the Azure Custom Vision platform. The global AP parameter generated 95.1%, a value that indicates excellent performance for the critical dust class. This value means that almost all dirty panels will be detected and cleaned in time, reducing yield losses. The accuracy for the no_dust class was 89%. The value shows a prevention of unnecessary activation of the cleaning mechanism. This conserves resources such as water, energy, and equipment wear and tear. AP achieved a 94% success rate for both classes, confirming the model’s consistency across a wide range of decision thresholds. This allows for flexible adjustment of the system’s sensitivity based on field conditions. Based on these arguments, it is decided that Azure Custom Vision allows for the integration of a cleaning activation mechanism into an IoT infrastructure through the tested model, which achieved a global AP of 95.1%. This performance brings an undeniable benefit to the implementation of PV park predictive maintenance in a real system. Choosing this model guarantees a compromise between maximizing energy efficiency and minimizing operational costs within the context of the Industry 4.0 paradigm.
Classical methods used for identifying dust on PV panels involve investments in equipment and human resources.
Table 2 presents a comparison between the costs of these solutions and the costs associated with the prototype proposed in this study.
The costs presented are indicative values valid for Romania and the European Union. They were obtained from public commercial sources (FLIR, Testo, Test-Meter UK, Raspberry Pi Foundation, Microsoft Azure Pricelist). Their purpose is to offer the reader an overview of the difference between methods, not to reflect absolute values. The comparative cost structure remains valid regardless of the region.
Table 2 shows that the proposed method has a hardware and variable cloud cost of approximately 160 EUR per node. The value is lower than that of traditional methods. The recurring costs originate from the Microsoft Azure Custom Vision Service, which are proportional to the number of processed images and can be optimized by activating detection only at scheduled intervals. From this perspective, our solution provides a superior cost–efficiency ratio, making it feasible for large-scale implementation in PV parks.
Table 3 presents the estimated costs of a node that includes the proposed hardware configuration. The total cost of a complete IoT node is 158 EUR, and it is considered lower than that of traditional thermographic or drone-based inspection systems, which typically range from 1500 EUR to 10,000 EUR per monitored section (based on FLIR and Testo commercial data, 2025).
The economic feasibility assessment is conducted through an extended cost analysis in the form of a Total Cost of Ownership (TCO) model. This takes into account the initial equipment cost, operational and maintenance costs, and the system’s lifespan. In this context, traditional inspection solutions (thermography, I-V curves, drones) involve high recurring costs associated with skilled personnel, periodic travel, fault-finding interventions, and the replacement of worn components. Instead, the proposed IoT architecture uses a low initial cost, which, along with minimal operational costs limited to preventive node maintenance and cloud service usage fees, makes the proposed solution competitive compared to traditional options.
Over the estimated ten-year lifespan of a PV system, the total cost of ownership of the proposed solution remains low due to automation, which eliminates manual interventions. This economically favors distributed architecture from a TCO perspective, which promotes its industrial-scale implementation within the Industry 4.0 concept.
In the economic evaluation of large-scale PV, the cost analyzed over time is related to the operational lifespan and the frequency of maintenance operations. In this paper, this aspect is addressed through a TCO analysis, which integrates the initial equipment cost and the recurring operating costs over the estimated lifespan of the system. The proposed architecture reduces the frequency of manual interventions, thus leading to a gradual decrease in the total annual cost relative to the operating period.
From a sustainability perspective, the proposed architecture optimizes resources, an aspect demonstrated quantitatively based on the data in
Table 2 and the operating characteristics of the IoT nodes. Each node has an estimated hardware and variable cloud cost of approximately 160 EUR, compared to 1500–10,000 EUR for conventional thermographic or manual inspection systems. Assuming an IoT node can monitor an area equivalent to that monitored by a human team, the reduction in initial investment exceeds 90%. Distributed logic also ensures that the cleaning mechanism is activated only when necessary.
From an energy perspective, architecture indirectly increases the annual yield of PV by reducing the periods during which the panels operate in a dirty state. According to specialized literature, energy losses due to dust deposits fall within the range of 5% to 30% of annual production. By integrating automatic detection that triggers the cleaning process only when necessary, the system reduces the average duration of yield losses. This aspiration translates into an estimated gain of 5–10% of annual energy production compared to cleaning at fixed intervals. Considering a typical 10% loss due to dust and a halving of the exposure time through automatic monitoring, this results in a theoretical increase of approximately 5% in annual production.
The suggested distributed IoT architecture, combined with Azure Custom Vision for automated PV maintenance, serves as one of the paper’s scientific references. Another highlight of the paper is utilizing a user-proposed algorithm to demonstrate autonomous decision-making at the local node level. Incorporating sustainability concepts into the Industry 4.0 predictive maintenance process is another contribution of the authors.
In the future, development directions can be imagined, such as integrating multiple solutions for multimodal analysis, integrating drones or mobile robots, and using optical sensors for simultaneous identification of dust particles or other elements with which image acquisition can be achieved. The authors propose future research directions involving the installation of systems within the university’s PV panels that can acquire images from the same angles and under the same repetitive conditions, given that the acquisition is performed at predetermined time intervals.
5. Conclusions
The objective of this paper was to integrate digital technologies specific to the Industry 4.0 paradigm into the predictive maintenance processes of PV panels. The authors proposed a new IoT prototype based on the Raspberry Pi 5 and image analysis using the Azure Custom Vision service. The issue addressed is a current one, given that the efficiency of PV systems is directly influenced by the degree of soiling on the active surface of the panels. Currently, conventional monitoring solutions generate high costs or significant operational complexity, which is why specialized literature is exploring various solutions suitable for Industry 4.0. In this context, the authors aimed to develop an autonomous system that detects dust deposits on panel surfaces by transmitting images to a cloud classification service, based on which the cleaning process would be automatically triggered. Integration into a distributed IoT infrastructure capable of providing predictive maintenance was achieved using local processing at the IoT node level, as well as centralized processing performed by the Azure Custom Vision service.
Experimental results have demonstrated that the proposed system is feasible for practical use by confirming the working hypothesis. The most important contributions can be summarized as follows:
The IoT hardware prototype is based on the Raspberry Pi 5 platform equipped with a Sony camera, which demonstrated the ability to capture images and transmit them via a Bash script to the Azure Custom Vision service. This script implements a complete image acquisition cycle, transmission to the cloud, decision-making, and activation of a cleaning system with post-cleaning feedback. Local decision logic reduces dependence on a centralized infrastructure, thereby increasing system resilience.
The performance of the ML models integrated into the Azure Custom Vision service was achieved by analyzing the 12 training iterations with different scenarios. Six scenarios utilized imbalanced datasets that favored the clean PV panel class, which is the majority class, while another six scenarios employed a balanced dataset. For the 12 iterations, the domains used for training were modified by scanning all six available domains at the Azure Custom Vision service level. Out of the total of 12 iterations, using the balanced dataset combined with the General domain [A2] yielded the best results with an AP of 95.1%. This result confirms that a balanced set of images between the dust and no_dust classes leads to the best detection.
Integration into the Industry 4.0 ecosystem was achieved by making the IoT infrastructure compatible with interoperable hardware–software platforms.
The results obtained were discussed in this paper in terms of their impact on infrastructure. Its implementation leads to improved production losses, as dirty panels are promptly detected, labor costs for periodic manual inspection are reduced by replacing it with an autonomous system, and water resources used for cleaning are optimized through the proposed automation scheme. In the long term, this approach favors the adoption of an IoT solution similar to the one proposed in this study.
Within this paper, a new infrastructure for monitoring and intervention on PV systems was proposed, using a low-cost configuration that includes a Raspberry Pi device with a camera for image acquisition. Another proposal considered by the authors as a novelty is the identification of the degree of soiling of a PV panel using the Azure Custom Vision service, creating a training capable of classifying whether a panel is clean or dirty. Additionally, the authors proposed an automation scheme in which the proposed algorithm uses the Azure Custom Vision service, a unique approach in the specialized literature.
Despite the presented performance, the study highlighted several limitations addressing the variability of environmental conditions, dependence on the Microsoft Azure Custom Vision infrastructure, the need for a Raspberry Pi platform and a Bash script developed by a domain expert, and the necessity for models capable of generalizing based on the training dataset. Based on these limitations, several future research directions were outlined in the discussion section, the most important being the acquisition of images under the same environmental conditions, as the authors speculate that the performance of the Azure Custom Vision service would be significantly improved if the images in the dataset were acquired under the same conditions. The paper represents a conceptual validation stage of the proposed architecture, which presents the functional integration of pre-implemented Azure Custom Vision services into IoT workflows for predictive maintenance. Physical testing of equipment and validation under real-world conditions will be addressed in a future stage of applied research. Furthermore, the authors propose and are implementing this infrastructure through a drone that will acquire images and transmit GPS coordinates when it identifies solar panels that need cleaning.