applsci-logo

Journal Browser

Journal Browser

Recent Advances in Automated Machine Learning

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Electrical, Electronics and Communications Engineering".

Deadline for manuscript submissions: closed (20 August 2023) | Viewed by 28866

Special Issue Editor


E-Mail Website1 Website2
Guest Editor
Machine Learning/Deep Learning Research Labs, Department of Computer Engineering, Dongseo University, Busan 47011, Republic of Korea
Interests: automated machine learning; adversarial machine learning; multi-agent reinforcement learning; few shot learning; generative adversarial network
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are inviting submissions to the Special Issue on Recent Advances in Automated Machine Learning.

Big data can now be found in various domains, a phenomenon which has spurred remarkable advances in deep learning, with many researchers investigating theories and applications of automated machine learning (AutoML). Advances in AutoML will have a huge impact in many areas of deep learning, such as data preparation, feature engineering, model selection and evaluation, hyperparameter tuning, network architecture search, and ensemble methods. For machine learning projects to have a successful start, we need to automate exploratory data analysis and feature selection to explore and understand the context, property, and quality of the data. In this initial process, automated data exploration tools and feature recommendation tools will be of great assistance. For optimal performance in terms of learning time and evaluation metrics (including accuracy), however, we need to develop effective model selection and evaluation methods to search for hyperparameters and network architecture. Moreover, since AutoML methodologies deal with multiple models simultaneously, we need to devise smart strategies for maintaining homogeneous/heterogeneous models with parallelized and limited resources. Techniques for searching for (or creating) optimal hyperparameters and network architectures with contemporary machine learning scenarios such as federated machine learning, meta-learning, self-supervised machine learning, etc. are attracting increasing interest from the research community.

In this Special Issue, we invite submissions exploring cutting-edge research and recent advances in the fields of automated machine learning. Both theoretical and experimental studies are welcome, as well as comprehensive review and survey papers.

Prof. Dr. Dae-Ki Kang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • automated domain adaptation
  • automated feature engineering
  • AutoML for meta-learning
  • explainability in AutoML
  • federated AutoML
  • hyperparameter optimization and creation
  • metaheuristics for AutoML
  • network architecture search
  • optimal resource utilization in AutoML
  • reinforcement learning for AutoML
  • security and privacy in AutoML
  • self-supervised learning and AutoML
  • semi-automated machine learning
  • stopping criteria for AutoML

Related Special Issue

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 594 KiB  
Article
Improving Automated Machine-Learning Systems through Green AI
by Dagoberto Castellanos-Nieves and Luis García-Forte
Appl. Sci. 2023, 13(20), 11583; https://doi.org/10.3390/app132011583 - 23 Oct 2023
Cited by 1 | Viewed by 1517
Abstract
Automated machine learning (AutoML), which aims to facilitate the design and optimization of machine-learning models with reduced human effort and expertise, is a research field with significant potential to drive the development of artificial intelligence in science and industry. However, AutoML also poses [...] Read more.
Automated machine learning (AutoML), which aims to facilitate the design and optimization of machine-learning models with reduced human effort and expertise, is a research field with significant potential to drive the development of artificial intelligence in science and industry. However, AutoML also poses challenges due to its resource and energy consumption and environmental impact, aspects that have often been overlooked. This paper predominantly centers on the sustainability implications arising from computational processes within the realm of AutoML. Within this study, a proof of concept has been conducted using the widely adopted Scikit-learn library. Energy efficiency metrics have been employed to fine-tune hyperparameters in both Bayesian and random search strategies, with the goal of enhancing the environmental footprint. These findings suggest that AutoML can be rendered more sustainable by thoughtfully considering the energy efficiency of computational processes. The obtained results from the experimentation are promising and align with the framework of Green AI, a paradigm aiming to enhance the ecological footprint of the entire AutoML process. The most suitable proposal for the studied problem, guided by the proposed metrics, has been identified, with potential generalizability to other analogous problems. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

15 pages, 2549 KiB  
Article
SANA: Sensitivity-Aware Neural Architecture Adaptation for Uniform Quantization
by Mingfei Guo, Zhen Dong and Kurt Keutzer
Appl. Sci. 2023, 13(18), 10329; https://doi.org/10.3390/app131810329 - 15 Sep 2023
Cited by 1 | Viewed by 903
Abstract
Uniform quantization is widely taken as an efficient compression method in practical applications. Despite its merit of having a low computational overhead, uniform quantization fails to preserve sensitive components in neural networks when applied with ultra-low bit precision, which could lead to a [...] Read more.
Uniform quantization is widely taken as an efficient compression method in practical applications. Despite its merit of having a low computational overhead, uniform quantization fails to preserve sensitive components in neural networks when applied with ultra-low bit precision, which could lead to a non-trivial accuracy degradation. Previous works have applied mixed-precision quantization to address this problem. However, finding the correct bit settings for different layers always demands significant time and resource consumption. Moreover, mixed-precision quantization is not well supported on current general-purpose machines such as GPUs and CPUs and, thus, will cause intolerable overheads in deployment. To leverage the efficiency of uniform quantization while maintaining accuracy, in this paper, we propose sensitivity-aware network adaptation (SANA), which automatically modifies the model architecture based on sensitivity analysis to make it more compatible with uniform quantization. Furthermore, we formulated four different channel initialization strategies to accelerate the quantization-aware fine-tuning process of SANA. Our experimental results showed that SANA can outperform standard uniform quantization and other state-of-the-art quantization methods in terms of accuracy, with comparable or even smaller memory consumption. Notably, ResNet-50-SANA (24.4 MB) with W4A8 quantization achieved 77.8% top-one accuracy on ImageNet, which even surpassed the 77.6% of the full-precision ResNet-50 (97.8 MB) baseline. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

24 pages, 688 KiB  
Article
TA-DARTS: Temperature Annealing of Discrete Operator Distribution for Effective Differential Architecture Search
by Jiyong Shin, Kyongseok Park and Dae-Ki Kang
Appl. Sci. 2023, 13(18), 10138; https://doi.org/10.3390/app131810138 - 8 Sep 2023
Cited by 1 | Viewed by 638
Abstract
In the realm of machine learning, the optimization of hyperparameters and the design of neural architectures entail laborious and time-intensive endeavors. To address these challenges, considerable research effort has been directed towards Automated Machine Learning (AutoML), with a focus on enhancing these inherent [...] Read more.
In the realm of machine learning, the optimization of hyperparameters and the design of neural architectures entail laborious and time-intensive endeavors. To address these challenges, considerable research effort has been directed towards Automated Machine Learning (AutoML), with a focus on enhancing these inherent inefficiencies. A pivotal facet of this pursuit is Neural Architecture Search (NAS), a domain dedicated to the automated formulation of neural network architectures. Given the pronounced impact of network architecture on neural network performance, NAS techniques strive to identify architectures that can manifest optimal performance outcomes. A prominent algorithm in this area is Differentiable Architecture Search (DARTS), which transforms discrete search spaces into continuous counterparts using gradient-based methodologies, thereby surpassing prior NAS methodologies. Notwithstanding DARTS’ achievements, a discrepancy between discrete and continuously encoded architectures persists. To ameliorate this disparity, we propose TA-DARTS in this study—a temperature annealing technique applied to the Softmax function, utilized for encoding the continuous search space. By leveraging temperature values, architectural weights are judiciously adjusted to alleviate biases in the search process or to align resulting architectures more closely with discrete values. Our findings exhibit advancements over the original DARTS methodology, evidenced by a 0.07%p enhancement in validation accuracy and a 0.16%p improvement in test accuracy on the CIFAR-100 dataset. Through systematic experimentation on benchmark datasets, we establish the superiority of TA-DARTS over the original mixed operator, thereby underscoring its efficacy in automating neural architecture design. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

16 pages, 1258 KiB  
Article
Predicting Commercial Building Energy Consumption Using a Multivariate Multilayered Long-Short Term Memory Time-Series Model
by Tan Ngoc Dinh, Gokul Sidarth Thirunavukkarasu, Mehdi Seyedmahmoudian, Saad Mekhilef and Alex Stojcevski
Appl. Sci. 2023, 13(13), 7775; https://doi.org/10.3390/app13137775 - 30 Jun 2023
Cited by 4 | Viewed by 1473
Abstract
The global demand for energy has been steadily increasing due to population growth, urbanization, and industrialization. Numerous researchers worldwide are striving to create precise forecasting models for predicting energy consumption to manage supply and demand effectively. In this research, a time-series forecasting model [...] Read more.
The global demand for energy has been steadily increasing due to population growth, urbanization, and industrialization. Numerous researchers worldwide are striving to create precise forecasting models for predicting energy consumption to manage supply and demand effectively. In this research, a time-series forecasting model based on multivariate multilayered long short-term memory (LSTM) is proposed for forecasting energy consumption and tested using data obtained from commercial buildings in Melbourne, Australia: the Advanced Technologies Center, Advanced Manufacturing and Design Center, and Knox Innovation, Opportunity, and Sustainability Center buildings. This research specifically identifies the best forecasting method for subtropical conditions and evaluates its performance by comparing it with the most commonly used methods at present, including LSTM, bidirectional LSTM, and linear regression. The proposed multivariate, multilayered LSTM model was assessed by comparing mean average error (MAE), root-mean-square error (RMSE), and mean absolute percentage error (MAPE) values with and without labeled time. Results indicate that the proposed model exhibits optimal performance with improved precision and accuracy. Specifically, the proposed LSTM model achieved a decrease in MAE of 30%, RMSE of 25%, and MAPE of 20% compared with the LSTM method. Moreover, it outperformed the bidirectional LSTM method with a reduction in MAE of 10%, RMSE of 20%, and MAPE of 18%. Furthermore, the proposed model surpassed linear regression with a decrease in MAE by 2%, RMSE by 7%, and MAPE by 10%.These findings highlight the significant performance increase achieved by the proposed multivariate multilayered LSTM model in energy consumption forecasting. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

12 pages, 1780 KiB  
Article
Multiple Object Tracking Using Re-Identification Model with Attention Module
by Woo-Jin Ahn, Koung-Suk Ko, Myo-Taeg Lim, Dong-Sung Pae and Tae-Koo Kang
Appl. Sci. 2023, 13(7), 4298; https://doi.org/10.3390/app13074298 - 28 Mar 2023
Cited by 3 | Viewed by 3168
Abstract
Multi-object tracking (MOT) has gained significant attention in computer vision due to its wide range of applications. Specifically, detection-based trackers have shown high performance in MOT, but they tend to fail in occlusive scenarios such as the moment when objects overlap or separate. [...] Read more.
Multi-object tracking (MOT) has gained significant attention in computer vision due to its wide range of applications. Specifically, detection-based trackers have shown high performance in MOT, but they tend to fail in occlusive scenarios such as the moment when objects overlap or separate. In this paper, we propose a triplet-based MOT network that integrates the tracking information and the visual features of the object. Using a triplet-based image feature, the network can differentiate similar-looking objects, reducing the number of identity switches over a long period. Furthermore, an attention-based re-identification model that focuses on the appearance of objects was introduced to extract the feature vectors from the images to effectively associate the objects. The extensive experimental results demonstrated that the proposed method outperforms existing methods on the ID switch metric and improves the detection performance of the tracking system. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

22 pages, 3823 KiB  
Article
Image Segmentation for Human Skin Detection
by Marcelo Leite, Wemerson Delcio Parreira, Anita Maria da Rocha Fernandes and Valderi Reis Quietinho Leithardt
Appl. Sci. 2022, 12(23), 12140; https://doi.org/10.3390/app122312140 - 27 Nov 2022
Cited by 7 | Viewed by 4197
Abstract
Human skin detection is the main task for various human–computer interaction applications. For this, several computer vision-based approaches have been developed in recent years. However, different events and features can interfere in the segmentation process, such as luminosity conditions, skin tones, complex backgrounds, [...] Read more.
Human skin detection is the main task for various human–computer interaction applications. For this, several computer vision-based approaches have been developed in recent years. However, different events and features can interfere in the segmentation process, such as luminosity conditions, skin tones, complex backgrounds, and image capture equipment. In digital imaging, skin segmentation methods can overcome these challenges or at least part of them. However, the images analyzed follow an application-specific pattern. In this paper, we present an approach that uses a set of methods to segment skin and non-skin pixels in images from uncontrolled or unknown environments. Our main result is the ability to segment skin and non-skin pixels in digital images from a non-restrained capture environment. Thus, it overcomes several challenges, such as lighting conditions, compression, and scene complexity. By applying a segmented image examination approach, we determine the proportion of skin pixels present in the image by considering only the objects of interest (i.e., the people). In addition, this segmented analysis can generate independent information regarding each part of the human body. The proposed solution produces a dataset composed of a combination of other datasets present in the literature, which enables the construction of a heterogeneous set of images. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

10 pages, 2722 KiB  
Article
Finite Element-Based Machine Learning Model for Predicting the Mechanical Properties of Composite Hydrogels
by Yasin Shokrollahi, Pengfei Dong, Peshala T. Gamage, Nashaita Patrawalla, Vipuil Kishore, Hozhabr Mozafari and Linxia Gu
Appl. Sci. 2022, 12(21), 10835; https://doi.org/10.3390/app122110835 - 26 Oct 2022
Cited by 6 | Viewed by 2841
Abstract
In this study, a finite element (FE)-based machine learning model was developed to predict the mechanical properties of bioglass (BG)-collagen (COL) composite hydrogels. Based on the experimental observation of BG-COL composite hydrogels with scanning electron microscope, 2000 microstructural images with randomly distributed BG [...] Read more.
In this study, a finite element (FE)-based machine learning model was developed to predict the mechanical properties of bioglass (BG)-collagen (COL) composite hydrogels. Based on the experimental observation of BG-COL composite hydrogels with scanning electron microscope, 2000 microstructural images with randomly distributed BG particles were created. The BG particles have diameters ranging from 0.5 µm to 1.5 µm and a volume fraction from 17% to 59%. FE simulations of tensile testing were performed for calculating the Young’s modulus and Poisson’s ratio of 2000 microstructures. The microstructural images and the calculated Young’s modulus and Poisson’s ratio by FE simulation were used for training and testing a convolutional neural network regression model. Results showed that the network developed in this work can effectively predict the mechanical properties of the composite hydrogels. The R-squared values were 95% and 83% for Young’s modulus and Poisson’s ratio, respectively. This work provides a surrogate model of finite element analysis to predict mechanical properties of BG-COL hydrogel using microstructure images, which could be further utilized for characterizing heterogeneous materials in big data-driven material designs. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

24 pages, 9173 KiB  
Article
Machine Learning Sequential Methodology for Robot Inverse Kinematic Modelling
by Franco Luis Tagliani, Nicola Pellegrini and Francesco Aggogeri
Appl. Sci. 2022, 12(19), 9417; https://doi.org/10.3390/app12199417 - 20 Sep 2022
Cited by 5 | Viewed by 2349
Abstract
The application of robots is growing in most countries, occupying a relevant place in everyday environments. Robots are still affected by errors due to their limitations, which may compromise the final performance. Accurate trajectories and positionings are strict requirements that robots have to [...] Read more.
The application of robots is growing in most countries, occupying a relevant place in everyday environments. Robots are still affected by errors due to their limitations, which may compromise the final performance. Accurate trajectories and positionings are strict requirements that robots have to satisfy and may be studied by the inverse kinematic (IK) formulation. The IK conventional numerical techniques are computationally intensive procedures, focusing on the robot joint values simultaneously and increasing the complexity of the solution identification. In this scenario, Machine Learning strategies may be adopted to achieve effective and robust manipulator’s IK formulation due to their computational efficiency and learning ability. This work proposes a machine learning (ML) sequential methodology for robot inverse kinematics modeling, iterating the model prediction at each joint. The method implements an automatic Denavit-Hartenberg (D-H) parameters formulation code to obtain the forward kinematic (FK) equations required to produce the robot dataset. Moreover, the artificial neural network (ANN) architecture is selected as a structure and the number of layers in combination with the hidden neurons per layer is defined by an offline optimization phase based on the genetic algorithm (GA) technique for each joint. The ANN is implemented with the following settings: scaled conjugate gradient as training function and the mean squared error as the loss function. Different network architectures are examined to validate the IK procedure, ranging from global to sequential and considering the computation direction (from end-effector or from basement). The method is validated in the simulated and experimental laboratory environment, considering articulated robots. The sequential method exhibits a reduction of the mean squared error index of 42.7–56.7%, compared to the global scheme. Results show the outstanding performance of the IK model in robot joint space prediction, with a residual mean absolute error of 0.370–0.699 mm in trajectory following 150.0–200.0 mm paths applied to a real robot. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

11 pages, 393 KiB  
Article
Cloud Storage Data Verification Using Signcryption Scheme
by Elizabeth Nathania Witanto and Sang-Gon Lee
Appl. Sci. 2022, 12(17), 8602; https://doi.org/10.3390/app12178602 - 27 Aug 2022
Cited by 4 | Viewed by 1471
Abstract
Cloud computing brings convenience to the users by providing computational resources and services. However, it comes with security challenges, such as unreliable cloud service providers that could threaten users’ data integrity. Therefore, we need a data verification protocol to ensure users’ data remains [...] Read more.
Cloud computing brings convenience to the users by providing computational resources and services. However, it comes with security challenges, such as unreliable cloud service providers that could threaten users’ data integrity. Therefore, we need a data verification protocol to ensure users’ data remains intact in the cloud storage. The data verification protocol has three important properties: public verifiability, privacy preservation, and blockless verification. Unfortunately, various existing signcryption schemes do not fully provide those important properties. As a result, we propose an improved version of a signcryption technique based on the short signature ZSS that can fulfill the aforementioned data verification important properties. Our computational cost and time complexity assessment demonstrates that our suggested scheme can offer more characteristics at the same computational cost as another ZSS signcryption scheme. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

26 pages, 5110 KiB  
Article
Detecting Malignant Leukemia Cells Using Microscopic Blood Smear Images: A Deep Learning Approach
by Raheel Baig, Abdur Rehman, Abdullah Almuhaimeed, Abdulkareem Alzahrani and Hafiz Tayyab Rauf
Appl. Sci. 2022, 12(13), 6317; https://doi.org/10.3390/app12136317 - 21 Jun 2022
Cited by 22 | Viewed by 3170
Abstract
Leukemia is a form of blood cancer that develops when the human body’s bone marrow contains too many white blood cells. This medical condition affects adults and is considered a prevalent form of cancer in children. Treatment for leukaemia is determined by the [...] Read more.
Leukemia is a form of blood cancer that develops when the human body’s bone marrow contains too many white blood cells. This medical condition affects adults and is considered a prevalent form of cancer in children. Treatment for leukaemia is determined by the type and the extent to which cancer has developed across the body. It is crucial to diagnose leukaemia early in order to provide adequate care and to cure patients. Researchers have been working on advanced diagnostics systems based on Machine Learning (ML) approaches to diagnose leukaemia early. In this research, we employ deep learning (DL) based convolutional neural network (CNN) and hybridized two individual blocks of CNN named CNN-1 and CNN-2 to detect acute lymphoblastic leukaemia (ALL), acute myeloid leukaemia (AML), and multiple myeloma (MM). The proposed model detects malignant leukaemia cells using microscopic blood smear images. We construct a dataset of about 4150 images from a public directory. The main challenges were background removal, ripping out un-essential blood components of blood supplies, reduce the noise and blurriness and minimal method for image segmentation. To accomplish the pre-processing and segmentation, we transform RGB color-space into the greyscale 8-bit mode, enhancing the contrast of images using the image intensity adjustment method and adaptive histogram equalisation (AHE) method. We increase the structure and sharpness of images by multiplication of binary image with the output of enhanced images. In the next step, complement is done to get the background in black colour and nucleus of blood in white colour. Thereafter, we applied area operation and closing operation to remove background noise. Finally, we multiply the final output to source image to regenerate the images dataset in RGB colour space, and we resize dataset images to [400, 400]. After applying all methods and techniques, we have managed to get noiseless, non-blurred, sharped and segmented images of the lesion. In next step, enhanced segmented images are given as input to CNNs. Two parallel CCN models are trained, which extract deep features. The extracted features are further combined using the Canonical Correlation Analysis (CCA) fusion method to get more prominent features. We used five classification algorithms, namely, SVM, Bagging ensemble, total boosts, RUSBoost, and fine KNN, to evaluate the performance of feature extraction algorithms. Among the classification algorithms, Bagging ensemble outperformed the other algorithms by achieving the highest accuracy of 97.04%. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

12 pages, 1643 KiB  
Article
A New Body Weight Lifelog Outliers Generation Method: Reflecting Characteristics of Body Weight Data
by Jiyong Kim and Minseo Park
Appl. Sci. 2022, 12(9), 4726; https://doi.org/10.3390/app12094726 - 7 May 2022
Cited by 1 | Viewed by 1436
Abstract
Lifelogs are generated in our daily lives and contain useful information for health monitoring. Nowadays, one can easily obtain various lifelogs from a wearable device such as a smartwatch. These lifelogs could include noise and outliers. In general, the amount of noise and [...] Read more.
Lifelogs are generated in our daily lives and contain useful information for health monitoring. Nowadays, one can easily obtain various lifelogs from a wearable device such as a smartwatch. These lifelogs could include noise and outliers. In general, the amount of noise and outliers is significantly smaller than that of normal data, resulting in class imbalance. To achieve good analytic accuracy, the noise and outliers should be filtered. Lifelogs have specific characteristics: low volatility and periodicity. It is very important to continuously analyze and manage them within a specific time. To solve the class imbalance problem of outliers in weight lifelog data, we propose a new outlier generation method that reflects the characteristics of body weight. This study compared the proposed method with the SMOTE-based data augmentation and the GAN-based data augmentation methods. Our results confirm that our proposed method for outlier detection was better than the SVM, XGBOOST, and CATBOOST algorithms. Through them, we can reduce the data imbalance level, improve data quality, and improve analytics accuracy. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

22 pages, 1559 KiB  
Article
Classification and Fast Few-Shot Learning of Steel Surface Defects with Randomized Network
by Amr M. Nagy and László Czúni
Appl. Sci. 2022, 12(8), 3967; https://doi.org/10.3390/app12083967 - 14 Apr 2022
Cited by 9 | Viewed by 3292
Abstract
Quality inspection is inevitable in the steel industry so there are already benchmark datasets for the visual inspection of steel surface defects. In our work, we show, contrary to previous recent articles, that a generic state-of-art deep neural network is capable of almost-perfect [...] Read more.
Quality inspection is inevitable in the steel industry so there are already benchmark datasets for the visual inspection of steel surface defects. In our work, we show, contrary to previous recent articles, that a generic state-of-art deep neural network is capable of almost-perfect classification of defects of two popular benchmark datasets. However, in real-life applications new types of errors can always appear, thus incremental learning, based on very few example shots, is challenging. In our article, we address the problems of the low number of available shots of new classes, the catastrophic forgetting of known information when tuning for new artifacts, and the long training time required for re-training or fine-tuning existing models. In the proposed new architecture we combine EfficientNet deep neural networks with randomized classifiers to aim for an efficient solution for these demanding problems. The classification outperforms all other known approaches, with an accuracy 100% or almost 100%, on the two datasets with the off-the-shelf network. The proposed few-shot learning approach shows considerably higher accuracy at a low number of shots than the different methods under testing, while its speed is significantly (at least 10 times) higher than its competitors. According to these results, the classification and few-shot learning of steel surface defects can be solved more efficiently than was possible before. Full article
(This article belongs to the Special Issue Recent Advances in Automated Machine Learning)
Show Figures

Figure 1

Back to TopTop