Gen2Gen: Efficiently Training Artificial Neural Networks Using a Series of Genetic Algorithms
Abstract
1. Introduction
2. Method Description
2.1. The Algorithm of the First Stage
Algorithm 1 Calculating the quantity with for a a provided neural network |
|
- Initialization step.
- (a)
- Set as the maximum number of allowed generations.
- (b)
- Set the number of chromosomes used. Each chromosome is considered as a vector of the double-precision values. The value d is used to represent the dimension of the input pattern and the constant H defines the number of the nodes of the neural network. Every value in the chromosomes is initialized randomly in the range .
- (c)
- Set as the selection rate, where .
- (d)
- Set as the mutation rate, where .
- (e)
- Set as the generation counter.
- Fitness calculation step.
- (a)
- For perform the following.
- i.
- Create the neural network for the chromosome .
- ii.
- Set .
- iii.
- Set using the function of Algorithm 1.
- iv.
- Set as value for the fitness of chromosome , with .
- (b)
- End For
- Application of genetic operations.
- (a)
- Copy the best chromosomes of the current population intact to the next generation. The remaining of chromosomes will be replaced by new chromosomes produced during crossover and mutation.
- (b)
- Perform the crossover procedure. For each new element, two parents, , are selected from the current population using using tournament selection. After the selection of the parents, the new offspring and are formed using the following:
- (c)
- Perform the mutation procedure, as proposed in [44]: For every chromosome and each element select a random number . If alters the corresponding element asThe number t is a random number that can be 0 or 1 and the function is calculated as
- Termination check step.
- (a)
- Set .
- (b)
- If go to fitness calculation step.
- Final Step.
- (a)
- Obtain the chromosome having the lowest fitness value in the population.
- (b)
- Produce the vectors and using the following:
2.2. The Algorithm of the Second Stage
- Initialization step.
- (a)
- Set as the maximum number of allowed generations and as the total number of chromosomes.
- (b)
- Set as the selection rate and as the mutation rate.
- (c)
- Initialize every chromosome, , randomly inside the produced vectors and from the previous phase.
- (d)
- Set as the number of samples used in the fitness calculation step.
- (e)
- Set , the generation counter.
- Fitness calculation step.
- (a)
- For perform the following.
- Calculate the fitness of each chromosome using the procedure provided in Algorithm 2.
- (b)
- End For.
- Application of genetic operators.
- (a)
- Selection procedure. Copy the best chromosomes to the next generation without changes. The remaining ones will be replaced by offspring created using crossover and the mutation procedure. The sorting is performed using the operator for the fitness values.
- (b)
- Crossover procedure. Perform the crossover procedure, where for every couple of produced chromosomes, two parents will be chosen using tournament selection. The new chromosomes will be produced using the one-point crossover method, graphically presented in Figure 2.
- (c)
- Mutation procedure. For each element of each chromosome a random number is drawn. The corresponding element is altered randomly when .
- Termination check step.
- (a)
- Set
- (b)
- If , go to fitness calculation step.
- Final step.
- (a)
- Obtain the best chromosome from the population .
- (b)
- Produce the corresponding set of intervals .
Algorithm 2 Fitness calculation function. |
|
2.3. The Final Training Algorithm
- Initialization step.
- (a)
- Set as the maximum number of allowed generations and as the total number of chromosomes.
- (b)
- Set as the selection rate and as the mutation rate.
- (c)
- Initialize randomly the chromosomes as random vectors with elements inside the bounds .
- (d)
- Set , the generation counter.
- Fitness calculation step.
- (a)
- For perform the following.
- i.
- Create the neural network for the chromosome .
- ii.
- Calculate the associated fitness value as .
- (b)
- End For.
- Incorporation of genetic operators. Apply the same genetic operators as in the first phase of the proposed algorithm, described in Section 2.1.
- Termination check step.
- (a)
- Set
- (b)
- If , go to fitness calculation step of the current algorithm.
- Testing step.
- (a)
- Obtain the chromosome with the lowest fitness value in the population and denote it as .
- (b)
- Produce the associated neural network .
- (c)
- Apply a local search procedure to the error function for this network. The local search procedure used was a BFGS method of Powell [45].
- (d)
- Apply the neural network on the associated test set of the problem to obtain the test error.
3. Results
- The UCI database, https://archive.ics.uci.edu/ (accessed on 19 April 2025) [46].
- The Keel website, https://sci2s.ugr.es/keel/datasets.php (accessed on 19 April 2025) [47].
- The Statlib URL https://stat.ethz.ch/Teaching/Datasets/Statlib-index (accessed on 19 April 2025).
3.1. Experimental Datasets
- 1.
- Appendictis, which is a medical dataset [48].
- 2.
- Alcohol, which is dataset regarding alcohol consumption [49].
- 3.
- Australian, which is a dataset produced from various bank transactions [50].
- 4.
- Balance dataset [51], produced from various psychological experiments.
- 5.
- 6.
- Circular dataset, which is an artificial dataset.
- 7.
- Dermatology, a medical dataset for dermatology problems [54].
- 8.
- The Hayes–Roth dataset, which was initially suggested in [55].
- 9.
- Heart, which is a dataset related to heart diseases [56].
- 10.
- HeartAttack, which is related to heart diseases.
- 11.
- Housevotes, a dataset which contains data from Congressional voting in the USA [57].
- 12.
- 13.
- 14.
- The Lymography dataset [62].
- 15.
- Mammographic, which is related to the presence of breast cancer [63].
- 16.
- 17.
- Pima, which is related to the presence of diabetes [66].
- 18.
- Popfailures, a dataset related to experiments regarding climate [67].
- 19.
- Regions2, a medical dataset applied to liver problems [68].
- 20.
- Saheart, which is a medical dataset concerning heart diseases [69].
- 21.
- Segment dataset [70].
- 22.
- The Sonar dataset, related to sonar signals [71].
- 23.
- Statheart, a medical dataset related to heart diseases.
- 24.
- Spiral, which was created artificially and contains two distinct classes.
- 25.
- Student, which is a dataset regarding experiments in schools [72].
- 26.
- Transfusion, which is also a dataset used for medical purposes [73].
- 27.
- 28.
- 29.
- 30.
- Zoo, which is a dataset regarding animal classification [80].
- 1.
- Abalone, which is a dataset for the detection of the age of abalones [81].
- 2.
- Airfoil, founded by NASA [82].
- 3.
- Auto, a dataset used to predict the fuel consumption in cars.
- 4.
- BK, which is used to predict the points scored in basketball games.
- 5.
- BL, a dataset that contains measurements from electricity experiments.
- 6.
- Baseball, which is a dataset used to predict the income of baseball players.
- 7.
- Concrete, which is a civil engineering dataset [83].
- 8.
- DEE, a dataset that is used to predict the price of electricity.
- 9.
- Friedman, which is an artificial dataset [84].
- 10.
- FY, which is a dataset regarding the longevity of fruit flies.
- 11.
- HO, a dataset located in the STATLIB repository.
- 12.
- Housing, regarding the price of houses [85].
- 13.
- Laser, which is used in physics experiments.
- 14.
- The MB dataset, originated in the Smoothing Methods in Statistics.
- 15.
- The NT dataset [86].
- 16.
- Mortgage, a dataset that contains data from the economy of the USA.
- 17.
- PL dataset, located in the STALIB repository.
- 18.
- Plastic, a dataset regarding problems occurring with pressure on plastics.
- 19.
- The PY dataset [87].
- 20.
- Quake, a dataset regarding the measurements of earthquakes.
- 21.
- SN, a dataset related to trellising and pruning.
- 22.
- Stock, which is related to the prices of stocks.
- 23.
- Treasury, a dataset that contains measurements from the economy of the USA.
3.2. Experimental Results
- The column DATASET is used to denote the name of the dataset.
- The column BFGS represents the results obtained by the training of a neural network with processing nodes using the BFGS optimization method [45]. This method terminates either when the derivative is zero or when a maximum number of iterations is reached. In the experiments performed, this number was set to 2000.
- The column ADAM is used to denote the training of a neural network with processing nodes using the ADAM optimization method [17]. The parameters used for the conducted experiments were the following: , and the maximum number of iterations was set to 10,000.
- The column NEAT represents the incorporation of the NEAT method (NeuroEvolution of Augmenting Topologies) [89]. The population size was set to 500, as in the case of the proposed method.
- The column RBF is used to denote the usage of a Radial Basis Function (RBF) network [90,91] with 10 processing nodes. The network was trained with the original training method incorporated in RBF networks with two distinct phases: during the first phase, the centers and the variances of the model were calculated using the k-means algorithm [92], and during the second phase, the weights of the network were obtained by solving a linear system of equations.
- The column GENETIC denotes the usage of a genetic algorithm to train a neural network with processing nodes. The parameters used in this algorithm are listed in Table 1.
- The column PROPOSED denotes experimental results of the proposed method.
- The row AVERAGE represents the average classification or regression error for all datasets.
3.3. A Practical Example
- RBF, which represents the application of the RBF network with 10 processing nodes.
- BFGS, which stands for the BFGS method, used to train a neural network with processing nodes.
- GENETIC, which represents a genetic algorithm incorporated to train a neural network with processing nodes.
- GEN2GEN, when represents the proposed method.
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef]
- Suryadevara, S.; Yanamala, A.K.Y. A Comprehensive Overview of Artificial Neural Networks: Evolution, Architectures, and Applications. Rev. Intel. Artif. Med. 2021, 12, 51–76. [Google Scholar]
- Tsoulos, I.G.; Gavrilis, D.; Glavas, E. Neural network construction and training using grammatical evolution. Neurocomputing 2008, 72, 269–277. [Google Scholar] [CrossRef]
- Guarnieri, S.; Piazza, F.; Uncini, A. Multilayer feedforward networks with adaptive spline activation function. IEEE Trans. Neural Netw. 1999, 10, 672–683. [Google Scholar] [CrossRef] [PubMed]
- Ertuğrul, Ö.F. A novel type of activation function in artificial neural networks: Trained activation function. Neural Netw. 2018, 99, 148–157. [Google Scholar] [CrossRef] [PubMed]
- Rasamoelina, A.D.; Adjailia, F.; Sinčák, P. A Review of Activation Function for Artificial Neural Network. In Proceedings of the 2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI), Herlany, Slovakia, 23–25 January 2020; pp. 281–286. [Google Scholar]
- Egmont-Petersen, M.; de Ridder, D.; Handels, H. Image processing with neural networks—A review. Pattern Recognit. 2002, 35, 2279–2301. [Google Scholar] [CrossRef]
- Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
- Huang, Z.; Chen, H.; Hsu, C.-J.; Chen, W.-H.; Wu, S. Credit rating analysis with support vector machines and neural networks: A market comparative study. Decis. Support Syst. 2004, 37, 543–558. [Google Scholar] [CrossRef]
- Baldi, P.; Cranmer, K.; Faucett, T.; Sadowski, P.; Whiteson, D. Parameterized neural networks for high-energy physics. Eur. Phys. J. C 2016, 76, 235. [Google Scholar] [CrossRef]
- Baskin, I.I.; Winkler, D.; Tetko, I.V. A renaissance of neural networks in drug discovery. Expert Opin. Drug Discov. 2016, 11, 785–795. [Google Scholar] [CrossRef]
- Bartzatt, R. Prediction of Novel Anti-Ebola Virus Compounds Utilizing Artificial Neural Network (ANN). Chem. Fac. 2018, 49, 16–34. [Google Scholar]
- Peta, K.; Żurek, J. Prediction of air leakage in heat exchangers for automotive applications using artificial neural networks. In Proceedings of the 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 8–10 November 2018; pp. 721–725. [Google Scholar]
- Vora, K.; Yagnik, S. A survey on backpropagation algorithms for feedforward neural networks. Int. J. Eng. Dev. Res. 2014, 1, 193–197. [Google Scholar]
- Pajchrowski, T.; Zawirski, K.; Nowopolski, K. Neural speed controller trained online by means of modified RPROP algorithm. IEEE Trans. Ind. Inform. 2014, 11, 560–568. [Google Scholar] [CrossRef]
- Hermanto, R.P.S.; Nugroho, A. Waiting-time estimation in bank customer queues using RPROP neural networks. Procedia Comput. Sci. 2018, 135, 35–42. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J.L. ADAM: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
- Reynolds, J.; Rezgui, Y.; Kwan, A.; Piriou, S. A zone-level, building energy optimisation combining an artificial neural network, a genetic algorithm, and model predictive control. Energy 2018, 151, 729–739. [Google Scholar] [CrossRef]
- Das, G.; Pattnaik, P.K.; Padhy, S.K. Artificial neural network trained by particle swarm optimization for non-linear channel equalization. Expert Syst. Appl. 2014, 41, 3491–3496. [Google Scholar] [CrossRef]
- Sexton, R.S.; Dorsey, R.E.; Johnson, J.D. Beyond backpropagation: Using simulated annealing for training neural networks. J. Organ. End User Comput. 1999, 11, 3–10. [Google Scholar] [CrossRef]
- Wang, L.; Zeng, Y.; Chen, T. Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst. Appl. 2015, 42, 855–863. [Google Scholar] [CrossRef]
- Karaboga, D.; Akay, B. Artificial bee colony (ABC) algorithm on training artificial neural networks. In Proceedings of the 2007 IEEE 15th Signal Processing and Communications Applications, Eskisehir, Turkey, 11–13 June 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 1–4. [Google Scholar]
- Sexton, R.S.; Alidaee, B.; Dorsey, R.E.; Johnson, J.D. Global optimization for artificial neural networks: A tabu search application. Eur. J. Oper. Res. 1998, 106, 570–584. [Google Scholar] [CrossRef]
- Zhang, J.-R.; Zhang, J.; Lok, T.-M.; Lyu, M.R. A hybrid particle swarm optimization—Back-propagation algorithm for feedforward neural network training. Appl. Math. Comput. 2007, 185, 1026–1037. [Google Scholar] [CrossRef]
- Zhao, G.; Wang, T.; Jin, Y.; Lang, C.; Li, Y.; Ling, H. The Cascaded Forward algorithm for neural network training. Pattern Recognit. 2025, 161, 111292. [Google Scholar] [CrossRef]
- Oh, K.-S.; Jung, K. GPU implementation of neural networks. Pattern Recognit. 2004, 37, 1311–1314. [Google Scholar] [CrossRef]
- Zhang, M.; Hibi, K.; Inoue, J. GPU-accelerated artificial neural network potential for molecular dynamics simulation. Comput. Commun. 2023, 285, 108655. [Google Scholar] [CrossRef]
- Nowlan, S.J.; Hinton, G.E. Simplifying neural networks by soft weight sharing. Neural Comput. 1992, 4, 473–493. [Google Scholar] [CrossRef]
- Hanson, S.J.; Pratt, L.Y. Comparing biases for minimal network construction with back propagation. In Advances in Neural Information Processing Systems; Touretzky, D.S., Ed.; Morgan Kaufmann: San Mateo, CA, USA, 1989; Volume 1, pp. 177–185. [Google Scholar]
- Augasta, M.; Kathirvalavakumar, T. Pruning algorithms of neural networks—A comparative study. Cent. Eur. Comput. Sci. 2003, 3, 105–115. [Google Scholar] [CrossRef]
- Prechelt, L. Automatic early stopping using cross validation: Quantifying the criteria. Neural Netw. 1998, 11, 761–767. [Google Scholar] [CrossRef]
- Wu, X.; Liu, J. A New Early Stopping Algorithm for Improving Neural Network Generalization. In Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation, Changsha, China, 10–11 October 2009; pp. 15–18. [Google Scholar]
- Treadgold, N.K.; Gedeon, T.D. Simulated annealing and weight decay in adaptive learning: The SARPROP algorithm. IEEE Trans. Neural Netw. 1998, 9, 662–668. [Google Scholar] [CrossRef]
- Carvalho, M.; Ludermir, T.B. Particle Swarm Optimization of Feed-Forward Neural Networks with Weight Decay. In Proceedings of the 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS’06), Rio de Janeiro, Brazil, 13–15 December 2006; p. 5. [Google Scholar]
- Arifovic, J.; Gençay, R. Using genetic algorithms to select architecture of a feedforward artificial neural network. Phys. A Stat. Mech. Appl. 2001, 289, 574–594. [Google Scholar] [CrossRef]
- Benardos, P.G.; Vosniakos, G.C. Optimizing feedforward artificial neural network architecture. Eng. Appl. Artif. Intell. 2007, 20, 365–382. [Google Scholar] [CrossRef]
- Garro, B.A.; Vázquez, R.A. Designing Artificial Neural Networks Using Particle Swarm Optimization Algorithms. Comput. Neurosci. 2015, 2015, 369298. [Google Scholar] [CrossRef]
- Siebel, N.T.; Sommer, G. Evolutionary reinforcement learning of artificial neural networks. Int. Hybrid Intell. Syst. 2007, 4, 171–183. [Google Scholar] [CrossRef]
- Jaafra, Y.; Laurent, J.L.; Deruyver, A.; Naceur, M.S. Reinforcement learning for neural architecture search: A review. Image Vis. Comput. 2019, 89, 57–66. [Google Scholar] [CrossRef]
- Pham, H.; Guan, M.; Zoph, B.; Le, Q.; Dean, J. Efficient neural architecture search via parameters sharing. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 4095–4104. [Google Scholar]
- Xie, S.; Zheng, H.; Liu, C.; Lin, L. SNAS: Stochastic neural architecture search. arXiv 2018, arXiv:1812.09926. [Google Scholar]
- Zhou, H.; Yang, M.; Wang, J.; Pan, W. Bayesnas: A bayesian approach for neural architecture search. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7603–7613. [Google Scholar]
- Huqqani, A.A.; Schikuta, E.; Ye, S.; Chen, P. Multicore and GPU Parallelization of Neural Networks for Face Recognition. Procedia Comput. Sci. 2013, 18, 349–358. [Google Scholar] [CrossRef]
- Kaelo, P.; Ali, M.M. Integrated crossover rules in real coded genetic algorithms. Eur. J. Oper. Res. 2007, 176, 60–76. [Google Scholar] [CrossRef]
- Powell, M.J.D. A Tolerant Algorithm for Linearly Constrained Optimization Calculations. Math. Program. 1989, 45, 547–566. [Google Scholar] [CrossRef]
- Kelly, M.; Longjohn, R.; Nottingham, K. The UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu (accessed on 19 August 2025).
- Alcalá-Fdez, J.; Fernandez, A.; Luengo, J.; Derrac, J.; García, S.; Sánchez, L.; Herrera, F. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. J.-Mult. Log. Soft Comput. 2011, 17, 255–287. [Google Scholar]
- Weiss, S.M.; Kulikowski, C.A. Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 1991. [Google Scholar]
- Tzimourta, K.D.; Tsoulos, I.; Bilero, I.T.; Tzallas, A.T.; Tsipouras, M.G.; Giannakeas, N. Direct Assessment of Alcohol Consumption in Mental State Using Brain Computer Interfaces and Grammatical Evolution. Inventions 2018, 3, 51. [Google Scholar] [CrossRef]
- Quinlan, J.R. Simplifying Decision Trees. Int. J. Man-Mach. Stud. 1987, 27, 221–234. [Google Scholar] [CrossRef]
- Shultz, T.; Mareschal, D.; Schmidt, W. Modeling Cognitive Development on Balance Scale Phenomena. Mach. Learn. 1994, 16, 59–88. [Google Scholar] [CrossRef]
- Zhou, Z.H.; Jiang, Y. NeC4.5: Neural ensemble based C4.5. IEEE Trans. Knowl. Data Eng. 2004, 16, 770–773. [Google Scholar] [CrossRef]
- Setiono, R.; Leow, W.K. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks. Appl. Intell. 2000, 12, 15–25. [Google Scholar] [CrossRef]
- Demiroz, G.; Govenir, H.A.; Ilter, N. Learning Differential Diagnosis of Eryhemato-Squamous Diseases using Voting Feature Intervals. Artif. Intell. Med. 1998, 13, 147–165. [Google Scholar]
- Hayes-Roth, B.; Hayes-Roth, B.F. Concept learning and the recognition and classification of exemplars. J. Verbal Learn. Verbal Behav. 1977, 16, 321–338. [Google Scholar] [CrossRef]
- Kononenko, I.; Šimec, E.; Robnik-Šikonja, M. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF. Appl. Intell. 1997, 7, 39–55. [Google Scholar] [CrossRef]
- French, R.M.; Chater, N. Using noise to compute error surfaces in connectionist networks: A novel means of reducing catastrophic forgetting. Neural Comput. 2002, 14, 1755–1769. [Google Scholar] [CrossRef] [PubMed]
- Dy, J.G.; Brodley, C.E. Feature Selection for Unsupervised Learning. J. Mach. Learn. Res. 2004, 5, 845–889. [Google Scholar]
- Perantonis, S.J.; Virvilis, V. Input Feature Extraction for Multilayered Perceptrons Using Supervised Principal Component Analysis. Neural Process. Lett. 1999, 10, 243–252. [Google Scholar] [CrossRef]
- Garcke, J.; Griebel, M. Classification with sparse grids using simplicial basis functions. Intell. Data Anal. 2002, 6, 483–502. [Google Scholar] [CrossRef]
- Mcdermott, J.; Forsyth, R.S. Diagnosing a disorder in a classification benchmark. Pattern Recognit. Lett. 2016, 73, 41–43. [Google Scholar] [CrossRef]
- Cestnik, G.; Konenenko, I.; Bratko, I. Assistant-86: A Knowledge-Elicitation Tool for Sophisticated Users. In Progress in Machine Learning; Bratko, I., Lavrac, N., Eds.; Sigma Press: Wilmslow, UK, 1987; pp. 31–45. [Google Scholar]
- Elter, M.; Schulz-Wendtland, R.; Wittenberg, T. The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med. Phys. 2007, 34, 4164–4172. [Google Scholar] [CrossRef] [PubMed]
- Little, M.A.; Mcsharry, P.E.; Roberts, S.J.; Costello, D.; Moroz, I. Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection. BioMed Eng. OnLine 2007, 6, 23. [Google Scholar] [CrossRef] [PubMed]
- Little, M.A.; McSharry, P.E.; Hunter, E.J.; Spielman, J.; Ramig, L.O. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 2009, 56, 1015–1022. [Google Scholar] [CrossRef] [PubMed]
- Smith, J.W.; Everhart, J.E.; Dickson, W.C.; Knowler, W.C.; Johannes, R.S. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care, Washington, DC, USA, 6–9 November 1988; IEEE Computer Society Press: Piscataway, NJ, USA, 1988; pp. 261–265. [Google Scholar]
- Lucas, D.D.; Klein, R.; Tannahill, J.; Ivanova, D.; Brandon, S.; Domyancic, D.; Zhang, Y. Failure analysis of parameter-induced simulation crashes in climate models. Geosci. Dev. 2013, 6, 1157–1171. [Google Scholar] [CrossRef]
- Giannakeas, N.; Tsipouras, M.G.; Tzallas, A.T.; Kyriakidi, K.; Tsianou, Z.E.; Manousou, P.; Hall, A.; Karvounis, E.C.; Tsianos, V.; Tsianos, E. A clustering based method for collagen proportional area extraction in liver biopsy images. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, Milano, Italy, 25–29 August 2015; art. no. 7319047. pp. 3097–3100. [Google Scholar]
- Hastie, T.; Tibshirani, R. Non-parametric logistic and proportional odds regression. JRSS-C (Appl. Stat.) 1987, 36, 260–276. [Google Scholar] [CrossRef]
- Dash, M.; Liu, H.; Scheuermann, P.; Tan, K.L. Fast hierarchical clustering and its validation. Data Knowl. Eng. 2003, 44, 109–138. [Google Scholar] [CrossRef]
- Gorman, R.P.; Sejnowski, T.J. Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets. Neural Netw. 1988, 1, 75–89. [Google Scholar] [CrossRef]
- Cortez, P.; Silva, A.M.G. Using data mining to predict secondary school student performance. In Proceedings of the 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008), Porto, Portugal, 9–11 April 2008; pp. 5–12. [Google Scholar]
- Yeh, I.C.; Yang, K.J.; Ting, T.M. Knowledge discovery on RFM model using Bernoulli sequence. Expert. Appl. 2009, 36, 5866–5871. [Google Scholar] [CrossRef]
- Jeyasingh, S.; Veluchamy, M. Modified bat algorithm for feature selection with the Wisconsin diagnosis breast cancer (WDBC) dataset. Asian Pac. J. Cancer Prev. APJCP 2017, 18, 1257. [Google Scholar]
- Alshayeji, M.H.; Ellethy, H.; Gupta, R. Computer-aided detection of breast cancer on the Wisconsin dataset: An artificial neural networks approach. Biomed. Signal Process. Control 2022, 71, 103141. [Google Scholar] [CrossRef]
- Raymer, M.; Doom, T.E.; Kuhn, L.A.; Punch, W.F. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Trans. Syst. Cybernetics. Part B Cybern. 2003, 33, 802–813. [Google Scholar] [CrossRef] [PubMed]
- Zhong, P.; Fukushima, M. Regularized nonsmooth Newton method for multi-class support vector machines. Optim. Methodsand Softw. 2007, 22, 225–236. [Google Scholar] [CrossRef]
- Andrzejak, R.G.; Lehnertz, K.; Mormann, F.; Rieke, C.; David, P.; Elger, C.E. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys. E 2007, 64, 061907. [Google Scholar] [CrossRef] [PubMed]
- Tzallas, A.T.; Tsipouras, M.G.; Fotiadis, D.I. Automatic Seizure Detection Based on Time-Frequency Analysis and Artificial Neural Networks. Comput. Intell. Neurosci. 2007, 2007, 80510. [Google Scholar] [CrossRef]
- Koivisto, M.; Sood, K. Exact Bayesian Structure Discovery in Bayesian Networks. J. Mach. Learn. Res. 2004, 5, 549–573. [Google Scholar]
- Nash, W.J.; Sellers, T.L.; Talbot, S.R.; Cawthor, A.J.; Ford, W.B. The Population Biology of Abalone (Haliotis Species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait, Sea Fisheries Division; Technical Report No. 48; Department of Primary Industry and Fisheries, Tasmania: Hobart, Australia, 1994; ISSN 1034-3288.
- Brooks, T.F.; Pope, D.S.; Marcolini, A.M. Airfoil Self-Noise and Prediction. Technical Report, NASA RP-1218. July 1989. Available online: https://ntrs.nasa.gov/citations/19890016302 (accessed on 14 November 2024).
- Yeh, I.C. Modeling of strength of high performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
- Friedman, J. Multivariate Adaptative Regression Splines. Ann. Stat. 1991, 19, 1–141. [Google Scholar]
- Harrison, D.; Rubinfeld, D.L. Hedonic prices and the demand for clean ai. J. Environ. Econ. Manag. 1978, 5, 81–102. [Google Scholar] [CrossRef]
- Mackowiak, P.A.; Wasserman, S.S.; Levine, M.M. A critical appraisal of 98.6 degrees f, the upper limit of the normal body temperature, and other legacies of Carl Reinhold August Wunderlich. J. Am. Med. Assoc. 1992, 268, 1578–1580. [Google Scholar] [CrossRef]
- King, R.D.; Muggleton, S.; Lewis, R.A.; Sternberg, M.J.E. Drug design by machine learning: The use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. Proc. Nat. Acad. Sci. USA 1992, 89, 11322–11326. [Google Scholar] [CrossRef]
- Tsoulos, I.G.; Charilogis, V.; Kyrou, G.; Stavrou, V.N.; Tzallas, A. OPTIMUS: A Multidimensional Global Optimization Package. J. Open Source Softw. 2025, 10, 7584. [Google Scholar] [CrossRef]
- Stanley, K.O.; Miikkulainen, R. Evolving Neural Networks through Augmenting Topologies. Evol. Comput. 2002, 10, 99–127. [Google Scholar] [CrossRef]
- Park, J.; Sandberg, I.W. Universal Approximation Using Radial-Basis-Function Networks. Neural Comput. 1991, 3, 246–257. [Google Scholar] [CrossRef]
- Montazer, G.A.; Giveki, D.; Karami, M.; Rastegar, H. Radial basis function neural networks: A review. Comput. Rev. J. 2018, 1, 52–74. [Google Scholar]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Berkeley Symposium on Mathematical Statistics & Probability; University of California Press: Oakland, CA, USA, 1967; Volume 1, pp. 281–297. [Google Scholar]
- Emad-Ud-Din, M.; Wang, Y. Promoting occupancy detection accuracy using on-device lifelong learning. IEEE Sens. J. 2023, 23, 9595–9606. [Google Scholar] [CrossRef]
Parameter | Meaning | Value |
---|---|---|
Chromosomes | 500 | |
Maximum number of generations | 200 | |
Selection rate | 0.1 | |
Mutation rate | 0.05 | |
H | Number of nodes | 10 |
Initialization factor | 10.0 | |
a | Bounding factor | 10.0 |
f | Scale factor for the margins | 2.0 |
Value used for penalties | 100.0 |
DATASET | BFGS | ADAM | NEAT | RBF | GENETIC | PROPOSED |
---|---|---|---|---|---|---|
Alcohol | 41.50% | 57.78% | 66.80% | 49.38% | 39.57% | 26.24% |
Appendicitis | 18.00% | 16.50% | 17.20% | 12.23% | 18.10% | 14.90% |
Australian | 38.13% | 35.65% | 31.98% | 34.89% | 32.21% | 31.64% |
Balance | 8.64% | 7.87% | 23.14% | 33.42% | 8.97% | 7.80% |
Cleveland | 77.55% | 67.55% | 53.44% | 67.10% | 51.60% | 47.51% |
Circular | 6.08% | 19.95% | 35.18% | 5.98% | 5.99% | 5.42% |
Dermatology | 52.92% | 26.14% | 32.43% | 62.34% | 30.58% | 5.97% |
Hayes Roth | 37.33% | 59.70% | 50.15% | 64.36% | 56.18% | 39.28% |
Heart | 39.44% | 38.53% | 39.27% | 31.20% | 28.34% | 16.85% |
HeartAttack | 46.67% | 45.55% | 32.34% | 29.00% | 29.03% | 23.77% |
HouseVotes | 7.13% | 7.48% | 10.89% | 6.13% | 6.62% | 3.05% |
Ionosphere | 15.29% | 16.64% | 19.67% | 16.22% | 15.14% | 8.75% |
Liverdisorder | 42.59% | 41.53% | 30.67% | 30.84% | 31.11% | 29.53% |
Lymography | 35.43% | 29.26% | 33.70% | 25.50% | 28.42% | 17.17% |
Mammographic | 17.24% | 46.25% | 22.85% | 21.38% | 19.88% | 16.45% |
Parkinsons | 27.58% | 24.06% | 18.56% | 17.41% | 18.05% | 17.46% |
Pima | 35.59% | 34.85% | 34.51% | 25.78% | 32.19% | 27.25% |
Popfailures | 5.24% | 5.18% | 7.05% | 7.04% | 5.94% | 4.66% |
Regions2 | 36.28% | 29.85% | 33.23% | 38.29% | 29.39% | 25.88% |
Saheart | 37.48% | 34.04% | 34.51% | 32.19% | 34.86% | 31.59% |
Segment | 68.97% | 49.75% | 66.72% | 59.68% | 57.72% | 42.43% |
Sonar | 25.85% | 30.33% | 34.10% | 27.90% | 22.40% | 19.30% |
Spiral | 47.99% | 48.90% | 50.22% | 44.87% | 48.66% | 44.67% |
Statheart | 39.65% | 44.04% | 44.36% | 31.36% | 27.25% | 18.90% |
Student | 7.14% | 5.13% | 10.20% | 5.49% | 5.61% | 4.33% |
Transfusion | 25.84% | 25.68% | 24.87% | 26.41% | 24.87% | 23.60% |
Wdbc | 29.91% | 35.35% | 12.88% | 7.27% | 8.56% | 8.69% |
Wine | 59.71% | 29.40% | 25.43% | 31.41% | 19.20% | 7.27% |
Z_F_S | 39.37% | 47.81% | 38.41% | 13.16% | 10.73% | 5.33% |
Z_O_N_F_S | 65.67% | 78.79% | 77.08% | 48.70% | 64.81% | 53.15% |
ZO_NF_S | 43.04% | 47.43% | 43.75% | 9.02% | 21.54% | 5.82% |
ZONF_S | 15.62% | 11.99% | 5.44% | 4.03% | 4.36% | 2.35% |
ZOO | 10.70% | 14.13% | 20.27% | 21.93% | 9.50% | 6.07% |
AVERAGE | 33.50% | 33.73% | 32.77% | 28.54% | 25.68% | 19.49% |
DATASET | BFGS | ADAM | NEAT | RBF | GENETIC | PROPOSED |
---|---|---|---|---|---|---|
Abalone | 5.69 | 4.30 | 9.88 | 7.37 | 7.17 | 4.42 |
Airfoil | 0.003 | 0.005 | 0.067 | 0.27 | 0.003 | 0.003 |
Auto | 60.97 | 70.84 | 56.06 | 17.87 | 12.18 | 12.10 |
Baseball | 119.63 | 77.90 | 100.39 | 93.02 | 103.60 | 79.30 |
BK | 0.28 | 0.03 | 0.15 | 0.02 | 0.03 | 0.017 |
BL | 2.55 | 0.28 | 0.05 | 0.013 | 5.74 | 0.001 |
Concrete | 0.066 | 0.078 | 0.081 | 0.011 | 0.0099 | 0.004 |
Dee | 2.36 | 0.630 | 1.512 | 0.17 | 1.013 | 0.21 |
Housing | 97.38 | 80.20 | 56.49 | 57.68 | 43.26 | 20.74 |
Friedman | 1.26 | 22.90 | 19.35 | 7.23 | 1.249 | 3.569 |
FA | 0.426 | 0.11 | 0.19 | 0.015 | 0.025 | 0.011 |
FY | 0.22 | 0.038 | 0.08 | 0.041 | 0.65 | 0.038 |
HO | 0.62 | 0.035 | 0.169 | 0.03 | 2.78 | 0.012 |
Laser | 0.015 | 0.03 | 0.084 | 0.03 | 0.59 | 0.004 |
MB | 0.129 | 0.06 | 0.061 | 2.16 | 0.051 | 0.048 |
Mortgage | 8.23 | 9.24 | 14.11 | 1.45 | 2.41 | 0.85 |
NT | 0.129 | 0.12 | 0.33 | 8.14 | 0.006 | 0.006 |
PL | 0.29 | 0.117 | 0.098 | 2.12 | 0.28 | 0.022 |
Plastic | 20.32 | 11.71 | 20.77 | 8.62 | 2.79 | 2.20 |
PY | 0.578 | 0.09 | 0.075 | 0.012 | 0.564 | 0.016 |
Quake | 0.42 | 0.06 | 0.298 | 0.07 | 0.12 | 0.037 |
SN | 0.40 | 0.026 | 0.174 | 0.027 | 2.95 | 0.024 |
Stock | 302.43 | 180.89 | 12.23 | 12.23 | 3.88 | 3.25 |
Treasury | 9.91 | 11.16 | 15.52 | 2.02 | 2.93 | 1.11 |
AVERAGE | 26.43 | 19.62 | 12.84 | 9.19 | 8.10 | 5.33 |
DATASET | GENETIC | PROPOSED (NO BFGS) | PROPOSED |
---|---|---|---|
Alcohol | 39.57% | 26.32% | 26.24% |
Appendicitis | 18.10% | 16.00% | 14.90% |
Australian | 32.21% | 28.09% | 31.64% |
Balance | 8.97% | 7.81% | 7.80% |
Cleveland | 51.60% | 46.24% | 47.51% |
Circular | 5.99% | 5.51% | 5.42% |
Dermatology | 30.58% | 8.83% | 5.97% |
Hayes Roth | 56.18% | 42.38% | 39.28% |
Heart | 28.34% | 18.37% | 16.85% |
HeartAttack | 29.03% | 19.50% | 23.77% |
HouseVotes | 6.62% | 3.48% | 3.05% |
Ionosphere | 15.14% | 10.03% | 8.75% |
Liverdisorder | 31.11% | 30.94% | 29.53% |
Lymography | 28.42% | 20.79% | 17.17% |
Mammographic | 19.88% | 16.59% | 16.45% |
Parkinsons | 18.05% | 16.21% | 17.46% |
Pima | 32.19% | 31.11% | 27.25% |
Popfailures | 5.94% | 4.61% | 4.66% |
Regions2 | 29.39% | 25.10% | 25.88% |
Saheart | 34.86% | 31.20% | 31.59% |
Segment | 57.72% | 40.87% | 42.43% |
Sonar | 22.40% | 25.55% | 19.30% |
Spiral | 48.66% | 46.13% | 44.67% |
Statheart | 27.25% | 17.59% | 18.90% |
Student | 5.61% | 3.88% | 4.33% |
Transfusion | 24.87% | 22.99% | 23.60% |
Wdbc | 8.56% | 8.43% | 8.69% |
Wine | 19.20% | 6.53% | 7.27% |
Z_F_S | 10.73% | 6.73% | 5.33% |
Z_O_N_F_S | 64.81% | 49.68% | 53.15% |
ZO_NF_S | 21.54% | 7.52% | 5.82% |
ZONF_S | 4.36% | 2.28% | 2.35% |
ZOO | 9.50% | 13.90% | 6.07% |
AVERAGE | 25.68% | 20.04% | 19.49% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tsoulos, I.G.; Charilogis, V. Gen2Gen: Efficiently Training Artificial Neural Networks Using a Series of Genetic Algorithms. Knowledge 2025, 5, 17. https://doi.org/10.3390/knowledge5030017
Tsoulos IG, Charilogis V. Gen2Gen: Efficiently Training Artificial Neural Networks Using a Series of Genetic Algorithms. Knowledge. 2025; 5(3):17. https://doi.org/10.3390/knowledge5030017
Chicago/Turabian StyleTsoulos, Ioannis G., and Vasileios Charilogis. 2025. "Gen2Gen: Efficiently Training Artificial Neural Networks Using a Series of Genetic Algorithms" Knowledge 5, no. 3: 17. https://doi.org/10.3390/knowledge5030017
APA StyleTsoulos, I. G., & Charilogis, V. (2025). Gen2Gen: Efficiently Training Artificial Neural Networks Using a Series of Genetic Algorithms. Knowledge, 5(3), 17. https://doi.org/10.3390/knowledge5030017