Batch Gradient Learning Algorithm with Smoothing Regularization for Feedforward Neural Networks
Abstract
:1. Introduction
2. Network Structure and Learning Algorithm Methodology
2.1. Network Structure
2.2. Modified Error Function with Smoothing Regularization (BGS)
3. Materials and Methods
- I.
- II.
- There exists such that
- III.
- IV.
- Further, if proposition 4 is also valid, we have the following strong convergence
- V.
- There exists a point such that .
Algorithm 1 The learning algorithm | |
Input | Input the dimension , the number of the nodes, the number maximum iteration number , the learning rate , the regularization parameter , and the sample training set is . |
Initialization | Initialize randomly the initial weight vectors and |
Training | For do Compute the error function Equation (10). Compute the gradients Equation (15). Update the weights and by using Equation (14). end |
Output | Output the final weight vectors and |
4. Experimental Results
4.1. N-Dimensional Parity Problems
4.2. Function Approximation Problem
5. Discussion
6. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
- (a)
- ;
- (b)
- .
References
- Deperlioglu, O.; Kose, U. An educational tool for artificial neural networks. Comput. Electr. Eng. 2011, 37, 392–402. [Google Scholar] [CrossRef]
- Abu-Elanien, A.E.; Salama, M.M.A.; Ibrahim, M. Determination of transformer health condition using artificial neural networks. In Proceedings of the 2011 International Symposium on Innovations in Intelligent Systems and Applications, Istanbul, Turkey, 15–18 June 2011; pp. 1–5. [Google Scholar]
- Huang, W.; Lai, K.K.; Nakamori, Y.; Wang, S.; Yu, L. Neural networks in finance and economics forecasting. Int. J. Inf. Technol. Decis. Mak. 2007, 6, 113–140. [Google Scholar] [CrossRef]
- Papic, C.; Sanders, R.H.; Naemi, R.; Elipot, M.; Andersen, J. Improving data acquisition speed and accuracy in sport using neural networks. J. Sport. Sci. 2021, 39, 513–522. [Google Scholar] [CrossRef]
- Pirdashti, M.; Curteanu, S.; Kamangar, M.H.; Hassim, M.H.; Khatami, M.A. Artificial neural networks: Applications in chemical engineering. Rev. Chem. Eng. 2013, 29, 205–239. [Google Scholar] [CrossRef]
- Li, J.; Cheng, J.H.; Shi, J.Y.; Huang, F. Brief introduction of back propagation (BP) neural network algorithm and its improvement. In Advances in Computer Science and Information Engineering; Springer: Berlin/Heidelberg, Germany, 2012; pp. 553–558. [Google Scholar]
- Hoi, S.C.; Sahoo, D.; Lu, J.; Zhao, P. Online learning: A comprehensive survey. Neurocomputing 2021, 459, 249–289. [Google Scholar] [CrossRef]
- Fukumizu, K. Effect of batch learning in multilayer neural networks. Gen 1998, 1, 1E-03. [Google Scholar]
- Hawkins, D.M. The problem of overfitting. J. Chem. Inf. Comput. Sci. 2004, 44, 1–12. [Google Scholar] [CrossRef]
- Dietterich, T. Overfitting and undercomputing in machine learning. ACM Comput. Surv. 1995, 27, 326–327. [Google Scholar] [CrossRef]
- Everitt, B.S.; Skrondal, A. The Cambridge Dictionary of Statistics; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- Moore, A.W. Cross-Validation for Detecting and Preventing Overfitting; School of Computer Science, Carnegie Mellon University: Pittsburgh, PA, USA, 2001. [Google Scholar]
- Yao, Y.; Rosasco, L.; Caponnetto, A. On early stopping in gradient descent learning. Constr. Approx. 2007, 26, 289–315. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Santos, C.F.G.D.; Papa, J.P. Avoiding overfitting: A survey on regularization methods for convolutional neural networks. ACM Comput. Surv. 2022, 54, 1–25. [Google Scholar] [CrossRef]
- Waseem, M.; Lin, Z.; Yang, L. Data-driven load forecasting of air conditioners for demand response using levenberg–marquardt algorithm-based ANN. Big Data Cogn. Comput. 2019, 3, 36. [Google Scholar] [CrossRef]
- Waseem, M.; Lin, Z.; Liu, S.; Jinai, Z.; Rizwan, M.; Sajjad, I.A. Optimal BRA based electric demand prediction strategy considering instance-based learning of the forecast factors. Int. Trans. Electr. Energy Syst. 2021, 31, e12967. [Google Scholar] [CrossRef]
- Alemu, H.Z.; Wu, W.; Zhao, J. Feedforward neural networks with a hidden layer regularization method. Symmetry 2018, 10, 525. [Google Scholar] [CrossRef] [Green Version]
- Li, F.; Zurada, J.M.; Liu, Y.; Wu, W. Input layer regularization of multilayer feedforward neural networks. IEEE Access 2017, 5, 10979–10985. [Google Scholar] [CrossRef]
- Mohamed, K.S.; Wu, W.; Liu, Y. A modified higher-order feed forward neural network with smoothing regularization. Neural Netw. World 2017, 27, 577–592. [Google Scholar] [CrossRef] [Green Version]
- Reed, R. Pruning algorithms-a survey. IEEE Trans. Neural Netw. 1993, 4, 740–747. [Google Scholar] [CrossRef]
- Setiono, R. A penalty-function approach for pruning feedforward neural networks. Neural Comput. 1997, 9, 185–204. [Google Scholar] [CrossRef] [PubMed]
- Nakamura, K.; Hong, B.W. Adaptive weight decay for deep neural networks. IEEE Access 2019, 7, 118857–118865. [Google Scholar] [CrossRef]
- Bosman, A.; Engelbrecht, A.; Helbig, M. Fitness landscape analysis of weight-elimination neural networks. Neural Process. Lett. 2018, 48, 353–373. [Google Scholar] [CrossRef]
- Rosato, A.; Panella, M.; Andreotti, A.; Mohammed, O.A.; Araneo, R. Two-stage dynamic management in energy communities using a decision system based on elastic net regularization. Appl. Energy 2021, 291, 116852. [Google Scholar] [CrossRef]
- Pan, C.; Ye, X.; Zhou, J.; Sun, Z. Matrix regularization-based method for large-scale inverse problem of force identification. Mech. Syst. Signal Process. 2020, 140, 106698. [Google Scholar] [CrossRef]
- Liang, S.; Yin, M.; Huang, Y.; Dai, X.; Wang, Q. Nuclear norm regularized deep neural network for EEG-based emotion recognition. Front. Psychol. 2022, 13, 924793. [Google Scholar] [CrossRef]
- Candes, E.J.; Tao, T. Decoding by linear programming. IEEE Trans. Inf. Theory 2005, 51, 4203–4215. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Liu, P.; Li, Z.; Sun, T.; Yang, C.; Zheng, Q. Data regularization using Gaussian beams decomposition and sparse norms. J. Inverse Ill Posed Probl. 2013, 21, 1–23. [Google Scholar] [CrossRef]
- Zhang, H.; Tang, Y. Online gradient method with smoothing ℓ0 regularization for feedforward neural networks. Neurocomputing 2017, 224, 1–8. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Koneru, B.N.G.; Vasudevan, V. Sparse artificial neural networks using a novel smoothed LASSO penalization. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 848–852. [Google Scholar] [CrossRef]
- Xu, Z.; Zhang, H.; Wang, Y.; Chang, X.; Liang, Y. L1/2 regularization. Sci. China Inf. Sci. 2010, 53, 1159–1169. [Google Scholar] [CrossRef] [Green Version]
- Wu, W.; Fan, Q.; Zurada, J.M.; Wang, J.; Yang, D.; Liu, Y. Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Netw. 2014, 50, 72–78. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Yang, D.; Zhang, C. Relaxed conditions for convergence analysis of online back-propagation algorithm with L2 regularizer for Sigma-Pi-Sigma neural network. Neurocomputing 2018, 272, 163–169. [Google Scholar] [CrossRef]
- Mohamed, K.S.; Liu, Y.; Wu, W.; Alemu, H.Z. Batch gradient method for training of Pi-Sigma neural network with penalty. Int. J. Artif. Intell. Appl. IJAIA 2016, 7, 11–20. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, W.; Liu, F.; Yao, M. Boundedness and convergence of online gradient method with penalty for feedforward neural networks. IEEE Trans. Neural Netw. 2009, 20, 1050–1054. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, W.; Yao, M. Boundedness and convergence of batch back-propagation algorithm with penalty for feedforward neural networks. Neurocomputing 2012, 89, 141–146. [Google Scholar] [CrossRef]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Tsinghua University Press: Beijing, China; Prentice Hall: Hoboken, NJ, USA, 2001. [Google Scholar]
- Liu, Y.; Wu, W.; Fan, Q.; Yang, D.; Wang, J. A modified gradient learning algorithm with smoothing L1/2 regularization for Takagi–Sugeno fuzzy models. Neurocomputing 2014, 138, 229–237. [Google Scholar] [CrossRef]
- Iyoda, E.M.; Nobuhara, H.; Hirota, K. A solution for the n-bit parity problem using a single translated multiplicative neuron. Neural Process. Lett. 2003, 18, 233–238. [Google Scholar] [CrossRef]
Problems | Network Structure | Weight Size | Max Iteration | LR | RP |
---|---|---|---|---|---|
3-bit parity | 3-6-1 | [−0.5, 0.5] | 2000 | 0.009 | 0.0003 |
6-bit parity | 6-20-1 | [−0.5, 0.5] | 3000 | 0.006 | 0.003 |
Problems | Learning Algorithms | Average Error | Norm of Gradient | Time (s) |
---|---|---|---|---|
3-bit parity | BG | 3.7979 × 10−7 | 0.0422 | 1.156248 |
BG | 5.4060 × 10−7 | 7.1536 × 10−4 | 1.216248 | |
BG | 9.7820 × 10−7 | 8.7826 × 10−4 | 1.164721 | |
BGS | 1.7951 × 10−8 | 0.0011 | 1.155829 | |
BGS | 7.6653 × 10−9 | 7.9579 × 10−5 | 1.135742 | |
6-bit parity | BG | 8.1281 × 10−5 | 1.1669 | 52.225856 |
BG | 3.8917 × 10−5 | 0.0316 | 52.359129 | |
BG | 4.1744 × 10−5 | 0.0167 | 52.196552 | |
BGS | 4.8349 × 10−5 | 0.0088 | 52.210994 | |
BGS | 4.1656 × 10−6 | 0.0015 | 52.106554 |
Learning Algorithms | Average Error | Norm of Gradient | Time (s) |
---|---|---|---|
BG | 0.0388 | 0.3533 | 4.415500 |
BG | 0.0389 | 0.3050 | 4.368372 |
BG | 0.0390 | 0.3087 | 4.368503 |
BGS | 0.0386 | 0.2999 | 4.349813 |
BGS | 0.0379 | 0.2919 | 4.320198 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mohamed, K.S.
Batch Gradient Learning Algorithm with Smoothing
Mohamed KS.
Batch Gradient Learning Algorithm with Smoothing
Mohamed, Khidir Shaib.
2023. "Batch Gradient Learning Algorithm with Smoothing