Spiking Neural-Networks-Based Data-Driven Control
Abstract
:1. Introduction
2. Preliminary
2.1. A Brief Introduction to Reinforcement Learning (RL)
2.2. An Overview of Learning Methods for SNN
2.3. The Cart-Pole Environment
- The position of the cart: x,
- The velocity of the cart: ,
- The pole angle: ,
- The angular velocity of the pole: .
- The pole drops when the absolute value of the angle of the pole is greater than .
- The cart slides out of the edge of the animation display (i.e., the cart position is out of the range between −2.4 to 2.4).
- The simulation steps are larger than 200 (or other customized values).
3. The TD-STDP SNN
3.1. The Overall Workflow of the Program
3.2. The Architecture of the SNN
3.3. The Input Neurons
3.4. The Output Neurons
3.5. Q-Learning by SNN
3.6. Determining Eligibility of the Synapses
3.7. Learning the Synaptic Weights by TD-STDP
3.8. Exploration and Exploitation in Training
4. The R-STDP SNN
4.1. The Differences between the R-STDP and the TD-STDP Programs
4.2. Updating Synapse Weights with Delayed Reward
4.3. Reward Function Designs
4.4. Exploitation and Exploration
5. The SNN Simulator
6. Results
6.1. TD-STDP Learning
6.2. R-STDP Learning
7. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Moe, S.; Rustad, A.M.; Hanssen, K.G. Machine Learning in Control Systems: An Overview of the State of the Art. In Proceedings of the Artificial Intelligence XXXV, Cambridge, UK, 11–13 December 2018; Bramer, M., Petridis, M., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 250–265. [Google Scholar]
- Brosilow, C.; Joseph, B. Techniques of Model-Based Control; Prentice Hall Professional: Hoboken, NJ, USA, 2002. [Google Scholar]
- Fliess, M.; Join, C. Model-Free Control. Int. J. Control 2013, 86, 2228–2252. [Google Scholar] [CrossRef] [Green Version]
- Jordan, M.I.; Mitchell, T.M. Machine Learning: Trends, Perspectives, and Prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
- Mokhtari, S.; Abbaspour, A.; Yen, K.K.; Sargolzaei, A. A Machine Learning Approach for Anomaly Detection in Industrial Control Systems Based on Measurement Data. Electronics 2021, 10, 407. [Google Scholar] [CrossRef]
- Parekh, D.; Poddar, N.; Rajpurkar, A.; Chahal, M.; Kumar, N.; Joshi, G.P.; Cho, W. A Review on Autonomous Vehicles: Progress, Methods and Challenges. Electronics 2022, 11, 2162. [Google Scholar] [CrossRef]
- Wang, S.; Chaovalitwongse, W.; Babuska, R. Machine Learning Algorithms in Bipedal Robot Control. IEEE Trans. Syst. Man, Cybern. Part C (Appl. Rev.) 2012, 42, 728–743. [Google Scholar] [CrossRef]
- Giusti, A.; Guzzi, J.; Cireşan, D.C.; He, F.L.; Rodríguez, J.P.; Fontana, F.; Faessler, M.; Forster, C.; Schmidhuber, J.; Caro, G.D.; et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots. IEEE Robot. Autom. Lett. 2016, 1, 661–667. [Google Scholar] [CrossRef] [Green Version]
- Rosenblatt, F. The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychol. Rev. 1958, 65, 386. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Wu, Z.; Tran, A.; Rincon, D.; Christofides, P.D. Machine Learning-Based Predictive Control of Nonlinear Processes. Part I: Theory. AIChE J. 2019, 65, e16729. [Google Scholar] [CrossRef]
- Kaelbling, L.P.; Littman, M.L.; Moore, A.W. Reinforcement Learning: A Survey. J. Artif. Intell. Res. 1996, 4, 237–285. [Google Scholar] [CrossRef]
- Barto, A.G. Reinforcement Learning Control. Curr. Opin. Neurobiol. 1994, 4, 888–893. [Google Scholar] [CrossRef] [PubMed]
- Zou, J.; Han, Y.; So, S.S. Overview of Artificial Neural Networks. Artif. Neural Netw. 2008, 458, 14–22. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Hassabis, D.; Kumaran, D.; Summerfield, C.; Botvinick, M. Neuroscience-inspired Artificial Intelligence. Neuron 2017, 95, 245–258. [Google Scholar] [CrossRef] [Green Version]
- Rieke, F.; Warland, D.; Van Steveninck, R.d.R.; Bialek, W. Spikes: Exploring the Neural Code; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
- Levitan, I.B.; Levitan, I.B.; Kaczmarek, L.K. The Neuron: Cell and Molecular Biology; Oxford University Press: New York, NY, USA, 2002. [Google Scholar]
- Eshraghian, J.K.; Ward, M.; Neftci, E.; Wang, X.; Lenz, G.; Dwivedi, G.; Bennamoun, M.; Jeong, D.S.; Lu, W.D. Training Spiking Neural Networks Using Lessons From Deep Learning. arXiv 2021, arXiv:2109.12894. [Google Scholar]
- Vreeken, J. Spiking Neural Networks, An Introduction; Adaptive Intelligence Laboratory, Intelligent Systems Group, Utrecht University: Utrecht, The Netherlands, 2003. [Google Scholar]
- Kim, S.; Park, S.; Na, B.; Yoon, S. Spiking-Yolo: Spiking Neural Network for Energy-Efficient Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11270–11277. [Google Scholar]
- Kabilan, R.; Muthukumaran, N. A Neuromorphic Model for Image Recognition using SNN. In Proceedings of the 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 20–22 January 2021; pp. 720–725. [Google Scholar]
- Mead, C. Neuromorphic Electronic Systems. Proc. IEEE 1990, 78, 1629–1636. [Google Scholar] [CrossRef] [Green Version]
- Khan, M.M.; Lester, D.R.; Plana, L.A.; Rast, A.; Jin, X.; Painkras, E.; Furber, S.B. SpiNNaker: Mapping Neural Networks onto a Massively-Parallel Chip Multiprocessor. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–6 June 2008; pp. 2849–2856. [Google Scholar]
- Bellman, R. Dynamic Programming. Science 1966, 153, 34–37. [Google Scholar] [CrossRef]
- Watkins, C.J.; Dayan, P. Q-Learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Gerstner, W.; Kistler, W.M.; Naud, R.; Paninski, L. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
- Izhikevich, E.M. Which Model to Use for Cortical Spiking Neurons? IEEE Trans. Neural Netw. 2004, 15, 1063–1070. [Google Scholar] [CrossRef]
- Brunel, N.; Van Rossum, M.C. Quantitative Investigations of Electrical Nerve Excitation Treated as Polarization. Biol. Cybern. 2007, 97, 341–349. [Google Scholar] [CrossRef]
- Ghosh-Dastidar, S.; Adeli, H. Spiking Neural Networks. Int. J. Neural Syst. 2009, 19, 295–308. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lecun, Y. A Theoretical Framework for Back-Propagation. In Proceedings of the 1988 Connectionist Models Summer School, CMU, Pittsburg, PA, USA, 17–26 June 1988; Touretzky, D., Hinton, G., Sejnowski, T., Eds.; Morgan Kaufmann: Burlington, MA, USA, 1988; pp. 21–28. [Google Scholar]
- Ledinauskas, E.; Ruseckas, J.; Juršėnas, A.; Buračas, G. Training Deep Spiking Neural Networks. arXiv 2020, arXiv:2006.04436. [Google Scholar]
- Ding, J.; Yu, Z.; Tian, Y.; Huang, T. Optimal ANN-SNN Conversion for Fast and Accurate Inference in Deep Spiking Neural Networks. arXiv 2021, arXiv:2105.11654. [Google Scholar]
- Rueckauer, B.; Liu, S.C. Conversion of Analog to Spiking Neural Networks Using Sparse Temporal Coding. In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 27–30 May 2018; pp. 1–5. [Google Scholar]
- Wu, Y.; Deng, L.; Li, G.; Zhu, J.; Shi, L. Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks. Front. Neurosci. 2018, 12, 331. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bohte, S.M.; Kok, J.N.; La Poutré, J.A. SpikeProp: Backpropagation for Networks of Spiking Neurons. In Proceedings of the ESANN, Bruges, Belgium, 26–28 April 2000; Volume 48, pp. 419–424. [Google Scholar]
- Markram, H.; Gerstner, W.; Sjöström, P.J. A History of Spike-Timing-Dependent Plasticity. Front. Synaptic Neurosci. 2011, 3, 4. [Google Scholar] [CrossRef]
- Hebb, D.O.; Penfield, W. Human Behavior After Extensive Bilateral Removal from the Frontal Lobes. Arch. Neurol. Psychiatry 1940, 44, 421–438. [Google Scholar] [CrossRef]
- Song, S.; Miller, K.D.; Abbott, L.F. Competitive Hebbian Learning Through Spike-Timing-Dependent Synaptic Plasticity. Nat. Neurosci. 2000, 3, 919–926. [Google Scholar] [CrossRef]
- Diehl, P.U.; Cook, M. Unsupervised Learning of Digit Recognition Using Spike-Timing-Dependent Plasticity. Front. Comput. Neurosci. 2015, 9, 99. [Google Scholar] [CrossRef] [Green Version]
- Bi, G.Q.; Poo, M.M. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type. J. Neurosci. 1998, 18, 10464–10472. [Google Scholar] [CrossRef] [Green Version]
- Scellier, B.; Bengio, Y. Equilibrium Propagation: Bridging the Gap between Energy-based Models and Backpropagation. Front. Comput. Neurosci. 2017, 11, 24. [Google Scholar] [CrossRef] [Green Version]
- Frémaux, N.; Gerstner, W. Neuromodulated Spike-Timing-Dependent Plasticity, and Theory of Three-Factor Learning Rules. Front. Neural Circuits 2016, 9, 85. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Florian, R.V. Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity. Neural Comput. 2007, 19, 1468–1502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Schultz, W.; Dayan, P.; Montague, P.R. A Neural Substrate of Prediction and Reward. Science 1997, 275, 1593–1599. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bargmann, C.I. Beyond the Connectome: How Neuromodulators Shape Neural Circuits. Bioessays 2012, 34, 458–465. [Google Scholar] [CrossRef] [PubMed]
- Frémaux, N.; Sprekeler, H.; Gerstner, W. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons. PLoS Comput. Biol. 2013, 9, e1003024. [Google Scholar] [CrossRef] [Green Version]
- Geva, S.; Sitte, J. A Cartpole Experiment Benchmark for Trainable Controllers. IEEE Control. Syst. Mag. 1993, 13, 40–51. [Google Scholar]
- Rafe, A.W.; Garcia, J.A.; Raffe, W.L. Exploration Of Encoding And Decoding Methods For Spiking Neural Networks On The Cart Pole And Lunar Lander Problems Using Evolutionary Training. In Proceedings of the 2021 IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, 28 June–1 July 2021; pp. 498–505. [Google Scholar]
- Ding, Y.; Zhang, Y.; Zhang, X.; Chen, P.; Zhang, Z.; Yang, Y.; Cheng, L.; Mu, C.; Wang, M.; Xiang, D.; et al. Engineering Spiking Neurons Using Threshold Switching Devices for High-efficient Neuromorphic Computing. Front. Neurosci. 2022, 15, 1732. [Google Scholar] [CrossRef]
- Zhou, Y.; Wang, Y.; Zhuge, F.; Guo, J.; Ma, S.; Wang, J.; Tang, Z.; Li, Y.; Miao, X.; He, Y.; et al. A Reconfigurable Two-WSe2-Transistor Synaptic Cell for Reinforcement Learning. Adv. Mater. 2022, 34, 2107754. [Google Scholar] [CrossRef]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. OpenAI Gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Seger, C. An Investigation of Categorical Variable Encoding Techniques in Machine Learning: Binary Versus One-Hot and Feature Hashing. Bachelor’s Dissertation, KTH, Stockholm, Sweden, 2018. [Google Scholar]
- Song, S.; Abbott, L.F. Cortical Development and Remapping Through Spike Timing-Dependent Plasticity. Neuron 2001, 32, 339–350. [Google Scholar] [CrossRef] [Green Version]
- Stimberg, M.; Brette, R.; Goodman, D.F. Brian 2, An Intuitive and Efficient Neural Simulator. Elife 2019, 8, e47314. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Pan, W. Spiking Neural-Networks-Based Data-Driven Control. Electronics 2023, 12, 310. https://doi.org/10.3390/electronics12020310
Liu Y, Pan W. Spiking Neural-Networks-Based Data-Driven Control. Electronics. 2023; 12(2):310. https://doi.org/10.3390/electronics12020310
Chicago/Turabian StyleLiu, Yuxiang, and Wei Pan. 2023. "Spiking Neural-Networks-Based Data-Driven Control" Electronics 12, no. 2: 310. https://doi.org/10.3390/electronics12020310
APA StyleLiu, Y., & Pan, W. (2023). Spiking Neural-Networks-Based Data-Driven Control. Electronics, 12(2), 310. https://doi.org/10.3390/electronics12020310