**5. Conclusions**

IoT technology and sensor networks are widely utilized for monitoring, data collection, and analysis of smart environments using the wireless communication system. However, due to the constraints of resources of the nodes, most of the solutions are unable to balance the routing load on the selected routes and incur rapid data losses in the presence of security attacks. In this paper, we presented a D2D multi-criteria reinforcement learning algorithm with secured IoT infrastructure for smart cities. It offers a more authentic and verified solution for directly connected devices and increases the trustworthiness of transmission. Using multi-criteria reinforcement learning, the proposed algorithm offers intelligent methods for sensing the coverage area and efficiently distributing the energy load between mobile devices. The proposed algorithm can be used for smart buildings to interconnect various operations and for security surveillance using mobile IoT devices and sensors technologies. Our proposed algorithm makes it possible to gather the real-time data from the smart building and timely transmit the data towards network applications for further analysis and appropriate actions.

However, the proposed algorithm still suffers from link disruption with the high exchange of control packets, and thus in the future, we aim to utilize the deep learning model and real-time dataset to train the network nodes and cope with network anomalies. Additionally, we would like to introduce the concept of multi-clouds in the proposed algorithm for high scalability and parallel processing.

**Author Contributions:** Conceptualization, K.H. and A.R.; methodology, K.H. and A.R.; software, T.S. and S.A.B.; validation, J.L. and T.S.; formal analysis, A.R., J.L., and K.H.; investigation, J.L. and S.A.B.; resources, J.L. and A.R.; data curation, A.R. and T.S.; writing—original draft preparation, K.H. and A.R.; writing—review and editing, T.S. and J.L.; visualization, S.A.B. and T.S.; supervision, J.L. and T.S.; project administration, J.L. and A.R.; funding acquisition, J.L. and A.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All data are available in the manuscript.

**Acknowledgments:** This research was technically supported by the Artificial Intelligence and Data Analytics Lab (AIDA), CCIS Prince Sultan University, Riyadh, Saudi Arabia. All authors are thankful for the technical support.

**Conflicts of Interest:** The authors declare no conflict of interest.
