**1. Introduction**

Complexity of the control synthesis problems for autonomous robots which must perform the assigned tasks and achieve the set goal, led to new ideas in the control theory. Now, to create a control system for an autonomous robot, this system needs to be trained [1,2], instead of obtaining it by solving some known optimization problems.

To formulate the real problem of mobile robot control, it is needed to describe a large number of different phase constraints. These can be walls, doors between the rooms, windows, columns and other obstacles. For example, a robot has to avoid a column, not to hit on a wall and to get in a door. Now, when control systems for mobile robots are being created, programmers imagine the problems that this robot must face and decide how it should overcome them. Quite a laborious process, but it is quite justified in conditions when control systems were developed on an individual basis for single technical objects, such as spacecraft. However, modern automation and robotization is reaching a broader level and becoming ubiquitous. This trend requires the development of new universal and even automatic approaches to the development of control systems.

Application of symbolic regression methods allows to automatically receive mathematical expressions for control functions. Such mathematical expressions describe how the robot should optimally reach the goal avoiding the obstacles.

Only symbolic regression methods can search structure and parameters of mathematical expression. Other methods, and even artificial neural networks, search only parameters.

**Citation:** Diveev, A.; Konstantinov, S.; Shmalko, E.; Dong, G. Machine Learning Control Based on Approximation of Optimal Trajectories. *Mathematics* **2021**, *9*, 265. https://doi.org/10.3390/ math9030265

Academic Editor: Mikhail Posypkin Received: 30 December 2020 Accepted: 26 January 2021 Published: 29 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The searching of control function structure in the control synthesis problem is called machine learning control [1]. This is a new technology in the development of control systems and it has not yet been proposed a rigorous mathematical formulation that defines this approach. In this paper, we propose some mathematical formalization of the machine learning problem (Section 2) and, on the basis of the proposed definitions, we single out a special area of machine learning—machine learning control (Section 3).

One of the main problems of machine learning control is the problem of control synthesis. The paper first presents the general mathematical formulation of the control synthesis problem, and then proposes its numerical formulation, since according to the methodology of machine learning control, the synthesis problem must be solved numerically using symbolic regression methods.

Further in the work in Section 4, we present our approach to solving the problem of machine learning control based on approximation of optimal trajectories. According to the technique of learning firstly it is necessary to create a training set in order to show to learning object what we want of it. For this purpose initially the optimal control problem is solved with the same quality criterion as for the synthesis problem from some different initial conditions. Obtained optimal trajectories are templates for learning. They show what forms of plots for variables must be obtained in the result of control synthesis problem solution and what values of functional must give these solutions. Then, obtained optimal trajectories for different initial conditions are approximated by a numerical method of symbolic regression. The proposed approach of machine learning based on approximation of optimal trajectories is demonstrated in the computational example of general synthesis of optimal control for a spacecraft landing on the surface of the Moon (Section 5).
