*3.1. Deep Learning (DL)-Powered UAM Security*

To better detect anomalous behaviors (e.g., aircraft route anomalies) to constantly collecti high-resolution cyber-attack information across avionics flight data, we have designed and developed DL-based cybersecurity monitoring techniques against cyber threats for UAM situation awareness (SAW). The developed LightMAN with cognitive-based decision support is not intended to replace human interaction and decision-making; rather, it is meant to support the operator to combine data, identify potential threats rapidly for a pre-planned mission, and provide timely recommended actions.

Learning directly from high-dimensional sensory inputs is one of the long-standing challenges. Our objective is to develop machine learning (ML)-based anomaly detection (MLAD) and reinforcement learning (RL) artificial agents that can achieve a good level of performance and generality on diagnostics and prognostics. Similar to a human operator, the goal for the agents is to learn strategies that lead to the greatest long-term rewards. Formally, MLAD can be described as a Markov decision process (MDP), which consists of s set of states, *S*, plus a distribution of starting states, *P*(*s*0); a set of actions, A; transition dynamics, *<sup>T</sup>*(*st*<sup>+</sup><sup>1</sup> <sup>|</sup> *st*, *at*), that map a state-action pair at time t to the distribution of states at time t + 1; a reward function, *<sup>R</sup>*(*st*, *at*,*st*<sup>+</sup>1); and a discount factor, *<sup>δ</sup>* <sup>∈</sup> [0, 1], where smaller values place more emphasis on immediate rewards. It is assumed that an agent interacts with an environment, S, in a sequence of actions, actions, observations, and rewards. At each time-step, the agent selects an action, *at* <sup>∈</sup> *<sup>A</sup>*, *<sup>A</sup>* <sup>=</sup> 1, . . . , *<sup>K</sup>*, which is passed to the environment and modifies its internal state and the corresponding reward [32]. In general, S may be stochastic. The system's internal state is not observable to the agent most of the time, instead, it observes various target features of interest from the environment, such as the signal features. It receives a reward R representing the change in overall system performance.

Based on the MLAD-RL strategy, we developed an automated monitoring mechanism for system-level source analytics. The monitoring data are defined as a set of metrics (e.g., route latitude/longitude, transmission delay, traffic buffer queue length, etc.) on each UAM edge and associated applications and processes. Given a large number of features, LightMAN uses feature extraction and reduction techniques in collected log data to select a set of the most critical features and implement deep learning-based detection schemes for identifying anomalous statuses. The general steps of the proposed anomaly monitoring technique are as follows: (i) *Data Collection*: The relevant sensory data collected across the system are assembled into a set of feature matrices. We define the feature as an individually measurable variable of the node being monitored (e.g., data frames, MAVLink messages, command and control (C2) mission logs, controller area network (CAN) buses, etc.); (ii) *Feature Extraction*: To effectively deal with high-dimensional data, we implement feature extraction techniques via named entity recognition (NER) [33] and the vector space model (VSM), which can reduce data dimensionality and improve analysis by removing inherent data dependencyl (iii) *Deep Learning-Based Detection*: LightMAN applies DL techniques (e.g., L-CNN, RNN/LSTM, etc.) to characterize the dynamic state of the monitored system. With the trained model in place, the operator can conduct the detection and classification of potential attacks.

As shown in Figure 2, the detection process consists of two main steps: the training process and the detecting process. In the training process, the collected log data are converted to a uniform data format for the learning process. We then train the classifier model for both normal and abnormal system states. In the online monitoring process, LightMAN monitoring tools collect real-time flight data, and the processed traffic data are sent to the learned classifier for anomaly detection. The effectiveness of the monitoring schemes is characterized by the true positive rate, false positive rate, monitoring time, overhead, etc.

**Figure 2.** ML/DL Learning Process for UAM Monitoring.
