*3.1. Data Sample Generation*

The first step of training such learning-aided models is data generation. To this end, random operations are sampled and simulated under prior distributions of power systems. To simplify illustrations, we respectively denote the input features and the target features as *X* and *Y*. *X* covers almost all variables that SCADA can measure, while *Y* is the TS margin index. Equation (7) details the data structure:

$$\begin{array}{c} \mathbf{X} = \{ \mathbf{P}\_{\mathbf{G}\prime}, \mathbf{V}\_{\mathbf{G}\prime}, \mathbf{P}\_{\mathbf{D}\prime}, \mathbf{Q}\_{\mathbf{D}\prime}, \mathbf{V}\_{\mathbf{b}} \}, \mathbf{G} \in \mathbf{S}\_{\mathbf{G}} \cup \mathbf{S}\_{\mathbf{W}\prime}, \mathbf{D} \in \mathbf{S}\_{\mathbf{D}}\\\mathbf{Y} = \{ \Gamma\_{\mathbf{c}} \}, \mathbf{c} \in \mathbf{S}\_{\mathbf{c}} \end{array} \tag{7}$$

where *P<sup>G</sup>* and *P<sup>D</sup>* are the characteristic vector of active generator output and active load, respectively; *Q<sup>D</sup>* is the vector of reactive load; *V<sup>G</sup>* represents the voltage of buses where generators are located; and *V<sup>b</sup>* means the voltage of other buses. Γ*<sup>c</sup>* represents the TS margin of the corresponding operation. In this paper, TS index (TSI) is adopted to quantify TS margin, which can be formulated as:

$$\begin{array}{l} \text{TSI} = 100 \times \left(\delta\_{\text{thr}} - \mid \delta\_{\text{max}} \mid \right) / (\delta\_{\text{thr}} + \mid \delta\_{\text{max}} \mid \mid),\\ \delta\_{\text{max}} = \max (\mid \delta\_{\text{Ci}} - \delta\_{\text{G}j} \mid \}, \text{Gi'}, \text{G} j \in \text{S}\_{\text{G}} \end{array} \tag{8}$$

where *δ*max is the maximum power angle difference during the post-fault duration.

Now turning to introduce calculation for Equation (7). To attain *X*, we firstly sample controllable variables under their prior limits by Equation (9), and loads under historical distributions by Equation (9):

$$\begin{array}{l} \mathbf{X}\_{\operatorname{gen}} = \{ \mathbf{X}^{1}\_{\operatorname{gen}}; \dots; \mathbf{X}^{n}\_{\operatorname{gen}} \} = \{ \mathbf{P}^{1}\_{\operatorname{G}\prime}, \mathbf{V}^{1}\_{\operatorname{G}\prime}; \dots; \mathbf{P}^{n}\_{\operatorname{G}\prime}, \mathbf{V}^{n}\_{\operatorname{G}} \}, \\ \mathbf{X}\_{\operatorname{load}} = \{ \mathbf{X}^{1}\_{\operatorname{load}}; \dots; \mathbf{X}^{n}\_{\operatorname{load}} \} = \{ \mathbf{P}^{1}\_{\operatorname{D}\prime}, \mathbf{Q}^{1}\_{\operatorname{D}\prime}; \dots; \mathbf{P}^{n}\_{\operatorname{D}\prime}, \mathbf{Q}^{n}\_{\operatorname{D}\prime} \} \end{array} \tag{9}$$

where *Xgen*, *Xload* are the control and load variables subsets of *X*, respectively; *n* is the number of samples.

The power flow program is then performed to get equilibrium points to determine the state variables *Vb*. Notably, samples should be evenly distributed over operational space to ensure the generalization ability of the learning-aided model. Therefore, Latin hypercube sampling (LHS) is adopted to generate samples in this paper [24].

Afterward, we impose disturbances in contingencies for one equilibrium point of *X* to obtain post-fault trajectories to compute TS margin Γ*c*. Via traversing each point in *X*, *Y* can be collected. Supervised learning can herewith be utilized to learn the learningaided model.
