6.2.1. Simulation Model

According to di fferent generator types, the carbon emission rate δ*sw* of each unit in the IEEE 300-bus system is summarized in Table 7. Besides, 96 di fferent load scenarios are designed to simulate di fferent optimization tasks in a day for the IEEE 300-bus system, as shown in Figure 7. Moreover, the optimization variables are given in Table 8.


**Table 7.** Carbon emission rate of the IEEE 300-bus system.

**Figure 7.** The load scenarios of the IEEE 300-bus system.

**Table 8.** Optimization variables of the IEEE 300-bus system.


#### 6.2.2. Comparative Analysis of Simulation Results

For the purpose of evaluating the optimization capability of MCR-Q(λ) learning, this section applies all the algorithms to solve the OCECF model for 10 runs. Since the number of optimization variables of the IEEE 300-bus system dramatically increases, the conventional Q and Q(λ) algorithms cannot implement an optimization due to the dimension disaster. Figure 8 provides the results comparison between di fferent methods, where each value is the average of the sum value of a day in 10 runs. It can be found that the proposed MCR-Q(λ) learning significantly outperforms other methods on the total carbon flow loss, total power loss, voltage stability component and the objective function. Hence, the MCR-Q(λ) learning-based OCECF can achieve a low-carbon operation for the power network. Particularly, these values obtained by MCR-Q(λ) learning are 2.0%, 3.4%, 45.9% and 10.3% lower than that obtained by GSO. It verifies that the optimization performance of MCR-Q(λ) is much better than other conventional meta-heuristic algorithms as the system scale increases.

(**c**) Voltage stability component 

**Figure 8.** *Cont.*

(**d**) Objective function 

**Figure 8.** Comparison of results obtained by different methods in the IEEE 300-bus system.

Besides, Table 9 gives the distribution statistics of the objective function under different algorithms in the IEEE 300-bus system, where each value is the sum value of the objective function of a day in 10 runs; the best, worst, variance and standard deviation (Std. Dev.) are calculated to evaluate the convergence stability [51]. It can be seen from Table 9 that the convergence stability of MCR-Q(λ) learning is the highest among all the methods with the smallest variance and standard deviation of the objective function.

**Table 9.** Distribution statistics of the objective function under different algorithms in the IEEE 300-bus system in 10 runs.

