Evolutionary Game Analysis of the Regulatory Strategy of Third-Party Environmental Pollution Management

Guolong Wei; Guoliang Li; Xue Sun

doi:10.3390/su142215449

Abstract

The “multiple-interaction” model of third-party management for environmental pollution has gradually replaced the traditional “command-and-control” model and become a new trend in governance. This new governance system is accompanied by a lack of regulatory capacity, a single reward and punishment mechanism, and frequent rent-seeking behavior, and other governance problems are becoming increasingly prominent. Based on the premise of limited rationality, considering the possible rent-seeking behavior of pollution control enterprises and professional environmental testing institutions, this paper constructs a tripartite evolutionary game model with pollution control enterprises, professional environmental testing institutions, and government regulatory departments as the main bodies. The evolutionary stabilization strategy of the three-party game is analyzed according to Lyapunov’s theory, and the system is optimized through a computational experimental simulation in MATLAB. The research results show that the government can effectively regulate the behavior of pollution control enterprises and professional environmental testing institutions by appropriately increasing the rewards and punishments, but excessive rewards are not conducive to increasing the government regulators’ own performance; the existing static reward and punishment mechanism of the government regulators fails to reward and punish the behavior of governance subjects in real time, and the linear dynamic punishment mechanism greatly increases the probability of rent-seeking behavior, neither of which is a stable control strategy for the system. The non-linear dynamic reward and punishment mechanism takes into account both dynamic incentives and dynamic constraints to make the system achieve the desired evolutionary stability strategy, i.e., pollution control enterprises follow regulations, professional environmental testing agencies refuse to seek rent, and the government actively regulates the system as the final evolutionary direction. The research findings and management implications provide countermeasures and suggestions for government regulators to improve the regulatory mechanism for the third-party management of environmental pollution.

Keywords:

environmental pollution; third-party governance; rent-seeking behavior; evolutionary game; system optimization

1. Introduction

The environmental problems arising from economic development are widely related to the government and the public, and it is difficult to balance the efficiency and effectiveness of traditional environmental pollution management approaches. For this reason, the state has tried a new governance idea in the environmental field, namely the third-party governance of environmental pollution, which means that independent third-party pollution control enterprises other than the sewage enterprises and government regulatory departments undertake the task of environmental pollution control [1]. In 2013, the third-party management model of environmental pollution was first proposed in the Decision of the Central Committee of the Communist Party of China on Several Major Issues of Comprehensively Deepening Reform, and in 2015 the State Council issued the Opinions on Promoting Third-Party Management of Environmental Pollution, marking the full implementation of third-party management for environmental pollution from a pilot reform to the national level [2]. The implementation of this system in China’s environmental pollution sector has achieved certain results, but the existing problems are also increasingly evident, including:

Government departments have “sometimes tight, sometimes loose” regulatory behavior, and the regulatory capacity needs to be improved [3];
The reward and punishment system of the government regulatory department in the third-party environmental pollution management system is not yet perfect, and the existing reward and punishment system is too singular to mobilize the enthusiasm of the participants [4];
Some pollution control enterprises and professional environmental testing institutions of the third-party environmental pollution management system take advantage of information asymmetry to adopt irregular governance and rent-seeking behaviors, which seriously undermine the effectiveness of the governance system [1].

Therefore, considering the demands of governance practice, there is an urgent need to explore a set of effective regulatory strategies for the third-party governance of environmental pollution to stimulate the compliance behavior of pollution treatment participants and punish and restrain their violations, so as to provide support for the compliance governance of environmental pollution projects and the effectiveness of the governance system.

At present, the research on the regulation of the third-party governance of environmental pollution focuses on two aspects; one is to propose regulatory issues based on the institutional level, and the other is to analyze the behavior of governance subjects through quantitative models. Most scholars are committed to raising problems with regulation at the institutional level, such as Tang [3] pointing out that the key to unblock the system of third-party governance is to regulate the behavior of governance subjects by using incentive regulation theory. The literature [5,6] indicates that the regulatory system of third-party governance for environmental pollution in China has not been perfected, and third-party governance needs to be incorporated into the legal track. Very few scholars have analyzed the behaviors of pollution treatment enterprises, pollution emission enterprises, the public, and the government by means of quantitative research; for example, Du [7] constructed a two-party evolutionary game model of the government and third-party pollution treatment enterprises, indicating that the evolution of the government and pollution treatment enterprises at this stage depends on the relative payment of various strategies. Chu [8] constructed a tripartite game model of the public, local government, and central government based on the prisoner game dilemma of public participation and local government, and the study showed that the public regulation of environmental impacts by local governments could be an alternative to central government regulation. In addition, scholars such as [9,10] also applied a quantitative analysis to study the strategy choices of governance subjects. The research results of regulatory strategies in other fields are quite fruitful, including the use of game theory to analyze the regulatory strategies of the different evolutionary paths of each subject and the benefits of public participation in regulation, which provide a lot of suggestions for high-quality regulation by government regulators. For example, Zhu and He [11] studied the supervision strategy of regulating the quality of online goods. He [12] studied the regulation of green product quality. Kong [13] studied the regulation of product quality in industrial clusters. It can be seen that the above-mentioned literature on third-party governance focuses on analyzing the strategy choices of governance subjects, with less research on regulatory strategies and especially a lack of quantitative research on regulating the behavior of governance subjects through reward and punishment mechanisms.

On the other hand, there is less research on professional environmental testing institutions. Due to the information asymmetry between government regulators and pollution treatment enterprises, government regulators will commission professional environmental testing agencies to test the effectiveness of pollution treatment enterprises. However, it is difficult to curb the rent-seeking behavior of pollution treatment enterprises from professional environmental testing agencies. Pollution control enterprises that follow the rules of environmental pollution generally have high governance costs, technological innovation difficulties, and other problems, while false pollution control saves on the cost of governance and the benefits of space. Under the constraint of “sometimes tight, sometimes loose” government regulations, pollution control enterprises have a tendency to seek rent from professional environmental testing agencies to obtain governance approval [1]. Driven by huge rent-seeking interests, professional environmental testing institutions are also bound to have the risk of rent-seeking intentions [14]. It can also be seen through the literature [1,8,14,15,16] that the subjects are mostly government regulators, sewage enterprises, pollution treatment enterprises, and the public when scholars research the third-party governance of environmental pollution.

In recent years, behavioral operations research has been widely used in various fields [17,18,19], and game theory is a standard tool for analyzing strategy choices. If we disregard the assumption of “perfect rationality” in traditional game theory and consider the whole dynamic system of environmental governance, the strategy choices of government regulators, pollution control enterprises, and other subjects with limited rationality will no longer be a negligible factor [20]. Evolutionary game theory takes into account not only the limited rationality and incomplete information state of the subject, but also the dependence and interaction of the subject’s strategy choice over time, which is actually more relevant than the traditional game theory. Therefore, in the state of limited rationality and incomplete information of the governance subjects, the evolutionary game theory is an effective method to study the regulatory strategy of environmental pollution third-party governance.

The research in this paper differs from the aforementioned scholars’ research in three aspects. First, it considers the problem of false pollution control by collusion between pollution control enterprises and professional environmental testing institutions. Second, under the regulatory incentive and constraint mechanism, it analyzes the performance behavior of each subject through evolution from the self-interest of decision makers, so that they spontaneously form the “compliance” behavior a long-term evolutionary stability strategy, rather than only restraining the behavior of subjects through laws and regulations. Third, the system optimization mechanism, based on the dynamic performance payment mechanism researched by scholars, proposes improvement measures, introduces a non-linear dynamic reward and punishment mechanism to control the stability of the evolutionary system, and provides suggestions and countermeasures to improve the governmental regulatory mechanism of the environmental pollution third-party treatment.

2. Model Assumptions and Construction

2.1. Basic Assumptions

Through the above analysis, it can be seen that pollution control enterprises have the motive of false pollution control in the consideration of maximizing their own interests in the environmental treatment [1], while pollution control enterprises must collude with professional environmental testing agencies if they expect to pass their tests, and professional environmental testing agencies have the intention of rent-seeking behavior under the constraints of government regulation and are driven by certain interests [16]. As the regulator, the government regulatory department is the implementer of the incentive and constraint mechanism. “Incentive” means that the government regulator motivates the pollution control enterprises and professional environmental testing institutions to adopt compliance behavior by means of pollution control subsidies, technological innovation, and monetary rewards, while “constraint” means that the pollution control enterprise restrains the illegal pollution control behavior of the pollution control enterprise and the rent-seeking behavior of the professional environmental testing institution by means of financial penalties and reputation losses. The active supervision by the governmental supervisory department must cost a certain amount of manpower, financial resources, and time, and rewards and punishments should be applied according to the performance of the pollution control enterprises and professional environmental testing agencies, while negative supervision by the government causes collusion between the pollution control enterprises and professional environmental testing agencies, which will be held accountable by the higher governmental supervisory department. The interests of the three parties under the limited rationality of the subject are different, but there is a certain interdependence.

Hypothesis 1.

There are three parties involved in the game, i.e., pollution control enterprises, professional environmental testing agencies, and government regulators, and the information among the three parties is incomplete.

Hypothesis 2.

Suppose the probability of the pollution treatment enterprises adopting the strategy of compliant pollution treatment is x, then the probability of adopting the strategy of false pollution treatment is 1 − x; the probability of professional environmental testing institutions adopting the strategy of rejecting rent-seeking is y, then the probability of adopting the strategy of intentional rent-seeking is 1 − y; and the probability of government regulators adopting the strategy of positive regulation is z, then the strategy of adopting negative regulations is 1 − z.

Hypothesis 3.

The three subjects of the game are finitely rational, aiming to maximize their own interests, with no predictive ability beforehand, and have the ability to learn and imitate afterwards.

Hypothesis 4.

The revenue obtained by the pollution treatment enterprise after completing the environmental treatment of the project is

R_{E}

, the cost of the pollution treatment enterprise following the rules is

C_{D}

, and the cost of the false pollution treatment is

C_{F}

,

C_{D} > C_{F}

. When the pollution treatment enterprise follows the rules and passes the tests of the professional environmental testing agency and when the pollution treatment enterprise falsely treats the pollution, the pollution treatment enterprise will bribe the professional environmental testing agency to pass the test and the rent-seeking cost of the pollution treatment enterprise will be recorded as

T_{S}

,

T_{S} < C_{D} - C_{F}

. When the pollution treatment enterprise falsely treats the pollution, its speculative behavior will generate speculative costs

C_{H}

, and such costs include falsifying environmental quality qualification certificates, false propaganda, and other operational management costs.

Hypothesis 5.

After the completion of the project environmental treatment, the pollution control enterprise shall pass the inspection of the professional environmental testing institutions, and when the professional environmental testing agency fails to pass the inspection, the environmental protection project of the pollution treatment enterprise cannot be passed and the professional environmental testing agency for the project test proceeds is

L_{K}

. When the pollution treatment enterprise falsely treats the pollution, if the professional environmental testing agency refuses to seek rent, the project test is failed; if the professional environmental testing agency intends to seek rent, the third-party agencies and the pollution treatment enterprise conduct rent-seeking behavior, and the pollution treatment enterprise false pollution treatment project passes the test. The professional environmental testing agencies’ speculation costs for

C_{G}

,

C_{G} < T_{S}

, mainly include the costs of falsifying test records and issuing false reports.

Hypothesis 6.

When the government regulators actively regulate, if the pollution treatment enterprises treat the pollution falsely, they will be punished

P_{A}

, and professional environmental testing institutions with the intention of rent-seeking will be punished

Q_{B}

; if the pollution treatment enterprises follow the rules, they will be rewarded

M_{A}

, and professional environmental testing institutions will be rewarded

M_{B}

for refusing to rent-seek. When the government regulator negatively regulates, it cannot get information about the behavior of the pollution control enterprises and professional environmental testing institutions. We assume that the cost paid by the government regulators for active regulation is

C_{Z}

.

Hypothesis 7.

When the pollution control enterprises follow the rules, this is conducive to environmental and ecological stability and stable social and economic development, which brings certain economic benefits

N_{U}

to government departments. When the pollution treatment enterprises and professional environmental testing agencies conduct rent-seeking behavior, unqualified environmental management projects flow into society, resulting in environmental pollution and the loss of people’s property, and the cost of government regulators to maintain social stability and remediate environmental pollution is

O_{V}

. When the government regulatory department negatively supervise, it is unable to know the behavioral choices of the pollution treatment enterprises and professional environmental testing agencies, leading to substandard environmental management projects in society, and the higher government departments will be accountable to the government regulatory department with the accountability penalty for

C_{J}

.

2.2. Model Building

Based on the model assumptions, a three-party game payment matrix can be established for pollution treatment enterprises, professional environmental testing agencies, and government regulators, as shown in Table 1.

Table 1. Payment matrix of the game between pollution control enterprises, professional environmental testing agencies, and government regulators.

According to the basic assumptions and Table 1, we can construct a logical relationship diagram between pollution treatment enterprises, professional environmental testing institutions, and government regulatory departments, as shown in Figure 1.

Figure 1. A logic diagram of pollution control enterprises, professional environmental testing institutions, and government regulatory departments.

3. Model Analysis

3.1. Evolutionary Stability Analysis of the Strategy of Pollution Control Companies

Assuming that the expected return of a pollution treatment company to adopt a rule-based treatment strategy is

U_{I 11}

, the expected return to adopt a false treatment strategy is

U_{I 12}

, and the expected average return to adopt a mixed strategy is

{\bar{U}}_{I}

, then

U_{I 11}

,

U_{I 12}

, and

{\bar{U}}_{I}

are expressed as in Equations (1)–(3), respectively:

\begin{matrix} U_{I 11} = & y z (R_{E} - C_{D} + M_{A}) + y (1 - z) (R_{E} - C_{D}) + z (1 - y) (R_{E} - C_{D} + M_{A}) \\ + (1 - y) (1 - z) (R_{E} - C_{D}) \end{matrix}

(1)

U_{I 12} = y z (- C_{F} - C_{H} - P_{A}) + y (1 - z) (- C_{F} - C_{H}) + z (1 - y) (R_{E} - C_{F} - C_{H} - P_{A} - T_{S}) + (1 - y) (1 - z) (R_{E} - C_{F} - C_{H} - T_{S})

(2)

{\bar{U}}_{I} = x U_{I 11} + (1 - x) U_{I 12}

(3)

From Equations (1)–(3), the replication dynamics equation for the adoption of the rule-based pollution control strategy by the pollution treatment enterprises can be obtained as Equation (4):

F (x) = \frac{d x}{d t} = x (U_{I 11} - {\bar{U}}_{I}) = x (1 - x) [y (R_{E} - T_{S}) + z (M_{A} + P_{A}) + C_{F} + C_{H} + T_{S} - C_{D}]

(4)

The derivative of Function (5) is obtained by taking the derivative of Equation (4):

\frac{d F (x)}{d x} = (1 - 2 x) [y (R_{E} - T_{S}) + z (M_{A} + P_{A}) + C_{F} + C_{H} + T_{S} - C_{D}]

(5)

Let

F (x) = 0

, whereby the two stable states of the replica dynamic equation, i.e.,

x = 0

and

x = 1

, can be solved. Thus, the evolutionary stability of the replicated dynamic system can be discussed in the following three cases: when

y = y_{0} = \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})}

(0

< y_{0} < 1

),

F (x)

≡ 0, at which time the system reaches a stable state no matter what value

x

is taken. When

y < y_{0} = \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})}

,

{\frac{d F (x)}{d x} |}_{x = 0} < 0

,

{\frac{d F (x)}{d x} |}_{x = 1} > 0

, at this time

x = 0

is the evolutionarily stable strategy and

x = 1

is the unstable strategy. When

y > y_{0} = \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})}

,

{\frac{d F (x)}{d x} |}_{x = 0} > 0

,

{\frac{d F (x)}{d x} |}_{x = 1} < 0

, at this time

x = 0

is the unstable strategy and

x = 1

is the evolutionarily stable strategy.

The evolutionary phase diagram of the strategy of pollution control companies is shown in Figure 2.

Figure 2. Phase diagram of the evolution of the strategy of pollution control enterprises.

Remember

Ω = {N (x, y, z) | 0 \leq x \leq 1, 0 \leq y \leq 1, 0 \leq z \leq 1}

from Figure 1, where we can see that

y = y_{0} = \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})}

, so the surface will be divided into two spaces

A_{1}

and

A_{2}

. When the initial region falls in

A_{1}

, the strategy of the pollution treatment companies will evolve into false pollution treatment, and when the initial region falls into

A_{2}

, the strategy of the pollution treatment companies will evolve into rule-based pollution treatment.

Proposition 1.

The adoption of a compliance strategy by pollution treatment companies is correlated with the revenue obtained after environmental pollution treatment

R_{E}

, the cost of speculation

C_{H}

, the rent-seeking cost of pollution treatment companies

T_{S}

, the incentive and punishment of government regulators

M_{A} + P_{A}

, and the cost saved by the false pollution treatment relative to compliance

C_{D} - C_{F}

, which is positively correlated with respect to

R_{E}

,

C_{H}

,

T_{S}

,

M_{A} + P_{A}

and negatively correlated with respect to

C_{D} - C_{F}

.

Proof.

Figure 2 shows that the probability of a pollution treatment company adopting a false treatment strategy is the volume

V_{A_{1}}

of

A_{1}

and the probability of adopting a compliance strategy is the volume

V_{A_{2}}

of

A_{2}

. Here, we calculate

V_{A_{2}}

as shown in Equation (6):

V_{A_{2}} = 1 - V_{A_{1}} = 1 - \int_{0}^{1} \int_{0}^{1} \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})} d z d x = 1 - \frac{2 (C_{D} - C_{F} - C_{H} - T_{S}) - (M_{A} + P_{A})}{2 (R_{E} - T_{S})}

(6)

The partial derivative of

V_{A_{2}}

yields

\frac{\partial V_{A_{2}}}{\partial R_{E}} > 0

,

\frac{\partial V_{A_{2}}}{\partial C_{H}} > 0

,

\frac{\partial V_{A_{2}}}{\partial T_{S}} > 0

,

\frac{\partial V_{A_{2}}}{\partial (M_{A} + P_{A})} > 0

,

\frac{\partial V_{A_{2}}}{\partial (C_{D} - C_{F})} < 0

. Thus, increasing

R_{E}

,

C_{H}

,

T_{S}

,

M_{A} + P_{A}

or decreasing

C_{D} - C_{F}

can make

V_{A_{2}}

increase, i.e., the probability of pollution treatment enterprises following the rules increases. □

From Proposition 1, it can be seen that guaranteeing the profits of the pollution treatment enterprises after the environmental treatment and appropriately increasing the strength of the rewards and punishments from the governmental regulatory departments can effectively motivate pollution treatment enterprises to adopt compliance strategies. At the same time, government regulators can also increase the speculative costs for pollution treatment enterprises by increasing the credibility of the media’s opinions and influence, and by expanding the competitiveness of enterprises to promote the adoption of compliance pollution treatment strategies.

3.2. Evolutionary Stability Analysis of the Strategies of Professional Environmental Testing Institutions

Assuming that the expected return of the professional environmental testing institutions adopting a rent-seeking rejection strategy is

U_{R 21}

, the expected return of adopting an intentional rent-seeking strategy is

U_{R 22}

, and the expected average return of adopting a mixed strategy is

{\bar{U}}_{R}

, then

U_{R 21}

,

U_{R 22}

, and

{\bar{U}}_{R}

are expressed as in Equations (7)–(9), respectively:

U_{R 21} = x z (L_{K} + M_{B}) + x (1 - z) L_{K} + (1 - x) z (L_{K} + M_{B}) + (1 - x) (1 - z) L_{K}

(7)

U_{R 22} = x z (L_{K} - Q_{B} - C_{G}) + x (1 - z) (L_{K} - C_{G}) + (1 - x) z (L_{K} + T_{S} - Q_{B} - C_{G}) + (1 - x) (1 - z) (L_{K} + T_{S} - C_{G})

(8)

{\bar{U}}_{R} = y U_{R 11} + (1 - y) U_{R 12}

(9)

From Equations (7)–(9), the replication dynamic equation for a rent-seeking strategy adopted by a professional environmental testing organization can be obtained as:

G (y) = \frac{d y}{d t} = y (U_{R 21} - {\bar{U}}_{R}) = y (1 - y) [z (M_{B} + Q_{B}) + x T_{S} + C_{G} - T_{S}]

(10)

The derivative function can be obtained by deriving (10) as shown in (11):

\frac{d G (y)}{d y} = (1 - 2 y) [z (M_{B} + Q_{B}) + x T_{S} + C_{G} - T_{S}]

(11)

Let

G (y) = 0

, which can be solved for the two stable states of the replica dynamic equations, namely

y = 0

and

y = 1

. Therefore, the evolutionary stability of the replica dynamic system can be discussed in the following three cases: when

z = z_{0} = \frac{T_{S} - C_{G} - x T_{S}}{(M_{B} + Q_{B})}

(0

< z_{0} < 1

,

G (y)

≡ 0, at this time the system reaches a stable state, regardless of the value of

y

. When

z < z_{0} = \frac{T_{S} - C_{G} - x T_{S}}{(M_{B} + Q_{B})}

,

{\frac{d G (y)}{d y} |}_{y = 0} < 0

,

{\frac{d G (y)}{d y} |}_{y = 1} > 0

, at this time

y = 0

is an evolutionarily stable strategy and

y = 1

is an unstable strategy. When

z > z_{0} = \frac{T_{S} - C_{G} - x T_{S}}{(M_{B} + Q_{B})} {\frac{d G (y)}{d y} |}_{y = 0} > 0

,

{\frac{d G (y)}{d y} |}_{y = 1} < 0

,

y = 0

is the unstable strategy and

y = 1

is an evolutionarily stable strategy.

The evolutionary phase diagram of the professional environmental testing agency’s strategy is shown in Figure 3.

Figure 3. Phase diagram of the evolution of the strategies of professional environmental testing institutions.

From Figure 3, it can be seen that the

z = z_{0} = \frac{T_{S} - C_{G} - x T_{S}}{(M_{B} + Q_{B})}

surface divides

Ω

into two spaces,

B_{1}

and

B_{2}

. When the initial region falls in

B_{1}

, the strategy of the professional environmental testing institutions will evolve to rent-seeking rejection, and when the initial region falls in

B_{2}

, the strategy of the professional environmental testing organization will evolve to intentional rent-seeking.

Proposition 2.

The adoption of a rent-seeking strategy by professional environmental testing institutions is correlated with the reward and punishment of professional environmental testing institutions by government regulators (

M_{B} + Q_{B}

), the speculative cost of professional environmental testing institutions

C_{G}

, and the rent-seeking cost of pollution treatment enterprises

T_{S}

. Specifically, it is positively correlated with respect to

M_{B} + Q_{B}

and

T_{S}

and negatively correlated with respect to

C_{G}

.

Proof.

Figure 3 shows that the probability of a professional environmental testing institution adopting a rent-seeking strategy of rejection is the volume

V_{B_{1}}

of

B_{1}

, and the probability of it adopting an intentional rent-seeking strategy is the volume

V_{B_{2}}

of

B_{2}

. Here, we calculate

V_{B_{1}}

as in Equation (12).

V_{B_{1}} = 1 - V_{B_{2}} = 1 - \int_{0}^{1} \int_{0}^{\frac{C_{G} - T_{S}}{T_{S}}} \frac{T_{S} - C_{G} - x T_{S}}{M_{B} + Q_{B}} d x d y = 1 - \frac{{(C_{G} - T_{S})}^{2}}{2 T_{S} (M_{B} + Q_{B})}

(12)

The partial derivative of

V_{B_{1}}

yields

\frac{\partial V_{B_{1}}}{\partial M_{B}} > 0

,

\frac{\partial V_{B_{1}}}{\partial Q_{B}} > 0

,

\frac{\partial V_{B_{1}}}{\partial T_{S}} > 0

,

\frac{\partial V_{B_{1}}}{\partial C_{G}} < 0

. Thus, increasing

M_{B}

or

Q_{B}

or decreasing

T_{S}

or

C_{G}

can make

V_{B_{1}}

increase, i.e., the probability of rejecting rent-seeking by professional environmental testing institutions increases. □

From Proposition 2, it can be seen that government regulators appropriately increase the rewards and punishments for professional environmental testing institutions, as well as increasing their rent-seeking costs by exposing the rent-seeking behavior of professional environmental testing institutions through the media, which can reduce the speculative behavior of professional environmental testing institutions. In addition, active supervision by government regulators reduces the probability of rent-seeking behavior and facilitates fair testing by professional environmental testing institutions.

3.3. Analysis of the Evolutionary Stability of the Government Regulator’s Strategy

Assuming that the government regulator’s expected return from adopting a positive regulatory strategy is

U_{G 31}

, the expected return from adopting a negative regulatory strategy is

U_{G 32}

, and the expected average return from adopting a mixed strategy is

{\bar{U}}_{G}

. Then,

U_{G 31}

,

U_{G 32}

, and

{\bar{U}}_{G}

are expressed as in Equations (13)–(15), respectively:

U_{G 31} = x y (N_{U} - M_{A} - M_{B} - C_{Z}) + x (1 - y) (N_{U} + Q_{B} - M_{A} - C_{Z}) + y (1 - x) (P_{A} - C_{Z} - M_{B}) + (1 - x) (1 - y) (P_{A} + Q_{B} - C_{Z} - O_{V})

(13)

U_{G 32} = x y N_{U} + x (1 - y) N_{U} - (1 - x) (1 - y) (O_{V} + C_{J})

(14)

{\bar{U}}_{G} = z U_{G 31} + (1 - z) U_{G 32}

(15)

From Equations (13)–(15), the replication dynamic equation for an aggressive regulatory strategy by government regulators can be obtained as:

M (z) = \frac{d z}{d t} = z (U_{G 21} - {\bar{U}}_{G}) = z (1 - z) [x y C_{J} - x (M_{A} + P_{A} + C_{J}) - y (M_{B} + Q_{B} + C_{J}) + P_{A} + C_{J} + Q_{B} - C_{Z}

(16)

The expression for the derivative function of (16) is shown in (17):

\frac{d M (z)}{d z} = z (1 - 2 z) [x y C_{J} - x (M_{A} + P_{A} + C_{J}) - y (M_{B} + Q_{B} + C_{J}) + P_{A} + C_{J} + Q_{B} - C_{Z}]

(17)

Let

M (z) = 0

, whereby the two stable states of the replica dynamic equations, i.e.,

z = 0

and

z = 1

, can be solved. Thus, the evolutionary stability of the replica dynamic system can be discussed in the following three cases: when

x = x_{0} = \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}}

(0

< x_{0} < 1

),

M (z)

≡ 0, at which time the system reaches a stable state, regardless of the value of

z

. When

x < x_{0} = \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}}

,

{\frac{d M (z)}{d z} |}_{z = 0} > 0

,

{\frac{d M (z)}{d z} |}_{z = 1} < 0

, at this time

z = 1

is the evolutionarily stable strategy and

z = 0

is the unstable strategy. When

x > x_{0} = \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}}

,

{\frac{d M (z)}{d z} |}_{z = 0} < 0

,

{\frac{d M (z)}{d z} |}_{z = 1} > 0

, at this time

z = 1

is the unstable strategy and

z = 0

is the evolutionarily stable strategy.

The evolutionary phase diagram of the government regulator’s strategy is shown in Figure 4.

Figure 4. Phase diagram of the evolution of the government regulator’s strategy.

From Figure 4, it can be seen that the

x = x_{0} = \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}}

surface divides

Ω

into two spaces,

C_{1}

and

C_{2}

. When the initial region falls in

C_{1}

, the government regulator’s strategy will evolve to positive regulation, and when the initial region falls in

C_{2}

, the government regulator’s strategy will evolve to negative regulation.

Proposition 3.

The positive regulatory strategy adopted by government regulators is related to the rewards and punishments (

P_{A}, M_{A}

,

M_{B}, Q_{B}

) of government regulators for pollution control enterprises and professional environmental testing institutions and the penalties

C_{J}

imposed by higher government regulators on government regulators when they regulate negatively, and is positively correlated with respect to

P_{A}, Q_{B}, C_{J}

and negatively correlated with respect to

M_{A}

,

M_{B}

.

Proof.

Figure 4 shows that the probability of a government regulator adopting a positive regulatory strategy is the volume

V_{C_{1}}

of

C_{1}

, and the probability of adopting a negative regulatory strategy is the volume

V_{C_{2}}

of

C_{2}

. The calculation of

V_{C_{1}}

is shown in Equation (18):

V_{C_{1}} = \int_{0}^{1} \int_{0}^{1} \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}} d y d z = \frac{M_{B} + Q_{B} + C_{J}}{C_{J}} + \frac{C_{J} (C_{Z} - P_{A} - C_{J} - Q_{B}) + (M_{A} + P_{A} + C_{J}) (M_{B} + Q_{B} + C_{J})}{C_{J}^{2}} \ln (1 - \frac{C_{J}}{M_{A} + P_{A} + C_{J}}) = \frac{M_{B} + Q_{B} + C_{J}}{C_{J}} - [\frac{C_{Z} + M_{A} + M_{B}}{C_{J}} + \frac{(M_{A} + P_{A}) (M_{B} + Q_{B})}{C_{J}^{2}}] \ln (1 + \frac{C_{J}}{M_{A} + P_{A}})

(18)

Taking the partial derivative of

V_{C_{1}}

yields

\frac{\partial V_{C_{1}}}{\partial P_{A}} > 0

,

\frac{\partial V_{C_{1}}}{\partial M_{A}} < 0

,

\frac{\partial V_{C_{1}}}{\partial M_{B}} < 0

,

\frac{\partial V_{C_{1}}}{\partial Q_{B}} > 0

,

\frac{\partial V_{C_{1}}}{\partial C_{J}} > 0

. Thus, increasing

P_{A}, Q_{B}, C_{J}

or decreasing

M_{A}

,

M_{B}

increases the volume of

C_{1}

. □

From Proposition 3, it can be seen that the greater the punishment set by the government regulators for the pollution control enterprises and professional environmental testing institutions, the higher the rate will be of positive regulation by the government regulators and the more beneficial it will be to promote positive regulation by the government regulators, while higher amounts of rewards set will be detrimental to their own regulatory duties, leading to a decrease in the rate of positive regulation. Punishment via negative regulation by higher government regulators is an important factor for positive regulation by government regulators.

3.4. System Equilibrium Point Stability Analysis

Solving the system of equations consisting of Equations (4), (10) and (16), we can obtain nine equilibrium points in the game process of pollution control enterprises, professional environmental testing agencies, and government regulators, which are

E_{1}

(0,0,0),

E_{2}

(0,1,0),

E_{3}

(0,0,1),

E_{4}

(0,1,1),

E_{5}

(1,0,0),

E_{6}

(1,1,0),

E_{7}

(1,0,1),

E_{8}

(1,1,1),

E_{9}

(

x_{0}

,

y_{0}

,

z_{0}

), where

E_{9}

(

x_{0}

,

y_{0}

,

z_{0}

)

\in Ω

,

Ω = {N (x, y, z) | 0 \leq x \leq 1, 0 \leq y \leq 1, 0 \leq z \leq 1}

. Here,

x_{0} = \frac{y (M_{B} + Q_{B} + C_{J}) + C_{Z} - P_{A} - C_{J} - Q_{B}}{y C_{J} - M_{A} - P_{A} - C_{J}}

,

y_{0} = \frac{C_{D} - C_{F} - C_{H} - T_{S} - z (M_{A} + P_{A})}{(R_{E} - T_{S})}

,

z_{0} = \frac{T_{S} - C_{G} - x T_{S}}{(M_{B} + Q_{B})}

. The stability of the evolving system can be judged using the Jacobi matrix, and according to the stability determination theorem of the Lyapunov ordinary differential equation, the system is asymptotically stable when the eigenvalues of the Jacobi matrix are all negative real parts; the system is unstable if the eigenvalues of the Jacobi matrix have at least one positive real part [21,22]. The expression of the Jacobi matrix J is as follows:

J = [\begin{matrix} \frac{\partial F (x)}{\partial x} & \begin{matrix} \frac{\partial F (x)}{\partial y} & \frac{\partial F (x)}{\partial z} \end{matrix} \\ \begin{matrix} \frac{\partial G (y)}{\partial x} \\ \frac{\partial M (z)}{\partial x} \end{matrix} & \begin{matrix} \begin{matrix} \frac{\partial G (y)}{\partial y} & \frac{\partial G (y)}{\partial z} \end{matrix} \\ \begin{matrix} \frac{\partial M (z)}{\partial y} & \frac{\partial M (z)}{\partial z} \end{matrix} \end{matrix} \end{matrix}] = [\begin{matrix} \begin{matrix} (1 - 2 x) [y (R_{E} - T_{S}) + z (M_{A} + P_{A}) \\ + C_{F} + C_{H} + T_{S} - C_{D}] \end{matrix} & x (1 - x) (R_{E} - T_{S}) & x (1 - x) (M_{A} + P_{A}) \\ \begin{matrix} y (1 - y) T_{S} \\ z (1 - z) [y C_{J} - (M_{A} + P_{A} + C_{J})] \end{matrix} & \begin{matrix} (1 - 2 y) [z (M_{B} + Q_{B}) + \\ x T_{S} + C_{G} - T_{S}] \\ z (1 - z) [x C_{J} - (M_{B} + Q_{B} + C_{J})] \end{matrix} & \begin{matrix} y (1 - y) (M_{B} + Q_{B}) \\ (1 - 2 z) [x y C_{J} - x (M_{A} + P_{A} + C_{J}) - y (M_{B} + Q_{B} + C_{J}) \\ + P_{A} + C_{J} + Q_{B} - C_{Z}] \end{matrix} \end{matrix}]

The corresponding eigenvalues of each equilibrium point are solved by bringing each equilibrium point into the Jacobi matrix, and the stability analysis of the equilibrium point is shown in Table 2 and Table 3.

Table 2. Eigenvalues of the Jacobi matrix corresponding to each equilibrium point.

Table 3. Equilibrium point evolutionary stabilization strategy (ESS) analysis.

As can be seen from Table 3, the system will evolve to

E_{3}

(0, 0, 1) and

E_{6}

(1, 1, 0) in different initial states when government regulators offer fewer rewards and penalties to pollution control companies and professional environmental testing agencies and when rent-seeking behavior by pollution control companies and third parties generates high interest. For the strategy combination of

E_{3}

(0, 0, 1), the pollution treatment enterprise falsely treats pollution, the professional environmental testing agency intends to seek rent, and the falsely treated environmental protection project passes inspections, causing social environmental pollution. In order to avoid this situation, government regulatory departments should reasonably develop reward and punishment programs to appropriately increase rewards and punishments for pollution treatment enterprises and professional environmental testing agencies to ensure that the project follows the rules of governance.

In summary, based on the model assumptions, two evolutionary stabilization strategies,

E_{3}

(0, 0, 1) and (1, 1, 0), can be achieved. The strategy combination of false pollution control, intentional rent-seeking, and active regulation can be avoided under a reasonable reward and punishment system of government regulators. The strategy combination of compliance, rent-seeking, and negative regulation, however, means that the pollution control companies will spontaneously adopt the compliance strategy, which means that the pollution control companies can still evolve the compliance behavior even under the negative regulation of the government. In fact, from the perspective of the managers, the evolutionary stabilization strategy of the equilibrium point

E_{8}

(1, 1, 1) is what we expect, in which the pollution control enterprises follow the rules and regulations, the third-party agencies refuse to seek rent, the government regulators actively supervise the process, and the three parties participate in the environmental pollution control projects. To achieve the equilibrium point

E_{8}

(1, 1, 1), we need to reintroduce incentives and penalties for pollution control enterprises and professional environmental testing agencies, which will be discussed in depth in this paper.

4. Computational Experimental Simulation

Based on the above analysis, in order to more intuitively reflect the influencing factors of the evolutionary process of environmental project management, MATLAB 2016b is used for the numerical simulation. The simulation start time is set to 0, the end time is set to 30, the simulation unit is not specifically set, the threshold of the model in this paper is in the range of [0, 30], and the simulation end time is set to satisfy the iteration threshold of the evolutionary stabilization strategy of the model. Then, we are given array 1:

R_{E} = 45

,

T_{S} = 12

,

M_{A} = 6

,

P_{A} = 12

,

C_{F} = 4.5

,

C_{H} = 3

,

C_{D} = 30

,

M_{B} = 4.5

,

Q_{B} = 6

,

C_{G} = 3

,

C_{J} = 12

,

C_{Z} = 4.5

, where the unit of the parameters is one million; that is, the initial state of the point

E_{6}

(1,1,0) in Table 3 is satisfied:

C_{D} - C_{F} - C_{H} - R_{E} < 0

and

C_{G}

,

M_{A} + M_{B} + C_{Z} > 0

. In this case, the evolution of the pollution control enterprises, professional environmental testing agencies, and government regulators is as shown in Figure 5, and the three-party evolutionary game system converges to

E_{6}

(1,1,0).

Figure 5. The

E_{6}

(1,1,0) evolutionary path diagram.

In order to clearly observe the evolutionary path of the three game subjects, we assume that the probability of implementing each strategy in the initial state is 0.5 for pollution control enterprises, professional environmental testing institutions, and government regulatory departments, and we conduct a simulation analysis for some factors.

(1) Incentives from government regulators for pollution treatment enterprises. Here, we decrease

M_{A}

to 3 and increase

M_{A}

to 12, respectively, comparing

M_{A} = 6

, as shown in Figure 6a. From Figure 6a, it can be seen that as

M_{A}

increases, the probability of convergence of the pollution treatment enterprises to the compliance strategy increases, but the rate of convergence of government regulators to active regulation decreases significantly, then finally evolves to 0 and reaches a steady state. At the same time, there is a significant increase in the probability of professional environmental testing institutions rejecting rent-seeking in the

M_{A}

increase process. This indicates that under the condition of active supervision by governmental regulatory authorities, the reasonable formulation of incentive measures plays a positive role in the project’s rule-based pollution control and is conducive to the project’s environmental governance.

Figure 6. Effects of some parameters on the evolutionary results.

(2) The accountability of higher levels of government to government regulators. Here, we decrease

C_{J}

to 6 and increase

C_{J}

to 24, respectively, comparing

C_{J} = 12

, as shown in Figure 6b. From Figure 6b, we can see that an increase in

C_{J}

leads to an increase in the probability of active regulation by government regulators. Through Figure 5, we can also find an interesting phenomenon, where for any one curve the evolutionary curve of the government regulators first increases and then decreases, which indicates that the government regulators intend to adopt a positive regulation strategy in the early stage of evolution, but with the reward and punishment mechanism for pollution control enterprises and professional environmental testing agencies and under the passage of time, we find that positive regulation is not the dominant strategy, so government regulators will change the strategy and eventually choose negative regulation, but the supervision of government regulators by higher levels of government can reduce the rate of regulation by government regulatory departments who adopt negative regulatory strategies, which is conducive to the pollution treatment enterprises following the rules and regulations to control environmental pollution.

The MATLAB simulation shows that the conclusions of the theoretical analysis and simulation analysis are consistent, and the simulation results for the remaining parameters are consistent with the theoretical analysis; in fact, we can analyze the size and sensitivity analyses of the parameters in the arbitrary evolution process, but this is not repeated to save space.

5. System Optimization

In the game process, the strategy choices of pollution control enterprises, professional environmental testing institutions, and government regulatory departments are often dependent. With the passage of time, the change of any party’s strategy will cause changes in the behavior of the other subjects, and the instability of such behavior will make the behavior of pollution control enterprises, professional environmental testing institutions, or government regulatory departments unpredictable. It is difficult to effectively implement the supervision of government regulatory departments, which leads to enterprises “exploiting the loopholes” in the long run and environmental pollution not being effectively managed. Therefore, it is necessary to control the stability of the evolutionary system and achieve the evolutionary stabilization of the equilibrium point (1, 1, 1) to ensure the stable and effective implementation of the environmental management project. To visualize this behavior for the tripartite subjects, let the simulation start time be 0 and the end time be 5. Then, we are given array 2:

R_{E} = 80

,

T_{S} = 60

,

M_{A} = 12

,

P_{A} = 27

,

C_{F} = 4

,

C_{H} = 2

,

C_{D} = 75

,

M_{B} = 12

,

Q_{B} = 12

,

C_{G} = 2

,

C_{J} = 3

,

C_{Z} = 24

, i.e., the initial conditions of the saddle point are satisfied, and the simulation diagram is shown in Figure 7a.

Figure 7. Evolutionary paths of tripartite behavior choices under a static reward and punishment mechanism.

5.1. Static Reward and Punishment Mechanism

Through the simulation analysis, it can be found that the adjustment of the fixed values of the parameters will not change the evolution direction of the system, although it will change the rate of reaching the evolutionary stability strategy. To further verify that using the static reward and punishment mechanism is difficult to control the stability of the dynamic system, we adjust some of the parameters in array 2 and assume that the probability of each game subject executing each strategy is 0.5, so we increase

M_{A} = 24

,

P_{A} = 54

and conduct the simulation, as shown in Figure 7b.

Comparing Figure 7a,b, it can be seen that for the enterprises, the implementation of the static reward and punishment mechanism can increase the probability of adopting the compliance strategy, but it cannot change the fluctuating instability of the behavioral choices of the treatment enterprises. For the professional environmental testing agencies and the government regulators, the static reward and punishment mechanism also does not lead to the evolution of a stable strategy. Similar results can be obtained by adjusting the parameters of the government regulator to the professional environmental testing agency. The above simulation results show that the static reward and punishment mechanism only changes the size of the fixed values of parameters, which can improve the performance behavior of game subjects within a short period of time, but as the evolution proceeds, the static reward and punishment mechanism lose its effect. Therefore, the static reward and punishment mechanism is not the stability control strategy of the system, because the choice of strategy for the three parties in the game process is not static but is dynamically adjusted, while the static reward and punishment mechanism does not make timely adjustments according to the performance behavior of each subject but leads to the system’s fluctuation instability becoming more and more intense. This is also an important reason why the existing regulatory system of environmental management projects exists but rent-seeking behavior often occurs.

5.2. Linear Dynamic Penalty Mechanism

The introduction of the linear dynamic penalty mechanism in this paper is mainly based on the studies by Liang et al. [22] and Gao [15] on the dynamic performance payment mechanism and dynamic penalty mechanism. In view of this consideration, this paper introduces a linear dynamic penalty mechanism, i.e., the government regulator imposes dynamic penalties on enterprises and professional environmental testing institutions according to their degree of violation. It is assumed that the penalty amount is

P_{A}

when the pollution treatment enterprise falsely treats pollution, and the penalties imposed on professional environmental testing agencies with the intention of rent-seeking

Q_{B}

are both linked and linearly correlated with their degree of violation. The dynamic penalty variables

P_{A}^{*}, Q_{B}^{*}

are, thus, introduced, as shown in Equations (19) and (20):

P_{A}^{*} = α (1 - x) P_{A}, 0 \leq x \leq 1

(19)

Q_{B}^{*} = β (1 - y) Q_{B}, 0 \leq y \leq 1

(20)

where

α, β

are linear dynamic penalty coefficients,

α

,

β > 1

. The effect of the linear dynamic penalty mechanism is visualized by bringing Equations (10) and (11) into the set of equations consisting of Equations (4), (10) and (16), taking

α = β = 2

, while the initial strategies of the game subjects are (0.5, 0.5, 0.5) and (0.5, 0.2, 0.8), extending the simulation end time to 10. The simulation results are shown in Figure 8a,b.

Figure 8. Evolutionary paths of tripartite behavior choices under the linear dynamic punishment mechanism.

By comparing the simulation results in Figure 8a,b, we can see that the stability of the system is effectively controlled after the implementation of the linear dynamic penalty mechanism, and the system evolution converges to (0.648, 0, 0.29), regardless of whether the game subjects take the initial strategy of (0.5, 0.5, 0.5) or (0.5, 0.2, 0.8). Although the fluctuating instability of the system becomes gradually stable after the implementation of the linear dynamic penalty mechanism, there are still two problems. First, in the treatment of pollution enterprises in a short period of time, there is still the possibility of the false treatment of environmental projects, and the government regulatory departments also have the possibility of negative supervision, which poses a greater threat to the environmental management of the project. Second, in the evolutionary process, professional environmental testing agencies eventually reach the full intention rent-seeking (

y = 0

) evolutionary stabilization strategy, which substantially increases the probability of rent-seeking behavior, causes serious corruption, and is not conducive to incentivizing pollution control enterprises to follow the rules of pollution control environmental projects.

In summary, the implementation of the linear dynamic punishment mechanism effectively controls the stability of the system, but the implemented evolutionary stabilization strategy is not the evolutionary stabilization strategy we expect, and there are still certain problems. Therefore, changing the performance behavior of the tripartite subjects under the linear dynamic punishment mechanism on this basis requires the further optimization of the system.

5.3. Non-Linear Dynamic Reward and Punishment Mechanism

Based on the shortage of the linear dynamic punishment mechanism, we consider the two-way incentive of the game subjects’ performance behavior. To this end, a non-linear dynamic reward and punishment mechanism is introduced to regulate the behaviors of enterprises, professional environmental testing institutions, and government regulatory departments. The non-linear dynamic reward and punishment mechanism is mainly considered from two aspects. First, the linear dynamic punishment mechanism is optimized. On the basis of the linear dynamic penalty mechanism, it is then combined with the active regulation rate of the government regulator, the rent-seeking rent

T_{S}

obtained by the professional environmental testing agency, and the speculative interest

C_{D} - C_{F}

of the enterprise, thereby introducing the dynamic penalty parameters

P_{A}^{'}

,

Q_{B}^{'}

, as shown in Equations (21) and (22).

P_{A}^{'} = α (1 - x) P_{A} + z (C_{D} - C_{F})

(21)

Q_{B}^{'} = β (1 - y) Q_{B} + z T_{S}

(22)

Second, a non-linear dynamic incentive mechanism is introduced. The consideration of introducing this mechanism is based on the studies by Wang [23] and Chen [24], which show that the compensation mechanism of the governmental regulatory authorities and the compliance behavior of the enterprises present an inverted “U” non-linear relationship. Yang et al. [25] showed that the incentive mechanism is different from the punishment mechanism, and it is counterproductive to blindly regulate the compliance behavior of enterprises through the incentive mechanism. Drawing on scholars’ related research results, this paper combines the non-linear dynamic incentive mechanism with the active supervision rate of governmental regulators

z

, the rent-seeking rate of the professional environmental testing institutions

y

, the compliance rate of the enterprises

x

, the incentives of the governmental regulators for enterprise compliance

M_{A}

, the incentives of governmental regulators toward the professional environmental testing institutions’ rent-seeking behavior

M_{B}

, the rent-seeking rents

T_{S}

, and the enterprises’ speculative interests

C_{D} - C_{F}

. When the compliance behavior of the enterprises and professional environmental testing institutions is low, a dynamic high-incentive strategy is adopted; when the compliance behavior of the enterprises and professional environmental testing institutions reaches the standard, the incentive level is appropriately reduced. The non-linear dynamic incentive parameters

M_{A}^{*}

,

M_{B}^{*}

, as shown in Equations (23) and (24), are, thus, introduced:

M_{A}^{*} = - x^{2} + M_{A} x - z (C_{D} - C_{F})

(23)

M_{B}^{*} = - y^{2} + M_{B} y - z T_{S}

(24)

Taking

α = β = 2

, the initial strategies of the game subjects are (0.5, 0.5, 0.5) and (0.5, 0.2, 0.8). To visualize the effects of the non-linear dynamic reward and punishment mechanism, the simulation time is shortened to 1, as shown in Figure 9a,b.

Figure 9. Evolutionary paths of tripartite behavior choices based on the linear dynamic reward and punishment mechanism.

Comparing Figure 9a,b, it can be seen that under the non-linear dynamic reward and punishment mechanism, the system stabilizes at the equilibrium of (1, 1, 1), regardless of whether the initial values are (0.5, 0.5, 0.5) or (0.5, 0.2, 0.8).

Comparing Figure 8 and Figure 9, it can be seen that the system evolves to the strategy of (1, 1, 1), at which time the strategy selection for all three parties reaches the ideal state. Thus, it can be conjectured that the initial policy choice under the non-linear dynamic reward and punishment mechanism has the ideal evolutionary stability, but theoretical proof is still needed. The proof is given to test the effectiveness of the non-linear dynamic reward and punishment mechanism. Randomly taking the initial strategy of the saddle points as (0.6, 0.1, 0.1) and bringing

P_{A}^{'}

,

Q_{B}^{'}

,

M_{A}^{*}

, and

M_{B}^{*}

into the system of equations consisting of Equations (4), (10) and (16), we obtain the new system as Equation (25):

{\begin{matrix} F^{*} (x) = \frac{d x}{d t} = x (U_{I 11} - {\bar{U}}_{I}) = x (1 - x) [y (R_{E} - T_{S}) + z (- x^{2} + M_{A} x + α (1 - x) P_{A}) + C_{F} + C_{H} + T_{S} - C_{D}] \\ G^{*} (y) = \frac{d y}{d t} = y (U_{R 21} - {\bar{U}}_{R}) = y (1 - y) [z (- y^{2} + M_{B} y + β (1 - y) Q_{B}) + x T_{S} + C_{G} - T_{S}] \\ \begin{matrix} M^{*} (z) = \frac{d z}{d t} = z (U_{G 21} - {\bar{U}}_{G}) = & z (1 - z) [x y C_{J} - x (- x^{2} + M_{A} x + α (1 - x) P_{A} + C_{J}) - y (- y^{2} + M_{B} y + β (1 - y) Q_{B} + C_{J}) \\ + α (1 - x) P_{A} + z (C_{D} - C_{F}) + C_{J} + β (1 - y) Q_{B} + z T_{S} - C_{Z}] \end{matrix} \end{matrix}

(25)

From Equation (25), the Jacobi matrix

J^{*}

is obtained as shown in (26):

J^{*} = [\begin{matrix} x (1 - x) (- 2 x z + M_{A} z - α P_{A} z) & x (1 - x) (R_{E} - T_{S}) & x (1 - x) [- x^{2} + M_{A} x + α (1 - x) P_{A}] \\ y (1 - y) T_{S} & y (1 - y) (- 2 y z + M_{B} z - β Q_{B} z) & y (1 - y) [- y^{2} + M_{B} y + β (1 - y) Q_{B}] \\ z (1 - z) [y C_{J} + 3 x^{2} - 2 M_{A} x - α (1 - 2 x) P_{A} - C_{J} - α P_{A}] & z (1 - z) [x C_{J} + 3 y^{2} - 2 M_{B} y - β (1 - 2 y) Q_{B}) - C_{J} - β Q_{B}] & z (1 - z) (C_{D} - C_{F} + T_{S}) \end{matrix}]

(26)

Let the system of equations in Equation (16) be 0 and solve the nine equilibrium points

E_{1}^{*}

(0, 0, 0),

E_{2}^{*}

(0, 1, 0),

E_{3}^{*}

(0, 0, 1),

E_{4}^{*}

(0, 1, 1),

E_{5}^{*}

(1, 0, 0),

E_{6}^{*}

(1, 1, 0),

E_{7}^{*}

(1, 0, 1),

E_{8}^{*}

(1, 1, 1),

E_{9}^{*}

(

x_{0}^{*}

,

y_{0}^{*}

,

z_{0}^{*}

). Bringing the saddle point (

x_{0}^{*}

,

y_{0}^{*}

,

z_{0}^{*}

) into the Jacobi matrix to

J^{*} = [\begin{matrix} - 0.2056 & 4.8 & 6.8256 \\ 5.4 & - 15.8866 & 2.0511 \\ - 5.3298 & - 2.6253 & - 8.044 \end{matrix}]

solves its eigenvalues

λ_{1} = - 3.3668 + 4.4767 i

,

λ_{2} = - 3.3668 - 4.4767 i

,

λ_{3} = - 17.4029

, where

λ_{1, 2, 3} < 0

. According to Lyapunov’s theory, the third-order matrix equilibrium point is asymptotically stable, conditional on having three eigencomplex roots with negative real parts [26], and so it can be shown that the evolution of the saddle point under the non-linear dynamic reward and punishment mechanism has stability.

6. Conclusions

6.1. Research Findings and Significance of the Study

Based on the third-party management of environmental pollution, this paper considered the rent-seeking behavior of pollution treatment enterprises and professional environmental testing institutions and constructed an evolutionary game model of pollution treatment enterprises, professional environmental testing institutions, and government regulatory departments. Firstly, the stability and influencing factors of the respective strategies of the three subjects were analyzed. Secondly, the evolutionary stability strategy of the whole dynamic system was analyzed and the evolutionary path of the three-party evolutionary game was revealed intuitively through computational experimental simulations to verify the correctness of the theoretical analysis. Finally, the system was optimized to achieve the desired evolutionary stability strategy. The main conclusions drawn from the theoretical and simulation analyses are as follows:

(1) Appropriate increases in rewards and penalties by government regulators can not only incentivize pollution control enterprises to follow the rules of pollution control, but can also regulate the behavior of professional environmental testing agencies, although excessive rewards will not be conducive to the performance of government regulators themselves;

(2) The existing static reward and punishment mechanism of the government regulatory department fails to make timely adjustments according to the strategic choices of each subject, and there is no legal system to regulate the behavior of governance subjects;

(3) The adoption of a linear dynamic punishment mechanism by government regulators plays a controlling role in the stability of the system, but it greatly increases the probability of rent-seeking behavior and poses a greater threat to environmental governance;

(4) The non-linear dynamic reward and punishment mechanism considers both dynamic incentives and dynamic constraints to make the system achieve the desired evolutionary stability strategy, i.e., the pollution control enterprises follow the rules and regulations, the professional environmental testing agencies refuse to seek rent, and the government actively supervises the system as the final evolutionary direction.

The main significance of the work done in this paper is that firstly we constructed an evolutionary model of environmental pollution with a third-party governance and regulatory strategy, analyzed the behavior of participants from the perspective of their own interests, and made their spontaneous “compliance” behavior a long-term evolutionary and stable strategy, which enriched the theoretical research of the environmental pollution third-party governance system. Second, based on the deficiencies of the dynamic performance payment mechanism, we innovatively introduced a non-linear dynamic reward and punishment mechanism to control the stability of the evolutionary system, which solved the single problem of rewards and punishments in the third-party environmental pollution governance system. Finally, this paper has a certain practical significance; that is, it can provide new ideas for environmental pollution governance in other countries or regions (partly).

6.2. Management Insights

The above findings give us management inspirations in the area of third-party environmental pollution management for the following areas:

(1) To improve the government service capacity, build a “sometimes tight, sometimes loose” type of government regulation and accountability mechanism, improve the institutional reform process for third-party environmental pollution management, and ensure that government regulations are effectively applied to the environmental pollution management project itself;

(2) Government regulatory departments should establish and improve their multi-directional supervision mechanism, give full play to the supervisory role of the public and social media, and set up a reporting reward mechanism to fundamentally improve the efficiency of the supervision process;

(3) A non-linear dynamic reward and punishment mechanism should be implemented to monitor the performance of the pollution control enterprises and professional environmental testing agencies in real time, to stimulate the compliance behaviors of enterprises and professional environmental testing agencies, and to severely punish false pollution control and rent-seeking behavior;

(4) The risk of technological innovation in pollution treatment enterprises should be reduced and the determination of pollution treatment enterprises to follow regulations should be strengthened. In terms of technological innovation, the government should provide technical support to encourage environmental pollution control technology innovation in enterprises so as to reduce the cost of technological innovation in pollution control enterprises.

6.3. Research Limitations of This Paper

One of the most common assumptions used in the environmental pollution third-party governance model constructed in this paper is that the participants are finitely rational subjects. However, in the real cases, there are inevitably non-finite rational participants due to the differences in political, economic, cultural, and environmental factors in some regions and countries. Considering the behavior of non-finite rational participants and narrowing the differences between the theoretical model and the real cases will be a future research topic for the authors.

Author Contributions

Conceptualization, G.L.; Resources, X.S.; Writing—review & editing, G.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study has no ethical implications and therefore does not require ethical approval.

Informed Consent Statement

This study did not involve humans.

Data Availability Statement

This study did not report data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zheng, J.; Dong, J.; Ren, T. A stochastic differential cooperative game based on third-party management of environmental pollution. J. Manag. Sci. 2021, 24, 76–93. [Google Scholar]
Liu, C. Exploration of the realistic obstacles of third-party governance of environmental pollution and its dissolution mechanism. Hebei Law 2016, 34, 164–171. [Google Scholar]
Tang, S. On the Institutional Obstacles and Incentive Regulation of Third-Party Environmental Pollution Management. Political Law Ser. 2021, 5, 105–114. [Google Scholar]
Liu, C. Regulation, interaction and third-party management of environmental pollution. China Popul.-Resour. Environ. 2015, 25, 96–104. [Google Scholar]
Hu, Y. Third-party management of environmental pollution: Re-examination of public-private relationship and institutional improvement. J. Jianghai 2021, 5, 174–180. [Google Scholar]
Zhang, L. Rule of law for third-party management of environmental pollution in the perspective of ecological civilization. Soc. Sci. 2018, 12, 125–135. [Google Scholar]
Du, J.; Zhao, L.; Chen, L. A study on the evolutionary game between government and third-party pollution control. Sci. Technol. Manag. Res. 2015, 35, 214–218. [Google Scholar]
Chu, Z.; Bian, C.; Liu, C.; Zhu, Q. Evolutionary simulation study of haze pollution, regulatory governance and public participation. China Popul. Resour. Environ. 2019, 29, 101–111. [Google Scholar]
Han, Y.; Kou, P. Research on the dilemma of environmental pollution management and countermeasures in China from the perspective of hidden economy. China Popul. Resour. Environ. 2020, 30, 73–81. [Google Scholar]
Zhang, H.; Tu, G.; Zhuang, D. Study on the stability of third-party management of agricultural waste under alternating seasons. J. Beijing Univ. Technol. (Soc. Sci. Ed.) 2020, 22, 42–48. [Google Scholar]
Zhu, L.; He, H.; Xu, Y. Study on the quality control strategy of online shopping goods with consumer participation in a collaborative perspective. Macro Qual. Res. 2022, 10, 86–99. [Google Scholar]
He, H.; Zhang, S.; Zhu, L. Green Product Quality Supervision Strategy in Online Shopping With Consumer Evaluation and Complaint. Front. Environ. Sci. 2021, 9, 702151. [Google Scholar] [CrossRef]
Kong, Q.; Zhang, Q.; Yang, H.; Shi, J. Research on the evolution and simulation of product quality regulation in enterprise clusters. China Manag. Sci. 2020, 28, 174–183. [Google Scholar]
Qu, G.; Liu, X.; Li, Y.; Qu, W.; Li, S.; Zhang, Q. Fuzzy game analysis of government regulation and companies joining third-party international environmental audit. China Manag. Sci. 2020, 28, 113–121. [Google Scholar]
Gao, X.; Xi, Z. Evolutionary game between government and enterprises’ emission behavior under combined measures. China Environ. Sci. 2020, 40, 5484–5492. [Google Scholar]
Li, J.; Xue, C. Analysis and simulation study of tripartite evolutionary game of environmental quality regulation under government constraint mechanism. Ind. Technol. Econ. 2019, 38, 58–66. [Google Scholar]
Yang, S.; Su, Y.; Wang, W.; Hua, K. Research on Developers’ Green Procurement Behavior Based on the Theory of Planned Behavior. Sustainability 2019, 11, 2949. [Google Scholar] [CrossRef]
Zheng, Y.; Qin, G.; Bai, C.; Liu, S. Types, evolution and impact mechanisms of R&D failure in industrial common technologies. Syst. Eng. 2021, 39, 30–40. [Google Scholar]
Luo, Y.; Li, A.; Chen, J.; Chen, Z.; Qian, C. Study on the application of regulatory technology in anti-money laundering regulation—Based on evolutionary game perspective. Syst. Eng. 2021, 39, 16–30. [Google Scholar]
Liu, X.; Wang, D.; Jiang, Z. Simulation and analysis of bid evaluation behaviors for multiattribute reverse auction. IEEE Syst. J. 2015, 9, 165–176. [Google Scholar] [CrossRef]
Du, J.; Zhao, L.; Jin, S. A study on the evolution of third-party pollution management behavior based on computational experiments. Oper. Res. Manag. 2016, 25, 213–221. [Google Scholar]
Liang, X.; Zhang, F.; Yan, H. Simulation and optimization of performance payment mechanism of PPP projects based on evolutionary game. China Manag. Sci. 2020, 28, 153–163. [Google Scholar]
Wang, X.; Yang, Y.; Wang, L. The impact of government subsidies on green innovation from the perspective of information disclosure: From “no target” to “the right remedy”. Sci. Technol. Prog. Countermeas. 2020, 37, 135–143. [Google Scholar]
Chen, Y.; Xu, H.; Xu, F.; Sheng, Z. The design of government R&D subsidy contract for green innovation of enterprises. J. Syst. Manag. 2019, 28, 717–724. [Google Scholar]
Yang, C.; Liu, B.; Bi, K. A study on the evolutionary game of green innovation diffusion among domestic and foreign firms under government control. Soft Sci. 2019, 33, 86–91. [Google Scholar]
Ma, Z.; Zhou, Y. Qualitative Variational Stability Methods for Ordinary Differential Equations; Science Press: Beijing, China, 2001. [Google Scholar]

Figure 1. A logic diagram of pollution control enterprises, professional environmental testing institutions, and government regulatory departments.

Figure 2. Phase diagram of the evolution of the strategy of pollution control enterprises.

Figure 3. Phase diagram of the evolution of the strategies of professional environmental testing institutions.

Figure 4. Phase diagram of the evolution of the government regulator’s strategy.

Figure 5. The

E_{6}

(1,1,0) evolutionary path diagram.

Figure 5. The

E_{6}

(1,1,0) evolutionary path diagram.

Figure 6. Effects of some parameters on the evolutionary results.

Figure 7. Evolutionary paths of tripartite behavior choices under a static reward and punishment mechanism.

Figure 8. Evolutionary paths of tripartite behavior choices under the linear dynamic punishment mechanism.

Figure 9. Evolutionary paths of tripartite behavior choices based on the linear dynamic reward and punishment mechanism.

Table 1. Payment matrix of the game between pollution control enterprises, professional environmental testing agencies, and government regulators.

		Professional Environment Testing Agency	Government Regulators
		Professional Environment Testing Agency	$Active Regulation z$	$Negative Regulation 1 - z$
Pollution control enterprises	Rule-based pollution control $x$	Rejecting rent-seeking $y$	$R_{E} - C_{D} + M_{A}, L_{K} + M_{B},$ $N_{U} - M_{A} - M_{B} - C_{Z}$	$R_{E} - C_{D}, L_{K}, N_{U}$
	Rule-based pollution control $x$	Intentional rent-seeking $1 - y$	$R_{E} - C_{D} + M_{A}, L_{K} - Q_{B} - C_{G},$ $N_{U} + Q_{B} - M_{A} - C_{Z}$	$R_{E} - C_{D}, L_{K} - C_{G}, N_{U}$
	Fake pollution control 1 − $x$	Rejecting rent-seeking $y$	$- C_{F} - C_{H} - P_{A}, L_{K} + M_{B},$ $P_{A} - C_{Z} - M_{B}$	$- C_{F} - C_{H}, L_{K},$ 0
	Fake pollution control 1 − $x$	Intentional rent-seeking $1 - y$	$\begin{matrix} R_{E} - C_{F} - C_{H} - P_{A} - T_{S}, \\ L_{K} + T_{S} - Q_{B} - C_{G}, \\ P_{A} + Q_{B} - C_{Z} - O_{V} \end{matrix}$	$\begin{matrix} R_{E} - C_{F} - C_{H} - T_{S}, \\ L_{K} + T_{S} - C_{G}, - O_{V} - C_{J} \end{matrix}$

Table 2. Eigenvalues of the Jacobi matrix corresponding to each equilibrium point.

Balancing Point	$λ_{1}$	$λ_{2}$	$λ_{3}$
$E_{1}$ (0,0,0)	$C_{F} + C_{H} + T_{S} - C_{D}$	$C_{G} - T_{S}$	$P_{A} + C_{J} + Q_{B} - C_{Z}$
$E_{2}$ (0,1,0)	$R_{E} + C_{F} + C_{H} - C_{D}$	$T_{S} - C_{G}$	$P_{A} - M_{B} - C_{Z}$
$E_{3}$ (0,0,1)	$M_{A} + P_{A} + C_{F} + C_{H} + T_{S} - C_{D}$	$M_{B} + Q_{B} + C_{G} - T_{S}$	$C_{Z} - P_{A} - C_{J} - Q_{B}$
$E_{4}$ (0,1,1)	$R_{E} + M_{A} + P_{A} + C_{F} + C_{H} - C_{D}$	$T_{S} - M_{B} - Q_{B} - C_{G}$	$M_{B} + C_{Z} - P_{A}$
$E_{5}$ (1,0,0)	$C_{D} - C_{F} - C_{H} - T_{S}$	$C_{G}$	$Q_{B} - M_{A} - C_{Z}$
$E_{6}$ (1,1,0)	$C_{D} - C_{F} - C_{H} - R_{E}$	$- C_{G}$	$- M_{A} - M_{B} - C_{Z}$
$E_{7}$ (1,0,1)	$C_{D} - M_{A} - P_{A} - C_{F} - C_{H} - T_{S}$	$M_{B} + Q_{B} + C_{G}$	$C_{Z} + M_{A} - Q_{B}$
$E_{8}$ (1,1,1)	$C_{D} - R_{E} - M_{A} - P_{A} - C_{F} - C_{H}$	$- M_{B} - Q_{B} - C_{G}$	$M_{A} + M_{B} + C_{Z}$
$E_{9}$ $(x_{0}$ $, y_{0}$ $, z_{0}$ )	Eigenvalues with different signs

Table 3. Equilibrium point evolutionary stabilization strategy (ESS) analysis.

Balancing Point	Jacobi Matrix Eigenvalue Real Part Sign	Stability	Judgment Conditions
$E_{1}$ (0,0,0)	$(-, -, +$ )	Unstable	$/$
$E_{2}$ (0,1,0)	$(+, +, \pm$ )	Unstable	$/$
$E_{3}$ (0,0,1)	$(-, -, -$ )	ESS	$M_{A} + P_{A} + C_{F} + C_{H} + T_{S} - C_{D} < 0$ $M_{B} + Q_{B} + C_{G} - T_{S} < 0$
$E_{4}$ (0,1,1)	$(+, \pm, \pm$ )	Unstable	$/$
$E_{5}$ (1,0,0)	$(+, +, \pm$ )	Unstable	$/$
$E_{6}$ (1,1,0)	$(-, -, -$ )	ESS	$/$
$E_{7}$ (1,0,1)	$(\pm, +, \pm$ )	Unstable	$/$
$E_{8}$ (1,1,1)	$(-, -, +$ )	Unstable	$/$
$E_{9}$ $(x_{0}$ $, y_{0}$ $, z_{0}$ )	Eigenvalues with different signs	Unstable	$/$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.