1. Introduction
Reciprocal cooperation is an indispensable component of a sustainable society. Even nearly half a century after the seminal work of Trivers [
1] on reciprocal altruism, the exploration of game-theoretic models for the evolution of cooperation through reciprocity remains at the forefront of evolutionary biology and the social sciences. As helping is costly, self-interested individuals will free-ride on others and, so, unconditional cooperation is unlikely to evolve. Therefore, the standard paradigm for the evolution of cooperation is a type of cooperation that is conditional on the degree of the other party’s cooperativeness, as is the case in reciprocal cooperation.
To succeed in competition with free riders, reciprocal altruists require sufficient cognitive capacity to effectively process information to discriminate non-free riders from free riders. When an interaction consists of iterated rounds between the same pair of individuals, reciprocity often occurs in the form of direct reciprocity [
1,
2,
3]. Direct reciprocity is expressed as follows: A helps B, and then B helps A. Direct reciprocity requires memorizing what the co-player and one’s self did for each other in the past rounds of the iteration. In the absence of such iterations, as in the case of generalized exchange [
4], reciprocity should be indirect [
5,
6,
7,
8,
9]. Indirect reciprocity extends closed pairwise interactions to relationships involving external third parties. Implementing indirect reciprocity thus requires knowing what the involved players did to others or had done to them by others in the past, for example, through observing the co-player(s) directly or using reputation systems.
There are two types of indirect reciprocity: upstream and downstream reciprocity [
7]. Downstream reciprocity, on one hand, can be expressed as follows: B helps C, and then A helps B (
Figure 1b). In other words, the response to B helping C was not C helping B directly, but B being helped by a third party A, who observed B helping C and consequently evaluated B positively. This led to B being helped by A or another party who was influenced by B’s positive evaluation, for instance, through gossip or reputation [
10,
11,
12]. This is called ‘rewarding reputation’ [
13]. Therefore, downstream reciprocity uses reputation to identify partners with whom to cooperate. Thus, the motivation for such a reputational mechanism in downstream reciprocity is often described as follows: ‘If I help you, then I will be deemed good, and then someone will help me.’ This is called ‘reputational giving’ [
14].
Upstream reciprocity (also known as generalized reciprocity) is expressed as follows: A helps B, and then B helps C (
Figure 1a). This type of reciprocity is characterized by the logic of not choosing the partners with whom to cooperate. This differs from the logic behind downstream reciprocity, which is based on conditional cooperation. Upstream reciprocity involves a chain of altruistic behaviors [
15], called ‘paying it forward’ [
16,
17,
18], which increases driving forces such as gratitude [
13,
18,
19,
20,
21,
22] or a sense of indebtedness [
22], rather than the expectation of direct or indirect reward. However, in the eyes of a third party, emotional behavior can be viewed as a kind of reputational behavior (and vice versa). These motivations for reciprocity can be easily intertwined when evaluated.
Upstream and downstream reciprocity are commonly observed behaviors in experimental settings and field research [
23,
24,
25,
26,
27,
28,
29,
30]. Notably, different types of reciprocal mechanisms can be applied in tandem to promote cooperation [
14]. However, upstream and downstream reciprocity have been theoretically studied mostly in isolation, and a comprehensive study combining both types of reciprocity is still missing.
In this study, we present a new model that integrates both types of indirect reciprocity. Our model demonstrates that a stable coexistence of reciprocal altruists and free riders is attainable through a strategy which is solely reliant on indirect reciprocity, without the need for additional mechanisms such as direct or spatial reciprocity. Spatial reciprocity, like direct reciprocity, is a mechanism of reciprocity [
31] which continues to be extensively studied [
32,
33,
34,
35]. Spatial reciprocity is characterized by a spatial or subgroup structure—or relatedness—among interacting individuals, which can increase cooperation among like-minded individuals. Existing studies [
36,
37,
38,
39] have shown that integrating direct or spatial reciprocity into upstream reciprocity could promote the evolution of the latter, suggesting that such outcomes may be a byproduct of the primary mechanism [
31]. Generally, reciprocity mechanisms enhance the likelihood of cooperative engagements with peers over encounters with all-out defectors [
33]. However, our integrated model leads to a unique dynamic in which interactions with all-out defectors may even be encouraged, in contrast to the traditional case of reciprocity.
We specifically implement the interplay of upstream and downstream reciprocity within our framework [
13], which is detailed as follows (see
Figure 1c). Let B be the modeled integrated reciprocator, who can act as either an upstream or downstream reciprocator. First, assume that D helps E; witnessing this, the integrated reciprocator B deems D to be good and rewards them by helping them as a downstream reciprocator. Furthermore, if A is another integrated reciprocator who has already deemed B good, they will try to reward B by helping them as well. Then, B will forward the help received to someone else (C) as an upstream reciprocator. This may lead to B being rewarded by another witnessing the integrated reciprocator. Subsequently, the chain of forwarding and rewarding help may continue in the same manner.
It should be recalled that the chain of unconditional helping by upstream reciprocators is easily terminated when facing a free rider [
37]. The reactivation of helping requires waiting for the event that a new chain begins. In contrast, it is expected that helping is more likely to lead to reactivation due to the intervention of selective rewarding in the model, as depicted above. This point has been overlooked to date, and may provide an important starting point for a comprehensive study of upstream and downstream reciprocity.
In the subsequent section, we articulate the integrated reciprocator model through incorporating forwarding and rewarding behaviors into the action rules of individuals, subsequently analyzing the framework according to evolutionary game theory [
40,
41]. This method diverges from conventional game models through accounting for the bounded rationality of players and aligns with biological evolution, in which advantageous strategies are naturally selected for. Our methodology facilitates dynamic tracking of the stability of a strategy across evolutionary processes—not just at static equilibria—marking a significant methodological advance over classical game-theoretic analyses. Specifically, our model examines the global stability of the mixed equilibrium state brought about by the newly proposed strategy, that is, whether all the interior orbits of the dynamical system meet at that point, regardless of their starting conditions [
41].
2. Results
2.1. The Setup
We build the model on the basis of the giving game in a well-mixed population. We assume that, given any interaction event, two players are randomly selected from the population and then interact with each other in only one round. The role of the donor or recipient is determined through a coin toss. To simplify the analysis, we assume that a player acts as both a donor and a recipient in each round [
40]. When acting as a donor, the player is offered the option to help (C) or not (D). Helping leads to benefits
for the recipient and costs
for the donor, where
. Not helping has no effect on either the donor or recipient. This yields an example of the well-known prisoner’s dilemma game [
2]. We also consider the probability of failing to implement an intended action, whether or not to help, denoted by
[
42].
We then apply the standard framework to study the evolution of indirect reciprocity based on the giving game [
43,
44,
45]. The player’s strategy is described using an action rule and an assessment rule. The action rule prescribes whether a player helps or not. After every round, each player acting as a donor is assigned a binary image of ‘good’ (G) or ‘bad’ (B) following the assessment rule. Note that the player’s image when acting as a recipient is assumed to remain unchanged. In this study, we consider public assessment, under which a representative observer monitors each game, enforces an assessment rule for updating images, and broadcasts information about the population. We allow each player to perfectly know the information of co-players regarding their actions and images.
2.2. Modeling Integrated Reciprocators
To study the interplay between upstream and downstream reciprocity, we allow for the circulation of forwarded and rewarded help, as shown in
Figure 1c. In this study, we examine reciprocators that conditionally help based on the integrated action rule (
Table 1a), as follows: those who received help in the previous round will help a potential recipient, regardless of the recipient’s image, while those who did not receive help in the previous round will help a potential recipient only if the recipient’s image is good. In the following section, we analyze a minimalistic setting in which each individual can choose one of three strategies: unconditional cooperator (X), unconditional defector (Y), and integrated reciprocator (Z). Unconditional cooperators and defectors always intend to help and not to help, respectively. The relative frequencies of the three strategies are denoted by
,
, and
, respectively, where
. We assume that, in the learning process, strategies that earn a higher payoff are more likely to be imitated in the population. We studied this simple process using replicator dynamics [
41] (see
Section 4 for details). In the following, we present the results obtained with the baseline model (Model I) and the refined model (Model II).
2.3. Model I: Stable Coexistence of the Good and the Bad
We first developed Model I by considering the simplest assessment rule: those who help are deemed good and those who do not are deemed bad (
Table 1b), denoted by
, where each component corresponds to an assessment outcome; for example,
, associated with ‘G’, indicates that the donor is evaluated as good when they receive help (C) and subsequently provide help (C). This is the well-known scoring rule [
46,
47,
48]. As shown in
Figure 2, Model I can stabilize the intermediate level of cooperation in a mixed state of reciprocators and defectors (at P in
Figure 2a,b). When maintaining this coexistence, while unconditionally forwarding help through the upstream reciprocation part of the action rule (the upper row of
Table 1a) can be exploited by defectors, this is compensated for by conditional rewarding in the downstream reciprocation part (the bottom row of
Table 1a). Thus, the problem regarding the evolution of upstream reciprocity [
7,
36,
37,
38,
39,
40] can be resolved through indirect reciprocity.
Figure 2 shows further details of the evolution of the three strategies. The phase portraits present a continuum of fixed points in the interior of the simplex
. Notably, dimorphic dynamics can be observed between integrated reciprocators and defectors, seen along the edge YZ given by
. In the case without errors (
Figure 2a), the edge YZ generally consists of a segment RZ, which is a basin of the attractor P, and another segment YR, which is a continuum of boundary fixed points. The attractor P:
is given by
The location of the attractor P asymptotically approaches the node Z () as the benefit–cost ratio increases. At the attractor P, the population average of the probability of helping is . We see that the curve PQ—which is a continuum of interior fixed points connecting the points P and Q—divides the simplex. Turning to the other boundaries, the dynamics between integrated reciprocators and cooperators along the edge XZ are neutral, and the dynamics between cooperators and defectors along the edge XY are dominated by defectors. Therefore, in the long run, considering random fluctuations can lead the population to ‘neutral drift’ along the lines of equilibria (i.e., PQ and RY), eventually coming into the vicinity of the node Y (the 100% defectors state).
In the case with errors (
Figure 2b), an attractor P and a repeller Q can appear along the edge YZ. While the continua of boundary fixed points disappear, those of the interior fixed points along PQ remain. The dynamics between integrated reciprocators and cooperators become dominated by the former. Besides these changes, the evolutionary fate of the population in the long run remains similar, even more definitely converging to the 100% defector state (see
Section 4 for details).
While Model I succeeds in inducing the attractor between reciprocators and defectors, the induced equilibrium is not asymptotically stable [
41] against the invasion of cooperators. Therefore, regardless of the presence or absence of errors, considering random perturbation, the population leaves the coexistence state in the long run. This is similar to the evolution of indirect reciprocity through scoring [
42,
49,
50]. The lack of stability of the coexistence state can be explained as follows: the definition of goodness in Model I is based only on whether to help or not, thus giving rise to the infamous problem of ‘unjustified defection’ [
9,
51,
52] when reciprocators refuse to help those who are deemed bad. In this case, the image of reciprocators becomes bad and the chance of being rewarded by other reciprocators decreases. When such a chain reaction of unjustified defection and image downgrading occurs, the advantage of being a reciprocator rather than a cooperator is lost.
2.4. Model II: Robustness against the Invasion of Cooperators
To strengthen the stability of the coexistence state, we propose Model II with a refined assessment rule (
Table 1c):
. Under the new rule, only those who implement upstream reciprocity should be rewarded by those who follow the action rule. Indeed, when receiving help in the previous round, those who help are deemed good, and those who do not are deemed bad; furthermore, when receiving no help in the previous round, the donor’s image remains unchanged (denoted as K in
Table 1c), regardless of whether they help or not in the current round. The new assessment rule is a sort of staying rule [
53,
54], which was invented as a reward to focus on upstream reciprocation.
The rationale for constructing Model II is multifaceted. Initially, in Model I governed by GBGB, the probability of reciprocators maintaining a good image varies with the fraction of reciprocators , as derived in Equation (14). In contrast, the probabilities that cooperators and defectors are perceived as good remain constant at and , respectively. From this, through preserving at the minimum and elevating to its maximum, we aim to enhance the chances for reciprocators to obtain benefits , equalizing them with cooperators. Consequently, in the presence of defectors, the probability that reciprocators—that is, those who cooperate conditionally—will incur costs is reliably lower than that of unconditional cooperators. It is thereby anticipated that reciprocators will, on average, realize a net payoff surpassing that of the cooperators, leading the mixed equilibrium P to be stable against the invasion of rare mutant cooperators. Accordingly, the design of Model II omits (i.e., ‘ignores’) the case where the focal player received no help (D) in the assessment, while exclusively assessing the case where the focal player received help (C). This realizes the goodness probability of reciprocators, , which takes the largest value of 1. As a result, the third and fourth bits of GBGB in Model I are set to K, and Model II can be represented as GBKK.
Model II can result in the coexistence of reciprocators and defectors, which does not allow cooperators to invade. In fact, in striking contrast to Model I, the dynamics for Model II have no interior equilibria (with or without errors).
Figure 3 shows that all interior orbits converge to the boundary of the simplex, particularly the edge YZ. If the rate of the implementation error,
, is sufficiently small, integrated reciprocators are better off than cooperators (i.e.,
holds). In the case without errors (
Figure 3a), along the edge YZ, there exists a unique fixed point P:
, with the same coordinates as in Model I, and the node Y is a saddle. At the attractor P, the cooperation rate (the probability to perform C) over the population is given by
. The dynamics on the other edges, XZ and XY, remain unchanged, as was the case in Model I. Thus, P is even the global attractor. Then, turning to the case with errors (
Figure 3b), the edge YZ can exhibit an attractor P and a repeller Q; thus, the population dynamics can be bistable, evolving either to the mixed state P or the 100% defector state Y (see
Section 4 for details).
The stability of the attractor P against the invasion of cooperators can be understood as follows: Assuming that the integrated reciprocator received no help in the previous round, even if they interacted with a co-player with a bad image, the reciprocator’s image would not change due to the staying element (K) of the assessment rule in Model II (
Table 1c). Thus, the occurrence of unjustifiable defection is prevented. This means that a reciprocator with a good image can keep that image and, thus, continue to be rewarded by other reciprocators.
Upon closer examination of the GBKK assessment rule, it is conceivable that an individual labeled good after reciprocating help (receiving prior C and giving C) might also be justly perceived as good when they choose to help (C), despite not receiving prior help (D). This more lenient variation is captured by the GBGK assessment rule. Our analysis indicates that its effects align closely with those under the GBKK rule. A significant point to understand this is that the component in GBGK enhances the probability of an individual attaining a good image () over the in GBKK, thus elevating to its maximum value of 1 when errors are absent, as well as in GBKK. The evolutionary dynamics are dictated by the expected payoffs of strategies, which fully hinge on the goodness probabilities , , and . With no errors present, GBKK and GBGK both yield probabilities of 1, 0, and 1, correspondingly, thus resulting in a consistent evolutionary trajectory for the population.
2.5. Cooperator, Defector, Upstream Reciprocator, and Downstream Reciprocator
The results above can be compared with those considering the evolution of the four strategies: unconditional cooperators, unconditional defectors, upstream reciprocators, and downstream reciprocators. Indeed, our study shows that the replicator dynamics for the four strategies can only result in the bistable fate of the population, as in the evolution of downstream reciprocity. The state space is divided into two distinct regions by a continuum of stable and unstable fixed points, given by
(with
), that is, the planar set (
Figure 4). Considering random perturbations, this causes the population to end up in the 100% defector state. This reveals that the simple extension of the strategy space to upstream reciprocators has no effect on improving the stability of cooperation (see
Section 4 for details).
3. Discussion
This study ventures into unexplored territory within the evolution of indirect reciprocity. Unlike traditional models, which often pair upstream reciprocity with direct or spatial reciprocity while overlooking downstream reciprocity, our model reveals that an asymptotically stable global attractor which sustains high levels of upstream reciprocation is achievable when integrated with downstream reciprocation, even in the absence of errors. Notably, in this attractor, the integrated reciprocators coexist with all-out defectors, providing evolutionary dynamics unprecedented in previous models within the challenging confines of the one-shot prisoner’s dilemma game in a well-mixed population. In contrast, finding an attractor between conditional and unconditional cooperators has been intensively studied [
43,
55,
56,
57,
58,
59].
The role of errors has been central in previous models: conditional strategies fostering cooperation are ironically at risk of erosion due to unconditional cooperation once a fully cooperative state is attained. To stabilize conditional cooperation, most models have introduced errors that result in conditional cooperators denying help to unconditional ones, thereby securing an advantage. As such, errors have hitherto been essential in stabilizing conditional cooperation [
44,
57,
60,
61,
62]. Our model departs from this convention, showing that the integration of upstream and downstream reciprocity can naturally stabilize altruists in the presence of free riders without the forced incorporation of error dynamics.
A broad aspect of our framework involves the interaction with errors in perception, such as when a player misinterprets an opponent’s reputation. Initially, we assumed a perfect public assessment condition, under which the model’s design focuses on errors in implementation, thus bypassing errors in perception. This simplification, while not detracting from the principal outcome—namely, that stable coexistence is attainable sans errors—implies that the inclusion of errors in perception could yield different results under varying model assumptions, such as the action and assessment rules.
Model I’s GBGB assessment rule confines the mixed equilibrium point P to a limited basin of attraction and exposes it to neutral stability disruptions due to rare mutant cooperators (
Figure 2a). The introduction of inattention in Model II effectively addresses these concerns through allowing integrated reciprocators to infiltrate the homogeneous defector state (node Y), maintaining their good image through the mechanism of inattention. Furthermore, inattention elevates the perceived goodness of integrated reciprocators to the maximum probability of 1, equivalent to cooperators. This advancement thwarts the invasion of cooperators at the point P who, unlike reciprocators, are subject to exploitation by defectors. Consequently, this boosts the stability of the point P (
Figure 3a) without overly escalating the model’s complexity.
A task we leave for the future is that the stable coexistence established in Model II can become unstable due to the invasion of ‘pure’ upstream reciprocity (
Table 2a). Pure upstream reciprocators are those who can free ride on the costly rewarding by the integrated reciprocators. To address this issue, a considerable countermeasure would be to update the assessment rule to downgrade pure upstream reciprocators. We also remark on another type of free rider: those who only employ downstream reciprocity (
Table 2b) and, thus, can free ride on the costly unconditional forwarding of help. Our analysis suggests that the coexistence state can be stable against the invasion of pure downstream reciprocators (see
Section 4 for details).
Our research underscores the necessity to investigate robustness with respect to the inclusion of all possible action rules and to expand to systematically encompass concepts such as negative reciprocity or forwarding ‘greed’ strategies [
63,
64]. The interplay with more complex downstream reciprocity, exemplified by the leading eight norms [
45,
65,
66], warrants comprehensive examination. The application of private assessment in the norm ecosystem [
30,
67,
68,
69,
70,
71,
72,
73], combined with other types of reciprocity [
36,
37,
38,
39,
74,
75] and sanctioning mechanisms [
76,
77], remains a significant area for future research. The relevance of spatial reciprocity can also be revisited in light of findings suggesting that certain network structures, such as directed triangular cycles [
34], can indeed promote cooperative behaviors, contrary to earlier beliefs [
7].
The phenomenon of different cooperation levels coexisting in human societies—which is often observed but less understood—opens a new realm of inquiry into the conditions fostering such polymorphism [
78,
79,
80]. Our analysis of the demanding one-shot prisoner’s dilemma-like environment in a well-mixed population, independent of additional coordination factors such as repeat interactions or spatial structuring, introduces a stable polymorphism of reciprocal altruists and free riders through the sole mechanism of indirect reciprocity. This theoretical foundation enriches the dialogue on the evolution of cooperation and diversity, beckoning further scholarly exploration into this complex and fascinating domain.
4. Materials and Methods
4.1. Evolutionary Dynamics and Image Dynamics
We analyzed the model using evolutionary game theory and investigated the replicator dynamics for a set of strategies considered. Thus, we assumed an infinitely large population and its slow evolution, such that the composition of the population may be supposed to remain constant without changes in consecutive rounds. In general, the replicator dynamics are given by , where denotes the relative frequency of individuals who employ strategy , is the expected payoff per round for strategy ( is determined after playing an infinitely large number of rounds), and is the average payoff over the population, given by .
In the first step, we investigated the dynamics for three strategies: unconditional cooperator (X), unconditional defector (Y), and integrated reciprocator (Z). We denote these relative frequencies by , , and , respectively. Thus, and . We also denote the relative frequency of those who have a good image within each strategy subpopulation by , with . We denote the frequency of the good over the whole population as .
We introduce a minimalistic framework that can deal with the interplay of upstream and downstream reciprocity using the generalized first-order action and assessment rule. The generalized first-order assessment rule is given by the following matrix:
where each element
denotes the probability that the focal player who received action
in the previous round and then gave action
in the current round, with
, is deemed good. This matrix acts as a function of what the focal player does and what was done to the focal player and, thus, can cover the first-order assessment rules such as scoring (
Table 1b).
In the equilibrium state (attained by starting from the state in which all have a good image,
), the frequency of the good for each strategy should satisfy the following:
where
and
denote the probabilities that the focal player with strategy
receives action
and that the focal player with strategy
gives action
if they have most recently received action
in a given round, respectively. Therefore,
,
, and
.
We provide the generalized first-order action rule using the following matrix:
where each element
denotes the probability that the focal player who received action
in the previous round and is then given an opponent with image
implements action
as the potential donor in the current round. This framework covers the fundamental action rules of integrated reciprocity (
Table 1a), upstream reciprocity (
Table 2a), and downstream reciprocity (
Table 2b).
Using the notation in Equation (4), the probability that a donor with strategy
implements C to (or helps) a recipient with strategy
is given by
in which
and, thus,
. This yields
and
where
denotes the relative frequency of strategy
.
Therefore, for the minimalistic setting with the strategy space
, we have
and
By solving Equations (3), (8) and (9), we can obtain , , and for each point of the state space .
We assume that the image dynamics in Equations (3) and (5) are so fast that the replicator dynamics can be determined according to the expected payoffs, which depend on
and
in the equilibrium state of the image dynamics. We also assumed that the image dynamics start from a situation in which all individuals have a good image. The expected payoffs for the strategies are given by
4.2. Model I
From the assessment rule that those who help are deemed good (
Table 1b), we have
Thus, substituting Equation (11) into Equation (8) yields
in which
holds. The zero set of
as a function of
provides a continuum of fixed points for the replicator dynamics in the interior of the two-dimensional state space
. This is what the interior curve PQ describes in
Figure 2a,b. First, we focus on the case without errors (
Figure 2a). From Equations (12) and (13), we see that, for the segment ZR on the edge YZ with
, the fraction of the good converges to
or, otherwise, for the segment RY, with
, to
. Hence, the fraction of the good over the entire population (i.e., the frequency of those who cooperate) is
. Substituting Equation (14) into Equation (13) yields the zero set of Equation (13) on the segment ZR, which is given by
in Equation (1). It follows that the point P with
is an attractor with a basin ZR. In contrast, the segment RY consists exclusively of fixed points, along which
yield
. Turning to the dynamics along the edge XZ, we have
and, thus,
. Hence, it follows that the dynamics of integrated reciprocators and cooperators are neutral. On the edge XY, it is obvious that
yields
and, thus, defectors dominate cooperators.
We then examined the case with errors (
Figure 2b). Using numerical simulations, we observed that an attractor P and a repeller Q can appear along the edge YZ in general. As the error rate is non-zero, the fraction of the good among integrated reciprocators,
, can always take a non-zero value. Similarly,
, never attains its full value. As a result, no continuum of boundary fixed points appeared along the boundary of the state space. In contrast, Equations (12) and (13) hold regardless of the presence or absence of errors and, thus, a continuum of interior fixed points remains. When considering neutral drift or random perturbations, particularly in the case with errors, the population in the long run converges to the 100% defector state (node Y). Interestingly, the global dynamics for Model I are similar to those for scoring [
40].
4.3. Model II
Using the staying element in the assessment rule (
Table 1c) for the equilibrium state of the image dynamics, we have the following equations:
which obviously lead to the following constant values:
In striking contrast to Model I, the replicator dynamics for Model II have no interior equilibrium in the state space, and we can thus see that all interior orbits converge to the boundary of the state space (
Figure 3a,b). Indeed, the payoff difference between reciprocators and cooperators is given by
in which
holds in the interior state space, yielding
for sufficiently small errors with
.
Next, let us examine the dynamics between integrated reciprocators and defectors along the edge YZ. For
, we have that
and, furthermore, in the case without errors (
= 0),
Thus, for , the point P with as in Equation (1) becomes an attractor along the edge YZ. We also note that the dynamics along the edges XZ and XY remain unchanged from those for Model I.
For these reasons, in the case without errors, it follows that all interior orbits will converge to P; thus, P is the global attractor (
Figure 3a). Using Equation (9), we have that the probability that integrated reciprocators give C,
, is equal to
. Thus, its population average is
.
In the case with errors, it turns out that an attractor P and repeller Q appear simultaneously on the edge YZ. Hence, the replicator dynamics have only two local attractors, P and the node Y. As a result, the global dynamics are bistable: the population converges to either P or Y (
Figure 3b).
4.4. Stability of the Attractor P against Invasion of Pure Downstream Reciprocators in Model II
Here, we prove that a rare mutant of pure downstream reciprocators (W) is worse off than the resident population consisting of defectors (Y) and reciprocators (Z) in the case without errors. Consider pure downstream reciprocators (PDR), who employ the action rule in
Table 2b and the assessment rule in
Table 1c. We first note that
and, thus,
on the edge YZ. Using this, we calculated the probability for PDR to receive C as
, and the probability for PDR to give C as
.
Similarly, the probability for integrated reciprocators to receive C is given by , and the probability for integrated reciprocators to give C, , is equal to . Therefore, it follows that the expected payoff for mutant PDRs, , is smaller than that for resident integrated reciprocators . In other words, the mutants are not selected for among the residents along the edge YZ (including P).
4.5. Cooperator, Defector, Upstream Reciprocator, and Downstream Reciprocator
We also explored the evolution of four strategies: unconditional cooperators, unconditional defectors, upstream reciprocators, and downstream reciprocators. Downstream reciprocators intend to help the recipient if the recipient helped someone else in the previous round; if the recipient did not help, downstream reciprocators intend not to do so (
Table 2b). Upstream reciprocators intend to help the recipient, regardless of the recipient’s image, if upstream reciprocators received help in the previous round. Otherwise, the upstream reciprocator intends not to help (
Table 2a).
We denote by
,
,
, and
the relative frequencies of unconditional cooperators (X), unconditional defectors (Y), upstream reciprocators (V), and downstream reciprocators (W), respectively. Thus,
and
. The frequency of the good over the entire population is given by
. Then, as in Equation (8), we can obtain the following equations to recursively define
:
By solving Equations (3), (9), and (20), we obtain
,
and
. Substituting these values into Equation (10) allows us to calculate the payoffs and, thus, the replicator dynamics. For Model II,
, the probability that a player with strategy
gives C is given by
Figure 4 describes the evolution of the four strategies based on the replicator dynamics.
Figure 4a shows the boundary dynamics on each face. On the X-Y-V plane, defectors are dominant. For the other three faces (X-Y-W, X-W-V, and Y-W-V), there can exist a continuum of interior fixed points if
. We also see that the edge dynamics between downstream reciprocators and upstream reciprocators are neutral. Therefore, a random shock can eventually bring the population to the node Y, which is the homogeneous state for defectors.
Figure 4b shows the interior dynamics. If
, there exists an intersection of the plane (planar continuum of fixed points) and the 3D simplex
; otherwise, there is no interior fixed point in
.
Figure 4b shows that the intersection consists of stable and unstable fixed points. Depending on the initial conditions, the population may first evolve to a stable point within the plane. Regardless of the initial conditions, the random perturbation can still cause the population to finally converge to the node Y.