1. Introduction
Due to the high energy costs implied by Proof-of-Work (PoW) consensus, Proof-of-Stake (PoS) has increasingly gained the attraction of investors [
1,
2] mainly for technological reasons [
3]. Instead of finding a nonce by the usual trial-and-error process, thus requiring computational power, a validator, i.e., a
staker, is investing an amount of underlying cryptos to contribute to the blockchain [
4,
5,
6]. Contribution could be the validation of history, or of the most recent transactions to form the new coming block. In the latter case, the consensus pseudo-randomly chooses one staker among the pool of stakers, and the probability of selection is equal to the proportion of investment [
2]. As an example, if there are three stakers,
A,
B and
C, such that
A and
C are investing each through a proportion of
, then
B is twice as likely to be selected by the pseudo-random process than
A and
C, who have an equal probability of being selected. It is worth pointing out that a capping could be applied: stakers cannot invest above a threshold, corresponding to the maximum investment possible. They also cannot earn more than a given amount. This ensures diversification and avoids the presence of whales in the staking pool.
It is then quite logical that investors are looking for a standard modeling of the expected return they would earn in the future by staking some coins in a PoS blockchain and positioning themselves as stakers. Thus, from a staker point of view, the investment, which we name staking in our context, consists of depositing some coins, and, in exchange, the investor is expected to gain some reward due to the blockchain validation by themselves.
To the best of our knowledge, there has been some very little focus in the academic literature on the issues around modeling staking rewards. The reason appears simple to us: each PoS blockchain has its own reward rules, and it seems difficult to propose a general framework of rewards. However, [
7] provides a dynamical model of the staking economy. In particular, the staking rewards follow a dynamic process through the Hamilton–Jacobi–Bellman equation and are a function of the aggregate amount of staked coins. The response of staking ratios due to stochastic impulse are shown, and statistics reveal the range for the staking reward rate to be between
and
. Slashing is mentioned but neglected. Ref. [
8] exposes some arguments in favor of PoS being a fixed income product, and the main finding is that the PoS ‘yield’ should remain stable in time. In fact, we think that the stability of the reward rate should be effective for blockchains which are sufficiently robust against rule changes and attacks. Ref. [
9] defines the block reward as the difference between the cryptocurrency supply of one block and that of its previous one. They then use reward as a parameter to compare the number of investors with the one when there is no reward. Ref. [
10] develops a model which affects the dynamics of investor wealth. Ref. [
11] provides the optimal reward design at equilibrium in the presence of malicious agents.
Regarding the transaction costs topic, [
12] estimates transaction costs in an equilibrium framework (not necessarily in the crypto area). Ref. [
13] provides the optimal transaction fee so that a transaction is stacked in the Ethereum blockchain. Ref. [
14] provides LSTM models, attention models, and CNN-LSTM to forecast gas price. To the best of our knowledge, no statistical analysis has been performed to capture the distribution of transaction fees in a PoS context. This kind of analysis is needed (though in a non-exhaustive way), as our model is making an assumption of the distribution.
It is worth noting that, generally speaking, existing models use the staking rate as a parameter to characterize the dynamics of a PoS blockchain. We have not seen specific modeling for proper estimation of the staking rate in a standard way for the industry (including especially slashing effects), which is the purpose of this paper.
2. On the Staking Reward Calculation
Our approach for modeling staking rewards, in this article, is based on a comparison of staking with a
cash-flow discount system applied to a floating-rate note model: an investor is
investing an amount of cash to earn some expected interest in the future. Mathematically, staking should correspond to a
floating-rate process (see, for example, [
15,
16]), where the future returns may vary, and a blockchain which can default if it is not sufficiently active, i.e., if there is not enough fluidity in its maintenance and construction. If the number of validators, users, and staked coins increase with time (which is the case for Ethereum; see [
1]), it seems reasonable not to consider the risk of default for a
healthy blockchain (in other words, a healthy blockchain may be considered triple-A rated from standard notations). However, there are some differences between a PoS consensus and a cash-flow discount model’s investment: the ‘savers’ need to make sure they are performing the validation correctly, as, otherwise, they might take the risk of being
slashed, which is being excluded for a certain amount of time, with some proportion of staked coins burnt. In addition, the Maximal Extractable Value (MEV) (see [
1] for an introduction, for example, to the
MEV-boost algorithm) is an important aspect of the PoS consensus: the transactions to be stored in the coming block are classified according to the amount of their transaction fees so that stakers extract maximal gain. Drawing a parallel with TradFi, the MEV may be viewed as the transaction costs for investing in traditional capital markets.
The investment due to staking implies a rate of gain for the staker. Fundamentally, a rate is the amount charged by the lender to the borrower to lend money. A reward is an incentive given in recognition for a service, effort, achievement, or a mechanism to motivate participation. When it comes to a blockchain, be it PoW, PoS or any other consensus mechanism, the above considerations still remain. For example, the miners spend their time and energy to mint new coins in the hope of receiving compensation for their effort, while stakers invest cash to earn an expected reward. Below are some of the benefits of mining and/or staking:
Mining rewards: Miners receive newly minted coins for successfully validating and adding new blocks to the blockchain. The miners are also compensated for protecting the network from spam attacks.
Staking rewards: In addition to the rewards mentioned above, the stakers receive opportunities to vote on protocol upgrades (e.g., staking reward amount) and changes in addition to partaking in the overall governance of the blockchain.
Regarding voting on protocol upgrades, blockchains are designed to be adaptable, allowing for upgrades and improvements to the protocol overtime. These upgrades can include changes to consensus algorithms, security features, or the addition of new functionalities. Stakers often get to vote on these and other changes to the protocols, which might address issues such as security vulnerabilities, scalability improvements, or the addition of new features.
From a blockchain viewpoint, through this mechanism of rewards, the consensus encourages participation, which in turn helps it have a broader reach and appeal. Through this procedure, the blockchain achieves coins distribution: coins get distributed to the wider community, helping enlarge the stakeholder base and reducing the concentration of coins in the hands of a few. In addition, increasing the number of agents increases security of the blockchain.
Overall, staking rewards play a critical role in fostering network participation, securing the blockchain network, and promoting the growth and adoption of the underlying digital asset. Nonetheless, a general and standard model for a staking rate has become an industrial need as discussed above. Investors are mostly interested in an estimation of their Annual Percentage Yield (APY) as stakers. At a given time, if
r is the staking rate and
f is the yearly frequency of reward, the APY is given by
It is worth mentioning that the specific mechanisms for determining the rewards, be it mining or staking, can vary significantly between different networks. Some may have fixed or predictable rates, while others may use more dynamic or adaptive methods. One may compare these analogies with the activities of central banks within closed and opened economies. Thus, some models developed for the purpose of TradFi help determine the value of the rates. These models are based on fixed-income valuation models, more specifically, cash-flow discount valuation models. The reward that the validator receives should broadly compensate for (i) effort towards validating the transactions; (ii) risk for staking in the case of the PoS consensus mechanisms; and (iii) demand and supply for the validation services.
Our approach to calculating the staking rate is therefore to use cash-flow discount mathematics (see
Section 4.2) because one can see a staker as an investor investing money in a fixed income security, receiving
expected gains from the blockchain in the future at validation times. In fact, if the investor is selected by the blockchain, then the gain is positive, and if not, it is zero, provided the staker is not slashed and has no particular reason to process to get their money back from the staking pool. This feature could be viewed as an investment in a security where the issuer will not pay the interest if there is a violation of the validation rules. Practically, the idea of our model is that the calculated rate below should be updated each time a new block is added to the blockchain. This gives a time series of rates which reflects the gain investors earn over time (see
Section 4.5). It is worth stressing that our approach is given when we calculate the rate: from one moment to another, data are changing, and it is a
dynamic rate of return which we can calculate. In addition, it is also worth stressing that the gain should consider MEV as well. This is what we do in this paper.
However, such an approach shall be more controversial in a PoW context: we propose an ‘equivalent’ cash-flow discount approach to PoW blockchains, and derive a
working rate, whose final expression is structurally very close to the staking rate. In the PoW context, the probability for a miner to put their candidate block but not the others needs to be calculated (for a study of the mining probability laws, see [
17]; for a mathematical introduction to (PoW) blockchain, see [
18]).
The rest of this paper is organized as follows.
Section 3 presents the results.
Section 4 explains the core of our model, and first introduces the Staking Probability Space (
Section 4.1), which validates the settings and allows the probability calculation; then we derive the staking rate in
Section 4.2 through a very simple floating-rate note model. The following
Section 4.3 and
Section 4.4 deal with two important staking addons to the simple floating-rate note model, which make the staking model more exhaustive and exclusive to a staking context:
Section 4.3 introduces the slash rate into the model, while
Section 4.4 adds the MEV.
Section 4.5 applies the developed concept to Ethereum 2.0. We then apply in
Section 4.6 the cash-flow discount models model in a PoW context. We discuss the assumptions of the methodology and the results in
Section 5 to finally conclude in
Section 6.
4. Formal Derivation of the Staking Rate
We introduce the following notations.
and ;
The discrete set is , for any ;
is the set of real numbers and ;
is the indicative function associated with the event X (i.e., it is 1 if the event X occurs, and 0 otherwise).
4.1. Probabilistic Definition of Staking
We suppose there are stakers in total. The validator, for , has deposited an amount of coins. This staker is depositing coins so that they can validate the next block. We assume they start investing at time (this is considered to be at present—thus, in the following, the variable t represents a forward time), and the choices of a validator occur at times , then , etc. Thus, we define as the increasing sequence of validation times (we assume that validation coincides with reward and there is selection of one unique staker per round). Without any loss of generality, we set for all .
For each
, the
staker is selected, or not. Thus, if
means they are selected and
means they are not, we define the sampling set as
Thus, an element
of
writes as
Let
be the algebra enhanced by all elementary cylinders
C of the form
The
-algebra enhanced by
is noted
. The space
is similar to the Bernoulli one. Thus, we can construct (e.g., [
19,
20]) a unique probability measure
such that
and
, for any
, for some
. The probability space
is the space of our interest.
Therefore, we can set the random variable
defined on
with values in
, and such that
so that
is a Bernoulli random variable describing if the staker
i is selected, with probability
p, or not, with probability
.
4.2. Staking Rate Derivation
Given the investment of
, we set
where
(interpreted as the total staked coins). Note that we do not necessarily assume that
is bounded: the consensus can avoid (depending on the blockchain) any single staker having too much power so that
p cannot exceed a given number in
.
Let . At time , the staker gain is given by . Then, if and only if the staker i is selected (otherwise, ). We assume that does not depend on i and n. We write , where g is the expected reward of any staker at any added block, and is a measurable quantity from the blockchain.
The whole system could be seen as a two-counterparty entity:
- 1.
The staker i;
- 2.
The blockchain pool, regularly rewarding staker i.
If today the rate of the gain to be received at time is given by , the expected present value of this gain is .
Definition 1. From the staker viewpoint, the total investment is , where represents the for the staking investment. The is the rate r, which makes the total investment equal to 0.
The rate r makes sense from both party viewpoints when since none of the two parties will commit if at least one loses money immediately.
Claim 1. Under the above notations and assumptions, the staking rate r is given by: Proof. The
present value for the staking investment after engaging in such an exchange with the blockchain is given by:
Here,
g is the
expected gain staker
i is looking for. From this equation, we therefore have:
We finally obtain the
staking rate by using the equation
. We have:
□
This rate does not depend on i, thus giving its universal characteristic to concern any investor’s interest.
4.3. Slash Rate Inclusion
We amend the model developed above with the inclusion of the
slash rate, which is the percentage of stakers slashed because they have not respected validation conditions. Still focusing on staker
i, we introduce the slash rate
s (we assume it is independent of
i), as the probability for staker
i to be slashed between two consecutive blocks (typically the previous one at present and the new coming one), that is to be banned from the staking pool due to not following the required validations (they may come back in the future, which means, for simplicity, that they would have to start from the beginning). Thus, we are assuming that, once a staker is slashed, they recover their coins (minus a burnt proportion) and are not stakers anymore (see
Section 5 for a discussion of this assumption). In practice, if the staker
i is slashed, then a proportion
of their staked coins is burnt, resulting in
retrieved staked coins.
Let
be the discrete random variable defined on
with values in
, which is the time for staker
i to be slashed. We introduce once more the gain
as in
Section 4.2.
In addition, we assume that the slashing process is memoryless: the slashing process for staker
i can occur at any time in the process and independently of its history. In practice, this means that the slash menace occurs between two consecutive blocks, no matter their respective place in the blockchain, and with equal probability. Since the time
to be slashed is discrete, the
Memoryless Property Theorem (see [
21]) implies that
follows a geometric random law.
We can set the random variable
defined on
with values in
, and such that:
so that
is a Bernoulli random variable describing if the staker
i is slashed, with probability
s, or not, with probability
.
We introduce the random variable
defined on
with values in
, and such that the event
is that staker
i is slashed at time
. We therefore have:
with the convention
. The first equality is due to the memoryless property. Finally, it is worth pointing out that we now let:
Claim 2. Under all above notations and assumptions, the staking rate is given by: Proof. Considering slashing, the gain becomes
. Equation (
8) becomes
Since
as surely by definition, and
as surely for all
, and since
, we then have:
The first term in Equation (
17) is calculated below, with the trick that
, and we have:
Since
and
, we apply the
Dominated Convergence Theorem (see [
22]), and we permute the sum and the mathematical expectation:
In addition, we note that:
The event
represents the fact that staker
i is
not slashed at time
n and has been selected to validate the block in construction at time
. Using Equation (
14), we have:
Moreover, we have:
and, hence,
Regarding the second term in Equation (
17), we have
Regrouping the terms in Equation (
17), and using the equation
, we deduce Equation (
15). □
4.4. MEV for Estimating the Reward
The estimation of the set of transaction fees is an important aspect to consider for the estimation of the expected gain g for a staker. In this section, we develop an addon model to shed light on the implication of the Maximal Extractable Value to the estimation of g.
We consider the random variable F representing the transaction fee valued per transaction. A reasonable assumption is that the law of F follows a memoryless process: if is a chronological sequence of transaction fees (each corresponds to transaction i in the memory pool), then it is an independent sequence. It is not entirely true though: a user could check the average transaction fee and pay a competitive fee by indeed referring to the market. However, we assume that the memory pool is mainly constructed from a set of randomly selected numbers according to a given distribution.
Since
F is a continuous positive random variable and possesses the memoryless property, then (see [
18] or [
21])
F follows an
exponential law:
where
is the average transaction fee (available on-chain). See
Section 5 for a discussion on this assumption.
The Maximal Extractible Value (MEV) (see, for example, [
1] for an introduction) consists of a process which organizes the transactions to maximize the profit of a staker, in terms of transaction fees. Bearing this in mind, a simple model for MEV can be expressed by the means of order statistics (see, for example, [
23]).
More specifically, suppose we have a list of transactions queuing in the memory pool. Only transactions will be chosen to be in the official list of transactions stored in the coming block. Consider the associating sequence of transaction fees.
Definition 2 (MEV process). Let and be the group of permutations of the set . An MEV process consists in choosing a permutation such that , and classify the transaction fees as such.
This defines a sequence
of non-increasing transaction fees random variables. We rename this sequence
. It is worth stressing that this is the sequence of the order statistics associated with the random variable
F. The
total transaction fee reward is therefore given by:
Claim 3. The average total transaction fee reward from an MEV process is given by: It is worth pointing out that is an essential component of g.
Proof. The joint probability distribution function for the family
is given by:
Indeed, first of all, without any loss of generality, we can assume that since the contrary event, i.e., , has 0 probability.
Next, note that the compounded function
is a
diffeomorphism. The
Variable Change Theorem (see [
22,
23]) leads to
hence the equation above. The reason that
is because the matrix of
is a permutation matrix.
Since
, the above equation gives
We now introduce the variable change
Note that , as surely, for any . We want to derive the law of .
is also a
diffeomorphism whose inverse is
Since
then, by the
Variable Change Theorem, we have
This proves that the family
is composed of mutually independent random variables and
Now, let
be the total transaction fees reward. We want to calculate the mathematical expectation of
. We have:
Thus,
and, therefore, we have:
□
4.5. The Ethereum 2.0 Staking Rate
This section aims at providing an estimation of the annual percentage yield (APY) for the Ethereum blockchain. At the time of writing, the APY is empirically estimated at around
(see [
1]—in accordance with the May 2023 rate). The above model allows to find an APY with the same magnitude order.
4.5.1. Rate Estimation
In May 2023, the average transaction fee per transaction for the Ethereum blockchain is
ETH 0.0007, while
are processed for each block on average, and there are roughly
transactions queuing in the memory pool (see, for example, [
1]). Assuming this occurs every 15 s (average time to have a block when Ethereum was PoW), the average distributed reward in a day is
Assuming MEV represents the main revenue stream, we can set or ETH 2102.64 per day.
The total amount of staked coins at the time of writing is
(on May 2023); hence, the rate estimation gives
4.5.2. Electricity Cost Addon
According to [
1], the annualized energy consumption of the Ethereum 2.0 blockchain is of
TWh (on May 2023). At this time, a reasonable magnitude order for the US electricity price is
USD/kWh. This magnitude order looks conservative, for example, in the UK or in France.
This makes
for one year. The staking cost is thus
ETH
per day, highly negligible when compared with
g (see
Section 4.5.1). It is worth noting that this cost is an overestimation, as it is an overall cost of maintaining the full blockchain.
According to this approach, the electricity cost is not going to negatively contribute to the rate.
4.5.3. Annual Percentage Yield
Using Equation (
1) for the APY estimation, we have
The above model allows estimating the current APY for Ethereum. It is worth pointing out that the Ethereum capacity to increase the number of transactions per blocks will significantly increase the APY.
4.5.4. Implementation
We show the evolution of the APY with respect to time in
Figure 2, from March 2023 to May 2023. The needed data (mainly
g and
) are the one on-chain.
In
Figure 2, the APY can vary abruptly based on the economic environment. Here, for instance, the spikes might relate to the US banking crisis—Silicon Valley Bank and Signature Bank—in March. This might be because the investors were looking to move their funds out of relatively higher-risk assets, especially since both these banks were heavy lenders to the technology sector, thus the spill-over effect. This assumption would need further testing to be properly validated and is out of the scope of this article. However, the main reason for such spikes observed in
Figure 2 is likely due to the Shanghai release allowing withdrawals and increasing reward (
g increased) [
24,
25].
4.6. Mining Rate Derivation
The cash-flow discount models in a PoW context seem to be more disputable. The underlying economic environment is quite different this time: staking is about
depositing to receive an expected reward, while working consists in
spending electricity to find a relevant nonce and connecting the latest block to the miner’s candidate block. A
working probability space is defined the same way as in
Section 4.1. In addition, if we still want to focus on an equation of the style of Equation (
8):
where
represents the gain earned by miner
i at time
, and
is the present value of the total future gains, then the rate
r is the return of gains obtained by spending money by a participant. To some extent, mining is like participating in a
game by paying to earn reward and, contrary to staking, the payment of the game is continuously performed over time.
There are
miners in total. For all
, we introduce the random variable
to be the time for miner
i to mine the coming block, i.e., be the first one to find a nonce among the pool of miners. The random variable
can be assumed to have the memoryless property [
18], and since it is a continuous and positive random variable, then
, with
for all
. Concretely,
represents the
hash rate for miner
i: the higher the rate, the less time miner
i takes to mine its block. Henceforth,
is the total hash rate, and if
is an arbitrary time period of mining, then
is the total hash computed by the set of miners during
. It is, therefore, the
total cost for the whole mining activity.
Bearing this in mind, we have the main claim for this section.
Claim 4. If g is the average reward per block and R is the total hash to get this block constructed, then the r is given by: Proof. First, we would like to prove that
In fact, if
, by using the
Bayes formula, we have
Now, by mathematical induction, we can prove that
, for any
. Bearing this in mind, replacing
with
and
with
gives Equation (
22).
By going through the spirit of proof of Claim 1, for a fixed miner
i and time
, we have
Here, we have
and
g is the average reward, or
From Equation (
20), we have
During
, the total investment is
, and thus
finally leads to
□
5. Discussion
5.1. Time-Dependency
It is worth pointing out that our approach is applied each time one needs to estimate the staking rate r. In practice, an update is performed each time a new block is added to the blockchain, giving a time series of the staking rate with respect to the block number. In particular, the number of total staked coins , the award g, the slash rate s and the proportion q of burnt coins need to be updated systematically.
5.2. General Discussion on the Approach
We provide a rigorous mathematical foundation for modeling the staking rate, open to practitioner and academic scrutiny. More specifically, in order for the probability of an event and for the mathematical expectation to make sense, we pose the problem in the way of
Section 4.1. Without a clear understanding of the underlying probability space, the model may produce misleading or inconsistent outcomes. From a business perspective, defining a probability space provides a common language for communication and collaboration among professionals. It ensures that the assumptions and interpretations of probabilities are clear and consistent across individuals or teams, fostering effective teamwork and minimizing misunderstandings. Last but not least, although this problem positioning may sound heavy, it appears necessary when considering the slash rate in the stake rate derivation: the definitions of
,
and
do not appear ambiguous.
5.3. Adding Maturity
The rate will remain unchanged if one adds a
maturity to our cash-flow discount model (here, a maturity represents the time when the staker retrieves their staked coins and thus stops being a staker). To see this, suppose
, for
, is the time at which the staker stops investing. Equation (
8) becomes
Then, the equation
leads to the same expression
r for the rate as in Equation (
7).
We have two remarks: (i) can be interpreted as the par of the investment, and (ii) regardless of whether the staker decides to stop their investment or not, the staking rate is the same. This is expected as long as we calculate a rate of return.
5.4. Assumptions and Healthy Blockchain
In the whole study, we assume that the blockchain is remaining sufficiently stable over time: it is not supposed to have substantial changes (e.g., no fork) or collapse. We are also not integrating attack events in our model, so we assume a blockchain which has a sufficiently long history with many honest agents acting on it. Such a
healthy blockchain is likely to survive for a sufficiently long time so that staking perpetually remains a relevant approximation. It is worth pointing out that a healthy blockchain and the memoryless property of intrinsic features (e.g., transaction fees) are two faces of the same coin. Intrinsically related to this main assumption, the reward dates are supposed to be known in advance (as suggested by the equation
for all
) and the blockchain is supposed to continue to pay the rewards indefinitely (see, for example, Equation (
8)). In addition, Equation (
8) also suggests that a constant actualization rate is applied to value the infinite stream of rewards, i.e., the staking rate
r is constant in the actualization of the rewards, which are thus supposed to be reinvested systematically each time they are earned.
5.5. Model Limitations
The main assumption of this model, as discussed above, is that it operates only on healthy blockchains. The perpetual characteristic of the bond approach uses the assumption of a sufficiently stable blockchain in time. This cannot happen if the blockchain is either forked or attacked, that is, if there is any specific change—i.e., rule breaking—which makes the blockchain have a different behavior from the one expected when calculating the rate. Thus, the model cannot apply if the blockchain will not continue to pay the rewards indefinitely (however, this aspect of the model is flexible by implementing some maturity; see above). From a perfectly healthy blockchain, which implies the stability of the whole system over time, the idea is to add more and more of what is making the blockchain less healthy, among which include a lack of hardware, or attacks. However, the first needed feature to consider—as it is inherent to staking—is slashing.
5.6. Slashing
Although the formula
might appear intuitive and trivial, the implementation of the slash rate into the process reveals an equation which was not easily expected (see Equation (
15)): the first term is a quadratic decrease in the gain, while the second one is a linear decrease, with the slope being the proportion of burnt stake coins. Overall, the staking rate is a decreasing quadratic function of the slash rate. One might think that the staker is taking more risks by staking since they can lose the initial investment, and thus, the reward should increase. However, the context is quite different from standard cash-flow discount models: the investor themselves can enhance a false or wrong validation process. Thus, the decrease in the staking rate can be seen as an average penalty included in the rate.
In addition, we have assumed that the staker is banned from the blockchain, which is not necessarily true: the staker can only have a proportion of burnt coins, remaining a staker as long as they still have staked coins remained in the staking pool. It would be interesting to see what Equation (
16) would become then. We would need to introduce the cumulative slashing time
, which is the time staker
i has been slashed for the
time,
, i.e., to simplify:
where we assumed time independence between two consecutive slashes. Since
is the time of slashing for staker
i, then
is the time for being slashed
p times. Thus, Equation (
16) becomes:
The first term leads to
(see Equation (
10)), while the second term would write as (inverting sum and expectation and setting
):
Unfortunately, there is no close formula for the sum above, to the best of our knowledge. In fact, in our model, we do not pretend that a slashed staker will never be able to come back through another round, perhaps after some time. The above calculation could be more complicated, but we do not believe it is necessary for what we want to achieve in this study.
5.7. Memoryless Property for Slashing Events
In
Section 4.3, we assumed that stakers can be slashed in a time-independent way. Stakers can be slashed for various reasons, e.g., double signing (validation of conflicting transactions), downtime (offline staker, not able to validate while selected), or non-compliance (failure to follow the protocol rules). Despite the fact that the exact slashing conditions depend on the specific rules of each blockchain protocol, there is no evidence, to the best of our knowledge, that there is a spontaneous time dependency in the slashing process for individual stakers. Time dependency appears due to a common decision for forking, or due to an attack provoking radical protocol changes. We are assuming a healthy blockchain, though we do not consider these events to occur.
5.8. Slashing Event Independent of Staker
In
Section 4.3, we assumed that the slash rate was independent of
i. This can be seen as an approximation, as this supposes that stakers all have the same resource and implementation of the verification and validation processes. However, it remains difficult to evaluate individual abilities to correctly validate blocks. In addition, for Ethereum, the staking amount is the same for all stakers, that is, ETH 32, which means (i) the process tends to provide equality of chance of selection, and (ii) resources may be comparable.
5.9. MEV and Total Income
MEV represents a significant portion of the stakers’ income in a high-traffic network like Ethereum 2.0. We have provided an estimation of the income
g only from MEV, in
Section 4.5. However, the specific income can vary widely. Some cryptocurrencies offer a fixed percentage of returns for staking their coins, whilst others fluctuate based on network usage and transaction volumes. To obtain more specific numbers, one would need to look into individual coins’ staking models and rewards. Thus, it seems difficult to provide a general income model, as one can find strong variability within PoS blockchains. However, we think our approach generally captures the idea of MEV as a classification of transactions with respect to their transaction fee amount, allowing increasing reward gains.
5.10. Transaction Fee—Exponential Assumption
In
Section 4.4, we assumed that the transaction fee was represented by a random variable whose law is an exponential one. This is a consequence of the discussion depicted therein about the memoryless process. Having an estimation of
F would require to have access to a sufficiently large number of transaction fees at a given time. If the collected sample is a sufficiently good representation of the whole population, the average transaction fee
would be close to its true value, and, more generally, we would have access to a broader distribution of transaction fees. Only then would we be able to have an idea of the distribution of the transaction fees, i.e., if they follow an exponential law rather than a log-normal one. Below, we have, however, performed a fit to the distribution of daily average transaction fees (in ETH) for the Ethereum blockchain (see
Figure 3). The data were selected from 7 November 2022 to 7 November 2023 on Blockchair (
https://blockchair.com). The time period corresponds to Ethereum 2.0 and is a relatively long time after the fork, allowing more stability in the chain data. We fit the exponential and lognormal distributions to the data histogram; the other distributions (e.g., normal) do not have enough significance to be shown here. In
Figure 3, we rescale the distributions to the empirical histogram so that both fits can be shown in the same figure.
The fits are using the
fitdistr function in R (optimization based on Nelder–Mead, quasi-Newton and conjugate gradient algorithms). We show three fits: (i) fitting the exponential law with the tail (from the median of the distribution), (ii) fitting the log-normal law with the whole distribution, and (iii) fitting the log-normal law with the tail. The Kolmogorov–Smirnov test (null hypothesis: data can be fitted) reveals a
p-value below 0.05 for the second case, and
p-values largely above 0.05 for the other cases (see caption in
Figure 3). Thus, within the 95% level confidence, we can reject the null that the whole data are fitted with a log-normal distribution, while we can reject the alternative that the tail is not fitted with exponential and log-normal distributions. Given the model depicted in
Section 4.4, we consider large values for transaction fees (
m can be chosen in a way to focus on values which are fitted with exponential laws). Thus, we cannot reject the exponential assumption for the tail of this data set. It is worth stressing that this above fit is already assuming the memoryless property: the distribution is taken
over time, rather than
at a given time.
5.11. Mining Rate
Conceptually, it is interesting to have a mining equivalence of the cash-flow discount approach. We still can derive a rate, not in a sense of investment, but rather as a ratio of ‘gain for mining a block/expense to mine’. However, contrary to the staking rate, where alliances between pool operators and depositors usually occur, it does not look straight to emphasize some business utility from the mining rate.