Next Article in Journal
An Iterated Local Search Heuristic for the Multi-Trip Vehicle Routing Problem with Multiple Time Windows
Previous Article in Journal
Testing Multivariate Normality Based on Beta-Representative Points
Previous Article in Special Issue
The Maximal and Minimal Distributions of Wealth Processes in Black–Scholes Markets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Demystifying the Two-Armed Futurity Bandit’s Unfairness and Apparent Fairness

1
Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan 250100, China
2
School of Mathematics, Shandong University, Jinan 250100, China
3
School of Statistics and Mathematics, Shandong University of Finance and Economics, Jinan 250014, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2024, 12(11), 1713; https://doi.org/10.3390/math12111713
Submission received: 16 April 2024 / Revised: 26 May 2024 / Accepted: 28 May 2024 / Published: 30 May 2024
(This article belongs to the Special Issue New Trends in Stochastic Processes, Probability and Statistics)

Abstract

:
While a gambler may occasionally win, continuous gambling inevitably results in a net loss to the casino. This study experimentally demonstrates the profitability of a particularly deceptive casino game: a two-armed antique Mills Futurity slot machine. The main findings clearly show that both non-random and random two-arm strategies, predetermined by the player and repeated without interruption, are always profitable for the casino, despite two coins being refunded for every two consecutive losses by the gambler. We theoretically explore the cyclical nature of slot machine strategies and speculate on the impact of the frequency of switching strategies on casino returns. Our results not only assist casino owners in developing and improving casino designs, but also guide gamblers to participate more cautiously in gambling.
MSC:
60J10; 60F05

1. Introduction

The origin of human gambling is estimated to coincide with the emergence of human civilization. Evidence suggests that people engaged in “taking chances” as early as the late Paleolithic Age. For example, divination was widely practiced to discern good and bad outcomes in prehistoric China. More recently, the establishment of casinos has significantly boosted the longstanding prosperity of the gambling industry. Over the centuries, various forms of gambling have been developed, including horse racing, lotteries, dice, baccarat, slot machines, roulette, and blackjack.
Today, some governments support and encourage the development of the gambling industry because it stimulates domestic economic growth, even during global economic downturns. This highlights not only the profitability of the gambling industry but also an implicit truth: casino games reliably generate revenue, at least partly due to their inherent design.
Gambling attracts players through the illusion of fairness, including the misconception that casinos are unprofitable. When gambler enthusiasm is heightened by ostensibly honest advertisements of fairness, gamblers indulge in fantasies of winning vast sums of money. Casinos are particularly captivating to individuals with gambling-related pathologies [1] who become deeply immersed in gambling, subsequently experiencing depressive symptoms, heightened gambling expectancy, and increased dark flow ratings [2].
Such ostensibly honest advertisements of fairness are often promoted through casino loyalty programs, which offer equal rewards to gamblers who wager equal amounts [3]. The aim of these loyalty programs is to enhance both attitudinal and behavioral loyalty. Attitudinal loyalty refers to the extent to which individuals trust and are satisfied with the casino, including a sense of identification with the casino brand. Behavioral loyalty, on the other hand, refers to the actual behaviors that demonstrate loyalty, such as repeatedly visiting the casino to gamble. However, despite the appearance of fairness, all casino games are inherently unfair. Indeed, casinos have consistently reported profits from players, with the notable exception of the Kelly formula [4], which determines the optimal proportion to wager in each period in a series of blackjack (“21”) hands or repeated investments, ensuring a win rate greater than 50%. Nonetheless, attitudinal loyalty remains high among casino players.
This article uses a multi-armed Futurity bandit to mathematically explore the profound mystery of attitudinal and behavioral loyalty in the face of casino profitability. The multi-armed bandit (MAB) [5], a popular entertainment tool, is selected because it has been meticulously designed by casinos to appear fair and attract gamblers [6]. The MAB has also been extensively studied theoretically to analyze various complex decision problems [7,8,9] in fields such as science, society, economy, and management. It also plays a central role in research on reinforcement learning [10,11,12]. Specifically, this study introduces a two-armed Futurity bandit to elucidate the pervasive absorption of gamblers at casinos. The two-armed slot machine contrasts with the seemingly fair one-armed slot machine, which can be unprofitable for the casino depending on the Futurity award design. For example, a Futurity slot machine may offer a truly fair reward: when the current number of consecutive gambler losses reaches a value of J, all coins invested by the gambler in these losses are refunded. However, two-armed slot machines disrupt this fairness and exhibit the phenomenon of Parrondo’s paradox [13,14]: the game becomes profitable for the casino when a player alternates between arms in any random or non-random manner, despite the true advertisement that each of the two arms is fair individually.
In the rest of this paper, Section 2 reviews the related researches. Section 3 describes the model and results. Section 4 conduct experiments to show the result. Section 5 offers the method of this paper and the related lemma. Section 6 concludes this paper.

2. Recent Work

Parrondo’s Paradox is a counterintuitive phenomenon where the combination of two losing strategies can lead to a winning outcome. This paradox was first proposed in 1996 by physicist Juan Parrondo of the Complutense University of Madrid. Several studies [15] have examined this paradox from perspectives including game theory, quantum game theory, and information theory. Pyke [16] introduced a fairness assumption applicable to our model suggesting that the two fair arms of the slot machine may lead to long-term profits for the casino through random or non-random strategy combinations.
Many people learn from their experiences in casinos, but the underlying inevitability of their outcomes is dictated by the “law of large numbers” in probability theory. Consequently, the mystery behind casino profitability and inherent unfairness remains elusive to those without a background in probability theory. This article employs probabilistic tools to examine the law of large numbers as it applies to a two-armed antique Mills Futurity slot machine designed by the Chicago Mills Novelty Company in 1936 [6,17,18]. The Futurity slot machine offers a reward whereby when the current number of consecutive gambler losses reaches a of value J, all coins invested by the gambler in these losses are refunded. The long-term profitability of such a machine exposes the deception of the apparent fairness of this game—the “fairness illusion”. According to the game’s compensation rule, two coins are returned to the gambler each time their consecutive losses reach two. In this context, the deception can be articulated as follows: casinos honestly but shrewdly advertise that the one-armed Futurity bandit is unprofitable due to its fairness, implying long-term unprofitability. This portrayal of fairness for the one-armed Futurity bandit enhances the casino’s reputation among gamblers. However, statistical artifacts emerge with the two-armed Futurity bandit when its left and right arms are alternately played, resulting in consistent profitability for the casino under the rule of returning two coins after two consecutive losses by the same gambler. This outcome aligns with the conjecture proposed by Ethier and Lee [6], who suggested that such a two-armed Futurity bandit adheres to Parrondo’s paradox when executing any non-random mixed strategy after proving this rule for any random mixed strategy. The present article employs experiments to validate these theoretical results.
Therefore, our results, along with the conclusions of Ethier and Lee [6], demonstrate that a non-random or random two-arm strategy decided by the player before playing and then repeated without interruption is always profitable for the casino, even though two coins are refunded for every two consecutive gambler losses. This phenomenon is theoretically exemplified. This model was ingeniously proven by Chen and Liang [19], who provided an expression for the casino’s asymptotic profit expectation.
The main contributions of this article include the following:
  • This study experimentally demonstrates the profitability of a deceptively unfair game for gamblers under a mixed strategy. The Kelly formula not only aids blackjack owners in better developing and improving the design of the game, but also helps gamblers participate cautiously in gambling. Furthermore, the formula is widely used in financial risk management as a component of modern financial technology.
  • This paper provides a preliminary theoretical proof of the traditional two-arm Futurity slot machine ( J = 10 ) model. It also presents our conjecture on the underlying mechanisms of slot machine profitability and offers inspiring ideas for the further exploration of Parrondo’s paradox.
In Section 3, this paper provides a detailed introduction to the model of the Futurity two-armed slot machine and presents the theoretical results of both random and non-random strategies under the condition of J = 2 . Section 4 employs the Monte Carlo method to simulate four different strategic scenarios to verify the theoretical results from Section 3, and then compares the theoretical gains with the benefits obtained from the simulation results. In the latter part of Section 4, we conduct experimental simulations of the traditional J = 10 Futurity two-armed slot machine model and compare the casino’s empirical average profits for each of the four strategies. This comparison aims to demonstrate, from an experimental perspective, that the casino can achieve long-term profits. In Section 5, we prospectively prove the periodic impact of non-random strategies on casino returns and speculate that the frequency of player strategy exchanges is positively related to the casino’s asymptotic return expectations. This provides a theoretical direction for further research on the Futurity two-armed slot machine and Parrondo’s paradox.

3. Model and Results

The antique Futurity slot machine, designed by the Chicago Mills Novelty Company [6,17,18], was in production from 1936 to 1941. After 7 December 1941, Mills Novelty ceased slot machine production and became a defense contractor for the duration of the war. When slot machine production resumed in 1945, it did so with new designs. In this article, we use the antique Futurity slot machine designed by the Chicago Mills Novelty in 1936 as an example to explore the scientific mystery of why “long bets will lose”.
In the antique Futurity slot machine, a player spends one coin per play. There are two screens on the machine: one screen’s pointer records the current number of consecutive gambler losses. When this number reaches 10 (which can be set to another value by the casino), all 10 coins are refunded to the gambler. This refund is called the futurity award. The other screen displays the current mode. The machine’s internal structure features a periodic cam with several fixed modes, each having different winning conditions and rewards. With each play, the cam rotates to the next mode. Each arm has its own mode cam, different from the others, resulting in independent payoff distributions for each arm. The gambler pulls one arm to play. For the futurity award, the number of consecutive losses is recorded regardless of the order in which the player plays the arms. When the pointer reaches value J (where J 2 ) set by the casino, J coins are refunded to the gambler. The casino advertises that each arm on its multiple-armed machine is “fair,” meaning each arm has a 50% chance of profit for the gambler. The gambler can play either arm in a deterministic order or at random.
For simplicity, we consider a simple two-armed Futurity bandit, with two arms denoted as A and B, each arm with a different i.i.d payoff sequence. For the convenience of analysis, we can regard each arm’s distribution of wins as a Bernoulli distribution, that is, the probability of winning a game is p A (resp. p B ), with 0 < p A < 1 and 0 < p B < 1 . The gambler must pay one coin to the casino for each coup and alternates between arms according to a pre-determined repeating sequence called a strategy. The casino also offers a futurity award each time a gambler suffers consecutive losses in gambling, as described above. Casinos usually advertise the design of J = 2 , which is the considered case in this work and the most attractive to gamblers. We consider the case in which the gambler chooses a pre-formulated non-random mixed strategy D before the game starts, where D contains at least 1 A and 1 B. For instance, for strategy D = A B B , the gambler pulls arm A, then arm B, then arm B, repeating that sequence indefinitely. This work considers a “fairness” design for the one-armed Futurity bandit by adjusting the payoff distribution of each arm, where the reward is assumed to be ( 3 2 p ) / ( 2 p ) under win probabilities p = p A and p B for arms A and B, respectively. If the game is played according to the above rules, it seems that the gambler is playing a fair game with no long-term loss, but in fact, the casino definitely makes a profit in the long run, as demystified in Theorem 1 [19] below.
Subtle mathematical induction shows that any non-random repeating mixed strategy D can be arranged in the following asymptotic form D ( a ( h , r , s ) ) :
D ( a ( h , r , s ) ) = A A r 1 B B s 1 A A r k B B s k A A r h B B s h : = A r 1 B s 1 A r h B s h .
Here, r k > 0 , s k > 0 , k = 1 h r k = r , k = 1 h s k = s and vector a ( h , r , s ) = ( a 1 , a 2 , , a 2 h ) = ( r 1 , s 1 , , r k , s k , , r h , s h ) . In order to make our results more concise, we define function b i of vector a for 2 h + 1 i 4 h as follows:
  • b 2 j 1 = ( 1 ) a 2 j 1 ( 1 p A ) a 2 j 1 , b 2 j = ( 1 ) a 2 j ( 1 p B ) a 2 j for 1 j h ,
  • b i = b i 2 h for 2 h + 1 i 4 h , b i = b i + 2 h for 2 h + 1 i 0 .
Theorem 1
(Chen and Liang [19] (2023)). The casino’s asymptotic profit expectation R is 2 Q S , where
Q : = Q ( D ( a ( h , r , s ) ) ) = h + m = 1 2 h j = 1 2 h 1 ( 1 ) j i = m m + j 1 b i + h i = 1 2 h b i ,
S : = S ( r , s , p A , p B ) = ( p A p B ) 2 ( 1 + ( 1 ) r + s ( 1 p A ) r ( 1 p B ) s ) ( r + s ) ( 2 p A ) 2 ( 2 p B ) 2 ( 1 ( 1 p A ) 2 r ( 1 p B ) 2 s ) .
This theoretical result demonstrates that the game is always profitable for the casino in the long term for all p A p B . The asymptotic profit expectation R = 0 applies if and only if p A = p B . These results make clear that the win probability discrepancy between the two arms favors the casino. In detail, the expression of the casino’s asymptotic profit expectation R consists of three parts. The first part is the number two, representing settlement rule J = 2 of the futurity bandit award. The second part, function Q, denotes the gambler’s playing rule across the two arms as laid out in the internal structure of strategy D. The last part, function S, characterizes the changes in profitability to the casino accompanying changes in the values of the considered parameters p A , p B and the considered playing number r , s . Figure 1 show the casino’s payoff of a single arm across different probabilities.
Figure 2 shows three-dimensional surfaces for the casino’s payoff as functions of win probabilities p A and p B for arms A and B, respectively, under four different but representative non-random strategies. The four panels, each with a different vertical scale, show that each non-random strategy generates distinct profit modes, but each is dominated by a region of casino profitability. Figure 2a implies that playing the two arms in direct alternation generates the greatest profits for the casino. Playing the two arms in equal numbers of pulls guarantees the symmetric form of the payoffs (see Figure 2a,b).
Now, suppose a gambler plays the two-armed Futurity bandit according to a strategy of randomness C with probability p γ of pulling arm A and correspondingly probability 1 p γ of pulling arm B. Previous research has shown that the asymptotic profit expectation R C of the casino [6] is
R C = f ( p γ ( 1 p A ) + ( 1 p γ ) ( 1 p B ) ) p γ f ( 1 p A ) ( 1 p γ ) f ( 1 p B ) ,
where f ( z ) = 2 z 2 1 + z . Since f ( z ) is a convex function, the casino is profitable in the long run for all p A p B . R C = 0 if and only if p A = p B . Figure 3 shows the payoff performance under a strategy of randomness with probabilities p γ of 0.1, 0.3, 0.5, 0.7, and 0.9 of selecting arm A. Each panel displays non-negative payoffs under all combinations of p A and p B . For p γ = 0.5 , meaning equal numbers of pulls of the two arms, the payoff surface is symmetric, in line with the symmetry of the results shown in Figure 2a,b.

4. Experiments

4.1. Simulation Verifying Theoretical Results

This section implements Monte Carlo simulations to verify the theoretical results above for four cases corresponding to the non-random mixed strategies D = A B , D = A A B B : = A 2 B 2 , D = A 3 B 2 , and D = A 4 B 4 A 6 B 3 . The number of coups is M = 100 , 000 , and the true mean profit for the casino equals ( M W J C ) / M , where W represents the total number of wins for the gambler and C denotes the count of futurity awards for J consecutive gambler losses, where J = 2 . Ten thousand replicates are conducted in each simulation.
As stated above, the casino can adjust the win probability distribution of each of the two arms so as to adjust its own profit while ensuring the fairness of each arm. In an initial simulation of a single arm, Figure 1 shows that the long-term payoff to the gambler or the casino always lies close to zero for any given win probability on interval [0, 1] for the single arm. This result represents the fairness of each individual arm. In particular, the long-term payoff to the gambler is zero without uncertainty when the win probability is zero, while the long-term payoff to the casino is zero without uncertainty when the win probability is one.
Since the value of Q in Theorem 1 is also related to the win probability distributions of the two arms, the impact of Q on profits should also be considered by the casino when adjusting the win probability distributions of the arms. Figure 2 shows three-dimensional surfaces of the payoff to the casino for all combinations of win probabilities of the two arms. These graphs vividly illustrate that the casino can select the win probabilities for A and B that maximize its profit.
Next, we aim to compare the theoretical payoff with that obtained from the simulation results by examining four vertical cross-sections of the three-dimensional surfaces in Figure 2. Without loss of generality, we fix the win probability of arm B at p B = 0.5 . Figure 4 shows the theoretical and simulated curves for those four non-random strategies. This agreement demonstrates that the theoretical conclusions are highly consistent with the simulated results, thus verifying Theorem 1 for these four cases.
Last, we simulate the results of mixed random and non-random strategy, some non-random strategy followed by random strategy. Figure 5a examines vertical cross-sections of the three-dimensional surfaces, showing the sample mean casino payoffs for the full range of win probabilities for arms A and B under mixed strategy. Without loss of generality, we fix the win probability of arm B at p B = 0.5 in Figure 5b under the mixed strategy. It obviously shows that the casino loss could inspire gamblers to choose mixed strategy to win if they could choose their own strategy for the gambling machines.

4.2. Empirical Study with a Real Two-Armed Futurity Slot Machine

This section considers a real antique Mills Futurity slot machine designed in 1936 by the Chicago Mills Novelty Company. There are two screens on the slot machine, one screen recording the current number of consecutive gambler losses and the other displaying the current mode. In detail, a player consumes 1 coin per coup, and when the number of consecutive gambler losses reaches J = 10 , all 10 coins are refunded. The machine’s J value of 10 in this machine is replaced by the case of J = 2 because the latter is even more attractive to gamblers. The machine’s internal structure includes a periodic cam switching between Modes E and O, corresponding to arm A and arm B. Both closely follow the multi-point distribution shown in Table 1. The win probabilities and rewards are distinct for the two modes, and the two-armed machines are “fair” in the sense that each arm has a 50% chance of profiting the gambler, as honestly advertised by the casino. However, the casino does not allow for a gambler to play solely on one arm, since such an experiment would reveal that it is the alternation between the two “fair” arms that makes money for the casino.
In this application, we transform the multi-point distribution of each mode into a two-point distribution. For each mode, we split the distribution into gain and loss, allowing the obtention of reward of each model, thereby revealing that each individual mode is indeed fair. In particular, we show the casino’s empirical mean profit for each of the four strategies in Figure 6, revealing that the sample mean casino profit converges long-term to a positive value in each case. This finding again confirms the conclusion that the casino can earn money, in the long run, using a two-armed Futurity bandit under a compensation rule equivalent to that for J = 10 certified by Ethier and Lee [6].

5. Method

Chen and Liang [19] demonstrated the long-term profitability of this slot machine, and their theoretical proof provides enlightening ideas for proving the profitability of the classic two-arm slot machine ( J = 10 ). The gambler’s motivation to gamble is presumably tied to the casino’s claim that each arm is fair. Although a given strategy D implemented by the player may yield a range of results, gamblers often believe they can formulate profitable strategies in advance. Therefore, we must examine the relationship among the various strategies. When a casino is confronted with strategy D, it must determine how this strategy is expected to affect its profitability. This slot machine can also be regarded as a confrontation game between the casino and the player. In this context, how is the asymptotic profit expectation difference between the casino and the player calculated? Based on the casino’s calculation of the asymptotic profit expectation, we can ingeniously determine that the value of this profit is strictly positive, thus explaining why the casino is profitable in the long run.
Below, we conduct a preliminary theoretical analysis of the characteristics of the model based on the assumptions of the classic two-arm bandit machine. We explain why a casino can claim that slot machines are fair and how certain player strategies can have a cyclical impact on the casino’s asymptotic returns. Finally, we elaborate on future work on this model and propose our conjectures.

5.1. Why Can Casinos Claim Each Arm of the Slot Machine Is Fair?

The source of the casino’s profit is every single coin paid by the gambler before each coup. The player’s profit from the slot machine is divided into two parts: one part is payoff u obtained by winning a single coup, and the other part is the refund obtained by losing two consecutive coups. We can choose either arm for initial analysis. If the gambler plays only the A arm, then p A represents the asymptotic probability, per coup, of the player obtaining the futurity award. The player’s expected asymptotic revenue per coup is then μ A * = p A u A + 10 p A , where u A is the payoff obtained by the gambler by winning a single coup. The casino can set μ A * = 1 by tuning parameters p A and u A . In such a case, the player’s asymptotic payoff expectation per coup is equal to the 1-coin payoff received by the casino before each coup. Ethier and Lee [6] calculated the value of p A as p A = p A q A 10 1 q A 10 , where q A = 1 p A . Then, to maintain fairness, the casino must ensure that u A = 10 p A ( 1 p A ) 10 + ( 1 p A ) 10 1 p A ( ( 1 p A ) 10 1 ) while modifying the arm’s payoff distribution to maintain or maximize profitability. In the same way, the casino can also make the B arm fair, but the set parameters need to be p A p B , 0 < p A < 1 , 0 < p B < 1 .

5.2. What Is the Relationship among Various Non-Random Mixing Strategies?

By implementing a general fixing of values r , s > 0 , the casino can ignore everything about a gambler’s strategy D other than how the strategy affects the casino’s asymptotic profit expectation. We let p D denote the asymptotic probability, per coup, of the gambler obtaining the futurity award under strategy D, and we let p i D denote the win probability of the ith game under strategy D. Ethier and Lee [6] preliminarily provided the form of p D . On this basis, we preliminarily calculated the casino’s asymptotic profit expectation and the value of p D .
Lemma 1.
The casino’s asymptotic profit expectation R is
R = 10 p D r r + s p A s r + s p B ,
where
p D = 1 r + s k = 1 r + s j = 1 r + s p j D i = j + 1 j + 10 k q i D 1 1 ( q A r q B s ) 10 ,
where q i D = 1 p i D .
Proof. 
Based on the casino’s claim that each arm is fair and the discussion above, we have
R = 1 μ D * = 1 μ D 10 p D ,
where μ D is the asymptotic payoff expectation for the player per coup, disregarding the Futurity award. From the law of large numbers, we know that
μ D = r r + s p A u A + s r + s p B u B .
Ethier and Lee [6] showed that p D has the following form:
p D = 1 r + s k = 1 r + s j = 1 r + s p j i = j + 1 j + 10 k ( r + s ) 10 k / ( r + s ) q i ( q A r q B s ) 10 k / ( r + s ) 1 ( q A r q B s ) 10 .
For any nonrandom-pattern strategy D, we have
q A r q B s = i = 1 r + s q i = i = 1 m q i i = m + 1 r + s q i = i = 1 m q i + r + s i = m + 1 r + s q i = i = m + 1 m + r + s q i
for any m = 1 , 2 , , r + s , and similarly for m = r + s + 1 , r + s + 2 , , 10 r + 10 s , we also have q A r q B s = i = 1 r + s q i + r + s = i = m + 1 m + r + s q i . Then, by Equation (1), we note that 1 j j + 10 k ( r + s ) 10 k / ( r + s ) < j + r + s 10 r + 10 s . We let m = j + 10 k ( r + s ) 10 k / ( r + s ) , and then we have ( q A r q B s ) 10 k / ( r + s ) = i = m + 1 j + 10 k q i . Then, Equation (1) can be rewritten as follows:
p D = 1 r + s k = 1 r + s j = 1 r + s p j D i = j + 1 j + 10 k q i D 1 1 ( q A r q B s ) 10 ,
and
R = r r + s μ A * + s r + s μ B * μ D 10 p D = 10 p D r r + s p A s r + s p B .
From the above lemma, we can observe that (1) the value of p D summarizes all relevant effects of a given strategy D and that (2) the value of p D affects the casino’s asymptotic profit expectation R. From the expression for p D , we observe that the value of p D is relatively insensitive to the choice of strategy.
Lemma 2.
For any set of fixed values of r, s, and l, where l = 1 , 2 , , r + s , and non-random-pattern strategies D 1 and D 2 , p i D 1 = p i + l D 2 for all i = 1 , 2 , , r + s . Then, the casino asymptotic profit expectations of the two strategies are equal, that is, R D 1 = R D 2 .
Proof. 
We consider strategies D 1 and D 2 where p i D 1 = p i + l D 2 for any i = 1 , 2 , , r + s . Then, it also holds for i = r + s + 1 , r + s + 2 , , 11 r + 11 s by the periodic, and by Lemma 1, we have
j = 1 r + s p j D 1 i = j + 1 j + 10 k q i D 1 = j = 1 r + s p j + l D 2 i = j + 1 j + 10 k q i + l D 2 = j = 1 r + s l p j + l D 2 i = j + 1 j + 10 k q i + l D 2 + j = r + s l + 1 r + s p j + l D 2 i = j + 1 j + 10 k q i + l D 2 = j = l + 1 r + s p j D 2 i = j + 1 l j + 10 k l q i + l D 2 + j = 1 l p j + r + s D 2 i = j + 1 l j + 10 k l q i + r + s + l D 2 = j = l + 1 r + s p j D 2 i = j + 1 j + 10 k q i D 2 + j = 1 l p j D 2 i = j + 1 j + 10 k q i D 2 = j = 1 r + s p j D 2 i = j + 1 j + 10 k q i D 2 .
That is, p D 1 = p D 2 , and by Lemma 1,
R D 1 = 10 ( p D 1 r r + s p A s r + s p B ) = 10 ( p D 2 r r + s p A s r + s p B ) = R D 2 .
Then, we complete the proof. □
To understand the above lemma more intuitively, we can consider Steps A and B in the strategy as number r of “A” balls and number s of “B” balls. If these balls are placed in a cycle, then the values of p D for different starting points in the same arrangement are equal. For example, for r = 4 , s = 2 , the following two arrangements yield the same p value, that is, p A A B A B A = p A B A B A A . Hence, R A A B A B A = R A B A B A A .
A A A B B A A B A A A B
In this way, any non-random pattern strategy provided by the player can be regarded by the casino as a strategy starting from arm A in the process of calculating profitability. Vector a ( h , r , s ) can be used to represent the structure of this strategy, that is, Equation (1), where 2 h is the number of times the arm is switched during a single cycle of the strategy. We conjecture that in any non-random pattern strategy provided by the player for fixed values of r and s, the more frequent the switching of arms, the higher the casino’s profitability; that is, R and h are positively correlated.
Conjecture 1.
We consider non-random pattern strategy D 1 = a ( h , r , s ) , where h < min { r , s } ; then, there is strategy D 2 = a ( h + 1 , r , s ) such that R D 1 > R D 2 . In particular, player strategy D = A B is most beneficial to the casino, and strategy D = A r B s is most beneficial to the player.
Ethier and Lee [6] showed that if no restrictions are provided to the player strategy, the casino may not be profitable in the long term. They pointed out that the player strategy must include A or B only once, or the casino sets the winning rate of the arm p A + p B > 1 / 3 , or the player strategy r + s J can ensure the long-term profitability of the casino, and these are still a open question. We relate simple and easy-to-calculate strategies to complex strategies provided by players based on the heuristic conjectures we provide, exploiting the periodicity of the impact of player strategies on casino returns, and calculating the difference in expected house asymptotic returns between different similar strategies. Then, we may obtain theoretical proofs of other conjectures of Ethier and Lee, which also reveals the principle of the Parrondo’s paradox and provides a theoretical explanation for casino profits.

6. Conclusions

This article suggests that the root cause of gamblers’ losses lies in the intricate mathematical logic of gambling equipment and the sophisticated program design based on probability modelling and random calculation. This work rigorously demystifies the so-called casino loyalty programs that advertise fair returns with one-armed Futurity bandits to attract gamblers but then continuously profit from them using two-armed Futurity bandits. We thus expose the fraud of the seemingly fair two-armed Futurity bandit. The explicit mathematical expression of expected casino profits, as found in the Results and illustrated in the corresponding figures, vividly elucidates how expected profit changes accompany variations in the considered parameters, again implying that the game can always be profitable for the casino in the long run. The experiments conducted were designed to validate the theoretical results through simulation, and a real two-armed Futurity slot machine with a more complex output was also tested to verify this conclusion.
We anticipate that this study will benefit gamblers by helping them recognize the fundamental unfairness within the gambling industry, particularly regarding so-called loyalty programs that are typically advertised with claims of fairness. On the other hand, we do not intend for our theoretical findings to be used in the further design of slot machines, nor by other businesses such as those engaging in discount marketing, bundled sales, or other induced consumption tactics. This article may serve as a starting point for further study of the mathematically inherent profitability of casino games, including more sophisticated multi-armed Futurity bandits, based on the probabilistic tools presented herein. We also hope this study will assist casino owners in better designing their casinos and helping gamblers participate in gambling more cautiously.

Author Contributions

Methodology, H.L.; Software, J.M.; Writing—review & editing, X.Y.; Project administration, W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key R&D Program of China (No. 2023YFA1008701), the National Natural Science Foundation of China (No. 12371292), the National Statistical Science Research Project (No. 2022LY080), the Shandong Provincial Natural Science Foundation (No. ZR2022QA109), and the Jinan Science and Technology Bureau (No. 2021GXRC056).

Data Availability Statement

The data is freely available for academic use from Github at https://github.com/yanxiaodong128/datareconstruction (accessed on 1 April 2024).

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Dixon, M.J.; Graydon, C.; Harrigan, K.A.; Wojtowicz, L.; Siu, V.; Fugelsang, J.A. The allure of multi-ine games in modern slot machines. Addiction 2014, 109, 1920–1928. [Google Scholar] [CrossRef] [PubMed]
  2. Dixon, M.J.; Stange, M.; Larche, C.J.; Graydon, C.; Fugelsang, J.A.; Harrigan, K.A. Dark flow, depression and multiline slot machine play. J. Gambl. Stud. 2018, 34, 73–84. [Google Scholar] [CrossRef] [PubMed]
  3. Hollingshead, S.J.; Davis, C.G.; Wohl, M.J. The customer-brand relationship in the gambling industry: Positive play predicts attitudinal and behavioral loyalty. Int. Gambl. Stud. 2023, 23, 118–138. [Google Scholar] [CrossRef]
  4. Kelly, J.L. A new interpretation of information rate. Bell Syst. Tech. J. 1956, 35, 917–926. [Google Scholar] [CrossRef]
  5. Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introcution, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
  6. Ethier, S.N.; Lee, J. A Markovian slot machine and Parrondo’s paradox. Ann. Appl. Probab. 2010, 20, 1098–1125. [Google Scholar] [CrossRef]
  7. Auer, P.; Cesa-Bianchi, N.; Fischer, P. Finite-time analysis of the multi-armed bandit problem. Mach. Learn. 2002, 47, 235–256. [Google Scholar] [CrossRef]
  8. Chen, Z.; Epstein, L.G. A central limit theorem for sets of probability measures. Stoch. Process. Their Appl. 2022, 152, 424–451. [Google Scholar] [CrossRef]
  9. Chen, Z.; Feng, S.; Zhang, G. Strategic limit theorems associated bandit problems. arXiv 2022, arXiv:2204.04442. [Google Scholar]
  10. Agrawal, R. Sample mean based index policies by O(logn) regret for the multi-armed bandit problem. Adv. Appl. Probab. 1995, 27, 1054–1078. [Google Scholar] [CrossRef]
  11. Collins, A.G.; Cockburn, J. Beyond dichotomies in reinforcement learning. Nat. Rev. Neurosci. 2020, 21, 576–586. [Google Scholar] [CrossRef] [PubMed]
  12. Narisawa, N.; Chauvet, N.; Hasegawa, M.; Naruse, M. Arm order recognition in multi-armed bandit problem with laser chaos time series. Sci. Rep. 2021, 11, 4459. [Google Scholar] [CrossRef] [PubMed]
  13. Harmer, G.P.; Abbott, D. A review of Parrondo’s paradox. Fluct. Nois Lett. 2002, 2, 71–107. [Google Scholar] [CrossRef]
  14. Parrondo, J.M.; Harmer, G.P.; Abbott, D. New paradoxical games based on Brownian ratchets. Phys. Rev. Lett. 2000, 85, 5226–5229. [Google Scholar] [CrossRef] [PubMed]
  15. Abbott, D. Asymmetry and disorder: A decade of Parrondo’s paradox. Fluct. Noise Lett. 2010, 9, 129–156. [Google Scholar] [CrossRef]
  16. Pyke, R. On random walks and diffusions related to Parrondo’s games. Lect.-Notes-Monogr. Ser. 2003, 42, 185–216. [Google Scholar]
  17. Geddes, R.N. The Mills Futurity. Loose Chang. 1980, 3, 10–14. [Google Scholar]
  18. Geddes, R.N.; Saul, D.L. The mathematics of the Mills Futurity slot machine. Loose Chang. 1980, 3, 22–27. [Google Scholar]
  19. Chen, Z.; Liang, H. Proof of a conjecture about Parrondo’s paradox for two-armed slot machines. arXiv 2023, arXiv:2310.08935v2. [Google Scholar]
Figure 1. Payoff of a single arm across the full range of win probabilities.
Figure 1. Payoff of a single arm across the full range of win probabilities.
Mathematics 12 01713 g001
Figure 2. Casino payoffs (theoretical values) for the full range of win probabilities for arms A and B under four different non-random strategies D. Note that the vertical scale differs among panels.
Figure 2. Casino payoffs (theoretical values) for the full range of win probabilities for arms A and B under four different non-random strategies D. Note that the vertical scale differs among panels.
Mathematics 12 01713 g002
Figure 3. True mean casino payoff under a strategy of randomness with probability p γ of selecting arm A.
Figure 3. True mean casino payoff under a strategy of randomness with probability p γ of selecting arm A.
Mathematics 12 01713 g003
Figure 4. Sample mean casino payoff for the full range of win probabilities for an arm A pull under the fixed probability p B = 0.5 of an arm B pull, for various non-random strategies D.
Figure 4. Sample mean casino payoff for the full range of win probabilities for an arm A pull under the fixed probability p B = 0.5 of an arm B pull, for various non-random strategies D.
Mathematics 12 01713 g004aMathematics 12 01713 g004b
Figure 5. (a): Sample mean casino payoffs for the full range of win probabilities for arms A and B under mixed strategy. (b): Sample mean casino payoff for the full range of win probabilities for an arm A pull under fixed probability p B = 0.5 of an arm B pull, for mixed strategy. Note that the vertical scale differs among panels.
Figure 5. (a): Sample mean casino payoffs for the full range of win probabilities for arms A and B under mixed strategy. (b): Sample mean casino payoff for the full range of win probabilities for an arm A pull under fixed probability p B = 0.5 of an arm B pull, for mixed strategy. Note that the vertical scale differs among panels.
Mathematics 12 01713 g005
Figure 6. Casino’s cumulative payoff vs. the number of coups for four strategies D applied to an actual two-armed antique Mills Futurity slot machine [6].
Figure 6. Casino’s cumulative payoff vs. the number of coups for four strategies D applied to an actual two-armed antique Mills Futurity slot machine [6].
Mathematics 12 01713 g006
Table 1. Multi-point distribution of reward values in Modes E and O for the actual two-armed antique Mills Futurity slot machine.
Table 1. Multi-point distribution of reward values in Modes E and O for the actual two-armed antique Mills Futurity slot machine.
Reward Probability 035101418150
Mode E0.9680.0030.0070.0180.00400
Mode O0.3570.5760.064000.0020.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, H.; Ma, J.; Wang, W.; Yan, X. Demystifying the Two-Armed Futurity Bandit’s Unfairness and Apparent Fairness. Mathematics 2024, 12, 1713. https://doi.org/10.3390/math12111713

AMA Style

Liang H, Ma J, Wang W, Yan X. Demystifying the Two-Armed Futurity Bandit’s Unfairness and Apparent Fairness. Mathematics. 2024; 12(11):1713. https://doi.org/10.3390/math12111713

Chicago/Turabian Style

Liang, Huaijin, Jin Ma, Wei Wang, and Xiaodong Yan. 2024. "Demystifying the Two-Armed Futurity Bandit’s Unfairness and Apparent Fairness" Mathematics 12, no. 11: 1713. https://doi.org/10.3390/math12111713

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop