A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game
Abstract
:1. Introduction
- We have developed fast converging algorithms to calculate the solution of the dynamic inverse Stackelberg game without sufficient information about the agents.
- The suggested algorithms can be considered as numerical methods for solving the corresponding static inverse Stackelberg game without sufficient information about the payoff functions of the agents.
- This game-theoretic model has been applied for optimal resource allocation among producers in the case of insufficient information about their cost functions.
2. Related Work
- (1)
- Resource dynamics are explicitly described as the phase variable depending on the Principal’s control. The control function can be non-differentiable; in this case, the dynamic equation is interpreted in terms of the Lebesgue-Stieltjes integral.
- (2)
- The Principal’s control has smooth variations, which is formalized using the Lipschitz property of the control function. This assumption seems natural for the majority of real organizational and economic systems.
- (3)
- The Principal allocates resources among the agents proportionally to their actions, which stimulates the latter to choose more intensive plans.
- (4)
- This hypothesis is used to develop a genetic algorithm for calculating the Principal’s optimal strategy with a non-uniform partition of the time interval.
3. Model
3.1. Static Setup and Dynamic Generalization
3.2. Dynamic Setup 1
- calculation of the maximum likelihood estimate for the transition probability matrix;
- calculation of the next value of the Q-function.
3.3. Dynamic Setup 2
- calculation of the estimates for the first and second moments of distribution;
- calculation of the next approximation .
4. Examples and Numerical Calculations
5. Conclusions and Future Work
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Von Stackelberg, H. Marktform und Gleichgewicht; Springer-Verlag: Vienna, Austria, 1934. [Google Scholar]
- Olsder, G.J. Phenomena in inverse Stackelberg games, part 1: Static problems. J. Optim. Theory Appl. 2009, 143, 589–600. [Google Scholar] [CrossRef]
- Olsder, G.J. Phenomena in inverse Stackelberg games, part 2: Dynamic problems. J. Optim. Theory Appl. 2009, 143, 601–618. [Google Scholar] [CrossRef]
- Laffont, J.-J.; Martimort, D. The Theory of Incentives: The Principal-Agent Model; Princeton University Press: Princeton, NJ, USA, 2002. [Google Scholar]
- Novikov, D. Theory of Control in Organizations; Nova Science Publishers: New York, NY, USA, 2013. [Google Scholar]
- Korgin, N.A. Equivalence and strategy-proofness of non-anonymous priority allotment mechanisms. Autom. Remote Control 2016, 77, 2065–2079. [Google Scholar] [CrossRef]
- Korgin, N.A.; Korepanov, V.O. An efficient solution of the resource allotment problem with the Groves–Ledyard mechanism under transferable utility. Autom. Remote Control 2016, 77, 914–942. [Google Scholar] [CrossRef]
- Nisan, N.; Roughgarden, T.; Tardos, E.; Vazirani, V. (Eds.) Algorithmic Game Theory; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
- Rokhlin, D.B.; Ougolnitsky, G.A. Stackelberg Equilibrium in a Dynamic Stimulation Model with Complete Information. Autom. Remote Control 2018, 79, 691–702. [Google Scholar] [CrossRef]
- Ehtamo, H.; Kitti, M.; Hamalainen, R.P. Recent Studies on Incentive Design Problems in Game Theory and Management Science. In Advances in Computational Management Science; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002; Volume 5, Chapter 8; pp. 121–134. [Google Scholar]
- Cruz, J.B., Jr. Leader-follower strategies for multilevel systems. IEEE Trans. Autom. Control 1978, 23, 244–255. [Google Scholar] [CrossRef] [Green Version]
- Ho, Y.C. On incentive problems. Syst. Control Lett. 1983, 3, 63–68. [Google Scholar] [CrossRef]
- Ho, Y.-C.; Luh, P.B.; Muralidharan, R. Information structure, Stackelberg games, and incentive controllability. IEEE Trans. Autom. Control 1981, 26, 454–460. [Google Scholar]
- Leitmann, G. On generalized Stackelberg strategies. J. Optim. Theory Appl. 1978, 26, 637–643. [Google Scholar] [CrossRef]
- Simaan, M.; Cruz, J.B., Jr. Additional aspects of the Stackelberg strategy in nonzero-sum games. J. Optim. Theory Appl. 1973, 11, 613–626. [Google Scholar] [CrossRef]
- Ho, Y.-C.; Luh, P.B.; Olsder, G.J. A control-theoretic view on incentives. Automatica 1982, 18, 167–179. [Google Scholar] [CrossRef]
- Ehtamo, H.; Hamalainen, R.P. Construction of optimal affine incentive strategies for linear-quadratic Stackelberg games. In Proceedings of the 24th IEEE Conference on Decision and Control, Fort Lauderdale, FL, USA, 11–13 December 1985; pp. 1093–1098. [Google Scholar]
- Tolwinski, B. Closed-loop Stackelberg solution to a multistage linear quadratic game. J. Optim. Theory Appl. 1981, 34, 485–501. [Google Scholar] [CrossRef]
- Behrens, D.A.; Caulkins, J.P.; Feichtinger, G.; Tragler, G. Incentive Stackelberg Strategies for a Dynamic Game on Terrorism. In Advances in Dynamic Game Theory; Jørgensen, S., Quincampoix, M., Vincent, T., Eds.; Annals of the International Society of Dynamic Games; Birkhauser: New York, NY, USA; Boston, MA, USA, 2007; Volume 9, pp. 459–486. [Google Scholar]
- Luh, P.B.; Ho, Y.-C.; Muralidharan, R. Load adaptive pricing: An emerging tool for electric utilities. IEEE Trans. Autom. Control 1982, 27, 320–329. [Google Scholar] [CrossRef]
- Shen, H.; Basar, T. Incentive-based pricing for network games with complete and incomplete information. In Advances in Dynamic Game Theory; Jørgensen, S., Quincampoix, M., Vincent, T., Eds.; Annals of the International Society of Dynamic Games; Birkhauser: New York, NY, USA; Boston, MA, USA, 2007; Volume 9, pp. 431–458. [Google Scholar]
- Stankova, K.; Olsder, G.J.; Bliemer, M.C.J. Comparison of different toll policies in the dynamic second-best optimal toll design problem: Case study on a three-link network. Eur. J. Transp. Infrastruct. Res. 2009, 9, 331–346. [Google Scholar]
- He, X.; Prasad, A.; Sethi, S.P.; Gutierrez, G.J. A survey of Stackelberg differential game models in supply chain and marketing channels. J. Syst. Sci. Syst. Eng. 2007, 16, 385–413. [Google Scholar] [CrossRef]
- Germeier, Y.B. On two-person games with a fixed sequence of moves. Doklady Math. 1971, 1989, 1001–1004. (In Russian) [Google Scholar]
- Danil’chenko, T.N.; Kononenko, A.F. Dynamic models of multilevel control systems with information transfer, I. Prob. Control Inf. Theory 1984, 13, 53–68. [Google Scholar]
- Danil’chenko, T.N.; Kononenko, A.F. Dynamic models of multilevel control systems with information transfer, II. Prob. Control Inf. Theory 1984, 13, 121–139. [Google Scholar]
- Belyavsky, G.I.; Danilova, N.V.; Ougolnitsky, G.A. Evolutionary methods for solving dynamic resource allocation problems. Math. Game Theory Appl. 2018. in press. (In Russian) [Google Scholar]
- Fogel, D. Evolutionary Computation: Toward a New Philosophy of Machine Intelligence; IEEE Press: Piscataway, NJ, USA, 2006. [Google Scholar]
- Goldberg, D. The Design of Innovation: Lessons from and for Competent Genetic Algorithms; Kluwer Academic Publishers: Norwell, MA, USA, 2002. [Google Scholar]
- Belyavsky, G.I.; Lila, V.B.; Puchkov, E.V. Algorithm and software implementation for hybrid learning method of artificial neural networks. Softw. Prod. Syst. 2012, 4, 96–101. (In Russian) [Google Scholar]
- Belyavsky, G.I.; Danilova, N.V.; Ougolnitsky, G.A. Evolutionary modeling in sustainable management of active systems. Math. Game Theory Appl. 2016, 8, 14–29. [Google Scholar]
- Liu, B. Stackelberg–Nash equilibrium for multilevel programming with multiple followers using genetic algorithms. Comput. Math. Appl. 1998, 36, 79–89. [Google Scholar] [CrossRef]
- Goykhman, M. On self-play computation of equilibrium in poker. arXiv, 2018; arXiv:1805.09282 v1. [Google Scholar]
- Sutton, R.S.; Barto, A. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef] [Green Version]
- Tharakunnel, K.; Bhattacharyya, S. Single-leader–multiple-follower games with boundedly rational agents. J. Econ. Dyn. Control 2009, 33, 1593–1603. [Google Scholar] [CrossRef]
- Erev, E.; Roth, A. Predicting how people play game: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 1998, 88, 848–881. [Google Scholar]
- Eidelman, Y.; Milman, V.; Tsolomitis, A. Functional Analysis: An Introduction; AMS: Provence, RI, USA, 2004. [Google Scholar]
2 | |
0.9 | |
1 | |
2 |
0.35 | 0.1837500000 | 0.09187500000 | 0.2625000000 |
0.36 | 0.1778142858 | 0.08890714290 | 0.4172257999 |
0.37 | 0.1788773809 | 0.08943869045 | 0.4193054183 |
0.38 | 0.1816893340 | 0.09084466700 | 0.4200877844 |
0.39 | 0.1852001900 | 0.09260009500 | 0.4207793681 |
0.40 | 0.1890599915 | 0.09452999575 | 0.4214115632 |
0.41 | 0.1931187782 | 0.09655938910 | 0.4219456888 |
0.42 | 0.1973015878 | 0.09865079390 | 0.4223502262 |
0.43 | 0.2015667888 | 0.1007833944 | 0.4226042619 |
0.44 | 0.2058894150 | 0.1029447075 | 0.4226943805 |
0.45 | 0.2102535881 | 0.1051267940 | 0.4226120220 |
0.445 | 0.2070930676 | 0.1035465338 | 0.4170062493 |
0.4475 | 0.2078141892 | 0.1039070946 | 0.4193088334 |
0.45 | 0.2085934568 | 0.1042967284 | 0.4190913001 |
0.44875 | 0.2075715326 | 0.1037857663 | 0.4175845248 |
0.449375 | 0.2075256146 | 0.1037628073 | 0.4180160874 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Belyavsky, G.; Danilova, N.; Ougolnitsky, G. A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game. Mathematics 2018, 6, 131. https://doi.org/10.3390/math6080131
Belyavsky G, Danilova N, Ougolnitsky G. A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game. Mathematics. 2018; 6(8):131. https://doi.org/10.3390/math6080131
Chicago/Turabian StyleBelyavsky, Grigory, Natalya Danilova, and Guennady Ougolnitsky. 2018. "A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game" Mathematics 6, no. 8: 131. https://doi.org/10.3390/math6080131
APA StyleBelyavsky, G., Danilova, N., & Ougolnitsky, G. (2018). A Markovian Mechanism of Proportional Resource Allocation in the Incentive Model as a Dynamic Stochastic Inverse Stackelberg Game. Mathematics, 6(8), 131. https://doi.org/10.3390/math6080131