Next Article in Journal
Equivalence of Partition Functions Leads to Classification of Entropies and Means
Next Article in Special Issue
Permutation Entropy and Its Main Biomedical and Econophysics Applications: A Review
Previous Article in Journal
A Model of Nonsingular Universe
Previous Article in Special Issue
Socio-Thermodynamics—Evolutionary Potentials in a Population of Hawks and Doves
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Entropy Optimization Model for Graduation of Data in Survival Analysis

1
School of Humanities & Economics Management, Chinese University of Geosciences, Number 29, Xueyuan Rd., Haidian, Beijing 100083, China
2
Lab of Resources & Environment Management, Chinese University of Geosciences, Number 29, Xueyuan Rd., Haidian, Beijing 100083, China
3
School of Business Administration, North China Electronic Power University, Number 2, Beinong Rd., Changpin, Beijing 102206, China
*
Author to whom correspondence should be addressed.
Entropy 2012, 14(8), 1306-1316; https://doi.org/10.3390/e14081306
Submission received: 7 March 2012 / Revised: 6 June 2012 / Accepted: 4 July 2012 / Published: 25 July 2012
(This article belongs to the Special Issue Concepts of Entropy and Their Applications)

Abstract

:
Graduation of data is of great importance in survival analysis. Smoothness and goodness of fit are two fundamental requirements in graduation. Based on the instinctive defining expression for entropy in terms of a probability distribution, two optimization models based on the Maximum Entropy Principle (MaxEnt) and Minimum Cross Entropy Principle (MinCEnt) to estimate mortality probability distributions are presented. The results demonstrate that the two approaches achieve the two basic requirements of data graduating, smoothness and goodness of fit respectively. Then, in order to achieve a compromise between these requirements, a new entropy optimization model is proposed by defining a hybrid objective function combining both principles of MaxEnt and MinCEnt models linked by a given adjustment factor which reflects the preference of smoothness and goodness of fit in the data graduation. The proposed approach is feasible and more reasonable in data graduation when both smoothness and goodness of fit are concerned.
PACS Codes:
89.70.Cf; 02.50.Cw

1. Introduction

Survival analysis is an important topic in actuarial science. Survival analysis has a long history and there are many kinds of approaches. The common approach to estimate the survival distribution is the parametric one, with which a theoretical survival distribution is specified and the parameters involved are determined by certain methods. There are a lot of different methods available in literatures, such as maximum likelihood estimators [1,2,3] and Bayesian estimators [4,5,6,7], etc. To our best knowledge, the choice of theoretical survival distributions or other prior distributions is difficult and critical for the implementation of these kinds of methods. In this paper, we will discuss how to utilize an entropy optimization approach to estimate mortality distribution based on the instinctive relationship between entropy and probability distribution.
Information-theoretic entropy, presented by Shannon in 1948 [8] and the entropy optimization principle was proposed by Jaynes in 1957 [9,10] and Kullback et al. (1951, 1959) [11,12], have widened the application area of entropy and transformed it from a measure of information into a tool of statistics inference. Generally speaking, entropy optimization includes the Maximum Entropy Principle (MaxEnt) and Minimum Cross Entropy Principle (MinCEnt). MaxEnt estimates a probability distribution based only on the known information, without adding any other subjective information. MinCEnt is used to estimate a probability distribution which is closest to the prior one by minimizing the cross entropy between the estimated and the prior one.
Based on the instinctive defining expression for entropy in terms of a probability distribution, this paper will first review the idea of the application of entropy optimization rules in mortality distribution estimation. Then, in consideration of the goodness of fit and smoothness requirements in data graduation, a new approach will be proposed, which tries to combine the MaxEnt and MinCEnt methods to assure the degree of goodness of fit and smoothness of the estimation by use of an adjustment factor.

2. Review of Entropy Optimization Principles

There are two basic approaches to estimate probability distributions by using the concept of entropy: Maximum Entropy Principle (MaxEnt) and Minimum Cross Entropy Principle (MinCEnt). MaxEnt was proposed by Jaynes in 1957 [9,10]. He thought that given just some mean values, there are usually an infinity of compatible distributions. MaxEnt encourages us to select the distribution that maximizes the Shannon entropy measure and is simultaneously consistent with the mean value constraints. This is a natural extension of Laplace's famous principle of insufficient reason, which postulates that the uniform distribution is the most satisfactory representation of our knowledge when we know nothing about the random variant except that each probability is non-negative and that the sum of the probabilities is unity.
Let Θ be a discrete random variable on the probability space Entropy 14 01306 i001, where Entropy 14 01306 i002 and P ( Θ = θ i ) = p i , i = 1 , 2 , , n which is unknown and need to be estimated by some known information denoted by g i ( θ ) ( j = 1 , 2 , , m ) such as mean value, variance, jth moment, etc. Mathematically, MaxEnt can be described as the following optimization model:
Entropy 14 01306 i003
where H(P) is the entropy of a probability distribution P on Ω.
If there is a prior distribution of Θ, i.e., Q(Θ = θi) = qi, then MinCEnt can be used to get another estimated distribution which is statistically closest to the prior distribution under the same constraints as MaxEnt. MinCEnt can be modeled as:
Entropy 14 01306 i004
where K(P,Q) is the cross entropy between the probability distribution P and Q.
The two model can be solved by Lagrangian approach [13]. The solutions are:
Entropy 14 01306 i005
and:
Entropy 14 01306 i006
where α 0 , α 1 , α m and β 0 , β 1 , β m are Lagrangian multipliers for each model above.

3. Data Graduating with Entropy Optimization Principles

Assume that there is a living group whose original population is l0. At time x(x > 0), the living population is denoted as l x ( x = 0 , 1 , , T ) . And we assume that lT = 0. Let d x ( x = 0 , 1 , , T ) be the death population from time x to x + 1, then:
d x = l x l x + 1
The value of dx usually can be obtained from observation. In survival analysis, qx is usually used to denote the mortality probability which means the probability for an individual not to live another year. The mortality probability is a basic parameter to construct a life table and has very important function in life insurance actuarial science. In real situations, it can be obtained from sample data and denoted as Entropy 14 01306 i007, which is:
Entropy 14 01306 i008
It is easy to find that Entropy 14 01306 i009. However, Entropy 14 01306 i007 is just an estimation of qx which has to be graduated to approximate the real mortality probability as closely as possible. This process is called graduation of data in life insurance.
Furthermore, the estimation of qx is based on the sample information, so we must utilize the information in sample fully and try our best to add as little extra information as possible. This may provide a reasonable foundation to use MaxEnt and MinCEnt. It should note that the mortality distribution does not meet the requirement of unity. Hence, in order to use MaxEnt and MinCEnt, let:
Entropy 14 01306 i010
and we define it as death probability, which is the ratio of the death population in different age groups to the original population. It is easy to find that Entropy 14 01306 i011 can meet requirements of probability distribution, and it may be stated that:
Entropy 14 01306 i012
Hence, Entropy 14 01306 i007, the estimation of qx, can achieved once Entropy 14 01306 i011 is determined.
Based on the sample information, the estimation of Entropy 14 01306 i011 can be described as:
Entropy 14 01306 i013
where Entropy 14 01306 i007 is the estimation of Entropy 14 01306 i011, E1 is the mean value of sample data and Ej is the jth moment.
On the other hand, Entropy 14 01306 i011 can be viewed as a prior distribution of death probability, then a MinCEnt model can be established to estimation of Entropy 14 01306 i011 as:
Entropy 14 01306 i014
The solution of Equations (7) and (8) can be achieved by using Lagrangian approaches [13]. The results are:
Entropy 14 01306 i015
and:
Entropy 14 01306 i016
where α j , β j ( j = 0 , 1 , , k ) are Lagrangian multipliers.
To clarify the feasibility and properties of the above estimation methods, the above models will be applied to estimate the mortality distribution on experimental data taken from Ananda et al. [5].
Example: The following data (Table 1) are death times for 208 mice, which were exposed to gamma radiation. The data are divided into 14 groups by the time interval given in [5] and then we calculate the mortality distribution by the MaxEnt and MinCEnt model.
Table 1. Experimental data.
Table 1. Experimental data.
Time Interval(x)Death Population (dx)Death Probability ( Entropy 14 01306 i011)
130.0144
230.0144
360.0288
460.0288
5160.0769
6140.0673
7250.1202
8200.0962
9320.1530
10250.1202
11270.1298
12130.0625
13110.0529
1470.0337
With the MaxEnt and MinCEnt approach, Table 2 shows the data graduation results under different moment constraints (up to 5th), and they are plotted in Figure 1 and Figure 2 too.
Table 2. Results of MaxEnt and MinCEnt.
Table 2. Results of MaxEnt and MinCEnt.
Experiment DataResults of MaxEnt(up to Ej)Results of MinCEnt(up to Ej)
x Entropy 14 01306 i011E1E2E3E4E5E1E2E3E4E5
10.01440.04470.00740.01150.01310.01260.01440.01440.01440.01440.0144
20.01440.04780.01470.01770.01770.01800.01440.01440.01440.01440.0144
30.02880.05110.02640.02710.02580.02630.02880.02880.02880.02880.0288
40.02880.05460.04340.04070.03860.03890.02880.02880.02880.02880.0288
50.07690.05830.06480.05870.05700.05670.07690.07690.07700.07690.0770
60.06730.06240.08820.08060.08050.07970.06730.06730.06740.06740.0674
70.12020.06670.10940.10360.10580.10500.12020.12020.12030.12030.1204
80.09620.07120.12370.12300.12640.12650.09620.09620.09620.09630.0963
90.15380.07620.12740.13290.13530.13630.15380.15380.15380.15380.1538
100.12020.08140.11950.12880.12840.12930.12020.12020.12020.12020.1201
110.12980.08700.10220.11050.10760.10760.12980.12980.12970.12970.1297
120.06250.09300.07960.08270.07970.07890.06250.06250.06250.06240.0625
130.05290.09940.05650.05320.05270.05200.05290.05290.05290.05290.0529
140.03370.10630.03660.02900.03150.03210.03370.03370.03370.03370.0337
χ 2 0.32370.04320.03420.03340.03340.00000.00000.00000.00000.0000
Entropy 14 01306 i0170.00000.00000.00010.00010.00010.61840.61840.61840.61840.6184
Figure 1. Results of MaxEnt.
Figure 1. Results of MaxEnt.
Entropy 14 01306 g001
Figure 2. Results of MinCEnt.
Figure 2. Results of MinCEnt.
Entropy 14 01306 g002
Table 3 and Figure 3 are the comparison of results of MaxEnt (up to 5th moment) with results of maximum likelihood estimation (MLE) and Bayesian estimation (BE) method from [5].
Table 3. Results of Different Graduation Approaches.
Table 3. Results of Different Graduation Approaches.
x Entropy 14 01306 i011MaxEntMLEBE
10.01440.01260.03110.0321
20.01440.01800.01680.0173
30.02880.02630.02440.0249
40.02880.03890.03490.0354
50.07690.05670.04910.0496
60.06730.07970.06760.0680
70.12020.10500.09010.0902
80.09620.12650.11440.1140
90.15380.13630.13510.1342
100.12020.12930.14370.1423
110.12980.10760.13100.1297
120.06250.07890.09530.0948
130.05290.05200.05010.0505
140.03370.03210.01650.0170
χ 2 0.03340.07560.0737
Entropy 14 01306 i0180.00010.00080.0007
Figure 3. Results of Different Graduation Approaches.
Figure 3. Results of Different Graduation Approaches.
Entropy 14 01306 g003
From above results, it can be found that:
(1)
From Table 3, results of MaxEnt have smaller χ 2 value(a measure of goodness of fit to original data and calculated by Euclidean distance) and Entropy 14 01306 i017 (a measure of smoothness by using 4th-moment of difference) than the results of MLE and BE approach. Hence, MaxEnt method of data graduating can be thought as a better method.
(2)
From Table 2, results of the MinCEnt approach are the same as the prior distribution, and the reason is that Entropy 14 01306 i011 and Ej are calculated from the experimental data. If Entropy 14 01306 i011 or Ej is given by other information, the results will be different from the prior distribution. However, from this extreme situation we can find that MinCEnt focuses on the goodness of fit in data graduation.
(3)
Comparing with the MinCEnt approach, the MaxEnt approach is better from the viewpoint of the smoothness of data graduation and is worse from the point of view of goodness of fit.

4. Data Graduating by Combining MaxEnt and MinCEnt

Smoothness and goodness of fit always are the most important consideration in graduation of data though many techniques have been developed. For example, the most widely used method until now, especially by North American actuaries for the construction of life tables, is the Whittaker-Henderson method of graduation. This method originates in the work of Bohlmann (1899) [14] and Whittaker (1923) [15], and contributions to the theory were made by Henderson (1924, 1925) [16,17], and others. The Whittaker-Henderson method gives the graduated values by minimizing the quantity:
Entropy 14 01306 i019
where F and S are the weighted measure of goodness of fit to the original data and smoothness receptively. Entropy 14 01306 i020 is the prior of mortality, Entropy 14 01306 i021 is the estimated mortality, i.e., result of data graduation, and Entropy 14 01306 i022 is the Entropy 14 01306 i023 difference of Entropy 14 01306 i021 (usually Entropy 14 01306 i024 or higher). Entropy 14 01306 i025 is a weight coefficient and h is a positive adjustment factor between goodness of fit and smoothness. This method is widely used and has become a basic logic in graduation of data and many other approaches currently follow this lead. Generally speaking, the characteristics of different data graduation methods may lie on two sides, putting different emphasis on the goodness of fit and smoothness, and on how to measure smoothness and goodness of fit. Therefore, graduation of data can be looked as a bi-objective question. On one hand, the graduation results should be smooth and on the other hand they should be close to the original data. From the above section, it has shown that results of MaxEnt data graduation is smoother while those of MinCEnt is closer to the original data, so we propose a new approach of data graduation which can combine the both methods as in the following model:
Entropy 14 01306 i026
where 0 μ 1 is a given adjustment factor between smoothness and goodness of fit. When μ = 0 , it is the MinCEnt approach, and when μ = 1 it is the MaxEnt approach. Because Equations (7) and (8) are solvable, it is easy to conclude that Equation (12) is solvable too.
In the above model, MaxEnt is proposed as a measure of smoothness and MinCEnt is proposed as a measure of goodness of fit. These two measures are integrated with a linear coefficient μ to reflect different weights on smoothness and goodness of fit. The reason to adopt a convex combination of smoothness and goodness of fit is to assure convexity of the objective function G and solvability of the proposed model. In reality, how to decide the appropriate weight between smoothness and goodness of fit is a highly controversial topic. We propose that the value of μ can be determined as h determined in Equation (11), which is usually determined by experimental approaches.
Based on the data of above Example, we calculate the estimated mortality distribution by this model. Table 4 and Figure 4 are results of this approach when taking 5th moment as constraints.
Table 4. Combination Method Results.
Table 4. Combination Method Results.
Experimental DataCombination Method
x Entropy 14 01306 i011 μ = 0.1 μ = 0.3 μ = 0.5 μ = 0.7 μ = 0.9
10.01440.01420.01390.01350.01320.0128
20.01440.01470.01540.01620.01690.0176
30.02880.02860.02820.02770.02720.0266
40.02880.02980.03170.03370.03580.0378
50.07690.07490.07070.06660.06260.0586
60.06730.06870.07130.07380.07620.0786
70.12020.11890.1160.1130.10990.1067
80.09620.09910.10480.11080.11690.1233
90.15380.15210.14870.14520.14160.1381
100.12020.12110.12310.1250.12680.1285
110.12980.12740.1230.11850.11410.1097
120.06250.0640.06720.07050.07380.0772
130.05290.05290.05280.05270.05250.0522
140.03370.03350.03320.03290.03260.0323
Figure 4. Combination Method Results.
Figure 4. Combination Method Results.
Entropy 14 01306 g004
From the above results, it can be concluded that the proposed method is feasible and the adjustment factor plays an important role in trading off smoothness and goodness of fit, which provides flexibility in graduation of data.

5. Conclusions

In this paper, based on the instinctive defining expression for entropy in terms of a probability distribution, the methods of application of entropy optimization in survival analysis were discussed. It was found that the results of the MaxEnt model focus on the smoothness of data graduation, while results of the MinCEnt model focus on the goodness of fit to the original data, so in consideration of the requirements of smoothness and goodness of fit in data graduation, a new approach was proposed to combine the results of MaxEnt and MinCEnt, which could provide a trade-off between smoothness and goodness of fit in data graduation by use of a given adjustment factor.

Acknowledgments

This research is supported by the Program for New Century Excellent Talents in University (NCET-10-0375) from Ministry of Education of China, and supported by the Fundamental Research Funds for the Central Universities.

References

  1. Broffitt, J.D. Maximum likelihood alternatives to actuarial estimators of mortality rates. Trans. Soc. Actuar. 1984, 36, 77–142. [Google Scholar]
  2. Zhou, J.; Liu, J.; Li, Y. The Theory of Constructing Life Table; Nankai University Press: Tianjin, China, 2001. [Google Scholar]
  3. Dennis, T.H.; Fellingham, G.W. Likelihood methods for combining tables of data. Scand. Actuar. J. 2000, 2, 89–101. [Google Scholar]
  4. Singh, A.K.; Ananda, M.A.; Dalpatadu, R. Bayesian estimation of tabular survival models from complete samples. Actuar. Res. Clear. House 1993, 1, 335–342. [Google Scholar]
  5. Ananda, M.M.; Dalpatadu, R.J.; Singh, A.K. Estimating parameters of the force of mortality in actuarial studies. Actuar. Res. Clear. House 1993, 1, 129–141. [Google Scholar]
  6. Haastrup, S. Comparison of some Bayesian analysis of heterogeneity in group life insurance. Scand. Actuar. J. 2000, 1, 2–16. [Google Scholar] [CrossRef]
  7. Nielson, A.; Lewy, P. Comparison of the frequentist properties of Bayes and the maximum likelihood estimators in an age structured fish stock assessment model. Can. J. Fish. Aquat. Sci. 2002, 59, 136–143. [Google Scholar] [CrossRef]
  8. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar] [CrossRef]
  9. Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
  10. Jaynes, E.T. Information theory and statistical mechanics II. Phys. Rev. 1957, 108, 171–190. [Google Scholar] [CrossRef]
  11. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  12. Kullback, S. Information Theory and Statistics; John Wiley and Sons: New York, NY, USA, 1959. [Google Scholar]
  13. Kapur, J.N.; Kesavan, H.K. Entropy optimization Principles with Applications; Academic Press Inc.: San Diego, CA, USA, 1992. [Google Scholar]
  14. Bohlmann, G. Ein Ausgleichungs Problem. In Nachrichten von der Königl Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-physikalische Klasse; Horstmann, L., Ed.; Commissionsverlag der Dieterich’schen Universitätsbuchhandlung: Göttingen, Germany, 1899; pp. 260–271. [Google Scholar]
  15. Whittaker, E.T. On a new method of graduation. Proc. Edinb. Math. Soc. 1923, 41, 63–75. [Google Scholar]
  16. Henderson, R. A new method of graduation. Trans. Actuar. Soc. Am. 1924, 25, 29–53. [Google Scholar]
  17. Henderson, R. Further remarks on graduation. Trans. Actuar. Soc. Am. 1925, 26, 52–74. [Google Scholar]
  18. Alicja, S.N.; William, F.S. An extension of the Whittaker-Henderson method of graduation. Scand. Actuar. J. 2012, 1, 70–79. [Google Scholar]

Share and Cite

MDPI and ACS Style

He, D.; Huang, Q.; Gao, J. A New Entropy Optimization Model for Graduation of Data in Survival Analysis. Entropy 2012, 14, 1306-1316. https://doi.org/10.3390/e14081306

AMA Style

He D, Huang Q, Gao J. A New Entropy Optimization Model for Graduation of Data in Survival Analysis. Entropy. 2012; 14(8):1306-1316. https://doi.org/10.3390/e14081306

Chicago/Turabian Style

He, Dayi, Qi Huang, and Jianwei Gao. 2012. "A New Entropy Optimization Model for Graduation of Data in Survival Analysis" Entropy 14, no. 8: 1306-1316. https://doi.org/10.3390/e14081306

Article Metrics

Back to TopTop