Next Article in Journal
Exploring Miscommunications in the Construction Industry Through Experiments
Previous Article in Journal
Enhanced Performance of Wastewater Membrane Bioreactor Using Machine Learning Model’s Prediction and Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Calculating Percentiles of T-Distribution Using Gaussian Integration Method †

Department of Vehicle Engineering, National Formosa University, Yunlin 632, Taiwan
Presented at the 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering, Yunlin, Taiwan, 15–17 November 2024.
Eng. Proc. 2025, 92(1), 2; https://doi.org/10.3390/engproc2025092002
Published: 10 April 2025
(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

Abstract

:
Statistical inference is used to estimate population parameters based on sample information and to quantify the sampling error based on the probability narrative. The population mean is inferred by its sample mean, but when using sample variance, the population variance is needed. In the quantitative analysis of the sampling error, the t-distribution is used. To determine the percentiles of the t-distribution, the cumulative probability density function is necessary. However, the analytic expression does not exist for the cumulative probability density function of the t-distribution. Its values are obtained using numerical integration. However, the percentiles of the t-distribution are not listed for degrees of freedom over 30, while only listed for every 10 data points in probability theory or mathematical statistics. This is inconvenient for research. Therefore, the cumulative probability density function of t-distribution was calculated using the Gaussian integration method in this study. The results show that the percentiles of the t-distribution are accurately estimated using the algorithm developed in this study.

1. Introduction

The census is conducted to measure or examine the statistics of a population. Sampling is performed to measure or examine members of a population. As the census is costly and time-consuming, sampling errors are critical, and artificial errors occur due to personnel fatigue. Therefore, the census is not more efficient than sampling.
The t-distribution is used in the quantitative analysis of sampling errors. To determine the percentiles of the t-distribution, its cumulative probability density function is used [1,2,3]. The function is derived by numerical integration. The Gaussian integration method is adopted in this study to integrate the probability density function of the t-distribution [4,5]. The integrand is fitted through the power series by the Gaussian integration method. If the step size is small enough and does not lead to the truncating error, accurate numerical integration is anticipated. The results of this study show that a high accuracy of percentiles of the t-distribution can be obtained using the algorithm developed in this article.

2. Gaussian Integration Method

Using the Gaussian integration method, +1 and −1 are taken as the upper and the lower limit, respectively. The integration result is obtained by summing specific integrand function values by multiplying the corresponding weights. These specific points are called Gaussian points [4,5].
I = 1 + 1 f λ d λ = i = 1 m w i f λ i
For example, the two-point formula (m = 2 in (1)) is analyzed as follows.
When m equals 2 in (1), there are two weighting coefficients w1, w2, and two sampling points, λ1 and λ2, are obtained in advance. There are four unknowns and let
f ( λ ) = a 0   +   a 1 λ   +   a 2 λ 2   +   a 3 λ 3
When a i , i = 0,1 , . . . , 3 are arbitrary, the following condition is satisfied:
1 + 1 ( a 0 + a 1 λ + a 2 λ 2 + a 3 λ 3 ) d λ = w 1 f ( λ 1 ) + w 2 f ( λ 2 )
This implies the following:
a 0 0 , a 1 = a 2 = a 3 = 0 in (3) must be satisfied. Therefore,
w 1 + w 2 = 2
Let a 1 0 , a 0 = a 2 = a 3 = 0 in (3), then
w 1 λ 1 + w 2 λ 2 = 0
When a 2 0 , a 0 = a 1 = a 3 = 0 in (3), we have
1 + 1 a 2 λ 2 d λ = w 1 a 2 λ 1 2 + w 2 a 2 λ 2 2 1 + 1 a 1 λ d λ = w 1 a 1 λ 1 + w 2 a 1 λ 2 w 1 λ 1 2 + w 2 λ 2 2 = 2 3
When a 3 0 , a 0 = a 1 = a 2 = 0 in (3),
1 + 1 a 3 λ 3 d λ = w 1 a 3 λ 1 3 + w 2 a 3 λ 2 3 w 1 λ 1 3 + w 2 λ 2 3 = 0
Using (4) to (7),
w 1 = w 2 = 1.0 ,   λ 1 = 1 3 ,   and   λ 2 = 1 3
Evaluation accuracy increases along with the increasing number of Gaussian points. The sampling points with sampling numbers 1 through 6 and their corresponding weighting coefficients, respectively, are listed in Table 1 [4,5].
If the upper and lower limits are not exactly equal to +1 and −1,
I = a b f ( x ) d x
Equation (9) can be normalized as
x = h 0 + h 1 λ
Then,
a = h 0 + h 1 ( 1 )
b = h 0 + h 1 ( 1 )
By solving (11) and (12),
h 0 = b + a 2
h 1 = b a 2
x = b + a 2 + b a 2 λ
d x = b a 2 d λ
Substituting (15) and (16) into (9), the following is obtained:
I = ( b a 2 ) 1 + 1 f ( b + a 2 + b a 2 λ ) d λ          = ( b a 2 ) 1 + 1 F ( λ ) d λ = ( b a 2 ) i = 1 m w i F ( λ i )

3. Evaluation Percentiles of T-Distribution

The t-distribution is the abbreviation of Student′s t-distribution. The t-distribution was proposed by William Sealy Gosset. The t-distribution is used to infer the population mean with a small sample size and a normal distribution without its population variance. The quantitative analysis of sampling error always involves the t-distribution.
The probability density function of the t-distribution with degrees of freedom n is calculated as
T n ( x ) = Γ ( ( n + 1 ) / 2 ) n π × Γ ( n / 2 ) × ( 1 + ( x 2 / n ) ) ( n + 1 ) / 2
where n is the degree of freedom and
Γ ( υ ) = 0 t υ 1 × e t d t
is the Gamma function.
For the degree of freedom n in (18) to be a natural number,
Γ ( n ) = ( n 1 ) !
and
Γ ( n + 1 2 ) = ( 2 n ) ! × π / [ 4 n × ( n ! ) ]
To evaluate the 100 ( 1 α ) percentile in Figure 1, t α , n must be evaluated. The probability P ( T n > t α , n ) of the random variable T n with a t-distribution greater than t α , n is α . To find t α , n , (21) is used.
P ( T n > t α , n ) = α
or
t α , n Γ ( ( n + 1 ) / 2 ) n π × Γ ( n / 2 ) × ( 1 + ( x 2 / n ) ) ( n + 1 ) / 2 d x = α
where α is the function of t α , n in (22). The inverse function does not exist. That is found in Newton’s iteration method. Equation (22) is rewritten as
g ( t α , n ) = t α , n Γ ( ( n + 1 ) / 2 ) n π × Γ ( n / 2 ) × ( 1 + ( x 2 / n ) ) ( n + 1 ) / 2 d x α
Then, to find t α , n ,
g ( t α , n ) = 0
To give a guess value t α , n , 0 ,
t α , n , n e w = t α , n , o l d g ( t α , n , o l d ) / g ( t α , n , o l d )
and the iteration is conducted. g ( t α , n , o l d ) in (26) is obtained by the following numerical differentiation.
g ( t α , n , o l d ) = L i m Δ t α , n 0 g ( t α , n , o l d + Δ t α , n ) g ( t α , n , o l d ) Δ t α , n
Because the probability density function of the t-distribution is symmetric to x = 0, the five-point Gaussian integration method is adopted, and the interval (0, t α , n ) is divided into 1 0 6 equally spaced subintervals. With a guess value of 1.0, Δ t α , n in (27) is set to 0 5 , and the iteration is stopped when the magnitude of g ( t α , n , o l d ) / g ( t α , n , o l d ) becomes smaller than 1 0 5 . The 100(1 − α ) percentile t α , n corresponds to the degrees of freedom n of 1 through 120, respectively, and α equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively (Table 2).

4. Conclusions

The derivation process of the Gaussian numerical integration using (1)–(17) shows that when m = 2, two weight coefficients, w1 and w2, need to be determined at two sampling points, λ1 and λ2, resulting in four unknowns. The integrand is fitted as a cubic polynomial. Therefore, the form of the integrand in (1) is
f ( λ ) = i = 0 2 m 1 a i λ i
where m is the number of sampling points in the Gaussian integral. Using (1) for integration, the theoretical error approaches zero even if the integration interval is not subdivided. However, since the probability density function of the t-distribution in (19) does not have the finite polynomial form as in (28), it is necessary to appropriately partition the integration interval before applying Gaussian integration with m sampling points and summing the results.
The probability density function T n ( x ) of the t-distribution with degrees of freedom n (19) indicates that the percentile 100 ( 1 α )   t α , n for n = 1 is particularly difficult to calculate accurately, especially the 100 ( 1 0.0005 ) percentile t 0.0005 , 1 = 636.619
The value is calculated to verify the correctness of the 100(1 − α) percentiles t α , n of the t-distribution listed in Table 2. The Newton–Raphson method is used to calculate the percentiles of the t-distribution (26) and (27), with an initial guess of 1.0 and an increment Δ t α , n equal to 1 0 5 in (27). Iteration stops when the adjustment g ( t α , n , o l d ) / g ( t α , n , o l d ) in (26) is smaller than a specified threshold 1 0 5 .
For n = 1, P ( T 1 > t 0.0005 , 1 ) = α = 0.0005 . The Gaussian integration method is used with m = 5 sampling points, dividing the interval from 0 to the specified limit t 0.0005 , 1 into 1 0 6 equal subintervals, and the result is t 0.0005 , 1 = 636.619249 after 15 iterations. The interval is partitioned from 0 to t 0.0005 , 1 into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 1 0 4 , 1 0 5 , 1 0 6 , and 1 0 7 , and the results are listed in Table 3. Partitioning the interval into 1000 to 2500 subintervals yields the t-distribution’s 100(1 – 0.0005) percentile approaching 636.619249. Moreover, as the value of NSECT increases, the calculated result remains 636.619249, indicating that there is no issue with rounding errors in the computation process.
To verify the accuracy of t 0.0005 , 1 = 636.619249, t 0.0005 , 1 = 636.619249 in (23) for validation. The interval is partitioned from 0 to 636.619249 into various numbers of equal subintervals. NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 1 0 4 , 1 0 5 , 1 0 6 and 1 0 7 , and the probability is P ( T 1 > 636.619249 ) that the random variable of the t-distribution exceeds 636.619249 (Table 4). If the number of subintervals NSECT exceeds 750, the probability P ( T 1 > 636.619249 ) is accurately calculated as 0.000500.
The negative result for NSECT = 100 arises because 100 is too small, increasing error and yielding a probability P ( 636.619249 > T 1 > 0 ) greater than 0.5. Thus, it is reasonable to conclude that using increments Δ t α , n as 1 0 5 in (27), stopping iteration when the adjustment g ( t α , n , o l d ) / g ( t α , n , o l d ) in (26) is less than a specified threshold 1 0 5 , and selecting an appropriate number 10 6 of subintervals is valid. Each of the 100(1 − α) percentiles of the t-distribution in Table 2 is calculated in five seconds. Furthermore, rounding the values in Table 2 to three decimal places yields results that match exactly with those from highly authoritative publications [6].

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Montgomery, D.C. Design and Analysis of Experiments; John Wiley & Sons: New York, NY, USA, 2001. [Google Scholar]
  2. Lee, J.B.; Max, E. Introduction to Probability and Mathematical Statistics; Duxbury Press: Belmont, CA, USA, 1992. [Google Scholar]
  3. Jay, L.D. Probability and Statistics for Engineering and the Sciences; Brooks/Cole: Pacific Grove, CA, USA, 2012. [Google Scholar]
  4. Zienkiewicz, O.C. The Finite Element Method; McGraw-Hill Book Co.: Maidenhead, UK, 1977. [Google Scholar]
  5. Bathe, K.J. Finite Element Procedures in Engineering Analysis; Prentice-Hall: Upper Saddle River, NJ, USA, 1982. [Google Scholar]
  6. Pearson, E.S.; Hartley, H.O. Biometrika Tables for Statisticians; Cambridge University Press: Cambridge, UK, 1966. [Google Scholar]
Figure 1. The 100(1 − α) percentile tα,n corresponding to degrees of freedom n.
Figure 1. The 100(1 − α) percentile tα,n corresponding to degrees of freedom n.
Engproc 92 00002 g001
Table 1. Sampling points with sampling numbers 1 through 6 and their corresponding weighting coefficients, respectively.
Table 1. Sampling points with sampling numbers 1 through 6 and their corresponding weighting coefficients, respectively.
Number of Sampling Points (m)Sampling Points (wi)Weighting Coefficients (λi)
10.02.0
2λ1 = −0.57735 02691 89626w1 = 1.0
λ2 = 0.57735 02691 89626w2 = 1.0
3λ1 = −0.77459 66692 41483w1 = 0.55555 55555 55556
λ2 = 0.00000 00000 00000w2 = 0.88888 88888 88889
λ3 = 0.77459 66692 41483w3 = 0.55555 55555 55556
4λ1 = −0.86113 63115 94053w1 = 0.34785 48451 37454
λ2 = −0.33998 10435 84856w2 = 0.65214 51548 62546
λ3 = 0.339981043584856w3 = 0.65214 51548 62546
λ4 = 0.861136311594053w4 = 0.34785 48451 37454
5λ1 = −0.90617 98459 38664w1 = 0.23692 68850 56189
λ2 = −0.53846 93101 05683w2 = 0.47862 86704 99366
λ3 = 0.00000 00000 00000w3 = 0.56888 88888 88889
λ4 = 0.53846 93101 05683w4 = 0.47862 86704 99366
λ5 = 0.90617 98459 38664w5 = 0.23692 68850 56189
6λ1 = −0.93246 95142 03152w1 = 0.17132 44923 79170
λ2 = −0.66120 93864 66265w2 = 0.36076 15730 48139
λ3 = −0.23861 91860 83197w3 = 0.46791 39345 72691
λ4 = 0.23861 91860 83197w4 = 0.46791 39345 72691
λ5 = 0.66120 93864 66265w5 = 0.36076 15730 48139
λ6 = 0.93246 95142 03152w6 = 0.17132 44923 79170
Table 2. The 100(1 − α ) percentile t α , n corresponding to the degrees of freedom n equal to 1 through 120, respectively, and α equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively.
Table 2. The 100(1 − α ) percentile t α , n corresponding to the degrees of freedom n equal to 1 through 120, respectively, and α equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively.
Degrees of Freedom n   α
0.2000.1000.0500.025
11.37643.07776.313812.7062
21.06071.88562.92004.3027
30.97851.63772.35343.1824
40.94101.53322.13182.7764
50.91951.47592.01502.5706
60.90571.43981.94322.4469
70.89601.41491.89412.3646
80.88891.39681.85952.3060
90.88341.38301.83312.2622
100.87911.37221.81252.2281
150.86621.34061.75312.1315
200.86001.32531.72472.0860
300.85381.31041.69732.0423
400.85071.30311.68392.0211
600.84771.29581.67062.0003
800.84611.29221.66411.9901
1000.84521.29011.66021.9840
1200.84461.28861.65771.9799
Degrees of Freedom n   α
0.0100.0050.0010.0005
131.820563.6567318.3088636.6192
26.96469.924822.327131.5991
34.54075.840910.214512.9240
43.74694.60417.17328.6103
53.36494.03215.89346.8688
63.14273.70745.20765.9588
72.99803.49954.78535.4079
82.89653.35544.50085.0413
92.82143.24984.29684.7809
102.76383.16934.14374.5869
152.60252.94673.73284.0728
202.52802.84533.55183.8495
302.45732.75003.38523.6460
402.42332.70453.30693.5510
602.39012.66033.23173.4602
802.37392.63873.19533.4163
1002.36422.62593.17373.3905
1202.35782.61743.15953.3735
Table 3. The 100(1 – 0.0005) percentiles t 0.0005 , 1 by partitioning the interval from 0 to t 0.0005 , 1 into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 1 0 4 , 1 0 5 , 1 0 6 and 1 0 7 .
Table 3. The 100(1 – 0.0005) percentiles t 0.0005 , 1 by partitioning the interval from 0 to t 0.0005 , 1 into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 1 0 4 , 1 0 5 , 1 0 6 and 1 0 7 .
NSECT t 0.0005 , 1 NSECT t 0.0005 , 1
100574.3271062500636.619249
175489.8294475000636.619249
250599.936222 10 4 636.619249
500635.953237 10 5 636.619249
750636.635366 10 6 636.619249
1000636.621221 10 7 636.619249
Table 4. The probability P ( T 1 > 636.619249 ) that a random variable of the t-distribution exceeds the value 636.619249 by partitioning the interval from 0 to 636.619249 into various numbers of equal subintervals, NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 10 4 , 1 0 5 , 1 0 6 , and 1 0 7 .
Table 4. The probability P ( T 1 > 636.619249 ) that a random variable of the t-distribution exceeds the value 636.619249 by partitioning the interval from 0 to 636.619249 into various numbers of equal subintervals, NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000, 10 4 , 1 0 5 , 1 0 6 , and 1 0 7 .
NSECT P ( T 1 > 636.619249 ) NSECT P ( T 1 > 636.619249 )
100−0.00058525000.000500
1750.00089550000.000500
2500.000527 10 4 0.000500
5000.000499 10 5 0.000500
7500.000500 10 6 0.000500
10000.000500 10 7 0.000500
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tien, T.-L. Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Eng. Proc. 2025, 92, 2. https://doi.org/10.3390/engproc2025092002

AMA Style

Tien T-L. Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Engineering Proceedings. 2025; 92(1):2. https://doi.org/10.3390/engproc2025092002

Chicago/Turabian Style

Tien, Tzu-Li. 2025. "Calculating Percentiles of T-Distribution Using Gaussian Integration Method" Engineering Proceedings 92, no. 1: 2. https://doi.org/10.3390/engproc2025092002

APA Style

Tien, T.-L. (2025). Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Engineering Proceedings, 92(1), 2. https://doi.org/10.3390/engproc2025092002

Article Metrics

Back to TopTop