Calculating Percentiles of T-Distribution Using Gaussian Integration Method

Tien, Tzu-Li

doi:10.3390/engproc2025092002

Open AccessProceeding Paper

Calculating Percentiles of T-Distribution Using Gaussian Integration Method^†

by

Tzu-Li Tien

Department of Vehicle Engineering, National Formosa University, Yunlin 632, Taiwan

^†

Presented at the 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering, Yunlin, Taiwan, 15–17 November 2024.

Eng. Proc. 2025, 92(1), 2; https://doi.org/10.3390/engproc2025092002

Published: 10 April 2025

(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

Download

Browse Figure

Versions Notes

Abstract

:

Statistical inference is used to estimate population parameters based on sample information and to quantify the sampling error based on the probability narrative. The population mean is inferred by its sample mean, but when using sample variance, the population variance is needed. In the quantitative analysis of the sampling error, the t-distribution is used. To determine the percentiles of the t-distribution, the cumulative probability density function is necessary. However, the analytic expression does not exist for the cumulative probability density function of the t-distribution. Its values are obtained using numerical integration. However, the percentiles of the t-distribution are not listed for degrees of freedom over 30, while only listed for every 10 data points in probability theory or mathematical statistics. This is inconvenient for research. Therefore, the cumulative probability density function of t-distribution was calculated using the Gaussian integration method in this study. The results show that the percentiles of the t-distribution are accurately estimated using the algorithm developed in this study.

Keywords:

statistical inference; percentile; t-distribution; Gaussian integration method

1. Introduction

The census is conducted to measure or examine the statistics of a population. Sampling is performed to measure or examine members of a population. As the census is costly and time-consuming, sampling errors are critical, and artificial errors occur due to personnel fatigue. Therefore, the census is not more efficient than sampling.

The t-distribution is used in the quantitative analysis of sampling errors. To determine the percentiles of the t-distribution, its cumulative probability density function is used [1,2,3]. The function is derived by numerical integration. The Gaussian integration method is adopted in this study to integrate the probability density function of the t-distribution [4,5]. The integrand is fitted through the power series by the Gaussian integration method. If the step size is small enough and does not lead to the truncating error, accurate numerical integration is anticipated. The results of this study show that a high accuracy of percentiles of the t-distribution can be obtained using the algorithm developed in this article.

2. Gaussian Integration Method

Using the Gaussian integration method, +1 and −1 are taken as the upper and the lower limit, respectively. The integration result is obtained by summing specific integrand function values by multiplying the corresponding weights. These specific points are called Gaussian points [4,5].

I = \int_{- 1}^{+ 1} f (λ) d λ = \sum_{i = 1}^{m} w_{i} f (λ_{i})

(1)

For example, the two-point formula (m = 2 in (1)) is analyzed as follows.

When m equals 2 in (1), there are two weighting coefficients w₁, w₂, and two sampling points, λ₁ and λ₂, are obtained in advance. There are four unknowns and let

f (λ) = a_{0} + a_{1} λ + a_{2} λ^{2} + a_{3} λ^{3}

(2)

When

a_{i}, i = 0,1, . . ., 3

are arbitrary, the following condition is satisfied:

\int_{- 1}^{+ 1} (a_{0} + a_{1} λ + a_{2} λ^{2} + a_{3} λ^{3}) d λ = w_{1} f (λ_{1}) + w_{2} f (λ_{2})

(3)

This implies the following:

a_{0} \neq 0, a_{1} = a_{2} = a_{3} = 0

in (3) must be satisfied. Therefore,

w_{1} + w_{2} = 2

(4)

Let

a_{1} \neq 0, a_{0} = a_{2} = a_{3} = 0

in (3), then

w_{1} λ_{1} + w_{2} λ_{2} = 0

(5)

When

a_{2} \neq 0, a_{0} = a_{1} = a_{3} = 0

in (3), we have

\int_{- 1}^{+ 1} a_{2} λ^{2} d λ = w_{1} a_{2} λ_{1}^{2} + w_{2} a_{2} λ_{2}^{2} \int_{- 1}^{+ 1} a_{1} λ d λ = w_{1} a_{1} λ_{1} + w_{2} a_{1} λ_{2} w_{1} λ_{1}^{2} + w_{2} λ_{2}^{2} = \frac{2}{3}

(6)

When

a_{3} \neq 0, a_{0} = a_{1} = a_{2} = 0

in (3),

\int_{- 1}^{+ 1} a_{3} λ^{3} d λ = w_{1} a_{3} λ_{1}^{3} + w_{2} a_{3} λ_{2}^{3} w_{1} λ_{1}^{3} + w_{2} λ_{2}^{3} = 0

(7)

Using (4) to (7),

w_{1} = w_{2} = 1.0, λ_{1} = - \frac{1}{\sqrt{3}}, and λ_{2} = \frac{1}{\sqrt{3}}

(8)

Evaluation accuracy increases along with the increasing number of Gaussian points. The sampling points with sampling numbers 1 through 6 and their corresponding weighting coefficients, respectively, are listed in Table 1 [4,5].

If the upper and lower limits are not exactly equal to +1 and −1,

I = \int_{a}^{b} f (x) d x

(9)

Equation (9) can be normalized as

x = h_{0} + h_{1} λ

(10)

Then,

a = h_{0} + h_{1} (- 1)

(11)

b = h_{0} + h_{1} (1)

(12)

By solving (11) and (12),

h_{0} = \frac{b + a}{2}

(13)

h_{1} = \frac{b - a}{2}

(14)

x = \frac{b + a}{2} + \frac{b - a}{2} λ

(15)

d x = \frac{b - a}{2} d λ

(16)

Substituting (15) and (16) into (9), the following is obtained:

I = (\frac{b - a}{2}) \int_{- 1}^{+ 1} f (\frac{b + a}{2} + \frac{b - a}{2} λ) d λ = (\frac{b - a}{2}) \int_{- 1}^{+ 1} F (λ) d λ = (\frac{b - a}{2}) \sum_{i = 1}^{m} w_{i} F (λ_{i})

(17)

3. Evaluation Percentiles of T-Distribution

The t-distribution is the abbreviation of Student′s t-distribution. The t-distribution was proposed by William Sealy Gosset. The t-distribution is used to infer the population mean with a small sample size and a normal distribution without its population variance. The quantitative analysis of sampling error always involves the t-distribution.

The probability density function of the t-distribution with degrees of freedom n is calculated as

T_{n} (x) = \frac{Γ ((n + 1) / 2)}{\sqrt{n π} \times Γ (n / 2) \times (1 + (x^{2} / n))^{(n + 1) / 2}}

(18)

where n is the degree of freedom and

Γ (υ) = \int_{0}^{\infty} t^{υ - 1} \times e^{- t} d t

(19)

is the Gamma function.

For the degree of freedom n in (18) to be a natural number,

Γ (n) = (n - 1)!

(20a)

and

Γ (n + \frac{1}{2}) = (2 n)! \times \sqrt{π} / [4^{n} \times (n!)]

(20b)

To evaluate the

100 (1 - α)

percentile in Figure 1,

t_{α, n}

must be evaluated. The probability

P (T_{n} > t_{α, n})

of the random variable

T_{n}

with a t-distribution greater than

t_{α, n}

is

α

. To find

t_{α, n}

, (21) is used.

P (T_{n} > t_{α, n}) = α

(21)

or

\int_{- \infty}^{t_{α, n}} \frac{Γ ((n + 1) / 2)}{\sqrt{n π} \times Γ (n / 2) \times (1 + (x^{2} / n))^{(n + 1) / 2}} d x = α

(22)

where

α

is the function of

t_{α, n}

in (22). The inverse function does not exist. That is found in Newton’s iteration method. Equation (22) is rewritten as

g (t_{α, n}) = \int_{- \infty}^{t_{α, n}} \frac{Γ ((n + 1) / 2)}{\sqrt{n π} \times Γ (n / 2) \times (1 + (x^{2} / n))^{(n + 1) / 2}} d x - α

(23)

Then, to find

t_{α, n}

,

g (t_{α, n}) = 0

(24)

To give a guess value

t_{α, n, 0}

,

t_{α, n, n e w} = t_{α, n, o l d} - g (t_{α, n, o l d}) / g^{'} (t_{α, n, o l d})

(25)

and the iteration is conducted.

g^{'} (t_{α, n, o l d})

in (26) is obtained by the following numerical differentiation.

g^{'} (t_{α, n, o l d}) = \underset{Δ t_{α, n} \to 0}{L i m} \frac{g (t_{α, n, o l d} + Δ t_{α, n}) - g (t_{α, n, o l d})}{Δ t_{α, n}}

(26)

Because the probability density function of the t-distribution is symmetric to x = 0, the five-point Gaussian integration method is adopted, and the interval (0,

t_{α, n}

) is divided into

1 0^{6}

equally spaced subintervals. With a guess value of 1.0,

Δ t_{α, n}

in (27) is set to

0^{- 5}

, and the iteration is stopped when the magnitude of

g (t_{α, n, o l d}) / g^{'} (t_{α, n, o l d})

becomes smaller than

1 0^{- 5}

. The 100(1 −

α

) percentile

t_{α, n}

corresponds to the degrees of freedom n of 1 through 120, respectively, and

α

equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively (Table 2).

4. Conclusions

The derivation process of the Gaussian numerical integration using (1)–(17) shows that when m = 2, two weight coefficients, w₁ and w₂, need to be determined at two sampling points, λ₁ and λ₂, resulting in four unknowns. The integrand is fitted as a cubic polynomial. Therefore, the form of the integrand in (1) is

f (λ) = \sum_{i = 0}^{2 m - 1} a_{i} λ^{i}

(27)

where m is the number of sampling points in the Gaussian integral. Using (1) for integration, the theoretical error approaches zero even if the integration interval is not subdivided. However, since the probability density function of the t-distribution in (19) does not have the finite polynomial form as in (28), it is necessary to appropriately partition the integration interval before applying Gaussian integration with m sampling points and summing the results.

The probability density function

T_{n} (x)

of the t-distribution with degrees of freedom n (19) indicates that the percentile

100 (1 - α)

t_{α, n}

for n = 1 is particularly difficult to calculate accurately, especially the

100 (1 - 0.0005)

percentile

t_{0.0005, 1} = 636.619

The value is calculated to verify the correctness of the 100(1 − α) percentiles

t_{α, n}

of the t-distribution listed in Table 2. The Newton–Raphson method is used to calculate the percentiles of the t-distribution (26) and (27), with an initial guess of 1.0 and an increment

Δ t_{α, n}

equal to

1 0^{- 5}

in (27). Iteration stops when the adjustment

g (t_{α, n, o l d}) / g^{'} (t_{α, n, o l d})

in (26) is smaller than a specified threshold

1 0^{- 5}

.

For n = 1,

P (T_{1} > t_{0.0005, 1}) = α = 0.0005

. The Gaussian integration method is used with m = 5 sampling points, dividing the interval from 0 to the specified limit

t_{0.0005, 1}

into

1 0^{6}

equal subintervals, and the result is

t_{0.0005, 1} = 636.619249

after 15 iterations. The interval is partitioned from 0 to

t_{0.0005, 1}

into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

1 0^{4}

,

1 0^{5}

,

1 0^{6}

, and

1 0^{7}

, and the results are listed in Table 3. Partitioning the interval into 1000 to 2500 subintervals yields the t-distribution’s 100(1 – 0.0005) percentile approaching 636.619249. Moreover, as the value of NSECT increases, the calculated result remains 636.619249, indicating that there is no issue with rounding errors in the computation process.

To verify the accuracy of

t_{0.0005, 1}

= 636.619249,

t_{0.0005, 1}

= 636.619249 in (23) for validation. The interval is partitioned from 0 to 636.619249 into various numbers of equal subintervals. NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

1 0^{4}

,

1 0^{5}

,

1 0^{6}

and

1 0^{7}

, and the probability is

P (T_{1} > 636.619249)

that the random variable of the t-distribution exceeds 636.619249 (Table 4). If the number of subintervals NSECT exceeds 750, the probability

P (T_{1} > 636.619249)

is accurately calculated as 0.000500.

The negative result for NSECT = 100 arises because 100 is too small, increasing error and yielding a probability

P (636.619249 > T_{1} > 0)

greater than 0.5. Thus, it is reasonable to conclude that using increments

Δ t_{α, n}

as

1 0^{- 5}

in (27), stopping iteration when the adjustment

g (t_{α, n, o l d}) / g^{'} (t_{α, n, o l d})

in (26) is less than a specified threshold

1 0^{- 5}

, and selecting an appropriate number

10^{6}

of subintervals is valid. Each of the 100(1 − α) percentiles of the t-distribution in Table 2 is calculated in five seconds. Furthermore, rounding the values in Table 2 to three decimal places yields results that match exactly with those from highly authoritative publications [6].

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The author declares no conflict of interest.

References

Montgomery, D.C. Design and Analysis of Experiments; John Wiley & Sons: New York, NY, USA, 2001. [Google Scholar]
Lee, J.B.; Max, E. Introduction to Probability and Mathematical Statistics; Duxbury Press: Belmont, CA, USA, 1992. [Google Scholar]
Jay, L.D. Probability and Statistics for Engineering and the Sciences; Brooks/Cole: Pacific Grove, CA, USA, 2012. [Google Scholar]
Zienkiewicz, O.C. The Finite Element Method; McGraw-Hill Book Co.: Maidenhead, UK, 1977. [Google Scholar]
Bathe, K.J. Finite Element Procedures in Engineering Analysis; Prentice-Hall: Upper Saddle River, NJ, USA, 1982. [Google Scholar]
Pearson, E.S.; Hartley, H.O. Biometrika Tables for Statisticians; Cambridge University Press: Cambridge, UK, 1966. [Google Scholar]

Figure 1. The 100(1 − α) percentile t_α,n corresponding to degrees of freedom n.

Table 1. Sampling points with sampling numbers 1 through 6 and their corresponding weighting coefficients, respectively.

Number of Sampling Points (m)	Sampling Points (w_i)	Weighting Coefficients (λ_i)
1	0.0	2.0
2	λ1 = −0.57735 02691 89626	w1 = 1.0
2	λ2 = 0.57735 02691 89626	w2 = 1.0
3	λ1 = −0.77459 66692 41483	w1 = 0.55555 55555 55556
	λ2 = 0.00000 00000 00000	w2 = 0.88888 88888 88889
	λ3 = 0.77459 66692 41483	w3 = 0.55555 55555 55556
4	λ1 = −0.86113 63115 94053	w1 = 0.34785 48451 37454
	λ2 = −0.33998 10435 84856	w2 = 0.65214 51548 62546
	λ3 = 0.339981043584856	w3 = 0.65214 51548 62546
	λ4 = 0.861136311594053	w4 = 0.34785 48451 37454
5	λ1 = −0.90617 98459 38664	w1 = 0.23692 68850 56189
	λ2 = −0.53846 93101 05683	w2 = 0.47862 86704 99366
	λ3 = 0.00000 00000 00000	w3 = 0.56888 88888 88889
	λ4 = 0.53846 93101 05683	w4 = 0.47862 86704 99366
	λ5 = 0.90617 98459 38664	w5 = 0.23692 68850 56189
6	λ1 = −0.93246 95142 03152	w1 = 0.17132 44923 79170
	λ2 = −0.66120 93864 66265	w2 = 0.36076 15730 48139
	λ3 = −0.23861 91860 83197	w3 = 0.46791 39345 72691
	λ4 = 0.23861 91860 83197	w4 = 0.46791 39345 72691
	λ5 = 0.66120 93864 66265	w5 = 0.36076 15730 48139
	λ6 = 0.93246 95142 03152	w6 = 0.17132 44923 79170

Table 2. The 100(1 −

α

) percentile

t_{α, n}

corresponding to the degrees of freedom n equal to 1 through 120, respectively, and

α

equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively.

Table 2. The 100(1 −

α

) percentile

t_{α, n}

corresponding to the degrees of freedom n equal to 1 through 120, respectively, and

α

equal to 0.200, 0.100, 0.050, 0.025, 0.010, 0.005, 0.001, and 0.0005, respectively.

Degrees of Freedom n	$α$
Degrees of Freedom n	0.200	0.100	0.050	0.025
1	1.3764	3.0777	6.3138	12.7062
2	1.0607	1.8856	2.9200	4.3027
3	0.9785	1.6377	2.3534	3.1824
4	0.9410	1.5332	2.1318	2.7764
5	0.9195	1.4759	2.0150	2.5706
6	0.9057	1.4398	1.9432	2.4469
7	0.8960	1.4149	1.8941	2.3646
8	0.8889	1.3968	1.8595	2.3060
9	0.8834	1.3830	1.8331	2.2622
10	0.8791	1.3722	1.8125	2.2281
15	0.8662	1.3406	1.7531	2.1315
20	0.8600	1.3253	1.7247	2.0860
30	0.8538	1.3104	1.6973	2.0423
40	0.8507	1.3031	1.6839	2.0211
60	0.8477	1.2958	1.6706	2.0003
80	0.8461	1.2922	1.6641	1.9901
100	0.8452	1.2901	1.6602	1.9840
120	0.8446	1.2886	1.6577	1.9799
Degrees of Freedom n	$α$
Degrees of Freedom n	0.010	0.005	0.001	0.0005
1	31.8205	63.6567	318.3088	636.6192
2	6.9646	9.9248	22.3271	31.5991
3	4.5407	5.8409	10.2145	12.9240
4	3.7469	4.6041	7.1732	8.6103
5	3.3649	4.0321	5.8934	6.8688
6	3.1427	3.7074	5.2076	5.9588
7	2.9980	3.4995	4.7853	5.4079
8	2.8965	3.3554	4.5008	5.0413
9	2.8214	3.2498	4.2968	4.7809
10	2.7638	3.1693	4.1437	4.5869
15	2.6025	2.9467	3.7328	4.0728
20	2.5280	2.8453	3.5518	3.8495
30	2.4573	2.7500	3.3852	3.6460
40	2.4233	2.7045	3.3069	3.5510
60	2.3901	2.6603	3.2317	3.4602
80	2.3739	2.6387	3.1953	3.4163
100	2.3642	2.6259	3.1737	3.3905
120	2.3578	2.6174	3.1595	3.3735

Table 3. The 100(1 – 0.0005) percentiles

t_{0.0005, 1}

by partitioning the interval from 0 to

t_{0.0005, 1}

into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

1 0^{4}

,

1 0^{5}

,

1 0^{6}

and

1 0^{7}

.

Table 3. The 100(1 – 0.0005) percentiles

t_{0.0005, 1}

by partitioning the interval from 0 to

t_{0.0005, 1}

into varying numbers of equal subintervals for NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

1 0^{4}

,

1 0^{5}

,

1 0^{6}

and

1 0^{7}

.

NSECT	$t_{0.0005, 1}$	NSECT	$t_{0.0005, 1}$
100	574.327106	2500	636.619249
175	489.829447	5000	636.619249
250	599.936222	$10^{4}$	636.619249
500	635.953237	$10^{5}$	636.619249
750	636.635366	$10^{6}$	636.619249
1000	636.621221	$10^{7}$	636.619249

Table 4. The probability

P (T_{1} > 636.619249)

that a random variable of the t-distribution exceeds the value 636.619249 by partitioning the interval from 0 to 636.619249 into various numbers of equal subintervals, NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

10^{4}

,

1 0^{5}

,

1 0^{6}

, and

1 0^{7}

.

Table 4. The probability

P (T_{1} > 636.619249)

that a random variable of the t-distribution exceeds the value 636.619249 by partitioning the interval from 0 to 636.619249 into various numbers of equal subintervals, NSECT = 100, 175, 250, 500, 750, 1000, 2500, 5000,

10^{4}

,

1 0^{5}

,

1 0^{6}

, and

1 0^{7}

.

NSECT	$P (T_{1} > 636.619249)$	NSECT	$P (T_{1} > 636.619249)$
100	−0.000585	2500	0.000500
175	0.000895	5000	0.000500
250	0.000527	$10^{4}$	0.000500
500	0.000499	$10^{5}$	0.000500
750	0.000500	$10^{6}$	0.000500
1000	0.000500	$10^{7}$	0.000500

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tien, T.-L. Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Eng. Proc. 2025, 92, 2. https://doi.org/10.3390/engproc2025092002

AMA Style

Tien T-L. Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Engineering Proceedings. 2025; 92(1):2. https://doi.org/10.3390/engproc2025092002

Chicago/Turabian Style

Tien, Tzu-Li. 2025. "Calculating Percentiles of T-Distribution Using Gaussian Integration Method" Engineering Proceedings 92, no. 1: 2. https://doi.org/10.3390/engproc2025092002

APA Style

Tien, T.-L. (2025). Calculating Percentiles of T-Distribution Using Gaussian Integration Method. Engineering Proceedings, 92(1), 2. https://doi.org/10.3390/engproc2025092002

Article Menu

Calculating Percentiles of T-Distribution Using Gaussian Integration Method^†

Abstract

1. Introduction

2. Gaussian Integration Method

3. Evaluation Percentiles of T-Distribution

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Calculating Percentiles of T-Distribution Using Gaussian Integration Method †

Abstract

1. Introduction

2. Gaussian Integration Method

3. Evaluation Percentiles of T-Distribution

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Calculating Percentiles of T-Distribution Using Gaussian Integration Method^†