A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates

Li, Siqiao; Dohi, Tadashi; Okamura, Hiroyuki

doi:10.3390/electronics11152353

Open AccessArticle

A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates

by

Siqiao Li

,

Tadashi Dohi

^* and

Hiroyuki Okamura

Graduate School of Advanced Science and Engineering, Hiroshima University, Hiroshima 739-8511, Japan

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(15), 2353; https://doi.org/10.3390/electronics11152353

Submission received: 12 June 2022 / Revised: 22 July 2022 / Accepted: 23 July 2022 / Published: 28 July 2022

(This article belongs to the Special Issue 10th Anniversary of Electronics: Recent Advances in Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

This paper focuses on the so-called proportional intensity-based software reliability models (PI-SRMs), which are extensions of the common non-homogeneous Poisson process (NHPP)-based SRMs, and describe the probabilistic behavior of software fault-detection process by incorporating the time-dependent software metrics data observed in the development process. The PI-SRM is proposed by Rinsaka et al. in the paper “PISRAT: Proportional Intensity-Based Software Reliability Assessment Tool” in 2006. Specifically, we generalize this seminal model by introducing eleven well-known fault-detection time distributions, and investigate their goodness-of-fit and predictive performances. In numerical illustrations with four data sets collected in real software development projects, we utilize the maximum likelihood estimation to estimate model parameters with three time-dependent covariates (test execution time, failure identification work, and computer time-failure identification), and examine the performances of our PI-SRMs in comparison with the existing NHPP-based SRMs without covariates. It is shown that our PI-STMs could give better goodness-of-fit and predictive performances in many cases.

Keywords:

software reliability models; proportional intensity model; non-homogeneous Poisson process; time-dependent covariate; maximum likelihood estimation; goodness-of-fit performance; predictive performance

1. Introduction

Over the almost last five decades, hundreds of stochastic models, known as the software reliability models (SRMs), have been widely developed in the literature [1,2,3], to investigate and describe software fault-detection phenomena from the viewpoints of mathematical approaches. Among these models, non-homogeneous Poisson process (NHPP)-based SRMs have been identified as the most traditional but important SRMs, which have attracted extensive attention from the software reliability community in describing the stochastic behavior of the software fault-detection. The relevant investigations also validated their superiority in the goodness-of-fit performance with software fault count data compared to the other SRMs. In fact, most of the existing NHPP-based SRMs in the literature have been developed by assuming representative probability distributions to represent the time-to-fault detection, including exponential distribution [4], gamma distribution [5,6], pareto distribution [7], truncated-logistic [8], and log-logistic [9] distributions, truncated-normal distribution [10], log-normal distribution [10,11] and the extreme-value distributions [12]. These NHPP-based SRMs are characterized by the mean value functions or the cumulative distribution functions (CDF) of software fault-detection time. Hence, they can qualitatively represent the typical software reliability growth phenomena and the software debugging scenarios during the software testing phase. In other words, the above approach is categorized into a black-box approach, where the software fault-detection time distribution is estimated with only the fault count data and does not depend on the knowledge/learning effects of the software product, test resources, and the process information. It should be noted that the common NHPP-based SRMs are quite simple in software reliability measurement and fault prediction but miss out on several software development/testing metrics of data collected throughout the software development process.

As an extension of the common NHPP-based SRMs, this paper summarize the so-called proportional intensity-based software reliability models (PI-SRMs) by Rinsaka et al. [13], and describe the probabilistic behavior of the software fault-detection process by incorporating the time-dependent software metrics data observed in the development process. In the subsequent paper, Shibata et al. [14] develop a software reliability assessment tool, PI-SRAT, to automate the parameter estimation and quantify the software reliability. Specifically, we generalize the seminal PI-SRM in [13] by introducing several well-known fault-detection time distributions because the work in [13] limited a few kinds of software fault-detection time distributions. The advantage of PI-SRMs is to combine a regression formula to represent the dependence of software metrics data with a stochastic counting process to represent the software fault count. Similar to the well-known software reliability assessment tool in SRATS [15], we introduce eleven parametric models (baseline intensity functions) in the PI-SRM and comprehensively evaluate the potential performances, including the goodness-of-fit performance and predictive performance for the PI-SRMs. In numerical illustrations with four data sets collected in real software development projects, we apply the maximum likelihood estimation to estimate model parameters with three time-dependent covariates (software metrics), test execution time (CPU hr), failure identification work (person hr), and computer time-failure identification (CPU hr), then examine the performances of our PI-SRMs in comparison with the existing NHPP-based SRMs without covariates. Furthermore, we investigate the dependence of software metrics in the software fault count process by checking the contribution of each software metric in the resulting regression coefficient.

The remainder of this paper is organized as follows. In Section 2, we provide a summary of the related works on PIM-based software reliability modeling approach. Section 3 summarizes the well-known NHPP-based SRMs and the maximum likelihood estimation for estimating model parameters. In Section 4, we give the definition of PI-SRMs and introduce the piecewise continuous mean value function to represent the cumulative and non-cumulative software development/test effect. Section 5 focuses on numerical experiments to facilitate our analysis effects of respective software metrics on the software fault count process. Finally, the paper is concluded in Section 6.

2. Related Works

It has been known that there is a strong correlation between software quality and some kinds of software development metrics. McCabe [16] presented various software engineering metrics that serve as units of measurement for the development of human resources, software products, and processes. Halstead [17] also emphasized the importance of software science and developed deterministic equations for estimating the quantity of residual faults in software as a function of programming effort. Putnam [18] and Takahashi and Kamayachi [19] demonstrated empirical relationships between the software product development environmental factors (programming language, coding techniques, reusability of existing code, programmer skill, etc.), and software fault characteristics. With numerous metrics data, Pillai and Nair [20] described an estimation problem of software cost and development effort. The methodologies mentioned above, on the other hand, are basically deterministic and cannot account for the uncertainty of software fault-detection mechanisms in the testing.

In general, it is known that software metrics can be divided into four categories; product metrics, development/test metrics, deployment/usage metrics, and software-hardware configurations metrics. Therefore, when both the software metrics and fault-detection data are utilized in the software reliability analysis, it can be expected that the assessment of the software reliability accurately will be more plausible and can be improved. Khoshgoftaar and Munson [21] and Khoshgoftaar et al. [22,23,24] introduced linear and non-linear regressions into the predictive modeling approaches to quantify the software quality with software complexity metrics data. Schnieidewind [25,26] also used regression models to quantify the software maintenance process. For the prediction of the field defect-occurrence rate, Li et al. [27] considered a defect-occurrence projection (metrics-based) method based on the exponential smoothing and classical moving average. Khoshgoftaar et al. [28] used the pure and zero-inflated Poisson regression approaches to develop software fault prediction models to predict the rank-order of software modules. Amasaki et al. [29] utilized the rank correlation coefficient and the logistic regression model to identify the fault data trend and evaluate the software quality, which is quantified by the number of detected faults after shipping. Unfortunately, these models [21,22,23,24,27,28,29] fail to deal with both the time series metrics data and the regression data simultaneously.

Ascher [30,31], Bendell [32], Evanco and Lacovara [33], Evanco [34] and Nishio and Dohi [35] used the proportional hazard model (PHM) or Cox regression model [36] to integrate the software development metrics and/or environmental factors and defined the software fault-detection time distribution by taking the time-series metrics data into account as the covariate [37,38]. Pham [3] investigated a dynamic variant of PHM and constructed an enhanced PHM based on a continuous-time Markov chain. However, since the covariates representing the development effort were composed of 0-1 binary values in their modeling, the resulting PHM-based modeling approach lost its validation if the cumulative effect of software development/testing effort reported in [39] was analyzed. By observing the behavior of myopic software debugging phenomena, Shibata et al. [40] introduced the discrete-time PHM into the cumulative Bernoulli trial process and proposed a software metrics-based modeling approach. Okamura et al. [41] assumed the logistic regression instead of the Cox regression and proposed a different discrete-time multi-factor modeling framework. Kuwa and Dohi [42,43] further extended the metrics-based SRMs with the logistic and Cox regressions to improve both goodness-of-fit and predictive performances. Nagaraju et al. [44] applied the discrete-time Cox-regression-based SRMs to an optimal test activity allocation problem.

It is worth mentioning that the discrete-time metrics-based SRMs in [40,41,42,43] were quite convincing in modeling because both the logistic and Cox regression approaches were consistently taken into consideration. However, the discrete-time metrics-based SRMs depend on the discrete fault-detection time distribution. In general, it is known that handling the continuous probability distributions is much easier than the discrete probability distributions. In fact, in the references [40,41,42,43], the authors dealt with very few discrete probability distributions, such as the geometric distribution, negative binomial distribution, and discrete Weibull distribution. On one hand, the PI-SRMs in [13] are based on the continuous probability distribution and can represent many software fault-detection patterns in the framework. The basic idea of PI-SRMs has come from the proportional intensity model (PIM) by Lawless [45], which is an extension of the common PHM in terms of time series analysis. However, it is worth noting that the PIM in [45] is not directly applicable to the non-decreasing cumulative data such as the software fault count. Hence, Rinsaka et al. [13] proposed a modified NHPP-based SRM with piecewise continuous mean value function with monotone increasing property to apply the PIM to the software fault count data analysis.

3. NHPP-Based Software Reliability Modeling

From the viewpoint of modeling and statistical estimation, suppose the stochastic counting process with the following four conditions:

(i): $N (0) = 0$ ;
(ii): ${N (t), t \geq 0}$ has independent increment;
(iii): $\Pr {N (t + Δ t) - N (t) \geq 2} = o (Δ t)$ ;
(iv): $\Pr {N (t + Δ t) - N (t) = 1} = λ (t) Δ t + o (Δ t)$ ,

where

N (t)

denotes the total number of events that occurred up to and including time t,

λ (t)

is a continuous (deterministic) function of time t, called the intensity function, and

o (Δ t)

is the higher-order term of infinitesimal time

Δ t

satisfying:

lim_{Δ t \to 0} \frac{o (Δ t)}{Δ t} = 0 .

(1)

Then the stochastic counting process

{N (t), t \geq 0}

can be regarded as the non-homogeneous Poisson process (NHPP). Since NHPP is a typical Markov process with a time-dependent transition rate, the probability mass function (PMF) of NHPP is given by:

\Pr {N (t) = n} = \frac{{H (t)}^{n}}{n!} exp (- H (t)),

(2)

where:

H (t) = \int_{0}^{t} λ (x) d x = E [N (t)]

(3)

is the mean value function of NHPP with

H (0) = 0

. It denotes the expected cumulative number of software faults detected by time t.

In software reliability engineering, two commonly used modeling assumptions are made:

(i): Software faults are detected at independent and identically distributed (i.i.d.) random times with the non-degenerate cumulative distribution function (CDF), $F (t; α)$ , where $α$ is a free parameter vector.
(ii): The total number of software faults remaining in software before testing, say, at time $t = 0$ , is a Poisson random variable with parameter $ω (> 0)$ .

Under the above two assumptions, it can be confirmed that the software fault-detection process

N (t)

follows an NHPP with mean value function

H (t; θ) = ω F (t; α)

, where

θ = (ω, α)

and

{lim}_{t \to \infty} H (t; θ) = ω (> 0)

. The resulting bounded mean value function implies that the initial number of residual faults in software is finite.

The key idea in the traditional software reliability modeling was to determine the mean value function

H (t; θ)

or the fault-detection time CDF

F (t; α)

, to fit the software fault count data. In the software reliability assessment tool on the spreadsheet (SRATS), Okamura and Dohi [15] implemented the NHPP-based SRM with eleven representative software fault-detection time CDFs belong to the generalized exponential distribution family and the extreme-value distribution family. All of them have been developed in the heavily cited references [4,5,6,7,8,9,10,11,12]. We apply these eleven NHPP-based SRMs to our PI-SRMs. More specifically, Table 1 presents the eleven CDF’s associated abbreviations and intensity functions.

Before closing this section, we summarize the maximum likelihood estimation for the common NHPP-based SRMs. Suppose that the cumulative number of software faults detected by each testing time

t_{k} (k = 1, 2, \dots, n)

, measured in calendar time, is denoted by

y_{k}

. For the time interval (group) data

(t_{k}, y_{k}) (k = 1, 2, \dots, n)

, the likelihood function and log-likelihood function with unknown parameter

θ

are given by:

\begin{matrix} L (θ) = exp (- H (t_{n}; θ)) \prod_{k = 1}^{n} \frac{{H (t_{k}; θ) - H (t_{k - 1}; θ)}^{y_{k} - y_{k - 1}}}{(y_{k} - y_{k - 1})!} \end{matrix}

(4)

and,

\begin{matrix} L L F (θ) & = & \sum_{k = 1}^{n} ln [H (t_{k}; θ) - H_{t} (t_{k - 1}; θ)] (y_{k} - y_{k - 1}) - H (t_{n}; θ) \\ - \sum_{k = 1}^{n} ln [(y_{k} - y_{k - 1})!], \end{matrix}

(5)

respectively, where

(t_{0}, y_{0}) = (0, 0)

. Note that almost all software fault count data observed in practice are the group data, because the software debugging is often made in the distributed testing environment, and that the measurement of execution time to detect each software fault, which is measured by CPU time, is almost impossible in industry. Finally, we can obtain the maximum likelihood (ML) estimate

\hat{θ}

by maximizing Equation (5) with respect to

θ

.

4. Proportional Intensity Model

4.1. Model Description

In this section, we introduce a PI-SRM compatible with the maximum likelihood estimation and incorporate several testing-effort factors observed on respective testing dates. Suppose that l types of software metrics data,

x_{k} = (x_{k 1}, \dots, x_{k l}) (k = 1, 2, \dots, n)

, are observed at each testing time

t_{k} (= 0, 1, 2, \dots, n)

. For the analytical purpose, we assume that each software metric

x_{k}

is dependent on the cumulative testing time

t_{k}

, and can be considered as a time-dependent function, denoted by

x_{k} (t_{k})

. In fact, this sort of parameter is referred to as a time-dependent covariate [37,38] in statistics and has been widely investigated in the context of the Cox regression-based proportional hazard model (PHM). We define the intensity function for our PI-SRM by:

λ_{x} (t_{k}, x_{k}; θ, β) = λ_{0} (t_{k}; θ) g (x_{k}; β),

(6)

with the regression coefficients

β = (β_{1}, \dots, β_{l})

and the baseline intensity

λ_{0} (t_{k}; θ) (> 0)

, and the covariate function

g (x_{k}; β) (> 0)

. When

g (x_{k}; β) = 1

for any

x_{k}

, the PI-SRMs are reduced to the NHPP-based SRMs with the baseline intensity

λ_{0} (t; θ)

. Based on the idea of common Cox regression PHM, it is appropriate to assume the following exponential form for the covariate function:

g (x_{k}; β) = exp (x_{k} β) .

(7)

In the literature [36,37,38], the above form is widely accepted to make the analysis easy and flexible. Lawless [45] also analyzed the event count data in actual medical applications with the same exponential covariate function. Note that the time-independent covariates considered by Lawless [45] were the binary data taking 0 and 1. Rinsaka et al. [13] proposed an intuitive but reasonable model to deal with the effect of the cumulative number of software faults and the software metrics in the covariate function. Define the mean value function for the given data

(t_{k}, y_{k}, x_{k}) (k = 1, 2, \dots, n)

by:

\begin{matrix} H_{p} (t_{1}; θ, β) & = & \int_{0}^{t_{1}} λ_{0} (u; θ) exp (x_{1} β) d u, \end{matrix}

(8)

\begin{matrix} H_{p} (t_{2}; θ, β) & = & \int_{t_{1}}^{t_{2}} λ_{0} (u; θ) exp (x_{2} β) d u + H_{p} (t_{1}; θ, β), \end{matrix}

(9)

\begin{matrix} ⋮ \end{matrix}

\begin{matrix} H_{p} (t_{k}; θ, β) & = & \sum_{i = 1}^{k} exp (x_{i} β) \int_{t_{i - 1}}^{t_{i}} λ_{0} (u; θ) d u \\ = & \sum_{i = 1}^{k} exp (x_{i} β) \times [H_{0} (t_{i}; θ) - H_{0} (t_{i - 1}; θ)], \end{matrix}

(10)

where

H_{0} (t_{i}; θ) = \int_{0}^{t_{i}} λ_{0} (u; θ) d u

. It is seen again that the PI-SRM can be reduced to the common NHPP-based SRM when

β_{j} = 0

for all

j (= 1, 2, \dots, l)

. By introducing

H_{p} (t; θ, β)

, we confirm that the monotone property of the mean value function with respect to testing time r can be guaranteed. Substituting the intensity function in Table 1 in to the baseline intensity

λ_{0} (t; θ)

, we obtain the eleven PI-SRMs corresponding to the NHPP-based SRMs in SRATS [15].

4.2. Maximum Likelihood Estimation

We also utilize the maximum likelihood estimation to estimate the parameter vectors

θ

and

β

of PI-SRM. For the fault count data

(t_{k}, y_{k})

and software metrics data

x_{k} = (x_{k 1}, \dots, x_{k l}) (k = 1, 2, \dots, n)

, we define the likelihood function by:

\begin{matrix} L (θ, β) = \prod_{k = 1}^{n} \frac{{H_{p} (t_{k}; θ, β) - H_{p} (t_{k - 1}; θ, β)}^{y_{k} - y_{k - 1}}}{(y_{k} - y_{k - 1})!} exp (- H_{p} (t_{n}; θ, β)), \end{matrix}

(11)

so that the log-likelihood function of PI-SRM can be written as:

\begin{matrix} L L F (θ, β) & = & \sum_{k = 1}^{n} ln [H_{p} (t_{k}; θ, β) - H_{p} (t_{k - 1}; θ, β)] (y_{k} - y_{k - 1}) \\ - \sum_{k = 1}^{n} ln [(y_{k} - y_{k - 1})!] - H_{p} (t_{n}; θ, β) . \end{matrix}

(12)

By maximizing Equation (12) with the Newton–Raphson method, we obtain the maximum likelihood estimates

(\hat{θ}, \hat{β})

of PI-SRM.

5. Numerical Examples

In our numerical examples, four software fault count data with software metrics are used, where these data are measured in the real-time command and control system development projects [39]. Details are shown in Table 2, in which three software metrics data: failure identification work, execution time, and computer time-failure identification, are involved in addition to the cumulative number of software faults detected at each testing time (calendar week in [39]). We quantitatively evaluate the goodness-of-fit performances of eleven PI-SRMs and evaluate the predictive performances via the above four time-dependent metrics data as the covariates. In the following discussion, we consider two patterns in dealing with software metrics. One is to input the software metrics as the cumulative

x_{k} = (x_{k 1}, \dots, x_{k l})

, the other as the difference

x_{k} = (x_{k 1} - x_{(k - 1) 1}, \dots, x_{k l} - x_{(k - 1) l})

, where l is the number of time-dependent metrics data in each data set and

k = 0, 1, 2, \dots, n

. The main concern here is to investigate the effects of cumulative values of software metrics on the contribution to the software fault count. For instance, we examine the difference between the cumulative length of test execution time by the present testing time and the test execution time spend at the same testing time.

5.1. Goodness-of-Fit Performance

For our PI-SRMs, we assume eleven baseline intensity functions in Table 1 and compare them to investigate the effects of each time-dependent software metric data on the stochastic behavior of the cumulative number of software faults detected in the testing phase. We calculate the maximum likelihood estimates

(\hat{θ}, \hat{β})

of covariate

g (x_{k}; β) = exp (x_{k} β)

for all combinations of software metrics data in Table 2 and consider a total of 7 combinations, as shown in Table 3. By deriving the corresponding log-likelihood (LLF), the Akaike information criterion (AIC) and mean squared error (MSE) are used to evaluate the goodness-of-fit performances of our PI-SRMs, where:

AIC = - 2 L L F (\hat{θ}, \hat{β}) + 2 π,

(13)

and,

MSE = \frac{1}{n} \sqrt{\sum_{k = 1}^{n} {(y_{k} - H_{p} (t_{k}; \hat{θ}, \hat{β}))}^{2}} .

(14)

π

represents the number of free parameters. The lower the AIC/MSE, the better SRM in terms of goodness-of-fit to the fault count data.

In Figure 1, we plot the cumulative number of detected software faults in GDS1 and the estimated mean value functions in the best-fitted SRMs, where we select the best model with the minimum AIC for the common NHPP-based SRMs without software metrics (orange curve) in SRATS [15], PI-SRM with cumulative software metrics (red curve), and PI-SRM with non-cumulative software metrics (blue curve), among eleven intensity functions. At first glance, it can be seen that the three curves exhibit similar behavior, but a closer look reveals that our PIMs can show more complex behaviors than existing NHPP-based SRMs without software metrics. Figure 2 illustrates the behavior of the estimated number of detected fault counts at each testing time interval in GDS1, where the same models as Figure 1 are used for comparison, and the orange bar-chart represents the actual number of software faults in each testing week. The result explains that our two PI-SRMs could show better goodness-of-fit performances than the existing NHPP-based SRM without software metrics and could catch up with the detailed trend on the software fault count.

To compare our PI-SRMs with the common NHPP-based SRMs without software metrics more precisely, we present the best AIC results for four time-dependent metrics data in Table 4. By comparing our two PI-SRMs with cumulative/non-cumulative metrics values, we investigate how to deal with the software metrics data in software fault data analysis. From the results in Table 4, it is found that our PI-SRMs are more appealing in software reliability modeling and outperform the existing NHPP-based SRMs without software metrics in terms of goodness-of-fit. In the comparison of two patterns with cumulative/non-cumulative metric data, it is seen that the non-cumulative software metrics tend to show better fitting results except in GDS4. Note that the difference of AIC between cumulative/non-cumulative metric patterns is minimal and negligible. Therefore, our conclusion on the goodness-of-fit performance is that the PI-SRM with non-cumulative software metric data should be better. Furthermore, in Table 4, it is observed that both the execution time and failure identification work could contribute to the goodness-of-fit performance in the PI-SRMs. Hence, the measurement of test execution time and failure identification work can help understand the software fault count in the testing phase more accurately and is useful to monitor the software testing progress.

5.2. Predictive Performance

Next, we are concerned with investigating the predictive performances of our PI-SRMs. In each observation point

n^{'} (1 \leq n^{'} < n)

when 50% or 80% of the whole data are available, we predict the future behavior of the cumulative number of software faults. To assess the predictive ability, we apply the prediction squared error (PMSE) as the predictive performance measure, where:

P M S E = \frac{1}{n - \overset{´}{n}} \sqrt{\sum_{k = \overset{´}{n} + 1}^{n} {[y_{k} - H_{p} (t_{k}; \hat{θ}, \hat{β})]}^{2} .}

(15)

The smaller the PMSE, the better the prediction performance of the model. As expected, when we predict the number of software faults detected in the future, both the software metrics

x_{k} (k = 1, 2, \dots, n)

and the regression coefficient

β

must be estimated. The regression coefficients are available by applying the plug-in estimates (maximum likelihood estimates) with the past observation. However, the difficulty when the PI-SRMs are used arises since we have to predict the software metrics themselves in the future. In our numerical experiments, we consider the following three cases:

Case I:: All the test/development metric data are completely known through the testing phase in advance, so the software testing expenditures are exactly given in the testing.
Case II:: The test/development metrics data do not change from the observation point in the future.
Case III:: The test/development metrics data experienced in the future are regarded as independent random variables and predictable by any statistical method.

Case I corresponds to the case where the software test plan is established and there is no confusion in the software testing phase. Case II implicitly assumes that the observation point is regarded as the release point of software because no testing effort will be spent in the operational phase. Case III would be the most plausible case in software testing. In this case, we are requested to introduce any statistical model to investigate the test/development metrics data. We employ two elementary regression methods, linear regression and exponential regression to predict the future software metrics data. More specifically, we assume that the metric data

x_{k}

before the observation point

n^{'}

and the corresponding time point

t_{k}

have been observed with

k = 1, 2, \dots, n^{'}

. Next, our goal is to calculate the predictive value of the metric data

\hat{x_{k}}

between a given time period

(t_{n^{'} + 1}, t_{n})

, by introducing the independent variable

T = {t_{n^{'} + 1}, t_{n^{'} + 2}, . . . t_{n}}

into the linear regression equation:

\hat{x_{k}} = δ_{1} + δ_{2} t_{k}

(16)

with intercept

δ_{1}

and coefficient

δ_{2}

can be derived by:

δ_{1} = \frac{(\sum_{k = 1}^{n^{'}} x_{k}) (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - (\sum_{k = 1}^{n^{'}} t_{k}) (\sum_{k = 1}^{n^{'}} t_{k} x_{k})}{n^{'} (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - {(\sum_{k = 1}^{n^{'}} t_{k})}^{2}}

(17)

and,

δ_{2} = \frac{n^{'} (\sum_{k = 1}^{n^{'}} t_{k} x_{k}) - (\sum_{k = 1}^{n^{'}} t_{k}) (\sum_{k = 1}^{n^{'}} x_{k})}{n^{'} (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - {(\sum_{k = 1}^{n^{'}} t_{k})}^{2}}

(18)

respectively. Similar to the linear regression method, we can also obtain the predictive values of the metric data

\hat{x_{k}}

by importing variable

T = {t_{n^{'} + 1}, t_{n^{'} + 2}, . . . t_{n}}

into the exponential regression equation:

\hat{x_{k}} = δ_{3} {δ_{4}}^{t_{k}},

(19)

where the coefficients

δ_{3}

and

δ_{4}

are given by:

δ_{3} = exp (\frac{(\sum_{k = 1}^{n^{'}} ln x_{k}) (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - (\sum_{k = 1}^{n^{'}} t_{k}) (\sum_{k = 1}^{n^{'}} t_{k} ln x_{k})}{n^{'} (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - {(\sum_{k = 1}^{n^{'}} t_{k})}^{2}})

(20)

and,

δ_{4} = exp (\frac{n^{'} (\sum_{k = 1}^{n^{'}} t_{k} ln x_{k}) - (\sum_{k = 1}^{n^{'}} t_{k}) (\sum_{k = 1}^{n^{'}} ln x_{k})}{n^{'} (\sum_{k = 1}^{n^{'}} {t_{k}}^{2}) - {(\sum_{k = 1}^{n^{'}} t_{k})}^{2}})

(21)

respectively. Note that with Equations (20) and (21), it can be easily found that the exponential regression is not appropriate for making the prediction when the non-cumulative metric data in PI-SRMs are used, because the variable may take 0, and the correlation coefficient may not be calculated theoretically. Therefore, we totally consider seven patterns of estimated development/test metrics data in the future phase in the above three cases, and investigate the predictive performances of our PI-SRMs.

Figure 3 and Figure 4 depict the prediction results of the cumulative number of software faults in GDS1 at 50% observation and 80% observation, respectively. It is not difficult to find that our two PI-SRMs could show a completely different predictive trend than common NHPP-based SRMs. However, we can recognize that the closer increasing trend to the underlying software fault count data, no matter whether the prediction length is long or short, especially in the testing phase after 50% and 80% observation points. The quantitative comparison in terms of predictive performance is investigated in Table 5 and Table 6, where we present the PMSE in four data sets at 50% observation point and 80 % observation point, respectively. Here we select the best SRMs with the smallest PMSE in PI-SRMs with cumulative/non-cumulative software metric data in CASE I, CASE II, and CASE III, and the existing NHPP-based SRMs. From these results, it is immediate to see that our PI-SRMs could still outperform the existing NHPP-based SRMs in all the data sets. We also find that utilizing the estimated metrics data in Case II, i.e., when the test/development metrics data do not change in the future tends to give better predictive performances than the other two cases in many cases (GDS1 50%, GDS2 50%, GDS3 50%, GDS4 50%, and GDS4 80%). So in 5 out of 8 (GDS2 50%, GDS4 50%, GDS2 80%, GDS3 80%, and GDS4 80%); our PI-SRMs with non-cumulative metric data could provide the minimum PMSE. More specifically, Combination II of software metrics in Table 3 gives the minimum PMSEs in GDS1 80%, GDS3 80%, and GDS4 80% data sets with non-cumulative software metric data and GDS1 50%, GDS2 50% with cumulative software metric data, respectively. The remaining three minimum PMSEs were given in the PI-SRMs with Combinations V, VI, and VII in Table 3. Finally, by carefully checking the prediction results in Table 4 and Table 5, we conclude that the failure identification work is the most important development metric in prediction and leads to improving the software fault prediction accurately.

5.3. Software Reliability Assessment

In the previous argument, we have confirmed that our PI-SRMs could show better predictive performances than the existing NHPP-SRMs in all cases. In the next step, we wish to quantify the software reliability, which is defined as the probability that the software after release is fault-free. Let

R (t_{l} ∣ t_{m}) = P r {N (t_{m}) - N (t_{l}) = 0 ∣ N (t_{l}) = n}

denote the software reliability in the operational phase

(t_{l}, t_{m}]

, where

t_{l}

is the release point. Then, from the NHPP assumption, it is easy to obtain:

R (t_{l} ∣ t_{m}) = exp [H_{p} (t_{m}; \hat{θ}, \hat{β}) - H_{p} (t_{l}; \hat{θ}, \hat{β})] .

(22)

In our numerical example, we set

t_{m} = 2 t_{l}

, say, the operational period is twice the length, and assume that the software metrics

x_{k} = (x_{k 1}, x_{k 2}, x_{k 3})

are constant in the time interval

(t_{l}, t_{m})

, since the software product has not been tested after the release time

t_{l}

. We assess the software reliability quantitatively with the best PI-SRMs, which are selected with the minimum AIC at the release time point

t_{l} = t_{n}

.

Table 7 presents the comparison results of our PI-SRMs with the existing NHPP-based SRMs. It can be seen that our PI-SRMs with cumulative/non-cumulative software metrics could provide larger software reliability than the common NHPP-based SRMs without software metrics. This result implies that if the PI-SRMs are reliable in goodness-of-fit and predictive performances, they are more inclined to provide positive decisions in terms of software reliability assessment, and the NHPP-based SRMs without software metrics tend to underestimate the software reliability. On the other hand, we also note that in all four data sets, the software reliability estimated by almost all of the SRMs, except in txvmin-II PI-SRM in GDS3 and txvmin NHPP-based SRM in GDS4, are not promising. This observation also implies that in time period

t_{m} - t_{l}

, these SRMs tend to give false alarms from the viewpoint of safety, so that the software products under testing seem to require more tests to meet the software reliability requirement.

6. Conclusions

This paper presented the proportional intensity NHPP-based SRMs (PI-SRMs in short) with eleven representative baseline intensity functions, which could incorporate multiple time-dependent cumulative/non-cumulative software development/test metrics data. In our numerical experiments with actual software project data, we have quantitatively evaluated the goodness-of-fit and predictive performances of our PI-SRMs and compared them with the common NHPP-based SRMs with the same baseline intensity functions. Finally, we have verified that our SRMs performed well in all data sets and had excellent potential ability on prediction. By carefully checking the regression coefficients, we have also confirmed that failure identification work was the most important testing metric that could contribute to software debugging, and could improve the goodness-of-fit and predictive performances.

In the future, we will propose other PI-SRMs with different baseline intensity functions. In software engineering, the measurement of software metrics and development efforts has been considered as the most fundamental technique to quantify the software product quality. However, compared with the traditional software reliability modeling with only software fault count data, the metrics-based software reliability quantification has not been fully studied yet. In the NHPP-based modeling framework, it is important to find out the best parametric model, such as the intensity function. A similar attempt should be made in finding the best baseline intensity function, which depends on the kind of software metrics used in the analysis.

On the other hand, to analyze the classical software fault count data, we applied only the three time-dependent software metrics mentioned in [39], fault identification effort, execution time, and computer time fault identification, while some metrics that are more easily observed as time-dependent or non-time-dependent during the testing of software engineering (e.g., the total number of operators, number of program volume, number of lines of comments, number of lines of code, number of lines of executable source code) were ignored. Therefore, we will continue to investigate our PI-SRMs in the near future by using the above mentioned metrics data as well as software fault count data.

Author Contributions

Conceptualization, S.L., T.D. and H.O.; methodology, S.L., T.D. and H.O.; validation, S.L., T.D. and H.O.; writing—original draft preparation, S.L.; writing—review and editing, T.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

This work was supported by the Program for Developing and Supporting the Next-Generation of Innovative Researchers at Hiroshima University (Next Generation Fellows), Japan.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lyu, M. (Ed.) Handbook of Software Reliability Engineering; McGraw Hill: New York, NY, USA, 1996. [Google Scholar]
Musa, J.D.; Iannino, A.; Okumoto, K. Software Reliability Measurement, Prediction, Application; McGraw-Hill: New York, NY, USA, 1987. [Google Scholar]
Pham, H. Software Reliability; Springer: London, UK, 2000. [Google Scholar]
Goel, A.L.; Okumoto, K. Time-dependent error-detection rate model for software reliability and other performance measures. IEEE Trans. Reliab. 1979, R-28, 206–211. [Google Scholar] [CrossRef]
Yamada, S.; Ohba, M.; Osaki, S. S-shaped reliability growth modeling for software error detection. IEEE Trans. Reliab. 1983, R-32, 475–478. [Google Scholar] [CrossRef]
Zhao, M.; Xie, M. On maximum likelihood estimation for a general non-homogeneous Poisson process. Scand. J. Stat. 1996, 23, 597–607. [Google Scholar]
Abdel-Ghaly, A.A.; Chan, P.Y.; Littlewood, B. Evaluation of competing software reliability predictions. IEEE Trans. Softw. Eng. 1986, SE-12, 950–967. [Google Scholar] [CrossRef]
Ohba, M. Inflection S-shaped software reliability growth model. In Stochastic Models in Reliability Theory; Springer: New York, NY, USA, 1984; pp. 144–162. [Google Scholar]
Gokhale, S.S.; Trivedi, K.S. Log-logistic software reliability growth model. In Proceedings of the Third IEEE International High-Assurance Systems Engineering Symposium (HASE 1998), Washington, DC, USA, 13–14 November 1998; pp. 34–41. [Google Scholar]
Okamura, H.; Dohi, T.; Osaki, S. Software reliability growth models with normal failure time distributions. Reliab. Eng. Syst. Saf. 2013, 116, 135–141. [Google Scholar] [CrossRef]
Achcar, J.A.; Dey, D.K.; Niverthi, M. A Bayesian approach using nonhomogeneous Poisson processes for software reliability models. In Frontiers in Reliability; World Scientific: Singapore, 1998; pp. 1–18. [Google Scholar]
Ohishi, K.; Okamura, H.; Dohi, T. Gompertz software reliability model: Estimation algorithm and empirical validation. J. Syst. Softw. 2009, 82, 535–543. [Google Scholar] [CrossRef] [Green Version]
Rinsaka, K.; Shibata, K.; Dohi, T. Proportional Intensity-Based Software Reliability Modeling with Time-Dependent Metrics. In Proceedings of the 30th Annual International Computer Software and Applications Conference (COMPSAC’06), Chicaco, IL, USA, 17–21 September 2006; Volume 1, pp. 369–376. [Google Scholar]
Shibata, K.; Rinsaka, K.; Dohi, T. PISRAT: Proportional Intensity-Based Software Reliability Assessment Tool. In Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007), Melbourne, VIC, Australia, 17–19 December 2007; pp. 43–52. [Google Scholar]
Okamura, H.; Dohi, T. SRATS: Software reliability assessment tool on spreadsheet (Experience report). In Proceedings of the 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE 2013), Pasadena, CA, USA, 4–7 November 2013; pp. 100–107. [Google Scholar]
McCabe, T.J. A complexity measure. IEEE Trans. Softw. Eng. 1976, SE-2, 308–320. [Google Scholar] [CrossRef]
Halstead, M.H. Elements of Software Science; Elsevier: New York, NY, USA, 1977. [Google Scholar]
Putnam, L.H. A general empirical solution to the macro software sizing and estimating problem. IEEE Trans. Softw. Eng. 1978, SE-4, 345–367. [Google Scholar] [CrossRef]
Takahashi, M.; Kamayachi, Y. An empirical study of a model for program error prediction. In Proceedings of the 8th International Conference on Software Engineering, London, UK, 28–30 August 1985; pp. 330–336. [Google Scholar]
Pillai, K.; Nair, V.S.S. A model for software development effort and cost estimation. IEEE Trans. Softw. Eng. 1997, 23, 485–497. [Google Scholar] [CrossRef]
Khoshgoftaar, T.M.; Munson, J.C. Predicting software development errors using software complexity metrics. IEEE J. Sel. Areas Commun. 1990, 8, 253–261. [Google Scholar] [CrossRef]
Khoshgoftaar, T.M.; Bhattacharyya, B.B.; Richardson, G.D. Predicting software errors, during development, using nonlinear regression models: A comparative study. IEEE Trans. Reliab. 1992, 41, 390–395. [Google Scholar] [CrossRef]
Khoshgoftaar, T.M.; Munson, J.C.; Bhattacharyya, B.B.; Richardson, G.D. Predictive modeling techniques of software quality from software measures. IEEE Trans. Softw. Eng. 1992, 18, 979–987. [Google Scholar] [CrossRef]
Khoshgoftaar, T.M.; Pandya, A.; Lanning, D. Application of neural networks for predicting program fault. Ann. Softw. Eng. 1995, 1, 141–154. [Google Scholar] [CrossRef]
Schnieidewind, N.F. Software metrics model for integrating quality control and prediction. In Proceedings of the Eighth International Symposium on Software Reliability Engineering, Washington, DC, USA, 2–5 November 1997; pp. 402–415. [Google Scholar]
Schneidewind, N.F. Measuring and evaluating maintenance process using reliability, risk, and test metrics. IEEE Trans. Softw. Eng. 1999, 25, 768–781. [Google Scholar] [CrossRef]
Li, P.L.; Shaw, M.; Herbsleb, J.; Ray, B.; Santhanam, P. Empirical evaluation of defect projection models for widely-deployed production software systems. In Proceedings of the 12th ACM SIGSOFT Symposium on Foundations of Software Engineering, Newport Beach, CA, USA, 31 October–6 November 2004; pp. 263–272. [Google Scholar]
Khoshgoftaar, T.M.; Gao, K.; Szabo, R. Comparing software fault predictions of pure and zero-inflated Poisson regression models. Int. J. Syst. Sci. 2005, 36, 705–715. [Google Scholar] [CrossRef]
Amasaki, S.; Yoshitomi, T.; Mizuno, O.; Takagi, Y.; Kikuno, T. A new challenge for applying time series metrics data to software quality estimation. Softw. Qual. J. 2005, 13, 177–193. [Google Scholar] [CrossRef]
Ascher, H. Proportional hazards modelling of software failure data. In Software Reliability; State of the Art Report; Bendell, A., Mellor, P., Eds.; Pergamon Infotech: Berkshire, UK, 1986; pp. 229–263. [Google Scholar]
Ascher, H. The use of regression techniques for matching reliability models to the real world. In Software System Design Methods, NATO ASI Series; Skwirzynski, J.K., Ed.; Springer: Berlin/Heidelberg, Germany, 1986; Volume F22, pp. 366–378. [Google Scholar]
Bendell, A. The use of exploratory data analysis techniques for software reliability assessment and prediction. In Software System Design Methods, NATO ASI Series; Skwirzynski, J.K., Ed.; Springer: Berlin/Heidelberg, Germany, 1986; Volume F22, pp. 337–351. [Google Scholar]
Evanco, W.M.; Lacovara, R. A model-based framework for the integration of software metrics. J. Syst. Softw. 1995, 26, 75–84. [Google Scholar] [CrossRef]
Evanco, W.M. Using a proportional hazards model to analyze software reliability. In Proceedings of the 9th International Conference Software Technology & Engineering Practice, Pittsburgh, PA, USA, 2 September 1999; pp. 134–141. [Google Scholar]
Nishio, Y.; Dohi, T. Determination of the optimal software release time based on proportional hazards software reliability growth models. J. Qual. Maint. Eng. 2003, 9, 48–65. [Google Scholar] [CrossRef]
Cox, D.R. Regression models and life-tables. J. R. Stat. Soc. 1972, B-34, 187–220. [Google Scholar] [CrossRef]
Murphy, S.; Sen, P. Time-dependent coefficients in a Cox type regression model. Stoch. Process. Their Appl. 1991, 39, 153–180. [Google Scholar] [CrossRef] [Green Version]
Tian, T.; Zucker, D.; Wei, L.J. On the Cox model with time-varying regression coefficient. J. Am. Stat. Assoc. 2005, 100, 172–183. [Google Scholar] [CrossRef] [Green Version]
Musa, J.D. Software Reliability Data, Technical Report, Data and Analysis Center for Software; Rome Air Development Center: New York, NY, USA, 1979. [Google Scholar]
Shibata, K.; Rinsaka, K.; Dohi, T. Metrics-Based Software Reliability Models Using Non-homogeneous Poisson Processes. In Proceedings of the 2006 17ths International Symposium on Software Reliability Engineering, Raleigh, NC, USA, 7–10 November 2006; pp. 52–61. [Google Scholar]
Okamura, H.; Etani, Y.; Dohi, T. A Multi-factor Software Reliability Model Based on Logistic Regression. In Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering, San Jose, CA, USA, 1–4 November 2010; pp. 31–40. [Google Scholar]
Kuwa, D.; Dohi, T. Generalized Logit Regression-Based Software Reliability Modeling with Metrics Data. In Proceedings of the 2013 IEEE 37th Annual Computer Software and Applications Conference, Kyoto, Japan, 22–26 July 2013; pp. 246–255. [Google Scholar]
Kuwa, D.; Dohi, T.; Okamura, H. Generalized Cox Proportional Hazards Regression-Based Software Reliability Modeling with Metrics Data. In Proceedings of the 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing, Vancouver, BC, Canada, 2–4 December 2013; pp. 328–337. [Google Scholar]
Nagaraju, V.; Jayasinghe, C.; Fiondella, L. Optimal test activity allocation for covariate software reliability and security models. J. Syst. Softw. 2020, 168, 110643. [Google Scholar] [CrossRef]
Lawless, J.F.L. Regression methods for Poisson process data. J. Am. Stat. Assoc. 1987, 82, 808–815. [Google Scholar] [CrossRef]
Goel, A.L. Software reliability models: Assumptions, limitations, and applicability. IEEE Trans. Softw. Eng. 1985, SE-11, 1411–1423. [Google Scholar] [CrossRef]

Figure 1. Behavior of estimated cumulative number of software faults in GDS1.

Figure 2. Behavior of estimated number of software faults in each time interval in GDS1.

Figure 3. Behavior of the predicted cumulative number of software faults with PI-SRMs and common NHPP-based SRM in GDS1 (50% observation point).

Figure 4. Behavior of the predicted cumulative number of software faults with PI-SRMs and common NHPP-based SRM in GDS1 (80% observation point).

Table 1. The existing NHPP-based SRMs.

Models	$λ (t; θ) & F (t; α)$
Exponential distribution (exp) [4]	$λ (t; θ) = ω b e^{- b t}$ $F (t; α) = 1 - e^{- b t}$
Gamma distribution (gamma) [5,6]	$λ (t; θ) = ω \frac{e^{- \frac{t}{c}} {(\frac{t}{c})}^{b - 1}}{c Γ (b)}$ $F (t; α) = \int_{0}^{t} \frac{c^{b} s^{b - 1} e^{- c s}}{Γ (b)} d s$
Pareto distribution (pareto) [7]	$λ (t; θ) = \frac{ω b c {(\frac{c}{c + t})}^{b - 1}}{{(c + t)}^{2}}$ $F (t; α) = 1 - {(\frac{b}{t + b})}^{c}$
Truncated normal distribution (tnorm) [10]	$λ (t; θ) = \frac{ω e^{- \frac{{(c - t)}^{2}}{2 b^{2}}}}{\sqrt{2 π} b (1 - \frac{1}{2} e r f c (\frac{c}{\sqrt{2} b}))}$ $F (t; α) = \frac{1}{\sqrt{2 π} b} \int_{- \infty}^{t} e^{- \frac{{(s - c)}^{2}}{2 b^{2}}} d s$
Log-normal distribution (lnorm) [10,11]	$λ (t; θ) = \frac{ω e^{- \frac{{(c - log (t))}^{2}}{2 b^{2}}}}{\sqrt{2 π} b t t}$ $F (t; α) = \frac{1}{\sqrt{2 π} b} \int_{- \infty}^{t} e^{- \frac{{(s - c)}^{2}}{2 b^{2}}} d s$
Truncated logistic distribution (tlogist) [8]	$λ (t; θ) = \frac{ω e^{- \frac{t - c}{b}}}{b (1 - \frac{1}{e^{c / b} + 1}) {(e^{- \frac{t - c}{b}} + 1)}^{2}}$ $F (t; α) = \frac{1 - e^{- b t}}{1 + c e^{- b t}}$
Log-logistic distribution (llogist) [9]	$λ (t; θ) = \frac{ω e^{- \frac{log (t) - c}{b}}}{b t {(e^{- \frac{log (t) - c}{b}} + 1)}^{2}}$ $F (t; α) = \frac{{(b t)}^{c}}{1 + {(b t)}^{c}}$
Truncated extreme-value maximum distribution (txvmax) [12]	$λ (t; θ) = \frac{ω e^{- \frac{t - c}{b} - e^{- \frac{t - c}{b}}}}{b (1 - e^{- e^{c / b}})}$ $F (t; α) = e^{- e^{- \frac{t - c}{b}}}$
Log-extreme-value max maximum distribution (lxvmax) [12]	$λ (t; θ) = \frac{ω c e^{- {(\frac{t}{b})}^{- c}} {(\frac{t}{b})}^{- c - 1}}{b}$ $F (t; α) = e^{- {(\frac{t}{b})}^{- c}}$
Truncated extreme-value minimum distribution (txvmin) [12]	$λ (t; θ) = \frac{ω e^{- \frac{- c - t}{b} - e^{- \frac{- c - t}{b}} + e^{c / b}}}{b}$ $F (t; α) = e^{- e^{- \frac{t - c}{b}}}$
Log-extreme-value minimum distribution (lxvmin) [46]	$λ (t; θ) = \frac{ω e^{- \frac{- c - log (t)}{b} - e^{- \frac{- c - log (t)}{b}}}}{b t}$ $F (t; α) = e^{- e^{-} \frac{t - c}{b}}$

(w > 0, b > 0, c > 0).

Table 2. Data sets.

Data	No. Faults	Testing Days
GDS1	136	21
GDS2	54	17
GDS3	38	14
GDS4	53	16
Metrics Data:	Failure identification work, Execution time, Computer time-failure identification.

Table 3. Combination of covariates

g (x_{k l}; β)

.

Table 3. Combination of covariates

g (x_{k l}; β)

.

	$g (x_{kl}; β) (l = 1, 2, 3)$
Combination I	$exp (β_{0} + x_{k 1} β_{1})$
Combination II	$exp (β_{0} + x_{k 2} β_{2})$
Combination III	$exp (β_{0} + x_{k 3} β_{3})$
Combination IV	$exp (β_{0} + x_{k 1} β_{1} + x_{k 2} β_{2})$
Combination V	$exp (β_{0} + x_{k 1} β_{1} + x_{k 3} β_{3})$
Combination VI	$exp (β_{0} + x_{k 2} β_{2} + x_{k 3} β_{3})$
Combination VII	$exp (β_{0} + x_{k 1} β_{1} + x_{k 2} β_{2} + x_{k 3} β_{3})$
$x_{k 1} :$ Execution time, $x_{k 2} :$ Failure identification work. $x_{k 3} :$ Computer time-failure identification.

Table 4. Goodness-of-fit performance based on AIC.

(i) Best proportional intensity model (cumulative metrics data)
	Model	AIC	MSE	$\hat{β}$
GDS1	tlogist-VI	110.114	0.470	$\hat{β_{0}} = - 2.5903, \hat{β_{2}} = - 0.0805, \hat{β_{3}} = 0.0277$
GDS2	tlogist-III	69.785	0.282	$\hat{β_{0}} = 1.7326, \hat{β_{3}} = 0.1406$
GDS3	txvmin-II	57.281	0.289	$\hat{β_{0}} = - 3.7048, \hat{β_{2}} = 1.2197$
GDS4	exp-I	81.059	0.612	$\hat{β_{0}} = 4.6132, \hat{β_{1}} = - 0.1659$
(ii) Best proportional intensity model (non-cumulative metrics data)
GDS1	txvmin-II	109.015	0.721	$\hat{β_{0}} = 2.9503, \hat{β_{2}} = 0.0206$
GDS2	llogist-II	67.352	0.261	$\hat{β_{0}} = - 0.4155, \hat{β_{2}} = 0.0447$
GDS3	gamma-II	50.696	0.221	$\hat{β_{0}} = 0.6061, \hat{β_{2}} = 1.1493$
GDS4	exp-VI	81.131	0.450	$\hat{β_{0}} = 3.8840, \hat{β_{2}} = - 0.2963, \hat{β_{3}} = 0.8060$
(iii) Best SRATS (no metrics data)
GDS1	tlogist	116.891	0.820	-
GDS2	llogist	73.053	0.501	-
GDS3	lxvmax	61.694	0.481	-
GDS4	txvmin	79.761	0.530	-

Table 5. Predictive performance based on PMSE at 50% observation point.

GDS1
	Best model	PMSE
Case I (cumulative)	tlogist-III	6.409
Case I (non-cumulative)	tlogist-II	4.014
Case II (cumulative)	lxvmax-II	2.160
Case II (non-cumulative)	txvmax-IV	4.931
Case III (cumulative): Linear regression	exp-IV	4.146
Case III (cumulative): Exponential regression	txvmin-V	19.213
Case III (non-cumulative): Linear regression	txvmax-II	3.916
SRATS	tnorm	3.408
	Best model	PMSE
Case I (cumulative)	tlogist-II	0.816
Case I (non-cumulative)	tnorm-III	0.799
Case II (cumulative)	gamma-II	0.742
Case II (non-cumulative)	txvmax-II	0.407
Case III (cumulative): Linear regression	tlogist-IV	0.616
Case III (cumulative): Exponential regression	tnorm-III	1.644
Case III (non-cumulative): Linear regression	tlogist-IV	0.780
SRATS	tlogist	1.769
GDS3
	Best model	PMSE
Case I (cumulative)	tlogist-II	2.676
Case I (non-cumulative)	txvmax-III	0.481
Case II (cumulative)	exp-VII	0.467
Case II (non-cumulative)	pareto-VI	1.506
Case III (cumulative): Linear regression	llogist-II	0.748
Case III (cumulative): Exponential regression	lxvmax-VI	1.842
Case III (non-cumulative): Linear regression	lxvmax-VII	1.769
SRATS	exp	1.836
GDS4
	Best model	PMSE
Case I (cumulative)	tlogist-III	2.088
Case I (non-cumulative)	pareto-II	1.506
Case II (cumulative)	exp-I	0.495
Case II (non-cumulative)	tnorm-VI	0.425
Case III (cumulative): Linear regression	txvmax-VI	1.139
Case III (cumulative): Exponential regression	exp-II	0.688
Case III (non-cumulative): Linear regression	lxvmin-I	0.703
SRATS	tlogist	1.754

Table 6. Predictive performance based on PMSE at 80% observation point.

GDS1
	Best model	PMSE
Case I (cumulative)	tnorm-II	2.482
Case I (non-cumulative)	txvmax-III	1.768
Case II (cumulative)	txvmax-VII	2.142
Case II (non-cumulative)	txvmax-V	2.903
Case III (cumulative): Linear regression	tnorm-II	1.033
Case III (cumulative): Exponential regression	tlogist-VII	3.159
SRATS	txvmin	1.218
GDS2
	Best model	PMSE
Case I (cumulative)	pareto-IV	0.488
Case I (non-cumulative)	gamma-V	0.277
Case II (cumulative)	lnorm-VII	0.399
Case II (non-cumulative)	pareto-I	0.466
Case III (cumulative): Linear regression	exp-IV	0.455
Case III (cumulative): Exponential regression	llogist-VI	0.499
Case III (non-cumulative): Linear regression	llogist-IV	0.508
SRATS	lnorm	0.531
GDS3
	Best model	PMSE
Case I (cumulative)	tnorm-II	0.326
Case I (non-cumulative)	txvmax-II	0.150
Case II (cumulative)	txvmax-IV	0.330
Case II (non-cumulative)	lxvmax-II	0.982
Case III (cumulative): Linear regression	lxvmin-I	0.340
Case III (cumulative): Exponential regression	txvmin-VI	1.484
Case III (non-cumulative): Linear regression	pareto-III	0.293
SRATS	exp	0.295
GDS4
	Best model	PMSE
Case I (cumulative)	exp-I	0.213
Case I (non-cumulative)	lxvmin-V	0.227
Case II (cumulative)	tnorm-IV	0.220
Case II (non-cumulative)	tnorm-II	0.206
Case III (cumulative): Linear regression	tlogist-II	0.207
Case III (cumulative): Exponential regression	lxvmax-III	0.273
Case III (non-cumulative): Linear regression	tlogist-VII	0.220
SRATS	gamma	0.230

Table 7. Software reliability assessment with best SRM (minimum AIC).

(i) Best proportional intensity model (cumulative metrics data)
	Model	Reliability
GDS1	tlogist-VI	2.969 × 10 $^{- 2}$
GDS2	tlogist-III	9.260 × 10 $^{- 1}$
GDS3	txvmin-II	9.998 × 10 $^{- 1}$
GDS4	exp-I	5.455 × 10 $^{- 3}$
(ii) Best proportional intensity model (non-cumulative metrics data)
GDS1	txvmin-II	4.393 × 10 $^{- 1}$
GDS2	llogist-II	1.984 × 10 $^{- 2}$
GDS3	gamma-II	2.945 × 10 $^{- 1}$
GDS4	exp-VI	4.324 × 10 $^{- 1}$
(iii) Best SRATS (no metrics data)
GDS1	tlogist	6.977 × 10 $^{- 5}$
GDS2	llogist	4.152 × 10 $^{- 3}$
GDS3	lxvmax	7.236 × 10 $^{- 5}$
GDS4	txvmin	9.559 × 10 $^{- 1}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, S.; Dohi, T.; Okamura, H. A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates. Electronics 2022, 11, 2353. https://doi.org/10.3390/electronics11152353

AMA Style

Li S, Dohi T, Okamura H. A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates. Electronics. 2022; 11(15):2353. https://doi.org/10.3390/electronics11152353

Chicago/Turabian Style

Li, Siqiao, Tadashi Dohi, and Hiroyuki Okamura. 2022. "A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates" Electronics 11, no. 15: 2353. https://doi.org/10.3390/electronics11152353

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comprehensive Analysis of Proportional Intensity-Based Software Reliability Models with Covariates

Abstract

1. Introduction

2. Related Works

3. NHPP-Based Software Reliability Modeling

4. Proportional Intensity Model

4.1. Model Description

4.2. Maximum Likelihood Estimation

5. Numerical Examples

5.1. Goodness-of-Fit Performance

5.2. Predictive Performance

5.3. Software Reliability Assessment

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI