Next Article in Journal
On the Construction of 3D Fibonacci Spirals
Next Article in Special Issue
Preserving Global Information for Graph Clustering with Masked Autoencoders
Previous Article in Journal
Stability Margin of Data-Driven LQR and Its Application to Consensus Problem
Previous Article in Special Issue
Attributed Graph Embedding with Random Walk Regularization and Centrality-Based Attention
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Optimized LSTM Neural Network for Accurate Estimation of Software Development Effort

by
Anca-Elena Iordan
Department of Computer Science, Technical University of Cluj-Napoca, 400027 Cluj-Napoca, Romania
Mathematics 2024, 12(2), 200; https://doi.org/10.3390/math12020200
Submission received: 24 November 2023 / Revised: 1 January 2024 / Accepted: 3 January 2024 / Published: 8 January 2024
(This article belongs to the Special Issue Advances in Data Mining, Neural Networks and Deep Graph Learning)

Abstract

:
Software effort estimation has constituted a significant research theme in recent years. The more important provocation for project managers concerns reaching their targets within the fixed time boundary. Machine learning strategies can lead software management to an entire novel stage. The purpose of this research work is to compare an optimized long short-term memory neural network, based on particle swarm optimization, with six machine learning methods used to predict software development effort: K-nearest neighbours, decision tree, random forest, gradient boosted tree, multilayer perceptron, and long short-term memory. The process of effort estimation uses five datasets: China and Desharnais, for which outputs are expressed in person-hours; and Albrecht, Kemerer, and Cocomo81, for which outputs are measured in person-months. To compare the accuracy of these intelligent methods four metrics were used: mean absolute error, median absolute error, root mean square error, and coefficient of determination. For all five datasets, based on metric values, it was concluded that the proposed optimized long short-term memory intelligent method predicts more accurately the effort required to develop a software product. Python 3.8.12 programming language was used in conjunction with the TensorFlow 2.10.0, Keras 2.10.0, and SKlearn 1.0.1 to implement these machine learning methods.

1. Introduction

In the process of developing a software product, the stage of estimating the entire effort required to obtain the product according to the initial specifications, is a very complex task for the project manager. To facilitate the work of the project manager, multiple techniques specific to artificial intelligence [1] have been used to predict as accurately as possible the effort needed for software development. These techniques are rarely regarded as compelling for uncertainty administration, and the results show their improbable prediction abilities for effort estimation at underlying phases of the software lifecycle [2].
The concept of machine learning [3] is a subdomain of computer science that gives computers the ability to act without explicit programming. Machine learning [4] deals with the study and construction of algorithms that can learn certain patterns from a set of training data, then make predictions and make decisions with a completely new dataset as input. The provocation of this study is to reveal which of the six used methods—K-nearest neighbours, decision tree, random forest, gradient boosted tree, multilayer perceptron, and long short-term memory—is more efficient for the domain of software project management.
The best results obtained by the six used intelligent methods, implemented based on the parameter tuning process, are compared with the results of other studies. In order to obtain a more accurate estimate of the effort than the existing ones, the LSTM method is improved by using particle swarm optimization which aims to optimize the weights of the LSTM neural network. The scientific contributions of this research work mainly consist of the following two aspects:
  • An improved LSTM model based on particle swarm optimization is proposed, and its superiority is proved by comparing not only with the results obtained in this study, but also with the results obtained in the analysed scientific works.
  • The optimized LSTM neural network using particle swarm optimization is innovatively used in software development effort estimation.
The direct results of this study based on the six methods previously specified can be used to simplify the tasks of project managers and increase the efficiency of the development process. To understand this research work as a whole, this article is structured as follows:
  • In the introduction is presented the reason for choosing the theme.
  • The second section presents a brief description of the current stage in the effort estimation process.
  • The third section contains details about the used approach for software development effort estimation.
  • The next section includes an analysis of the results provided by the six used intelligent methods, following the parameter tuning process and comparison with previous results.
  • The fifth section presents the improved version of the LSTM method based on particle swarm optimization, highlighting the superior results it provides.
  • The last section presents all the conclusions reached after the implementation of used methods.

2. Literature Survey

Over the years, software effort approximation has used approaches based on fuzzy logic, evolutionary methods, and artificial neural networks.
In paper [5], two machine learning methods were used (linear regression, K-nearest neighbours) and three versions of the Cocomo dataset. To determine which of the two analysed methods is better, the following five metrics were considered: root mean square error, relative absolute error, mean absolute error, and correlation coefficient. The model proposed in the aforementioned work consisted in identifying the problem domain, scanning data, and partition data in testing and training, using the WEKA tool. The reached outputs unveil that the linear regression method is a superior estimator by contraposition with KNN.
Another version for approximation of software effort is explored in [6] using the Case-Based Reasoning method optimized by the Genetic Algorithm on seven datasets: Albrecht, Maxwell, NASA, Telecom, Kemerer, China, and Desharnais (Table 1). The main goals of the authors were to investigate the combination of the GA algorithm with CBR to find the best combination of CBR parameters in order to improve the accuracy of software effort prediction. The research methodology consisted of processing the dataset, splitting the processed data for training and testing, and computing the CBR-GA model. Based on the values obtained for the three metrics used—mean absolute error, mean balanced relative error, and mean inverted balanced error—the proposed model provides more accurate estimates, especially in the case of larger datasets.
A gradient boosting regressor model proposed in paper [7] was applied on two datasets: Cocomo81 and China. The performance of the proposed model was reported to seven models: stochastic gradient descent, K-nearest neighbours, decision tree, bagging regressor, random forest regressor, Ada-boost regressor, and gradient boosting regressor starting from four metrics: mean absolute error, mean square error, root mean square error, and coefficient of determination. The gradient boosting algorithm is improved by adding the summation of the predicted results of the previous tree, and this iteration continues until the estimated accuracy is achieved. The research procedure consisted of data collection, data preprocessing, splitting the data for training and testing in a ratio of 80:20, and implementing the gradient boosting model. The study results proved that the gradient boosting regressor performance is outstanding regarding the two datasets for all used metrics, obtaining values such as 0.98 (Cocomo81) and 0.93 (China) for coefficient of determination and 153 (Cocomo81) and 676.6 (China) for mean absolute error.
In study [8], three methods—linear regression, random forest, and multilayer perceptron—were used to obtain an accurate estimation of software effort. These methods were used on the Desharnais dataset, and the implementation was achieved through the WEKA toolkit. The research methodology steps cover a preprocessing technique used to eliminate irrelevant and excessive attributes, splitting the data for training and testing and for developing the three chosen models. The conclusion obtained by comparing these three methods was that linear regression determines a more accurate estimate than the other two methods based on five metrics: mean absolute error, root mean square error, root relative absolute error, relative squared error, and correlation coefficient.
In research presented in paper [9], more variants of the Cocomo dataset were used with a different number of selected attributes to estimate effort based on four machine learning algorithms: linear regression, support vector machine, regression tree, and random forest. The experiment procedure phases include data collection, data preprocessing, data analysis, data splitting, and prediction models development. The effects of the experiments revealed that the support vector machine and random forest methods categorically provide consistent results in the case of using only five important selected attributes compared to the case of using all the attributes.
A comparison between the multiple linear regression method and the expert judgement method applied on a real-time dataset obtained from a medium-sized multinational organization was made in [10]. Multiple linear regression generates better results based on the values of the used evaluation metrics.
In paper [11], the random forest method was compared with the regression tree method based on three datasets: ISBSG R8, Tukutuku, and Cocomo. The improved random forest method analyses the dependency by the number of used attributes from a dataset, observing whether the accuracy was sensitive of considered parameters. The evaluation metrics—magnitude of relative error, mean magnitude of relative error, and median magnitude of relative error—show that the random forest technique surpasses the regression tree technique.
In study [12], decision tree methods were used to estimate development effort and the cost of software when agile principles are fulfilled. The training process was based on 10-fold cross-validation, and the combining of three learning methods (decision tree, random forest and Ada-boost regressor) led to the improvement of prediction accuracy.
Another research [13] proposed the Neural Coherent Clustered Ensemble Classifier model combined with the Optimized Satin Bowerbird model applied to seven datasets: Cocomo81, CocomoNasa93, CocomoNASA60, Desharnais, AlbrechtFPA, ChinaFPA, and Cocomo_sdr (Table 1). The Neural Gas Coherent Clustering method was used with the scope to group the datasets in a coherent approach using an index of the nearest characteristic vector as the definitive parameter. The ECOPB proposed model was evaluated using two metrics: clustering accuracy and mean magnitude of relative error.
An improved analogy-based effort estimation method was introduced in [14] through the standard deviation technique. Validation of achieving improvement was based on a magnitude of relative error metric applied to a dataset, which included information about 21 projects developed using Scrum-based agile software development collected from six different software houses in Pakistan.
In the software development domain, the most accurate approximation of the software effort represents one of the most controvertible problems. The accurate approximation of the software effort needed for software project development is fundamental for performing software administration. For this consideration, correct approximation of software effort is a difficult research labour.
Table 1. Software effort estimation—related works synthesis.
Table 1. Software effort estimation—related works synthesis.
Existing WorksDatasetsMethodsMetrics
MarapelliCocomoLinear regressionRoot mean square error
[5] K-nearest neighboursRelative absolute error
Mean absolute error
Correlation coefficient
Hameed et al.AlbrechtGenetic algorithmMean absolute error
[6]Maxwell Mean balanced relative error
NASA Mean inverted balanced error
Telecom
Kemerer
China
Desharnais
Kumar et al.Cocomo81Stochastic gradient descentMean absolute error
[7]ChinaK-nearest neighboursMean square error
Decision treeRoot mean square error
Bagging regressorCoefficient of determination
Random forest regressor
Ada-boost regressor
Gradient boosting regressor
Singh et al.DesharnaisLinear regressionMean absolute error
[8] Random forestRoot mean square error
Multilayer perceptronRoot relative absolute error
Relative squared error
Correlation coefficient
Zakaria et al.CocomoLinear regressionMean square error
[9] Support vector machineRoot mean square error
Regression treeMean absolute error
Random forestMean absolute percentage error
Abdelali et al.ISBSG R8Random forestMagnitude of relative error
[11]TukutukuRegression treeMean magnitude of relative error
Cocomo Median magnitude of relative error
Sanchez et al.Projects set fromDecision treeMean square error
[12]PakistanRandom forestMean relative error
Ada-boost regressorCoefficient of determination
Mean magnitude of relative error
Resmi et al.Cocomo81ECOPBClustering accuracy
[13]CocomoNasa93 Mean magnitude of relative error
CocomoNASA60
Desharnais
AlbrechtFPA
ChinaFPA
Cocomo_sdr
Muhammad et al.21 projects fromAnalogy-based effortMagnitude of relative error
[14]Pakistan

3. Research Approach

3.1. Data Preparation

To establish the effort needed to develop a software product, there is a large number of data collections, such as: Albrecht, Cocomo81, China, Desharnais, ISBSG, Kemerer, Kitchenham, Maxwell, Miyazaki, NASA, and Tukutuku. In the procedure to approximate the software effort presented in this study, five datasets were used: Albrecht, Kemerer, Cocomo81, China, and Desharnais.
The number of attributes and the number of analysed projects for these five datasets are presented in Table 2. The Albrecht dataset [15] is characterized by eight attributes obtained from 24 IBM software projects, the Kemerer dataset [16] is characterized by seven attributes obtained from 15 analysed projects, and the Cocomo81 dataset [17] is characterized by 17 attributes obtained from 63 NASA software projects. The outputs of all these three datasets, representing the actual effort required to develop the software project, were in the unit person-months. The China dataset [18] was represented by 16 attributes extracted from 499 analysed projects, and the Desharnais dataset [19] was represented by 12 attributes extracted from 81 software projects accomplished in Canada. The outputs of these last two mentioned datasets, representing the actual effort required to develop the software project, were in the unit person-hours.
From the collections of attributes associated with all datasets, six attributes were used; information about them is presented in Table 3. The selection of the used input attributes was performed arbitrarily, influencing the intelligent models implemented in this study. The second column in Table 3 contains the names of the used attributes, and the third column the meaning of each attribute. The last four columns of Table 3 contain information about each of the used attribute values as follows: the fifth column specifies the minimum value of the attribute, the sixth column specifies the maximum value of the attribute, the seventh column specifies the average value of the attribute, and the eighth column specifies the standard deviation of the attribute.

3.2. Used Metrics

Accurate assessment of the performance of artificial intelligence strategies [20] is very complicated due to unbalanced data collections. The following four metrics were used to achieve the previously stated scope: mean absolute error, root mean square error, median absolute error, and coefficient of determination.
Mean absolute error [21], denoted by MAE, signifies the average sum of absolute errors. The formula by which the mean absolute error is determined is established by Equation (1).
M A E = 1 m · k = 1 m x k x k
In this formula (as well as in the following three), m signifies the number of all data points, xk signifies the value to be estimated, and xk″ is the estimated value. Median absolute error [22], denoted by MdAE, calculates the median of all absolute differences between the real effort and the estimated effort, defined by the formula represented in Equation (2).
M d A E = m e d i a n x k x k k = 1 m
Root mean square error [23], denoted by RMSE, evaluates the standard deviation of the estimated value. The mathematical relation by which the root mean square error is evaluated is represented by Equation (3).
R M S E = 1 m · k = 1 m x k x k 2
Coefficient of determination [21], denoted by CD, is defined by dividing the sum of squared residual by the sum of all squares; the relation is given in Equation (4).
C D = 1 k = 1 m x k x k 2 k = 1 m x k x k 2
where:
x = 1 p · k = 1 m x k .
The range associated with the coefficient of determination is given by real numbers between 0 and 1. In the best situation, the predicted values fit exactly the real values, resulting in a value of 1 for coefficient of determination.
If the obtained value for the coefficient is negative, then a correlation does not exist between the data and the used model. These four selected metrics represent a crucial performance statistic in the case of regression models because are easy to understand, interpretable, and reliable elements for evaluation of the prediction accuracy. The lower the values associated with the first three metrics, the more accurately the model predicts. The determination of characteristic values of these four metrics was realized through the sklearn.metrics library belonging to the Scikit-learn [24] tool.

3.3. Selected Machine Learning Methods

For effort estimation, six machine learning methods were chosen: K-nearest neighbours (KNN), decision tree (DT), random forest (RF), gradient boosted tree (GBT), multilayer perceptron (MLP), and long short-term memory (LSTM).
The K-nearest neighbours method [25] is the easiest machine learning method, and the fact that it fetches the right results in a shorter time has led to its great use for both classification and regression problems. This method is based on a feature similarity, which represents that the similarity level of the values’ features to those of the training set determines how a new input will be predicted. An output value will be predicted for the new input based on the features of the input neighbours, with the predicted value being determined by the majority of its neighbours.
A decision tree [26] consists of internal nodes, branches, and leaf nodes, where each internal node represents the test for a certain attribute, each branch expresses the result of the test, and the leaf nodes represent the classes. The decision tree classifier includes two stages: building the tree, and applying it as a solution model to the addressed problem. To evaluate the utility of an attribute, the notion of information gain, defined in terms of entropy, is used. Thus, the greater the information gain, the lower the entropy.
The random forest [27] strategy consists of a collection of decision trees in which each tree is built by applying an algorithm and a random vector to the training dataset. The prediction given by this method is obtained by a majority of votes over the predictions given by each individual tree.
The gradient boosted tree [28] is one of the most powerful strategies for building predictive models, involving three elements: a loss function to be optimized, a weak learner to make the predictions, and an additive model to add weak learners to minimize the loss function. The loss function depends on the type of problem to be solved. Decision trees are used as weak learners in this strategy, being built in a greedy way in which the best branching points are chosen based on the purity of the scores.
Artificial neural networks are computational systems inspired by the way biological nervous systems process information. The main element of this paradigm is the structure of the information processing system, which is composed of large number of highly interconnected processing elements that work together to solve a certain task. The learning process of biological systems involves adjustments of the synaptic connections that exist between neurons, which also happens in the case of artificial networks that learn and adjust by example. From the numerous types of artificial neural networks, the multilayer perceptron and the long short-term memory neural networks were used in this study.
MLP includes more layers, every layer being connected to the following one. The layers’ nodes are neurons characterized by activation functions, less so for the nodes from the input layer. Between the input and the output layers there are one or more hidden layers. Backpropagation is the learning strategy that allows multilayer perceptron to iteratively adapt the weights in the network layers in order to optimize the cost function.
LSTM [29] is a category of artificial recurrent neural network [30] used in the deep learning area. Compared with standard feed-forward neural networks, LSTM is characterized by feedback connections. Moreover, LSTM is very sensitive to the number of nodes in hidden layers, the number of training epochs, the initial learning rate, the momentum, and the dropout probability. These parameters will have a large impact on the software effort estimation performance.
The Python 3.8.12 programming language [31] together with four libraries—Keras 2.10.0 [32], Tensorflow 2.10.0, MathPlotLib [33], and SKlearn 1.0.1—were chosen to implement and evaluate these six selected machine learning methods.

3.4. Development of Effort Estimation Software

To successfully achieve the proposed objectives, an intelligent software was developed whose functionalities are included in the UML use case diagram [34] shown in Figure 1. The use case diagram contains an actor (the user of the intelligent software), eleven use cases, and relationships between them.
Thereby, the software functionalities consist of:
  • Selecting a dataset from the five analysed sets used in model evaluation;
  • Parameter tuning, training, testing, and evaluating an intelligent method selected from the six intelligent methods used in this study;
  • Improving the LSTM method through particle swarm optimization;
  • Parameter tuning, training, testing, and evaluation of the optimized LSTM method;
  • Comparing the performance of the improved LSTM method with the results generated by the six intelligent chosen methods.

4. Analysis of the Six Selected Classical Intelligent Methods

In most methods of machine learning [35], the parameters represent variables used by methods to learn the data characteristics and to adjust the learning from the dataset, with the purpose of obtaining the best performance. The parameter tuning [36] concept involves obtaining the suitable parameters for every learning method, such that the predicted results are optimal.
After the parameter tuning process, it is necessary to split the datasets used for training and testing. For every used dataset, in order to detect the suitable percentage to train and test the data, a fitting strategy was used and a value of approximately 80% was chosen for the training set and the remainder for the test set. Table 4 shows information about the number of effort values (columns 2 and 7), the minimum effort value (columns 3 and 8), the maximum effort value (columns 4 and 9), the average effort value (columns 5 and 10) and the standard deviation (columns 6 and 11) used both in the training stage of the intelligent methods and in their testing stage.

4.1. K-Nearest Neighbours

The K-nearest neighbours method, having the six inputs presented in Table 3 and an output (estimated effort), was set with the following features: Minkowski metric for distance computation, leaf size equal with 30, and uniform weights. To implement this method, the KNeighborsRegressor function from the SKlearn 1.0.1 library was used. For the K-nearest neighbours method, in the parameter tuning process, the following two parameters were used to determine the performance of this method:
  • k—signifies the number of neighbours used to determine the prediction for new instances. In this study, the eight used values of this parameter vary between 3 and 10.
  • p—signifies the power parameter from the Minkowski distance [37], characterized by the following formula:
d M i n k o w s k i = k = 1 m x k x k p 1 p
The usefulness of the Minkowski distance within the KNN method consists in determining which neighbours will be analysed to compare their characteristics with those of a new instance for which a new prediction is determined. Thereby, distance metrics are used calculate which are the neighbours with the most appropriate features and choose the first K neighbours to obtain the new prediction. For this second parameter, three values between 1 and 3 were used by the KNN method to predict the effort. If the parameter p is equal to the value 1, the Minkowski distance is reduced to the Manhattan distance [38] given by the following formula:
d M a n h a t t a n = k = 1 m x k x k
In the case when the parameter p is equal with value 2, the Minkowski distance is transformed into the Euclidean distance [39], represented by the next formula:
d E u c l i d e a n = k = 1 m x k x k 2
Following the used values in the parameter tuning process (eight values for parameter k and three values for parameter p), 24 variants of the KNN method were trained and tested. In Table 5, columns 3 to 5 show the values obtained for the four used metrics by these 24 variants of the KNN method: the third column shows the minimum value, the fourth column shows the maximum value, and the fifth column shows the average value for each metric. Columns 6 and 7 present the values of the parameters for which the minimum values of the four metrics were obtained.
The last three columns show information about the estimated effort by the KNN model for which optimum values of the metrics were obtained. For all five datasets, by comparing the real effort values from Table 4 with the estimated values from Table 5, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.

4.2. Decision Tree

Characterized by the six inputs presented in Table 3 and an output (estimated effort), the DT method was designed as follows: the minimum number of samples required to split an internal node is equal with 2 and the strategy to split each node is the best split. This method was implemented through the DecisionTreeRegressor function from the SKlearn 1.0.1 library. To determine the performance of the Decision Tree [40] method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
  • d—represents the maximum depth of the tree. In this research study, six values were used for this parameter varying between 5 and 10.
  • n—represents the maximum leaf nodes. In this research study, 10 values were used for this parameter varying between 11 and 20.
Following the used values in the parameter tuning process (six values for parameter d and 10 values for parameter n), 60 variants of the DT method were trained and tested. Table 6 shows the values provided by the 60 variants of the DT method, the meanings of the columns from Table 6 being the same as those from Table 5. For all five datasets, by comparing the real effort values from Table 3 with the estimated values from Table 6, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.

4.3. Random Forest

Random forest [41] is an estimator that fits a number of decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy. At the implementation of RF method, the RandomForestRegressor function from the SKlearn 1.0.1 library was used. The RF method, having the six inputs presented in Table 3 and an output (estimated effort), was set with the following features: squared error as a function to measure the quality of a split, and the value of 2 for the minimum number of samples required to split an internal node and also for the minimum number of samples required to be at a leaf node.
To determine the performance of the random forest method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
  • d—represents the maximum depth of the tree. In this research study, six values were used for this parameter varying between 5 and 10.
  • t—represents the number of trees in the forest. In this research study, six values were used for this parameter belonging to the following set: {50, 100, 150, 200, 250, 300}.
Following the used values in the parameter tuning process (six values for parameter d and six values for parameter t), 36 variants of the RF method were trained and tested. Table 7 shows the values provided by the 36 variants of the RF method, the meanings of the columns from Table 7 being the same as those from Table 5.
For all five datasets, by comparing the real effort values from Table 4 with the estimated values from Table 7, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.

4.4. Gradient Boosted Tree

The gradient boosted tree [42] estimator makes an additive model in a forward stepwise manner, allowing the optimization of arbitrary differentiable loss functions. In each step, a regression tree is fitted on the negative gradient of the given loss function. For the implementation of the GBT method, the GradientBoostingRegressor function from the SKlearn 1.0.1 library was used. The GBT method, characterized by the six inputs presented in Table 3 and an output (estimated effort), has been designed with the following features: the squared error to optimize the loss function, the friedman_mse function to measure the quality of a split, the value of 3 as the minimum number of samples required to split an internal node, and the value of 200 for the number of boosting stages to perform.
To determine the performance of the gradient boosted tree method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
  • d—represents the maximum depth of the individual regression estimators. In this research study, five values were used for this parameter varying between 1 and 5.
  • l—represents the learning rate which shrinks the contribution of each tree. In this research study, five values were used for this parameter belonging to the following set: {0.05, 0.1, 0.15, 0.2, 0.25}.
Following the used values in the parameter tuning process (five values for parameter d and five values for parameter l), 25 variants of the GBT method were trained and tested. Table 8 shows the values provided by the 25 variants of the GBT method, the meanings of the columns from Table 8 being the same as those from Table 5. For all five datasets, by comparing the real effort values from Table 3 with the estimated values from Table 8, it is observed that all estimated intervals are included in the real intervals.

4.5. Multilayer Perceptron

At the implementation of MLP method, the MLPRegressor function from the SKlearn 1.0.1 library was used. The MLP method, characterized by the six inputs presented in Table 3 and an output (estimated effort), has been designed with the following features: three hidden layers each with 100 neurons, the relu activation function for the hidden layers, and an Adam solver for weight optimization. To determine the performance of the multilayer perceptron [43] method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
  • t—represents the maximum number of iterations. The values of this parameter are represented by the elements of the set {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000}.
  • l—represents the initial learning rate. In this research study, the values of this parameter are represented by the elements of the set {0.002, 0.003, 0.004, 0.005, 0.006}.
Following the used values in the parameter tuning process (10 values for parameter t and five values for parameter l), 50 variants of the MLP method were trained and tested. Table 9 shows the values provided by the 50 variants of the MLP method, the meanings of the columns from Table 9 being the same as those from Table 5.
For all five datasets, by comparing the real effort values from Table 4 with the estimated values from Table 9, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.

4.6. Long Short-Term Memory

In addition to recurrent neural networks, to which category it belongs, long short-term memory [44] adds a gate structure to its architecture. Compared to traditional neural networks, which are characterized by only one input, LSTM has two input sets: the current information and the output vector provided by the previous unit, while the complex processes associated with the LSTM cell are performed by the unit state. The essence of LSTM lies in the hidden layer, which instead of having nodes contains blocks of memory that contain components which make them smarter than nodes, called memory cells, consisting of three separate gates to regulate the flow and modification of information. The LSTM unit state consists of a forget gate, an input gate, and an output gate. The purpose of the forget gate is to choose to retain or discard some information. The input gate has the role of determining which information is retained internally and of ensuring that critical information can be saved. The output gate’s role is to ascertain the output value and to control the current LSTM state which must be pass to the enable function.
In order to implement the LSTM method, an instance of the LSTM class defined in Keras.layers was used. The LSTM method, characterized by the six inputs presented in Table 3 and an output (estimated effort), has been designed with the following features: hyperbolic tangent as activation function, sigmoid as recurrent activation function, and a value of 0.5 for dropout probability. To determine the performance of the LSTM method applied to five datasets using an Adam optimizer, in the parameter tuning process the following two parameters were tuned:
  • e—represents the number of training epochs. The values of this parameter are represented by the elements of the set {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000}.
  • n—represents the number of neurons in the LSTM hidden layer. The values of this parameter belong to the set {25, 50, 75, 100}.
Following the used values in the parameter tuning process (10 values for parameter e and four values for parameter n), 40 variants of the LSTM method were trained and tested. Table 10 shows the values provided by the 40 variants of the LSTM method, the meanings of the columns from Table 10 being the same as those from Table 5. For all five datasets, by comparing the real effort values from Table 3 with the estimated values from Table 10, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.

4.7. Comparative Analysis with Previous Works

In the case of the Albrecht dataset, the minimum value obtained for the MAE metric is 4.870, the minimum value obtained for the MdAE metric is 2.911, and the minimum value obtained for the RMSE metric is 6.560, all these three values being obtained using the LSTM method.
Thus, this method provides the most efficient estimate for the Albrecht dataset among the six analysed. Comparing the results obtained for the Albrecht dataset with the value of 7.742 (Table 11) for the MAE metric presented in the paper [6], it is observed that the optimal variants of the DT, RF, RBT, MLP, and LSTM methods provide lower values, and therefore is a better estimate of the effort. Only in the case of the KNN method was a value greater than 7.742 obtained.
For the Kemerer dataset, the minimum value obtained for the MAE metric is 42.094, the minimum value obtained for the MdAE metric is 26.206, the minimum value obtained for the RMSE metric is 57.560, and the maximum value for the CD metric is 0.493, all these values also being obtained using the LSTM method. Therefore, the LSTM method provides the most efficient estimate for the Kemerer dataset among the six analysed methods. Comparing the results obtained for the Kemerer dataset with the value 138.911 (Table 11) for the MAE metric presented in paper [6], it is observed that the optimal variants of all six analysed methods provide lower values for the MAE metric, and is thus a better estimation of effort.
In the case of the Cocomo81 dataset, the minimum value obtained for the MAE metric is 178.051, the minimum value obtained for the MdAE metric is 30.642, the minimum value obtained for the RMSE metric is 245.153, and the maximum value for the CD metric is 0.897, all these values being obtained thanks to the LSTM method. Thus, this method provides the most efficient estimate for the Cocomo81 dataset among the six analysed methods. Comparing the results obtained for the Cocomo81 dataset with the value 928.3318 (Table 11) for the MAE metric and with the value 2278.87 for the RMSE metric presented in the paper [9], it is observed that the optimal variants of all six analysed methods provide lower values, and so it is a better estimate of the effort. Comparing the results obtained for Cocomo81 dataset with the value 255.2615 (Table 11) for the MAE metric presented in the paper [5], it is observed that the optimal variants of KNN, RF, GBT, MLP, and LSTM methods provide lower values, and is thus a better estimate of the effort. Only in the case of the DT method was a value greater than 255.2615 obtained. Comparing the results obtained for the Cocomo81 dataset with the value of 533.4206 for the RMSE metric presented in the paper [5], it is observed that the optimal variants of all the six analysed methods provide lower values, and is thus a better estimation of the effort. Comparing the minimum value 178.051 obtained for the MAE metric through the LSTM method with the value 153 (Table 11) presented in the paper [7], the minimum value 245.153 obtained for the RMSE metric through LSTM method with the value 228.7 presented in the paper [7], and the maximum value 0.897 obtained for the CD metric through the LSTM method with the value 0.98 presented in the same paper, it can be concluded that the model presented in [7] is more efficient than the LSTM method used in this research work.
For the China dataset, the minimum value obtained for the MAE metric is 865.122, the minimum value obtained for the MdAE metric is 272.353, the minimum value obtained for the RMSE metric is 2034.574, and the maximum value for the CD metric is 0.902, all these values being obtained using the LSTM method. Thus, this method provides the most efficient estimate for China dataset among the six analysed methods. Comparing the results obtained for the China dataset with the value 926.182 (Table 11) for the MAE metric presented in paper [6], it is observed that only the LSTM method provides lower values, and is thus a better estimate of the effort. In the case of the KNN, DT, RF, GBT, and MLP methods, values higher than 926.182 were obtained, and so they are more ineffective methods. Comparing the minimum value 865.122 obtained for the MAE metric by means of the LSTM method with the value 676.6 (Table 11) presented in the paper [7], the minimum value 2034.574 obtained for the RMSE metric by means of the LSTM method with the value 1803.3 presented in paper [7], and the maximum value 0.902 obtained for the CD metric through the LSTM method with the value 0.93 presented in the same paper, it can be concluded that the model presented in [7] is more efficient than the LSTM method used in this research paper for the China dataset.
In the case of the Desharnais dataset, the minimum value obtained for the MAE metric is 1404.571, the minimum value obtained for the MdAE metric is 880.121, and the minimum value obtained for the RMSE metric is 1847.561, the first two values being obtained thanks to the LSTM method, and the last one through the KNN method. For the CD metric, the maximum value 0.662 was obtained using the LSTM method. Comparing the results obtained for the Desharnais dataset with the value 2244.675 (Table 11) for the MAE metric presented in the paper [6], it is observed that the optimal variants of all six analysed methods provide lower values, therefore it is a better estimate of the effort. Moreover, comparing the results obtained for the Desharnais dataset with the value 2013.79 for the MAE metric presented in the work [8] and with value 2824.57 for the RMSE metric presented in the same work, it is observed that the optimal variants of all six analysed methods provide lower values, so it is a better estimation of the effort.
As can be seen from the second paragraph, the research methodologies in the case of the five used works [5,6,7,8,9] in the comparison of the results are similar to the methodological process used in this study, but with different percentages used when dividing the data for training and testing. After comparing the results obtained with the values presented in the research works selected for comparison, it can be observed that for the Albrecht, Kemerer, and Desharnais datasets, the LSTM method provides better estimates of the effort. Because, for Cocomo81 and China datasets, the architecture of the LSTM method presented in this paper does not provide satisfactory results in comparison with the results obtained by the model presented in the paper [7], further research should be carried out to improve the LSTM method.

5. Optimized LSTM Based on Particle Swarm Optimization

Particle swarm optimization (PSO) [45] is an optimization method belonging to the computational intelligence field, being derived from the study of bird predation behaviour. Every particle in the PSO method fits to a possible solution of the problem, being characterized by three metrics: velocity, position, and fitness. PSO calculates the fitness value of the particles through a process of continuously updating the position and velocity during the iterative process to reach the global optimum. The PSO method is characterized by a fast search speed, easy convergence, and great efficiency. Starting from this aspect, the PSO method combined with LSTM has been applied in many fields, such as: stock forecast [46], smart agriculture [47], financial risk [48], and teaching quality [49].
In the case of the standard LSTM method, the values of some hyperparameters, such as the initial learning rate, the dropout probability, and the momentum, must be set manually. The choice of suitable values for these parameters is based on the experience of the researchers. The weight of the LSTM hidden layer represents the input of the particle swarm. The initial output error of the LSTM is used as the fitness of the particle swarm, then the particle performance is analysed according to condition. The random initial particle swarm updates its own parameter according to the individual extremum and global extremum.
For a better initialization of these three hyperparameters, in this research paper the advantages of the PSO method are combined with the LSTM method. The activities performed within the optimized LSTM method based on PSO are shown in the UML activity diagram drawn in Figure 2. It can be observed that after the activity of establishing the training set and before the training of the LSTM method, the PSO method is used to establish the values of the previously mentioned hyperparameters. The initial values of the PSO method parameters are 10 for the population size, 50 for the number of iterations, and 1.5 for the two acceleration factors, and the range of optimized LSTM hyperparameters is shown in Table 12.
To estimate the soft effort using the optimized LSTM method, the same two parameters (number of training epochs, and number of neurons in the LSTM hidden layer) are used in the tuning process as in the case of the standard LSTM method, their values belonging to the same sets. Table 13 shows the values provided by the 40 variants of the improved LSTM method, the meanings of columns from Table 13 being the same as those from Table 5.
The real effort values provided by the Albrecht test dataset are between 2.9 and 102.4 (Table 4). The range of effort values estimated by the model provided by the minimum values of MAE and MdAE metrics and by the maximum value of CD metric is determined by the values 7.879 and 102.378 (Table 13). The range of effort values estimated by the model provided by the minimum values of the RMSE metric is determined by the values 7.808 and 100.381.
The real effort values provided by the Kemerer test dataset are between 72 and 287 (Table 4). The range of effort values estimated by the model provided by the minimum values of the MAE metric and by the maximum value of the CD metric is determined by the values 72.475 and 281.969 (Table 13). The range of effort values estimated by the model provided by the minimum values of the MdAE metric is determined by the values 79.445 and 280.277. The range of effort values estimated by the model provided by the minimum values of the RMSE metric is determined by the values 75.541 and 274.121.
Figure 3 shows a graphical representation of the predicted values for the software effort by optimized LSTM method in the case of the minimum values of the metrics relative to the real effort specific to each of the five datasets.
Comparing the minimum value 133.096 (Table 13) obtained for the MAE metric by means of the optimized LSTM method with the value 153 (Table 11) presented in the paper [7], the minimum value 217.794 obtained for the RMSE metric by means of the optimized LSTM method with the value 228.7 presented in the paper [7], and the maximum value 0.986 obtained for the CD metric by means of the optimized LSTM method with the value 0.98 presented in the same paper, it can be concluded that the optimized LSTM method presented in this study is more efficient than the model presented in [7] for the Cocomo81 dataset. The real effort values provided by the Cocomo81 test dataset are between 6 and 2040 (Table 4). The range of effort values estimated by the model provided by the minimum values of the MAE and RMSE metrics and by the maximum value of the CD metric is determined by the values 42.699 and 1996.398 (Table 13). The range of effort values estimated by the model provided by the minimum values of the MdAE metric is determined by the values 34.991 and 1239.811.
Comparing the minimum value 331.089 (Table 13) obtained for the MAE metric by means of the optimized LSTM method with the value 676.6 (Table 11) presented in the paper [7], the minimum value 873.102 obtained for the RMSE metric by means of the optimized LSTM method with the value 1803.3 presented in the paper [7], and the maximum value 0.951 obtained for the CD metric by means of the optimized LSTM method with the value 0.93 presented in the same paper, it can be concluded that the optimized LSTM method presented in this study is more efficient than the model presented in [7] for the China dataset. The real effort values provided by the China test dataset are between 89 and 49,034 (Table 4). The range of effort values estimated by the model provided by the minimum values of the MAE and MdAE metrics and by the maximum value of the CD metric is determined by the values 136.281 and 40,989.973 (Table 13).
The range of effort values estimated by the model provided by the minimum values of the RMSE metric is determined by the values 116.507 and 42,016.782. The real effort values provided by the Desharnais test dataset are between 546 and 14,987 (Table 4). The range of effort values estimated by the model provided by the minimum values of the MAE and RMSE metrics and by the maximum value of the CD metric is determined by the values 1369.367 and 14,461.862 (Table 13). The range of effort values estimated by the model provided by the minimum values of the MdAE metric is determined by the values 667.482 and 7990.067. For all five datasets, by comparing the real effort values from Table 4 with the estimated values from Table 13, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort. Table 14 presents a synthesis of the optimal values obtained for all four used metrics in the case of all the intelligent methods used in this study.
From the comparison of the values of the metrics obtained (Table 11 and Table 14), it is concluded that the optimized LSTM method provides a more efficient option for estimating the software effort.

6. Conclusions

Estimating the effort required to develop software products is one of the most vexing problems for software project managers because it affects the status of the projects in terms of success or failure. This research paper proposes an optimized LSTM neural network method which use PSO methods to optimize three hyperparameters of LSTM with the aim of obtaining the most accurate estimate of the effort required to develop a software product. The results obtained by the optimized LSTM method were compared with the results provided by six other prediction methods: KNN, DT, RF, GBT, MLP, and LSTM, by applying to the values of five datasets: Albrecht, Kemerer, Cococmo81, China, and Desharnais. The superiority of the optimized LSTM method compared to the other six prediction methods resulted from obtaining lower values for the three metrics used in the evaluation: MAE, MdAE, RMSE, and CD.
There were some limitations, especially for small datasets such as Albrecht and Kemerer, which have a small number of observations and low dimensionality. Future work will be needed to investigate the reason for this discrepancy and what optimization method can search in small datasets. In addition, as a further direction, optimization of the LSTM method should be attempted using other techniques with the purpose to obtain a better estimation of effort estimation with the existing datasets.
From this scientific work, software developers will be able to benefit in selecting the best models for predicting the development effort of software products before they are developed.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available in Albrecht dataset at [10.1109/TSE.1983.235271] [15]. Kemerer dataset https://zenodo.org/record/268464 [16]. Cocomo81 dataset https://doi.org/10.1109/TSE.1984.5010193 [17]. China dataset https://zenodo.org/record/268446 [18]. Desharnais dataset [19].

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Panoiu, M.; Panoiu, C.; Mezinescu, S.; Militaru, G.; Baciu, I. Machines Learning Techniques Applied to the Harmonic Analysis of Railway Power Supply. Mathematics 2023, 11, 1381. [Google Scholar] [CrossRef]
  2. Walter, B.; Jolevski, I.; Garnizov, I.; Arsovic, A. Supporting Product Management Lifecycle with Common Best Practices. In Systems, Software and Services Process Improvement; Springer: Grenoble, France, 2023; pp. 207–215. [Google Scholar]
  3. Muscalagiu, I.; Popa, H.E.; Negru, V. Improving the Performances of Asynchronous Search Algorithms in Scale-Free Networks using the Nogoood Processor Technique. Comput. Inform. 2015, 34, 254–274. [Google Scholar]
  4. Iordan, A.E. Supervised learning use to acquire knowledge from 2D analytic geometry problems. In Recent Challenges in Intelligent Information and Database Systems; Springer: Singapore, 2022; pp. 189–200. [Google Scholar]
  5. Marapelli, B. Software Development Effort and Cost Estimation using Linear Regression and K-Nearest Neighbours Machine Learning Algorithms. Int. J. Innov. Technol. Explor. Eng. 2019, 9, 2278–3075. [Google Scholar]
  6. Hameed, S.; Elsheikh, Y.; Azzeh, M. An Optimized Case-Based Software Project Effort Estimation Using Genetic Algorithm. Inf. Softw. Technol. 2023, 153, 107088. [Google Scholar] [CrossRef]
  7. Kumar, P.S.; Behera, H.; Nayak, J.; Naik, B. A Pragmatic Ensemble Learning Approach for Effective Software Effort Estimation. Innov. Syst. Softw. Eng. 2022, 18, 283–299. [Google Scholar] [CrossRef]
  8. Singh, A.J.; Kumar, M. Comparative Analysis on Prediction of Software Effort Estimation using Machine Learning Techniques. In Proceedings of the International Conference on Intelligent Communication and Computational Research, Punjab, India, 25 January 2020. [Google Scholar]
  9. Zakaria, N.A.; Ismail, A.R.; Ali, A.Y.; Khalid, N.H.; Abidin, N.Z. Software Project Estimation with Machine Learning. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 726–734. [Google Scholar] [CrossRef]
  10. Fedotova, O.; Teixeira, L.; Alvelos, H. Software Effort Estimation with Multiple Linear Regression: Review and Practical Application. J. Inf. Sci. Eng. 2018, 29, 925–945. [Google Scholar]
  11. Abdelali, Z.; Mustapha, H.; Abdelwahed, N. Investigating the Use of Random Forest in Software Effort Estimation. Procedia Comput. Sci. 2019, 148, 343–352. [Google Scholar] [CrossRef]
  12. Sanchez, E.R.; Santacruz, E.F.V.; Maceda, H.C. Effort and Cost Estimation Using Decision Tree Techniques and Story Points in Agile Software Development. Mathematics 2023, 11, 1477. [Google Scholar] [CrossRef]
  13. Resmi, V.; Anitha, K.L.; Narasimha Murthy, G.K. Optimized Satin Bowerbird for Software Project Effort Estimation. Eur. Chem. Bull. 2023, 12, 410–423. [Google Scholar]
  14. Muhammad, A.L.; Khalid, M.K.; Hani, U. Using Standard Deviation with Analogy-Based Estimation for Improved Software Effort Prediction. KSII Trans. Internet Inf. Syst. 2023, 17, 1356–1375. [Google Scholar]
  15. Albrecht, A.J.; Gaffney, J.E. Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation. IEEE Trans. Softw. Eng. 1993, 9, 639–648. [Google Scholar] [CrossRef]
  16. Zenodo. Kemerer. Available online: https://zenodo.org/record/268464 (accessed on 11 July 2023).
  17. Boehm, B.W. Software Engineering Economics. IEEE Trans. Softw. Eng. 1984, 10, 4–21. [Google Scholar] [CrossRef]
  18. Zenodo. China: Effort Estimation Dataset. Available online: https://zenodo.org/record/268446 (accessed on 15 July 2023).
  19. Desharnais, J.M. Analyse Statistique de la Productivitie des Projets Informatique a Partie de la Technique des Point des Function. Master’s Thesis, University of Montreal, Montréal, QC, Canada, 1999. [Google Scholar]
  20. Panoiu, M.; Panoiu, C.; Iordan, A.; Ghiormez, L. Artificial Neural Networks in Predicting Current in Electric Arc Furnaces. IOP Conf. Ser. Mater. Sci. Eng. 2014, 57, 012011. [Google Scholar] [CrossRef]
  21. Handelman, G.S.; Kok, H.K.; Chandra, R.; Razavi, A.; Huang, S.; Brooks, M.; Lee, M.; Asadi, H. Peering into the Black Box of Artificial Intelligence: Evaluation Metrics of Machine Learning Methods. Am. J. Roentgenol. 2019, 212, 38–43. [Google Scholar] [CrossRef]
  22. Botchkarev, A. Performance Metrics in Machine Learning Regression, Forecasting and Prognostics: Properties and Topology. Interdiscip. J. Inf. Knowl. Manag. 2019, 14, 45–79. [Google Scholar]
  23. Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1–12. [Google Scholar]
  24. Gavin Hackeling, G. Mastering Machine Learning with Scikit-Learn; Packt Publishing Ltd.: Birmingham, UK, 2018. [Google Scholar]
  25. Covaciu, F.; Pisla, A.; Iordan, A.E. Development of a Virtual Reality Simulator for an Intelligent Robotic System Used in Ankle Rehabilitation. Sensors 2021, 21, 1537. [Google Scholar] [CrossRef]
  26. Patel, H.; Prajapati, P. Study and Analysis of Decision Tree Based Classification Algorithms. Int. J. Comput. Sci. Eng. 2018, 6, 74–78. [Google Scholar] [CrossRef]
  27. Spoon, K.; Beemer, J.; Whitmer, J.; Fan, J.; Frazee, J.; Stronach, J.; Bohonak, A.; Levine, R. Random Forests for Evaluating Pedagogy and Informing Personalized Learning. J. Educ. Data Min. 2016, 8, 20–50. [Google Scholar]
  28. Castro-Martín, L.; Mar Rueda, M.; Ferri-García, R.; Hernando-Tamayo, C. On the Use of Gradient Boosting Methods to Improve the Estimation with Data Obtained with Self-Selection Procedures. Mathematics 2021, 9, 2991. [Google Scholar] [CrossRef]
  29. Iordan, A.E. Usage of Stacked Long Short-Term Memory for Recognition of 3D Analytic Geometry Elements. In Proceedings of the International Conference on Agents and Artificial Intelligence, Lisbon, Portugal, 3–5 February 2022. [Google Scholar]
  30. Alamia, A.; Gauducheau, V.; Paisios, D.; VanRullen, R. Comparing Feedforward and Recurrent Neural Network Architectures with Human Behavior in Artificial Grammar Learning. Sci. Rep. 2020, 10, 22172. [Google Scholar] [CrossRef] [PubMed]
  31. Awar, N.; Zhu, S.; Biros, G.; Gligoric, M. A performance portability framework for Python. In Proceedings of the ACM International Conference on Supercomputing, New York, NY, USA, 14–18 June 2021. [Google Scholar]
  32. Ullo, S.L.; Del Rosso, M.P.; Sebastianelli, A.; Puglisi, E.; Bernardi, M.L.; Cimitile, M. How to develop your network with Python and Keras. Artif. Intell. Appl. Satell.-Based Remote Sens. Data Earth Obs. 2021, 98, 131–158. [Google Scholar]
  33. Hunt, J. Introduction to Matplotlib. In Advanced Guide to Python 3 Programming; Springer: Cham, Switzerland, 2019; Volume 5, pp. 35–42. [Google Scholar]
  34. Iordan, A.E.; Covaciu, F. Improving design of a triangle geometry computer application using a creational pattern. Acta Tech. Napoc. Appl. Math. Mech. Eng. 2020, 63, 73–78. [Google Scholar]
  35. Covaciu, F.; Crisan, N.; Vaida, C.; Andras, I.; Pusca, A.; Gherman, B.; Radu, C.; Tucan, P.; Hajjar, N.A.; Pisla, D. Integration of Virtual Reality in the Control System of an Innovative Medical Robot for Single-Incision Laparoscopic Surgery. Sensors 2023, 23, 5400. [Google Scholar] [CrossRef] [PubMed]
  36. Mabayoje, A.; Balogun, A.; Hajarah, H.; Atoyebi, J.; Mojeed, H.; Adeyemo, V. Parameter tuning in KNN for software defect prediction: An empirical analysis. J. Teknol. Sist. Komput. 2019, 7, 121–126. [Google Scholar] [CrossRef]
  37. Kumbure, M.; Luukka, P. A generalized fuzzy k-nearest neighbor regression model based on Minkowski distance. Granul. Comput. 2022, 7, 657–671. [Google Scholar] [CrossRef]
  38. Uyanik, B.; Orman, G.K. A Manhattan distance based hybrid recommendation system. Int. J. Appl. Math. Electron. Comput. 2023, 11, 20–29. [Google Scholar] [CrossRef]
  39. Iordan, A.E. Optimal Solution of the Guarini Puzzle Extension using Tripartite Graphs. IOP Conf. Ser. Mater. Sci. Eng. 2019, 477, 012046. [Google Scholar] [CrossRef]
  40. Roshanski, I.; Kalech, M.; Rokach, L. Automatic Feature Engineering for Learning Compact Decision Trees. Expert Syst. Appl. 2023, 229, 120470. [Google Scholar] [CrossRef]
  41. Yu, Y.; Wang, L.; Huang, H.; Yang, W. An Improved Random Forest Algorithm. J. Phys. Conf. Ser. 2020, 1646, 012070. [Google Scholar] [CrossRef]
  42. Xia, Y.; Chen, J. Traffic Flow Forecasting Method based on Gradient Boosting Decision Tree. Adv. Eng. Res. 2017, 130, 413–416. [Google Scholar]
  43. Han, Y.; Zhang, Z.; Kobe, F. The Hybrid of Multilayer Perceptrons: A New Geostatistical Tool to Generate High-Resolution Climate Maps in Developing Countries. Mathematics 2023, 11, 1239. [Google Scholar] [CrossRef]
  44. Hsieh, S.C. Tourism demand forecasting based on an LSTM network and its variants. Algorithms 2021, 14, 243. [Google Scholar] [CrossRef]
  45. Higashitani, M.; Ishigame, A.; Yasuda, K. Particle swarm optimization considering the concept of predator-prey behavior. In Proceedings of the IEEE International Conference on Evolutionary Computation, Vancouver, BC, Canada, 16–21 July 2006; pp. 434–437. [Google Scholar]
  46. Lv, L.; Kong, W.; Qi, J.; Zhang, J. An improved long short-term memory neural network for stock forecast. MATEC Web Conf. 2018, 232, 01024. [Google Scholar] [CrossRef]
  47. Zheng, C.; Li, H. The Prediction of Collective Economic Development based on the PSO-LSTM Model in Smart Agriculture. PeerJ Comput. Sci. 2023, 9, 1304. [Google Scholar] [CrossRef]
  48. Chen, X.; Long, Z. E-Commerce Enterprises Financial Risk Prediction Based on FA-PSO-LSTM Neural Network Deep Learning Model. Sustainability 2023, 15, 5882. [Google Scholar] [CrossRef]
  49. Qu, Z.; Yin, J. Optimized LSTM Networks with Improved PSO for the Teaching Quality Evaluation Model of Physical Education. Int. Trans. Electr. Energy Syst. 2022, 2022, 8743694. [Google Scholar] [CrossRef]
Figure 1. UML use case diagram.
Figure 1. UML use case diagram.
Mathematics 12 00200 g001
Figure 2. UML activity diagram.
Figure 2. UML activity diagram.
Mathematics 12 00200 g002
Figure 3. Effort estimated by optimized LSTM method compared to the real effort. (a) Albrecht dataset; (b) Kemerer dataset; (c) China dataset; (d) Cococmo81 dataset; (e) Desharnais dataset.
Figure 3. Effort estimated by optimized LSTM method compared to the real effort. (a) Albrecht dataset; (b) Kemerer dataset; (c) China dataset; (d) Cococmo81 dataset; (e) Desharnais dataset.
Mathematics 12 00200 g003
Table 2. Datasets dimensions.
Table 2. Datasets dimensions.
DatasetsProjectsInput AttributesOutput AttributeOutput Unit
Albrecht2471Person-months
Kemerer1561Person-months
Cocomo8163161Person-months
China499151Person-hours
Desharnais81111Person-hours
Table 3. Lists with chosen attributes for all five used datasets.
Table 3. Lists with chosen attributes for all five used datasets.
DatasetsUsed AttributesAttributes DescriptionMinMaxMeanStd
AlbrechtInputNumericInput functions number719340.2536.913
OutputNumericOutput functions number1215047.2535.169
FileNumericCount of file processing36017.3715.522
RawFPcounsRaw function points189.521902638.54452.654
AdjfpNumericAdjusted function points1991902647.62487.995
InquiryNumericCount of query functions07516.8719.337
KemererLanguageProgramming language131.20.561
RawFPUnadjusted function points972284993.86597.426
DurationDuration of the project53114.267.544
AdjFPAdjusted function points 99.92306.8999.14589.592
HardwareHardware resources162.331.676
KSLOCKilo lines of code39450186.57136.817
Cocomo81DataDatabase size0.941.161.040.073
TimeTime constraint11.661.110.161
CplxComplexity of product0.71.651.090.202
StorStorage constraint11.561.140.179
ToolSoftware tools use0.831.241.010.085
LocLines of code1.98115077.21168.51
ChinaInputFunction points of input09404167.12486.301
OutputFunction points of external output02455113.61221.299
FileFunction points of internal logical files0295591.23210.289
AddedFunction points of added functions013,580360.41829.797
ResourceTeam type141.450.823
AFPAdjusted function points917,518486.911059.008
DesharnaisTeamExpTeam experience −142.181.415
ManagerExpManager experience−172.531.643
LengthLength of the project13911.677.424
EntitiesNumber of entities7387122.3384.882
LanguageProgramming language131.550.707
AdjustmentAdjusted factor55227.6310.592
Table 4. Real effort values.
Table 4. Real effort values.
DatasetsTrainingTesting
NumberMinimMaximMeanStdNumberMinimMaximMeanStd
Albrecht180.5105.218.4125.60962.9102.432.2636.225
Kemerer1123.21107.31222.91306.831472.0287.0209.1594.446
Cocomo81475.911,400.0819.912075.392166.02040.0282.06525.316
China37426.054,620.04218.066679.0712589.049,034.03417.625826.511
Desharnais60651.023,940.05241.554714.12421546.014,987.04488.473478.961
Table 5. Test results for KNN method.
Table 5. Test results for KNN method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeanpkMinimMaximMean
MAEAlbrecht11.22818.55715.914136.33359.46722.872
Kemerer75.51999.72887.2011486.675210.325140.968
Cocomo81221.619257.051235.5052314.001476.333170.152
China1747.0482154.6721983.69513272.04124,896.84112,753.127
Desharnais1422.8781660.3921562.577171831.0118977.0154149.707
MdAEAlbrecht4.5566.6575.671286.25130.18716.237
Kemerer57.933105.03989.9291363.233237.001130.167
Cocomo8135.783138.35790.4511313.998477.336169.568
China611.6021228.203963.55215211.71321,100.60511,009.203
Desharnais921.6671541.4171270.797131547.0210,061.3333580.746
RMSEAlbrecht18.20231.74627.054136.33359.46722.872
Kemerer90.933109.744101.4121486.671210.325140.968
Cocomo81446.987483.493466.3991717.742448.714203.692
China3922.7554436.1634161.75317237.13522,966.10910,303.723
Desharnais1847.5612839.9272230.060161886.5019945.8334193.794
CDAlbrecht0.3890.4370.416136.33359.46722.872
Kemerer0.3030.3510.3321486.675210.325140.968
Cocomo810.7380.7930.7622314.001476.333170.152
China0.6680.7620.71313272.04124,896.84112,753.127
Desharnais0.5910.6580.624171831.0118977.0154149.707
Table 6. Test results for DT method.
Table 6. Test results for DT method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeandnMinimMaximMean
MAEAlbrecht6.97523.43912.3785127.751101.20268.357
Kemerer95.853123.987105.82961273.202159.006124.358
Cocomo81275.979305.639287.25161724.0161069.077453.055
China1575.6332556.4852098.901516895.60633,291.00718,912.274
Desharnais1953.3612693.0822087.1257172593.5019100.2464507.015
MdAEAlbrecht3.8259.8686.7385127.751101.20268.357
Kemerer81.802123.52599.10451881.121162.015132.831
Cocomo8148.617154.18177.21761724.0161069.077453.055
China755.6421845.5051398.297619895.60333,291.07517,588.015
Desharnais1453.0832108.7051701.0256111414.44714,973.0757982.055
RMSEAlbrecht10.21332.87217.0835127.751101.20268.357
Kemerer106.226137.477119.25361273.202159.006124.358
Cocomo81514.792585.492541.03581898.764902.033345.159
China2848.5574441.9813471.155516895.60633,291.00718,912.274
Desharnais2526.0713704.8093306.1725182507.5019217.2586988.345
CDAlbrecht0.4970.5610.5365127.751101.20268.357
Kemerer0.2750.3090.28961273.202159.006124.358
Cocomo810.6910.7740.73361724.0161069.077453.055
China0.6850.7930.723516895.60633,291.00718,912.274
Desharnais0.4790.5630.5157172593.5019100.2464507.015
Table 7. Test results for RF method.
Table 7. Test results for RF method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeandtMinimMaximMean
MAEAlbrecht6.01520.91410.81771509.37498.20767.264
Kemerer82.739135.792110.145920077.025204.876132.329
Cocomo81235.857299.066267.743825033.0991329.045567.987
China1409.3242441.0751970.8456300706.12535,987.75520,075.177
Desharnais1792.8372479.1751995.07582002275.04511,975.3575705.075
MdAEAlbrecht3.5798.5284.381810012.84286.92546.152
Kemerer69.174117.02698.7131015082.038236.109157.235
Cocomo8143.925138.27169.925910078.8081099.125458.794
China621.7821753.6551203.55510250819.25536,606.17719,877.095
Desharnais1284.9472095.1751685.25593001155.03813,475.1056795.755
RMSEAlbrecht7.72630.71424.00371509.37498.20767.264
Kemerer83.726128.057102.113920077.025204.876132.329
Cocomo81487.283557.883521.151910078.8081099.125458.794
China2792.3724337.0773507.8086300706.12535,987.75520,075.177
Desharnais2379.0633685.9152985.17593001155.03813,475.1056795.755
CDAlbrecht0.5190.5930.55371509.37498.20767.264
Kemerer0.2840.3270.301920077.025204.876132.329
Cocomo810.6580.7320.693825033.0991329.045567.987
China0.7090.8210.7656300706.12535,987.75520,075.177
Desharnais0.4910.5810.54382002275.04511,975.3575705.075
Table 8. Test results for GBT method.
Table 8. Test results for GBT method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeandlMinimMaximMean
MAEAlbrecht5.89227.67418.93650.158.29489.49169.174
Kemerer60.197120.34582.39240.2574.147223.736148.884
Cocomo81209.198279.885251.09250.231.7251492.221671.257
China1287.3822835.1351487.65540.2595.12539,875.75515,978.045
Desharnais1549.7922278.7551956.70550.15908.75511,309.8755995.255
MdAEAlbrecht3.09217.53811.94140.212.74378.92654.729
Kemerer58.682106.82780.72430.1589.387213.357176.086
Cocomo8139.826127.86481.02330.2581.2161283.117687.832
China504.6722095.1851675.08520.15709.14838,075.27517,899.345
Desharnais901.9252399.0751108.12530.21093.4458796.0955750.174
RMSEAlbrecht8.09438.75723.93150.258.83988.93768.147
Kemerer64.045134.98198.92740.2574.147223.736148.884
Cocomo81463.782521.236482.22630.2581.2161283.117687.832
China2563.8246235.9754387.18740.2595.12539,875.75515,978.045
Desharnais2095.8273190.7252601.00730.21093.4458796.0955750.174
CDAlbrecht0.5270.6090.56350.158.29489.49169.174
Kemerer0.3470.3920.36740.2574.147223.736148.884
Cocomo810.7590.8120.77450.231.7251492.221671.257
China0.7730.8650.81540.2595.12539,875.75515,978.045
Desharnais0.5850.6390.60750.15908.75511,309.8755995.255
Table 9. Test results for MLP method.
Table 9. Test results for MLP method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeantlMinimMaximMean
MAEAlbrecht7.55532.30617.0468000.0033.72692.53856.623
Kemerer48.145523.383275.0354000.00385.215239.455135.764
Cocomo81199.513224.522211.0924000.00439.785945.767435.843
China1383.7132764.5241309.1755000.004100.26236,214.90517,088.325
Desharnais1587.6192434.2461997.0772000.0041189.9098186.2664378.075
MdAEAlbrecht3.25625.65512.1044000.0032.59841.75928.729
Kemerer47.427582.999321.4284000.00385.215239.455135.764
Cocomo8157.83398.13969.92410000.00348.505957.248675.912
China517.8981929.0291417.0755000.004100.26236,214.90517,088.325
Desharnais905.8412521.2221203.7551000.0061114.5057368.8114877.075
RMSEAlbrecht8.20842.33824.7538000.0033.72692.53856.623
Kemerer58.217570.662298.5194000.00385.215239.455135.764
Cocomo81355.245371.664362.2145000.00260.725964.043542.912
China2892.1965864.3613708.1855000.004100.26236,214.90517,088.325
Desharnais2176.5913252.0832709.1222000.0041189.9098186.2664378.075
CDAlbrecht0.5030.5470.5228000.0033.72692.53856.623
Kemerer0.3820.4380.4134000.00385.215239.455135.764
Cocomo810.7620.8260.7934000.00439.785945.767435.843
China0.7920.8430.8255000.004100.26236,214.90517,088.325
Desharnais0.5760.6240.5992000.0041189.9098186.2664378.075
Table 10. Test results for LSTM method.
Table 10. Test results for LSTM method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeanenMinimMaximMean
MAEAlbrecht4.8707.4285.341700507.218102.35762.851
Kemerer42.094499.132233.4375002572.912253.776148.866
Cocomo81178.051433.352256.4673002593.9181845.4021001.563
China865.1222837.9911899.123100025544.76239,938.26528,055.125
Desharnais1404.5712297.2821709.153600100610.1629234.8853545.995
MdAEAlbrecht2.9116.9954.2355001006.286100.39159.973
Kemerer26.206580.956311.4685002572.912253.776148.866
Cocomo8130.642332.799187.3959005021.553500.38245.728
China272.3532188.8221550.13590050224.83426,239.26615,066.025
Desharnais880.1212130.5031406.345100075657.8027847.1065008.755
RMSEAlbrecht6.5608.2097.349100757.63799.04455.862
Kemerer57.560544.419279.8285002572.912253.776148.866
Cocomo81245.153574.658352.7143002593.9181845.4021001.563
China2034.5744864.5222897.095100025544.76239,938.26528,055.125
Desharnais1960.0073178.6012499.065600100610.1629234.8853545.995
CDAlbrecht0.5940.6310.612700507.218102.35762.851
Kemerer0.4310.4930.4645002572.912253.776148.866
Cocomo810.8160.8970.8453002593.9181845.4021001.563
China0.7840.9020.845100025544.76239,938.26528,055.125
Desharnais0.6080.6620.631600100610.1629234.8853545.995
Table 11. Comparison with the results of other studies.
Table 11. Comparison with the results of other studies.
DatasetsMetricsMetric Values
Other StudiesLSTM—This Study
AlbrechtMAE7.742 [6]4.870
KemererMAE138.911 [6]42.094
Cocomo81MAE928.3318 [9]178.051
255.2615 [5]
153 [7]
RMSE2278.87 [9]245.153
533.4206 [5]
228.7 [7]
CD0.98 [7]0.897
ChinaMAE926.182 [6]865.122
676.6 [7]
RMSE1803.3 [7]2034.574
CD0.93 [7]0.902
DesharnaisMAE2244.675 [6]1404.571
2013.7987 [8]
RMSE2824.57 [8]1847.560
Table 12. Range of optimized LSTM parameters.
Table 12. Range of optimized LSTM parameters.
ParametersLower ValueUpper Value
Initial learning rate0.00010.1
Momentum0.50.9
Dropout probability01
Table 13. Test results for improved LSTM method.
Table 13. Test results for improved LSTM method.
MetricsDatasetsMetric ValuesParametersEstimated Effort
MinimMaximMeanenMinimMaximMean
MAEAlbrecht3.60420.02612.04610001007.879102.37862.199
Kemerer35.29481.21051.7153002572.475281.969151.872
Cocomo81133.096498.216243.86570010042.6991996.398875.823
China331.089686.181498.15560075136.28140,989.97324,775.855
Desharnais1253.7811857.3541437.557100251369.36714,461.8627868.247
MdAEAlbrecht2.90211.5685.09210001007.879102.37862.199
Kemerer13.74783.12547.836100010079.445280.277149.346
Cocomo8130.142363.576215.7325005034.9911239.811676.557
China111.213464.951277.95560075136.28140,989.97324,775.855
Desharnais849.9291393.1361007.79340025667.4827990.0675908.635
RMSEAlbrecht4.71732.98219.187800757.808100.38159.035
Kemerer41.412133.24778.9451007575.541274.121147.326
Cocomo81217.794790.626373.58470010042.6991996.398875.823
China873.1021386.0981175.07550075116.50742,016.78229,775.095
Desharnais1535.8522949.7322211.123100251369.36714,461.8627868.247
CDAlbrecht0.6150.6970.64810001007.879102.37862.199
Kemerer0.5280.5790.5413002572.475281.969151.872
Cocomo810.9170.9860.95370010042.6991996.398875.823
China0.8930.9510.93760075136.28140,989.97324,775.855
Desharnais0.6270.6830.645100251369.36714,461.8627868.247
Table 14. Optimal metrics values.
Table 14. Optimal metrics values.
DatasetsMethodsMAEMdAERMSECD
AlbrechtKNN11.2284.55618.2020.437
DT6.9753.82510.2130.561
RF6.0153.5797.7260.593
GBT5.8923.0927.0940.609
MLP7.5553.256 8.2080.547
LSTM4.8702.9116.5600.631
Optimized LSTM3.6042.9024.7170.697
KemererKNN75.51957.93390.9330.351
DT95.85381.802106.2260.309
RF82.73969.17483.7260.327
GBT60.19758.68264.0450.392
MLP48.14547.42758.2170.438
LSTM42.09426.20657.5600.493
Optimized LSTM35.29413.74741.4120.579
Cocomo81KNN221.61935.783446.9870.793
DT275.97948.617514.7920.774
RF235.85743.925487.2830.732
GBT209.19839.826463.7820.812
MLP199.51357.833355.2450.826
LSTM178.05130.642245.1530.897
Optimized LSTM133.09630.142217.7940.986
ChinaKNN1747.048611.6023922.7550.762
DT1575.633755.6422848.5570.793
RF1409.324621.7822792.3720.821
GBT1287.382504.6722563.8240.865
MLP1383.713517.8982892.1960.843
LSTM865.122272.3532034.5740.902
Optimized LSTM331.089111.213873.1020.951
DesharnaisKNN1422.878921.6671847.5610.658
DT1953.3611453.0832526.0710.563
RF1792.8371284.9472379.0630.581
GBT1549.792901.9252095.8270.639
MLP1587.619905.8412176.5910.624
LSTM1404.571880.1211960.0070.662
Optimized LSTM1253.781849.9291535.8520.683
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Iordan, A.-E. An Optimized LSTM Neural Network for Accurate Estimation of Software Development Effort. Mathematics 2024, 12, 200. https://doi.org/10.3390/math12020200

AMA Style

Iordan A-E. An Optimized LSTM Neural Network for Accurate Estimation of Software Development Effort. Mathematics. 2024; 12(2):200. https://doi.org/10.3390/math12020200

Chicago/Turabian Style

Iordan, Anca-Elena. 2024. "An Optimized LSTM Neural Network for Accurate Estimation of Software Development Effort" Mathematics 12, no. 2: 200. https://doi.org/10.3390/math12020200

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop