Do Stop Words Matter in Bug Report Analysis? Empirical Findings Using Deep Learning Models Across Duplicate, Severity, and Priority Classification
Abstract
1. Introduction
2. Background Knowledge
2.1. Bug Report and Bug Life Cycle
- Open: The bug has been reported and is pending investigation or assignment.
- Closed: The issue has been resolved, verified, or determined to be invalid or unreproducible.
- Reopened: A previously closed bug is reopened because the fix was ineffective or the problem has reoccurred.
- Deferred: The bug is considered low priority or out of scope for the current release and is scheduled for future consideration.
2.2. Bug Report Preprocessing
2.3. Feature Selection in Bug Reports
2.4. Bug Duplicate Detection
2.5. Bug Severity Prediction
- S1 (Very Severe): Issues that cause major failures, such as system crashes, memory leaks, or complete data corruption.
- S2 (Severe): Defects that affect core functionality or essential features but do not lead to total system failure.
- S3 (Moderate): Problems such as interface inconsistencies, logical errors, or boundary condition issues that affect user workflows without halting operation.
- S4 (Minor): Cosmetic issues, including typographical errors or formatting inconsistencies, that do not impact software usability or correctness.
2.6. Bug Priority Prediction
- P1 (Highest Priority): Critical defects that cause major system failures or render essential functions inoperable. Immediate attention is required.
- P2 (High Priority): Significant issues that affect key components but may not prevent system operation. These issues should be addressed promptly.
- P3 (Medium Priority): Defects that affect non-core functionality or lead to unexpected results without causing major disruption.
- P4 (Low Priority): Minor usability or cosmetic issues that do not interfere with normal operation and can be resolved at a later time.
- P5 (Lowest Priority): Trivial or unclear problems that have minimal or no impact on system behavior or user experience.
3. Our Approach
3.1. Data Extraction and Preprocessing
- Tokenization: Text fields were segmented into individual word-level tokens.
- Lemmatization: Each token was normalized to its base or root form to reduce morphological variance.
- Stop Word Removal (optional): Common function words that are typically considered semantically uninformative, such as “the,” “is,” and “at,” were removed in one version of the dataset.
3.2. Feature Selection
3.3. Model Training and Evaluation
3.3.1. Convolutional Neural Networks (CNNs)
- Input Layer: Accepts preprocessed text in the form of token indices or embedding vectors.
- Convolutional Layer: Applies filters to extract local feature patterns such as key phrases or term co-occurrences.
- Pooling Layer: Reduces the dimensionality of feature maps and preserves the most salient features.
- Fully Connected Layer: Aggregates extracted features to form a dense representation suitable for classification.
- Output Layer: Produces the final prediction for each task using a softmax or sigmoid function, depending on the classification type.
3.3.2. Long Short-Term Memory (LSTM)
3.3.3. Gated Recurrent Unit (GRU)
3.3.4. Transformer
3.3.5. BERT (Bidirectional Encoder Representations from Transformers)
3.3.6. Training Configuration
3.4. Implementation Details
3.4.1. Model Configuration
3.4.2. Text Processing and Vectorization
3.4.3. Training Settings
3.4.4. Imbalanced Data Handling
3.4.5. Evaluation Metrics
4. Experiment Result
4.1. Overview
4.2. Data Collection and Preprocessing
4.3. Model Training and Evaluation
4.4. Research Questions
- RQ1. How effectively do deep learning models perform in detecting duplicate bug reports and predicting severity and priority levels across diverse open-source projects?
- RQ2. To what extent does stop word removal affect the performance of deep learning models on bug report classification tasks?
- RQ3. How does the performance of the models with stop word removal compare to those without, in relation to standard baseline architectures such as CNN, LSTM, GRU, Transformer, and BERT?
4.5. Result
- Answer for RQ1.
- Duplicate Detection: All five deep learning models (CNN, LSTM, GRU, Transformer, and BERT) achieved an average F1-score of 0.36 across the eight projects.
- Severity Prediction: The average F1-score across all models was 0.33, reflecting stable performance in classifying the impact level of reported bugs.
- Priority Prediction: The models also achieved an average F1-score of 0.33 in predicting the urgency of resolving each bug report.
- Answer for RQ2.
- For duplicate detection, the F1-score was 0.36 both with and without stop word removal.
- For severity prediction, the F1-score remained at 0.33 in both cases.
- For priority prediction, the F1-score also remained stable at 0.33 regardless of whether stop words were removed.
- Answer for RQ3.
- For duplicate detection, both the proposed model and all baseline models achieved an average F1-score of 0.36.
- For severity prediction, the F1-score remained stable at 0.33 across all models.
- For priority prediction, the proposed model also matched the baseline performance with an average F1-score of 0.33.
- Statistical Hypothesis Testing
- The t-test [39] was used when the performance data followed a normal distribution. This test compares the mean values of two groups to determine if the difference is statistically significant.
- The Wilcoxon signed-rank test [40], a non-parametric alternative, was applied when normality could not be assumed. This test evaluates whether the median difference between paired samples is significantly different from zero.
5. Discussion
5.1. Experiment Results
- For duplicate detection, the difference in F1-scores typically ranged between 0.01 and 0.05. For instance, in the Eclipse dataset using the CNN model (Table A1), the F1-score was 0.34 with stop word removal and 0.35 without.
- For severity prediction, the difference was generally around 0.01 across models and datasets. In Table A2, the RedHat dataset with the LSTM model showed an F1-score of 0.34 when stop words were removed and 0.33 when they were retained.
- For priority prediction, differences in F1-scores were also limited, with most variations falling between 0.01 and 0.03. In Table A11, for example, the Kernel dataset using CNN produced an F1-score of 0.38 with stop word removal and 0.35 without.
- The limited impact of stop word removal indicates that the models are already effective at identifying relevant semantic features without needing to discard common words.
- The consistency in performance suggests that duplicate and non-duplicate bug reports may share a common core vocabulary, which reduces the influence of stop words in distinguishing between them.
- The deep learning models used in this study, particularly BERT, Transformer, and LSTM, are designed to capture contextual information and exhibit robustness to uninformative tokens. This likely diminishes the importance of removing stop words during preprocessing.
5.2. Detailed Analysis of Experimental Results
5.3. Threats to Validity
- (1)
- Severe class imbalance, particularly in the duplicate detection task (e.g., 11,539 duplicates vs. 231,068 non-duplicates), which may not be fully addressed by SMOTE due to its synthetic nature;
- (2)
- Linguistic ambiguity and noise in bug reports, including jargon and inconsistent phrasing;
- (3)
- Simplified model architectures without advanced tuning, domain-specific embedding, or attention mechanisms;
6. Related Work
6.1. Bug Duplicate Detection
6.2. Bug Severity Prediction
6.3. Bug Priority Prediction
6.4. Comparison with Our Study
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Model | CNN | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.35 | 0.50 | 0.34 | 0.53 | 0.54 | 0.38 | 0.50 | 0.35 | 0.57 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.53 | 0.27 | 0.50 | 0.33 | 0.51 | 0.56 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
FreeBSD | Duplicate | 0.35 | 0.58 | 0.51 | 0.71 | 0.71 | 0.38 | 0.57 | 0.50 | 0.69 | 0.70 |
Severity | 0.27 | 0.50 | 0.33 | 0.50 | 0.55 | 0.25 | 0.50 | 0.33 | 0.51 | 0.53 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.57 | 0.26 | 0.50 | 0.33 | 0.52 | 0.59 | |
GCC | Duplicate | 0.35 | 0.50 | 0.33 | 0.53 | 0.53 | 0.37 | 0.50 | 0.33 | 0.52 | 0.53 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.53 | 0.25 | 0.50 | 0.33 | 0.51 | 0.53 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.51 | |
Gentoo | Duplicate | 0.38 | 0.50 | 0.35 | 0.56 | 0.58 | 0.36 | 0.50 | 0.36 | 0.57 | 0.58 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.35 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.38 | 0.50 | 0.34 | 0.50 | 0.56 | 0.38 | 0.50 | 0.34 | 0.50 | 0.53 |
Severity | 0.26 | 0.50 | 0.33 | 0.52 | 0.53 | 0.30 | 0.50 | 0.33 | 0.51 | 0.53 | |
Priority | 0.26 | 0.50 | 0.33 | 0.51 | 0.52 | 0.31 | 0.50 | 0.34 | 0.50 | 0.50 | |
RedHat | Duplicate | 0.32 | 0.52 | 0.39 | 0.61 | 0.59 | 0.38 | 0.51 | 0.38 | 0.57 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.27 | 0.50 | 0.33 | 0.55 | 0.61 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.27 | 0.50 | 0.33 | 0.52 | 0.63 | |
Sourceware | Duplicate | 0.34 | 0.50 | 0.34 | 0.49 | 0.49 | 0.38 | 0.50 | 0.34 | 0.51 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | 0.27 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.32 | 0.50 | 0.33 | 0.50 | 0.49 | 0.28 | 0.50 | 0.33 | 0.50 | 0.55 | |
WebKit | Duplicate | 0.36 | 0.50 | 0.33 | 0.63 | 0.62 | 0.35 | 0.50 | 0.34 | 0.63 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.30 | 0.50 | 0.33 | 0.51 | 0.74 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.27 | 0.50 | 0.33 | 0.50 | 0.75 |
Model | LSTM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.38 | 0.50 | 0.33 | 0.54 | 0.54 | 0.25 | 0.52 | 0.35 | 0.62 | 0.60 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.34 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.53 | |
FreeBSD | Duplicate | 0.70 | 0.71 | 0.69 | 0.76 | 0.77 | 0.68 | 0.70 | 0.69 | 0.76 | 0.77 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | 0.25 | 0.50 | 0.33 | 0.51 | 0.53 | |
Priority | 0.25 | 0.50 | 0.33 | 0.53 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
GCC | Duplicate | 0.39 | 0.51 | 0.35 | 0.60 | 0.58 | 0.35 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | |
Priority | 0.25 | 0.50 | 0.34 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | |
Gentoo | Duplicate | 0.46 | 0.50 | 0.34 | 0.60 | 0.61 | 0.46 | 0.51 | 0.34 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Kernel | Duplicate | 0.51 | 0.50 | 0.35 | 0.55 | 0.54 | 0.52 | 0.56 | 0.38 | 0.62 | 0.58 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.58 | 0.25 | 0.50 | 0.33 | 0.49 | 0.60 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.60 | 0.25 | 0.50 | 0.33 | 0.49 | 0.55 | |
RedHat | Duplicate | 0.66 | 0.61 | 0.68 | 0.68 | 0.65 | 0.70 | 0.67 | 0.65 | 0.75 | 0.72 |
Severity | 0.25 | 0.50 | 0.34 | 0.53 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | |
Sourceware | Duplicate | 0.48 | 0.50 | 0.34 | 0.50 | 0.50 | 0.48 | 0.50 | 0.34 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.56 | 0.25 | 0.50 | 0.33 | 0.50 | 0.60 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.61 | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | |
WebKit | Duplicate | 0.44 | 0.51 | 0.33 | 0.66 | 0.66 | 0.53 | 0.50 | 0.33 | 0.65 | 0.65 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.75 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 |
Model | GRU | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.48 | 0.50 | 0.33 | 0.54 | 0.54 | 0.45 | 0.50 | 0.33 | 0.56 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.57 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
FreeBSD | Duplicate | 0.44 | 0.50 | 0.33 | 0.53 | 0.50 | 0.43 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.63 | |
Priority | 0.25 | 0.50 | 0.34 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
GCC | Duplicate | 0.33 | 0.50 | 0.34 | 0.54 | 0.54 | 0.35 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
Gentoo | Duplicate | 0.32 | 0.50 | 0.35 | 0.55 | 0.50 | 0.29 | 0.50 | 0.33 | 0.56 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.63 | 0.58 | 0.34 | 0.66 | 0.63 | 0.57 | 0.50 | 0.35 | 0.52 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | 0.25 | 0.50 | 0.33 | 0.50 | 0.58 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | |
RedHat | Duplicate | 0.65 | 0.59 | 0.53 | 0.59 | 0.54 | 0.67 | 0.62 | 0.58 | 0.67 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.71 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | |
Sourceware | Duplicate | 0.42 | 0.50 | 0.33 | 0.51 | 0.51 | 0.41 | 0.50 | 0.33 | 0.56 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | 0.25 | 0.50 | 0.33 | 0.50 | 0.63 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.68 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
WebKit | Duplicate | 0.38 | 0.50 | 0.33 | 0.65 | 0.64 | 0.27 | 0.50 | 0.33 | 0.65 | 0.65 |
Severity | 0.26 | 0.50 | 0.33 | 0.51 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 |
Model | Transformer | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.43 | 0.50 | 0.33 | 0.51 | 0.55 | 0.35 | 0.50 | 0.33 | 0.55 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.34 | 0.50 | 0.55 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.55 | 0.50 | |
FreeBSD | Duplicate | 0.33 | 0.50 | 0.33 | 0.56 | 0.55 | 0.36 | 0.50 | 0.33 | 0.55 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.27 | 0.50 | 0.34 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.28 | 0.50 | 0.36 | 0.50 | 0.50 | 0.32 | 0.50 | 0.35 | 0.55 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.56 | 0.25 | 0.50 | 0.33 | 0.52 | 0.56 | |
Gentoo | Duplicate | 0.26 | 0.50 | 0.34 | 0.56 | 0.55 | 0.28 | 0.50 | 0.34 | 0.50 | 0.56 |
Severity | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.35 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.48 | 0.50 | 0.34 | 0.50 | 0.49 | 0.48 | 0.50 | 0.35 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
RedHat | Duplicate | 0.43 | 0.50 | 0.35 | 0.56 | 0.54 | 0.37 | 0.50 | 0.34 | 0.50 | 0.67 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.50 | 0.50 | 0.34 | 0.49 | 0.50 | 0.49 | 0.50 | 0.33 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
WebKit | Duplicate | 0.40 | 0.51 | 0.36 | 0.59 | 0.60 | 0.43 | 0.50 | 0.33 | 0.61 | 0.62 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | BERT | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.34 | 0.50 | 0.33 | 0.50 | 0.52 | 0.37 | 0.50 | 0.33 | 0.58 | 0.57 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | |
FreeBSD | Duplicate | 0.33 | 0.50 | 0.34 | 0.50 | 0.52 | 0.32 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
GCC | Duplicate | 0.32 | 0.50 | 0.35 | 0.50 | 0.50 | 0.32 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.29 | 0.50 | 0.33 | 0.52 | 0.59 | 0.32 | 0.50 | 0.34 | 0.59 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.53 | 0.52 | 0.38 | 0.57 | 0.54 | 0.52 | 0.50 | 0.34 | 0.58 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.26 | 0.50 | 0.34 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
RedHat | Duplicate | 0.62 | 0.53 | 0.41 | 0.58 | 0.55 | 0.64 | 0.59 | 0.44 | 0.67 | 0.64 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.53 | 0.52 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.51 | |
Sourceware | Duplicate | 0.44 | 0.50 | 0.34 | 0.49 | 0.49 | 0.39 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.28 | 0.50 | 0.33 | 0.50 | 0.49 | 0.29 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.49 | 0.26 | 0.50 | 0.33 | 0.51 | 0.50 | |
WebKit | Duplicate | 0.30 | 0.50 | 0.33 | 0.60 | 0.59 | 0.42 | 0.50 | 0.33 | 0.58 | 0.58 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 |
Model | CNN | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.42 | 0.50 | 0.33 | 0.51 | 0.51 | 0.54 | 0.50 | 0.34 | 0.53 | 0.53 |
Severity | 0.25 | 0.50 | 0.34 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
FreeBSD | Duplicate | 0.53 | 0.50 | 0.33 | 0.55 | 0.56 | 0.48 | 0.51 | 0.36 | 0.65 | 0.66 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.42 | 0.50 | 0.33 | 0.54 | 0.55 | 0.36 | 0.50 | 0.33 | 0.54 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.69 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Gentoo | Duplicate | 0.43 | 0.50 | 0.33 | 0.53 | 0.53 | 0.50 | 0.50 | 0.33 | 0.54 | 0.54 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.31 | 0.50 | 0.33 | 0.51 | 0.51 | 0.34 | 0.50 | 0.34 | 0.53 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
RedHat | Duplicate | 0.45 | 0.50 | 0.33 | 0.49 | 0.49 | 0.50 | 0.50 | 0.33 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.59 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.59 | |
Sourceware | Duplicate | 0.49 | 0.50 | 0.34 | 0.49 | 0.49 | 0.45 | 0.50 | 0.33 | 0.50 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.53 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | |
WebKit | Duplicate | 0.40 | 0.50 | 0.33 | 0.51 | 0.51 | 0.33 | 0.50 | 0.33 | 0.53 | 0.54 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.67 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.67 |
Model | LSTM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.34 | 0.50 | 0.33 | 0.51 | 0.51 | 0.57 | 0.50 | 0.33 | 0.53 | 0.53 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | |
FreeBSD | Duplicate | 0.57 | 0.55 | 0.46 | 0.69 | 0.70 | 0.57 | 0.55 | 0.47 | 0.68 | 0.68 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
GCC | Duplicate | 0.40 | 0.51 | 0.34 | 0.60 | 0.59 | 0.39 | 0.51 | 0.35 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.47 | 0.50 | 0.34 | 0.60 | 0.61 | 0.46 | 0.50 | 0.33 | 0.61 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.51 | 0.50 | 0.35 | 0.55 | 0.54 | 0.53 | 0.50 | 0.34 | 0.53 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.58 | 0.25 | 0.50 | 0.33 | 0.49 | 0.58 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
RedHat | Duplicate | 0.62 | 0.50 | 0.35 | 0.55 | 0.54 | 0.62 | 0.51 | 0.37 | 0.58 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.71 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 | |
Sourceware | Duplicate | 0.39 | 0.50 | 0.33 | 0.51 | 0.50 | 0.38 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | 0.25 | 0.50 | 0.33 | 0.50 | 0.56 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.58 | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | |
WebKit | Duplicate | 0.52 | 0.50 | 0.34 | 0.56 | 0.56 | 0.43 | 0.50 | 0.33 | 0.55 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 |
Model | GRU | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.41 | 0.50 | 0.33 | 0.50 | 0.50 | 0.50 | 0.50 | 0.33 | 0.53 | 0.53 |
Severity | 0.26 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.39 | 0.50 | 0.33 | 0.50 | 0.50 | 0.33 | 0.50 | 0.33 | 0.52 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.70 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.33 | 0.50 | 0.33 | 0.56 | 0.56 | 0.32 | 0.50 | 0.34 | 0.55 | 0.55 |
Severity | 0.25 | 0.50 | 0.34 | 0.50 | 0.58 | 0.26 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.31 | 0.50 | 0.33 | 0.59 | 0.58 | 0.33 | 0.50 | 0.35 | 0.59 | 0.59 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.48 | 0.50 | 0.34 | 0.52 | 0.53 | 0.50 | 0.50 | 0.34 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.63 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | 0.25 | 0.50 | 0.33 | 0.50 | 0.61 | |
RedHat | Duplicate | 0.58 | 0.50 | 0.34 | 0.53 | 0.52 | 0.61 | 0.51 | 0.36 | 0.56 | 0.54 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | |
Sourceware | Duplicate | 0.45 | 0.50 | 0.34 | 0.50 | 0.50 | 0.39 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.67 | 0.25 | 0.50 | 0.33 | 0.50 | 0.63 | |
WebKit | Duplicate | 0.55 | 0.50 | 0.33 | 0.55 | 0.56 | 0.57 | 0.50 | 0.33 | 0.57 | 0.58 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.75 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | 0.25 | 0.50 | 0.33 | 0.50 | 0.75 |
Model | Transformer | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.37 | 0.50 | 0.34 | 0.50 | 0.51 | 0.36 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
FreeBSD | Duplicate | 0.32 | 0.50 | 0.33 | 0.51 | 0.51 | 0.33 | 0.50 | 0.33 | 0.56 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.49 | 0.25 | 0.50 | 0.33 | 0.50 | 0.68 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
GCC | Duplicate | 0.31 | 0.50 | 0.34 | 0.50 | 0.55 | 0.33 | 0.50 | 0.35 | 0.52 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
Gentoo | Duplicate | 0.26 | 0.50 | 0.33 | 0.55 | 0.52 | 0.29 | 0.50 | 0.34 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.37 | 0.50 | 0.34 | 0.50 | 0.51 | 0.51 | 0.50 | 0.35 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
RedHat | Duplicate | 0.48 | 0.51 | 0.41 | 0.51 | 0.50 | 0.50 | 0.50 | 0.41 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.40 | 0.50 | 0.33 | 0.50 | 0.50 | 0.41 | 0.50 | 0.34 | 0.50 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
WebKit | Duplicate | 0.44 | 0.50 | 0.33 | 0.54 | 0.55 | 0.46 | 0.50 | 0.33 | 0.57 | 0.57 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 |
Model | BERT | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.48 | 0.50 | 0.33 | 0.52 | 0.51 | 0.47 | 0.50 | 0.33 | 0.55 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.53 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.35 | 0.50 | 0.33 | 0.56 | 0.55 | 0.33 | 0.50 | 0.34 | 0.50 | 0.52 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.29 | 0.50 | 0.34 | 0.50 | 0.50 | 0.29 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | |
Priority | 0.27 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Gentoo | Duplicate | 0.25 | 0.50 | 0.34 | 0.55 | 0.59 | 0.26 | 0.50 | 0.35 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.52 | 0.50 | 0.34 | 0.49 | 0.49 | 0.41 | 0.50 | 0.34 | 0.49 | 0.49 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
RedHat | Duplicate | 0.58 | 0.50 | 0.34 | 0.50 | 0.50 | 0.60 | 0.50 | 0.34 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.43 | 0.50 | 0.33 | 0.50 | 0.51 | 0.41 | 0.50 | 0.33 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
WebKit | Duplicate | 0.39 | 0.50 | 0.33 | 0.53 | 0.53 | 0.52 | 0.50 | 0.33 | 0.52 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | CNN | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.35 | 0.50 | 0.33 | 0.55 | 0.55 | 0.33 | 0.50 | 0.34 | 0.55 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.63 | 0.59 | 0.45 | 0.70 | 0.69 | 0.57 | 0.54 | 0.43 | 0.69 | 0.69 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.56 | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | |
GCC | Duplicate | 0.34 | 0.50 | 0.33 | 0.54 | 0.54 | 0.41 | 0.50 | 0.33 | 0.53 | 0.53 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | |
Gentoo | Duplicate | 0.30 | 0.50 | 0.33 | 0.57 | 0.57 | 0.40 | 0.50 | 0.36 | 0.58 | 0.58 |
Severity | 0.25 | 0.50 | 0.33 | 0.53 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.47 | 0.50 | 0.34 | 0.50 | 0.51 | 0.52 | 0.50 | 0.34 | 0.51 | 0.51 |
Severity | 0.29 | 0.50 | 0.35 | 0.51 | 0.51 | 0.32 | 0.50 | 0.34 | 0.51 | 0.51 | |
Priority | 0.39 | 0.50 | 0.38 | 0.49 | 0.50 | 0.27 | 0.50 | 0.35 | 0.50 | 0.50 | |
RedHat | Duplicate | 0.60 | 0.51 | 0.38 | 0.60 | 0.59 | 0.62 | 0.55 | 0.40 | 0.65 | 0.64 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.31 | 0.50 | 0.33 | 0.51 | 0.74 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.36 | 0.50 | 0.33 | 0.50 | 0.75 | |
Sourceware | Duplicate | 0.40 | 0.50 | 0.34 | 0.51 | 0.51 | 0.36 | 0.50 | 0.33 | 0.51 | 0.52 |
Severity | 0.31 | 0.50 | 0.34 | 0.50 | 0.50 | 0.30 | 0.50 | 0.33 | 0.51 | 0.52 | |
Priority | 0.33 | 0.50 | 0.33 | 0.52 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
WebKit | Duplicate | 0.34 | 0.50 | 0.33 | 0.61 | 0.60 | 0.30 | 0.50 | 0.33 | 0.63 | 0.61 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | LSTM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.48 | 0.51 | 0.37 | 0.61 | 0.59 | 0.49 | 0.51 | 0.35 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
FreeBSD | Duplicate | 0.36 | 0.50 | 0.36 | 0.58 | 0.59 | 0.37 | 0.50 | 0.36 | 0.59 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.39 | 0.50 | 0.34 | 0.55 | 0.58 | 0.40 | 0.50 | 0.35 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.37 | 0.50 | 0.33 | 0.61 | 0.61 | 0.35 | 0.50 | 0.33 | 0.59 | 0.59 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.54 | 0.52 | 0.39 | 0.60 | 0.57 | 0.60 | 0.55 | 0.37 | 0.63 | 0.50 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.59 | 0.26 | 0.50 | 0.33 | 0.49 | 0.54 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.60 | 0.25 | 0.50 | 0.33 | 0.50 | 0.57 | |
RedHat | Duplicate | 0.68 | 0.65 | 0.64 | 0.72 | 0.69 | 0.71 | 0.69 | 0.67 | 0.75 | 0.72 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | |
Sourceware | Duplicate | 0.49 | 0.50 | 0.34 | 0.50 | 0.50 | 0.47 | 0.50 | 0.33 | 0.51 | 0.55 |
Severity | 0.28 | 0.50 | 0.33 | 0.50 | 0.64 | 0.25 | 0.50 | 0.33 | 0.50 | 0.58 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.57 | 0.26 | 0.50 | 0.33 | 0.49 | 0.59 | |
WebKit | Duplicate | 0.42 | 0.52 | 0.38 | 0.63 | 0.62 | 0.57 | 0.53 | 0.40 | 0.65 | 0.64 |
Severity | 0.25 | 0.50 | 0.33 | 0.52 | 0.56 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | GRU | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.42 | 0.50 | 0.33 | 0.58 | 0.57 | 0.41 | 0.50 | 0.33 | 0.60 | 0.59 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.55 | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.32 | 0.50 | 0.33 | 0.52 | 0.50 | 0.35 | 0.50 | 0.34 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.61 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.34 | 0.50 | 0.33 | 0.53 | 0.54 | 0.33 | 0.50 | 0.33 | 0.54 | 0.54 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.36 | 0.50 | 0.33 | 0.52 | 0.50 | 0.33 | 0.50 | 0.33 | 0.49 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.42 | 0.50 | 0.34 | 0.52 | 0.52 | 0.42 | 0.50 | 0.34 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | 0.25 | 0.50 | 0.33 | 0.50 | 0.60 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
RedHat | Duplicate | 0.64 | 0.58 | 0.62 | 0.60 | 0.55 | 0.67 | 0.64 | 0.62 | 0.68 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.73 | |
Sourceware | Duplicate | 0.46 | 0.50 | 0.34 | 0.51 | 0.51 | 0.46 | 0.50 | 0.33 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.60 | 0.25 | 0.50 | 0.33 | 0.50 | 0.68 | |
WebKit | Duplicate | 0.39 | 0.50 | 0.33 | 0.62 | 0.61 | 0.36 | 0.50 | 0.34 | 0.63 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | Transformer | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | 0.50 |
Eclipse | Duplicate | 0.28 | 0.50 | 0.33 | 0.56 | 0.56 | 0.29 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.55 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
FreeBSD | Duplicate | 0.35 | 0.50 | 0.33 | 0.50 | 0.51 | 0.35 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.27 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.61 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.40 | 0.50 | 0.34 | 0.50 | 0.49 | 0.41 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.36 | 0.50 | 0.34 | 0.50 | 0.50 | 0.35 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
Kernel | Duplicate | 0.50 | 0.50 | 0.38 | 0.50 | 0.54 | 0.46 | 0.50 | 0.35 | 0.52 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
RedHat | Duplicate | 0.45 | 0.50 | 0.36 | 0.51 | 0.53 | 0.54 | 0.52 | 0.35 | 0.52 | 0.46 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.43 | 0.50 | 0.36 | 0.50 | 0.50 | 0.42 | 0.50 | 0.35 | 0.52 | 0.52 |
Severity | 0.26 | 0.50 | 0.34 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
WebKit | Duplicate | 0.40 | 0.51 | 0.35 | 0.60 | 0.60 | 0.35 | 0.50 | 0.33 | 0.58 | 0.59 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | BERT | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.35 | 0.50 | 0.34 | 0.59 | 0.59 | 0.35 | 0.50 | 0.33 | 0.56 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | |
Priority | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.74 | |
FreeBSD | Duplicate | 0.35 | 0.50 | 0.33 | 0.55 | 0.56 | 0.34 | 0.50 | 0.33 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.64 | |
GCC | Duplicate | 0.33 | 0.50 | 0.33 | 0.50 | 0.50 | 0.32 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.69 | 0.25 | 0.50 | 0.33 | 0.49 | 0.54 | |
Gentoo | Duplicate | 0.31 | 0.50 | 0.33 | 0.55 | 0.55 | 0.32 | 0.50 | 0.33 | 0.50 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.55 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.45 | 0.50 | 0.34 | 0.55 | 0.55 | 0.42 | 0.50 | 0.33 | 0.56 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
RedHat | Duplicate | 0.42 | 0.50 | 0.33 | 0.56 | 0.55 | 0.41 | 0.50 | 0.34 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.36 | 0.50 | 0.36 | 0.50 | 0.50 | 0.33 | 0.50 | 0.35 | 0.54 | 0.56 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.49 | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | |
WebKit | Duplicate | 0.35 | 0.50 | 0.34 | 0.50 | 0.51 | 0.33 | 0.50 | 0.33 | 0.56 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | CNN | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.29 | 0.50 | 0.33 | 0.56 | 0.56 | 0.30 | 0.50 | 0.35 | 0.51 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.61 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.70 | 0.69 | 0.69 | 0.76 | 0.77 | 0.74 | 0.73 | 0.73 | 0.79 | 0.80 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.56 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | |
GCC | Duplicate | 0.40 | 0.50 | 0.33 | 0.60 | 0.59 | 0.40 | 0.50 | 0.33 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.54 | 0.55 | |
Priority | 0.25 | 0.50 | 0.34 | 0.51 | 0.55 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.46 | 0.51 | 0.36 | 0.63 | 0.63 | 0.39 | 0.50 | 0.34 | 0.63 | 0.63 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.54 | 0.51 | 0.36 | 0.56 | 0.54 | 0.46 | 0.51 | 0.36 | 0.52 | 0.59 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | 0.28 | 0.50 | 0.33 | 0.51 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
RedHat | Duplicate | 0.78 | 0.78 | 0.78 | 0.82 | 0.79 | 0.78 | 0.78 | 0.78 | 0.81 | 0.78 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.29 | 0.50 | 0.33 | 0.51 | 0.58 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.52 | 0.61 | |
Sourceware | Duplicate | 0.49 | 0.50 | 0.37 | 0.50 | 0.51 | 0.43 | 0.50 | 0.35 | 0.50 | 0.51 |
Severity | 0.31 | 0.50 | 0.33 | 0.50 | 0.50 | 0.28 | 0.50 | 0.33 | 0.49 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | 0.27 | 0.50 | 0.33 | 0.50 | 0.50 | |
WebKit | Duplicate | 0.50 | 0.53 | 0.38 | 0.62 | 0.64 | 0.27 | 0.50 | 0.33 | 0.63 | 0.66 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | LSTM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.38 | 0.50 | 0.34 | 0.59 | 0.58 | 0.35 | 0.50 | 0.33 | 0.52 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.52 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
FreeBSD | Duplicate | 0.35 | 0.50 | 0.33 | 0.52 | 0.52 | 0.33 | 0.50 | 0.33 | 0.50 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.52 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.61 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
GCC | Duplicate | 0.32 | 0.50 | 0.34 | 0.50 | 0.52 | 0.33 | 0.50 | 0.34 | 0.55 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Gentoo | Duplicate | 0.29 | 0.50 | 0.33 | 0.55 | 0.51 | 0.32 | 0.50 | 0.33 | 0.50 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Kernel | Duplicate | 0.49 | 0.50 | 0.35 | 0.55 | 0.54 | 0.58 | 0.51 | 0.37 | 0.56 | 0.54 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.57 | 0.25 | 0.50 | 0.33 | 0.49 | 0.62 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.61 | |
RedHat | Duplicate | 0.63 | 0.57 | 0.40 | 0.66 | 0.64 | 0.62 | 0.52 | 0.39 | 0.56 | 0.54 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.71 | |
Sourceware | Duplicate | 0.44 | 0.50 | 0.33 | 0.50 | 0.50 | 0.42 | 0.50 | 0.33 | 0.51 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.60 | 0.26 | 0.50 | 0.33 | 0.50 | 0.63 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.59 | 0.25 | 0.50 | 0.33 | 0.49 | 0.64 | |
WebKit | Duplicate | 0.41 | 0.50 | 0.33 | 0.58 | 0.57 | 0.45 | 0.50 | 0.33 | 0.60 | 0.60 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | GRU | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.42 | 0.50 | 0.33 | 0.58 | 0.57 | 0.42 | 0.50 | 0.34 | 0.58 | 0.58 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | |
FreeBSD | Duplicate | 0.32 | 0.50 | 0.33 | 0.50 | 0.54 | 0.32 | 0.50 | 0.33 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.52 | |
GCC | Duplicate | 0.33 | 0.50 | 0.33 | 0.53 | 0.54 | 0.34 | 0.50 | 0.33 | 0.50 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
Gentoo | Duplicate | 0.32 | 0.50 | 0.34 | 0.51 | 0.51 | 0.31 | 0.50 | 0.33 | 0.51 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.52 | 0.50 | 0.34 | 0.52 | 0.52 | 0.42 | 0.50 | 0.34 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.62 | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.64 | 0.25 | 0.50 | 0.33 | 0.50 | 0.59 | |
RedHat | Duplicate | 0.64 | 0.58 | 0.52 | 0.60 | 0.55 | 0.67 | 0.64 | 0.52 | 0.68 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.72 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.70 | |
Sourceware | Duplicate | 0.46 | 0.50 | 0.34 | 0.52 | 0.50 | 0.45 | 0.50 | 0.34 | 0.51 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.61 | 0.25 | 0.50 | 0.33 | 0.50 | 0.65 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.66 | 0.25 | 0.50 | 0.33 | 0.50 | 0.63 | |
WebKit | Duplicate | 0.39 | 0.50 | 0.33 | 0.62 | 0.61 | 0.38 | 0.50 | 0.33 | 0.62 | 0.62 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | Transformer | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.28 | 0.50 | 0.33 | 0.56 | 0.56 | 0.28 | 0.50 | 0.33 | 0.55 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.29 | 0.50 | 0.33 | 0.51 | 0.52 | 0.29 | 0.50 | 0.33 | 0.52 | 0.52 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.56 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.57 | |
GCC | Duplicate | 0.40 | 0.50 | 0.33 | 0.57 | 0.55 | 0.38 | 0.50 | 0.33 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.51 | |
Gentoo | Duplicate | 0.31 | 0.50 | 0.33 | 0.55 | 0.56 | 0.32 | 0.50 | 0.33 | 0.55 | 0.55 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.50 | 0.50 | 0.38 | 0.50 | 0.54 | 0.46 | 0.50 | 0.35 | 0.52 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
RedHat | Duplicate | 0.45 | 0.50 | 0.36 | 0.50 | 0.53 | 0.54 | 0.52 | 0.35 | 0.52 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.53 | 0.50 | 0.36 | 0.50 | 0.50 | 0.52 | 0.50 | 0.35 | 0.51 | 0.50 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | |
WebKit | Duplicate | 0.41 | 0.51 | 0.35 | 0.60 | 0.60 | 0.35 | 0.50 | 0.33 | 0.58 | 0.59 |
Severity | 0.26 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Model | BERT | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Apply Stopword Removal | Not Apply Stopword Removal | ||||||||||
Project | Task | Precision | Recall | F1 | ROC | PR | Precision | Recall | F1 | ROC | PR |
Eclipse | Duplicate | 0.35 | 0.50 | 0.33 | 0.52 | 0.52 | 0.36 | 0.50 | 0.33 | 0.55 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
FreeBSD | Duplicate | 0.28 | 0.50 | 0.33 | 0.50 | 0.50 | 0.28 | 0.50 | 0.33 | 0.52 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.49 | 0.25 | 0.50 | 0.33 | 0.50 | 0.55 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.34 | 0.51 | 0.56 | |
GCC | Duplicate | 0.33 | 0.50 | 0.34 | 0.55 | 0.56 | 0.33 | 0.50 | 0.33 | 0.50 | 0.51 |
Severity | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.49 | 0.25 | 0.50 | 0.33 | 0.49 | 0.69 | |
Gentoo | Duplicate | 0.36 | 0.50 | 0.33 | 0.50 | 0.54 | 0.35 | 0.50 | 0.34 | 0.56 | 0.56 |
Severity | 0.25 | 0.50 | 0.34 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Kernel | Duplicate | 0.35 | 0.50 | 0.33 | 0.59 | 0.60 | 0.36 | 0.50 | 0.33 | 0.54 | 0.56 |
Severity | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.49 | 0.50 | 0.25 | 0.50 | 0.33 | 0.51 | 0.50 | |
RedHat | Duplicate | 0.57 | 0.53 | 0.42 | 0.49 | 0.49 | 0.57 | 0.56 | 0.45 | 0.49 | 0.49 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | |
Sourceware | Duplicate | 0.32 | 0.50 | 0.33 | 0.50 | 0.51 | 0.33 | 0.50 | 0.33 | 0.52 | 0.52 |
Severity | 0.27 | 0.50 | 0.33 | 0.51 | 0.51 | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | |
Priority | 0.25 | 0.50 | 0.33 | 0.51 | 0.51 | 0.25 | 0.50 | 0.33 | 0.52 | 0.51 | |
WebKit | Duplicate | 0.35 | 0.50 | 0.33 | 0.51 | 0.54 | 0.27 | 0.50 | 0.33 | 0.54 | 0.57 |
Severity | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.54 | |
Priority | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 | 0.25 | 0.50 | 0.33 | 0.50 | 0.50 |
Hypothesis | p-Value | Result | Hypothesis | p-Value | Result | Hypothesis | p-Value | Result |
---|---|---|---|---|---|---|---|---|
H10 | 0.9944 (t) | Accept | H20 | 0.9234 (w) | Accept | H30 | 0.9238 (w) | Accept |
H40 | 0.9550 (w) | Accept | H50 | 0.9549 (t) | Accept | H60 | 0.9934 (t) | Accept |
H70 | 0.9838 (w) | Accept | H80 | 0.9731 (w) | Accept | H90 | 0.7903 (t) | Accept |
H100 | 0.8318 (w) | Accept | H110 | 0.9835 (t) | Accept | H120 | 0.9250 (t) | Accept |
H130 | 0.8620 (t) | Accept | H140 | 0.9675 (t) | Accept | H150 | 0.9492 (w) | Accept |
H160 | 0.9779 (t) | Accept | H170 | 0.9389 (t) | Accept | H180 | 0.9303 (t) | Accept |
H190 | 0.9089 (t) | Accept | H200 | 0.9593 (w) | Accept | H210 | 0.8965 (t) | Accept |
H220 | 0.9635 (t) | Accept | H230 | 0.9529 (t) | Accept | H240 | 0.6545 (t) | Accept |
H250 | 0.8565 (t) | Accept | H260 | 0.8865 (t) | Accept | H270 | 0.9556 (w) | Accept |
H280 | 0.9517 (w) | Accept | H290 | 0.9995 (w) | Accept | H300 | 0.9545 (t) | Accept |
H310 | 0.9970 (t) | Accept | H320 | 0.9994 (t) | Accept | H330 | 0.9954 (t) | Accept |
H340 | 0.6988 (t) | Accept | H350 | 0.7965 (t) | Accept | H360 | 0.9677 (t) | Accept |
H370 | 0.9638 (w) | Accept | H380 | 0.9550 (w) | Accept | H390 | 0.9626 (t) | Accept |
H400 | 0.9324 (t) | Accept |
Hypothesis | p-Value | Result | Hypothesis | p-Value | Result | Hypothesis | p-Value | Result |
---|---|---|---|---|---|---|---|---|
H410 | 0.9826 (t) | Accept | H420 | 0.9590 (w) | Accept | H430 | 0.7940 (t) | Accept |
H440 | 0.9306 (t) | Accept | H450 | 0.9830 (t) | Accept | H460 | 0.9851 (t) | Accept |
H470 | 0.8837 (t) | Accept | H480 | 0.8697 (t) | Accept | H490 | 0.9018 (t) | Accept |
H500 | 0.9819 (t) | Accept | H510 | 0.9360 (t) | Accept | H520 | 0.8971 (t) | Accept |
H530 | 0.9367 (t) | Accept | H540 | 0.9632 (t) | Accept | H550 | 0.9836 (t) | Accept |
H560 | 0.9183 (t) | Accept | H570 | 0.9607 (w) | Accept | H580 | 0.9721 (t) | Accept |
H590 | 0.9543 (t) | Accept | H600 | 0.9153 (w) | Accept | H610 | 0.9775 (w) | Accept |
H620 | 0.9015 (t) | Accept | H630 | 0.9859 (t) | Accept | H640 | 0.9699 (w) | Accept |
H650 | 0.9776 (t) | Accept | H660 | 0.8544 (w) | Accept | H670 | 0.9544 (w) | Accept |
H680 | 0.8582 (t) | Accept | H690 | 0.9452 (w) | Accept | H700 | 0.9339 (t) | Accept |
H710 | 0.8080 (w) | Accept | H720 | 0.9681 (t) | Accept | H730 | 0.9958 (t) | Accept |
H740 | 0.8232 (w) | Accept | H750 | 0.8938 (w) | Accept | H760 | 0.7578 (w) | Accept |
H770 | 0.8982 (t) | Accept | H780 | 0.9386 (t) | Accept | H790 | 0.9568 (w) | Accept |
H800 | 0.9169 (t) | Accept |
Hypothesis | p-Value | Result | Hypothesis | p-Value | Result | Hypothesis | p-Value | Result |
---|---|---|---|---|---|---|---|---|
H810 | 0.9719 (t) | Accept | H820 | 0.9126 (t) | Accept | H830 | 0.9701 (t) | Accept |
H840 | 0.9648 (t) | Accept | H850 | 0.9564 (t) | Accept | H860 | 0.8379 (t) | Accept |
H870 | 0.9672 (t) | Accept | H880 | 0.8774 (t) | Accept | H890 | 0.8283 (t) | Accept |
H900 | 0.8210 (t) | Accept | H910 | 0.9140 (t) | Accept | H920 | 0.9670 (t) | Accept |
H930 | 0.9248 (t) | Accept | H940 | 0.9527 (t) | Accept | H950 | 0.9853 (t) | Accept |
H960 | 0.9820 (t) | Accept | H970 | 0.9479 (t) | Accept | H980 | 0.8891 (t) | Accept |
H990 | 0.9457 (t) | Accept | H1000 | 0.8779 (t) | Accept | H1010 | 0.9521 (w) | Accept |
H1020 | 0.8726 (t) | Accept | H1030 | 0.9979 (t) | Accept | H1040 | 0.9284 (t) | Accept |
H1050 | 0.9875 (t) | Accept | H1060 | 0.8701 (t) | Accept | H1070 | 0.8946 (t) | Accept |
H1080 | 0.8522 (t) | Accept | H1090 | 0.9585 (t) | Accept | H1100 | 0.9508 (t) | Accept |
H1110 | 0.8978 (w) | Accept | H1120 | 0.9339 (w) | Accept | H1130 | 0.9952 (w) | Accept |
H1140 | 0.9066 (w) | Accept | H1150 | 0.6370 (t) | Accept | H1160 | 0.9201 (t) | Accept |
H1170 | 0.9077 (t) | Accept | H1180 | 0.8913 (t) | Accept | H1190 | 0.9214 (t) | Accept |
H1200 | 0.9583 (w) | Accept |
References
- Banerjee, S.; Syed, Z.; Helmick, J.; Culp, M.; Ryan, K.; Cukic, B. Automated Triaging of very Large Bug Repositories. Inf. Softw. Technol. 2017, 89, 1–13. [Google Scholar] [CrossRef]
- Eclipse. Available online: https://bugs.eclipse.org/bugs (accessed on 5 July 2024).
- Mozilla. Available online: https://bugzilla.mozilla.org/ (accessed on 5 July 2024).
- Mukherjee, U.; Rahman, M.M. Understanding the Impact of Domain Term Explanation on Duplicate Bug Report Detection. arXiv 2025, arXiv:2503.18832. [Google Scholar] [CrossRef]
- Bugzilla. Available online: https://bugzilla.org/ (accessed on 5 July 2024).
- Yoon, K. Convolutional Neural Networks for Sentence Classification. arXiv 2014, arXiv:1408.5882. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, L.; Jiang, J. Duplicate Question Detection with Deep Learning in Stack Overflow. IEEE Access 2020, 8, 25964–25975. [Google Scholar] [CrossRef]
- FreeBSD. Available online: https://bugs.freebsd.org/bugzilla/ (accessed on 5 July 2024).
- GCC. Available online: https://gcc.gnu.org/bugzilla/ (accessed on 5 July 2024).
- Gentoo. Available online: https://bugs.gentoo.org/ (accessed on 5 July 2024).
- Kernel. Available online: https://bugzilla.kernel.org/ (accessed on 5 July 2024).
- RedHat. Available online: https://bugzilla.redhat.com/ (accessed on 5 July 2024).
- Sourceware. Available online: https://sourceware.org/bugzilla/ (accessed on 5 July 2024).
- Webkit. Available online: https://bugs.webkit.org/ (accessed on 5 July 2024).
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Devlin, J. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Vaswani, A. Attention is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- GCC #65092. Available online: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65092 (accessed on 5 July 2024).
- Bugzilla. Available online: https://bugzilla.readthedocs.io/en/5.2/using/editing.html (accessed on 5 July 2024).
- Eren, Ç.; Şahin, K.; Tüzün, E. Analyzing Bug Life Cycles to Derive Practical Insights. In Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, Oulu, Finland, 14–16 June 2023; pp. 162–171. [Google Scholar]
- Kao, A.; Poteet, S.R. Natural Language Processing and Text Mining; Springer: London, UK, 2007. [Google Scholar]
- Xia, Y.; Hua, J.; Dougherty, E.R. Quantification of the Impact of Feature Selection on the Variance of Cross-Validation Error Estimation. EURASIP J. Bioinform. Syst. Biol. 2007, 2007, 1–11. [Google Scholar] [CrossRef]
- Nguyen, A.T.; Nguyen, T.T.; Nguyen, T.N.; Lo, D.; Sun, C. Duplicate Bug Report Detection with a Combination of Information Retrieval and Topic Modeling. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE), Essen, Germany, 3–7 September 2012; pp. 70–79. [Google Scholar]
- Sun, C.; Lo, D.; Wang, X.; Jiang, J.; Khoo, S.-C. A Discriminative Model Approach for Accurate Duplicate Bug Report Retrieval. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE), Cape Town, South Africa, 2–8 May 2010; Volume 1, pp. 45–54. [Google Scholar]
- Wang, X.; Zhang, L.; Xie, T.; Anvik, J.; Sun, J. An Approach to Detecting Duplicate Bug Reports Using Natural Language and Execution Information. In Proceedings of the 30th International Conference on Software Engineering (ICSE), Leipzig, Germany, 10–18 May 2008; pp. 461–470. [Google Scholar]
- Meng, Q.; Zhang, X.; Ramackers, G.; Visser, J. Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection. arXiv 2024, arXiv:2404.14877. [Google Scholar] [CrossRef]
- Patil, A.; Jadon, A. Auto-Labelling of Bug Report Using Natural Language Processing. In Proceedings of the 2023 IEEE 8th for Convergence in Technology (I2CT), Pune, India, 7–9 April 2023; pp. 1–7. [Google Scholar]
- Eclipse. Available online: https://bugs.eclipse.org/bugs/show_bug.cgi?id=30959 (accessed on 5 July 2024).
- Eclipse. Available online: https://bugs.eclipse.org/bugs/show_bug.cgi?id=31082 (accessed on 5 July 2024).
- Gupta, S.; Gupta, S.K. A Systematic Study of Duplicate Bug Report Detection. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 578–589. [Google Scholar] [CrossRef]
- Al-Msie, R.F. BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports. arXiv 2024, arXiv:2407.04707. [Google Scholar]
- Ramos, J. Using TF-IDF to Determine Word Relevance in Document Queries. In Proceedings of the First Instructional Conference on Machine Learning; Citeseer: Princeton, NJ, USA, 2003; Volume 242, pp. 29–48. [Google Scholar]
- McHugh, M.L. The Chi-Square Test of Independence. Biochem. Med. 2013, 23, 143–149. [Google Scholar] [CrossRef]
- Stopword List. Available online: https://gist.github.com/rg089/35e00abf8941d72d419224cfd5b5925d (accessed on 5 July 2024).
- Berrar, D. Cross-validation. In Encyclopedia of Bioinformatics and Computational Biology, 2nd ed.; Elsevier: Amsterdam, The Netherlands, 2019; pp. 542–545. [Google Scholar]
- Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision, Recall and F-score, with Implication for Evaluation. In Proceedings of the European Conference on Information Retrieval, Santiago de Compostela, Spain, 21–23 March 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar]
- Davis, J.; Goadrich, M. The Relationship between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 233–240. [Google Scholar]
- Gravetter, F.J.; Wallnau, L.B. Introduction to the T statistic. Essent. Statist. Behav. Sci. 2014, 8, 252. [Google Scholar]
- Woolson, R.F. Wilcoxon signed-rank test. In Wiley Encyclopedia of Clinical Trials; Wiley: Hoboken, NJ, USA, 2007; p. 13. [Google Scholar]
- Bland, J.M.; Altman, D.G. Multiple Significance Tests: The Bonferroni Method. BMJ 1995, 310, 170. [Google Scholar] [CrossRef]
- Harris, C.R.; Millman, K.J.; Van Der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Oliphant, T.E. Array Programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef]
- Chaparro, O.; Florez, J.M.; Singh, U.; Marcus, A. Reformulating Queries for Duplicate Bug report Detection. In Proceedings of the IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), Hangzhou, China, 24–27 February 2019; pp. 218–229. [Google Scholar]
- Kukkar, A.; Mohana, R.; Kumar, Y.; Nayyar, A.; Bilal, M.; Kwak, K.S. Duplicate Bug report Detection and Classification System based on Deep Learning Technique. IEEE Access 2020, 8, 200749–200763. [Google Scholar] [CrossRef]
- Chauhan, R.; Sharma, S.; Goyal, A. DENATURE: Duplicate. Detection and Type Identification in Open Source Bug Repositories. Int. J. Syst. Assur. Eng. Manag. 2023, 14, 1–18. [Google Scholar] [CrossRef]
- Rocha, T.M.; Carvalho, A.L.D.C. SiameseQAT: A Semantic Context-Based Duplicate Bug report Detection using Replicated Cluster Information. IEEE Access 2021, 9, 44610–44630. [Google Scholar] [CrossRef]
- Xie, Q.; Wen, Z.; Zhu, J.; Gao, C.; Zheng, Z. Detecting Duplicate Bug reports with Convolutional Neural Networks. In Proceedings of the IEEE 25th Asia-Pacific Software Engineering Conference (APSEC), Nara, Japan, 4–7 December 2018; pp. 416–425. [Google Scholar]
- He, J.; Xu, L.; Yan, M.; Xia, X.; Lei, Y. Duplicate Bug report Detection using Dual-Channel Convolutional Neural Networks. In Proceedings of the 28th International Conference on Program Comprehension, Seoul, Republic of Korea, 13–15 July 2020; pp. 117–127. [Google Scholar]
- Mashhadi, E.; Ahmadvand, H.; Hemmati, H. Method-Level Bug Severity Prediction using Source Code Metrics and LLMs. In Proceedings of the IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), Florence, Italy, 9–12 October 2023; pp. 635–646. [Google Scholar]
- Shatnawi, M.Q.; Alazzam, B. An Assessment of Eclipse Bugs’ Priority and Severity Prediction Using Machine Learning. Int. J. Commun. Netw. Inf. Secur. 2022, 4, 62–69. [Google Scholar] [CrossRef]
- Ramay, W.Y.; Umer, Q.; Yin, X.C.; Zhu, C.; Illahi, I. Deep Neural Network-Based Severity Prediction of Bug reports. IEEE Access 2019, 7, 46846–46857. [Google Scholar] [CrossRef]
- Bani-Salameh, H.; Sallam, M.; AI shboul, B. A Deep-Learning-Based Bug Priority Prediction using RNN-LSTM Neural Networks. E-Inform. Softw. Eng. J. 2021, 15, 29–45. [Google Scholar]
- Rathnayake, R.M.D.S.; Kumara, B.T.G.S.; Ekanayake, E.M.U.W.J.B. CNN-Based Priority Prediction of Bug reports. In Proceedings of the IEEE International Conference on Decision Aid Sciences and Application (DASA), Online, 7–8 December 2021; pp. 299–303. [Google Scholar]
- Zhang, W.; Challis, C. Automatic Bug Priority Prediction using DNN Based Regression. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, Proceedings of the International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, Xi′an, China, 1–3 August 2020; Springer: Cham, Switzerland, 2020; Volume 1, pp. 333–340. [Google Scholar]
- Umer, Q.; Liu, H.; Sultan, Y. Emotion based Automated Priority Prediction for Bug reports. IEEE Access 2018, 6, 35743–35752. [Google Scholar] [CrossRef]
- Choudhary, P.A.; Singh, S. Neural Network Based Bug Priority Prediction Model using Text Classification Techniques. Int. J. Adv. Res. Comput. Sci. 2017, 8, 1315. [Google Scholar]
Project Type | Project | Time Frame | Number of Reports |
---|---|---|---|
Development Tool | Eclipse | 28/02/02–21/08/23 | 559,680 |
GCC | 24/06/02–15/08/23 | 94,589 | |
Server | Sourceware | 27/10/00–16/08/23 | 30,710 |
Web Browser Engine | WebKit | 01/01/01–15/08/23 | 242,605 |
Operating System | FreeBSD | 03/06/95–21/08/23 | 262,410 |
Gentoo | 09/11/02–23/08/23 | 536,646 | |
Kernel | 20/04/08–17/12/14 | 14,366 | |
Red Hat | 31/03/00–29/08/07 | 160,178 | |
Total | 1,901,084 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ji, J.; Yang, G. Do Stop Words Matter in Bug Report Analysis? Empirical Findings Using Deep Learning Models Across Duplicate, Severity, and Priority Classification. Appl. Sci. 2025, 15, 9178. https://doi.org/10.3390/app15169178
Ji J, Yang G. Do Stop Words Matter in Bug Report Analysis? Empirical Findings Using Deep Learning Models Across Duplicate, Severity, and Priority Classification. Applied Sciences. 2025; 15(16):9178. https://doi.org/10.3390/app15169178
Chicago/Turabian StyleJi, Jinfeng, and Geunseok Yang. 2025. "Do Stop Words Matter in Bug Report Analysis? Empirical Findings Using Deep Learning Models Across Duplicate, Severity, and Priority Classification" Applied Sciences 15, no. 16: 9178. https://doi.org/10.3390/app15169178
APA StyleJi, J., & Yang, G. (2025). Do Stop Words Matter in Bug Report Analysis? Empirical Findings Using Deep Learning Models Across Duplicate, Severity, and Priority Classification. Applied Sciences, 15(16), 9178. https://doi.org/10.3390/app15169178