Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis

Romero, Julian D.; Feijoo-Garcia, Miguel A.; Nanda, Gaurav; Newell, Brittany; Magana, Alejandra J.

doi:10.3390/bdcc8100132

Open AccessArticle

Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis

by

Julian D. Romero

¹

,

Miguel A. Feijoo-Garcia

²

,

Gaurav Nanda

¹

,

Brittany Newell

¹

and

Alejandra J. Magana

^2,*

¹

School of Engineering Technology, Purdue University, 401 N. Grant St., West Lafayette, IN 47907, USA

²

Department of Computer and Information Technology, Purdue University, 401 N. Grant St., West Lafayette, IN 47907, USA

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2024, 8(10), 132; https://doi.org/10.3390/bdcc8100132

Submission received: 26 July 2024 / Revised: 8 September 2024 / Accepted: 10 September 2024 / Published: 8 October 2024

Download

Browse Figures

Versions Notes

Abstract

Examining the effectiveness of machine learning techniques in analyzing engineering students’ decision-making processes through topic modeling during simulation-based design tasks is crucial for advancing educational methods and tools. Thus, this study presents a comparative analysis of different supervised and unsupervised machine learning techniques for topic modeling, along with human validation. Hence, this manuscript contributes by evaluating the effectiveness of these techniques in identifying nuanced topics within the argumentation framework and improving computational methods for assessing students’ abilities and performance levels based on their informed decisions. This study examined the decision-making processes of engineering students as they participated in a simulation-based design challenge. During this task, students were prompted to use an argumentation framework to articulate their claims, evidence, and reasoning, by recording their informed design decisions in a design journal. This study combined qualitative and computational methods to analyze the students’ design journals and ensured the accuracy of the findings through the researchers’ review and interpretations of the results. Different machine learning models, including random forest, SVM, and K-nearest neighbors (KNNs), were tested for multilabel regression, using preprocessing techniques such as TF-IDF, GloVe, and BERT embeddings. Additionally, hyperparameter optimization and model interpretability were explored, along with models like RNNs with LSTM, XGBoost, and LightGBM. The results demonstrate that both supervised and unsupervised machine learning models effectively identified nuanced topics within the argumentation framework used during the design challenge of designing a zero-energy home for a Midwestern city using a CAD/CAE simulation platform. Notably, XGBoost exhibited superior predictive accuracy in estimating topic proportions, highlighting its potential for broader application in engineering education.

Keywords: argumentation framework; topic modeling; machine learning; qualitative analysis; natural language processing

Share and Cite

MDPI and ACS Style

Romero, J.D.; Feijoo-Garcia, M.A.; Nanda, G.; Newell, B.; Magana, A.J. Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis. Big Data Cogn. Comput. 2024, 8, 132. https://doi.org/10.3390/bdcc8100132

AMA Style

Romero JD, Feijoo-Garcia MA, Nanda G, Newell B, Magana AJ. Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis. Big Data and Cognitive Computing. 2024; 8(10):132. https://doi.org/10.3390/bdcc8100132

Chicago/Turabian Style

Romero, Julian D., Miguel A. Feijoo-Garcia, Gaurav Nanda, Brittany Newell, and Alejandra J. Magana. 2024. "Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis" Big Data and Cognitive Computing 8, no. 10: 132. https://doi.org/10.3390/bdcc8100132

APA Style

Romero, J. D., Feijoo-Garcia, M. A., Nanda, G., Newell, B., & Magana, A. J. (2024). Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis. Big Data and Cognitive Computing, 8(10), 132. https://doi.org/10.3390/bdcc8100132

Article Menu

Evaluating the Performance of Topic Modeling Techniques with Human Validation to Support Qualitative Analysis

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI