Automatic Recommendation of Forum Threads and Reinforcement Activities in a Data Structure and Programming Course
Abstract
:1. Introduction
- Domain-specific application: The study delves into a highly specific and challenging domain: the recommendation of conversations in forums focused on data structures and algorithms. In this domain, students pose questions seeking precise and constrained solutions to intricate problems.
- Query-driven recommendation: A distinctive feature is the recommendation solely based on the student’s query. This enables the recommender to function effectively without needing prior information about the student’s performance or interests.
- Holistic recommendations: The recommender yields promising results both in the recommendation of conversations and activities from exams. This not only provides answers to the questions posed by the students, but also offers them supplementary learning activities.
- Technical innovation: The study introduces technical innovation through the application of transformers to process texts related to data structures and algorithmic concepts. Additionally, it adapts the key phrase definition presented in [12] to our specific domain, marking a novel approach.
2. Related Work
3. Material and Method
- Step 1: Text Pre-Processing and Key Phrases Extraction
- Step 2: Text Representation and Similarity Calculation
- Computing similarity using bags of words. We calculate the similarity between the student’s query and (1) all the questions in the test repository and (2) all the posts in the forums. We represent pre-processed texts as bags of words, which model the text as a multiset (bag) of its words, ignoring grammar and word order. We then calculate the Jaccard coefficient between the bags of words that represent each pair of texts. We have implemented and tested various similarity measures, including LIN and cosine distance. However, we have chosen to present the results obtained using the Jaccard coefficient as they have demonstrated the most favorable outcomes). The Jaccard coefficient is a measure of similarity between finite sample sets, defined as the size of the intersection divided by the size of the union of the sample sets:In addition, we compute the similarity between the titles and the bodies (main text) for each post and query. Since the titles often contain the most important elements of the body section, we assign double weight to the similarity between titles.
- Computing similarity using key phrases. To represent the texts, we use bags of key phrases that were extracted as explained in step 1. Using these representations, we compute the similarity between the student’s query and the test questions and posts using the Jaccard coefficient. Additionally, we compute the similarity between the titles and the main bodies of each post and query. Since titles usually contain the most important information from the body section, we assign double weight to the similarity between titles.
- Computing similarity using word embeddings. We first convert the texts into embeddings and then calculate the cosine distance between the embeddings of each pair of posts. This allows us to map posts with similar meanings close together in vector space. To achieve this, we use the Sentence Transformers (ST) framework [33], which leverages a pre-trained BERT model to obtain the contextual representation of the posts. ST applies a mean pooling method to the output, which converts token embeddings to sentence embeddings of a fixed size. By default, this method averages the output embeddings.The Sentence Transformers framework provides a set of pre-trained models for various functionalities such as semantic search, semantic similarity, question answering, clustering, image and text, etc. Moreover, these models are generally available in English and only some of them have a multilingual version. In this work, we have selected the best pre-trained model for semantic similarity and which has a multilingual version since the analyzed texts are in Spanish. (The performance of the models available for this framework is available at https://www.sbert.net/docs/pretrained_models.html (accessed on 18 September 2023)).Some pre-trained models have several similarity functions such as dot-product or cosine or Euclidean distances. In particular, the model used has only one similarity function available which is the cosine distance. However, this function is the one that works best since this framework also has a loss function specially designed for the cosine distance (CosineSimilarityLoss).To compare each post and the student’s query, we obtain embeddings for the title and the body separately. We also assign double weight to the similarity between titles, as they often contain the most important information.We utilized a BERT-based model to generate embeddings for both the title and the body of each post. BERT and other transformer networks produce embeddings for each token in the input text. To create a fixed-sized sentence embedding, the model applies mean pooling, which involves averaging the output embeddings for all tokens to yield a fixed-size vector.We used the multilingual model “paraphrase-multilingual-mpnet-base-v2” [33], which has been trained on parallel data for over 50 languages, including Spanish. The model is capable of generating aligned vector spaces, meaning that similar inputs in different languages are mapped closely in a vector space, without requiring explicit specification of the input language.This model maps sentences and paragraphs to a dense vector space of 768 dimensions, and can be employed for tasks such as text similarity, clustering, and semantic search.
- Step 3: Ranking and Presentation
4. Evaluation Methodology
4.1. Dataset
4.2. Evaluation Methodology
- The strict evaluation criterion or problem-related recommendation: a question from a test is relevant to the student query if it refers to the same concept or problem solved or it uses the same data structure or scheme. This is a very restrictive criterion that considers only questions strongly related to the query.
- The relaxed evaluation criterion or scheme-related recommendation: it allows the recommendation of questions concerning the same data structure or scheme but a different problem, since they may also be of interest to the student to reinforce learning. For example, if the student is asking about the resolution of the backpack problem using the greedy scheme, the recommender may suggest activities regarding other problems solved using the greedy strategy (e.g., Dijkstra, or the exchange problem).
5. Results and Discussion
5.1. Recommendation of Discussion Threads
5.2. Recommendation of Reinforcement Activities
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
UNED | Universidad Nacional de Educación a Distancia |
POS | Part Of Speech |
ST | Sentence Transformers |
P@K | Precision at K |
BERT | Bidirectional Encoder Representations from Transformers |
Appendix A
References
- Dhawan, S. Online Learning: A Panacea in the Time of COVID-19 Crisis. J. Educ. Technol. Syst. 2020, 49, 5–22. [Google Scholar] [CrossRef]
- Adkins, S.S. The 2019 Global Learning Technology Investment Patterns: Another Record Shattering Year; Technical Report; Metaari’s Analysis of the 2019 Global Learning Technology Investment Patterns; Metaari: Monroe, WA, USA, 2020. [Google Scholar]
- Arkorful, V.; Abaidoo, N. The role of e-learning, the advantages and disadvantages of its adoption in Higher Education. Int. J. Educ. Res. 2014, 2, 397–410. [Google Scholar]
- Tuckman, B. Relations of academic procrastination, rationalizations, and performance in a web course with deadlines. Psychol. Rep. 2005, 96, 1015–1021. [Google Scholar] [CrossRef]
- Bakia, M.; Shear, L.; Toyama, Y.; Lasseter, A. Understanding the Implications of Online Learning for Educational Productivity. Technical Report; U.S. Department of Education Office of Educational Technology: Washington, DC, USA, 2012.
- Twigg, C. Improving quality and reducing cost: Designs for effective learning. Change 2003, 35, 22–29. [Google Scholar] [CrossRef]
- Twigg, C. Improving learning and reducing costs: New models for online learning. Educ. Rev. 2003, 38, 28–38. [Google Scholar]
- Dumford, A.D.; Miller, A.L. Online Learning in Higher Education: Exploring Advantages and Disadvantages for Engagement. J. Comput. High. Educ. 2018, 30, 452–465. [Google Scholar] [CrossRef]
- Plaza, L.; Araujo, L.; López-Ostenero, F.; Martínez-Romo, J. Use of advanced natural language processing techniques for the automatic recommendation of reinforcement activities. In INTED2021 Proceedings; IATED: Valencia, Spain, 2021; pp. 5699–5705. [Google Scholar]
- Norman, M. Three Ways to Encourage Conversation in Online Discussion Forums. 2016. Available online: https://ctl.wiley.com/three-ways-to-encourage-conversation-in-online-discussion-forums/ (accessed on 18 September 2023).
- Irish, I.; Chatterjee, S.; Tailor, C.; Finkelberg, R.; Arriaga, R.; Starner, T. Post Recommendation System Impact on Student Participation and Performance in an Online AI Graduate Course. In Proceedings of the Ninth ACM Conference on Learning @ Scale, Roosevelt Island, NY, USA, 1–3 June 2022; pp. 24–34. [Google Scholar]
- Duque, A.; Fabregat, H.; Araujo, L.; Martinez-Romo, J. A keyphrase-based approach for interpretable ICD-10 code classification of Spanish medical reports. Artif. Intell. Med. 2021, 121, 102177. [Google Scholar] [CrossRef] [PubMed]
- Yengin, I.; Karahoca, D.; Karahoca, A.; Yücel, A. Roles of teachers in e-learning: How to engage students and how to get free e-learning and the future. Procedia-Soc. Behav. Sci. 2010, 2, 5775–5787. [Google Scholar] [CrossRef]
- Resnick, P.; Varian, H.R. Recommender Systems. Commun. ACM 1997, 40, 55–58. [Google Scholar] [CrossRef]
- Park, D.; Kim, H.; Choi, I.; Kim, J. A literature review and classification of recommender systems research. Expert Syst. Appl. 2010, 39, 10059–10072. [Google Scholar]
- Prins, F.J.; Nadolski, R.J.; Berlanga, A.J.; Drachsler, H.; Hummel, H.G.; Koper, R. Competence description for personal recommendations: The importance of identifying the complexity of learning and performance situations. Educ. Technol. Soc. 2008, 11, 141–152. [Google Scholar]
- Al-Badarneh, A.; Alsakran, J. An Automated Recommender System for Course Selection. Int. J. Adv. Comput. Sci. Appl. 2016, 7, 166–175. [Google Scholar] [CrossRef]
- Liu, J.; Wang, X.; Liu, X.; Yang, F. Analysis and design of personalized recommendation system for university physical education. In Proceedings of the International Conference on Networking and Digital Society, Wenzhou, China, 30–31 May 2010; Volume 2, pp. 472–475. [Google Scholar]
- Pinto, F.M.; Estefania, M.; Cerón, N.; Andrade, R.; Campaña, M. iRecomendYou: A Design Proposal for the Development of a Pervasive Recommendation System Based on Student’s Profile for Ecuador’s Students’ Candidature to a Scholarship. In New Advances in Information Systems and Technologies: Volume 2; Springer International Publishing: Cham, Switzerland, 2016; Volume 445, pp. 537–546. [Google Scholar]
- Ray, S.; Sharma, A. A Collaborative Filtering Based Approach for Recommending Elective Courses. In Information Intelligence, Systems, Technology and Management: 5th International Conference, ICISTM 2011, Gurgaon, India, 10–12 March 2011. Proceedings 5; Springer: Berlin/Heidelberg, Germany, 2011; Volume 141, pp. 330–339. [Google Scholar]
- Valdiviezo-Díaz, P.; Aguilar, J.; Riofrío, G. A fuzzy cognitive map like recommender system of learning resources. In Proceedings of the IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada, 24–29 July 2016; pp. 1539–1546. [Google Scholar]
- Ansari, M.H.; Moradi, M.; Nikrah, O.; Kambakhs, K. CodERS: A hybrid recommender system for an E-learning system. In Proceedings of the 2nd International Conference of Signal Processing and Intelligent Systems, Tehran, Iran, 14–15 December 2016; pp. 1–5. [Google Scholar]
- Bourkoukou, O.; Bachari, E.E.; El, M. A Personalized E-Learning Based on Recommender System. Int. J. Learn. Teach. 2016, 2, 99–103. [Google Scholar] [CrossRef]
- Chau, H.; Barria-Pineda, J.; Brusilovsky, P. Learning Content Recommender System for Instructors of Programming Courses. In Artificial Intelligence in Education: 19th International Conference, AIED 2018, London, UK, 27–30 June 2018, Proceedings, Part II 19; Springer International Publishing: Cham, Switzerland, 2018; Volume 10948, pp. 47–51. [Google Scholar]
- Singh, A.; P, D.; Raghu, D. Retrieving similar discussion forum threads: A structure based approach. In Proceedings of the SIGIR’12—Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, USA, 12–16 August 2012. [Google Scholar]
- Duan, H.; Zhai, C. Exploiting Thread Structures to Improve Smoothing of Language Models for Forum Post Retrieval. In Proceedings of the Advances in Information Retrieval, Dublin, Ireland, 18–21 April 2011; pp. 350–361. [Google Scholar]
- Papadimitriou, D.; Koutrika, G.; Velegrakis, Y.; Mylopoulos, J. Finding Related Forum Posts through Content Similarity over Intention-Based Segmentation. IEEE Trans. Knowl. Data Eng. 2017, 29, 9. [Google Scholar] [CrossRef]
- Pattabiraman, K.; Sondhi, P.; Zhai, C. Exploiting Forum Thread Structures to Improve Thread Clustering. In Proceedings of the 2013 Conference on the Theory of Information Retrieval, Copenhagen, Denmark, 29 September–2 October 2013; pp. 64–71. [Google Scholar]
- Li, M.; Gao, W.; Chen, Y. A Topic and Concept Integrated Model for Thread Recommendation in Online Health Communities. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; pp. 765–774. [Google Scholar]
- Lan, A.S.; Spencer, J.C.; Chen, Z.; Brinton, C.G.; Chiang, M. Personalized Thread Recommendation for MOOC Discussion Forums. In Proceedings of the Machine Learning and Knowledge Discovery in Databases, Würzburg, Germany, 16 September 2019; pp. 725–740. [Google Scholar]
- Zhu, P.; Hauff, C.; Yang, J. MOOC-Rec: Instructional Video Clip Recommendation for MOOC Forum Questions. In Proceedings of the 15th International Conference on Educational Data Mining, Durham, UK, 24–27 July 2022; pp. 705–709. [Google Scholar]
- Irish, I.; Chatterjee, S.; Jivani, S.; Jia, X.; Lee, J.; Arriaga, R.; Starner, T. Managing the Chaos: Approaches to Navigating Discussion Forums for Instructional Staff. In Proceedings of the 10th ACM Conference on Learning @ Scale, Copenhagen, Denmark, 20–22 July 2023; pp. 406–410. [Google Scholar]
- Reimers, N.; Gurevych, I. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv 2019, arXiv:1908.10084. [Google Scholar]
Similarity | Criterion | P@5 |
---|---|---|
Bags of words | Problem (strict) | 0,56 |
Scheme (relaxed) | 0,69 | |
Bag of key phrases | Problem (strict) | 0,60 |
Scheme (relaxed) | 0,82 | |
Embeddings | Problem (strict) | 0,58 |
Scheme (relaxed) | 0,73 |
Similarity | Criterion | P@5 |
---|---|---|
Bags of words | Problem (strict) | 0,64 |
Scheme (relaxed) | 0,75 | |
Bag of key phrases | Problem (strict) | 0,65 |
Scheme (relaxed) | 0,80 | |
Embeddings | Problem (strict) | 0,45 |
Scheme (relaxed) | 0,63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Plaza, L.; Araujo, L.; López-Ostenero, F.; Martínez-Romo, J. Automatic Recommendation of Forum Threads and Reinforcement Activities in a Data Structure and Programming Course. Appl. Syst. Innov. 2023, 6, 83. https://doi.org/10.3390/asi6050083
Plaza L, Araujo L, López-Ostenero F, Martínez-Romo J. Automatic Recommendation of Forum Threads and Reinforcement Activities in a Data Structure and Programming Course. Applied System Innovation. 2023; 6(5):83. https://doi.org/10.3390/asi6050083
Chicago/Turabian StylePlaza, Laura, Lourdes Araujo, Fernando López-Ostenero, and Juan Martínez-Romo. 2023. "Automatic Recommendation of Forum Threads and Reinforcement Activities in a Data Structure and Programming Course" Applied System Innovation 6, no. 5: 83. https://doi.org/10.3390/asi6050083