Visualization and Data Analysis of Multi-Factors for the Scientific Research Training of Graduate Students
Abstract
:1. Introduction
- A questionnaire is designed based on multi-dimensional data of the graduate training process to achieve refined tracking of graduate training data.
- Data mining methods are combined with multi-dimensional questionnaire data to identify the key factors in the graduate training process.
- A set of interactive visual analytics tools integrating visualization methods and human–computer interactions are provided to assist administrators in further understanding and exploring the key factors affecting the graduate training process.
2. Related Work
2.1. Higher Education
2.2. Education Data Mining
2.3. Education Data Visualization
3. Requirement Analysis and System Overview
3.1. Data Description
3.2. Requirement Analysis
3.3. System Pipeline
4. Data Mining Methods for Graduates’ Cultivation
4.1. Visualization and Interaction
4.2. Factor Selection and Classification
4.2.1. Feature Selection Algorithms
4.2.2. Classification Algorithms
5. Evaluation
5.1. Selection Results
5.2. Classification and Prediction
5.2.1. ROC Curve and AUC Value
5.2.2. Prediction Results
5.3. Advice for Graduates, Supervisors, and University Administrators
5.3.1. For Graduate Students
5.3.2. For Supervisors
5.3.3. For University Administrators
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Calma, A. Postgraduate research training: Some issues. High. Educ. Q. 2011, 65, 368–385. [Google Scholar] [CrossRef]
- Komarraju, M.; Musulkin, S.; Bhattacharya, G. Role of student–faculty interactions in developing college students’ academic self-concept, motivation, and achievement. J. Coll. Stud. Dev. 2010, 51, 332–342. [Google Scholar] [CrossRef]
- Lechuga, V.M. Faculty-graduate student supervisoring relationships: Supervisors’ perceived roles and responsibilities. High. Educ. 2011, 62, 757–771. [Google Scholar] [CrossRef]
- Wheeler, L.B.; Maeng, J.L.; Chiu, J.L.; Bell, R.L. Do teaching assistants matter? Investigating relationships between teaching assistants and student outcomes in undergraduate science laboratory classes. J. Res. Sci. Teach. 2017, 54, 463–492. [Google Scholar] [CrossRef]
- Acker, S.; Hill, T.; Black, E. Thesis supervision in the social sciences: Managed or negotiated? High. Educ. 1994, 28, 483–498. [Google Scholar] [CrossRef]
- Amida, A.; Algarni, S.; Stupnisky, R. Testing the relationships of motivation, time management and career aspirations on graduate students’ academic success. J. Appl. Res. High. Educ. 2020, 13, 1305–1322. [Google Scholar] [CrossRef]
- Barattucci, M.; Zakariya, Y.F.; Ramaci, T. Academic Achievement and Delay: A Study with Italian Post-Graduate Students in Psychology. Int. J. Instr. 2021, 14, 1–20. [Google Scholar] [CrossRef]
- Kardan, A.A.; Sadeghi, H.; Ghidary, S.S.; Sani, M.R.F. Prediction of student course selection in online higher education institutes using neural network. Comput. Educ. 2013, 65, 1–11. [Google Scholar] [CrossRef]
- Wong, G.K.W.; Li, S.Y.K.; Wong, E.W.Y. Analyzing academic discussion forum data with topic detection and data visualization. In Proceedings of the IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), Bangkok, Thailand, 7–9 December 2016. [Google Scholar] [CrossRef]
- Nilashi, M.; Abumalloh, R.A.; Zibarzani, M.; Samad, S.; Zogaan, W.A.; Ismail, M.Y.; Mohd, S.; Akib, N.A.M. What Factors Influence Students Satisfaction in Massive Open Online Courses? Findings from User-Generated Content Using Educational Data Mining. Educ. Inf. Technol. 2022, 27, 9401–9435. [Google Scholar] [CrossRef]
- Onwuegbuzie, A.J.; Collins, K.M.T.; Jiao, Q.G. Performance of cooperative learning groups in a postgraduate education research methodology course: The role of social interdependence. Act. Learn. High. Educ. 2009, 10, 265–277. [Google Scholar] [CrossRef]
- Chen, T.; Chang, K.Y. A study on the rare factors exploration of learning effectiveness by using fuzzy data mining. EURASIA J. Math. Sci. Technol. Educ. 2017, 13, 2235–2253. [Google Scholar] [CrossRef]
- Pardos, Z.A.; Kao, K. moocRP: An open-source analytics platform. In Proceedings of the Second (2015) ACM Conference on Learning, Vancouver, BC, Canada, 14–18 March 2015. [Google Scholar] [CrossRef]
- Yin, X.H. Construction of student information management system based on data mining and clustering algorithm. Complexity 2021, 2021, 4447045. [Google Scholar] [CrossRef]
- Gu, J. An effectiveness model of vocational education mode reform based on data mining. Int. J. Contin. Eng. Educ. Life Long Learn. 2022, 32, 111–127. [Google Scholar] [CrossRef]
- Yuan, J.; Chen, C.; Yang, W.; Liu, M.; Xia, J.; Liu, S. A survey of visual analytics techniques for machine learning. Comput. Vis. Media 2021, 7, 3–36. [Google Scholar] [CrossRef]
- Shah, N.; Bhagat, N.; Shah, M. Crime forecasting: A machine learning and computer vision approach to crime prediction and prevention. Vis. Comput. Ind. Biomed. Art 2021, 4, 9. [Google Scholar] [CrossRef]
- Liu, M.; Shi, J.; Li, Z.; Li, C.; Zhu, J.; Liu, S. Towards Better Analysis of Deep Convolutional Neural Networks. IEEE Trans. Vis. Comput. Graph. 2017, 23, 91–100. [Google Scholar] [CrossRef] [Green Version]
- Ying, Z.; L, G.; H, X.; G, B.; Z, Z.; Q, W.; Y, L.; Y, L.; F, Z. ASTF: Visual Abstractions of Time-Varying Patterns in Radio Signals. IEEE Trans. Vis. Comput. Graph. 2022, 1–11. [Google Scholar] [CrossRef]
- Wang, X.; Chen, W.; Xia, J.; Wen, Z.; Zhu, R.; Schreck, T. HetVis: A Visual Analysis Approach for Identifying Data Heterogeneity in Horizontal Federated Learning. IEEE Trans. Vis. Comput. Graph. 2022, 1–10. [Google Scholar] [CrossRef]
- Zhao, Y.; Shi, J.; Liu, J.; Zhao, J.; Zhou, F.; Zhang, W.; Chen, K.; Zhao, X.; Zhu, C.; Chen, W. Evaluating Effects of Background Stories on Graph Perception. IEEE Trans. Vis. Comput. Graph. 2021, 28, 4839–4854. [Google Scholar] [CrossRef]
- Zhao, Y.; Jiang, H.; Chen, Q.; Qin, Y.; Xie, H.; Wu, Y.; Liu, S.; Zhou, Z.; Xia, J.; Zhou, F. Preserving Minority Structures in Graph Sampling. IEEE Trans. Vis. Comput. Graph. 2021, 27, 1698–1708. [Google Scholar] [CrossRef]
- Xia, J.; Huang, L.; Lin, W.; Zhao, X.; Wu, J.; Chen, Y.; Zhao, Y.; Chen, W. Interactive Visual Cluster Analysis by Contrastive Dimensionality Reduction. IEEE Trans. Vis. Comput. Graph. 2022, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Kumar, A.S.; Vijayalakshmi, M.N.; Koppad, S.H.; Dharani, A. Narrative and Text Visualization: A Technique to Enhance Teaching Learning Process in Higher Education. In Proceedings of the Data Visualization, Singapore, 4 March 2020. [Google Scholar] [CrossRef]
- Fahd, K.; Venkatraman, S. Visualizing risk factors of dementia from scholarly literature using knowledge maps and next-generation data models. Vis. Comput. Ind. Biomed. Art 2021, 4, 19. [Google Scholar] [CrossRef] [PubMed]
- Ida, M. Web service and visualization for higher education information providing service. In Proceedings of the 2010 IEEE International Conference on Software Engineering and Service Sciences, Beijing, China, 16–18 July 2010. [Google Scholar] [CrossRef]
- Chong, S.; Lee, Y.H.; Tang, Y.W. Data Analytics and Visualization to Support the Adult Learner in Higher Education. In Proceedings of the 2020 The 4th International Conference on E-Society, E-Education and E-Technology, Taipei, Taiwan, China, 15–17 August 2020. [Google Scholar] [CrossRef]
- Vílchez-Román, C.; Sanguinetti, S.; Mauricio-Salas, M. Applied bibliometrics and information visualization for decision-making processes in higher education institutions. Libr. Hi Tech 2020, 39, 263–283. [Google Scholar] [CrossRef]
- Ngo, L.; Dantuluri, V.; Stealey, M.; Ahalt, A.; Apon, A. An architecture for mining and visualization of us higher educational data. In Proceedings of the 2012 Ninth International Conference on Information Technology-New Generations, Las Vegas, NV, USA, 16-18 April 2012. [Google Scholar] [CrossRef]
- Choo, J.; Liu, S. Visual analytics for explainable deep learning. IEEE Comput. Graph. Appl. 2018, 38, 84–92. [Google Scholar] [CrossRef] [PubMed]
- Schwab, M.; Strobelt, H.; Tompkin, J.; Fredericks, C.; Huff, C.; Higgins, D.; Strezhne, A.; Komisarchik, M.; King, G.; Pfister, H. booc.io: An education system with hierarchical concept maps and dynamic nonlinear learning plans. IEEE Trans. Vis. Comput. Graph. 2016, 23, 571–580. [Google Scholar] [CrossRef]
- Wei, H.; Li, H.; Xia, M.; Wang, Y.; Qu, H. Predicting student performance in interactive online question pools using mouse interaction features. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, Frankfurt, Germany, 23–27 March 2020. [Google Scholar] [CrossRef] [Green Version]
- Sundgren, M.; Jaldemark, J. Visualizing online collaborative writing strategies in higher education group assignments. Int. J. Inf. Learn. Technol. 2020, 37, 351–373. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso: A retrospective. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2011, 73, 273–282. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
- Pati, Y.C.; Rezaiifar, R.; Krishnaprasad, P.S. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 1–3 November 1993. [Google Scholar] [CrossRef] [Green Version]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–279. [Google Scholar] [CrossRef]
- Kononenko, I. Semi-naive Bayesian classifier. In Proceedings of the European Working Session on Learning, Porto, Portugal, 6–8 March 1991. [Google Scholar] [CrossRef]
- Hosmer, J.; David, W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; John Wiley & Sons: Amherst, MA, USA, 2013; pp. 35–47. [Google Scholar]
- Wu, T.F.; Lin, C.J.; Weng, R. Probability estimates for multi-class classification by pairwise coupling. Adv. Neural Inf. Process. Syst. 2003, 5, 975–1005. [Google Scholar]
Module | Factors | Value | All | Third Year | Second Year | First Year | ||||
---|---|---|---|---|---|---|---|---|---|---|
No. | PCT | No. | PCT | No. | PCT | No. | PCT | |||
Students’ basic information | Gender (Sbasic3) | Male | 225 | 33.23% | 45 | 29.41% | 89 | 33.09% | 91 | 35.69% |
Female | 452 | 66.77% | 108 | 70.59% | 180 | 66.91% | 164 | 64.31% | ||
Recent graduates (Sbasic4) | Yes | 495 | 73.12% | 123 | 80.39% | 191 | 71.00% | 181 | 70.98% | |
No | 182 | 26.88% | 30 | 19.61% | 78 | 29.00% | 74 | 29.02% | ||
Bachelor’s degree from our university (Sbasic5) | Yes | 155 | 22.90% | 27 | 17.65% | 62 | 23.05% | 66 | 25.88% | |
NO | 522 | 77.10% | 126 | 82.35% | 207 | 76.95% | 189 | 74.12% | ||
Plan after graduation (Sbasic6) | PH D | 78 | 11.52% | 8 | 5.23% | 28 | 10.41% | 42 | 16.47% | |
Civil Servant | 192 | 28.36% | 43 | 28.10% | 80 | 29.74% | 69 | 27.06% | ||
Staff | 366 | 54.06% | 96 | 62.75% | 143 | 53.16% | 127 | 49.80% | ||
Others | 41 | 6.06% | 6 | 3.92% | 18 | 6.69% | 17 | 6.67% | ||
Participation of academic competition (Sbasic7) | None | 409 | 53.59% | 82 | 53.90% | 145 | 71.37% | 182 | 60.41% | |
School | 185 | 29.41% | 45 | 31.23% | 84 | 21.96% | 56 | 27.33% | ||
Province | 42 | 13.07% | 20 | 4.83% | 13 | 3.53% | 9 | 6.20% | ||
National | 41 | 3.92% | 6 | 10.04% | 27 | 3.14% | 8 | 6.06% | ||
Students’ academic information | Frequency of academic lectures organized by your college (Sacademic1) | Uncertain | 31 | 4.58% | 8 | 5.23% | 11 | 4.09% | 12 | 4.71% |
1-2/semester | 37 | 5.47% | 5 | 3.27% | 23 | 8.55% | 9 | 3.53% | ||
1-2/month | 296 | 43.72% | 77 | 50.33% | 112 | 41.64% | 107 | 41.96% | ||
Per week | 313 | 46.23% | 63 | 41.18% | 123 | 45.72% | 127 | 49.80% | ||
Frequency of participating in academic lectures (Sacademic2) | Hardly | 13 | 1.92% | 1 | 0.65% | 8 | 2.97% | 4 | 1.57% | |
Once/semester | 133 | 19.65% | 33 | 21.57% | 66 | 24.54% | 34 | 13.33% | ||
Once/week | 111 | 16.40% | 18 | 11.76% | 25 | 9.29% | 68 | 26.67% | ||
Twice/week | 420 | 62.04% | 101 | 66.01% | 170 | 63.20% | 149 | 58.43% | ||
Frequency of participating in academic training (Sacademic3) | None | 80 | 11.82% | 11 | 7.19% | 33 | 12.27% | 36 | 14.12% | |
1–2 times | 300 | 44.31% | 68 | 44.44% | 106 | 39.41% | 126 | 49.41% | ||
More than 3 | 297 | 43.87% | 74 | 48.37% | 130 | 48.33% | 93 | 36.47% | ||
Frequency of reading papers (Sacademic4) | 1–5/semester | 35 | 5.17% | 9 | 5.88% | 11 | 4.09% | 15 | 5.88% | |
1–5/month | 255 | 37.67% | 76 | 49.67% | 95 | 35.32% | 84 | 32.94% | ||
1–5/week | 327 | 48.30% | 59 | 38.56% | 135 | 50.19% | 133 | 52.16% | ||
1/day | 60 | 8.86% | 9 | 5.88% | 28 | 10.41% | 23 | 9.02% | ||
Supervisor’s basic information | Supervisor’s guiding way (Tbasic1) | Single tutor | 623 | 92.02% | 146 | 95.42% | 248 | 92.19% | 229 | 89.80% |
Tutor group | 54 | 7.98% | 7 | 4.58% | 21 | 7.81% | 26 | 10.20% | ||
Supervisor’s title (Tbasic2) | Lecturer | 9 | 1.33% | 0 | 0.00% | 5 | 1.86% | 4 | 1.57% | |
Associate professor | 207 | 30.58% | 36 | 23.53% | 81 | 30.11% | 90 | 35.29% | ||
Professor | 461 | 68.09% | 117 | 76.47% | 183 | 68.03% | 161 | 63.14% | ||
Number of students for each Supervisor (Tbasic3) | No more than 3 | 545 | 80.50% | 98 | 64.05% | 208 | 77.32% | 239 | 93.73% | |
No more than 6 | 130 | 19.20% | 55 | 35.95% | 60 | 22.30% | 15 | 5.88% | ||
More than 7 | 2 | 0.30% | 0 | 0.00% | 1 | 0.37% | 1 | 0.39% | ||
Supervisor’s information about guidance | Communication frequency (Tguidance1) | None | 10 | 1.48% | 5 | 1.96% | 2 | 0.74% | 5 | 1.96% |
1-2/semester | 60 | 8.86% | 25 | 9.15% | 21 | 7.81% | 25 | 9.80% | ||
1-2/month | 320 | 47.27% | 96 | 53.59% | 142 | 52.79% | 96 | 37.65% | ||
1-2/week | 287 | 42.39% | 129 | 35.29% | 104 | 38.66% | 129 | 50.59% | ||
Communication way (Tguidance2) | Face-to-face | 541 | 79.91% | 211 | 77.78% | 211 | 78.44% | 119 | 82.75% | |
Telephone | 15 | 2.22% | 4 | 2.61% | 7 | 2.60% | 4 | 1.57% | ||
14 | 2.07% | 3 | 2.61% | 7 | 2.60% | 4 | 1.18% | |||
Message | 107 | 15.81% | 37 | 16.99% | 44 | 16.36% | 26 | 14.51% | ||
Participation of supervisor’s projects (Tguidance3) | None | 339 | 50.07% | 73 | 54.90% | 126 | 46.84% | 140 | 47.71% | |
One | 231 | 34.12% | 55 | 32.55% | 93 | 34.57% | 83 | 35.95% | ||
More than two | 107 | 15.81% | 25 | 12.55% | 50 | 18.59% | 32 | 16.34% | ||
Supervisor’s requirements for paper publication (Tguidance4) | None | 180 | 26.59% | 31 | 20.26% | 82 | 30.48% | 67 | 26.27% | |
Request | 46 | 6.79% | 11 | 7.19% | 14 | 5.20% | 21 | 8.24% | ||
Guidance | 451 | 66.62% | 111 | 72.55% | 173 | 64.31% | 167 | 65.49% | ||
Frequency of academic discussion (Tguidance5) | None | 105 | 15.51% | 16 | 10.46% | 38 | 14.13% | 51 | 20.00% | |
Once/semester | 108 | 15.95% | 32 | 20.92% | 44 | 16.36% | 32 | 12.55% | ||
Once/month | 231 | 34.12% | 61 | 39.87% | 97 | 36.06% | 73 | 28.63% | ||
Once/week | 233 | 34.42% | 44 | 28.76% | 90 | 33.46% | 99 | 38.82% | ||
Supervisor’s guidance degree (Tguidance6) | None | 28 | 4.14% | 5 | 3.27% | 11 | 4.09% | 12 | 4.71% | |
Little | 58 | 8.57% | 13 | 8.50% | 21 | 7.81% | 24 | 9.41% | ||
Much | 591 | 87.30% | 135 | 88.24% | 237 | 88.10% | 219 | 85.88% | ||
Supervisor’s disadvantages (Tguidance7) | Too many students | 77 | 11.37% | 30 | 19.61% | 28 | 10.41% | 19 | 7.45% | |
Too busy | 141 | 20.83% | 36 | 23.53% | 53 | 19.70% | 52 | 20.39% | ||
High demand | 179 | 26.44% | 35 | 22.88% | 78 | 29.00% | 66 | 25.88% | ||
Invalid interaction | 105 | 15.51% | 20 | 13.07% | 43 | 15.99% | 42 | 16.47% | ||
None | 175 | 25.85% | 32 | 20.92% | 67 | 24.91% | 76 | 29.80% |
Module | Factors | Value | All | Third Year | Second Year | First Year | ||||
---|---|---|---|---|---|---|---|---|---|---|
No. | PCT | No. | PCT | No. | PCT | No. | PCT | |||
Evaluation basis | Number of published papers (paper1) | 0 | 448 | 66.17% | 45 | 29.41% | 167 | 62.08% | 236 | 92.55% |
1 | 130 | 19.20% | 55 | 35.95% | 60 | 22.30% | 15 | 5.88% | ||
2 | 64 | 9.45% | 31 | 20.26% | 30 | 11.15% | 3 | 1.18% | ||
3 | 26 | 3.84% | 16 | 10.46% | 9 | 3.35% | 1 | 0.39% | ||
4 | 6 | 0.89% | 5 | 3.27% | 1 | 0.37% | 0 | 0.00% | ||
5 | 3 | 0.44% | 1 | 0.65% | 2 | 0.74% | 0 | 0.00% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Li, G.; Yin, Y.; Zhang, L. Visualization and Data Analysis of Multi-Factors for the Scientific Research Training of Graduate Students. Appl. Sci. 2022, 12, 12845. https://doi.org/10.3390/app122412845
Liu Y, Li G, Yin Y, Zhang L. Visualization and Data Analysis of Multi-Factors for the Scientific Research Training of Graduate Students. Applied Sciences. 2022; 12(24):12845. https://doi.org/10.3390/app122412845
Chicago/Turabian StyleLiu, Yanan, Guojun Li, Yulong Yin, and Leibao Zhang. 2022. "Visualization and Data Analysis of Multi-Factors for the Scientific Research Training of Graduate Students" Applied Sciences 12, no. 24: 12845. https://doi.org/10.3390/app122412845
APA StyleLiu, Y., Li, G., Yin, Y., & Zhang, L. (2022). Visualization and Data Analysis of Multi-Factors for the Scientific Research Training of Graduate Students. Applied Sciences, 12(24), 12845. https://doi.org/10.3390/app122412845