A Predictive System Informed by Students’ Similar Behaviour

Burgos, Daniel

doi:10.3390/su12020706

Open AccessArticle

A Predictive System Informed by Students’ Similar Behaviour

by

Daniel Burgos

Research Institute for Innovation & Technology in Education (UNIR iTED), Universidad Internacional de La Rioja (UNIR), 26006 Logroño, La Rioja, Spain

Sustainability 2020, 12(2), 706; https://doi.org/10.3390/su12020706

Submission received: 16 December 2019 / Revised: 15 January 2020 / Accepted: 16 January 2020 / Published: 18 January 2020

(This article belongs to the Special Issue Innovating Learning Analytics for Sustainable Higher Education)

Download

Browse Figures

Versions Notes

Abstract

It is quite complex to adapt instruction to student needs in view of online education owing to the ensuing communication disconnection in such learning environments. Decision support schemes offer assistance by automatically gathering students’ data and forwarding them to the tutor in the appropriate perspective, in order to predict their behaviour and implement some action beforehand to avert or promote the final upshot. This study shows of a decision support scheme known as u-Tutor that is centred on the similarity computation between learners in the past, and how it was used in a real-case scenario. For this case study, this tool has been utilized by two real courses comprising of 392 learners alongside academic faculty, as of 2015 to 2019. The analysis offered focuses on 3 research areas: (1) perceived usefulness, (2) usability of the tool and (3) success rate of classification. From the acquired data, it can be seen that the teaching group managed to offer excellent approximations for those learners who eventually managed to pass the course, whereas u-Tutor seemed to be an early warning for learners at risk, indicating its capacity as a tutors’ supportive tool.

Keywords:

recommender systems; learning analytics; student’s behaviour; background similarities; learning management systems

1. Introduction

1.1. Data Analysis and Learning Analytics

Methods of data analysis are presently receiving attention from educational research literature as a study field. When used in education, data analysis may be perceived from two dissimilar viewpoints: learning analytics and education data mining. The latter concentrates on algorithms, techniques and how to have them improved, while the former deals with how benefits are obtained by the educational scenario using these methods [1]. Learning analytics seeks to analyze the data arising from educational settings and come up with information to be considered in enriching the process of learning/teaching. Learning analytics methods may be utilized in numerous diverse educational environments, like distance, face-to-face or blended. For instance, Vieira, Parsons, and Byrd [2] examined 52 literature papers, from which 3 belonged to classroom settings, 30 to blended learning settings and 19 to online learning settings. As a result, it was apparent that the literature reviewed the ideal focus of the implementation of learning analytics to distance learning settings.

Moreover, distance learning is highly relevant in the learning analytics area based on two key reasons: to begin with, in online settings, both tutors and learners take part in a virtual learning environment (VLE) as their major interaction point. As a result, it is simpler, compared to blended or classroom environments, to capture the activity of most of the participants in the course [3]. The methods of data capture are, specifically, a research topic in the area, based on the existing literature [4,5,6]. In addition, online education is burdened with communication discontinuation [7,8]; based on this, it is particularly pertinent to create informative techniques to comprehend the course proceedings and the learning procedures of the students. As broadly examined by Mangaroska and Giannakos [9], the relationship between learning design and learning analytics has acquired mounting interest in the Technology-enhanced Learning (TEL) environment, given the potential of learning analytics tools in informing data-driven design decisions. Consistent with Vieira et al., [2] learning analytics methods may be used for numerous diverse purposes: comprehending teamwork, instructional design, comprehending motivation, enhancing reflection, or examining usage behaviors, among others.

1.2. At-Risk Learners

Of all the recognized objectives, early detection and mark prediction of at-risk learners have been examined in numerous studies. For instance, Jayaprakash et al. [10] and Bainbridge et al. [11] utilized activity logos merged with demographic data to attain early detection of at-risk learners, where more than 80% of such students were detected effectively. Identical results were offered by Cambruzzi, Rigo, and Barbosa [12], where the scholars examined distance learning and highlighted an architecture with the ability to capture and analyze data from diverse sources. Another study by Agudo-Peregrina et al. offered a similar illustration by analyzing the association between dissimilar variables and academic accomplishment, and identified a positive correlation in online courses but nil association in classroom settings. The findings, therefore, highlight the significance of educational environment in the success of a predictive analysis. Another study by Romero et al. [5] discussed the mining of forum interactions to estimate the performance of the students. There is ample literature in this field, like Papamitsiou and Economides [13], who were able to ascertain 17 research papers where the objective was to predict performance. Regardless of all the studies and development struggles in the scientific literature, Prieto et al. argue that integration problems presently remains inexplicable in view of learning analytics [14]. The scholars argued that regardless of the present increment in the learning analytics interest, their integration in daily classroom practices is still stagnant, while they ascertain the intricacy of communication among diverse stakeholders engaged in the procedure of implementing a learning analytics revolution in the classroom level as a key issue meriting discussion [13]. Alleged expediency and usability are equally issues influencing the integration of any technological invention, thereby being considered intricate and significant topics being examined in learning analytics systems [15]. Based on this, the facilitated solutions ought to be applicable, besides having a clearly explicable value that makes use of participants in their practical application.

As stipulated above, alleged expediency and usability are significant drivers in the integration of learning analytics. The present paper seeks to offer an experiential case study in validating the usefulness and usability of u-Tutor (the motto in full is Alumni Alike Activity Awareness) in a realistic situation. For this study, the precision of u-Tutor has been evaluated as a predictive system. Empirical literature is specifically pertinent in the learning analytics sector, given the fact that a significant number of the most-cited papers are theoretical rather than empirical [16]. Other sections of this study are as follows: the subsequent section offers a description of educational and technological details to comprehend the setting in which the case study was conducted. This is followed by the explanation of the research questions considered in the study. After that, the case study’s methodology is explained in depth while the results and discussion of the results are offered in the last chapter.

2. Case Study Contextual Description

2.1. The Technological Context

In accordance with de-la-Fuente-Valentín and Burgos, u-Tutor is a decision support approach considered in the prediction of the behavior of the students by analyzing the similarities of present students to those from background courses [17]. This analysis has been correlated (by behavior prediction approach) with the scores of learners from previous courses attained upon completion of the course. Tutors/teachers are facilitated with a visual depiction of the learners in present courses and a measurement regarding the similarity of the present students with the learners from the previous courses. According to Shneiderman, this means that the tutor is able to have a fast overview of the course status in addition to acquiring information upon demand [18]. By offering tailored information regarding persons, u-Tutor enables the tutor to personalize instruction to the needs of the learners, where the tutor is the adaptation’s key driver, while the tool becomes the information source. So as to foster institutional implementation of the tool, the technique involves the smooth incorporation being integral to the workflow of the teachers, while ensuring the significance of the solution and the usability of the tool are taken care of.

For the present study, any learner admitted in an academic course has been designated as a ‘student’, the professor as a ‘teacher’ and the assistant offering support to students all through an academic year as a tutor. In this specific context, the similarity of two students holds when they generate an identical event log within a certain arbitrary duration (for this study, 3 weeks). The measurement of similarity is founded on the event type and the total times per event repetition (such as four forum view, seven course content views, as well as a single activity submission), with the calculation having a subject scope (the ability of two students being identical in a certain subject and different in another subject). De-la-Fuente-Valentín and Burgos [17] provide a comprehensive discussion of similarity metrics. The design of u-Tutor is meant for online, distance settings with VLE being the tutors’ interaction point with the peers and the course material. This context means the tutors possess regular consultancy time with the students, while it is imperative to identify similar students for the purposes of optimizing such consultancy. For the purposes of explaining the process of information retrieval from u-Tutor, the subsequent example has been presented: Simon (a student), is admitted in a programming major expected to run for 3 months and presently in its third week. Considering the preceding course edition (with equal duration and identical learning accomplishments, accessible tools, and pedagogical approach), other students’ activities were captured by the monitoring system and now it is capturing the activities of Simon. Therefore, the pattern for Simon’s activity (that is, the total occasions of dissimilar activities) is evaluated against that of the previous course’s students and gives a feedback regarding the similarity of Simon against that of each historical student. This is followed by a visual comparison between the similar information and the resultant score in the following approach: the grouping of the previous courses’ students is done founded on their attained score (from 0–1, 1–2, and so on, on a scale of 0–10). The calculation of Simon’s similarity with a group is done by the u-Tutor as the average similarity of all the group’s students. This similarity value is used to pick the color to denote the specific group in the visualization (with a higher similarity being epitomized by darker colors). The resultant visualization has been portrayed in Figure 1, indicating the behavior of Simon is similar to the students who attained a score ranging from 4–8.

The teacher/tutor interface indicates students (current course) as rows. Figure 2 portrays the resultant visualization.

Rather than essentially approximating the score of Simon, u-Tutor is actually indicating the average score of the same historical students. This estimation is achieved by visually interpreting the Figure as observed. The tutor has systematic and personal communications with all the students either by use of emails or phone calls. This means that a tutor is aware of the student’s personal situations and is able to contextualize the information through visualization and interpreting it, while engaging the necessary decision when required. There are three visualization approaches for u-Tutor: to begin with, global view is a pie chart classifying learners as ‘at extreme risk,’ ‘at risk,’ ‘pass,’ and ‘outstanding.’ This is followed by the grid view (refer Figure 2), with each student being represented by each row. Grid view is split into tabs, a single tab for each of the categorization in the global perception. The student-centric view (Figure 1) portrays the comprehensive information about a single student, incorporating a line having the similarity values. Additional information regarding the u-Tutor may be acquired from de-la-Fuente-Valentín and Burgos [17,19].

2.2. The Educational Context

The setup deployment was done on the implemented courses at the Universidad Internacional de La Rioja (UNIR), an online-distance learning institution in Spain, South America and USA, where most of their 40,000 students are Spanish and Latin Americans. The case study concentrated on two courses, both implemented in Spanish, “Web Projects Management” and “Web Services Administration,” two parts involving 4-week duration and the same master’s course. The courses were defined by the following features: a total of 392 students took part per course, for both courses. The distribution of each course is done using pools of 35 and with a specific teacher allocated, called tutor, with a total of 12 different tutors. The two courses commenced and completed with equal durations. In addition, throughout the 4-week duration, the students accomplished numerous activities and had them submitted by the final day of the course. These marks were evaluated at 40% of the overall marks. Besides that, the last face-to-face assessment facilitated the remaining 60% of the total marks. However, the online activity did not only comprise of the face-to-face activity. Further, several tutors tracked the activity of the students throughout the master’s program and facilitated them with customized advice. As discussed, a total of 12 tutors were involved in this study. Additionally, it was required of the students to pass the course for the purposes of obtaining the master’s degree. Lastly, the preceding courses considered in estimating the similarity measures included the course’s 9 previous editions, where over 500 students were admitted in total.

Given the application of u-Tutor by the tutors in the present case study, there is the need to define the tutor’s role in the learning process. The master’s programme comprised of 20 independent courses, at a 4-week average duration. The initial two courses at the onset of the master’s program and the subsequent two courses commence when the initial has finished. The courses considered as the study’s subjects (“Web Services Administration” and “Web Projects Management”) were conducted during the middle of the semester. The students were given support by the tutors throughout the 9-month duration of the master’s course. Practically, the function of the tutor is to supervise and follow-up on the progress of the students from a transversal point of view, knowing them and offering support to them in their personal situations. For instance, the tutors are able to call the students, should they notice a decrement in the students’ performances or if the tutors foresee a potential student drop-out. The tutor is able to understand the background of the student and approaches them beyond the range of a single course; this personal comprehension of a student’s condition enables the tutor to relate any information emanating from the student.

3. Research Areas and Related Research Questions

The case study discussed was overseen by three research areas that were depicted in five research questions. The analysis considered the experiential setting as an entire entity, seeking the context for better comprehension of the observations. Instantaneously, the observations sought to answer several research areas expressed in a number of research questions that guided the researchers in focusing their efforts. The five research questions cover the areas of (1) perceived usefulness, (2) usability and (3) success rate of classification.

3.1. Perceived Usefulness

Notwithstanding the precision of the learning analytics algorithm, the integration process necessitates the usefulness of the tools. To this level, and beyond the tool’s success rate in students’ classification, the case study analysis ought to authenticate whether the tool was competent for the tutors and if it could be integrated into their daily workflow. As a result, the present case study comprises of mechanisms to evaluate the frequency of the tool’s application and whether it actually endorsed the tasks of the tutors. The other key aspect under consideration is the tool’s impact on the workflow of the tutors, that is, whether the tutors excluded or included activities in their daily obligations as a result of the u-Tutor application. The study duration of 4 weeks might have been insufficient to offer ideal changes in their workflow but it offered awareness into the impact of the tool. For this study, the definite research questions are as follows:

[RQ1] What is the alleged practicality of u-Tutor?
[RQ2] How effective is u-Tutor in causing tutor actions that would not occur without the tool?

3.2. Usability

Consistent with Lukarov and colleagues, usability is observed as a significant topic meriting assessment in learning analytics systems [15]. Thus, the design and application of a visualization tool ought to integrate usability by design, while the tool’s validation ought to authenticate its usability. The design and creation of u-Tutor have considered an iterative method in refining the user interface [18]. The third specific study question is as follows:

[RQ3] Do the tutors comprehend and be familiar with using the visual information and the interface options?

3.3. Success Rate of Classification

When it comes down to it, u-Tutor functions as a decision support system which is considered dependent on estimating students’ results and categorizing them based on this prediction.

Therefore, it is imperative to verify the accuracy of the classification. The definite research questions included here are:

[RQ4] To what extent do the classifications match the actual results?
[RQ5] To what extent do the classifications match the tutors’ beliefs?

The analysis of these questions follows a quantitative approach by comparing the estimated marks with actual marks.

4. Methodology

4.1. Settings of the Case Study

The work offered in this paper adheres to a case study methodology. This means that the u-Tutor approach was considered in a real setting using an online learning platform and gave its application to the tutors. The case study analysis is founded on observations made and data gathered from the instruments discussed in the Data Capture Methods section. u-Tutor has been configured for application in the courses previously discussed. This means that u-Tutor acquired events from the Learning Management System and analyzed them for the purposes of developing the similarities’ visualization. Besides, as evident from Figure 3, u-Tutor was incorporated into LMS user interface, facilitating an easier access of the tool. It is easy to realize the Spanish language in some of the Figures since the course was implemented in Spanish.

Regardless of the course duration of 4 weeks, the case study only lasted for two weeks. The main reason is that, since the tool retrieves information from other previous courses and compares that information with current student’s behavior, the tool requires a user tracking on the current cohort to be able to make the comparison and provide some useful insight. With insufficient data from the present, the coupling with the past becomes meaningless. Further, to avert the cold-start influence, the setup of the tool was done at the course onset but given to the tutors on the beginning of the course’s third week. This was followed by assisting the tutors in a training session where the researchers discussed the tool’s functionality and characteristics. Once this was done, the tutors were awarded login credentials to access the tool and inquired to freely interrelate throughout their daily tasks, utilizing the tool at their own speed. The tutors were asked to revise the status of the students in u-Tutor prior to getting in touch with them. Based on the framework suggested by Drachsler and Greller [20], the learning analytics’ set-up in the present study uses the dimensions discussed below (Table 1).

4.2. Data Capture Methods

The researchers were guided by the subsequent artefacts in observing the case study:

Throughout the courses, integration of human-estimation was done into the single-student perception given by u-Tutor as a simple interface using a slider that enabled the user to provide an approximation of the student’s present mark. This interface (portrayed in Figure 4) was unsystematically opened upon accessing the single-student view capable of being closed or opened upon request. In the recent study, the input attained by this means is referred to as ‘estimation’ or ‘human’ among the observations made and there are those which necessitatd further description. For these cases, the researchers reached the tutoring team using emails. The communication process was considered reliable and the emails did not take more than a single day. u-Tutor comprises of an interface enabling reporting of a problem (Figure 5). The primary idea was to acquire functionalities anticipated by tutors but not guaranteed by the tool; however, it was equally conisdered in capturing the tutors’ generic comments. In addition, the analysis accounted for the ideal marks attained by the students in the daily activities (40%) and those attained in the final assessment (60%).

A machine-determined score interval was proposed through visualization for each student with identical students in the preceding course. This was an ideal approximation that was stored for future evaluation against the real marks. u-Tutor user logs equally underwent storage, enabling an assessment of the users’ interface with the tool.

Once the courses were accomplished, the tutors were presented with an online (anonymous) questionnaire with both open-text and multiple choice questions using the Likert scale approach. The questionnaire touched on the three research areas: (1) perceived usefulness, (2) usability and (3) success rate of classification (accuracy), and their overall opinions about u-Tutor. This was followed by personal semi-structured interviews, carried out and recorded in the form of video and audio conferences, comprising of all completed information from the tutors. Given the fact that the interviews were conducted after the questionnaires, preparations had to be done by the researchers through taking into consideration the answers provided for the questionnaires, even if the eventual scores were accessible.

4.3. Analysis Methods

Sticking to the codes of a case study approach [21], the analysis adheres to the exceptionality of the case and fails to generalize the results. For instance, the competence in the classification of students is dependent upon the characteristics of the present case, so the analysis seeks to establish the circumstances affected by the success rate in this specific case, rather than seeking to generalize the attained rate. The case study analysis was guided by the research areas and research questions presented under “The educational context” section, each having a dissimilar nature. With that said, each research question helps building the feedback around the related research area, and they have been analyzed as follows:

Considering research questions one and two, the researchers engaged a qualitative analysis. From one perspective, LMS usage and server log statistics were evaluated to identify whether the tutors utilized the tool on a daily basis; alternatively, the interview and questionnaire answers identified the subjective opinions of the tutors on the tool utility. The data sources triangulation was considered in reinforcing the findings. In view of the third research question, a qualitative analysis was conducted taking into consideration the questionnaire feedback, the issues reported using the report-a-problem feedback platform, and the interviews, which acquired the subjective opinions of the tutors on the tool’s usability. Brook’s SUS questionnaire [22] was taken into consideration but ended up being rejected owing to a decreased number of potential respondents. Last but not the least, the analysis of the last two research questions used three values: machine estimations, human estimations, and the real marks. The error and success rate estimation were done through a comparison of the estimates with the real outcomes. Besides, the questionnaire analysis estimated the attitude of the tutors towards the estimation approach. Figure 6 offers an account of the error and success rate estimation. In estimating the success rate, both machine and human estimations were founded on intervals rather than values. This means that tutors portrayed a scores interval where it was expected students would fall under. The formula used in calculating the success rate was the total quantity of successful estimations divided by the total quantity of estimations made, as follows:

S u c c e s s r a t e = (S u c c e s s f u l e s t i m a t i o n s) / (T o t a l e s t i m a t i o n s)

Further, given that estimation of the scores was done using intervals rather than values, the error measurement is expected to consider the interval size. For instance, if 8.2 is the score for a student, then an estimation [8,9] is highly accurate compared to an estimation of [5,9], regardless of the similarity of the border of the range in both instances. The formula below encompasses the range size in the error measurement with D_b being the distance from the actual score to the estimated interval, while D_c is the distance from the actual score to the center of the estimated interval.

E r r o r i n e s t i m a t i o n = (D_b + D_c) / 2

5. Results

As mentioned in the previous section, the courses were overseen by 12 tutors. As a result of work management issues, various responsibilities were shared amongst them, thereby influencing the manner in which u-Tutor was utilized, through coupling; one tutor played the role of actively engaging the tool and being in contact with the students, while another tutor oversaw the administrative duties without being directly involved with the students or utilizing the tool. The piloting of the study was therefore affected since it minimized the total information availed for the case study as regards to the perceived usefulness and usability. The following sub-sections show the results related to the three research areas: (1) perceived usefulness, (2) usability and (3) success rate of classification. As commented before, the author used five research questions as a way to retrieve the users’ input towards a more informed analysis focused on those very areas. The research questions are addressed in the context of the research areas.

5.1. Perceived Usefulness

The platform logs’ analysis considers the total times of logging in by the tutor on the u-Tutor, in addition to the part of the tool accessed by them. The application analysis is not quite meaningful by itself, but it aids in comprehending the perception of the tutor regarding its usefulness. The total number of views has been exemplified in Table 2 and Table 3. The two tables focus on the previous week of each course since most of the activity among the tutors occurred during this period. Table 2 focuses on a comparison of the grouped-student with single-student view. The results indicate strong platform application on the first day and minimal application on the subsequent days. There is some logic in this since the information updates of the tool occurs on a daily basis, although the status of the students fail to change so swiftly and normally requires additional time. Besides, as discussed in the Methodology section, the training session preceded all activities. Therefore, broad application of the tool (124 single student views, out of over 392 students) at the start of the week gives an inclusive perception capable of being complemented on a daily basis with minimum intense application on the remainder of the days. Throughout the week, the tutor was able to pay a visit to approximately 29 pages on a daily basis.

Table 3 seeks to establish the likelihood of the tutor to perceive one of the views as highly useful in comparison to the rest; for instance, if the ‘at severe risk’ categorization was more interesting (hence, additional views) compared to the rest of the views. It can be seen that on the first weekday, all classifications were visited. Besides, it is evident that by the last day of the week, the ‘at severe risk’ and ‘at risk’ groups had much attention compared to the rest of the groups.

The total received estimations (equally offered in Table 3) is associated with the application statistics. This means that deep application of the tool occurred at the onset of the week. It was clear that the use of u-Tutor among the tutors was deep enough to authenticate the alleged significance of the tool. Upon completion of the course, a questionnaire (in Appendix A) and a set of interviews followed where tutors gave their feedback. Before the interview, the questionnaire answers were analyzed by the researchers. Table 4 offers a summary of answers provided as questionnaire feedback on u-Tutor’s perceived usefulness. They are selected as a sample of significant answers, although the analysis is carried out by taking into account all the provided answers. Table 4 is, therefore, an excerpt of the output. A significant number of these questions were posed as multiple-choice questions, whose goal sought to use interviews in examining the reason for the answers.

An initial inquiry of the answers indicates that, at the outset, the tutors realized important information while utilizing the tool and utilized the gathered information in their everyday obligations. From the feedback, to begin with, the tutors were able to get significant information while utilizing the tool and employed the gathered data. Based on their u-Tutor application, they identified some situations and summoned the students involved (there, the eventual obligation of the tutor). Normally, they used the feedback from the tool to shape their contact with the student and provide personalized support and encouragement with specific performance and overall approach to the learning flow. In addition, the tutors identified the probability to have acquired equal information from additional sources, although u-Tutor simplified the task. Besides that, notwithstanding the total page views from the statistical application analysis (at least 30 pages on a daily basis), the tutors failed to perceive the tool integral to their workflow. Additionally, for the purposes of integrating u-Tutor in future courses, it was suggested that a few improvements be made, such as diversity in learning settings.

The third question sought to determine whether the tutors utilized the tool since they were asked to do so, or whether they actually identified significant information. This is a pertinent issue to establish the evidence validity in regards to the perceived significance, meaning that among the items mentioned in the interview, that they were equally associated with this issue. When inquired about it, the tutors who utilized u-Tutor argued that initially they considered the tool since they were asked to do so and they regarded it as additional work. Nonetheless, the tutors realized that identifying inactive students was quite an easy task and helpful. That is, they hesitantly began the application of u-Tutor but eventually recognized the positive utility of the tool.

Question 9 is of equal importance in identifying u-Tutor’s usefulness, thereby discussing the issue during the interviews. In the questionnaire feedback, the tutors showed that they were not willing to utilize the tool for the second time when deployed under similar settings. The tutors suggested some features during the interview that would motivate them to consider using u-Tutor in future. Specifically, they proposed a tool that identifies specific students (‘If I could search for a student, u-Tutor would be really useful for me’), a student-centred view with all of a student’s courses in a single view (‘For the single-student view, I expected information for all the current courses in the same view’), and already-marked activities as an additional source of the similarity calculations (‘If the tool included already-marked activities, it would be really accurate and therefore useful’).

The interview equally verified one of u-Tutor’s design principles as a visual analytics approach, which is the requirement for human interpretation to have data contextualized before coming up with a decision. Specifically, the tutors argued that, ‘In some cases, I found severe-risk students, but I knew their personal circumstances and know that they will do a good job,’ and ‘In some cases, the estimation given for a student was in two score intervals (e.g., 4–5 and 6–7). In these cases, I selected the interval according to what I already knew about the student.’ Typically, the tutor’s viewed u-Tutor as a decision-making support, but not an independent decision maker. The study further identified some weaknesses of the tool. To begin with, just as mentioned above, the tool lacks the aptitude to identify a student quickly. In view of this, the tutors argued that ‘[because of not having a student search tool] using u-Tutor slowed down some of my tasks.’ This feature appears to be first-priority perfection for impending incorporation of u-Tutor into the scenario under study. The other issue raised is associated with the level of confidence, which was not utilized by the tutors. During the interviews, there was some confusion regarding the ideal functionality of the confidence level selection, although upon clarification, the tutors argued that this feature was not useful. Based on the researchers’ point of view, this information is vital, especially in view of future recommendations involving the integration of confidence levels in a suitable manner. Last, but not the least, there is the need to indicate that the tutors recognized that ‘A visual representation of the information helped me with understanding the statistical data from the learning management system.’

5.2. Usability

The usability analysis sought to establish the level to which tutors comprehended and understood the application of visual information and interface options. Among the key concerns, as evident from the existing studies using u-Tutor, includes the complexity involved in substantiating the type of information offered in the visualization, specifically, the story defining the data. For effective application of this tool, it is important for u-Tutor’s users to comprehend that the visualization portrays a measure of similarities with students’ scores within a certain interval, and this is considered an estimate, thereby not an ideal prediction. Therefore, it is only through comprehending the story and defining the data that tutors will be able to contextualize the information; else, misinterpretations will occur, resulting in faulty decisions. For this study, there was a training session for the tutors involving explanation of the interface and the type of the visualized data by the researchers. Regarding the questionnaire, the tutors utilizing u-Tutor comprehended it as a tool that ‘lets you to see the result that a student may have, taking into account students from previous courses that behaved similarly.’ During the interview, the tutors also recognized that, ‘I know that u-Tutor also considers odd cases, because a student from previous courses may also have the same odd conduct.’ This quote argues that the system simply examines similarities; it does not involve judging the accuracy of a behavioral pattern. It was through this quote that the tutors realized that the main concern is the similarity measures, not really the level of activity. These questionnaire and interview quotations indicate that the tutor definitely understood the nature of the tool. The report-a-problem feedback interface was used once by the tutors in requesting a student-centric visualization. This means that the tool was considered in collecting a report from a single student (resulted being a typical situation). This type of an interface has not been facilitated by the present version of u-Tutor, hence, being considered a key usability concern and a future establishment feature. Additional questions based on the questionnaire sought to establish the intricacy of the tutors experience while attempting to comprehend the interface. Based on the answers facilitated, the tutors considered the system quite simple to use, besides being able to explicate the elements of the u-Tutor interface. Nonetheless, it was not required of them to learn a lot prior to utilizing the u-Tutor. Since the tutors did not complain about the usability issues (besides the student-centred approach), and they indicated agile application of the tool, the usability of u-Tutor is presently perceived in a stable state, with the users providing positive feedback.

5.3. Success Rate of Classification

Based on the interview data, it can be seen that tutors were unwilling to obtain automated estimates, by arguing, for example, ‘You may find a student that downloads all the course content on the first day and stops interacting with the LMS. He may achieve good results, and the tool would misinterpret his data.’ This is definitely among the arguments not in favor of the utilization of analytics. As mentioned by Leony and colleagues [4], an approach to assess the analytics coverage would diminish the problem. Notwithstanding such an adverse opinion, the tutors portrayed some attitudes and answers illuminating their (relative) trust, as far as the obtained estimations were concerned. A typical example comprises the answers provided in the Questionnaire, as evidenced in Table 5. Another instance can be quoted from the interview ‘[when I used the tool to make my estimations] I never selected an interval with a white color (lowest similarity).’ Based on this, it would be indicated that the visualization impacted the beliefs of tutors regarding the students. The tutors equally realized that ‘In general terms, the u-Tutor estimations matched my opinion, built upon my conversations with the students.’

The opinions from tutors equally highlight one of the key features of u-Tutor as a visual analytic tool: the necessity for human interpretation of machine outcomes to have the data contextualized. The numeric breakdown of approximations and results equally supports such a requirement for contextualization. The precision of these approximations was evaluated by considering the approximations achieved by the machine (without any human construal), the approximations by the tutors (supported by u-Tutor), and the real obtained scores. The two measures considered included estimation error and estimation success rate. For the 29 human estimations acquired, success rate was computed, with the success rate of 27% (8 in total) being realized. For the 29 cases, the machine estimation realized success in 7 of them, and amazingly, there was no simultaneous success between the tool and the tutor. This shows the significance for analyzing the errors resulting from the estimations.

During success comparison, between automatic estimations and human estimations, the results indicated a better success ration by tutors for students who succeeded in the course, whereas the automated estimations prospered in those cases where students failed to take the final exam. This behavior has been portrayed in Figure 7. This means that u-Tutor is considered an early warning system for at-risk students, while considering the prosperous students and that u-Tutor remains a decision support tool necessitating human contextualization.

Practical applications aid in understanding error estimation, as evident from Table 6. The considered examples are more illustrative than representative. For example, error values lower than 1 symbolize good estimations to the practical score. 1–1.5 values are acceptable; those between 1.5 and 2 are borderline cases, while those over 2 are not acceptable estimations.

From Table 7, it can be seen that the average error cannot be accepted as the approximation of the score. Nonetheless, a comprehensive analysis of the cases having higher values of error indicated that the cases with error values exceeding 2 belong to students who failed to take the examination, even though it was expected of them to do so. Devoid of considering those severe cases, the average value of error is within the satisfactory margins.

To conclude, u-Tutor facilitated information to the tutors for the purposes of approximating the scores of the students within satisfactory error margins. Students who managed to pass the course eventually had better estimations being facilitated, whereas u-Tutor is considered an early warning system for at-risk students. This means that, notwithstanding the marks attained during the 4-week course duration, all students are expected to succeed in the final exam so as to pass the course. ‘At-risk’ students could be warned before their examination.

6. Limitations of the Study

It has been acknowledged by the researchers that the size of the sample (12) was small, thereby acting as a study limitation. While the research was initially scheduled for a higher number of participating tutors, organizational problems could not allow most of them to take part in the study. Nevertheless, the final results could not be considered generalizable since organizational results prevented the tutors from taking part. While this is considered less pertinent in qualitative research, it remains the major study limitation. Besides, there was a probability of cultural bias: the study setting was a Spanish institution with students being Spanish-speaking, from Latin America and Spain. This means that while interpreting the results, this issue has to be considered.

7. Conclusions and Future Work

This article offers a case study involving the deployment of u-Tutor in a practical learning setting where 392 students were given support by the faculty, in addition to the 9 preceding editions commencing 2015–2019. The tool sought to aid the tutors in adapting the tutoring experiences to the requirements of the students, utilize similarity metrics in comparing the students with those from similar courses, assessing their performances. In one approach, the system offers an evaluation of the current behavior of the users in estimating their future advanced behavior and the eventual associated outcome. The study’s goal was to validate the tool based on 3 research areas: (1) perceived usefulness, (2) usability and (3) success rate of classification. As regards to perceived usefulness, the evaluation of the tool’s application by the tutors, survey responses, and interview responses indicate their ability to ascertain cases that they would otherwise not have identified. Thus, u-Tutor was considered significant in enhancing tutoring experiences. Considering usability, the study did not conclude presence of any key usability problems, with the users portraying an affirmative view of the interface. This has considered the usability of u-Tutor as being stable. Last, in view of the success rate of classification, the information offered by u-Tutor underwent tutor contextualization, by being able to estimate the scores of the students.

The study findings show consistency with the existing research, demonstrating the usability stability of u-Tutor, indicating lack of key issues, while proposing the need to come up with novel functions to offer improved support. A key lesson evident from this study is the requirement to have estimation separated from description; that is, u-Tutor offers a visual depiction of the resemblances between previous courses’ learners, while being able to comprehend such information as approximation. While the results indicated the estimation potential of u-Tutor, the tool is descriptive in nature, hence, the need for the end user to comprehend this while interpreting the results. According to the data, the predictions by u-Tutor were in complement to those of the tutor, indicating the ability of the tool to act as a supportive feature.

For a practical application of this research, including the u-Tutor tool, in other contexts or within the same educational context of the host university, a clear recommendation is to scale up the sample of the tutors. In practice, since every tutor is assigned to a group of students, this upscaling would require a complementary increase of students. In doing so, the results of the semi-structured interviews and the questionnaires could show a diversity of situations and user profiles, along with reactions from the tutors that could feed an informed database for further use and comparison. The second practical recommendation is to retrieve as deep a background as possible, so that the actual search for similarities uses a broader spectrum that can better categorize and identify every single case in a present cohort. This fine-tuning process would increase the chances for early prediction and supportive or corrective actions.

Funding

The author reports zero external funding.

Conflicts of Interest

To this end, there is no conflict of interest.

Appendix A

Nr	Question	Type of Question	Possible Answers
1	How often did you use the tool?	(Multiple-choice)	Always when I worked in the supported courses Most of the time I worked in the supported courses In some occasions that I worked in the supported courses Rarely Never
2	About the information given by u-Tutor	(Multiple-choice)	It is redundant to what I already knew I could get the information by myself, but u-Tutor makes the task more agile I would not know how to extract this information
3	When you used the tool, what was your purpose?	(Multiple-choice)	Obtain information on the students To get useful information to support the pilot
4	Did you decide to actively support any student due to u-Tutor information?	(Multiple-choice)	Yes, in many cases Yes, some of the students Yes, with just a few students No, never
5	If your previous answer was ‘yes’, explain what type of support.	(Open question)	-
6	Choose the reason for your support action.	(Multiple-choice)	I supported the student because u-Tutor warned me about a situation I would not have found by myself I supported the student because u-Tutor helped me confirm critical cases already identified by myself Other reason: __________
7	For what task did u-Tutor support you?	(Open question)	-
8	Did you integrate u-Tutor into your daily workflow?	(Open question)	-
9	Would you like to use u-Tutor in future courses?	(Open question)	-

References

Siemens, G.; d Baker, R.S. Learning analytics and educational data mining. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge—LAK ’12, Vancouver, BC, Canada, 29 April–2 May 2012; ACM Press: New York, NY, USA, 2012; p. 252. [Google Scholar] [CrossRef]
Vieira, C.; Parsons, P.; Byrd, V. Visual learning analytics of educational data: A systematic literature review and research agenda. Comput. Educ. 2018, 122, 119–135. [Google Scholar] [CrossRef]
Papadakis, S.; Kalogiannakis, M.; Sifaki, E.; Vidakis, N. Access Moodle Using Smart Mobile Phones. A Case Study in a Greek University. In Interactivity, Game Creation, Design, Learning, and Innovation; ArtsIT 2017, DLI 2017; Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Brooks, A., Brooks, E., Vidakis, N., Eds.; Springer: Cham, Switzerland, 2018; Volume 229, pp. 376–385. [Google Scholar]
Leony, D.; Crespo, R.M.; Perez-Sanagustin, M.; de la Fuente Valentín, L.; Pardo, A. Coverage metrics for learning-event datasets based on client-side monitoring. In Proceedings of the 2012 IEEE 12th International Conference on Advanced Learning Technologies, Rome, Italy, 4–6 July 2012. [Google Scholar]
Romero-Zaldivar, V.-A.; Pardo, A.; Burgos, D.; Delgado Kloos, C. Monitoring student progress using virtual appliances: A case study. Comput. Educ. 2012, 58, 1058–1067. [Google Scholar] [CrossRef]
Tobarra, L.; Ros, S.; Hernández, R.; Robles-Gómez, A.; Caminero, A.C.; Pastor, R. Integration of multiple data sources for predicting the engagement of students in practical activities. Int. J. Interact. Multimed. Artif. Intell. 2014, 2, 53–62. [Google Scholar] [CrossRef]
Dunn, K.E.; Rakes, G.C.; Rakes, T.A. Influence of academic self-regulation, critical thinking, and age on online graduate students’ academic help-seeking. Distance Educ. 2014, 35, 75–89. [Google Scholar] [CrossRef]
Greene, J.A.; Azevedo, R. A theoretical review of Winne and Hadwin’s model of self-regulated learning: New perspectives and directions. Rev. Educ. Res. 2007, 77, 334–372. [Google Scholar] [CrossRef]
Mangaroska, K.; Giannakos, M.N. Learning analytics for learning design: A systematic literature review of analytics-driven design to enhance learning. IEEE Trans. Learn. Technol. 2018, 12, 516–534. [Google Scholar] [CrossRef]
Jayaprakash, S.M.; Moody, E.W.; Lauría, E.J.M.; Regan, J.R.; Baron, J.D.; Baron, J.D. Early alert of academically at-risk students: An open source analytics initiative. J. Learn. Anal. 2014, 1, 6–47. [Google Scholar] [CrossRef]
Bainbridge, J.; Melitski, J.; Zahradnik, A.; Lauría, E.J.M.; Jayaprakash, S.; Baron, J. Using learning analytics to predict at-risk students in online graduate public affairs and administration education. J. Public Aff. Educ. 2015, 21, 247–262. [Google Scholar] [CrossRef]
Cambruzzi, W.; Rigo, S.J.; Barbosa, J.L.V. Dropout prediction and reduction in distance education courses with the learning analytics multitrail approach. J. Univers. Comput. Sci. 2015, 21, 23–47. [Google Scholar]
Papamitsiou, Z.; Economides, A.A. Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. J. Educ. Technol. Soc. 2014, 17, 49–64. [Google Scholar]
Prieto, L.P.; Rodríguez Triana, M.J.; Martínez Maldonado, R.; Dimitriadis, Y.A.; Gašević, D. Orchestrating learning analytics (OrLA): Supporting inter-stakeholder communication about adoption of learning analytics at the classroom level. Australas. J. Educ. Technol. 2019, 35, 14–33. [Google Scholar] [CrossRef]
Lukarov, V.; Chatti, M.A.; Schroeder, U. Learning analytics evaluation—Beyond usability. In Proceedings of the DeLFI Workshops; Rathmayer, S., Pongratz, H., Eds.; CEUR Workshop Proceedings: Aachen, Germany, 2015; pp. 123–131. [Google Scholar]
Dawson, S.; Gašević, D.; Siemens, G.; Joksimovic, S. Current state and future trends: A citation network analysis of the learning analytics field. In Proceedings of the Fourth International Conference on Learning Analytics and Knowledge—LAK ’14, Indianapolis, IN, USA, 24–28 March 2014; ACM Press: New York, NY, USA, 2014; pp. 231–240. [Google Scholar] [CrossRef]
De-la-Fuente-Valentín, L.; Burgos, D. Am I doing well? A4Learning as a self-awareness tool to integrate in Learning Management Systems. Campus Virtuales. 2014, 3, 32–40. [Google Scholar]
Shneiderman, B. The eyes have it: A task by data type taxonomy for information visualizations. In Proceedings of the 1996 IEEE Symposium on Visual Languages, Boulder, CO, USA, 3–6 September 1996. [Google Scholar]
De-la-Fuente-Valentín, L.; Burgos, D. A4Learning: Un enfoque metodológico iterativo para apoyar mejor el aprendizaje y la enseñanza. IEEE Lat. Am. Trans. 2015, 13, 477–484. [Google Scholar]
Drachsler, H.; Greller, W. The pulse of learning analytics understandings and expectations from the stakeholders. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, Vancouver, BC, Canada, 29 April–2 May 2012; ACM: New York, NY, USA, 2012; pp. 120–129. [Google Scholar]
Stake, D.R.E. The Art of Case Study Research; Sage: Thousand Oaks, CA, USA, 1995. [Google Scholar]
Brooke, J. SUS—A quick and dirty usability scale. In Usability Evaluation in Industry; Taylor & Francis: London, UK, 1996; p. 189. [Google Scholar]

Figure 1. Student-centred view of similarities.

Figure 2. Grid view of similarities.

Figure 3. u-Tutor integration in the LMS.

Figure 4. Human-estimation feedback collector.

Figure 5. Report-a-problem feedback interface. Closed (left) and opened (right).

Figure 6. Estimated range versus actual score.

Figure 7. Distribution of success on estimation.

Table 1. Set-up of learning analytics in this case study.

Dimension	Values
Stakeholders	Data subjects: The students Data clients: Tutors
Objective	Reflection: The system captures similarities among students to inform the tutor about the marks obtained by those who, in previous courses, behaved similarly to a given student.
Data	Protected dataset: Students’ interactions within the LMS Time scale: The interactions were analysed within a frame of 3 weeks.
Instruments	Algorithms: Similarity measurements as described by de-la-Fuente-Valentín and colleagues [18]. Visualization: Graphical solution designed to support this tool
External limitations	Ethics: What are the dangers of misinterpreting the data? Data protection: The students have the legal right not to be analyzed.
Internal limitations	Required competences: Will the visualization be eloquent enough to be easily understood by the tutors?

Table 2. Daily views, grouped by type.

Date	Single Students	Grouped Students
Last day–0	12	13
Last day–1	19	15
Last day–2	19	41
Last day–3	15	19
Last day–4	26	116

Table 3. Views per day, grouped by classified score.

Date/Estimations Made	Severe Risk	Risk	Pass	Outstanding
Last day–0	9	2	3	1
Last day–1	15	3	4	2
Last day–2	7	9	5	7
Last day–3	3	10	4	3
Last day–4	18	16	15	17

Table 4. Summary of perceived usefulness questions.

	Question	Selected Answers
1	How often did you use the tool?	Most of the time I worked in the supported courses. (Multiple-choice)
2	About the information given by u-Tutor	I could get the information by myself, but u-Tutor makes the task more agile. (Multiple-choice)
3	When you used the tool, what was your purpose?	Obtain information on the students. (Multiple-choice)
4	Did you decide to actively support any student due to u-Tutor information?	Yes, some of the students. (Multiple-choice)
5	If your previous answer was ‘yes’, explain what type of support.	It was easy to find inactive students. I called them to understand what was happening. (Open question)
6	Choose the reason for your support action.	I supported the student because u-Tutor warned me about a situation I would not have found by myself. (Multiple-choice)
7	For what task did u-Tutor support you?	To find students with low participation. (Open question)
8	Did you integrate u-Tutor into your daily workflow?	No, I did not./Yes, I have tried to integrate the tool. (Open question)
9	Would you like to use u-Tutor in future courses?	No, because in this case, all the marked activities are delivered at the end of the course, and I do not know if the activity is enough to classify students. It would be preferable to use it in courses with continuous submissions./ Yes, u-Tutor gives me an outstanding view of what is going on with my groups. I need to understand better how to use it more efficiently, but I think that the early results look promising and will help me in improving my support to the students. (Open question)

Table 5. Summary of accuracy (success rate of classification) questions.

	Question		Answer
10	To what extent do you agree with the following assertions? 1 = strongly disagree 5 = strongly agree (Likert scale)	u-Tutor, without any contextualization, is quite often successful in classifying students.	4
11		After contextualizing information from u-Tutor, I often succeed in classifying students.	4
12		u-Tutor estimations match my estimations.	4

Table 6. Examples of error in estimation values.

Actual Score	Tutor Estimated Interval	Error in Estimation
6.92	[7,8]	0.33
6.42	[4,6]	0.92
9.3	[6,9]	1.05
8.04	[5,7]	1.54
8.14	[4,6]	2.64
0	[3,5]	3.5
8.34	[4,5]	3.59

Table 7. Obtained error in estimation.

Average error in human estimation if no success (student failure)	2.39
Average error in human estimation if no success (student failure) (discarding dropouts)	1.42

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Burgos, D. A Predictive System Informed by Students’ Similar Behaviour. Sustainability 2020, 12, 706. https://doi.org/10.3390/su12020706

AMA Style

Burgos D. A Predictive System Informed by Students’ Similar Behaviour. Sustainability. 2020; 12(2):706. https://doi.org/10.3390/su12020706

Chicago/Turabian Style

Burgos, Daniel. 2020. "A Predictive System Informed by Students’ Similar Behaviour" Sustainability 12, no. 2: 706. https://doi.org/10.3390/su12020706

APA Style

Burgos, D. (2020). A Predictive System Informed by Students’ Similar Behaviour. Sustainability, 12(2), 706. https://doi.org/10.3390/su12020706

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Predictive System Informed by Students’ Similar Behaviour

Abstract

1. Introduction

1.1. Data Analysis and Learning Analytics

1.2. At-Risk Learners

2. Case Study Contextual Description

2.1. The Technological Context

2.2. The Educational Context

3. Research Areas and Related Research Questions

3.1. Perceived Usefulness

3.2. Usability

3.3. Success Rate of Classification

4. Methodology

4.1. Settings of the Case Study

4.2. Data Capture Methods

4.3. Analysis Methods

5. Results

5.1. Perceived Usefulness

5.2. Usability

5.3. Success Rate of Classification

6. Limitations of the Study

7. Conclusions and Future Work

Funding

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI