**1. Introduction**

The adaptation of the subjects of any graduate degree to the new European Space for Higher Education, where new capacities and abilities are evaluated in students, implies a significant change in the traditional teaching methodology that has been developed, mainly as master classes. This change in the pedagogical model of teaching–learning is in turn conditioned and reinforced by the new model of current digital society and by the new information and communication techniques available in any space-time framework [1,2]. The availability of information on everyday electronic devices such as smartphones or tablets, in addition to the connectivity of user groups to the internet, allows establishing other more active work dynamics, giving to the student a greater participation in the teaching–learning process of a subject [3].

The studies carried out to evaluate the global impact of the use of technology on student performance are not conclusive, yielding different results [4,5] given that these teaching researches may depend on other factors not identified in the analysis itself, such as the educational method or strategy [6] that is carried out through the electronic medium, or the student's commitment to the learning methodology [7]. However, these multimedia and interactive technologies can be of great help in offering quality comprehensive education [8] based on current computer tools that facilitate cognitive learning processes and reinforce the capacities of abstract reasoning and study of a specific subject, in addition to complete the traditional forms of learning [9].

**Citation:** López-Tocón, I. Moodle Quizzes as a Continuous Assessment in Higher Education: An Exploratory Approach in Physical Chemistry. *Educ. Sci.* **2021**, *11*, 500. https:// doi.org/10.3390/educsci11090500

Academic Editors: Sandra Raquel Gonçalves Fernandes, Marta Abelha and Ana Teresa Ferreira-Oliveira

Received: 22 June 2021 Accepted: 30 August 2021 Published: 3 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Different teaching methodologies integrate the technological devices in the educational environment such as the blended-learning [10–12] and gamification [13–15]. Blendedlearning, b-learning, combines face-to-face lessons in the classroom, required for any subject in university study plans, and virtual training teaching activities through learning platforms, while the gamification techniques try to create similar experiences to those experienced when playing games in order to motivate and engage users. Both methodologies profit from the presence of the teacher as a transmitter of knowledge and guide of educational activities and from the communication technology that facilitates independent and collaborative learning. In particular, b-learning has been applied in subjects from different areas of knowledge such as education sciences [12], natural sciences [16], economics [9], engineering [7], etc. as a proposal for European convergence, given that it allows the student's noncontact work hours to be completed with virtual activities, as established in the new university teaching guides for the convergence of the European Space for Higher Education.

Moodle platform [17] is a virtual learning environment that offers very attractive functionalities from the pedagogical point of view by promoting the philosophy of constructivist social education [18,19], and where the subjects can be accommodated with easy handling, at the editing and user level by teachers and students, respectively. In this virtual environment, teaching resources of different characteristics can be included, such as links to web pages, chats, forums, messages, and other specific documents like notes, tutorials and question relationships elaborated by the teacher. Moreover, it offers the possibility of carrying out online activities through quizzes, which could allow the continuous assessment of students' learning. A great variety of quizzes can be designed with different item types and settings, but not all quizzes can differentiate the skills and competences of student, and thus, they could not be used as assessment tools. The quality of these quizzes can be analyzed by statistical and psychometric data reported by Moodle platform [20,21].

Concerning evaluation methods by using online quizzes, there are studies in diverse disciplines such as engineering, biology, medicine and the social sciences [22–24]. Although, there are some objections to the implementation of such systems related to the confidentiality of the identity of the student, the use of the information and its possible impact on the educational process [25,26]; these offer some advantages such as the efficient management of results in a huge students' group, the speed by which the evaluation can be performed, and the save of paper [27]. However, the design of quizzes must be adequately elaborated in order to be used as an assessment tool. Two important points must be considered in the design, such as the writing of different questions using different item type and the own quiz settings. The statistical and psychometric data derived from a particular quiz can a great help us know the quality of the quiz. There are some studies regarding the analysis of information generated from test-type quiz evaluations in other scientific subjects [20,21,28–30], yielding how such results could be useful for professors and students. No statistical and psychometric studies on physical chemistry quizzes have been found in the bibliography.

In this work, two types of Moodle quizzes are designed in physical chemistry subject. The main objective is to establish which type of quiz can be used as an assessment tool on the basis of statistical and psychometric data. Here, it highlights how Moodle statistics can be used to measure the effectiveness and reliability of a quiz. In addition, the effect of these online activities on the final scores of the students are compared with those obtained in a traditional education.

#### **2. Materials and Methods**

The research is designed in three stages: first, the student population was surveyed by a brief poll to inquire about their entry into the university; second, the students answered the quizzes during the teaching semester; and finally, the statistical and psychometric parameters of the quizzes was analyzed on the basis of the classical test theory [31–33]

(See Supplementary Materials). A brief survey is carried out at the end of the teaching period to know the opinion of the students about this experience. The scores obtained in the two Ordinary Calls of exams are compared with those obtained in previous courses where the teaching methodology corresponds to a traditional education based exclusively on master classes.

This research is performed in the general physical chemistry subject during six years, from the 2014–2015 to the 2019–2020 academic years, just before of the pandemic situation. This matter is included in the Basic Module of the Degree in Chemistry at the University of Málaga. It consists of six theoretical credits, and it is taught during the first semester of the first year of the degree.

This subject was chosen because it is a difficult matter for novel students in the Degree of Chemistry. It includes themes like thermodynamic, electrochemistry and kinetics that are the starting point of other physical chemistry subjects in higher courses, in which a significant dropout of students has been detected. Thus, it seems convenient to apply a new educational methodology, or new activities using technological devices, in the first course in order to consolidate and strengthen the basic concepts of this matter.

#### *2.1. Sample*

The average number of students in general physical chemistry was about 80 students during the last academic years, with a parity proportion of men and women in the last four years. All students can freely participate in the quizzes as a unique experimental group. No specific sampling method and no control group is established with the aim that all students were evaluated in a homogenous way so that there are no discrepancies in the final evaluation.

It was not possible to perform a similar study in other courses or scientific areas, even in other degrees, because there were no other teachers implied in the project using a similar educational strategy with Moodle quizzes. Although this sample is not representative of the higher education context, the similar results obtained in this experience along different years with different students population point out that it would not be expected to see significant changes in another similar scientific scene, giving probably a similar trend.

At the beginning of the course, a brief survey is carried out to explore the admission at the university, such as academic background on chemistry knowledges and the enrollment in the degree. Considering an average of the last six academic years, practically all the students, 86–88%, are 18 years old, and the rest, 11–12%, are in the range of 21 to 25 years old, which could probably be due to repeaters in the secondary or bachelor cycle, or students who come from other degrees. Most of the students, 86–95%, have studied a chemistry subject during high school, but it should be noted that about 4–5% of students have not studied any chemistry subject in any official degree before to their admission to the university, although they indicate that they have basic knowledge of chemistry. Only a small proportion, 1–2%, have no knowledge of chemistry. Moreover, a high proportion, around 70–85%, has enrolled in the Chemistry degree because it is their vocation, being the first option in university pre-registration. Only 12–25% of students recognize that it is not their vocation and it has not been the first option in the university pre-registration. In addition, this degree was not the first choice of about 1–2% of students, but it was the only option for their admission to the university.

#### *2.2. Development of the Experience: Didactic Strategy*

Within the Moodle platform, a question bank has been created and divided into five thematic blocks that involve all topics of the teaching program (Table 1). Each block has more than 50 questions or items, even over 100 items in the cases of the Matter and Thermodynamics blocks. The question bank has over 450 items belonging to four types of Moodle questions: true/false, multiple choice (with multi-responses and single response), matching and numerical. All these items were elaborated according to the scientific competencies required for passing this subject.


**Table 1.** Teaching program of the general physical chemistry subject developed in eleven lessons and distributed in five thematic blocks. Available at https://oas.sci.uma.es:8443/ht/2020/ProgramasAsignaturas\_Titulacion\_5004\_AsigUMA\_51 635.pdf (accessed on 22 July 2021).

> The set of items is classified, in turn, into two categories, one with the questions that collect the basic knowledge of the subject, while the other contains more elaborate questions, in order to check the skills and abilities of the students in practical reasoning about physical chemistry. In this way, two types of quiz are developed. First, a "basic" quiz (BQ) is proposed for each of the eleven topics. It consists of ten true/false type items, with a time limit of one hour. The BQ contains the same questions for all students and is active for a period of one week after finishing the topic in class. Second, another type of "thematic block" quiz (TBQ) is proposed corresponding to each of the five thematic blocks, which are made up of several topics in the teaching program, except those dedicated to chemistry kinetics (see Table 1). It has ten items of different type (multiple choice, numerical, matching) chosen at random from a category of question bank, so it is practically an individual and different test for each student. The multiple choice items have a particular characteristic, the correct/incorrect answers score positively/negatively, with a proportional value to the number of item options. These quizzes are held in a scheduled day.

> In both types of quizzes, each item has the same statistical weight of 10% in the final mark. All quizzes are performed outside the classroom and have a delayed feedback; that is, the correct answers can be only checked once the test is over for all students. All these activities are carried out continuously throughout the semester according to the physical chemistry program.

> All students were informed about the characteristic of Moodle quizzes and how the platform works before doing the activities. In this way, any bias factor due to students' attitudes towards technology along the time would be diminished.

#### **3. Results and Discussion**

#### *3.1. Participation in the Quizzes*

There is a high participation during the last six academic courses (Figure 1), higher than 50% in any BQ or TBQ quizzes, with the exception of the last BQ carried out in the 2017–2018 academic year with a participation of 40%.

**Figure 1.** Participation in basic (BQ) and thematic block (TBQ) quizzes during the last six academic years. Own elaboration based on this study.

A detailed analysis by academic year allows us to know the dynamics and evolution of the participation. The participation falls down in the last quizzes, being always slightly lower than the first ones. This decrease is more striking in the 2016–2017 and 2017–2018 academic years, which go from approximately 85% and 75% in the first BQ to 60% and 40% in the last quiz, respectively, while it goes from 85% and 70% to 60% and 55% in the TBQs, respectively.

The general trend is a progressive decrease in participation throughout the semester in any academic year. This is due to several factors, such as possible changes in the enrolment of the subject given that some students are waiting for a possible change to another degree at the beginning of the course, and this process does not materialize until after a month, but in the meantime, they have been taking the quizzes. Moreover, mid-semester partial exams of other matters are held, so students are immersed in the study of other subjects and end up not doing the quizzes, either because the time has passed to do it, or because they have not studied. Additionally, at the end of the semester, a large number of students have decided to drop out of the degree in chemistry and are not involved in the training activities. The dropout rate in this first-degree course is approximately 20–25%. In the initial survey of the class, 25% of the students consider that the degree in chemistry is not their vocation and it was not their first option in the university pre-registration.

#### *3.2. Statistical and Psychometric Data of the Quizzes and Each Item*

The results provided directly by the Moodle platform (https://docs.moodle.org/ dev/Quiz\_statistics\_calculations (accessed on 22 July 2021)) [20,21] have been analyzed and calculated according to the classical test theory [32,33]. Supplementary Materials summarize the definition of psychometric parameters. Tables 2 and 3 collect, for each quiz, statistical data such as the average score, the standard deviation (SD), the range of correct answers (maximum and minimum percentage), also called the facility index (FI), and the asymmetry in the distribution of the scores, also called bias, together with the internal consistency coefficient (ICC), or Cronbach's alpha, which gives an idea of the quality of the tests and allows to recognize if the whole exam is homogeneous.


**Table 2.** Statistical data corresponding to the eleven basic quizzes (BQ) in different academic years.

FI (Facility Index); SD (Standard Deviation); ICC (Internal consistency coefficient, Cronbach's alpha). Own elaboration based on this study.

The average score of any BQ in any academic year is high, between remarkable and outstanding (6.51 for BQ-10 and 9.83 for BQ-7 in the 2015–2016 and 2018–2019 academic years, respectively), with a high percentage of correct answers in each quiz that ranges from 70% to 100%, except in the 2015–2016 academic year where a minimum success rate of 15%, 47% and 37%, was obtained in the BQ-6, BQ-7 and BQ-10, respectively, which correspond to the two most difficult topics to assimilate: thermodynamics and electrochemistry. This large range in the correct answers yields an asymmetric distribution with a negative bias greater than −1 in all academic years. That indicates the lack of discrimination among those students who do better than the average ratio, and it is due to the fact that most of the items are classified as basic knowledge, and also to the type of question (true/false) which shows a random response of 50%. The standard deviation is practically around 20%, except in some cases with a slightly higher value, between 22 and 28%, in those quizzes corresponding to the topics of thermodynamics and electrochemistry.


**Table 3.** Statistical data corresponding to the five thematic block quiz (TBQ) in different academic years.

FI (Facility Index); SD (Standard Deviation); ICC (Internal consistency coefficient, Cronbach's alpha). Own elaboration based on this study.

The ICC in most quizzes at any academic year is higher than 65%, the minimum value proposed as indicator of an overall homogeneity of the quiz [34]. However, in some cases, values lower than 65% have been obtained, for example in the BQ-3 and BQ-9 in the 2014–2015 academic year, even significantly lower values of 27.80% in the BQ-11 of the 2015–2016 academic year or even negative values of −15.84% in the BQ-3 of the 2016–2017 academic year. These results show the limitation of this parameter in considering that the quiz measures with the same precision all the students evaluated when it really depends on the level of each student and, ultimately, on the population used to calculate it.

Moreover, the dispersion of the IF and the discrimination index (DI) for each item of any quiz have been analyzed in order to know the item effectiveness to discern between students with different cognitive ability (Figure S1, left). Most of the questions have an adequate discrimination, with a DI above 30%. A more detailed analysis of the discriminative efficiency (DE) for any item of the different BQ (Figure S2) shows that the effectiveness of the items depends on the academic year and, therefore, on the student population. For example, in the academic year 2015–2016, all items of the BQ1 do not reach 30% of DE, while in the 2018–2019 academic year, they are all above 30%. The same behavior has been found in other items corresponding to other quizzes. Therefore, the same item may or may not be discriminatory for a population of students depending on the level of knowledge they have, and therefore, questions that have a low DI should not be discarded. It is concluded that quizzes made only with true/false items serve as continuous training activities in the teaching–learning process of a matter, not being feasible as assessment activities because they are not discriminatory for students.

Different results are obtained for the TBQs (Table 3). The average score drops significantly with respect to the BQ quizzes, from 5.16 in the TBQ-3 of the 2014–2015 academic year to 7.01 in the TBQ-1 of the academic year 2017–2018, even in some cases reaching

values of 3.9 or 4.5 in the TBQ-4 and CBT-5 of the 2014–2015 academic year, for example. Moreover, the FI index drops significantly and ranges from 28% in the TBQ-5 of the 2014– 2015 academic year to 81% in the TBQ-1 of the 2016–2017 academic year, but in no case does it reach 100% in any of the items in any quiz. The dispersion in the average scores oscillates around 20%, being slightly high in the last three CBTs of certain academic courses. The asymmetry of the distribution in the scores, the so-called bias, is still negative, but now with a value lower than −1, reaching a slightly positive value and with an almost symmetric distribution, with a bias close to zero in the last three quizzes of the 2018–2019 academic year with a value between 0.07 and 0.02.

As a general trend, the bias in the first two quizzes, TBQ-1 and TBQ-2, is somewhat higher than in the rest of the TBQ-3, TBQ-4 and TBQ-5 ones that have a bias close to zero. This indicates that the topics corresponding to the first two blocks are better assimilated than the rest of the topics corresponding to the blocks of thermodynamics, electrochemistry and kinetics. This effect is probably due to the fact that the first topics are already studied in the bachelor grade, while the topics of the last blocks are totally new, which means an effort in learning process.

Therefore, these TBQs are more discriminative between students than BQs, that is, here the cognitive abilities of each student are tested. The FI-DI scatter diagrams are shown in Figure S1 right. Although a low DI value is obtained due to the random characteristic of the quiz, the detailed analysis of the FI-DI diagrams for all items in any TBQ (Figure S3) reveals that most of items are discriminative with a DI value above 30% and with a wide range in the FI. However, the ICC is always lower than the reference value of 65% [34]. This is because the items of the quiz have been randomly selected by the Moodle platform and show different questions for each student.
