2.2. A Hybrid Model for Evaluating the Learning Outcomes of Students (Pilots)
The stages of designing a hybrid model for evaluating the learning outcomes of students are presented in terms of the given fuzzy and expert mathematical models to obtain an output assessment —a fuzzy model for evaluating the results of theoretical training of students
For this model, a set of evaluation criteria is offered, representing the disciplines of the theoretical cycle. Theoretical training disciplines are selected according to the accredited curriculum of a higher educational institution or the curriculum of an organization approved for flight training by a national aviation authority, a so-called approved training organization (ATO). Our study uses the disciplines of aviation education in the “PILOT” study program of the Technical University of Košice (TUKE), Slovakia. The information criteria are entered, which are divided into three groups (years of study at the bachelor′s level) .
According to each criterion, which is a theoretical discipline, after mastering it, the student receives a corresponding point percentage assessment and a linguistic expert assessment of the learning results . For example, the results achieved by a student while studying a subject are evaluated according to six classification levels:
“A”—excellent if 91–100%;
“B”—very good if 81–90%;
“C”—good if 71–80%;
“D”—acceptable if 61–70%;
“E”—sufficient if 51–60%;
“FX”—not sufficiently if 0–50%.
A student completes a subject and receives credit if their results have been graded from “A” to “E”.
An expert on the didactic system of aviation education receives for some students (pilots) P a corresponding point percentage assessment for the mastered subjects of theoretical training. Below are a set of evaluation criteria for the theoretical disciplines of aviation education at the bachelor’s level in the “PILOT” study program.
—The first year of study:
—English 2;
—Physics 2;
—Aviation communication;
—Aviation legislation;
—Aviation meteorology 1;
—Aviation meteorology 2;
—Air navigation 1;
—Aviation rules 1;
—Mathematics 2;
—Physical Education 1;
—Physical Education 2;
—Fundamentals of computer science;
—Basics of flight 1.
—The second year of study:
—Economics;
—Airports and transport infrastructure;
—Organization of air traffic 1;
—Search and rescue service;
—Devices and systems 1;
—Avionics 1;
—Aircraft construction;
—Aeronautics 2;
—Air traffi—c organization 2;
—Semester project 1;
—Aviation engines.
—The third year of study:
—Flight planning and monitoring;
—Operational procedures in commercial air transport;
—Avionics 2;
—Flight technical characteristics;
—Fundamentals of flight 2;
—Colloquial exam;
—Air transport process;
—Defense of the final thesis;
—Final thesis;
—Weight and balance;
—Human capabilities and limitations.
A set of percentage points for mastered subjects of theoretical training is obtained, which is denoted by respectively for a set of pilot students according to evaluation criteria . In general, the set of criteria is open and represents the official curriculum, and their number does not affect the calculation of the complex hybrid model.
A fuzzy model for evaluating the results of theoretical training of students is offered in the form of a step-by-step algorithm.
First step. Introduction of the “Importance of Discipline” for the level of pilot competence
To determine the level of competence for each cycle of theoretical training, the theory of fuzzy sets and fuzzy logic procedures are used, since each assessment on a 100-point scale, in a certain range, is obtained from subjects (teachers and trainers) and has a fuzzy, fuzzy character. In addition, if you analyze the results of the assessment of individual subjects, then you can see that for some teachers, grades from a scale of 80–100 prevail for students of different levels of training and others from 60–90, etc. Of course, all subject’s accredited training programs are important, but each of them has a certain degree of influence on the formation of each student pilot competency. The objective reality is that assessment results depend on many factors. Of course, it is impossible to conduct research on the teaching of the same subject by different teachers to the same students. However, it is known from the theory of expert evaluation that different experts give their conclusions within the limits of their competencies and the psychophysiological characteristics of the individual. Therefore, to obtain a real level of quality training of students, the following approach is proposed.
The point “Importance of Discipline” is considered, which represents the evaluation of all the evaluation criteria of the theoretical disciplines of aviation education that could satisfy the DM. “Importance of Discipline” is characterized by the fact that each of the theoretical subjects has different degrees of influence on the formation of the competence of a pilot student.
The set of “Importance of Discipline” will be chosen by the expert independently, analyzing each theoretical discipline and the teacher who practices it, while choosing the optimal value. For example, some teachers have a maximum score of 85% in the subject. Conversely, in some subjects, the minimum grade is 75%. It cannot reflect the objective reality of the quality of the graduate’s knowledge.
Second step. Calculation of estimates of the proximity of the student’s learning results to the “Importance of Discipline”
The approach to building the membership function is described as follows. A set of values is determined, which are relative estimates of the proximity of the elements of point percentage estimates for the mastered subjects of theoretical training
to the corresponding element of “Importance of Discipline”:
where
is the lowest grades (not necessarily the minimum) received by students in the corresponding subject
is the highest marks (not necessarily the maximum) in the corresponding subject
.
The determined matrix characterizes by columns the relative evaluations of the proximity of the training results of the student (pilot) to the “Importance of Discipline” T for each specific subject and removes the issue of different evaluation scales. As a result, [0; 1].
Third step. Fuzzification of input hybrid data
The term set of linguistic expert evaluations of learning outcomes is determined on a percentage scale according to the following content, the larger the value, the higher the level of the criterion. According to the above, the following division of intervals is proposed: —[0; 50], —[51; 60], [61; 70], [71; 80], [81; 90], and [91; 100].
The dependence of linguistic expert evaluations of the students’ training results on the criterion of evaluating theoretical subjects and quantitative evaluation of the “Importance of Discipline” will be carried out with the help of intellectual analysis of knowledge and functions of belonging. From a formal point of view, there is an uncertainty of the “high level” type, which is described by the membership function “value x greater”. It is natural to express the known data from the quadratic S-spline of the membership function:
Here,
are the values of the ends of the intervals depending on the linguistic variable
T. The larger the value of
, the higher the level of the criterion
. In addition, the value
represents the disclosure of the uncertainty of the educational achievement of the student (pilot)
in the relevant subject
,
.
Thus, there was a transition from linguistic evaluations and evaluations of the closeness of the learning outcomes of the student (pilot) to the “Importance of Discipline” and one normalized evaluation.
Fourth step. Considering the importance of theoretical disciplines in the acquisition of relevant aviation education competencies
Let DM have its considerations regarding the importance of the coefficients for each discipline
, from the interval [1;10]. If DM does not need this, then the criteria are considered equally important. For data comparison, normalized weighting factors are determined:
where the condition is met
.
It is noted that the components of the vector can be selected in other ways, depending on the specific situation.
Fifth step. Defuzzification of the data
An aggregate risk assessment of aviation education results is calculated. For this, it is proposed to use convolutions, depending on the considerations of the DM regarding the risk of aviation education results:
where
—pessimistic considerations regarding the risk of aviation education results;
—careful considerations regarding the risk of aviation education results;
—average considerations regarding the risk of aviation education results; and
—optimistic considerations regarding the risk of aviation education results.
Note that steps 4–5 are given for the calculation of the entire cycle of theoretical training. In the case of calculating intermediate values, for example in the 1st or 2nd year of study, the index j runs through the value of the number of disciplines in the corresponding year of study.
To derive the linguistic level of risks within the framework of the theoretical mastery of subjects, the following linguistic conclusions are offered: = “insignificant risk of aviation education results”; = “low risk of aviation education results”; = “average risk of aviation education results”; = “high risk of aviation education results”; and = “critical risk of aviation education results”.
As a result of the verification of the fuzzy model for evaluating the results of the theoretical training of students on real data (the didactic system of aviation education), the levels for comparing the ratings with the linguistic were established as follows: [0.5025; 0.6]—; (0.6; 0.7]—; (0.7; 0.8]—; (0.8; 0.9]—; and (0.9; 1]—.
—Expert model for evaluating the results of student pilot training on flight simulators
For this model, the evaluation criterion —“Training on the simulator 3” is proposed. Without reducing the generality, this discipline is studied at the university TUKE (Slovakia) during the training of students in the specialty “PILOT” study program”, with one credit for pilot students of the 3rd year of study, in the winter semester. Similar to the theoretical training, the results achieved by the student during the study of the subject “Training on the simulator 3” are evaluated according to six classification levels is determined on a percentage scale: —[0; 50], —[51; 60], [61; 70], [71; 80], [81; 90], and [91; 100]. We denote the obtained estimate by , respectively for a student, pilot In addition, let the training instructor make their judgment about the risks to the student pilot training results in the flight simulator. For such a conclusion, we introduce the linguistic variable , where: —insignificant risk of training results on the flight simulator; —low risk of training results on the flight simulator; —average risk of training results on the flight simulator; —high risk of training results on the flight simulator; —critical risk of training results on the flight simulator.
First, let us complete fuzzification of the results of learning the percentage scale. For this purpose, it is proposed to use intellectual analysis of knowledge with the help of membership functions. For example, it is natural to use a quadratic S-spline:
Thus, we will obtain the normalized output estimates from the interval [0; 1] for student pilots.
Next, the normalized baseline score and the training instructor’s reasoning are aggregated using the following membership function:
where
k is the risk threshold of training results on the simulator, the value of which varies depending on the expert opinion of the
. This threshold can be obtained by training on real result data. For example, let us experimentally set:
k when we have expert opinion
;
k when we have expert opinion
;
k—expert opinion
;
k—expert opinion
;
k—expert opinion
.
Thus, aggregated normalized estimates from the interval [0; 1], regarding the evaluation of student pilot training results on flight simulators.
The presented research is universal and is not limited only to studies in higher educational institutions. In this regard, if, when evaluating the training results of students, the training instructor does not have the opportunity to express their considerations regarding the risks of the training results, then the calculation according to Formula (10) is skipped, and the value is taken for further calculations .
—Expert model for evaluating the results of practical flight training of students
Evaluation of the results of practical flight training of students is carried out by flight training instructors. Thus, to obtain a “private pilot license” (PPL) (min. 45 flight hours), you need to acquire skills in two stages: —aircraft piloting technique (to learn to fly an aircraft, these are exercises PSA/01 to PSA/17; PSA–piloting a single-engine aircraft); —aircraft navigation control (navigational flights on selected routes, these are exercises PSA/18 to PSA/24).
Thus, for the expert model of evaluating the results of practical flight training of students, we will have two groups of criteria: and .
Both stages of the training are performed as flights with an instructor (DUAL) or as separate flights without an instructor (SOLO).
The evaluation of the performed exercises can be: “completed” or “not completed”. Another rating scale used in the “PILOT” study program of TUKE University (Slovakia) uses five levels of flight training evaluation: “1” excellent (in academic plan A 91–100%); “2” very good (B 81–90%); “3” good (C 71–80%); “4” acceptable (D 61–70%); “5” not sufficiently (FX).
Based on these marks for all of the exercises, the student pilot’s overall flight training score is determined, which is reported by the approved training organization (ATO) to the National Aviation Authority for the purpose of obtaining a pilot license.
Formally, the expert model for evaluating the results of practical flight training of students will be presented as follows.
Let us denote the grades received by the student (pilot) on PSA exercises as follows:
Flights with the instructor , ;
Individual flights without an instructor , .
where all the assessments are .
In the first stage, we will calculate the average values for the exercises within the selected sets:
In the second stage, the overall score in the recommendations of the ATO is calculated:
In the final stage, to compare the data, it is proposed to model the uncertainty with a membership function using a quadratic Z-spline:
The resulting aggregated normalized assessment of the results of practical flight training of students has the following meaning: when the value of the assessment approaches 1, then the student has acquired the best skills in the stages of aircraft piloting technique and aircraft navigation control.
Thus, at the output of the expert model for evaluating the results of practical flight training, we have normalized and compared ratings by pilots .
—Expert model for evaluating the competence of practical flight training of students by means of observed behavior
A set of criteria for evaluating the competence of practical flight training of students by means of observed behavior is proposed, which is divided into nine groups
. The assessment criteria in each group
G are presented in the form of a question to describe the competence. Indicators for “observed behavior” are used from the officially published document
Competency Assessment and Evaluation for Pilots, Instructors and Evaluators/Guidance material published by the International Air Transport Association (IATA) [
26].
The document is based on an idea: an adapted competency model, which is a group of competencies with their associated description and performance criteria adapted from an ICAO (International Civil Aviation Organization) competency framework that the ATO approved training organization/AOC (air operator certificate/air operator certificate holder (operator) uses to develop competency-based training and assessment for pilots and instructor–evaluators.
Some fragments of indicators for evaluating the competence of practical flight training of students by means of observed behavior are given. All other indicators are given in [
1].
Group —application of knowledge demonstrating knowledge and understanding of relevant information, operating instructions, aircraft systems, and environment. This group consists of seven criteria . For example, —a student knows where to get the necessary information.
Group —application of procedures and compliance with rules, which is determined following official operating instructions and relevant regulations. This group also consists of seven criteria . For example, —the student applies appropriate operational instructions, procedures, and methods promptly.
Group —communication. Communicates using appropriate means in the work environment, both in staff and non-staff situations. This group consists of ten criteria . For example, —student appropriately chooses what, when, how and with whom to communicate.
Group —aircraft flight path control and automation. Controls the flight path using automation. This group consists of six criteria . For example, —student safely controls the flight path to achieve optimal performance.
Group —control of the flight path of the aircraft with manual control. This group consists of seven criteria . For example, —student maintains the planned flight path during manual flight while managing other tasks and distractions.
Group —leadership and teamwork. Influences others to achieve a common goal and collaborates to accomplish team goals. This group consists of eleven criteria . For example, —student pilot gives and receives constructive feedback.
Group —problem-solving and decision-making. This group consists of nine criteria . For example, —the student pilot uses appropriate and timely decision-making techniques.
Group —perception, awareness, and management of information to predict its impact on work. This group consists of seven criteria . For example, —the student checks the accuracy of the information for errors.
Group —maintaining an available workload by prioritizing and distributing tasks using appropriate resources. This group consists of nine criteria . For example, —student effectively plans, prioritizes, and schedules appropriate tasks.
To present an expert model for assessing the competence of practical flight training of students through observed behavior, we will present the following approach. The idea is that in some cases the results of the assessment of competencies and management of threats and errors may not be relevant to the assessment of competence to the learning objectives of the session. In this case, the flight instructor must evaluate the associated “observable behavior” of each competency with the following values, while determining: the number of “observable behaviors” demonstrated by the corresponding student pilot when they were required; is the frequency of “observed behavior” demonstrated by the student (pilot) , when they were required.
The following linguistic variables are proposed for flight training instructor assessment of the quantity (
) and frequency (
) “observed behavior” [
1]:
{few, hardly any; some; many; most; all, almost all} and
{rarely; occasionally; regularly; very often; always, almost always}.
Next, it is necessary to associate the results of the evaluation of the linguistic variables
and
with a certain scale. For this, the following characteristic functions are considered, respectively.
The purpose of this defined normalized numerical scale is to enable further comparison and calculation.
Furthermore, to aggregate the values
,
within the criterion
(
h–number of criteria), similarly, intellectual analysis of knowledge is used by modeling the uncertainty of the “average value” type on based on multidimensional membership functions. For example, such modeling is based on a cone-shaped membership function, and the value of the center of the base of the cone is a unit vector, and the scaling is based on the coordinates of the vector
;
is equal to
:
where
.
In this way, an aggregated value was obtained for each criterion. Next, we will use the weighted average amount to obtain one rating for students:
From a mathematical point of view, the obtained initial estimates will be from the interval [0.434; 1], this explains the setting of the base of the cone and its scaling.
To comply with the relevant standards, the obtained value is compared with the following linguistic assessment of competence with the following linguistic conclusions: = “exemplary manner”; = “effectively”; = “adequately”; = “minimal acceptable”; = “ineffectively”.
According to industry best practice, the ATO policy should be as follows [
1]: the prescribed standard is
for each pilot and the minimum acceptable standard is—
. As a result of the verification of the expert model for assessing the competence of practical flight training of students using observed behavior on real data (the didactic system of aviation education) and the above industry practices, the levels for comparing the
scores with the linguistic
are as follows:
[0.434; 0.58]—
;
(0.58; 0.64]—
;
(0.64; 0.78]—
;
(0.78; 0.86]—
;
(0.86; 1]—
.
At the same time, if the student receives: then remedial training is required; then you need to pay attention and recommend remedial training; and then corrective training is not required.
—Model for aggregating raw data for deriving a general assessment of the quality of individual training of a student (pilot) within the framework of aviation education
At the input of the model for aggregating output data to derive a general assessment of the quality of individual training of a student (pilot) within the framework of aviation education, we have normalized and compared values obtained for students (pilots) based on the above models, namely: —aggregate risk assessment of the results of aviation education within the framework of theoretical mastery of subjects; —aggregated normalized score for evaluating the results of student pilot training on flight simulators; —aggregated normalized evaluation of the results of practical flight training of students; —output assessments of the competence of practical flight training of students using observed behavior. All input data are normalized and compared.
The following approach is proposed to obtain the output estimate of .
In the first stage, let DM need to set the weighting coefficients
for each model of evaluating the learning outcomes of pilot students
from the interval [1; 10]. Normalized weighting factors are determined for data comparison:
After that, one quantitative overall assessment of the quality of individual training of a student (pilot) in the framework of aviation education is calculated, separately for each student pilot
, using a weighted average convolution:
We note that if the DM does not need to distinguish the importance of assessment models, then the weighting factors are balanced, and Formula (19) will express the arithmetic mean value.
The following term-set of linguistic variables is proposed to derive the qualitative level of training of the pilot : = “high level of individual training of a pilot”; = “the level of individual training of a pilot is above average”; = “average level of individual training of a pilot”; = “low level of individual training of the pilot”; = “unacceptable level of individual training of a pilot”. As a result of verification on real data, using the didactic system of aviation education, the levels for comparing the y score with the linguistic were established as follows: [0.5; 0.6]—; (0.6; 0.7]—; (0.7; 0.8]—; (0.8; 0.9]—; (0.9; 1]—.
DM decision levels can always be changed without violating the minimum requirements of the approved training standards under the supervision of the national aviation authority.
Thus, the vagueness of input expert evaluations is revealed, thereby improving the effectiveness of the model, which can derive a quantitative overall assessment of the quality of individual training of a student. All this makes it possible to increase the degree of validity of making further management decisions regarding the possibility of improving the quality of pilot training, and individual study counseling. In addition, the level of management of the educational process in the pilot training system increases, which entails a reduction in the risks of training results.
Another important aspect is that the initial grades are stored in the database in the didactic system of aviation education. When obtaining a sufficient number of them, it is possible to improve the settings of the model parameters by applying the methods of neuro-fuzzy networks and their training. The hybrid model has a modular principle, and its components can be replaced by other models or not all involved in the evaluation process. Therefore, the presented hybrid model for evaluating learning outcomes can be easily developed and adapted for other students. For example, doctors, military personnel, ship captains, and others for whom the acquisition of practical skills is an important component of training.