1. Introduction
The concept of organizational sustainability is currently a focus of interest because management processes and organizational domains have a considerable impact on the social environment and on workers. In this sense, organizational sustainability is a broad construct that incorporates environmental, social, and ethical dimensions [
1,
2] and, specifically, it refers to how organizational and human management practices affect employee health and sustainable performance [
3,
4,
5].
With this in mind, a fundamental requirement for the sustainability of organizations is the accuracy of the selection and assessment processes, in order to hire employees with the knowledge, skills, abilities, personality, and other relevant competencies for doing the job. Therefore, organizations must develop selection and assessment processes that allow for the hiring of employees with competencies that improve organizational performance and productivity, increase job satisfaction and fairness, and reduce adverse occupational outcomes such as turnover or absenteeism [
6].
An important issue related to personnel selection processes is the examination of the validity of the instruments used to predict performance criteria. An extensive number of studies have examined the capacity of personality measures to predict several occupational and academic outcomes since they are a widely used assessment procedure in organizational and educational settings [
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22]. However, most of this research has been carried out using traditional single-stimulus (SS) personality measures, which are more susceptible to the potential adverse effects of faking (response distortion) [
23,
24,
25,
26].
Empirical evidence has shown that faking is a type of response bias that can negatively impact hiring decisions in selection processes. Therefore, the control of this phenomenon is crucial for personnel selection and for creating a sustainable organization. Forced-choice (FC) personality inventories could be an alternative to single-stimulus (SS) measures. However, to date, no study has examined the predictive validity of personality variables using a quasi-ipsative FC inventory under faking response conditions. Hence, the aims of this article were: (1) estimate the predictive validity of a quasi-ipsative FC personality inventory under faking conditions; (2) examine whether the performance measurement method used is a moderator of predictive validity; and (3) analyze whether the effects of faking occur independently of the performance measurement method used.
1.1. Personality Variables
Personality inventories are tools which are widely used to evaluate candidates in recruitment processes [
14,
27,
28,
29,
30]. For this reason, and especially after the consolidation of the five-factor model (FFM), multiple researchers have examined the validity of personality measures based on this model as a predictor of different occupational and academic criteria. Empirical evidence has shown that the Big Five factors are suitable predictors of several job performance outcomes. Specifically, meta-analytic research has shown that conscientiousness is the best predictor of performance criteria, including general job performance, job satisfaction, contraproductive behavior, contextual performance, and training success, and that generalizes its validity across occupations and criteria. Emotional stability also showed generalized validity across criteria and occupations, but its validity size is smaller than that of conscientiousness [
13,
15,
16,
18,
19,
20,
21,
31]. The remaining three factors have been successfully related to performance criteria in specific occupations. Extraversion is a predictor of performance in jobs that require social interaction, such as managerial, commercial, and police occupations, and for the performance of training and teamwork. Agreeableness showed a generalization of validity for occupations oriented to cooperation and to helping others, that is, occupations related to customer service and health occupations, and for the performance of teamwork. Finally, openness to experience is a relevant predictor of training performance and of performance in jobs requiring high levels of creativity [
12,
13,
15,
18,
32]. Regarding the academic context, conscientiousness has proven to be the strongest predictor of relevant criteria, such as grade point average (GPA) or academic dishonesty, among others. Likewise, academic performance, assessed using GPA, also correlated significantly with openness to experience and agreeableness [
1,
10,
11,
17,
33,
34].
Therefore, if the first step to achieving an efficient and sustainable organization is to design personnel selection processes that allow the incorporation of potential high-performance employees, the inclusion of personality measures based on the Big Five in the selection process seems an essential requirement. Organizations need highly innovative and productive workers to survive and be sustainable [
6] and Big Five measurements can help identify those workers since, as we just noted, these measures predict organizational criteria related to individual performance and behavior at work.
1.2. Personality and Faking Behavior
Even though the evidence has supported the predictive validity of the Big Five, the use of personality inventories in personnel selection continues to receive criticism because of their potential sensitivity to faking [
35,
36]. Typically, the Big Five have been measured using SS personality inventories. In these inventories, the individual must rate each statement separately from other statements, indicating the extent to which the statement content describes their personality. Usually, this type of test presents a yes/no, true/false, or Likert scale answer format. For this reason, some authors have noted that this format shows a potential susceptibility to answer distortion [
37,
38,
39,
40,
41]. Specifically, the meta-analytical findings of Birkeland et al. [
23], Salgado [
42], and Viswesvaran and Ones [
26] have indicated that SS personality inventories can be deliberately distorted by individuals if they are motivated to fake.
Faking behavior has been defined as a tendency of individuals to respond in a manner that will offer a portrayal of themselves that favors their evaluation process [
36,
40,
43,
44]. Therefore, faking is an intentional distortion of the response to the selection instruments, especially to the personality inventories [
23,
42,
45,
46]. Therefore, this phenomenon is a serious problem in applied settings when important hiring decisions are taken using SS personality measures.
Regarding the adverse effects that faking can produce on personality measures, a great deal of research has shown that faking affects the psychometric properties of SS personality inventories. First, the meta-analyses of Birkeland et al. [
23], Hooper [
47], Salgado [
42], and Viswesvaran and Ones [
26] pointed out that faking produces an increase in the scores of SS personality inventories and it also reduces the magnitudes of the standard deviations. This effect was found to a greater extent in the conscientiousness and emotional stability factors in all cases examined. Second, faking causes an attenuation of the reliability. Findings on this issue found that when faking occurs, the degree of error in the measure increases. Therefore, the scores obtained using SS personality questionnaires under faking conditions are less reliable [
42,
48,
49,
50]. Third, as the scores are less reliable when individuals commit faking, the predictive validity of the SS personality measures is also attenuated [
42]. Finally, empirical evidence has suggested that the construct validity (factor structure) of SS questionnaires could also be affected by faking, producing additional factors or decreasing the number of them [
42,
50,
51].
The psychometric theory of faking effects was proposed by Salgado [
42] as a theoretical framework to explain the impact of faking. According to this approach, faking is a source of error variance that produces an artificial homogenization of the samples, causing an increase in the scores and reducing the magnitudes of the standard deviations. These two simultaneous effects produce a reduction in the range of the scores obtained by the individuals. Moreover, this last artifactual effect of faking causes a decrease in reliability and predictive validity and a modification of the factor structure of personality instruments.
Consequently, if faking affects the magnitude of the scores, the reliability, and validity of the SS measures, it will have a direct impact on the accuracy of the decision-making processes so that individuals who better distort their answers get higher positions in the selection ranking [
42,
52]. Hence, alternatives to SS personality inventories should be considered to reduce the impact of faking on the selection processes. In this sense, forced-choice (FC) inventories are instruments for assessing personality that better control the effects of faking [
25,
53,
54,
55,
56].
1.3. Forced-Choice Inventories and Control of the Effects of Faking
FC personality inventories are characterized by the fact that individuals must choose between several alternatives that have the same degree of social desirability. Usually, the options are presented grouped in pairs, triads, or tetrads. The individuals must choose the alternative that describes them best and, in some cases, the alternative that least describes them. As the options are similar in their level of social desirability, it will be more difficult for the participants to distort their responses. Therefore, the use of FC personality inventories reduces the effects of faking [
37,
57,
58,
59,
60,
61,
62,
63].
FC personality inventories can provide three types of scores, depending on how the answer is chosen (normative, ipsative, and quasi-ipsative scores), each with specific psychometric characteristics [
19,
22,
64,
65,
66]. The normative FC measures are characterized by the presentation of only unidimensional items, each item evaluates just one personality factor. Therefore, the normative scores allow inter-individual comparisons on each personality factor assessed, that is, the scores of an individual are statistically dependent on other individuals in the population and independent of other scores of the assessed individual. An example of an item of a normative FC type would be: Check the answer that best indicates how you behave: In social meetings, usually: (a) other people introduce you; (b) you introduce yourself to others. The Myers–Briggs Type Indicator (MBTI) [
67] would be an example of a normative FC personality test that is widely used.
In the case of ipsative FC scores, the score for each dimension depends on the individual’s scores on the other graded dimensions. Consequently, the sum of the scores obtained for each individual is a constant. Ipsative scores permit us to compare one individual across different personality factors (intra-individual comparisons), the score for each dimension depends on the individual’s scores on the other rated dimensions. Consequently, the sum of the scores obtained for each individual is a constant. Ipsative scores allow us to compare an individual through different personality factors (intra-individual comparisons), that is, the results are dependent at the individual level but independent of the scores of other subjects in the test. Therefore, it should be noted that it only shows us the relative importance of each factor. It is precisely for this reason that the use of this type of measure is not recommended in contexts in which it is necessary to make a comparison or ranking of all participants, such as selection processes, because the information provided with this measure would be biased [
68]. In this category, we can find several personality tests widely applied in professional practice, such as the Occupational Personality Questionnaire (OPQ) [
69], Edwards Personal Preferences Schedule (EPPS) [
70], or the Description en Cinq Dimensions (D5D) [
71].
Finally, the quasi-ipsative FC scores include those measures that do not meet all the criteria to be ipsative but present some characteristics associated with them [
64]. Specifically, a score is quasi-ipsative when some of the followed conditions apply [
65,
66,
72]: (1) individuals only partially order the alternatives; (2) scales have different numbers of items; (3) not all of the items ranked by the respondents are scored; (4) scales are scored differently for differing respondent characteristics; (5) items differ in how they are weighted; (6) some ipsative scales are deleted when data are analyzed; and (7) the inventory includes normative sections. Likewise, Hicks [
65] and Meade [
66] indicate that quasi-ipsative scores are defined by the following conditions: (a) the results for each factor vary between individuals over a certain range of scores, (b) the scores do not add up to the same constant for all people, and (c) increasing the score in one factor does not necessarily produce a decrease in the score in other factors. An example of an item from this type of FC inventory would be the following: In each item, mark the phrase that best describes you and the phrase that least describes you. “I am a person (a) who is open-minded; (b) who is a perfectionist; (c) who does not usually lose their temper.” Tests such as the Gordon Personal Profile-Inventory (GPPI-I) [
73], the IPIP-MFC [
68], or the more recent QI5F-Tri by Salgado [
74] are examples of quasi-ipsative FC inventories.
Furthermore, two types of quasi-ipsative FC scores can be distinguished: (a) algebraically dependent quasi-ipsative FC, when a metric dependence exists between the scores and, therefore, there is some degree of ipsativization of scores; and (b) non-algebraically dependent quasi-ipsative FC, when the score for each personality factor is not influenced by the score in other personality variables [
72].
In summary, quasi-ipsative questionnaires share properties with normative and pure ipsative measures. This uniqueness means that the scores obtained with this forced-choice format can be analyzed at the intra- and inter-individual level [
62]. In other words, the scores allow us to know the individual differences of each subject and at the same time provide us information about his/her differences with respect to a reference group, which makes these measures more appropriate than the ipsative ones for research from a statistical point of view.
1.4. Predictive Validity of Quasi-Ipsative FC Inventories
Empirical evidence has shown that FC inventories are a suitable predictor of occupational and academic performance criteria under honest response conditions. The meta-analyses of Salgado [
62], Salgado et al. [
19], and Salgado and Táuriz [
22] have shown that FC personality measures are a more valid instrument for predicting occupational and academic performance than SS personality measures and that the quasi-ipsative FC inventories are better predictors of performance than the ipsative or normative FC inventories [
19,
22,
75]. More specifically, they found that conscientiousness, evaluated with quasi-ipsative FC inventories, is the best predictor of occupational and academic performance. These findings have recently been replicated by Fisher et al. [
76], in a small-scale meta-analysis, and by Lee et al. [
77]. They also found that quasi-ipsative FC personality measures are the most valid personality assessment instrument for predicting performance. However, to the best of our knowledge, none of the studies published to date have examined the predictive validity of quasi-ipsative FC personality measures under faking response conditions. Hence, this study aims to examine the effects of faking on the predictive validity of a quasi-ipsative FC inventory.
As we mentioned previously, according to the psychometric theory of the effects of faking [
42], the consequences of faking would be twofold: (a) an increase in the mean score; and (b) a reduction in the variance (standard deviation). These effects would, in turn, lead to a decrease in reliability and a decrease in predictive validity. Nevertheless, recent empirical findings have shown the robustness of the quasi-ipsative FC personality inventories (without algebraic dependency) to the effects of faking: (a) the faking effects were considerably less critical to the average score of the evaluated groups [
59]; and (b) the quasi-ipsative FC inventories showed a high degree of measurement invariance (that is, construct validity), under both honest and faking response instructions [
25]. Therefore, considering these results, the negative effects of faking on the predictive validity of quasi-ipsative FC inventories can be expected to be minor or non-existent.
1.5. Aims of the Study and Research Hypotheses
This study aims to examine the predictive validity of a quasi-ipsative FC inventory for predicting academic performance assessed through self-report performance ratings and academic grade point average under honest and faking response conditions. Considering that conscientiousness is the best predictor of occupational and academic performance, as the meta-analyses cited above have found, the following hypothesis is proposed:
Hypothesis 1 (H1): Conscientiousness measured with a quasi-ipsative FC inventory predicts academic performance (grade point average and self-reported performance ratings) under honest and faking response instructions.
On the other hand, previous research, including meta-analyses, has not examined whether personality factors, particularly conscientiousness, predict academic performance in the same way when it is evaluated through performance rating scales or through academic grades. The results produced in other areas of research, for example, in cognitive abilities [
78], have shown that the predictive validity coefficients are different depending on whether performance is evaluated using performance ratings or results data (e.g., production, sales). In the case of cognitive abilities, the validity is higher when rating scales are used to assess performance. Furthermore, meta-analytical research has found that the relationship between data-based performance measures (objective criteria) and rating scale performance measures (subjective criteria) is of a moderate magnitude (around 0.40). In this regard, McDaniel et al. [
79] suggested that distinguishing between types of criterion measures is relevant in predictive validity studies, because the size of the validity usually varies according to the type of criterion and, therefore, separate analyses should be performed to identify potential differences between the criteria. Consequently, considering that the type of performance measure can be a moderating variable of predictive validity, the following hypotheses are proposed:
Hypothesis 2 (H2): The predictive validity of conscientiousness evaluated with a quasi-ipsative FC inventory is higher when performance is evaluated through self-reported ratings than when the grade point average is used.
Regarding the effects of faking, the findings mentioned above show that this phenomenon has a direct impact on the validity of personality measures. Hence, we posit the following hypothesis:
Hypothesis 3 (H3): The effects of faking on the predictive validity of conscientiousness evaluated with a quasi-ipsative FC inventory are independent of the way in which performance is evaluated (grade point average vs. self-reported ratings).
One last goal is to test two predictions derived from the psychometric theory of the effects of faking [
42] due to the fact that, to date, no study has investigated the predictions of the psychometric theory of faking in quasi-ipsative FC inventories. This theory maintains that faking produces a decrease in the reliability of personality measures. Therefore, we proposed Hypothesis 4:
Hypothesis 4 (H4): Faking produces a reduction in the reliability of personality measures that can be estimated by comparing the reliability under honest and faking response instructions.
Likewise, this theory proposes that faking causes an increase in the mean scores and a reduction in variability (lower standard deviations), producing range restriction in the scores. In this sense, the last hypothesis of this study is:
Hypothesis 5 (H5): Faking produces range restriction in personality measures that can be observed when the variability under honest and faking response conditions is compared.
2. Materials and Methods
2.1. Sample
The sample consisted of 939 students from the University of Santiago de Compostela belonging to different degrees. There were 657 women (69.96%) and 282 men (30.04%). The average age was 21.62 years old (SD = 3.90).
To carry out this experimental study, the voluntary participation of university students was requested by posting notices in faculties and other public spaces of the University of Santiago de Compostela (e.g., libraries, academic management units, or university residences). The sample collection was carried out between the months of January and June 2017 and between those same months of 2018. In order to attract students, they were offered economic compensation (EUR 10) in exchange for their participation. To perform the study, small face-to-face groups of between 10 and 15 participants were organized and all subjects provided informed consent to participate in the study.
2.2. Measures
QI5F_tri. The quasi-ipsative FC questionnaire QI5F_tri [
74] was used to assess personality. This test consists of 140 items that evaluate the Big Five personality factors (28 items for each factor). Each item presents three response alternatives that are balanced in social desirability. The three alternatives reflect different personality dimensions, but each item is used to assess a single personality factor, that is, the items used to evaluate one factor are not used to evaluate other factors. Therefore, the QI5F_tri implements Horn’s [
72] strategy of quasi-ipsativation, which means that the score for each of the Big Five is algebraically independent of the score for the other personality factors even though the format score is quasi-ipsative. An example of an item of QI5F_tri would be: I am a person who is: (a) very imaginative (openness to experience); (b) generous to others (agreeableness), (c) meticulous in every task (conscientiousness). With respect to the reliability of this measure, the internal consistency coefficients (Cronbach’s alpha) were 0.71, 0.73, 0.80, 0.66, and 0.80 for emotional stability (ES), extraversion (EX), openness to experience (OE), agreeableness (A), and conscientiousness (C), respectively. The test–retest reliabilities (for a four-week interval) reported were 0.91, 0.90, 0.79, 0.65, and 0.72 for ES, EX, OE, A, and C, respectively. Otero et al. [
24] also reported evidence of the convergent–discriminant validity of the QI5F_tri using an SS personality inventory. Exploratory factor analyses confirmed the five-factor structure of the QI5F [
25].
GPA. Grade point average (GPA) was used as a measure of academic performance. Each participant had to provide a copy of their official academic transcripts to participate in the study. The mean GPA was 6.98 (SD = 0.88) for this sample. Salgado and Tauriz [
22] developed an empirical distribution of GPA reliability and found an average reliability coefficient of 0.83, therefore, the reliability coefficient used was α = 0.83.
CDTE. The academic task performance was assessed using the Cuestionario de Desempeño de Tarea en Estudiantes (CDTE; Questionnaire of Academic Task Performance) developed by Salgado [
80]. It is a self-report measure composed of 30 items that assess three dimensions: accomplishment; achievement orientation; and implication as a student. The internal consistency coefficient (Cronbach’s alpha) for this measure was α = 0.76 (
N = 803).
CDCE. The Cuestionario de Desempeño Contextual en Estudiantes (CDCE; Questionnaire of Academic Contextual Performance) [
80] was used to assess contextual performance. This scale is composed of 30 items that assess the following behaviors: personal support, organization, and conscious initiative. The internal consistency reliability coefficient was α = 0.75 (
N = 794).
CDAN. Academic dishonesty behavior was measured using the Cuestionario de Desempeño Académico Negativo (CDAN; Questionnaire of Negative Academic Performance) [
80]. This scale consists of a self-report measure composed of 30 items that assess the following dimensions: cheating on examinations, inappropriate use of resources, absenteeism, non-compliance with rules, and low effort. The Cronbach’s alpha was 0.89 (
N = 799).
2.3. Procedure
To carry out the study, small group sessions were organized to respond to the tests and two experimental designs were used. Of all the subjects, 52.18% (N = 490) participated in a within-subject design, in which all participants answered the personality inventory both honestly and under conditions that induced them to commit faking. The remaining 47.82% (449 subjects) responded to the personality measure only under honest response instructions.
In the honest condition, the participants followed the instructions that are described below: “In the following questionnaire you will be presented with sets of phrases grouped into triads. Try to rank them by first identifying the one that best describes you, the one that second best describes you, and finally the one that describes you least. In each item, mark a plus sign (+) next to the phrase that best describes you and a minus sign (−) next to the phrase that least describes you. You should leave blank the one you considered second”. For the faking condition, the test instructions were slightly modified in such a way that participants were encouraged to fake. The following paragraph was added: “When answering, imagine that you are in the last step of a selection process for a very attractive job. Since it offers you a great opportunity to advance your professional career, you want to get that job. To do this, you must answer the test trying to give a better image of yourself.”
In both response conditions, the inventory was administered: (1) in paper-and-pencil format or (2) in computer format using the Inquisit program [
81]. The participants only had access to the test during the time they attended the study and they responded using only one of the administration formats.
Regarding the procedure to answer the three performance measures (CDTE, CDCE, and CDAN), the participants following the instructions of the questionnaire: “Please indicate the frequency with which you engaged in the behaviors and actions described below in your academic environment using a 5-point Likert scale ranging from 1 = never to 5 = always.”
The three self-reported scales were administered in paper-and-pencil format and participants were asked to be totally honest in their answers. In the within-subject design, the performance ratings were administered between the two response conditions (honest and faking) of the personality measure to separate in time the responses of the participants under each set of instructions. This procedure was intended to ensure that the individuals could not rely on what they answered in the first condition of the personality test to answer in the second.
2.4. Statistical Analyses
To carry out this study, several statistical analyses were performed. First, to test Hypotheses 1 and 2, descriptive statistics were calculated, and a correlational analysis was performed.
To test Hypotheses 3–5, first, a principal component analysis was performed, and the factorial score was obtained for each subject from the compound of the three scales. The second step consisted of a correlation analysis between the global factor of academic performance and the five personality dimensions using the SPSS program. The third step consisted of correcting the observed correlations for measurement error in the predictor and the criterion and for restriction in the direct range in the predictor, for which the VALCOR program was used [
82].
3. Results
Next, the results obtained in the analysis of correlations between personality and the set of performance criteria examined under the honest and faking response conditions are presented. Each of the tables shows first the means and standard deviations obtained for each of the variables, and then the correlations observed between the personality variables and both the measures of task performance and contextual and academic dishonesty, respectively.
3.1. Results of the Predictive Validity of the QI5F_tri for the Total Sample and the between-Subject Design Sample under Honest Conditions
Table 1 shows the observed predictor–criterion correlations obtained for the total sample and the between-subject sample in the honest condition (the correlation matrices involving all measure predictors and criteria are presented in
Appendix A for the total sample and in
Appendix B for the between-subject sample). As can be seen, the results obtained in both samples are very similar, although the correlations are, in general terms, slightly higher in the case of the between-subject design sample. The results show significant correlations between most of the personality factors and the performance criteria examined.
Emotional stability is negatively and significantly related to task performance (CDTE; r = −0.15 and 0.18, p < 0.01) and contextual performance (CDCE; r = −0.13 and 0.18, p < 0.01), obtaining very similar effect sizes in the two samples. Likewise, in the total sample, this factor also correlates negatively and significantly with the measure of counterproductive academic behaviors. In this case, the effect size is smaller (r = −0.09, p < 0.01). No significant relationship has been found between this factor and GPA in both samples.
Regarding extraversion, the results show that it predicts task performance, finding negative and significant correlations, with values of r = −0.10 and 0.15 (p < 0.01) for the total sample and the between-subject sample, respectively. A positive and significant relationship with academic dishonesty has also been found. The correlations obtained were r = 0.15 (p < 0.01) for the total sample and r = 0.25 (p < 0.01) for the between-subject design for this criterion. Finally, extraversion showed a negative and significant correlation with GPA (r = 0.16, p < 0.01), although only in the case of the between-subject sample.
Openness to experience was the only factor that did not significantly correlate with any of the performance criteria analyzed, obtaining effect sizes of less than 0.07 in all cases.
Agreeableness was negatively and only significantly correlated with the average academic grade (r = −0.14, p < 0.01) in the total sample. Moreover, a low, but significant, correlation has also been found between this factor and contextual performance of r = 0.07 (p < 0.05) for this same sample. The results with the remaining criterion variables in the two samples analyzed showed very low and non-significant correlations.
Finally, the results show that conscientiousness is a robust predictor of performance when a quasi-ipsative FC inventory is used. It is the only factor that presents significant correlations with all the evaluated criteria, with the values obtained for the between-subject sample being slightly higher. Specifically, in relation to GPA, a correlation of r = 0.24 (p < 0.01) was obtained in the total sample and of r = 0.34 (p < 0.01) in the between-subject sample. Regarding the three performance rating scales, a correlation of r = 0.38 (p < 0.01) with task performance, of r = 0.17 and 0.15 (p < 0.01) with contextual performance, and of r = −0.31 and −0.33 (p < 0.01) with the measure of negative academic performance was found for the total sample and the between-subject sample, respectively.
In summary, the results obtained in the honest condition for both samples show that the Big Five, in general, and conscientiousness in particular, are important predictors of multiple performance criteria when evaluated with quasi-ipsative FC inventories. Therefore, these results support Hypothesis 1 under honest response conditions and support the results of the meta-analysis by Salgado and Táuriz [
22] on the predictive validity of the quasi-ipsative FC personality inventories.
3.2. Results of the Predictive Validity of the QI5F_tri in the Within-Subject Design Sample under Honest and Faking Conditions
Table 2 includes the predictor–criterion correlations for the within-subject design in the two response conditions examined, honest and faking (the correlation matrices involving all measures predictors and criteria in both conditions are presented in
Appendix C). Taken together, the results show that personality measures assessed with a quasi-ipsative FC inventory without algebraic dependence predict performance under both conditions. However, there is a reduction in effect sizes in this sample if we compare the values with those obtained in the general and between-subject design sample.
The results obtained in the honest condition show that emotional stability only significantly predicts academic performance (CDTE) with a negative correlation of r = −0.13 (p < 0.01), and the correlations obtained with the other performance measures are very low.
In the case of extraversion, no significant correlations were obtained between this factor and the evaluated criteria. Therefore, and contrary to the results obtained in the other cited samples, extraversion does not turn out to be a predictor of any of the performance criteria. The opposite occurs with openness to experience, which in this design predicts academic task performance, r = −0.10 (p < 0.05), and deviant behaviors in the academic context, r = 0.12 (p < 0.01).
Agreeableness correlates significantly and negatively with GPA, obtaining a correlation of r = −0.17 (p < 0.01). However, no significant correlations are obtained with the other performance measures, with values very close to zero in all cases.
Finally, conscientiousness, again, is the only factor that obtains significant correlations with all the criterion variables analyzed. The correlations are positive and significant with GPA (r = 0.17, p < 0.01), task performance (r = 0.38, p < 0.01), and contextual performance (r = 0.18, p < 0.01), and negative and significant with academic dishonesty, with a r = −0.30 (p < 0.01). These data, therefore, show the robustness of conscientiousness as a predictor of performance when quasi-ipsative FC measures are used and support Hypothesis 1 under honest response conditions.
Focusing on the faking response condition, the results show lower correlations than those obtained in the remaining experimental conditions and samples examined, finding only significant correlations for the conscientiousness and emotional stability factors, the correlations obtained for the remaining personality factors being very low and not significant.
Specifically, conscientiousness is the only factor that predicts almost all performance criteria evaluated under faking conditions. It shows a correlation of r = 0.11 (z = 1.105, p > 0.10) with GPA and of r = 0.16 (z = 4.389, p < 0.01) with the CDTE scale. Moreover, a correlation of r = −0.15 (z = −3.017, p < 0.01) with the CDAN scale has been obtained; therefore, conscientiousness also predicts people’s propensity to commit negative academic behaviors under conditions of faking. However, the results showed statistically significant differences in the validity of conscientiousness for predicting task performance and contextual performance between both types of response. The only criterion with which the correlation obtained is weak is contextual performance (CDCE). It is true, however, that even in honest conditions, the correlations found were lower between conscientiousness and this criterion when compared to the other criteria examined. For emotional stability, the results show that it is a robust predictor of task performance (r = −0.11; z = −0.421, p > 0.10) and contextual performance (r = −0.15; z = −0.421, p > 0.10) under faking response conditions. In this case, we found no statistically significant differences between the predictive validity under both types of response conditions. Therefore, the results obtained for the within-subject design, once again, show that conscientiousness, followed by emotional stability, are the best predictors of academic performance even in faking conditions when they are evaluated using a quasi-ipsative FC inventory. Thus, the results obtained support Hypothesis 1.
Finally, the results collected in
Table 1 and
Table 2 show that conscientiousness is a better predictor of criteria based on rating scales (i.e., CDTE, CDCE, and CDAN) than of those based on academic grades (i.e., GPA). Thus, the results in the case of the total sample (
Table 1) show a validity for GPA (
r = 0.24) which is lower than that found for task performance and academic dishonesty (
r = 0.38 and
r = −0.31, respectively). In the case of the between-subject design, the validity for predicting GPA (
r = 0.34) is less than the validity for predicting task performance (
r = 0.38). In the case of the within-subject design, the validity for GPA (
r = 0.17) is less than the validity for the three measures of rating scales, with correlations of 0.38, 0.18, and −0.30 for CDTE, CDCE, and CDAN, respectively. In the faking condition, the validity for GPA (
r = 0.11) is lower than the validity for predicting task performance and academic dishonesty (
r = 0.16 and
r = −0.15, respectively). Therefore, these results also support Hypothesis 2.
3.3. Results of the Predictive Validity of the QI5F_tri for Overall Academic Performance Rating and Academic Performance Compound
These results show that the type of performance measure can act as a moderator of the predictive validity of the quasi-ipsative FC measure of personality. However, the previous comparisons are not totally adequate because the results for GPA, which is a broad measure of performance, are being compared with the results for rating scales that evaluate facets or performance sub-dimensions and, therefore, are narrow measures.
Moreover, the correlations observed under faking response instructions are subject to a potential attenuation because of the effects of faking on the reliability of the personality measures and a potential underestimation due to potential range restriction. A more appropriate way to make the comparison is to compare the results for GPA with the results of a compound of the three facets of academic performance and at the same time correct such correlations for measurement error in the predictor and the criterion and for range restriction in the predictor.
With this objective in mind, a principal component analysis was carried out, which showed that a single component, with a latent root of 2.001 (the next had a latent root of 0.618), explained 66.7% of the variance, and the factorial loads were 0.89, 0.79, and 0.78 for CDTE, CDCE, and CDAN, respectively. Therefore, an overall component of academic performance adequately explains the relationship between the three sub-dimensions of performance. Next, the correlations between overall academic performance (OAP) and the Big Five factors were calculated.
Table 3 shows the reliability and restriction coefficients in the range used in the analyses. The correlations, both observed and corrected, appear in
Table 4.
The results in
Table 3 are related to Hypotheses 4 and 5 of this study. As can be seen, faking produces range restriction and reduces the reliability in the five personality factors. Therefore, these two hypotheses derived from the theory of faking have been empirically supported.
The results displayed in
Table 4 indicate that conscientiousness predicts academic performance better when it is evaluated using OAP than when it is assessed using GPA in both response conditions. In the honest response condition, we found a corrected correlation of 0.47 for performance evaluated with rating scales and a corrected correlation of 0.21 when it was assessed by GPA. In the faking response instructions, the corrected correlations were 0.20 for OAP and 0.14 for GPA. Similar results were found in the case of emotional stability, although such results had not been anticipated in our hypotheses. Therefore, regarding Hypothesis 3, the results have shown that the effects of faking on the validity of the quasi-ipsative FC inventory are independent of whether the criterion is evaluated using rating scales or using objective data such as GPA, in both cases there is a reduction in the observed correlation. However, the decrease is greater in the case of performance evaluation using rating scales, i.e., OAP.
The last analysis presented in this study has to do with predicting the broadest possible performance criterion with the performance measures used in this study. To achieve this goal, the four performance measures were combined in a single compound, for which a principal component analysis was carried out, and then those loads were used as a variable to correlate with the Big Five scores obtained under both response conditions. The principal component analysis showed that there was only one significant component, with a latent root of 2.123 (the next component had a root of 0.860) and that it explained 53.1% of the variance. The factor loads were 0.502, 0.859, 0.750, and 0.756 for GPA, CDTE, CDCE, and CDAN, respectively. Therefore, the academic performance compound (APC) obtained adequately explains the relationship between the four measures.
Table 5 shows the observed and corrected correlations between the five factors and the APC.
As can be seen, the corrected correlation of conscientiousness with APC in the honest condition was 0.50, that is, 6.4% higher than the correlation obtained for OAP. In the case of the faking condition, the corrected correlation was 0.24, that is, 20% higher than the correlation for OAP. These two results, taken together, indicate that conscientiousness better predicts a broad composite of academic performance that includes both self-reported rating scales and academic grades.
4. Discussion
This study had four main objectives. First, to determine whether a quasi-ipsative FC (algebraically independent) personality measure predicts academic performance ratings and academic grades under honest and faking response instructions and, particularly, to examine the predictive validity of conscientiousness (Hypothesis 1). The second was to check whether the performance measurement method affects predictive validity (Hypothesis 2). The third objective was to check whether the effects of faking occur independently of the performance measurement method used (Hypothesis 3). The last objective was to test the effects of faking on the reliability and range restriction of the personality scores that Salgado [
42] proposed in his theoretical model of the effects of faking (Hypotheses 4 and 5). In relation to the first objective, the results obtained showed that conscientiousness is the best predictor of performance when quasi-ipsative FC measures are used, with significant
r values found in all cases under honest conditions. These results are similar to those obtained by Salgado and Táuriz [
22] and Salgado et al. [
19], who found that the conscientiousness factor was the best predictor of academic performance. It was also observed that emotional stability and extraversion were predictors of various academic performance criteria. However, the significant correlations obtained between these factors and the performance measures vary in each analyzed sample. With openness to experience and agreeableness, a similar situation occurred, and although they have not stood out as predictors of academic performance, they have obtained significant correlations with some of the performance criteria analyzed. These results reveal the considerable variability that occurs between the experimental designs of the honest condition. This is, therefore, the first unique contribution of this study.
Regarding the faking condition, the results show the robustness of conscientiousness as a predictor of academic performance, showing significant correlations with almost all the analyzed performance variables. Likewise, significant correlations have been found with emotional stability for task and contextual performance criteria, which shows that this factor is a valid predictor of performance even under faking conditions. These results follow the line of those obtained in the honest condition of this study and of the results of previous meta-analyses of the predictive validity of the FC quasi-ipsative inventories (in honest conditions) that indicated that emotional stability is an adequate predictor of performance (Salgado et al., 2015). This is the second unique contribution of this study.
Therefore, these findings show that personality evaluated with an algebraically independent quasi-ipsative FC inventory predicts performance even under faking response conditions. However, there is a reduction in the effect sizes of the correlations, although it cannot be totally attributable to faking in the present study, since when the correlations corrected for measurement error and restriction in the range are examined, they do not approximate the values of the correlations under honest response conditions. As we have indicated, the results of the honest samples reflect a significant variability in the results between the experimental designs, which shows that there are variables, other than faking, that could be affecting the results. Therefore, this is the third unique contribution of this study.
With respect to the second goal, this study showed that conscientiousness is a better predictor of performance when it is assessed with rating scales than with academic grades. This contribution is also unique to this study. Moreover, the results join the growing empirical evidence that indicates that the performance measurement method is a powerful moderator of the validity of predictive instruments, for example, of cognitive ability tests [
78] and the selection interview [
79]. The evidence provided indicates that in the case of the measurement of personality, the performance measurement method can also have important effects on validity.
In relation to the third objective, this study has contributed by showing that the faking effect occurs independently of the performance measures used, although the reduction in the validity coefficient was greater when performance was measured with performance ratings scales. Hence, this is the fifth contribution of this study. Finally, the sixth contribution of this study has to do with the psychometric theory of the effects of faking (Hypotheses 4 and 5) [
42]. According to this theory, if subjects distort their answers, the reliability and validity of the questionnaires will be attenuated, due to an increase in measurement error and a reduction in the range of scores. This effect has been verified in the present study. The alpha coefficient of internal consistency obtained under faking instructions was lower than that obtained under honest response conditions. It could also be observed that there was a certain degree of range restriction in four of the personality factors (the exception was conscientiousness). Therefore, the study has contributed in a unique way by testing the predictions of the psychometric theory of the effects of faking on experimental conditions and with a type of personality inventory not examined to date. The results provide empirical evidence to support the theory’s predictions.
In conclusion, this study represents a unique empirical contribution since it is the first study that has simultaneously examined the criterion validity of the quasi-ipsative FC inventory under honest and faking conditions for academic criteria and the results have been compared in three samples.
The results obtained allow us to conclude that the personality measures evaluated with a quasi-ipsative FC inventory (without algebraic independence) predict performance even under faking conditions. Specifically, it has been found that conscientiousness is the best predictor of academic performance, regardless of the response condition (honest or faking) or the experimental design in which it is evaluated (between or within-subject design).
The current study has also made it possible to analyze the moderating effect that the type of performance measure has on the predictive validity of the quasi-ipsative FC inventory in honest and faking conditions, a topic that has not been analyzed in the field of academic performance. We found that conscientiousness is a better predictor of performance when it is evaluated with self-report rating scales than with academic grades in both response conditions. Therefore, the results have shown that the type of performance measure is a powerful moderator of the validity of the predictive instruments. Regarding this issue, it has also been shown that faking reduces the validity coefficients of both types of criteria measures, although higher reduction invalidity has been observed when using self-report rating data.
Finally, this research has provided empirical evidence that supports the predictions of the psychometric theory of the effects of faking [
42]. The results have shown that the reliability coefficients are smaller and there is range restriction in the faking condition compared to the honest one. This is the first study to analyze this effect in a quasi-ipsative FC inventory.
4.1. Theoretical and Practical Implications
The results of this study have implications for both the theory and practice of personality assessment in applied contexts. From a theoretical point of view, this is the first study that provides empirical evidence of the effects of faking on the predictive validity of a quasi-ipsative FC that provides non-algebraically dependent scores. The results obtained suggest that this type of FC questionnaire is a robust instrument that controls faking effects on predictive validity.
Moreover, in relation to the theory of personality assessment, a relevant implication of the results is that the psychometric effects of faking do not seem to be the only factors reducing the predictive validity of the personality measures. When the effects derived from faking on the reliability (internal consistency) and the range restriction have been controlled for and the validity coefficients have been corrected, a notable difference between the validity coefficients obtained under honest and faking response conditions could still be observed, which should not have been that great once the effects had been psychometrically corrected. This implies that other variables, not only faking, affect the predictive validity of these measures.
We speculate that the change (reduction) in the predictive validity coefficients may also be due to other idiosyncratic factors (e.g., changes in the response mode of individuals) or fatigue (response to a large questionnaire on two consecutive occasions, which required more than an hour of work) or practice (less involvement on the second occasion, with less elaborate answers). Future studies should examine the potential contribution of these factors (and others) to the reduction in predictive validity.
In relation to the practice of personality assessment, an important implication is that predictive validity is considerable when a quasi-ipsative FC personality inventory is used even in faking response conditions, and that the validity is even better when broad academic performance criteria are examined. For this reason, a recommendation for evaluation professionals in applied contexts (e.g., student admission processes, selection processes for internship and training) is to use quasi-ipsative FC inventories without algebraic dependence since, in addition to being good predictors, they are robust against faking.
Furthermore, based in the findings obtained, we recommend that the observed validity coefficients be corrected for range restriction to establish a less biased estimator of validity.
Finally, the results of this study also suggest that the measurement method of performance is a powerful moderator of the validity of predictive instruments, therefore, professionals must be aware that the validity of personality instruments (in our case, the quasi-ipsative FC inventories) is not identical for all modes of measuring academic performance and must use the appropriate coefficient for the type of measure of the criterion to be used in each case, remembering that the validity coefficients are lower for academic grades than for self-reported performance ratings.
4.2. Limitations of the Study and Future Research
This study is not without its limitations. The first limitation comes from the differences in sample sizes, which produced different sampling errors. Likewise, the characteristics of the participants could also have affected the results; they came from several different degrees, and this could have affected criteria such as GPA, which may have conditioned the results in the honest condition and could have affected the results in the faking condition.
A second limitation arises from the fact that in the faking condition, only the results in a within-subject design have been analyzed, as we were unable to examine the results in a between-subject design. We hope to be able to continue with this research and obtain sufficient data to check whether these results are also maintained in this design.
It should also be noted that the faking condition of this study is a condition of maximum distortion in which the participants are induced to commit a high degree of faking. Therefore, it is expected that the results of faking would be less intense under normal performance contexts. In the present case, it was not possible to control for this variable (faking in maximum performance vs. faking in typical performance), which is a third limitation of the study. It would be advisable to carry out studies that analyze the criterion validity of the quasi-ipsative FC personality inventory in real selection contexts to examine whether the results obtained in this study are reproduced.
Likewise, in this study, a quasi-ipsative FC measure without algebraic dependence has been used to predict exclusively academic performance criteria, which does not allow for the generalization of the results to other types of FC inventories and other criteria. Recent studies [
63,
76] have shown that even among different types of quasi-ipsative FC, the method of obtaining the score in the Big Five (based on the classical theory of measurement or on Thurstonian item response theory models) can produce remarkably different predictive validity coefficients. In this sense, in future research, it would be of interest to examine whether faking affects the predictive validity of other types of FC inventories (for example, ipsative, normative, or quasi-ipsative with algebraic dependence inventories) and with other criteria such as job performance.