3.1. Responses and Descriptive Statistics
The researched subjects consisted of two main groups: first-year undergraduate accounting students and final-year graduate accounting students. There were also two control groups: first-year undergraduate management students and final-year graduate management students. A total of 432 responses to the questionnaire were received: 146 (at the response rate of 90 percent) and 95 (at the response rate of 84 percent) from the accounting students and 84 (at the response rate of 93 percent) and 107 (at the response rate of 88 percent) from the management students. The demographic details of the respondents are reported in
Table 1.
Regarding the first-year undergraduate studies, most of the researched subjects were female (63.0%), mostly at the age of 20 or younger (95.9%), with no work experience (78.1%). Comparable demographic statistics were identified for the control group, in which 50% of the respondents consisted of females, mostly at the age of 20 or younger (96.4%) who had no work experience (73.8%). For both groups, approximately 1% had professional experience in accounting or auditing.
The second researched group was the final-year graduate accounting students. It was highly dominated by females (83.2%), which is not unusual, as around 90% of the certified accountants in Poland are women [
33]. Most of the researched subjects in this group were 23 years old or older (94.7%), with work experience (77.9%). Note that about half (46.4%) of the final-year graduate accounting students had professional experience in accounting or auditing. In contrast, the control group was not as dominated by females (59.8%); had a similar age structure, with most subjects being 23 years old or older (99.1%); and had a comparable percentage of subjects with work experience (79.4%). However, only very few subjects from the control group had any professional experience in accounting (2.8%), which makes for the biggest difference, in contrast to the final-year graduate accounting students. To control for the effects of the demographic variables, gender, age, work experience (length and profile), and target profession after graduating were included as independent variables in the hypothesis robustness testing.
We calculated Cronbach’s alpha for a given number of 432 responses, regardless of the fact that Hurtt [
29] preliminarily tested the scale with this coefficient. However, the researched subjects differ from the ones Hurtt surveyed, and more importantly, the questionnaire had been translated into Polish. Although the translation process was realized with great diligence and according to the commonly accepted standards, there may still be some doubt as to whether the context of the statements was delivered appropriately. The internal consistency measured with Cronbach’s alpha is 0.81 for all 432 completed questionnaires, which is similar to the values received by [
29,
31,
34]. In
Table 2, we report the complete results for Cronbach’s alphas for each construct in comparison with Hurtt’s [
29] results.
There is a large dispute over the acceptable or desirable ranges of Cronbach’s alpha for research studies in the social sciences. For example, Nunnally [
35] and Carmines and Zeller [
36] suggest .80 as a minimum level of reliability for “basic research”. In contrast, DeVellis [
37], Bland and Altman [
38], and Hair et al. [
39] suggest a value ranging from 0.70 to 0.80 for comparing groups as satisfactory. Furthermore, Aron and Aron [
40] and Field [
41] proposed that, for research in psychology, Cronbach’s alpha of 0.60 or even lower could be adequate. On the other hand, high values of alpha are not desired, either. Steiner [
42] states that a value higher than 0.90 often indicates redundancy and points to an excessive number of items in the scale. Generally, values of alpha between 0.80 and 0.90 are considered to be reliable by most researchers. Therefore, tree constructs of HPSS (Hurtt’s Professional Skepticism Scale), do not fall within the given range. There are a couple of possible reasons why lower values of this coefficient were obtained. If the questionnaire is translated or used in a country that is culturally dissimilar to where the scale was developed, it might have a lower reliability [
43]. This negative impact on reliability is even greater when some items of the translated questionnaire are written in the opposite direction [
44]. As all of these factors took place, such a negative impact should be taken into consideration.
Although the value of the Cronbach’s alpha coefficient for the interpersonal understanding construct of 0.76 falls out of the most desirable range, it can still be interpreted as respectable. Researchers agree that a value of this coefficient over 0.75 for social sciences is acceptable [
45,
46]. However, values of 0.63 and 0.53 calculated for the self-determining and questioning mind constructs show very limited, or poor reliability. Even though some authors, such as Aron and Aron [
40] and Field [
41], give some credibility to the results with Cronbach’s alpha of 0.60, it should be mentioned that every value of alpha lower than 0.70 confirms a random error in at least 50% of the scores. It should be mentioned, however, that other authors who use HPSS also report alphas lower than .70 [
29,
34] for at least one of the constructs.
3.2. Results for H1, H1a, and H1b
In H1, it is stated that there is no statistically significant difference in the change of professional skepticism level measured with HPSS between students enrolled in the accounting programs and students enrolled in other programs in the field of economy. The descriptive statistics on professional skepticism measured with HPSS are reported in
Table 3.
The mean scores achieved by the first-year bachelor (undergraduate) accounting students and the control group (management students) are statistically indifferent (
p < 0.05).
Table 3 also shows that, consistent with the expectation, the mean score of the final-year master students enrolled in the accounting program does not differ from the mean score of the control group (
p < 0.05). Although the mean levels of professional skepticism measured with HPSS in both groups have changed during the four years of education, the two accounting programs have, on average, no significant impact on the relative change compared to the control group. Therefore, we can acknowledge H1 to be confirmed.
With H1a, we distinguish between students of two different master programs in accounting. We also predict that neither of these two programs has a significant impact on the change in the students’ professional skepticism level, comparing one to another. As there is only one accounting program at the undergraduate level, and both groups have an equal initial mean score, the hypothesis is confirmed if there is no statistically significant difference in the mean scores of these two groups at the final-year graduate level.
Table 3 shows that H1a must be initially rejected, as the mean score differs significantly (
p < 0.05). What is also interesting is that students of the ACCA-accredited accounting program increased their initial mean score by 3.9 points (3.2%), while their colleagues enrolled in the standard academic master program in accounting decreased their mean score since being a first-year undergraduate by 3.1 points (−2.6%).
With H1b, we expect that the impact of the master-level accounting program accredited by ACCA had no significant impact on the change in the professional skepticism level measured with HPSS compared to the control group. As the mean scores of these two groups differ significantly (p < 0.01) at the final-year master level and are statistically indistinguishable (p < 0.05) at the first-year bachelor level, the hypothesis is initially rejected.
3.3. Robustness Analysis
We performed a robustness analysis to confirm or disconfirm the impact of independent variables on the test results of hypotheses H1, H1a, and H1b. Firstly, we analyzed whether gender significantly affected the mean score and standard deviation of the collected questionnaire results. We expected there to be no relation between gender and the level of skepticism measured with HPSS and, in consequence, that the gender structure of the researched groups did not significantly affect the test results of H1, H1a, and H1b. The mean score of males at the first-year undergraduate level was significantly (
p < 0.01) higher than the mean score of females. At the final-year graduate level, such significance in the difference of the mean scores did not occur. In order to control for gender, we equalized the gender structures of the first-year undergraduate groups and respective final-year graduate groups. The first time, we recalculated the mean scores and standard deviations of the final-year graduate accounting group and control group, according to the gender structure of the adequate first-year undergraduate groups. The second time, we did it the opposite way, recalculating the mean scores and standard deviations of the first-year undergraduate group and control group according to the gender structure of the adequate final-year graduate groups. The results show there are no significant differences in the mean scores of the accounting groups and control groups with equalized gender structures. Therefore, the gender of the researched subject has no significant impact on the test results of H1. Detailed results of the above analysis are presented in
Table 4.
The robustness analysis of the gender impact on H1a and H1b was performed with the same procedures as for H1. Under H1a, we compared the accounting students following the standard academic program and the ACCA-accredited program. Controlling for gender, we equalized the gender structure threefold, using female and male shares occurring in the first-year undergraduate accounting students, final-year graduate accounting students enrolled in the standard program, and final-year graduate accounting students enrolled in the ACCA-accredited program. The results in
Table A2 show that, under one of the above equalizations of female and male shares in the researched groups of subjects, gender has a significant (
p < 0.05) impact on H1a. For the other two, such significance does not occur. For H1b, we analyzed the impact of gender structure on the main result, with two equalizations of female and male shares in the researched groups, and found it to be insignificant. The details of the analysis are reported in
Table A3.
In the next stage, we measured the impact of work experience on the professional skepticism level. We predicted that such an impact existed, but that it did not significantly affect the PS (Professional Skepticism) level measured with HPSS. The results show that professional experience does not have a significant (
p < 0.05) impact on the mean score for first-year undergraduate students, but that it does for the final-year graduate groups (
Table 5).
Further analysis shows that the impact of professional experience on the final-year students’ mean scores applies only to the control group, and that it is significant at
p < 0.01. Detailed results are presented in
Table A4. We also disaggregated professional experience into four components, which were given to the researched subjects as possible choices, in order to verify whether significant differences in mean scores for the final-year control group applied to any professional experience, or only to a particular type of it. However, we decided to re-aggregate the data collected for professional experience in the Accounting Department or Accounting Office, Auditing Firm, and Financial Department into one category (Experience in Accounting or Finance), as only eight subjects out of 219 for the first-year undergraduate groups claimed to have such work experience, and eight subjects out of 201 for the final-year graduate students declared to have worked in an Auditing Firm or Financial Department. We recalculated, then, the means and standard deviations for the new aggregated item, “Experience in Accounting or Finance”, and compared it with the mean scores of the subjects with no professional experience or with other professional experience. The dependence identified earlier for the control group at the final-year graduate level was significant (
p < 0.05 and
p < 0.01) for both groups with professional experience, in comparison to the group with no work experience. The detailed results of the significance tests for the mean comparisons of the three extracted groups of subjects are reported in
Table A5.
In order to control for professional experience, which was the independent variable, we adopted a procedure identical to the case of gender. The mean scores and standard deviations were recalculated under the assumption that the professional experience structure would remain static. The results reported in
Table 6 show that professional experience has a significant (
p < 0.01) impact on H1.
If the work experience structure of first-year undergraduate groups persisted in the final-year graduate groups, then the result of the H1 test is opposite to the one initially obtained. The same procedure was applied in order to test H1a and H1b. For all equalized professional experience structures of first-year undergraduate groups and their corresponding final-year graduate groups, H1a and H1b were rejected. These results support the expectation that work experience does not significantly affect the professional skepticism level of the researched subjects (measured with HPSS). The detailed results of the H1a and H1b tests controlled for the work experience factor are provided in
Table A6 and
Table A7, respectively.
Aside from the area of professional experience, subjects were asked to provide information on their length of service (years). We expect that this independent variable would have no significant influence on the skepticism level of the researched subjects measured with HPSS during their university years. One-way ANOVA (
p < 0.05) was carried out to test the significance of such an impact. The mean scores of the researched subjects were divided into six groups, with respect to the length of service that subjects provided in a questionnaire. Each of the six dependent variable sets was examined for significant outliers with InterQuartile Range (IQR), tested for normality with Shapiro-Wilk (
p < 0.05), and tested for homogeneity of variances with Levene’s test (
p < 0.05). All the required assumptions to run one-way ANOVA were met. The results reported in
Table 7 show that the length of service did not significantly (
p < 0.05) affect the level of professional skepticism measured with HPSS in the researched subjects.
A similar analysis was additionally implemented two more times, for undergraduate subjects and graduate subjects separately. As all sets of dependent data met the assumptions required to run ANOVA (no outliers, positively tested for normality, and tested for homogeneity of variances), the analysis was carried out in order to verify whether there were statistically significant differences of mean scores between the groups of subjects with various lengths of service. The results of both analyses reported in
Table A8 and
Table A9 support the expectation that length of service is insignificant at (
p < 0.05) for the professional skepticism level. It should, however, be mentioned that, for the final-year graduate group, the scores of subjects with no professional experience were removed from data set carried out with ANOVA. As it was reported in
Table 5, there is a statistically significant difference in the mean scores between groups of subjects with and without professional experience. The impact of the absence of professional experience was already tested (
Table 6,
Table A6, and
Table A7), and there was no point in replicating this analysis all over again. As all the results reported in
Table 7,
Table A8, and
Table A9 do not show a statistically significant impact of the length of service on the mean scores of the subjects, further analyses controlled for this independent variable were not carried out.
We predicted that the age of researched subjects would have no significant impact on the average professional skepticism level measured with HPSS. In order to verify this expectation, one-way ANOVA (
p < 0.05) was carried out for the mean scores of the subjects divided into age groups, as in
Table 1. Similarly to the previous uses of ANOVA, we pretested the data sets for outliers, normality, and homogeneity of variances to confirm the possibility of using the one-way analysis of variances. The obtained results show that there are no statistically significant differences in the mean score between the groups of subjects of different ages. The detailed results are reported in
Table 8.
We also ran a one-way ANOVA (
p < 0.05) separately for the undergraduate and graduate student groups. The initial pretests showed that all data sets met the assumptions required for such analysis. The results of both analyses are reported in
Table A10 and
Table A11. No statistically significant differences in the mean scores between the groups of different ages was found. Consequently, the controlled analysis was unnecessary.
We also expect that subjects’ predictions of the future profession they wish to follow has very limited influence on the professional skepticism level measured with HPSS. Therefore, we predict that it provides no statistically significant impact on the subjects’ mean scores. To analyze if there is a relation between this independent variable and the mean scores, we used one-way ANOVA (
p < 0.05), dividing subjects into three independent groups with respect to the future profession they wished to follow. We pretested the data sets for outliers, normality, and homogeneity of variances. All the assumptions required for implementing ANOVA were met. The results presented in
Table 9 support the previously made expectation, as there is no statistically significant relation between the dependent and independent variables.
In the next stage, we carried out a one-way ANOVA (
p < 0.05) separately for both the first-year undergraduate students and the final-year graduate students after pretesting the assumptions. The results, reported in
Table A12 and
Table A13, also show that there is no statistical significance of the impact of the future profession subjects wished to follow on mean scores. Therefore, controlled analyses were not performed.
In the final stage of analysis, we decomposed the mean scores obtained by each of the researched and control groups into six characteristics that comprise the HPSS. In respect to H1, we expected that the level of at least one of these characteristics in the researched subjects, measured with HPSS, would significantly change over four years of education, compared to the control group’s scores. In
Table 10, we provide detailed results for the decomposed means, which confirms the expectations. The mean score for the “search for knowledge” characteristics changed significantly in comparison to the control group during the four years of university education. However, it should be noted that the significance of this change results exclusively from the fact that the average value of this feature for the group of management students has fallen from 24.8 in the first year of undergraduate studies to 22.9 in the final year of graduate studies. If only the group of accounting students is considered, the change in this characteristic within four years of studies is positive but statistically insignificant.
We also performed a similar analysis that extends the informative content of the results obtained for H1a and H1b. The detailed statistics are reported in
Table A14 and
Table A15. The comparison of accounting students following a standard academic program with those following an ACCA-accredited program shows significant changes in the mean scores during their four years of education for the following two characteristics: search for knowledge, and a questioning mind. When the ACCA-accredited accounting program group is compared to the management students, 4 out of 6 characteristic levels measured with HPSS changed significantly over the four years of education in favor of the former: search for knowledge, interpersonal understanding, self-confidence, and a questioning mind.