*2.4. STEM Studies*

The areas of knowledge related to the fields of Science, Technology, Engineering and Mathematics are known as STEM. This type of knowledge area is directly related to the generation of innovations, competitiveness improvements, and economic and social growth, which ends up having a positive impact on well-being [74,75]. Thus, several studies have linked the creation of new technology-based firms, focused on STEM areas, with economic growth and development [4,76], since innovative entrepreneurship contributes significantly to value creation and the improvement of economic dynamics. This is due, among other things, to its job creation capacity [77]. This explains the growing interest in this type of entrepreneurship, not only within academia, but also within national and supranational organizations such as the Organization for Economic Co-Operation and Development— OECD or the United Nations, which have developed some programs focused on attracting and retaining people within STEM areas [21].

However, there are not many studies that have been developed relating STEM areas and EI [78]. In some cases, they select samples of students from the area of business vs. non-business [79], others use samples from the field of engineering [28], and on very few occasions, both groups are compared [80,81]. In the same way, few studies indicate the degree subject showing the dichotomy of STEM/no STEM as an element to consider when measuring the level of EI [78]. Given the interest in the STEM collective and its potential in territorial development, this study proposes, in an exploratory way, the existence of a causal relationship between the degree subject and the level of EI. In the same way, the combination of this condition with some others in the analysis will show specific profiles that characterize potential entrepreneurs and, in this way, offer interesting information for policymakers. In this sense, the relationship between gender and STEM should be highlighted, since, although women are underrepresented in this area in terms of entrepreneurship, "women entrepreneurs have a 5% greater likelihood of innovativeness than men" [82] (p. 9), which reflects the potential of this group.

#### *2.5. Family Entrepreneurial Background (FEB)*

The literature offers evidence that suggests that the probability that an individual feels an interest in entrepreneurship increases if they come from a family in which other members have undertaken entrepreneurial projects [83,84]. Furthermore, various studies find that entrepreneurs often come from entrepreneurial families [85,86], a fact by which the option of self-employment becomes more attractive [87,88] and increases the EI [89]. In this way, it can be suggested that the family environment can function as an antecedent for entrepreneurship [24].

The influence of the family, and more specifically in the case of family entrepreneurs, has been studied from a sociological perspective in which entrepreneurs make their social capital and social networks available to potential entrepreneurs [90,91]. However, this is not the only way in which the FEB exerts an influence on individuals, since having entrepreneurial relatives also serves as a learning model for individuals, improving their attitudes and behaviors related to entrepreneurial activity [92]. In this way, the values and norms of close family members can determine the EI of individuals [93]. Thus, parents act as role models for their children [92]. This means that, for example, these children have had more experiences related to proactivity, risk-taking, and innovation [93]. In this context, Marques et al. [93] highlighted that the learning processes that take place in the family environment favor and reinforce the appearance of strong attitudes and intentions related to entrepreneurship, since children who have grown up in business environments have more learning experiences related to entrepreneurship. Thus, living in a family with a business background makes individuals progressively enter the world of entrepreneurship [94] and offers the option of doing things differently, becoming a motivational factor for the child [95]. Mexican culture is very collectivist, which implies that individuals interact regularly with their extended family members with whom they maintain strong ties [96,97]. As a consequence, to study the influence of the FEB, in this research the extended family model was used [98], including grandparents and uncles in the family unit.

Considering the evidence presented, this study proposes the existence of a causal relationship between the university students' FEB and their level of EI.

The theoretical framework allows the formulation of the following causal model and propositions:

> EI = f (COPER, CONOR, GEN, STEM, FEB)

**Proposition 1.** *None of the five causal conditions (COPER, CONOR, GEN, STEM, FEB) is necessary to merit a prediction of high levels of EI among university students*.

**Proposition 2.** *The five causal conditions form multiple configurations that are sufficient to predict a high level of EI among university students*.

#### **3. Materials and Methods**

A questionnaire with 23 questions (Appendix A) was provided, including in the first part items of academic (degree name) and socio-demographic (gender, age, household income and family entrepreneurial background) types.

Following Liñan and Chen [26], the questionnaire included items for the analysis of the EI of university students according to a type 1–3 Likert scale, where 1 meant "totally disagree" and 3 "totally agree". According to the Transparency International Corruption Perception Index [99,100], students were asked about their perception of corruption and their degree of normalization while assessing environmental corruption. These questions were evaluated by means of a Likert scale of type 1–5, where 1 meant "totally disagree" and 5 "totally agree". In our study, we used 1–3 Likert scales to obtain the polarized information that was requested in the EI variable and 1–5 Likert scales for COPER and CONOR variables (Appendix B).

The students' responses to question 1 (degree name) were tabulated to obtain the variable type of university career (STEM or not STEM), generating a dichotomous variable (0, 1). Likewise, the students' responses regarding question 4 (household income) were used to divide the sample (*N* = 380) into two subsamples (medium and high level: *N* = 180; low level: *N* = 200), applying a threshold of MXN 11,600.

The variables EI, COPER, and CONOR were constructed using the eigenvalues resulting from two exploratory factor analyses (EFAs) performed with the statistical software IBM SPSS (v24). The EI EFA was carried out on questions 6–10 (Appendix B), and the EFA on corruption perception and normalization of environmental corruption (COPER, CONOR) was carried out according to questions 11–23 (Appendix B). The first EFA gave rise to a factor (7–10 questions), the eigenvalues of which were taken to configure the EI variable. The second EFA resulted in two factors (questions 12–15 and questions 20–23, respectively), whose two eigenvalues were taken to configure the variables COPER and CONOR.

Table 1 reports on the values of the Bartlett sphericity test and the sample adequacy measure for the factors obtained after the analysis of EI and the perception and normalization of environmental corruption (COPER, CONOR). Appendix B reports the methodological detail of the EFAs carried out in this study.


**Table 1.** Exploratory factor analysis (EFA) results.

Source: authors' elaboration.

The correlation analysis between the variables EI and COPER corroborated a positive relationship (r = 0.170; *p* < 0.001). Table 2 reports on the descriptive statistics for the variables differentiated by degree type (STEM vs. not STEM) and by gender.

**Table 2.** Descriptive statistics.


Source: authors' elaboration.

The proposed model, designed to explain the result (EI), included five variables: two continuous variables (COPER and CONOR) and three categorical variables (GEN, STEM, FEB). To test the model (EI = f (COPER, CONOR, GEN, STEM, FEB)), we carried out a qualitative comparative analysis of fuzzy sets (fsQCA).

fsQCA is a methodology designed for the systematic analysis of cases that allows researchers to find causal patterns that determine the result or outcome [101,102]. This methodology was originally designed for the analysis of small or medium-sized samples [103,104]. However, fsQCA does not offer any mathematical limitation for its application in large samples, guaranteeing valid results for this type of analysis [105,106]. This methodology is based on Boolean logic and allows the identification of need relationships and sufficiency relationships between a set of independent variables (conditions or attributes) and the dependent variable (outcome). A condition is necessary when it must be present for the outcome studied to occur. The absence of necessary conditions implies the existence of multiple combinations of conditions that can give rise to the outcome studied. The factual analysis of the available cases makes it possible to identify the pathways followed by the profiles of students who manifest a high level of EI.

In addition, fsQCA allows the identification of counterfactual evidence, described as combinations of attributes that could occur and have not been observed within the available sample [102]. Therefore, fsQCA is an adequate methodology for the development and profiling of a theoretical scheme [107], as well as the evolution of multilevel theory [108]. In addition to facilitating the validation of hypotheses, it allows the generation of new knowledge based on the analysis of the different causal relationships of the observed phenomenon. This insight is especially useful in case studies in which the research field falls between diverse and complex theoretical frameworks. When the research areas find contradictory evidence on many occasions or, as in our study, when it comes to novel and little explored relationships. In all cases, the research design must guarantee methodological control [109] and facilitate the understanding of the criteria applied throughout the process, especially in the calibration phase of the variables.

fsQCA is a variant of the original QCA methodology, and is applied in fuzzy sets. These types of sets retain most of the essential mathematical properties of sharp sets [110]. Unlike other methodologies, fsQCA does not compare individual variables, but rather analyzes complete combinations of simultaneous conditions, and allows researchers to overcome the limitations of inferential statistical techniques [33]. Furthermore, this methodology perfectly captures the idea of causal asymmetry [105], because a certain attribute that occurs in a specific directionality does not necessarily offer the same result as its opposite directionality.

Following Ragin [102], it is known that fsQCA allows the deconstruction of a single symmetric analysis in two different asymmetric analyzes of set theory, one focused on sufficiency and the other on necessity. In addition, this methodology allows the display of the advantages of equifinality in the analysis [111], since the same phenomenon (outcome) can be explained through different combinations of attributes grouped in different causal configurations (pathways or recipes). fsQCA has been used in multiple areas of the social sciences (e.g., [112]), highlighted especially in those of managemen<sup>t</sup> and entrepreneurship [30].

Our fsQCA analysis was performed with the fs/QCA 3.0 software, and the methodological procedure followed four sequential steps [102,105,113]:

1. Calibration: in this step, the variables (dependent and independent) must be calibrated using a logarithmic function so that they can be analyzed based on set theory. The calibration process involves recalculating all continuous variables so that they are integrated into the continuum 0–1, establishing thresholds that allow determining the membership of a value to one of the following three sets: (1) fully inside; (0.5) maximum ambiguity; (0) fully outside. Following Misangyi and Acharya [114], this study used the continuous variable calibration method with percentiles, with the percentiles proposed by Climent-Serrano et al. [115]: 90%, 50%, and 10%, to delimit the thresholds (1), (0.5), and (0) of the variables EI, COPER, and CONOR, respectively.


Tables 3 and 4 show the calibration thresholds and descriptive statistics of the variables used in the model for the high and low household subsamples.


**Table 3.** Calibration and descriptive statistics (HHI).

Source: authors' elaboration.

**Table 4.** Calibration and descriptive statistics (LHI).


 **4. Results**

*4.1. Analysis of Necessary Conditions*

This study analyzed the EI of Mexican male and female university students, taking as a reference the concurrence, COPER, and normalization, based on gender, university career, and FEB. The analysis was carried out in two subsamples of students, grouped by the household income of their family units. The analysis of necessary conditions (Table 5) reported information on causal conditions that are necessary for the investigated outcome to occur. Ragin [102] accepted that a condition is necessary if it exceeds the threshold of 0.9.


**Table 5.** Analysis of necessary conditions.

Source: authors' elaboration.

The analysis of necessary conditions reports that the presence (or absence) of any condition is not mandatory for a university student (male or female) to increase their EI, for both subsamples (high and low family income). This finding implies the existence of multiple combinations of factors that can lead to high levels of EI. To ge<sup>t</sup> closer to a more extensive knowledge beyond the observed reality, we must delve into the analysis of cases by conducting the sufficiency analysis. This fact confirms both Propositions 1 and 2.

#### *4.2. Analysis of Sufficient Conditions*

The consistency of the solutions (complex, intermediate, and parsimonious) was satisfactory. Table 6 reports the results of the analysis of sufficiency for the outcome (EI) for two subsamples (high and low household income). The relevant information was synthesized to understand the solutions of the model (consistency and coverage) and the characteristics that helped us to understand the EI of the different profiles of university students. Black circles indicate the presence of a condition, and white circles indicate its absence. The criterion proposed by Fiss [105] was followed to graphically represent the presence (or absence) of conditions in each pathway, reporting the "core conditions" with large circles and the peripheral conditions with small ones. When a condition is not marked (blank spaces), it indicates "don't care".

Appendices C and D offer more detailed information.

The solutions to the models (HHI and LHI) had an adequate level of consistency (HHI: 0.8301; LHI: 0.7991), higher than the minimum threshold required by Ragin [102] (2008). Both subsamples took cut-off thresholds for the selection of cases in the "truth table" that were higher than 0.75, as established by the best practices in fsQCA [102].

The coverage levels varied depending on the database used. The HHI subsample had a coverage of 0.6992, and the LHI subsample had a coverage of 0.2926. In both cases, the minimum threshold of 0.75 was exceeded [102]. The differences in coverage between the two subsamples must be understood in the Mexican social context, in which opportunitydriven entrepreneurship is scarce among people with less income, developing a search for resources that in many cases leads to necessity-driven entrepreneurship.


**Table 6.** Analysis of sufficiency for the outcome (EI).

(1) Black circles indicate the presence of a condition, and white circles indicate its absence. According to Fiss [105], large circles indicate "core conditions", small ones indicate peripheral conditions, and blank spaces indicate "don't care". (2) Given the exploratory nature of this study, the methodology was not conditioned by directionalities established a priori. Source: authors' elaboration.

> The sufficiency for the outcome (EI) analysis in HHI reported six different profiles of university students with high EI. Men with HHI were concentrated in three different profiles (H1, H2, and H4). Pathway H1 explained the behavior of one in five students (unique coverage: 0.1971) and included an archetype of male students with a family background in business creation. They were students with COPER and without corruption normalization. The male students explained by the H2 pathway were not STEM students, and their profile was not permissive regarding environmental corruption either. However, this archetype of entrepreneurial students did not have a family background in the creation of companies, and their perception of environmental corruption was not decisive to explain their EI. One in five college students with HHI fit this archetype (unique coverage: 0.2289).

> The H4 solution contributed to improving the understanding of the profile of male students pursuing STEM degrees, with a family background and who were characterized by not having COPER, but having corruption normalization.

> The female students in STEM disciplines (H6) had a family background, the entrepreneurial culture was affected by their high COPER, and they were not flexible regarding the corrupt environment, because they did not normalize corruption, showing integrity in the values of social behavior. Solutions H3 and H5 (unique Coverage: 0.0492 and 0.1029, respectively) explained the behavior of female students pursuing non-STEM degrees with a family background and high perception and normalization of corruption (H5) and, on the contrary, without a family background and without perception or normalization of corruption (H3).

> The analysis of conditions carried out under the approach of core and peripheral conditions [105] reported important findings. The family background was a core condition for generating EI among STEM degree students, both men and women. In the case of men, it was also a core condition not to declare high levels of COPER (H4). In the case of women, conversely, the core sine qua non condition being the presence of COPER and, simultaneously, the absence of CONOR (H6). Women who clearly perceived the level of environmental corruption (core condition) developed high levels of EI when their perception of corruption was internalized and normalized (H5). Other relevant core conditions were the absence of normalization of corruption in the case of men (H1, H2, H3),

which occurred particularly in non-STEM grade students (H2, H3), and the core condition that these students did not have an entrepreneurial background in their family (H3) also was relevant.

The analysis of the subsample of students with LHI revealed the core condition for the presence of a family background for all the profiles identified. In this sense, the archetype of male students in STEM degree programs developed high levels of EI, relying on an archetype that does not perceive environmental corruption, precisely because it normalizes it (L5). Non-STEM female students did not normalize corruption, and in fact, their COPER was not significant in inhibiting EI (L4).

The L1 and L2 solutions for the LHI subsample helped define and understand the profiles of female students in non-STEM degree programs who do not perceive environmental corruption (L2) or do not normalize such corruption (L1). Finally, there was a highly relevant profile (consistency: 0.9163), whose causal configuration of attributes grouped male students in non-STEM degree programs that did have a high rate of COPER, which reflected the feeling of many young people in Mexico who do not support corruption, are aware of it, and try to keep on with their lives while assuming that corruption is a part of their context.

#### *4.3. Reliability and Robustness Fit*

A sensitivity test was performed following Skaaning [116] and Schneider and Wagemann [113]. The cut-off points for calibration were modified as suggested by Fiss [105] and Stevens [117]. The percentile that defined the fully inside point was reduced by 10%, and the point that defined the fully outside point was increased by 10%, establishing the following points for fully inside (80%), maximum ambiguity (50%), and fully outside (20%). This new sensitivity analysis involved applying a stress test to the model to validate its robustness fit.

A model has a robust and acceptable fit only if its consistency level remains within the range (+5%, −5%) after performing the stress test, and only if, simultaneously, the consistency of the three solutions (complex, parsimonious, and intermediate) exceeds the threshold of 0.75 and the minimum coverage of the three solutions is 0.25, according to the criteria established by Ragin [102].

Once the new calibration of the variables was carried out, the model was retested and the evidence found by the new analysis (Table 7) validated the robustness of the proposed model for analyzing the EI of university students in corruption environments.


**Table 7.** Stress test of the calibration process and methodological robustness of the proposed model For the high family income subsample.

Source: authors' elaboration.

The stress test result was positive and guaranteed the robustness of the fit of the model in the subsample of students with high family income. The three solutions generated in fsQCA (complex, parsimonious, and intermediate) had consistency values within the range (0.8140, 0.8716), and a maximum consistency deviation of −1.27%. For more detailed information, see Appendices E and F.

The stress test also reported that the subsample of students with low family income (Table 8) guaranteed the robustness adjustment of the model for the three fsQCA solutions (complex, parsimonious, and intermediate), with consistency values within the range (0.7591, 0.8391), and a maximum consistency deviation of +3.99%. For more detailed information, see Appendix F.


**Table 8.** Stress test of the calibration process and methodological robustness of the proposed model For the low family income subsample.

Source: authors' elaboration.

After applying the stress test, the solutions to the model for both subsamples remained stable, describing student profiles without significant changes (Appendices E and F). The evidence supported that the solutions of the model remained stable for both subsamples, describing the profiles of the students without significant changes (Appendices E and F). Furthermore, the Castelló-Sirvent [118] Robustness Coefficient was calculated to evaluate the model. In addition to the standards set by Ragin [101,102] (consistency ≥ 0.75; coverage ≥ 0.25), and according to Castelló-Sirvent [118], a model is robust only if it reports an adequate RC-value (RC ≥ 0.95) after the stress test carried out on the cut-off points (Appendix G).

The robustness of a model tested in fsQCA can be established by modifying one or more of the cutoff points (fully inside, maximum ambiguity, fully outside), and can be carried out by modifying the calibration, either with percentiles or manual. The stress test aims to identify the robustness of the model by analyzing the average variation of the consistency. If the stress test has been performed by recalibrating the variables, a threshold of ±10% or ±15% can be used. According to the RC-value, if it was recalibrated ±15%, the robustness of the model can be very strong (0.9900 ≤ RC ≤ 1) or strong (0.9500 ≤ RC ≤ 0.9899). If it was recalibrated ±10%, the robustness of the model can be strong (0.9900 ≤ RC ≤ 1), moderate (0.9500 ≤ RC ≤ 0.9899), or weak (0.9000 ≤ RC ≤ 0.9499). Standardized symbols are used together with the RC-value (\*\*\* very strong; \*\* strong; \* moderate) to indicate the robustness of the model tested with fsQCA [118] (Appendix G). In our study, the RC-value supported moderate robustness (RC = 0.9632 \*).
