3.1. Research Design and Sampling
This study employed a survey research design. Its sample was 79 non-native CL students who had experience in learning the CL at school. The focus of this study was on those who had shown good achievement in learning the Chinese language. Hence, the sample was former SJKC non-native CL high achievers in Malaysia. Taking note of the low percentage of high-achieving learners, which was an average of 800 students yearly, this study took 10% of 800 to respond to the questionnaires. Thus, a total of 79 students participated. The participants were then placed in two groups, namely ‘low proficiency’ and ‘high proficiency’.
A sampling method called snowballing with volunteerism was employed in the selection of the samples. A list of samples was first identified by a few educators in Malaysia. Then, the participants’ permissions were obtained to access their data. The participants who did not agree to participate were excluded. The inclusion criteria to select the samples were based on a volunteer basis, and most importantly, they consented to complete a Google form questionnaire.
3.3. Data Analysis
Discriminant Function Analysis (DA) was used to analyse the obtained data. The DA serves the same purpose as multiple linear regression by predicting an outcome. Hence, the DA was employed in this study since multiple linear regression is restricted to scenarios where the dependent variable is an interval variable. The regression equation also provides an estimated mean population numerical dependent variable value for specified weighted combinations of independent variable values. The ‘proficiency level’ was the dependent variable.
This study used two proficiency levels, ‘low’ and ‘high’, to address research question one. To address research question two, six proficiency levels were measured. The first independent variable was motivation, which consisted of attitude, effort, and desire. Another independent variable was strategy, which consisted of basic writing, essay writing, and reading. These independents were the discriminators (in regression analysis, the independent variables are predictors). This study focused on high achievers divided into two categories (‘low proficiency’ and ‘high proficiency’) in order to observe the factors that affect their proficiency.
In the DA, the independent variables are combined in weighted combinations to produce a single new composite variable, namely the discriminant score. Thus, the significant portions of the discriminant score reflect misclassifying cases into respective groups (low/high proficiency). A good DA model shows minimal misclassification, so the analysis detects the variables that primarily contribute to differentiating groups.
However, this was a simple discriminant analysis with two groups in the dependent variable. The simple discriminant analysis is provided with one set of eigenvalues: Wilks’ Lambda and beta coefficients. The number of sets is always one less than the number of DV groups. Therefore, in this analysis, the data obtained were the respondents’ demographic data and the answers given by them. Further, the ‘proficiency level’ was a nominal variable to indicate whether the learner was of high or low proficiency. The other variables were attitude, effort, desire, and writing strategies.
To reiterate, the aim of the analysis was to identify if these variables discriminate the participants’ proficiency (low or high proficiency) and examine whether there were any significant differences between the ‘low’ and ‘high’ proficiency groups on each of the independent variables using group means and ANOVA.
The 79 respondents were divided into two groups based on their self-reported CL competency, which was based on the Common European Framework of Reference (CEFR) categorization. Low proficiency refers to those who rated themselves A1, A2, or B1, while high proficiency refers to the respondents rating themselves B2, C1, or C2.