Next Article in Journal / Special Issue
Decision Support Using Machine Learning Indication for Financial Investment
Previous Article in Journal
Protecting Sensitive Data in the Information Age: State of the Art and Future Prospects
Previous Article in Special Issue
Towards Reliable Baselines for Document-Level Sentiment Analysis in the Czech and Slovak Languages
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis of Data from Surveys for the Identification of the Factors That Influence the Migration of Small Companies to eCommerce

by
William Villegas-Ch.
1,2,*,
Santiago Criollo-C
1,
Walter Gaibor-Naranjo
3 and
Xavier Palacios-Pacheco
4
1
Escuela de Ingeniería en Tecnologías de la Información, FICA, Universidad de Las Américas, Quito 170125, Ecuador
2
Facultad de Tecnologías de Información, Universidad Latina de Costa Rica, San José 70201, Costa Rica
3
Carrera de Ciencias de la Computación, Universidad Politécnica Salesiana, Quito 170105, Ecuador
4
Departamento de Sistemas, Universidad Internacional del Ecuador, Quito 170411, Ecuador
*
Author to whom correspondence should be addressed.
Future Internet 2022, 14(11), 303; https://doi.org/10.3390/fi14110303
Submission received: 20 September 2022 / Revised: 14 October 2022 / Accepted: 20 October 2022 / Published: 25 October 2022
(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)

Abstract

:
Currently, medium and small businesses face a significant change in the way consumers purchase their services, products, or goods. This change is fundamentally due to the pandemic caused by the 2019 coronavirus disease, during which people were forced to use information and communication technologies to satisfy their needs and interact with other people. After the pandemic, people’s dependence on technology increased exponentially, to such an extent that the Internet has become the channel through which any product can be purchased in an agile and varied way, from the comfort of home, and regardless of schedules. Therefore, for companies, moving from the traditional market to eCommerce is a necessity, but the change must take place efficiently. Therefore, identifying the factors that influence consumers to access a brand, a service, or a product is a characteristic of eCommerce. This paper presents an analysis of the factors that influence the use of electronic commerce. For this, a review of similar works was carried out for the design of surveys and the identification of the critical points considered by consumers. These data were analyzed in a granular way with tools used in business intelligence to improve decision making in the migration to a digital market.

1. Introduction

Currently, the Internet is the most used tool as a means of communication in society. Its penetration and the accessibility of applications have made it a distribution channel that allows people and organizations to generate a wide variety of interactions, even at the level of commercial transactions. In addition to this, quarantines and isolation due to the 2019 coronavirus disease (COVID-19) pandemic caused an increase in the use of web applications [1]. These events have meant an exponential increase in business-to-consumer (B2C) as well as business-to-business (B2B) electronic commerce (eCommerce) [2]. Electronic commerce undoubtedly presents a competitive advantage for organizations and opens the possibility for those companies that did not have access to these services due to lack of infrastructure or the high costs to acquire them [3]. In addition, the proper use of eCommerce eliminates intermediaries and distributors, which presents a direct environment between the client and the producer, reducing operating costs [4].
To achieve the success of the electronic market, organizations must carry out a complete analysis of its characteristics and needs and of how the use of the Internet becomes a channel of influence between buyer and seller [5]. Several works reviewed as a contribution to the literature on the subject addressed the use of the Internet from the vendor’s point of view. However, in the last two years and mainly due to the pandemic, research on the electronic market proliferated exponentially. These works sought to improve the customer experience by analyzing the use of the Internet and its acceptance or not as a shopping channel [6]. Perhaps, before the pandemic, the lack of knowledge about the orientation of the market and the effect of ICTs on customer behavior generated a wrong view in certain investigations, giving greater importance to ICTs as support for business management and its infrastructures [7].
Technologies play an important role in the exchange, promotion, and sale of products and services where their use is increasing. eCommerce employment generates millionaire profits and is presented as a new form of business that undoubtedly represents a driving force for the economic development of the business sector, both for developed nations and developing nations [8]. The adoption of eCommerce by the global business sector is continuously growing; however, as organizations increase in size, electronic commerce becomes more complex and challenging [9]. Since the beginning of the 21st century, studies of electronic business showed the importance of this topic worldwide, and authors such as [10,11] highlighted that its use stands out due to the number of publications on the subject. It is currently considered a significant element of study due to the great impact it has on the economic and social development of humankind. In the research study carried out, several literature review articles on eCommerce and digital marketing were identified. The main topics addressed in the articles were related to the main publication regions, most relevant journals, eCommerce adoption, and benefits of eCommerce [12,13,14,15,16,17]. However, very few articles proposed methodologies that can be applied to small businesses (Lora and Segarra, 2013; Gutiérrez and Nava, 2016; Janita and Chong, 2013; Abed, Dwivedi and Williams, 2015; Sánchez and Juárez, 2017).
It is important to highlight that the elements of society have undergone an important change in the way in which they carry out their activities. People’s adaptation to computer technology has been somewhat natural. Large information technology (IT) providers have been able to identify the needs of organizations and individuals to improve their applications in terms of accessibility and usability, managing to improve their use in most environments [18]. This work proposes the analysis of the use of the Internet as a shopping channel to provide evidence and an analysis of the reasons that lead the consumer to interact in eCommerce and the ones that do not [19]. For this, several data available on the web and consumer surveys were obtained to generate a data warehouse. The stored data were processed to identify the factors that measured the motivation and demotivation to use eCommerce. This process is an integral part of a data analytics model that identifies several of the concepts used in eCommerce to convert them into dimensions used by companies that start in the digital world [20].
The purpose of this work was to establish a data analysis model that can be used by small- and medium-sized companies [21]. These companies are the ones that after the pandemic have suffered the greatest number of problems to market their products and services. In addition, it is important to establish that migrating to digital commerce is costly for the economy of these companies that have been severely hit. Therefore, obtaining a model that identifies the factors that directly influence consumers to access their products through a digital channel is motivating and useful for the proper management of resources in their migration to eCommerce.

2. Materials and Methods

The choice of the methodology was based on the identification of the factors that affect the decision of consumers to use eCommerce. To meet this objective, a review of similar works was carried out to determine the factors with the greatest influence on the use of eCommerce. In a second phase of the method, several surveys were developed for the acquisition of data on the motivation of the end-user to use the Internet as a digital channel for the evolution of the market. The data obtained were processed and integrated into an analysis stage with the use of statistical tools that identified patterns in the data and determined the contribution values of each factor to explain the phenomenon studied.

2.1. Preliminary Concepts

To identify the indicators that measure the motivation and demotivation behind the use of eCommerce by the consumer and the dimensions that make up each concept, it is important to establish the conceptual bases for the development of the method as detailed in the following subsections.

2.1.1. Current State of eCommerce

The years of 2020–2021 presented challenges for the entire society, since COVID-19 changed abysmally and aggressively how society develops and how individuals communicate [22]. To overcome store closures and social distancing, companies, especially retailers, had to adapt their business models. This undoubtedly boosted online shopping to unprecedented percentages, which indicates that consumer behavior has probably changed forever. This work was carried out in Ecuador, where the growth of eCommerce stood at 43.75% in 2020, which means that the volume of business reached 2.3 billion USD, assuming an increase of 700 million compared with 2019 [23].
These data mark the growth of eCommerce in Ecuador; however, in most of the records, they are established in means that are already common for the acquisition of products or services using the Internet as a channel of commerce [24]. The use of this channel is mostly exploited by large corporations, which have a robust structure that covers eCommerce in an integral way that goes from the visualization and positioning of the brand and products to fulfillment [25]. For their part, small businesses or retailers have several problems in positioning themselves and have sought other means, such as social networks, to somehow access eCommerce [26].

2.1.2. eCommerce Architecture

Companies that are part of eCommerce have IT architectures that respond to the needs of the company and the portfolio of its customers. However, for retail companies that are in their first steps of migration to a digital channel, making use of a complete and scalable architecture can demand too many resources [27]. Therefore, it is important to establish an architecture that includes the basic components to start using the Internet to enter the digital market [28]. Figure 1 presents a basic eCommerce architecture that is tailored to the needs of small- and medium-sized businesses.
The main component is the use of the Internet as a communication channel and the web as a medium where products or brands are promoted. It is on this point that the difference between a company that manages eCommerce in a structured way and small companies is focused. Generally, large companies have design departments or hire them to design web pages that integrate complete catalogs of services or products aimed at consumers [7]. For their part, retail companies without the sufficient financial resources to establish a comprehensive structure to enter digital commerce use other means, such as social networks, to offer their products and brand positioning.
The next component is the devices, among which the use of mobile phones stands out. Several studies carried out in Ecuador in 2020 analyzed the use of the Internet and the new consumption habits generated by the pandemic. Ecuador has a population of around 17.7 million inhabitants, of which 64.3% are residents of urban areas. The number of connected mobile devices in the country is around 13.8 million, which constitutes 77.8% of the population. Therefore, for retail companies, it is important to use these data to establish their target market, by also considering which are the most used devices to access social networks.
The next component considers the location of the product and the consumer, and this is a very important variable that must be considered, at both the seller and buyer levels [29]. Shipping costs and delivery time depend on this information. Aligned to this component within the architecture of eCommerce, the payment process is considered [30]. This is a key point that determines the loyalty of consumers to a brand or a company. Other models use digital platforms that work through an application, as is the case of companies that register with Uber Eats. This is a comprehensive platform that connects users, restaurant partners, and delivery partners to make food delivery an easy and hassle-free service. For their part, retail companies generally do not have a payment model; therefore, they manage this directly with the buyer, who does not have support in receiving the product once the payment is made.
The next component is fulfillment, which for retail companies represents a difficulty, since they depend on third parties for delivery. This stage is critical and not controlled by the seller and consumer, which can cause problems in receiving the product, both in terms of delivery times and the state of the product. To this is added an additional cost that the buyer must assume, without there being a guarantee on what they expect to receive [31].

2.2. Identification of the Factors That Determine the Use of eCommerce

To determine the indicators that measure consumer motivation behind the use of eCommerce and the factors that directly affect it, several indicators identified in the literature review were proposed. In addition, several direct queries for consumers were generated, and these queries were reflected in a survey format. The variables identified and those to be processed in the data analysis are presented as follows [32]:
  • Offer: In eCommerce, there is a greater offer of products or services;
  • Speed: When generating a search on a digital platform, it provides a quick view of all the products of interest that are offered;
  • Hours: Unlike traditional commerce, when using the Internet, products, brands, and services are generally available 24/7, providing greater flexibility;
  • Convenience: The possibility of placing orders by mobile means and from anywhere provides a unique feature for the consumer;
  • Privacy: The use of the Internet as a communication channel provides greater privacy and intimacy when making a purchase;
  • Environment: The experiences of other people count as a fundamental part for the consumer to access eCommerce.
In addition to the indicators that were identified in similar works, there are other indicators that must be accessed through direct consumer surveys [33]. There is even the possibility of accessing data from different platforms or social networks that serve to clearly determine the negative points existing in eCommerce [34]. The survey was considered for a population of 500 people; however, for the presentation of the data, the calculation of the representative sample was applied, and the population was taken as a datum of the formula. This process was carried out to reduce the size of the population and to be able to use and present the objective data in this work. Equation (1) presents the formula for calculating the representative sample, where the parameters are as follows:
  • N = 500, population size;
  • Z = 1.96, confidence level (statistical parameter);
  • p = 50%, percentage of the population that has the desired attribute;
  • q = 50%, percentage of the population that does not have the desired attribute;
  • e = 3%, maximum accepted estimation error.
Representative sample size calculation:
n = N Z 2 p q e 2 N 1 + Z 2 p q
Once the calculation was made, it was obtained that the representative population was composed of 217.49 individuals; in this work, the data of 217 individuals were considered and were employed in the following stages of the analysis.
The survey was conducted with the purpose of obtaining relevant answers in the following areas:
  • Experience;
  • Accessibility;
  • Compliance;
  • Privacy;
  • Compensation;
  • Contact;
  • Loyalty;
  • Intention to repurchase;
  • Perceived value.
Figure 2 presents a flowchart where each of the stages of the methodology for the analysis of the indicators that affect the use of eCommerce is explained. In the first phase of the method, the variables that were part of the analysis were identified; in this instance, the variables discovered in the previous works reviewed were considered. These data were processed in the search of patterns that allowed us to determine the factors that directly affect the use of eCommerce [35]. The results obtained were validated, and if they answered the question posed by the interested area or individual, the information was presented in control panels, guaranteeing the understanding of the information.

3. Results

For the development of this work, two types of variables were considered, those identified through the review of previous works and the variables identified through surveys. The data were obtained to represent a population of 500 people; however, out of the total the population, the calculation was applied to the representative population, to limit the volume of data obtained, guaranteeing the results of the study.

3.1. Mechanism for Data Collection

For data collection, various sources can be used to determine the factors with the greatest influence on the implementation and use of eCommerce. Data sources are varied and depend on the type of analysis to be carried out; for example, social networks are current media that generate large volumes of data [36]. However, the extraction process requires more knowledge and trained personnel for the implementation of the analysis structure [37].
Companies that start an eCommerce integration process can choose to generate their own data sources, with surveys being among the fastest mechanisms for the generation of data and the identification of the degree of customer satisfaction. In this work, a survey model was developed that included questions related to the factors that affected the consumer in the use of digital channels and eCommerce, for example:
  • Consumer experience with online stores;
  • Accessibility;
  • Catalogs;
  • Compliance;
  • Privacy;
  • Compensation;
  • Contact;
  • Medium;
  • Satisfaction;
  • Loyalty;
  • Repurchase intention;
  • Shipment;
  • Perceived value.
In addition to these factors, it was necessary to identify information that allows companies to profile consumers in terms of gender, age, academic level, experience in online shopping, number of online purchases in each period, etc. These factors were added as variables in the analysis to help companies to segment their population according to their line of business.
Surveys seek to determine the level of acceptance of a brand, product, or service. Generally, survey questions are designed to elicit as much real information from consumers as possible, so they need to be clear, direct, and objective. The response model was presented by levels of acceptance or rejection; in this way, the respondent did not fall in a state of annoyance or discomfort that punished the veracity of the answers. Below are the questions used in the survey, where 1 means “Totally disagree” and 5 means “Totally agree”:
  • It is easy to find what I need on the website [1,2,3,4,5];
  • It is easy to access any section of the website [1,2,3,4,5];
  • The website allows me to complete a transaction quickly [1,2,3,4,5];
  • The information on the website is well organized [1,2,3,4,5];
  • Pages download quickly [1,2,3,4,5];
  • The website is easy to use [1,2,3,4,5].

3.2. Application of the Methodology

As an initial step of the methodology, it was important to establish the current state of Internet use and other statistics that allowed us to identify the environment where the methodology was applied. This work, when carried out in Ecuador, used data provided by other investigations [38]. These data pertained to the year of 2021, when the total population was 17.77 million, of which 64.3% were residents of urban areas. Considering that mobile phones have become the most used devices for Internet access, it was obtained that there were 13.82 million connected mobile devices, which meant that 77.8% of the population had access to this medium.

3.2.1. Variable Identification

In analyzing the factors that determine electronic commerce, there are a variety of variables that can be considered. In this work, the variables that were considered in the reviewed works were included, and these were:
  • Comfort;
  • Schedules;
  • Speed;
  • Around;
  • Privacy;
  • Offer.
In this work, the variables were taken as categories, and to evaluate them, several questions were created that aligned with what was desired to know about each of them. Once the questions were generated, a survey model was sent to 217 individuals, i.e., the representative population. Table 1 presents a sample of 20 responses obtained in the survey; for the sample presented in the table, the first 20 records were simply taken.

3.2.2. Analysis

In this stage, the factorial analysis of the data obtained in the 217 surveys was carried out. Several tools allow the calculation of factor analyses to be performed to determine the level of influence that each of the variables has. Among these tools are SPSS, Excel, etc. The important thing is to know exactly what the objective of the factor analysis calculation is [39].
The analysis process was carried out in SPSS to verify, through a factor analysis, the components that affected the use of eCommerce. Table 2 shows the results obtained through a dimension reduction model. To identify the appropriate factorial model, the Kaiser–Meyer–Olkin (KMO) measure and Bartlett’s sphericity test were used. This model tests the partial correlations among variables. In addition, Bartlett’s test checks whether the correlation matrix is an identity matrix. KMO assumes values between 0 and 1; according to various works reviewed, for a KMO coefficient to be adequate, it must be greater than 0.6. The coefficient obtained was 0.788, so the items were taken as adequate, and the analysis continued. In Bartlett’s test, a chi-square value is obtained that is associated with the sampling distribution that allows one to know the probability of error when rejecting a null hypothesis. The ideal case that is expected for the significance value is that this value is less than 0.005, something that was fulfilled in the analysis, with a value less than 0.001, which indicated that there were significant differences between the correlation matrix and the identity matrix.
Table 3 shows the values obtained in the anti-image matrices; this type of matrix contains the negatives of the partial correlation coefficients, and the anti-image covariance matrix contains the negatives of the partial covariances. To be considered a good factor model, most of the off-diagonal elements must be small. In the results obtained in the covariance matrix, it was observed that the values were good and exceeded the 0.6 established in KMO, except for the value obtained in speed, which was a lower value, 0.386; however, in the correlation matrix, it had a value of 0.777. A similar result was obtained for the “environment” item. In addition, with the positive values closest to 1, a diagonal was formed, in which case, the values above the diagonal were identical to those below it, which meant that it was a mirror matrix.
Table 4 shows the table of commonalities, where the percentage of variance in a variable explained by all the factors together is measured and can be interpreted as the reliability of the indicator. With the results obtained in the table, it could be highlighted that the variables of comfort and privacy had less reliability. This is because the amount of variance in all variables is explained by the eigenvalue, and if a factor has a low eigenvalue, then it contributes little to explaining the variance in the variables. Another low value was that of schedules. These three parameters were the ones that had to be analyzed in depth to determine the degree of influence that they had on the research question.
Table 5 shows the matrix of the total explained variance. The results indicated that the final solution was explained by two factors or dimensions. These two factors represented 70,016% of our accumulated variance, with each factor being greater than 40% of the accumulated variance, with this being an adequate value for analysis. The headers are detailed at the end of the table.
Figure 3 shows the sedimentation graph, which is a representation of the magnitude of the eigenvalues that identifies the number of values that must be extracted. The eigenvalues that explained most of the variance are located on the left side, forming a slope. The graph helped us to visually determine which were the optimal factors.
Table 6 shows the matrix of rotated factors, with which each item belonging to or influencing each factor was determined. As a characteristic identified in the review of previous works, the minimum number of items for each factor must be three to four to be considered valid. In this case, the analysis generated two factors, which the average and speed items directly affected, as can be seen in the table. When there are items that affect two or more factors, these can create false incidence values or can subtract degrees of influence from other items. Therefore, it is recommended to eliminate the shared items, generate the calculation of the rotated matrix again, and verify the degrees of influence of each item on the generated factors.
According to the results obtained, it was identified that factor one had the greatest influence on the use of eCommerce and that the perception of the clients was related to the offered item. This result was admissible, considering that the greater the offer of products and of comparisons among the products found in the digital market is, the greater the attraction of the consumer is. Factor two had an item with greater influence, which was the hours, with these being an advantage for traditional companies. Another item identified was the privacy of factor one, at present, due to the large volume of computer attacks that seek to obtain, modify, or delete user data. This item had a great impact on the use of eCommerce by people; therefore, it is important to establish models and architectures in the migration to a digital environment that comply with all information security and privacy measures. To validate the analysis, it is recommended to use Cronbach’s alpha coefficient. Cronbach’s alpha is the reliability indicator of psychometric scales that provides a measure of the internal consistency of the items that make up a scale. If this measure is high, it implies evidence of the homogeneity of the scale, that is, that the elements point in the same direction.
For example, in Table 7, the calculation results of Cronbach’s alpha are shown, and the data considered were obtained from the above-mentioned 217 records; however, for the example in the table, only the first 20 are presented. In the table, the dotted spaces mean that there were multiple records in the original repository. The last row contains the results of the calculation of the variance based on the entire population. The total column is the sum of all the values of each of the categories that were included in the analysis.
In the calculation of Cronbach’s alpha, the variances of each category were used plus the total variance shown in the last column. These data were applied using the formula presented below:
= k k 1 1 i = 1 k S i 2 S t 2
where the following apply:
  • Si is the variance of item i;
  • St the variance of the total observed values;
  • k is the number of questions or items;
The results of Cronbach’s alpha are presented in Table 8. This consisted of three calculations or interactions, and interaction 1 presents the result of the calculation of the six items. According to the values obtained in the variances, three values were identified that were significant for the analyses. These values were comfort, privacy, and supply; the variances in these items indicated that their incidence could affect the coefficient of Cronbach’s alpha. Therefore, in interaction 2, the supply data were eliminated, and the result increased to be greater than 0.8, which indicated that the analysis was more representative, as it did not consider this item. The process was carried out with each item that presented variance values close to zero; however, by removing the privacy value, the alpha coefficient reduced, which meant that it was a parameter that had a great influence on the analyses. The opposite happened when we removed the comfort item; without this value, the coefficient was closer to one and exceeded 0.8, which implied that the analysis was better and had greater reliability.
This process can answer several research questions one may have about small business migration to eCommerce. However, it is important to point out that this first analysis integrated the variables identified in previous works. If the answer is not reliable enough, it is necessary to integrate a greater number of variables or data sources. Additional data sources can be of various nature, as mentioned in the above sections. The most common example of external sources are surveys that are found on the web and are considered open data. Other possible sources to be added are social networks; these contain a high volume of information that can be analyzed about the perception of users relative to a brand, product, or service.
In this work, several additional questions were added to evaluate new categories, and new surveys were designed to obtain data. The new categories added sought to identify all the possible causes that were marked as the main points for people to migrate from traditional markets to eCommerce.
Many categories can be included in this process; however, each of the new categories must be analyzed by each department in charge of market research. In this work, six additional categories were integrated. In each category, two items or questions were included to identify the perception of users. The questions for each category are presented below:
  • Accessibility:
    a.
    Is it easy to find what I need on the website?
    b.
    Does the website allow me to complete a transaction quickly?
  • Catalogs:
    a.
    Did you find enough product details?
    b.
    Do you usually find several alternatives for the same product?
  • Compliance:
    a
    Do you deliver the order when promised?
    b.
    Does the website have reliable offers?
  • Support:
    a.
    Do you have an online customer service?
    b.
    How helpful was the customer service staff?
  • Satisfaction (consumer experience):
    a.
    Was my decision to buy on this website a wise one?
    b.
    Does the performance of this website meet my expectations?
  • Shipment:
    a.
    Did you receive your product at the shipping address?
    b.
    Has the order been delivered on time?
From the data analysis of the categories attached to the process, several results were obtained that guided the understanding of what consumers looked for in an eCommerce model. Table 9 shows the results; the KMO coefficient was 0.763, a value that allowed the analysis to be defined as valid, considering in addition that the significance value was less than 0.005, reaffirming its validity.
In Table 10, the anti-image values of covariance and correlation are presented. In this analysis, the incidence of each of the items and how this determines the different factors or dimensions considered for the evaluation of the data are observed. In the correlation of the diagonal values, it was obtained that they exceeded 0.3; however, there were values of certain items that could be considered as the minimum and could be eliminated to improve the response of the analysis. In this table, item 1 had a correlation value of 0.450; this was a low value that was kept in mind to compare it with the following tables and eliminate it if its influence on the object of study was not relevant. In the table, the headings correspond to the questions in each category; these were replaced by the letter “P”, accompanied by a number as an identifier, for example, Q1: Is it easy to find what I need on the website? This format was followed in all tables so that they fit the size of the format.
Table 11 presents the results of the commonalities, where it was identified that items 1 and 2 had low extraction values of 0.012 and 0.044 respectively. As valid values in the analysis, it was established that all those items with an extraction value greater than 0.400 were valid in commonality processing.
The analysis defined four factors to reach the percentage of the accumulated variance of 65.492%; this was a valid value to consider in the analysis. Table 12 shows the factors and the items that were directly aligned with each of them. The factors to be accepted within the analysis had to integrate at least three items. This was true in the matrix of rotated factors; however, each factor had to be reviewed, since there were values that did not influence the decision of consumers. The table shows how each item was related to each factor; in the case of items 1 and 2, these did not represent a relationship with any of the factors. In addition, other items did not have a great influence on the factors, so these were excluded, and the analysis was conducted again to verify the changes in the analysis. To validate the effects of the changes, the calculation of Cronbach’s alpha was used, where the least representative items were eliminated from the analysis.
Table 13 shows the results of Cronbach’s alpha. Interaction 1 showed the result with all the items; the result was less than 0.8 and could be considered valid for the analyses. According to the results of the matrix of rotated factors in the second interaction, item 1, belonging to the accessibility category, was eliminated. The result obtained for Cronbach’s alpha improved the validity of the analyses, and the same process was carried out with P2. In interaction P3, there was an increase in the value of alpha; therefore, these items had little influence on the consumer’s decision. The items linked to the category of “accessibility” could be eliminated or replaced with others among those identified in the above sections. Another important fact to keep in mind is that in this work, the number of questions per category was limited to two, with a total of twelve questions corresponding to six categories. It is possible that by integrating a greater number of questions into the analysis, the interaction results would have greater precision. However, it was necessary to consider the guidelines in the design of surveys to guarantee the information collected.
The proposed method was applied by two minor companies that are dedicated to the sale of school supplies. These companies, within their stock, offer many products, such as markers, notebooks, pens, scissors, etc. These companies during the pandemic were financially compromised since their line of business was based on the direct interaction with the consumer [40]. By applying the method and defining each of the factors that affected their clientele, these companies decided to scale to eCommerce. Three months after the application, the effectiveness was measured through a quick survey of the owners of the company; in this survey, they were asked to indicate the result achieved vs. the expected result. In the two companies, they established that the result achieved was between 70 and 75% vs. the 80% expected result after three months of execution. With these data, it was identified that the effectiveness of the application of the method was 87.5%.
For the evaluation of the results, a machine learning algorithm was applied, with the objective to train an AI model and compare the results obtained from the factorial analysis. The algorithm applied was a decision tree, where the 12 questions posed in the survey and the corresponding data were considered, and to this, the general query on the frequency of use of eCommerce was added.
The validity of the models was checked through cross-validation, a very common verification method. N-fold cross-validation divides the training set into N parts, using N-1 parts to train and one part to test. It repeats this N times and finally calculates the result. The dataset with the survey responses was configured for validation as follows: The total data corresponding to 723 surveys were used in two groups, where 70% of the data were used for training, and the remaining 30%, for tests. To measure this quality, precision was used, which expressed the percentage of values that were correctly classified. The training dataset that corresponded to 70% considered 506 people, and the 30% of test data were derived from 217 surveys, which corresponded to the most representative population, as calculated in the above sections. When analyzing the model training dataset, the results presented in Table 14 were obtained. The total number of instances was 506, of which 75.4941% were classified as correct and 24.5059% as incorrect instances. The absolute error was 32.3%, and the relative root square error was 72.38%.
Table 15 presents the precision values for each class; it was observed that the enough and always classes had values of less than 60%, and these results were considered in the comparison with the test dataset. Each class represented the use of electronic commerce in the surveyed population; these criteria were enough, some, little, always, and never. Linearly, each class was separable from the other. This predicted which class the observations per question belonged to; therefore the dataset was considered as a multi-class classification model. The attributes corresponded to 12 questions; with these, a 10-fold cross-validation was performed with the use of a training set. Training is important because regardless of the algorithms that are included in the analysis, this process allows one to adjust the parameters and train the models to guarantee the results.
In the next stage, we worked with the test dataset that corresponded to the 217 instances mentioned above. As part of the performance evaluation process, the algorithm was run with two iterations; when it was executed, the percentage of correct classification decreased considerably. When applied with 5 total iterations, the results did not vary too much compared with those obtained with the 10 iterations used for the exercise; therefore, the iteration limit was selected as 10 to avoid overlearning.
Table 16 shows the stratified cross-validation, where the total number of instances was 217, which corresponded to the number of records entered; of these, 171 interactions were classified as correct, with a validity percentage of 78.8018%. This value was greater than 78%, an indicator that established that the process was sufficiently valid, with 21.1982% of instances being classified as erroneous. Similarly, the table details the values of different variables, such as Kappa, which had a value of 0.7347. This was a value close to 1, so it was considered that there was a strong agreement among the evaluations made using the algorithm.
Various metrics were used in this investigation to evaluate the performance of the algorithms. The classifier performance implies precision, error rate, recall, and F-Measure. Table 17 presents the detailed precision data by class.
In the results obtained from the confusion matrix in Table 18, 46 instances were classified incorrectly, and 171 were correctly identified. In the class that never used eCommerce, there were 40 people classified as correct in the iteration, meaning that their answers in the survey corresponded to their real opinion. Among the erroneous classifications, there were five people, two of whom belonged to group c, two to group d, and one to group e. Of people who used eCommerce little, 29 were classified correctly, and 8 were incorrect. In the class something, there were 41 correct and 4 incorrect classifications. In the class enough, 25 classifications were correct and 21 incorrect; finally, in the class always, 36 people were classified as correct and 5 as incorrect. With these results, it was possible to determine for each class the number of effective surveys that allowed us to evaluate the use of eCommerce; regarding the number of incorrect classifications, a relatively low value was obtained, which could be considered in another analysis for the identification of the reasons why the answers did not represent reality in the categorized classes.
The ROC area measurement is one of the most important values of the algorithm output. The optimal classifier had ROC area values close to 1; therefore, the survey or data collection instrument was taken as valid. The next phase of the analysis generated the data cluster to which the SimpleKMeans algorithm was applied. Table 19 presents the results obtained, where two clusters were generated; these evaluated the data when the users indicated that they used an eCommerce channel at the “Something and Enough” level. In each cluster, the existing relationships were indicated by a question and how these determined the predetermining factors of the use of eCommerce in the selected population. The table shows that there were 13 attributes, which corresponded to the 12 questions that evaluated the six initial categories. To differentiate the questions, the category and an identifier corresponding to the number of questions were placed, for example, Accessibility-Preg1–Accessibility-Preg2; this was repeated with each question.
In attribute analysis, the best first, with a forward search direction, was used as the search method. The process generated 79 evaluated subsets, and the merit of the best subset found was 0.233, with question two of the satisfaction category being the attribute with the highest influence on the use of eCommerce. In addition, there were predictive attributes that were established as complementary and had a high incidence value to respond to the phenomenon under study; these attributes are shown below.
The selected attributes were 2,5,6,9,10,13 (six attributes):
  • Accessibility_Preg1;
  • Catalogs_Preg2;
  • Compliance_Preg1;
  • Support_Preg2;
  • Satisfaction_Preg1;
  • Shipment_Preg2.
Among the selected attributes, the analysis considered question 1 in the accessibility category as an attribute that had an impact on a user making use of eCommerce. This comparison was very useful to adjust the proposed method and guarantee the identification of the factors that influenced the use of eCommerce.
Once the parameters were adjusted and the results of the dataset process were verified, the re-evaluation of the model was established. Therefore, we reassessed the training dataset with the test dataset, and the averaged result allowed us to establish the accuracy of the model and define exactly which were the categories that the respondents defined as important when using eCommerce. Table 20 shows the results of the cross-validation matrix. In the reevaluation, a result of less than 74.1935% was obtained in the instances classified as correct, this being less than that presented in the test dataset. Consequently, the error increased; therefore, it could be established that during the tests, it is better to use the test dataset with the data of the most representative population. These results were the ones that established which were the categories that responded to the study phenomenon.

4. Discussion

The development of this work was based on two essential characteristics: the first was the review of similar works, and the second characteristic was the collection of information that allowed us to determine the factors that influenced consumers to use eCommerce. The review of previous works was of great value, since eCommerce is a concept that has been developed for many years [41]. This means that there is a lot of information on the subject; however, with the pandemic, eCommerce has become a necessity for all companies. Therefore, the migration of the traditional market to the use of a digital channel is a necessity that must be analyzed under the new conditions given by the new normality [42].
The proposed method is not limited to determining the variables that affect the use of eCommerce. This work, being scalable, can identify the incidence values of each item in a survey that evaluates a factor [43]. With this feature, it is possible to design surveys that guarantee the quality of the data and capture the real impression of the consumer with respect to the digital market [44]. The application of the method and the results obtained made it possible to generate a comparison with related works. The main difference with respect to our proposal was the integral process that was carried out, since this work included the design of the survey and performed a validation analysis of this instrument; in addition, the incidence value of each question and how these could respond to the phenomenon under study were analyzed. Instead, several of the studies reviewed used online tools that directly presented the results [45], without necessarily validating the factors that influenced people’s perceptions of eCommerce [46].
The results identified that the accessibility category did not have a greater influence on the decision of consumers before eCommerce. The matrix of commonalities in Table 11 shows that the initial and extraction values did not reach 0.400, this being the base value of incidence with which the analysis was performed. Its incidence value was so low that it ranked between 0.035 and 0.012 for question one on accessibility, and the second question reached between 0.48 and 0.044; these results were sufficient for the elimination of the category. However, there is the possibility of increasing the number of questions or creating a survey that only evaluates the accessibility of eCommerce platforms [47]. For any additional solution, it is important to establish the objective pursued and the information that is required to be known; in the designed survey, the selected questions were:
  • Is it easy to find what I need on the website?
  • Does the website allow me to complete a transaction quickly?
These questions allowed us to know exactly how people felt about the web pages that the platforms managed and how they allowed a transaction to be executed. When processing the information, these questions did not cause a greater impact on the users, who, in turn, gave greater importance to issues such as catalogs, existing support, or delivery compliance. Undoubtedly, accessibility for application developers is very important, and the system improvements and updates that they perform mean that accessibility for users is overlooked or the importance of the concept is not understood. Therefore, for the end user, this category is something that must be included innately in eCommerce platforms or applications.

5. Conclusions

After the analysis was carried out, it was obtained that the use of the variables identified in the literature review allowed a clear panorama on the use of eCommerce to be established. However, in the review of works, it was important to include those that were developed in the pandemic. By giving priority to these jobs, we sought to establish the new needs that are marked in society in a new normal.
eCommerce is currently a huge source of economic development for companies and businesses. With the penetration of the Internet, the use of online shopping channels has gained more popularity in the business sector, even more so when society went through the pandemic, where ICTs became the appropriate means to carry out their activities, becoming an object of study of the international scientific community today. In this research study, the literature regarding electronic commerce and digital marketing in the period of 2019–2021 was analyzed. The analysis focused on eCommerce and digital marketing development strategies and models for small- and medium-sized businesses. These companies represent the largest number in the business sector and provide enormous benefits to the economy; boosting their development would be equivalent to developing the economy of each country.
In a migration from a traditional business model to eCommerce, it is important to establish a structure that guides companies in the transition. Therefore, identifying the variables that affect consumers is a fundamental part of the work to be performed. Previous research is an important contribution; however, it is necessary to obtain data that help the analysis directly from the application environment, since the needs in the different sectors of society are not similar.
Companies that migrate to an online store have an advantage, that is, the possibility of collecting useful data from consumers. This allows one to identify how the interested parties browse the web or what type of sites they visit and even the products they purchase. With the results of such an analysis, it is easy to identify, in consumers, what encourages them to buy or what stops them. Using a scalable method, the results can be used both in the migration to eCommerce and in improving the shopping experience and increasing the probability that website visitors become customers. In a physical store, it is very difficult to calculate these data, since there are usually not enough records.
An AI algorithm was added to the analysis process; this allowed us to identify the number of surveys that fit the reality of the surveyed population. This was observed in the confusion matrix, where the results established that there was a group made up of 48 surveys in which the questions were not clear enough or that the people in this group did not answer everything assertively raised regarding the use of eCommerce. The J48 algorithm, when building the decision tree, improved the understanding of the study phenomenon and how the model varied according to the iterations applied to the training. In addition, to improve this process, two datasets were used, of which one was used for training, and the second set, for testing. With this reference, it was possible to improve the data acquisition instruments and adapt them to the reality of each sector of the population.
In future work, the integration of artificial intelligence techniques into the method should be considered. The objective was to create an analysis and recommendation architecture that can record the user experience and information found on the web, such as data from social networks. With these data, the architecture can profile consumers and recommend to companies the products, services, or brands that have the greatest value for the client. In this way, it is possible to improve customer service and experience, as well as make a better use of resources.

Author Contributions

W.V.-C. contributed to the following: the conception and design of the study, acquisition of data, analysis, and interpretation of data, drafting the article, and approval of the submitted version. The author S.C.-C. contributed to the study by design, conception, interpretation of data, and critical revision. W.G.-N. and X.P.-P. made the following contributions to the study: analysis and interpretation of data, approval of the submitted version. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Vindegaard, N.; Benros, M.E. COVID-19 Pandemic and Mental Health Consequences: Systematic Review of the Current Evidence. Brain Behav. Immun. 2020, 89, 531–542. [Google Scholar] [CrossRef] [PubMed]
  2. Mangiaracina, R.; Perego, A.; Campari, F.; Drivers, A. Factors Influencing B2c ECommerce Diffusion. Eng. Technol. 2012, 6, 844–852. [Google Scholar] [CrossRef]
  3. Sidpra, J.; Gaier, C.; Reddy, N.; Kumar, N.; Mirsky, D.; Mankad, K. Sustaining Education in the Age of COVID-19: A Survey of Synchronous Web-Based Platforms. Quant. Imaging Med. Surg. 2020, 10, 1422–1427. [Google Scholar] [CrossRef] [PubMed]
  4. Bell, L.; McCloy, R.; Butler, L.; Vogt, J. Motivational and Affective Factors Underlying Consumer Dropout and Transactional Success in ECommerce: An Overview. Front. Psychol. 2020, 11, 1546. [Google Scholar] [CrossRef] [PubMed]
  5. Han, Q.; Gu, M.; You, L.; Miao, F. Rumor Spreading from Social Networks to E-Commerce. In Proceedings of the International Conference on Communication Technology, Xi’an, China, 16–19 October 2019. [Google Scholar]
  6. Shopify Ecommerce Definition—What Is Ecommerce. 2019. Available online: https://www.shopify.com/encyclopedia/what-is-ecommerce (accessed on 21 May 2022).
  7. Arora, S. Devising E-Commerce and Green Ecommerce Sustainability. Int. J. Eng. Dev. Res. 2019, 7, 206–210. [Google Scholar]
  8. Garín-Muñoz, T.; López, R.; Pérez-Amaral, T.; Herguera, I.; Valarezo, A. Models for Individual Adoption of ECommerce, EBanking and EGovernment in Spain. Telecommun. Policy 2019, 43, 100–111. [Google Scholar] [CrossRef] [Green Version]
  9. Nöldeke, G. ECommerce Report 2020; Statista: Hamburg, Germany, 2020. [Google Scholar]
  10. Ahmed, R. Ecommerce in Pakistan: Challenges and Opportunities. In Proceedings of the Eighteenth Wuhan International Conference on E-Business, Wuhan, China, 24–26 May 2019. [Google Scholar]
  11. Estrada, R.; Bayona-Oré, S. Critical Success Factors Associated to Tourism e-Commerce: Study of Peruvian Tourism Operators. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 95–104. [Google Scholar] [CrossRef]
  12. Aghaunor, L.; Fotoh, X. Factors Affecting Ecommerce Adoption in Nigerian Banks. In IT and Business Renewal; Jönköping International Business School: Jönköping, Sweden, 2006. [Google Scholar]
  13. Louvieris, P.; Driver, J. Avoiding Buyer Behaviour Myopia in Hotel Ecommerce. J. Hosp. Leis. Mark. 2004, 11, 65–84. [Google Scholar] [CrossRef]
  14. Chen, J.; Xu, S.; Tian, Y.; Zhao, C. Logistics Factors Affecting Cross-Border Ecommerce Implementation. Econ. Manag. J. 2020, 9, 8–15. [Google Scholar]
  15. Liu, G. An Ecommerce Recommendation Algorithm Based on Link Prediction. Alex. Eng. J. 2022, 61, 905–910. [Google Scholar] [CrossRef]
  16. Tsagkias, M.; King, T.H.; Kallumadi, S.; Murdock, V.; de Rijke, M. Challenges and Research Opportunities in ECommerce Search and Recommendations. ACM SIGIR Forum 2020, 54, 1–23. [Google Scholar] [CrossRef]
  17. Molla, A.; Licker, P.S. ECommerce Adoption in Developing Countries: A Model and Instrument. Inf. Manag. 2005, 42, 877–899. [Google Scholar] [CrossRef]
  18. Wang, Z.; Ben, S. Effect of Consumers’ Online Shopping on Their Investment in Money Market Funds on Ecommerce Platforms. Inf. Syst. e-Bus. Manag. 2021, 20, 325–346. [Google Scholar] [CrossRef]
  19. Mäki, M.; Toivola, T. Global Market Entry for Finnish SME Ecommerce Companies. Technol. Innov. Manag. Rev. 2021, 11, 11–21. [Google Scholar] [CrossRef]
  20. Huang, R. Ecommerce in Rural Areas and Environmental Sustainability: The Last-Mile Delivery. In Proceedings of the Sixteenth Wuhan International Conference on E-Business, Wuhan, China, 26–28 May 2017. [Google Scholar]
  21. Jara, A.J.; Parra, M.C.; Skarmeta, A.F. Participative Marketing: Extending Social Media Marketing through the Identification and Interaction Capabilities from the Internet of Things. Pers. Ubiquitous Comput. 2014, 18, 997–1011. [Google Scholar] [CrossRef]
  22. Donthu, N.; Gustafsson, A. Effects of COVID-19 on Business and Research. J. Bus. Res. 2020, 117, 284–289. [Google Scholar] [CrossRef]
  23. Kemp, S. Digital 2019: Ecuador. Available online: https://datareportal.com/reports/digital-2019-ecuador (accessed on 19 October 2022).
  24. Al-Qirim, N. The Adoption of ECommerce Communications and Applications Technologies in Small Businesses in New Zealand. Electron. Commer. Res. Appl. 2007, 6, 462–473. [Google Scholar] [CrossRef]
  25. Agatz, N.A.H.; Fleischmann, M.; van Nunen, J.A.E.E. E-Fulfillment and Multi-Channel Distribution—A Review. Eur. J. Oper. Res. 2008, 187, 339–356. [Google Scholar] [CrossRef] [Green Version]
  26. Agatz, N.; Fleischmann, M.; van Nunen, J. E-Fulfillment and Multi-Channel Distribution—A Review REPORT SERIES. Erasmus Res. Inst. Manag. 2008, 47, 777–780. [Google Scholar]
  27. Ghandour, A. Ecommerce Website Value Model for SMEs. Int. J. Electron. Commer. Stud. 2015, 6, 203–222. [Google Scholar] [CrossRef]
  28. Sitepu, R.K.-K. The Impact of Modern Markets on the Performance of Micro, Small and Medium Enterprises. J. Ekon. Bisnis 2011, 16, 10–24. [Google Scholar]
  29. Ciprian, A. Influence of Adoption Factors and Risks on Ecommerce and Online Marketing. In Proceedings of the International Conference “Marketing—From Information to Decision”, Cluj-Napoca, Romania, 28–29 October 2011. [Google Scholar]
  30. Marseto, F.; Handayani, P.W.; Pinem, A.A. Push, Pull, and Mooring Evaluation of User Switching Intention from Social Commerce to E-Commerce. In Proceedings of the 2019 International Conference on Information Management and Technology, (ICIMTech), Denpasar, Indonesia, 19–20 August 2019. [Google Scholar]
  31. Villegas-Ch., W.; García-Ortiz, J.; Román-Cañizares, M.; Sánchez-Viteri, S. Proposal of a Remote Education Model with the Integration of an ICT Architecture to Improve Learning Management. PeerJ Comput. Sci. 2021, 7, e781. [Google Scholar] [CrossRef]
  32. Singh, M.; Singh, G. Impact of Social Media on E-Commerce. Int. J. Eng. Technol. 2018, 7, 21–26. [Google Scholar] [CrossRef] [Green Version]
  33. Capece, G.; di Costa, F.; Kurasova, O.; Villegas-Ch, W.; García-Ortiz, J.; Ortiz-Garces, I.; Sánchez-Viteri, S. Identification of the Consequences of COVID-19 through the Analysis of Data Obtained in Surveys of a Specific Population. Informatics 2022, 9, 46. [Google Scholar] [CrossRef]
  34. Amornkitvikai, Y.; Lee, C. Determinants of E-Commerce Adoption and Utilisation by SMEs in Thailand. ISEAS YUSOF ISHAK INSTITUTE: Economics Working Paper. 2020. Available online: https://www.iseas.edu.sg/images/pdf/ISEASEWP2020-1AmornkitvikaiLee.pdf (accessed on 2 May 2022).
  35. Zhang, B.; Yong, R.; Li, M.; Pan, J.; Huang, J. A Hybrid Trust Evaluation Framework for E-Commerce in Online Social Network: A Factor Enrichment Perspective. IEEE Access 2017, 5, 7080–7096. [Google Scholar] [CrossRef]
  36. Larson, D.; Chang, V. A Review and Future Direction of Agile, Business Intelligence, Analytics and Data Science. Int. J. Inf. Manag. 2016, 36, 700–710. [Google Scholar] [CrossRef] [Green Version]
  37. Khalifa, S.; Elshater, Y.; Sundaravarathan, K.; Bhat, A. The Six Pillars for Building Big Data Analytics Ecosystems. ACM Comput. Surv. 2016, 49, 1–36. [Google Scholar] [CrossRef]
  38. Grandon, E.E.; Mykytyn, P.P. Theory-Based Instrumentation to Measure the Intention to Use Electronic Commerce in Small and Medium Sized Businesses. J. Comput. Inf. Syst. 2004, 44, 44–57. [Google Scholar] [CrossRef]
  39. Bengel, A.; Shawki, A.; Aggarwal, D. Simplifying Web Analytics for Digital Marketing. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; pp. 1917–1918. [Google Scholar]
  40. Villegas-Ch, W.; Palacios-Pacheco, X.; Luján-Mora, S. Application of a Smart City Model to a Traditional University Campus with a Big Data Architecture: A Sustainable Smart Campus. Sustainability 2019, 11, 2857. [Google Scholar] [CrossRef] [Green Version]
  41. Bt Mohd, N.A.; Zaaba, Z.F. A Review of Usability and Security Evaluation Model of E-Commerce Website. In Proceedings of the Procedia Computer Science, Nanning, China, 11–13 January 2019; Volume 161. [Google Scholar]
  42. Al-Hudhaif, S.A.; Alkubeyyer, A. E-Commerce Adoption Factors in Saudi Arabia. Int. J. Bus. Manag. 2011, 6, 122. [Google Scholar] [CrossRef] [Green Version]
  43. Garrn Muuoz, T.; Lopez, R.; Perez Amaral, T.; Herguera Garcca, I.; Valarezo Unda, A. Models for Individual Adoption of ECommerce, EBanking and EGovernment in Spain. SSRN Electron. J. 2017, 43, 100–111. [Google Scholar] [CrossRef]
  44. Huang, L.; Abnoosian, K. A New Approach for Service Migration in Cloud-Based e-Commerce Using an Optimization Algorithm. Int. J. Commun. Syst. 2020, 33, e4457. [Google Scholar] [CrossRef]
  45. Villegas-Ch, W.; Molina-Enriquez, J.; Chicaiza-Tamayo, C.; Ortiz-Garcés, I.; Luján-Mora, S. Application of a Big Data Framework for Data Monitoring on a Smart Campus. Sustainability 2019, 11, 5552. [Google Scholar] [CrossRef] [Green Version]
  46. Zuniarti, I.; Yuniasih, I.; Martana, I.K.; Setyaningsih, E.D.; Susilowati, I.H.; Pramularso, E.Y.; Astuti, D. The Effect of the Presence of E-Commerce on Consumer Purchasing Decisions. Int. J. Data Netw. Sci. 2021, 5, 479–484. [Google Scholar] [CrossRef]
  47. Safiria, A.; Ditta, A.; Ahmad, A.H.; Ula, R.; Fauzi, A.; Idris, I.; Faizun, M.; Yazid, M. The Role of Perceived Benefits and Perceived Risks Towards the Consumers’ Purchase Intention Via E-Commerce: An Evidence From Indonesia. Solid State Technol. 2020, 63, 3257–3274. [Google Scholar]
Figure 1. Basic architecture of eCommerce.
Figure 1. Basic architecture of eCommerce.
Futureinternet 14 00303 g001
Figure 2. Flowchart of methodology for the analysis of the indicators that affect the use of eCommerce.
Figure 2. Flowchart of methodology for the analysis of the indicators that affect the use of eCommerce.
Futureinternet 14 00303 g002
Figure 3. Sedimentation graph: representation of the magnitude of the eigenvalues.
Figure 3. Sedimentation graph: representation of the magnitude of the eigenvalues.
Futureinternet 14 00303 g003
Table 1. Variables identified in the review of previous works with a sample of data collected through surveys.
Table 1. Variables identified in the review of previous works with a sample of data collected through surveys.
ComfortSchedulesSpeedAroundPrivacyOffer
433535
334333
435554
254533
334533
245335
543445
334454
233435
544345
553445
253543
534553
235545
353333
243433
544345
233344
344134
444533
Table 2. KMO and Bartlett’s test.
Table 2. KMO and Bartlett’s test.
Kaiser–Meyer–Olkin Measure of Sampling
Adequacy
0.788
Bartlett’s test of sphericityApprox. chi-square460.453
Df15
Sig.<0.001
Table 3. Anti-image matrices: measures of sampling adequacy (MSA).
Table 3. Anti-image matrices: measures of sampling adequacy (MSA).
ComfortSchedulesSpeedAroundPrivacyOffer
Anti-image covarianceComfort0.765−0.1190.44−0.092−0.0640.206
Schedules−0.1190.610−0.194−0.040−0.0050.066
Speed−0.044−0.1940.386−0.166−0.057−0.099
Around−0.092−0.040−0.1660.394−0.153−0.125
Privacy−0.064−0.005−0.057−0.1530.617−0.097
Offer0.2060.066−0.099−0.125−0.0970.687
Anti-image correlationComfort0.731 a−0.174−0.081−0.167−0.0940.284
Schedules−0.1740.800 a−0.401−0.81−0.0080.103
Speed−0.081−0.401−0.777 a−0.427−0.117−0.192
Around−0.167−0.081−0.4270.789 a−0.311−0.240
Privacy−0.094−0.008−0.117−0.3110.867 a−0.149
Offer0.2840.103−0.192−0.240−0.1490.722 a
a: Measures of sampling adequacy (MSA)
Table 4. Communalities obtained using the extraction method of maximum likelihood.
Table 4. Communalities obtained using the extraction method of maximum likelihood.
InitialExtraction
Comfort0.2350.344
Schedules0.3900.472
Speed0.6140.731
Around0.6060.711
Privacy0.3830.342
Offer0.3130.625
Table 5. Total variance explained obtained using the extraction method of maximum likelihood.
Table 5. Total variance explained obtained using the extraction method of maximum likelihood.
Initial EigenvaluesExtraction Sums of Squared
Loadings
Rotation Sums of Squared
Loadings
FactorTotal% of varianceCumulative %Total% of varianceCumulative %Total% of varianceCumulative%
13.04650.77250.7722.64344.05844.0581.66027.67227.672
21.15519.24470.0160.66111.02155.0791.64427.40755.079
30.65110.85380.870
40.5078.45189.320
50.3786.30695.626
60.2624.374100.000
Table 6. Rotated factor matrix obtained using the extraction method of maximum likelihood and the rotation method of Varimax with Kaiser normalization, with rotation converged in 3 iterations.
Table 6. Rotated factor matrix obtained using the extraction method of maximum likelihood and the rotation method of Varimax with Kaiser normalization, with rotation converged in 3 iterations.
12
Offer0.788
Around0.6300.560
Privacy0.515
Schedules 0.648
Speed0.5680.639
Comfort 0.585
Table 7. Calculation of Cronbach’s alpha for the factors identified in the factorial analysis.
Table 7. Calculation of Cronbach’s alpha for the factors identified in the factorial analysis.
ComfortSchedulesSpeedAroundPrivacyOfferTotal
43353523
33433319
43555426
25453322
33453321
24533522
54344525
33445423
23343520
54434525
55344526
25354322
53455325
23554524
35333320
24343319
54434525
23334419
34413419
44453323
43353522
Var.P1.8548705642.1696786942.1595701762.1374843381.6306568411.37785894833.66013294
Table 8. Cronbach’s alpha results of three interactions.
Table 8. Cronbach’s alpha results of three interactions.
# CalculationResult
Interaction 10.796075764
Interaction 20.804694163
Interaction 30.822225082
Table 9. KMO and Bartlett’s test two.
Table 9. KMO and Bartlett’s test two.
Kaiser–Meyer–Olkin Measure of Sampling
Adequacy
0.763
Bartlett’s test of sphericityApprox. chi-square754.059
df66
Sig.<0.001
Table 10. Anti-image matrices, measures of sampling adequacy (MSA) with 12 items.
Table 10. Anti-image matrices, measures of sampling adequacy (MSA) with 12 items.
P1P2P3P4P5P6P7P8P9P10P11P12
Anti-image covarianceP10.9650.0370.027−0.0090.0070.020−0.015−0.0940.0040.0710.037−0.073
P20.0370.9520.0080.072−0.043−0.0760.020−0.0620.0100.0720.0060.001
P3−0.27−0.008−0.643−0.254−0.075−0.0690.0120.0480.148−0.1000.0480.007
P4−0.0090.072−0.2540.546−0.109−0.032−0.077−0.024−0.1030.1110.039−0.041
P50.007−0.043−0.075−0.1090.542−0.091−0.078−0.175−0.0610.047−0.0100.046
P60.020−0.076−0.069−0.032−0.0910.529−0.240−0.005−0.016−0.006−0.0310.055
P7−0.0150.0200.012−0.077−0.078−0.2400.502−0.043−0.076−0.0020.00290.66 × 105
P8−0.094−0.0690.048−0.024−0.175−0.005−0.0430.696−0.098−0.053−0.0280.041
P90.0040.0120.148−0.103−0.061−0.016−0.076−0.0980.474−0.166−0.119−0.064
P100.0710.048−0.1000.1110.047−0.006−0.002−0.053−0.1660.646−0.100−0.086
P110.0370.1480.0480.039−0.010−0.0310.002−0.028−0.119−0.1000.488−0.248
P12−0.073−0.1000.007−0.0410.0460.05590.6 × 1050.041−0.064−0.086−0.2480.567
Anti-image correlationP10.450 a0.048−0.034−0.0130.0100.028−0.021−0.1150.0060.0890.055−0.098
P20.0380.007−0.0100.100−0.060−0.1070.029−0.0760.0150.0910.0090.002
P3−0.034−0.0340.630 a−0.428−0.128−0.1190.0220.0720.267−0.1550.0860.012
P4−0.013−0.010−0.4280.733 a−0.201−0.059−0.147−0.039−0.2020.1860.076−0.074
P50.0100.630 a−0.128−0.2010.838 a−0.171−0.149−0.285−0.1190.080−0.0200.083
P60.028−0.428−0.119−0.059−0.1710.786 a−0.465−0.008−0.032−0.011−0.0620.100
P7−0.021−0.1190.022−0.147−0.149−0.4650.802 a−0.072−0.157−0.0030.0040.000
P8−0.1150.0220.072−0.039−0.285−0.008−0.0720.825 a−0.170−0.079−0.0480.065
P90.0060.0720.267−0.202−0.119−0.032−0.157−0.1700.774 a−0.3000.247−0.123
P100.0890.267−0.1550.1860.080−0.011−0.003−0.079−0.3000.750 a0.178−0.143
P110.055−0.1550.0860.076−0.020−0.0620.004−0.048−0.247−0.1780.750 a−0.471
P12−0.0980.0120.012−0.0740.0830.1000.0000.065−0.123−0.143−0.4710.724 a
a: Measures of sampling adequacy (MSA)
Table 11. Commonalities obtained using the extraction method of maximum likelihood. One or more commonality estimates greater than 1 were encountered during iterations; the resulting solution should be interpreted with caution.
Table 11. Commonalities obtained using the extraction method of maximum likelihood. One or more commonality estimates greater than 1 were encountered during iterations; the resulting solution should be interpreted with caution.
InitialExtraction
P10.0350.012
P20.0480.044
P30.3570.411
P40.4540.753
P50.4580.544
P60.4710.959
P70.4980.535
P80.3040.434
P90.5260.633
P100.3540.368
P110.5120.661
P120.4330.583
Table 12. Rotated factor matrix obtained using the extraction method of maximum likelihood and the rotation method of Varimax with Kaiser normalization, with rotation converged in 8 iterations.
Table 12. Rotated factor matrix obtained using the extraction method of maximum likelihood and the rotation method of Varimax with Kaiser normalization, with rotation converged in 8 iterations.
Question1234
P110.803
P120.752
P90.644 0.425
P100.585
P6 0.9400.259
P7 0.5610.2910.331
P2
P4 0.8030.267
P3−0.208 0.587
P1
P8 0.249 0.578
P5 0.4010.3620.502
Table 13. Second analysis and result of Cronbach’s alpha of three interactions.
Table 13. Second analysis and result of Cronbach’s alpha of three interactions.
# CalculationResult
Interaction 10.680688048
Interaction 20.702051044
Interaction 30.725779032
Table 14. Stratified cross-validation with a training dataset corresponding to 506 instances.
Table 14. Stratified cross-validation with a training dataset corresponding to 506 instances.
Stratified Cross-Validation
Correctly classified instances38275.4941%
Incorrectly classified instances12424.5059%
Kappa statistic0.6936
Mean absolute error0.1035
Root mean square error0.2895
Relative absolute error32.3515%
Relative root square error72.3879%
Total number of instances506
Table 15. Detailed accuracy by class, with a training dataset corresponding to 506 instances.
Table 15. Detailed accuracy by class, with a training dataset corresponding to 506 instances.
TP RateFP RatePrecisionRecallF-MeasureMCCROC AreaPCR AreaClass
1.0000.0420.8521.0000.9200.9040.9820.885Never
0.7790.0170.920.7790.8440.8120.9580.871Little
0.9340.0250.9080.9340.9210.9000.9660.884Something
0.5690.1210.5420.5690.5550.4390.810.470Enough
0.4790.1000.5290.4790.5030.3940.810.474Always
Weighted avg.0.7550.0600.7540.7550.7520.6940.9060.720
TP Rate, the true positive rate, represents the instances correctly classified as a given class. FP Rate, the false positive rate, represents the instances falsely classified as a given class. Precision is the proportion of instances that are true of a class divided by the total number of instances classified as that class. Recall is the proportion of instances classified as a given class divided by the actual total in that class, equivalent to the TP rate. F-Measure is a combined measurement of precision and recall calculated as Precision-Recall/(Precision + Recall). MCC is used in machine learning as a measure of the quality of binary classifications (two classes); it considers true and false positives and negatives and is generally considered to be a balanced measure that can be used even if the classes are of very different sizes. ROC is the measurement of the receiver operating characteristic area; it is one of the most important WEKA output values, and it provides an idea of how classifiers work in general. PRC, the precision recovery area or precision recovery graph, is more informative than the ROC graph when evaluating binary classifiers on unbalanced datasets.
Table 16. Stratified cross-validation with a training dataset corresponding to 217 instances.
Table 16. Stratified cross-validation with a training dataset corresponding to 217 instances.
Stratified Cross-Validation
Correctly classified instances17178.8018%
Incorrectly classified instances4621.1982%
Kappa statistic0.7347
Mean absolute error0.097
Root mean square error0.2838
Relative absolute error30.356%
Relative root square error70.9858%
Total number of instances217
Table 17. Detailed accuracy by class.
Table 17. Detailed accuracy by class.
TP RateFP RatePrecisionRecallF-MeasureMCCROC AreaPCR AreaClass
1.0000.0340.871.0000.9300.9170.9780.844Never
0.6900.0170.9060.6900.7840.7500.8770.737Little
0.8540.1010.7070.8540.7740.7070.8740.591Something
0.5430.0530.7350.5430.6250.5520.7940.585Enough
0.8780.0630.7660.8780.8180.7750.9290.712Always
Weighted avg.0.7880.0550.7930.7880.7810.7340.8870.687
TP Rate, the true positive rate, represents the instances correctly classified as a given class. FP Rate, the false positive rate, represents the instances falsely classified as a given class. Precision is the proportion of instances that are true of a class divided by the total number of instances classified as that class. Recall is the proportion of instances classified as a given class divided by the actual total in that class, equivalent to the TP rate. F-Measure is a combined measurement of precision and recall calculated as 2 Precision/Recall/(Precision + Recall). MCC is used in machine learning as a measure of the quality of binary classifications (two classes); it considers true and false positives and negatives and is generally considered to be a balanced measure that can be used even if the classes are of very different sizes. ROC is the measurement of the receiver operating characteristic area; it is one of the most important WEKA output values, and it provides an idea of how classifiers work in general. PRC, the precision recovery area or precision recovery graph, is more informative than the ROC graph when evaluating binary classifiers on unbalanced datasets.
Table 18. Results obtained from the confusion matrix by people classified correctly and incorrectly and the identification of the number of surveys that did not respond to the study phenomenon.
Table 18. Results obtained from the confusion matrix by people classified correctly and incorrectly and the identification of the number of surveys that did not respond to the study phenomenon.
abcdeClassified as
400000a = Never
629700b = Little
034140c = Something
00102511d = Enough
000536e = Always
Table 19. Training of a data mining algorithm with two clusters to determine the use of eCommerce.
Table 19. Training of a data mining algorithm with two clusters to determine the use of eCommerce.
AttributeFull Data
(217)
Cluster#1
109
2
108
User_eCommerceSomethingEnoughSomething
Accessibility_Preg13.20744.56322.3
Accessibility_Preg23.10144.43682.2077
Catalogs_Preg13.20284.51722.3231
Catalogs_Preg23.04614.42532.1231
Compliance_Preg13.20744.48282.3538
Compliance_Preg23.2354.52872.3692
Support_Preg13.18434.48282.3154
Support_Preg23.17514.40232.3538
Satisfaction_Preg13.19354.39082.3923
Satisfaction_Preg23.22124.58622.3077
Shipment_Preg13.26734.52872.4231
Shipment_Preg23.20284.44832.3692
Table 20. Stratified cross-validation with re-evaluation on test set.
Table 20. Stratified cross-validation with re-evaluation on test set.
Stratified Cross-Validation
Correctly classified instances16174.1935%
Incorrectly classified instances5625.8065%
Kappa statistic0.6777
Mean absolute error0.1181
Root mean square error0.308
Total number of instances217
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Villegas-Ch., W.; Criollo-C, S.; Gaibor-Naranjo, W.; Palacios-Pacheco, X. Analysis of Data from Surveys for the Identification of the Factors That Influence the Migration of Small Companies to eCommerce. Future Internet 2022, 14, 303. https://doi.org/10.3390/fi14110303

AMA Style

Villegas-Ch. W, Criollo-C S, Gaibor-Naranjo W, Palacios-Pacheco X. Analysis of Data from Surveys for the Identification of the Factors That Influence the Migration of Small Companies to eCommerce. Future Internet. 2022; 14(11):303. https://doi.org/10.3390/fi14110303

Chicago/Turabian Style

Villegas-Ch., William, Santiago Criollo-C, Walter Gaibor-Naranjo, and Xavier Palacios-Pacheco. 2022. "Analysis of Data from Surveys for the Identification of the Factors That Influence the Migration of Small Companies to eCommerce" Future Internet 14, no. 11: 303. https://doi.org/10.3390/fi14110303

APA Style

Villegas-Ch., W., Criollo-C, S., Gaibor-Naranjo, W., & Palacios-Pacheco, X. (2022). Analysis of Data from Surveys for the Identification of the Factors That Influence the Migration of Small Companies to eCommerce. Future Internet, 14(11), 303. https://doi.org/10.3390/fi14110303

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop