Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective

Wang, Chih-Hsuan; Gu, Yu-Wei

doi:10.3390/app12178480

Open AccessArticle

Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective

by

Chih-Hsuan Wang

^*

and

Yu-Wei Gu

Department of Industrial Engineering & Management, National Yang Ming Chiao Tung University, Hsinchu 30013, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(17), 8480; https://doi.org/10.3390/app12178480

Submission received: 5 August 2022 / Revised: 21 August 2022 / Accepted: 22 August 2022 / Published: 25 August 2022

(This article belongs to the Special Issue Advanced Applications of Artificial Intelligence, Data Analytics and Soft Computing)

Download

Browse Figures

Versions Notes

Abstract

:

Retail firms are the best representatives of a developed country’s economic condition because they sell many of the necessary goods used for daily consumption, including food, clothes, shoes, electric appliances, and office supplies. This study presents a novel framework to help retail practitioners achieve the following goals: (1) predict sales revenues by identifying significant economic indicators, (2) estimate stable equilibriums by capturing interactive dynamics between competing firms, and (3) derive operational efficiencies and indicate required improvements by conducting performance assessments. To verify the validity of the research, data pertaining to Walmart, Costco, and Kroger are collected. Specifically, the least absolute shrinkage and selection operator (Lasso) is adopted in order to identify significant economic indicators. Consumer price index and regular wage are two common indicators that affect the the three firms’ sales numbers. In sales forecasting, support vector regression (SVR) and multivariate adaptive regression splines (MARS), respectively, perform the best in the training set and the testing set. Finally, the Lotka–Volterra model (LVM) and data envelopment analysis (DEA) are used for competitive analysis and performance assessment. A relationship of economic mutualism has been identified between the three firms. Furthermore, research findings show that Kroger performs inefficiently, though it can expect to increase sales more than the others in stable equilibriums.

Keywords:

economic indicators; retail; sales forecasting; market analysis; performance assessment

1. Introduction

The retail sector dominates a big proportion of the service industry in modern countries because it provides a variety of the goods necessary for daily consumption [1,2]. Generally, retail chains consist of four specific systems: department stores, hypermarkets, supermarkets, and convenience stores. Specifically, households are the primary customers for hypermarkets and supermarkets, while individuals are the primary customers for department stores and convenience stores. Clearly, the product categories, geographical locations, area sizes, and make-up of the main customers are quite different between portions of the retail sector [3,4,5]. In practice, the aggregate sales number is the best representative of consumer shopping, pricing policy, promotion plan, and product strategy performance. Inspired by the concept of business analytics, this research highlights three critical issues: sales forecasting (predictive analytics), market analysis (diagnostic analytics), and performance assessment (prescriptive analytics). For the retail sector, sales forecasting helps managers understand customers’ behaviors and predict their future desires [6]. Then, a firm can optimize storage space, shelf space, and display space to prepare inventories and develop product strategies. Although sales forecasting is critically important, it is extremely challenging due to the lack of any systematic approaches useful for identifying representative or effective predictors.

In sales forecasting, most past studies predict sales by relying on historical data without considering the impacts of the predictors. In particular, significant economic indicators can vary between different portions of the retail sector, or between individual firms [7,8,9]. Economic indicators are generally drawn from one of the three categories: leading, lagging, or coincident indicators [10,11,12,13]. Industrial production index (IPI), consumer confidence index (CCI), and purchase manager index (PMI), are usually the leading indicators used in the forecasting of a country’s economy [13,14]. Gross national product (GNP), consumer price index (CPI), producer price index (PPI), and unemployment rates are usually treated as lagging indicators because they can be used to justify economic conditions. Coincident indicators, such as gross domestic product (GDP), personal consumption expenditure (PCE), and regular wage, are concurrent with changes in the economy. In practice, it is not easy to justify the temporal causality (leading, lagging, or coincident) of the indicators. Thus, they are mixed together in this research.

Additionally, market competition is very common in the retail sector [15,16]. Generally, market competition has three types. The first is horizontal competition, meaning competition between homogeneous firms targeting similar segments. The second is vertical competition between the upstream supplier and the downstream retailer. The third is channel competition between online platforms and onsite stores. In reality, the degree of competition depends on the dependences of the interactive firms as related to market segments, customer groups, and product categories [17]. Horizontal competition means that competing firms offer substitutive products or services [18,19] while vertical competition means that each partner takes a slice of the value chain, similar to profit sharing [12,20]. Today, artificial intelligence and computer vision have blurred the boundary between online platforms and onsite retailers. Walmart merged an e-commerce platform in 2018, while Amazon opened cashier-less stores the same year. Clearly, channel competition between Amazon and Walmart has become much more intense than it was before.

This research focuses on a mix of horizontal and vertical competition between Walmart, Costco, and Kroger. Among them, Walmart provides varieties of products ranging from electronic appliances, furniture, office supplies, sport supplies, and clothes, to fresh food. In contrast, Kroger focuses on community stores and offers fresh food, vegetables, fruits, snacks, bread, milk, etc. Costco seems to have positioned itself between Walmart and Kroger, but closer to Walmart. In addition to sales forecasting and market analysis, performance assessments are also important for assisting retailers in understanding how the efficient input of resources can bring about a desired outcome, while indicating required improvements. As a consequence, this research proposes a novel framework to achieve the following goals: (1) identify effective predictors for the execution of sales forecasting for retail firms, (2) analyze market competition between these interactive firms, thereby revealing managerial insights, and (3) derive operational efficiencies, thereby indicating actions required for improvement. For clarity, the research questions are addressed as follows:

What economic indicators are significant predictors that affect retail sales?
What interrelationships exist between Walmart, Costco, and Kroger, and how can stable market equilibriums be estimated?
How can performance assessment, in terms of operational efficiencies, be conducted, and what actions are required to improve inefficient decision management units (DMUs)?

The rest of the paper is organized as follows: Section 2 provides an overview of market competition and sales forecasting. Section 3 details the proposed techniques. Research findings are presented in Section 4. Discussions are presented in Section 5. Conclusions are shown in Section 6.

2. Literature Review

Sales forecasting, market analysis, and performance assessment are three critical issues for retail firms. Sales forecasting can help practitioners achieve better financial budgeting and operation planning [21,22]. Market analysis assists firms in deducing the interrelationships between competitors and estimating stable equilibriums [23,24]. Performance assessment derives operational efficiencies and indicates the actions required to improve input resources and output outcomes. Generally, forecasting techniques can be qualitative or quantitative. Typical qualitative methods include the Delphi method, market research, and panel discussion, while quantitative methods include moving average, exponential smoothing, and time series [6]; however, the above-mentioned quantitative methods do not consider the causalities between the predictors and the outcome [7,25].

To highlight research contributions, Table 1 compares this research to past studies. Clearly, past studies rarely addressed the impacts of dynamic competition (internal effects) and economic indicators (external effects) on retailers. Besides, a process which only derives operational efficiencies is insufficient. The required actions to improve inefficient firms should be clearly indicated. Thus, this research attempts to simultaneously tackle the following issues [12,17,22,26]: (1) What is the causality between economic indicators and aggregate sales (predictive analytics)? (2) How does market competition decide stable equilibriums (diagnostic analytics)? (3) What actions should be taken to improve operational efficiency (prescriptive analytics)?

2.1. Sales Forecasting Based on Economic Indicators

Economic indicators are a collection of aggregate factors [11,13,33] that can denote a country’s economic conditions. Depending on the temporal causalities, economic indicators can be leading, coincident, or lagging signals [34]. A leading indicator is an economic factor that changes before the economy begins to grow or decline. Conversely, a lagging indicator is a measure that moves after a change in the economy has already occured. In contrast, coincident indicators concurrently reflect the economic condition of a country. Based on economic indicators, economists can help a country predict future conditions, and flash green, red, or yellow lights to alert the government, firms, consumers, and even investors regarding future changes to the economic condition.

In practice, leading indicators help practitioners and policymakers predict significant changes in the economy, while lagging indicators are used to confirm increasing or declining patterns and changes in trends [10,29,35]. Coincident indicators are very powerful because there are no delays between the predictors and the outcome. Regardless of whether leading, lagging, or coincident indicators are used, they must be systematically identified to recognize significant predictors. Since this study aims at the prediction of aggregate sales for retail firms, associated economic indicators, such as CPI (consumer price index), CCI (consumer confidence index), PCE (personal consumption expenditure), non-manufacturing purchase index (NMI), producer price index (PPI), industry production index (IPI), purchase manager index (PMI), regular wage, unemployment rate, oil price, and exchange rate are adopted as potential predictors in sales forecasting.

2.2. Dynamic Competition and Performance Assessment

To model market dynamics, game theory and channel competition are frequently adopted to characterize sequential or concurrent moves between the firms in an oligopoly structure. Specifically, game theory based on mathematical programming has been widely applied to auction, mechanism design, and channel coordination [36,37,38]. To the best of our knowledge, most past studies focused on horizontal competition in which homogeneous firms compete for the same segments of customers [23,24]. In this study, Walmart and Costco are similar to big-scale hypermarkets while Kroger is like supermarkets. Generally, customers of hypermarkets or supermarkets are households (weekly purchases) rather than individuals (daily consumption) in convenience stores. Because the available information for the three retailers is the aggregate sales, it is used to analyze market competition that can quantify the relationships between the three firms. For instance, given the sales of a firm increases or decreases, what’s the impact on its competing firms? Based on the interrelationships, what are stable equilibriums for the competing firms? In this study, Lotka–Volterra model (LVM) is constructed to achieve the above-mentioned goals.

Further, to conduct performance assessment and demonstrate the strengths or weaknesses of a firm, operational efficiencies are derived for competing retailers. Operational efficiency, or the so-called productivity, is used to measure the degree of utilization from input resources to output outcomes. Referred to past studies [3,5,28], this research considers full-time employees, cost of goods sold (COGS), and operating expenses as the input, and sales revenues as the output. In this study, three retailers spanning from 2005 to 2021 are treated as decision management units (DMUs). Data envelopment analysis (DEA) is applied to derive operational efficiencies and indicate the actions required to improve inefficient DMUs. Mathematically, the most efficient DMUs have unity operational efficiencies.

3. Proposed Techniques

Figure 1 details the proposed techniques. First, Lasso (least absolute shrinkage and selection operator) is adopted to identify key predictors that significantly affect the sales revenues of Walmart, Costco, and Kroger. Then, machine learning is applied to conduct sales forecasting. Second, the LVM (Lotka–Volterra model) is used to analyze market dynamics between the three retailers and estimate their stable equilibriums. Lastly, DEA (data envelopment analysis) is applied to derive operational efficiencies and indicate necessary actions for the improvement of inefficient firms. Without loss of generality, MARS (multivariate adaptive regression splines), SVR (support vector regression), and DNN (deep neural network) are adopted in sales forecasting.

Specifically, root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are used to measure forecasting errors [23,29,39]:

\begin{matrix} RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} e_{i}^{2}}, \end{matrix}

(1)

\begin{matrix} MAE = \frac{1}{n} \sum_{i = 1}^{n} | e_{i} |, \end{matrix}

(2)

\begin{matrix} MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{e_{i}}{y_{i}} |, \end{matrix}

(3)

where n denotes the number of observations, and

e_{i} = F_{i} - y_{i}

is an error measured between a predicted value (

F_{i}

) and the real data (

y_{i}

).

3.1. Statistical Learning

As opposed to the conventional unbiased regression, biased regression can balance the trade-off between forecasting errors and model complexities. Typical biased regression schemes include Ridge, Lasso, and ElasticNet [40,41]. The differences between them are regularized distance measures: L₁ norm is for Lasso (see Equation (4)), L₂ norm is for Ridge (see Equation (5)), and a compromise is for Elastic Net (see Equation (6)). Specifically, L₁ norm is Manhattan distance (

{‖ β ‖}_{1} = \sum_{i} | β_{i} |

) and L₂ norm is Euclidean distance (

{‖ β ‖}_{2} = \sqrt{\sum_{i} {β_{i}}^{2}}

).

{\hat{β}}_{L a s s o} = a r g \underset{β}{M i n} {{‖ Y - X β ‖}_{2}^{2} + λ {‖ β ‖}_{1}},

(4)

{\hat{β}}_{R i d g e} = a r g \underset{β}{M i n} {{‖ Y - X β ‖}_{2}^{2} + λ {‖ β ‖}_{2}},

(5)

{\hat{β}}_{E l a s t i c} = a r g \underset{β}{M i n} {{‖ Y - X β ‖}_{2}^{2} + λ {‖ β ‖}_{1} + (1 - λ) {‖ β ‖}_{2}},

(6)

where Y means a response, X are multivariate predictors,

β

represents regression coefficients, and

λ

is a regularization constant. Lasso is adopted to identify significant economic indicators because it can diminish a lot of redundant predictors.

Unlike multiple linear regression (MLR), multivariate adaptive regression splines (MARS) is a nonparametric and nonlinear methodology. It is defined as follows:

f (x) = a_{0} + \sum_{m = 1}^{M} a_{m} \prod_{k = 1}^{K_{m}} [S_{k, m} (x_{k, m} - t_{k, m})],

(7)

where

a_{0}

is a constant,

a_{m}

are regression coefficients of the model, M is the number of basis functions (degree of nonlinearity),

K_{m}

is the number of splits for the m^th basis,

S_{k, m}

takes values of either 1 or −1 to indicate the right or the left step function,

x_{k, m}

are input variables, and

t_{k, m}

are “knot” locations in each interval [40]. In Figure 2, a nonlinear mapping is approximated by using “knots,” in which BF means basis functions.

3.2. Machine Learning

Based on quadratic programming [41,42], Figure 3 shows how support vector regression (SVR) transforms the low-dimensional input space to the high-dimensional feature space using a linear cylindrical tube:

M i n i m i z e \frac{1}{2} {‖ w ‖}^{2} + \frac{C}{2} \sum_{i = 1}^{n} (ξ_{i}^{2} + ζ_{i}^{2})

(8)

s u b j e c t t o w^{T} ϕ (x_{i}) + b - y_{i} \leq ε + ξ_{i},

(9)

y_{i} - w^{T} ϕ (x_{i}) - b \leq ε + ζ_{i},

(10)

where n denotes the number of samples, w is the slope and b means the intercept. Let us take the derivatives with respect to w, b,

ξ

, and

ζ

to find the KKT conditions:

w = \sum_{i = 1}^{n} (α_{i} - β_{i}) ϕ (x_{i}), ξ_{i} = α_{i} / C, a n d ζ_{i} = β_{i} / C,

(11)

where

α_{i} \geq 0

and

β_{i} \geq 0

represents Lagrangian multipliers in Constraints (9) and (10). If we plug Equation (11) back into the primal problem, we also have

α_{i} β_{i} = 0

to derive the dual problem. The details are referred to by [43,44].

Deep neural network (DNN) consists of the input layer, multiple hidden layers, and the output layer. An error signal defined in Equation (12) needs to be back propagated to adjust the weights between the neurons. When the mean squared error converges, the updating process stops, and the model has been well trained [44,45]:

E r r o r = \frac{1}{n} \sum_{s = 1}^{n} {[Y_{k} - f (w_{i j}, θ_{i}, X_{j})]}^{2}, \forall j = 1 \dots M, \forall k = 1 \dots L,

(12)

where n is the number of training samples, M is the dimensions of input variables, L is the dimension of output variables,

w_{i j}

means the weights from layer i to layer j, and

θ_{i}

is the intercept. The universal approximation function, f, needs to be learned to conduct nonlinear fitting. As shown in Figure 4, the error signal can calculate the best fitted weights and intercepts that can minimize forecasting error. In contrast, the forward working signals can conduct forecasting. Based on the chain rules, the optimal weights and intercepts are derived to form a universal approximation that best defines the relationships between the predictors and the outcome [45]. Specifically, hyperparameters, such as the number of hidden layers and associated neurons, drop-out rate, activation function (hyperbolic tangent, sigmoid, and ReLu), and optimizer (stochastic decent, gradient decent, AdaDelta, AdaGrad, and etc.), need to be selected in model training. Thereafter, forecasting can be realized in model testing.

3.3. Competitive Analysis

Based on the logistic equation, the Lotka–Volterra model (LVM) is adopted to capture the interactions between competing firms [14,18,23]. Differential equations are given as:

\frac{{d x}_{1}}{dt} = a_{1} x_{1} - b_{1} x_{1}^{2} - c_{1} x_{1} x_{2},

(13)

\frac{{d x}_{2}}{dt} = a_{2} x_{2} - b_{2} x_{2}^{2} - c_{2} x_{1} x_{2},

(14)

where

x_{i}

can be modeled by adopting users, shipments, revenues, etc.,

a_{i}

denotes the ability of the equation itself,

b_{i}

refers to the limitation of the firm during market expansion,

c_{i}

describes the interaction between the firm and its competitor. In equilibriums, the differential values in Equations (13) and (14) are zeros, and the two objects can be mutually estimated as:

x_{1} = (a_{1} - c_{1} x_{2}) / b_{1}

and

x_{2} = (a_{2} - c_{2} x_{1}) / b_{2}

. To use discrete data, differential equations are converted into difference equations:

x_{1} (t + 1) = \frac{α_{1} x_{1} (t)}{1 + β_{1} x_{1} (t) + γ_{1} x_{2} (t)},

(15)

x_{2} (t + 1) = \frac{α_{2} x_{2} (t)}{1 + β_{2} x_{2} (t) + γ_{2} x_{1} (t)},

(16)

where

a_{i} = l n α_{i}

,

b_{i} = β_{i} l n α_{i} / (α_{i} - 1)

, and

c_{i} = γ_{i} l n α_{i} / (α_{i} - 1)

are used to estimate three important parameters,

α_{i}, β_{i}, γ_{i}

.

The original LVM can be generalized to include more objects at a time. For clarity, managerial insights regarding the parameters in LVM are described in Table 2. The relationships between a firm and its rivals can be one of the six types: pure competition (mutually harmful), mutualism (win-win), predator-prey (win-loss), amensalism (one-side harmful), commensalism (one-side beneficial), and neutralism (independent). Hence, LVM can clearly explain the market dynamics between firms [18,19]. Further, stable equilibriums occur when neither of the population levels are changing: differential equations are equal to 0:

\frac{{d x}_{1}}{dt} = 0

,

\frac{{d x}_{2}}{dt} = 0

. In this case, four possible equilibriums,

(x_{1}^{*} {, x}_{2}^{*})

, are derived: (1)

(x_{1}^{*} {, x}_{2}^{*}) = 0

meaning both species disappear, (2)

(x_{1}^{*} {, x}_{2}^{*}) = (a_{1} / b_{1}, 0)

meaning specie 1 will survive while specie 2 will disappear, (3)

(x_{1}^{*} {, x}_{2}^{*}) = (0, a_{2} / b_{2})

meaning specie 2 will survive while specie 1 will disappear, and (4)

(x_{1}^{*} {, x}_{2}^{*}) = (\frac{a_{1} b_{2} - a_{2} c_{1}}{b_{1} b_{2} - c_{1} c_{2}}, \frac{a_{2} b_{1} - a_{1} c_{2}}{b_{1} b_{2} - c_{1} c_{2}})

meaning both species can survive. Each equilibrium point can be stable only if the real parts of the eigenvalues of the Jacobian matrix,

J (x_{1}, x_{2}) = (\begin{array}{c} a_{1} - 2 b_{1} x_{1} - c_{1} x_{2} & - c_{1} x_{1} \\ - c_{2} x_{2} & a_{2} - 2 b_{2} x_{2} - c_{2} x_{1} \end{array})

, are negative.

3.4. Performance Assessment

Data envelopment analysis (DEA) is one of the most classic techniques used to measure operational efficiencies among the so-called DMUs (decision management units). There are two common measures [27]: one is BCC (Banker, Charnes, Cooper) and the other is CCR (Charnes, Cooper, Rhodes). The selection of input and output variables is critical to the efficiency measures and relative performances of the DMUs. Mathematically, operational efficiency for a specific DMU can be expressed as follows:

\begin{matrix} M a x & η_{k} = \frac{\sum_{r} u_{r} y_{r k}}{\sum_{i} v_{i} x_{i k} + v_{k o}} \end{matrix}, s u b j e c t t o η_{k} \leq 1, v_{i}, u_{r} \geq ε, \forall i, r,

(17)

where

v_{i}

and

u_{r}

are the weights of the input (

x_{i k}

) and output (

y_{r k}

) variables,

ε

is called non-Archimedean small number, an extremely small positive value that is usually represented by 10⁻⁶, and i, r, and k, respectively, represent indices for input, output, and DMU. The parameter

v_{k o}

is used to control variable returns to scale (VRS) in BCC or constant returns to scale (CRS) in CCR (

v_{k o} = 0

). For the input-oriented DEA, CCR can be replaced by solving the following formula [28,30]:

\begin{matrix} M a x & \sum_{r} u_{r} y_{r k} \end{matrix}

, subject to

\sum_{i} v_{i} x_{i k} = 1

,

η_{k} \leq 1

,

v_{i}

,

u_{r} \geq ε

,

\forall i, r

. Based on dual theorem in linear programming, the dual form can be solved as follows:

M i n η_{k} = θ - ε [\sum_{i} s_{i k}^{-} + \sum_{r} s_{r k}^{+}], s u b j e c t t o η_{k}, s_{i k}^{-}, s_{r k}^{+} \geq 0, \forall i, r \sum_{k} λ_{k} x_{i k} - θ x_{i k} + s_{i k}^{-} = 0, \sum_{k} λ_{k} y_{r k} - s_{r k}^{+} = y_{r k}, s_{i k}^{-}, s_{r k}^{+}, λ_{k} \geq 0,

(18)

where

s_{i k}^{-}

and

s_{r k}^{+}

represent the slack (input excess) and the surplus (output shortfall), and

θ

is a constant ratio of the reduction of input variables used for achieving an efficient DMU.

The main differences between CCR and BCC are VRS or CRS. In simple words, CCR derives operational efficiency (OE) defined by the weighted output over the weighted input. To improve the CRS assumption in CCR, BCC separates OE into two multiplicative parts: scale efficiency (SE) and technical efficiency (TE). Scale efficiency is the ratio of existing inputs (or outputs) of DMUs to the inputs (or outputs) of optimal production scale. Specifically,

\sum_{k} λ_{k}

can be used to justify the trends of returns to scale:

\sum_{k} λ_{k} < 1

means increasing returns to scale (IRS),

\sum_{k} λ_{k} = 1

means constant returns to scale (CRS), and

\sum_{k} λ_{k} > 1

indicates decreasing returns to scale (DRS). If a DMU achieves Pareto efficiency,

η_{k} = 1

, it means no adjustment is required (

s_{i k}^{-} = s_{r k}^{+} = 0

). Otherwise, the required adjustments for an inefficient DMU (

η_{k} < 1

) are

x_{i k}^{*} = θ x_{i k} - s_{i k}^{-}

,

y_{r k}^{*} = y_{r k} + s_{r k}^{+}

, where

x_{i k}

(

x_{i k}^{*}

) and

y_{r k}

(

y_{r k}^{*}

) are the input and output variables before (after) adjustment;

s_{i k}^{-}

and

s_{r k}^{+}

are the desired adjustments.

4. Experimental Results

To justify the validity of the presented framework, quarterly sales of the three retailers, Walmart, Costco, and Kroger, are collected from 2005/Q1 to 2021/Q4. For visualization, Figure 5 displays that Walmart significantly surpasses Costco and Kroger, and all of them demonstrate seasonal variations. To help retail practitioners conduct sales forecasting, economic indicators [8] are treated as potential predictors. In particular, Lasso is applied to identify key performance indicators. As indicated by Table 3, CPI and regular wage are commonly identified for the three firms. PPI and oil price are only critical to Walmart because a hypermarket imports lots of goods (fashion clothes, home appliances, furniture, office supplies, sporting goods, electronic appliances, homemade tools, etc.) from foreign manufacturers, and it is more sensitive to the upstream variations. Specifically, DJT are critical to Costco and Kroger because they need frequent freight transportation to support logistics and inventory management. In contrast, GDP are key to Walmart and Costco, while PCE is influential to Walmart and Kroger.

Very interestingly, lots of indicators, such as CCI, PMI, NMI, IMPI, EXPI, etc., are not critical to any of the three firms. Major product categories sold by a firm form a basis for the identified key predictors. Kroger is a chain supermarket selling uncooked food, such as vegetables, fruits, drinks, snacks, fishes, meat, bread, milk, etc., and Costco also sells lots of well-cooked food and bath supplies. According to different product categories sold by the three retailers, Walmart is closer to the upstream (producer) side while Kroger is closer to the downstream (customer) side. In contrast, Costco seems to be close to the median between Walmart and Kroger.

4.1. Forecasting Sales Based on Economic Indicators

After the most significant economic indicators have been identified with respect to the three retail firms, they are treated as the predictors for forecasting sales revenues. To justify the validity of these economic indicators, MARS, SVR, and DNN are compared in sales forecasting. As we know, MARS, SVR, and DNN originate from statistics, quadratic programming, and deep learning. Deep learning algorithms like RNN, GRU, and LSTM are not considered in this research, because they require lots of data samples to optimize their network topologies and the associated hyperparameters. Specifically, the training set (Table 4) is from 2005/Q1 to 2019/Q4 and the test set is from 2020/Q1 to 2021/Q4 (Table 5). For all the three firms, it is interesting to observe that SVR exhibits the best performance in the training set, while MARS exhibits the best performance in the test set. Generally, the performances of Costco and Kroger are worse than Walmart, and their MAPEs are slightly greater than 10%. In data science, overfitting means good training performance but poor testing performance. The differences between the training set and the test set are limited, and these results guarantee no overfitting is found in this research.

4.2. Analyzing the Interrelationships

LVM is adopted to analyze interactions between Walmart, Costco, and Kroger. In Table 6, it is found that the relationship known as mutualism exists between all pairs. This means that each firm can benefit from the existence of the other retailers. This result may imply the whole retail market is still a growing pie. Since the three firms position themselves in different geographical locations, product categories, and consumer groups, they do not intensively compete with one another. Table 7 further estimates stable sales equilibriums, considering the interactive dynamics continue. Compared to the sales in 2021/Q4, Costco (+5.9%) and Kroger (+7.8%) significantly increased sales at market equilibriums, while Walmart (−2.8%,) slightly decreased. In Table 7, the MAPEs for the three retailers are around 10% and these results justify the validity of using LVM in interactive regression. More importantly, they provide a quantitative basis to estimate the degree of change in sales revenues.

In reality, sales revenues are affected by many factors, including manufacturing cost, pricing policies, promotion plans, channel competition, product positioning, geographical location, customer defection, etc. Recently, online e-commerce platforms, such as Amazon, spent lots of resources to compete with traditional retailers. A slogan, “just walk out,” is promoted by Amazon, asking consumers using their smartphone app to pick up their favorite food and simply walk out of the store without having to interact with a cashier. No cashiers are needed to serve on sites because artificial intelligence (AI) technologies automatically detect consumers’ motions and complete all transactions, including bill payments. This paradigm shift deserves observation, in order to evaluate the impact of AI technologies on future developments in the retail industry.

4.3. Deriving Operational Efficiencies and Performance Assessment

To conduct performance assessment, the correlation coefficients between input variables (COGS- cost of goods sold, full-time employees, OE- operating expenses) and the output variable (sales revenue) are shown in Table 8. Positive coefficients imply that the output is proportional to the input, thus justifying the validity of the input and output variables. In terms of BCC measures, the operational efficiencies for Walmart, Costco, and Kroger are shown in Figure 6. Clearly, Kroger has significantly lagged behind Walmart and Costco since 2009. To address hidden causalities, all input and output variables are displayed in Figure 7, Figure 8, Figure 9 and Figure 10. Not surprisingly, Walmart exhibits the largest scales of input and output variables. Kroger shows almost equivalent sales (Figure 7) and COGS (Figure 8) to Costco, though it has more full-time employees (Figure 9) and higher operating expenses (Figure 10) than Costco. These observations clearly account for Kroger’s operational efficiency ranking being the lowest, because it consumes more input resources without generating higher sales revenues. Although Walmart expresses worry about Amazon’s move to retail markets, Kroger can potentially be more impacted by Amazon because it is a chain supermarket with community-based stores. To elicit more insights, partial operational efficiencies with respect to a single input variable are derived and shown in Table 9: Costco performs poorly in “COGS” and Kroger performs the worst in both “full-time employees” and “operating expenses”.

To help Kroger improve operational efficiencies, Table 10 shows the required adjustments of input resources in percentages. In 2005, no adjustments were required for any retailer. As shown in Figure 6, Kroger performed the worst from 2008 to 2019, and hence, it needed to concurrently reduce COGS, full-time employees, and operating expenses during these years. In 2010, 2012, and 2021, Costco and Walmart performed efficiently, and thus, no adjustments were required. Further, the reduction of COGS and operating expenses was more critical to Costco and Kroger, while Walmart seemed to focus on decreasing full-time employees. Although Walmart had the greatest scales of input resources and out revenues, it performed efficiently in many years: 2005, 2010, 2011, 2012, 2013, 2016, 2017, and 2021. In 2018, Walmart spent a lot of money to merge an e-commerce platform because it wanted to defend its territory and compete with Amazon. This event, coupled with the US–China trade war, can explain Walmart’s inefficiencies in 2018 and 2019. On average, Costco did not perform as well as Walmart but it was still more efficient than Kroger. In practice, to enhance operational efficiencies, cost reduction is easier than increasing sales revenues.

5. Discussions

Inspired by the concept of business analytics, this research presents an integrated framework to help retail managers address three critical issues: sales forecasting, market analysis, and performance assessment. Generally, business analytics has four specific modules: descriptive analytics (what happened in the past), diagnostic analytics (why did it happen), predictive analytics (what will happen in the future), and prescriptive analytics (how to take actions to improve shortcomings). Specifically, sales forecasting covers diagnostic analytics and predictive analytics, market analysis covers descriptive analytics and predictive analytics, and performance assessment covers diagnostic analytics and prescriptive analytics. In sales forecasting, CPI and regular wage are identified as two common factors affecting retail sales for Walmart, Costco, and Kroger. The economic indicators used in this research are actually treated as leading signals to retail sales. Due to limited data, quarterly samples are collected from 2005/Q1 to 2021/Q4. However, big events from outside environments, such as the US–China trade war, COVID-19, and inflation since 2022, may impact sales revenues differently. Consequently, more data composed of monthly samples is required in order to justify the research findings.

In market analysis, the relationship known as mutualism is found to exist between the three firms. In other words, a firm can expect to positively vary its sales with its competitors (one increases or decreases, the other has the same direction). Not surprisingly, this finding implies a common driver affecting retail sales. However, in terms of product varieties and market segmentation, Walmart, Costco, and Kroger are not homogeneous. Walmart possesses the greatest variety of products, such as those found in hypermarkets, while Kroger focuses on community supermarkets. In contrast, Costco seems to position itself at the median point between Walmart and Kroger. Thus, to reveal more insights, market analysis should be elaborated to carefully target specific customer groups, and should include product categories and geographic areas. Besides, the results concerning market equilibrium indicate that Kroger has the greatest potential to increase sales. However, this implication does not take the competition from Amazon’s cashier-less stores into account. The paradigm shift arising from artificial intelligence and computer vision deserve observation, in order to evaluate their potential impact on future developments in the retail sector.

Finally, regarding performance assessment, operational efficiencies are mathematically derived by input (resource) and output (outcome) variables. By consulting domain experts, COGS (cost of goods sold), full-time employees, and operating expenses are used as the input, while sales are used as the output. Operational efficiencies are derived annually. As opposed to sales forecasting and market analysis, performance assessment focuses on efficiency: how efficiently does a firm utilize its resources to generate an outcome? As we know, profit margin is usually very low in the retail sector. Thus, to substantially enhance competitive advantage, improvement of operational efficiency may be more important than an increase of sales for retail firms. Possible methods include a decrease of input resources while keeping the same outcome, or the use of the same input resources while generating a higher level of outcome.

6. Conclusions

To help retail firms conduct sales forecasting, market analysis, and performance assessment, this research proposes a novel framework, and the top three US retail firms, Walmart, Costco, and Kroger, are used to evaluate the research validity. More importantly, these three critical issues are surrounded by the concept of business analytics from start to finish. In summary, the research contributions are outlined as follows:

A statistical regression known as Lasso is used to select the economic indicators for Walmart, Costco, and Kroger, and machine learning methods (MARS, SVR, DNN) are used for sales forecasting,
The Lotka–Volterra model is applied to conduct competitive analysis between the top three US retail firms, and to estimate stable market equilibriums in order to reveal insights,
Data envelopment analysis is used to derive operational efficiencies and to indicate the actions required for inefficient firms to improve their input resource variables.

Experimental results show the identified economic indicators incorporated into machine learning work well in sales forecasting (the average MAPEs are below or around 10%). Besides, the demonstrated interrelationship known as mutualism indicates that the total market is still a growing pie, and thus, each firm can benefit alongside the other. Finally, from 2009 to 2019, and also 2021, Kroger performed the worst in operational efficiency. The required improvements suggested include a decrease of full-time employees, a reduction of COGS, and a reduction of operating expenses.

Needless to say, this research is not without limitations: (1) due to limited information, only aggregate sales were collected and analyzed for Walmart, Costco, and Kroger. Product sales with respect to detailed categories (perishable food, home supplies, electronic appliances, snacks, etc.) could provide more insights [4]; (2) only onsite retailers with branch stores were analyzed and compared, while the competition arising from online e-commerce platforms, such as Amazon, were omitted. Moving forward, the boundary between onsite stores and online platforms is expected to blur, and hence, their competition deserves to be addressed [2,31]; and (3) the STP issue, market segmentation, customer targeting, and product positioning should be considered in order to fit consumer preferences for the accomplishment of upselling and cross selling. Furthermore, purchasing transaction records should be linked to customer demographics to develop attractive product strategies and promotion plans.

Author Contributions

Conceptualization, framework, article writing, C.-H.W.; data collection, methodology implementation: Y.-W.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research is under the support of Taiwan Ministry of Science and Technology.

Data Availability Statement

All data used in this research was extracted from the public lists for US economic indicators and financial reports of US retailers.

Conflicts of Interest

The authors declare no conflict of interest.

References

Martínez-de-Albéniz, V.; Belkaid, A. Here comes the sun: Fashion goods retailing under weather fluctuations. Eur. J. Oper. Res. 2021, 294, 820–830. [Google Scholar] [CrossRef]
Ulrich, M.; Jahnke, H.; Langrock, R.; Pesch, R.; Senge, R. Distributional regression for demand forecasting in e-grocery. Eur. J. Oper. Res. 2021, 294, 831–842. [Google Scholar] [CrossRef]
Brviera-Puig, A.; Buitrago-Vera, J.; Escribá-Pérez, C. Internal benchmarking in retailing in retailing with DEA and GIS: The case of loyalty-oriented supermarket chain. J. Bus. Econ. Manag. 2020, 21, 1035–1057. [Google Scholar] [CrossRef]
Fildes, R.; Ma, S.; Kolassa, S. Retail forecasting: Research and practice. Int. J. Forecast. 2019, in press. [CrossRef]
Vyt, D. Retail network performance evaluation: A DEA approach considering retailers’ geomarketing. Int. Rev.Retail. Distrib. Consum. Res. 2008, 18, 235–253. [Google Scholar] [CrossRef]
Naseri, M.B.; Elliott, G. The diffusion of online shopping in Australia: Comparing the Bass, Logistic and Gompertz growth models. J. Mark. Anal. 2013, 1, 49–60. [Google Scholar] [CrossRef]
Alon, I.; Qi, M.; Sadowski, R.J. Forecasting aggregate retail sales: A comparison of artificial neural networks and traditional methods. J. Retail. Consum. Serv. 2001, 8, 147–156. [Google Scholar] [CrossRef]
Badorf, F.; Hoberg, K. The impact of daily weather on retail sales: An empirical study in brick-and-mortar stores. J. Retail. Consum. Serv. 2020, 52, 101921. [Google Scholar] [CrossRef]
Sagaert, Y.R.; Aghezzaf, E.-H.; Kourentzes, N.; Desmet, B. Tactical sales forecasting using a very large set of macroeconomic indicators. Eur. J. Oper. Res. 2018, 264, 558–569. [Google Scholar] [CrossRef]
Baumohl, B. The Secrets of Economic Indicators: Hidden Clues to Future Economic Trends and Investment Opportunities; Pearson: New York, NY, USA, 2007. [Google Scholar]
Ceylan, H.; Ozturk, H.K. Estimating energy demand of Turkey based on economic indicators using genetic algorithm approach. Energy Convers. Manag. 2004, 45, 2525–2537. [Google Scholar] [CrossRef]
Choi, T.M.; Yu, Y.; Au, K.F. A hybrid SARIMA wavelet transform method for sales forecasting. Decis. Support Syst. 2011, 51, 130–140. [Google Scholar] [CrossRef]
Griliches, Z. Patent Statistics as Economic Indicators: A Survey in R&D and Productivity: The Econometric Evidence; University of Chicago Press: Chicago, IL, USA, 1998. [Google Scholar]
Hua, G.B. Residential construction demand forecasting using economic indicators: A comparative study of artificial neural networks and multiple regression. Constr. Manag. Econ. 1996, 14, 25–34. [Google Scholar] [CrossRef]
Tirole, J. The Theory of Industrial Organization; The MIT Press: New York, NY, USA, 1991. [Google Scholar]
Wu, O.; Chen, H. Chain-to-chain competition under demand uncertainty. J. Oper. Res. Soc. China 2016, 4, 49–75. [Google Scholar] [CrossRef]
Thomassey, S. Sales forecasts in clothing industry: The key success factor of the supply chain management. Int. J. Prod. Econ. 2010, 128, 470–483. [Google Scholar] [CrossRef]
Kreng, V.B.; Wang, T.C.; Wang, H.T. Tripartite dynamic competition and equilibrium analysis on global television market. Comput. Ind. Eng. 2012, 63, 75–81. [Google Scholar] [CrossRef]
Tsai, B.H.; Li, Y.; Lee, G.H. Forecasting global adoption of crystal display televisions with modified product diffusion model. Comput. Ind. Eng. 2010, 58, 553–562. [Google Scholar] [CrossRef]
Chu, C.W.; Zhang, G.P. A comparative study of linear and nonlinear models for aggregate retail sales forecasting. Int. J. Prod. Econ. 2003, 86, 217–231. [Google Scholar] [CrossRef]
Sun, Z.L.; Choi, T.M.; Au, K.F.; Yu, Y. Sales forecasting using extreme learning machine with applications in fashion retail. Decis. Support Syst. 2008, 46, 411–419. [Google Scholar] [CrossRef]
Xia, M.; Zhang, Y.; Weng, L.; Ye, X. Fashion retail forecasting based on extreme learning machine with adaptive metrics of inputs. Knowl. -Based Syst. 2012, 36, 253–259. [Google Scholar] [CrossRef]
Hung, H.C.; Chiu, Y.C.; Wu, M.C. An enhanced application of Lotka–Volterra model to forecast the sales of two competing retail formats. Comput. Ind. Eng. 2017, 109, 325–334. [Google Scholar] [CrossRef]
Hung, H.C.; Tsai, Y.S.; Wu, M.C. A modified Lotka–Volterra model for competition forecasting in Taiwan’s retail industry. Comput. Ind. Eng. 2014, 77, 70–79. [Google Scholar] [CrossRef]
Stock, J.H.; Watson, M.W. New indexes of coincident and leading economic indicators. NBER Macroecon. Annu. 1989, 4, 351–394. [Google Scholar] [CrossRef]
Wong, W.K.; Guo, Z.X. A hybrid intelligent model for medium-term sales forecasting in fashion retail supply chains using extreme learning machine and harmony search algorithm. Int. J. Prod. Econ. 2010, 128, 614–625. [Google Scholar] [CrossRef]
Donthu, N.; Yoo, B. Retail Productivity Assessment Using Data Envelopment Analysis. J. Retail. 1998, 74, 89–105. [Google Scholar] [CrossRef]
Vaz, C.B.; Camanho, A.S.; Guimarães, R.C. The assessment of retailing efficiency using Network Data Envelopment Analysis. Ann. Oper. Res. 2010, 173, 5–24. [Google Scholar] [CrossRef]
Lin, C.J.; Lee, T.S. Tourism demand forecasting: Econometric model based on multivariate adaptive regression splines, artificial neural network and support vector regression. Adv. Manag. Appl. Econ. 2013, 3, 1–18. [Google Scholar]
Li, S.; Tsai, S. Efficiency of apparel retail at the firm level—An evaluation using data envelopment analysis. J. Text. Eng. Fash. Technol. 2018, 4, 131–138. [Google Scholar] [CrossRef]
Qi, Y.; Li, C.; Deng, H.; Cai, M.; Qi, Y.; Deng, Y. A deep neural framework for sales forecasting in E-Commerce. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 299–308. [Google Scholar]
Ma, S.; Fildes, R. Retail sales forecasting with meta-learning. Eur. J. Oper. Res. 2021, 288, 111–128. [Google Scholar] [CrossRef]
Worrell, E.; Price, L.; Martin, N.; Farla, J.; Schaeffer, R. Energy intensity in the iron and steel industry: A comparison of physical and economic indicators. Energy Policy 1997, 25, 727–744. [Google Scholar] [CrossRef]
McLaren, N.; Shanbhogue, R. Using internet search data as economic indicators. Bank Engl. Q. Bull. 2011, 134–140. [Google Scholar] [CrossRef]
Clements, M.P.; Galvão, A.B. Forecasting US output growth using leading indicators: An appraisal using MIDAS models. J. Appl. Econom. 2009, 24, 1187–1206. [Google Scholar] [CrossRef]
Choi, S.C. Price competition in a channel structure with a common retailer. Mark. Sci. 1991, 10, 271–296. [Google Scholar] [CrossRef]
Görsch, D.; Pedersen, M.K. eChannel Competition: A Strategic Approach to Electronic Commerce. In Proceedings of the 8th European Conference on Information Systems, Trends in Information and Communication Systems for the 21st Century, ECIS 2000, Vienna, Austria, 3–5 July 2000; Volume 167. [Google Scholar]
Wang, X.; Ng, C.T. New retail versus traditional retail in e-commerce: Channel establishment, price competition, and consumer recognition. Anneals Oper. Res. 2020, 291, 921–937. [Google Scholar] [CrossRef]
Taylor, J.W.; Buizza, R. Using weather ensemble predictions in electricity demand forecasting. Int. J. Forecast. 2003, 19, 57–70. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Series in Statistics: New York, NY, USA, 2001. [Google Scholar]
Wang, C.H.; Chen, T.Y. Combining biased regression with machine learning to conduct supply chain forecasting and analytics for printing circuit board. Int. J. Syst.Sci. Oper. Logist. 2022, 9, 143–154. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.J. Learning with Kernels, Support Vector Machines, Regularization, Optimization, and Beyond; The MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
Kecman, V. Learning and Soft Computing-Support Vector Machines, Neural Network, and Fuzzy Logic Models; The MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
Tan, P.N.; Steinbach, M.; Kumar, V. Introduction to Data Mining; Pearson: New York, NY, USA, 2010. [Google Scholar]
Haykin, S.O. Neural Networks and Learning Machines, 3rd ed.; Pearson: New York, NY, USA, 2009. [Google Scholar]

Figure 1. Proposed research techniques.

Figure 2. Multivariate adaptive regression splines (MARS). Dot means data samples ans line is its regression fitting.

Figure 3. Support vector regression (SVR). Stars mean representative data samples, a solid line is a central fitting, and dashed lines denote the upper and lower limit.

Figure 4. Deep neural network (DNN).

Figure 5. Quarterly sales revenues for the three US firms (in millions $USD).

Figure 6. Operational efficiencies.

Figure 7. Sales revenues in millions $USD.

Figure 8. Cost of goods sold (COGS) in millions $USD.

Figure 9. Full-time employees.

Figure 10. Operating expenses in millions $USD.

Table 1. Overall comparison between this research and past studies.

References	Economic Indicators	Market Analysis	Performance Assessment	Methods
This research	*	*	*	Lasso, MARS, SVR, DNN, LVM, DEA
Chu and Zhang [20]				ARIMA, BPN
Donthu and Yoo [27]			*	DEA
Sun et al. [21]				Extreme learning
Vyt [5]			*	DEA
Thomassey [17]				ARIMA, BPN, HWS
Tsai et al. [19]		*		LVM
Wong and Guo [26]				ARIMA, BPN
Vaz et al. [28]			*	DEA
Choi et al. [12]				ARIMA, wavelet, ANN
Kreng et al. [18]		*		LVM
Xia et al. [22]				ARIMA, BPN
Lin and Lee [29]	*			MARS, BPN, SVR
Naseri and Elliott [6]				Bass, Logistic, Gompertz
Hung et al. [24]		*		LVM
Hung et al. [23]		*		LVM, ARIMA
Li and Tsai [30]			*	DEA
Sagaert et al. [9]	*			Lasso, ARIMA, HWS, ES, MLR, SR
Qi et al. [31]				DNN, RNN, GB
Brviera-Puig et al. [3]			*	DEA
Ma and Fildes [32]				ARIMA, BPN, ES, MLR, RF, GB, SVR
Ulrich et al. [2]				Demand distribution, QR, QRF, RF

ARIMA: autoregressive integrated moving average, BPN: back propagation neural network, DEA: data envelopment analysis, DNN: deep neural network, ES: exponential smoothing, (X)GB: (extreme) gradient boosting, HWS: Holt-Winters smoothing, Lasso: least absolute shrinkage and selection operator, LVM: Lotka–Volterra model, MARS: multivariate adaptive regression splines, MLR: multiple linear regression, QR: quantile regression, QRF: quantile regression forest, RF: random forest, RT: regression tree, SVR: support vector regression, VAR: vector autoregression. * means one of the three modules (economic indicators, market analysis, performance assessment) has been addressed.

Table 2. Relationship description according to the signs of interaction parameters.

c₁, c₂	Relationship	Explanation
+, +	Pure competition	Both suffer from each other’s existence
+, −	Predator-prey	Entity 1 serves as direct food to entity 2
−, −	Mutualism	The case of symbiosis (win-win)
+, 0	Amensalism	Entity 1 suffers from the existence of entity 2, who is impervious to what is happening
−, 0	Commensalism	Entity 1 benefits from the existence of entity 2, who nevertheless remains unaffected
0, 0	Neutralism	No interaction between each other

Table 3. Identified economic indicators for the three US retailers using Lasso. * means significant indicators.

References	Walmart	Costco	Kroger
Consumer price index (CCI)
Consumer confidence index (CPI)	*	*	*
Personal consumption expenditure (PCE)	*		*
Gross domestic product (GDP)	*	*
Dow Jones Transportation (DJT)		*	*
Producer price index (PPI)	*
Purchase manager index (PMI)
Non-manufacturing purchase index (NMI)
Import price index (IMPI)
Export price index (EXPI)
Oil price	*
US dollar index (USDX)
Regular wage	*	*	*
Non-farming population (NFP)
Unemployment rate
Interest rate

Table 4. Forecasting errors for the training set (2005/Q1~2019/Q4).

Training	Walmart			Costco			Kroger
	MARS	SVR	DNN	MARS	SVR	DNN	MARS	SVR	DNN
RMSE	6250	6583	6616	3557	3889	3841	3541	3606	3414
MAE	295,647	289,486	5383	171,969	141,518	2526	141,155	136,233	2387
MAPE	4.5%	4.36%	4.86%	11.09%	8.53%	9.43%	12.27%	8.48%	9.41%

Table 5. Forecasting errors for the testing set (2020/Q1~2021/Q4).

Testing	Walmart			Costco			Kroger
	MARS	SVR	DNN	MARS	SVR	DNN	MARS	SVR	DNN
RMSE	7190	8986	9022	8080	10,210	9355	5699	5353	4978
MAE	48,830	61,676	7762	43,001	63,978	6982	29,132	37,298	3835
MAPE	4.27%	5.37%	5.43%	10.22%	15.12%	13.1%	9.4%	13.25%	10.49%

Table 6. Pairwise analyses for the three retail firms.

Outcome	Predictor	$Parameters (a, b, c)$	Relationship
Walmart	Costco	1.466 **, 5.287 × 10⁻⁶ , −5.697 × 10⁻⁶ *	Mutualism
Costco	Walmart	0.767 *, 8.976 × 10⁻⁶ , −4.405 × 10⁻⁶ *	Mutualism
Costco	Kroger	0.768 , 1.188 × 10⁻⁵ , −2.266 × 10⁻⁵ **	Mutualism
Kroger	Costco	1.553 *, 3.481 × 10⁻⁵ , −1.234 × 10⁻⁵ **	Mutualism
Kroger	Walmart	0.612 *, 2.450 × 10⁻⁵ , −8.671 × 10⁻⁶ **	Mutualism
Walmart	Kroger	1.294 *, 4.169 × 10⁻⁶ , −8.028 × 10⁻⁶ **	Mutualism

Significance level used: *** < 0.001, ** < 0.01, * < 0.05.

Table 7. Stable equilibriums of sales (million USD) using LVM.

	Walmart	Costco	Kroger
2021/Q4	152,871	51,904	33,048
Stable Equilibriums	148,468.68 (−2.8%)	54,989.79 (+5.9%)	35,637.64 (+7.8%)
R square	0.77	0.91	0.93
MAPE	6.92%	11.03%	5.78%

Table 8. Correlation coefficients between input and output variables.

	Sales Revenues	COGS	Full-Time Employees	Operating Expenses
Sales revenues	1
COGS	0.999	1
Employees	0.978	0.97	1
Operating expenses	0.988	0.98	0.984	1

Table 9. Operational efficiencies for top three US retailers.

	Revenue/COGS			Revenue/Employees			Revenue/Operating Expenses
	Walmart	Costco	Kroger	Walmart	Costco	Kroger	Walmart	Costco	Kroger
2005	0.977	0.942	1	0.52	1	0.427	0.837	1	0.417
2006	0.982	0.857	0.992	0.569	0.953	0.444	0.88	1	0.459
2007	0.982	0.857	0.984	0.631	0.963	0.436	0.922	0.972	0.471
2008	0.992	0.857	0.975	0.685	0.977	0.44	0.921	0.982	0.482
2009	0.996	0.86	0.982	0.657	0.939	0.434	0.903	0.935	0.435
2010	0.998	0.86	0.966	0.733	0.976	0.441	0.927	0.947	0.482
2011	0.994	0.856	0.95	0.753	0.988	0.462	0.961	0.97	0.511
2012	0.989	0.855	0.942	0.855	1	0.478	0.997	0.982	0.501
2013	0.991	0.856	0.943	0.803	0.957	0.489	1	0.973	0.532
2014	0.989	0.856	0.944	0.827	0.951	0.47	0.994	0.968	0.524
2015	0.992	0.86	0.96	0.827	0.939	0.447	0.971	0.944	0.502
2016	1	0.863	0.965	0.793	0.85	0.428	0.921	0.917	0.484
2017	1	0.861	0.959	0.818	0.864	0.431	0.919	0.937	0.48
2018	0.997	0.857	0.959	0.858	0.9	0.441	0.928	0.956	0.473
2019	0.993	0.858	0.96	0.919	0.913	0.428	0.956	0.945	0.461
2020	0.995	0.859	0.972	0.988	0.927	0.469	0.978	0.953	0.459
2021	1	0.854	0.96	1	1	0.451	1	1	0.459

Table 10. Suggested improvements of input resources for inefficient firms.

	Revenue/COGS			Revenue/Employees			Revenue/Operating Expenses
	Walmart	Costco	Kroger	Walmart	Costco	Kroger	Walmart	Costco	Kroger
2005
2006	−256.3			−78,650			−61.6
2007	−561.1	−289.4	−106.6	−137,565	−680	−620	−136	−32.6	−28.4
2008	−602.2	−191.9	−234.5	−55,969	−426	−1292	−154	−21.4	−60.4
2009	−602.4	−315.7	−403.6	−177,184	−735	−2282	−158	−37.0	−116
2010			−625.3			−3340		−251.8	−159.4
2011		−159.8	−770.9		−322	−3718		−244.9	−181.5
2012			−1268.3			−5763			−302.1
2013		−279.1	−472.3		−552	−2058		−31	−105.8
2014	−727.9	−400.1	−759.2	−4400	−780	−3375	−186.6	−44.6	−172.6
2015	−726.2	−101.3	−339.4	−4400	−200	−1600	−191	−11.6	−81.8
2016		−310.7	−529.3		−675	−2586		−36.8	−133
2017		−345.6	−1022		−717	−4873		−39.9	−257.4
2018	−1149.1	−504.6	−1072.9	−6900	−980	−4939	−1249.2	−56.8	−273.9
2019	−1958.8	−403.5	−1515.9	−11,000	−762	−7248	−540	−46	−397.3
2020	−825.6	−150.2		−4400	−273		−974.3	−17
2021			−1266.1			−5580			−332.5

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.-H.; Gu, Y.-W. Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective. Appl. Sci. 2022, 12, 8480. https://doi.org/10.3390/app12178480

AMA Style

Wang C-H, Gu Y-W. Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective. Applied Sciences. 2022; 12(17):8480. https://doi.org/10.3390/app12178480

Chicago/Turabian Style

Wang, Chih-Hsuan, and Yu-Wei Gu. 2022. "Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective" Applied Sciences 12, no. 17: 8480. https://doi.org/10.3390/app12178480

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sales Forecasting, Market Analysis, and Performance Assessment for US Retail Firms: A Business Analytics Perspective

Abstract

1. Introduction

2. Literature Review

2.1. Sales Forecasting Based on Economic Indicators

2.2. Dynamic Competition and Performance Assessment

3. Proposed Techniques

3.1. Statistical Learning

3.2. Machine Learning

3.3. Competitive Analysis

3.4. Performance Assessment

4. Experimental Results

4.1. Forecasting Sales Based on Economic Indicators

4.2. Analyzing the Interrelationships

4.3. Deriving Operational Efficiencies and Performance Assessment

5. Discussions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI