Next Article in Journal
Optimal Control of Dengue Transmission with Vaccination
Next Article in Special Issue
Moderating Effect of Proactivity on Firm Absorptive Capacity and Performance: Empirical Evidence from Spanish Firms
Previous Article in Journal
Time-Consistency of an Imputation in a Cooperative Hybrid Differential Game
Previous Article in Special Issue
Integrating and Controlling ICT Implementation in the Supply Chain: The SME Experience from Baja California
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Marketing Mix Modeling Using PLS-SEM, Bootstrapping the Model Coefficients

by
Mariano Méndez-Suárez
Department of Market Research and Quantitative Methods, ESIC Business & Marketing School, Pozuelo de Alarcón, 28223 Madrid, Spain
Mathematics 2021, 9(15), 1832; https://doi.org/10.3390/math9151832
Submission received: 8 July 2021 / Revised: 30 July 2021 / Accepted: 2 August 2021 / Published: 3 August 2021

Abstract

:
Partial least squares structural equations modeling (PLS-SEM) uses sampling bootstrapping to calculate the significance of the model parameter estimates (e.g., path coefficients and outer loadings). However, when data are time series, as in marketing mix modeling, sampling bootstrapping shows inconsistencies that arise because the series has an autocorrelation structure and contains seasonal events, such as Christmas or Black Friday, especially in multichannel retailing, making the significance analysis of the PLS-SEM model unreliable. The alternative proposed in this research uses maximum entropy bootstrapping (meboot), a technique specifically designed for time series, which maintains the autocorrelation structure and preserves the occurrence over time of seasonal events or structural changes that occurred in the original series in the bootstrapped series. The results showed that meboot had superior performance than sampling bootstrapping in terms of the coherence of the bootstrapped data and the quality of the significance analysis.

1. Introduction

Marketing mix models use multiple regression to measure marketing effectiveness and efficiency [1]. In the case of multichannel retailers that sell online and offline and advertise on both offline and Internet media, a common solution to the model marketing mix is chaining multiple regression models (based on conversations with consulting experts), i.e., modeling first the impact of advertising on online sales and then using this information to model offline sales. Recent research [2] proposed using partial least squares structural equation models (PLS-SEM) to measure the simultaneous impact of advertising in multichannel retailer contexts and to measure the effectiveness of the different advertising campaigns on web and store sales [3].
PLS-SEM has some desirable properties for marketing mix modeling because it is a causal modeling approach aimed at maximizing the explained variance of the dependent constructs, and because it is similar to multiple regression analysis, it is appropriate for prediction [4]. Moreover, and very relevant, PLS-SEM avoids the problem of indeterminacy and displays the factor scores [5], allowing the use of latent variable scores measured by one or several indicators in subsequent analyses [6]. Consequently, PLS-SEM is particularly useful for measuring the efficiency of marketing campaigns by attributing sales to each of the advertising channels and calculating marketing ROI [3].
However, because PLS-SEM does not assume normality, lack of extreme values, or symmetry in sample data [7], the parametric significance tests usually employed in linear models cannot be applied to test whether outer loadings and path coefficients are significant. Instead, PLS-SEM relies on a nonparametric sampling bootstrapping procedure [8] to test the significance of estimated coefficients. This bootstrapping methodology involves repeated random sampling with replacement from the original sample to create bootstrap samples. It is a good procedure for estimating sampling distributions under independent and identically distributed (i.i.d.) random variables [9], even in situations in which the i.i.d. setup is slightly violated [10], as with cases in which there might be changes in the mean or variance (i.e., the survey is conducted in different countries or with heterogenous respondents) [11,12].
Although sampling bootstrapping is a proper method to measure the significance of the coefficients in most PLS-SEM applications, it is not recommended for marketing mix time series because the data has internal structure and the sampling bootstrapping method can change the dates of events, such as Black Friday or Christmas, or introduce several additional events or none at all in a given year. It also does not respect the time intervals of the structural changes that the series may have.
As an alternative to sampling bootstrapping, we propose maximum entropy (meboot) bootstrapping [13], which maintains the individual basic shapes of time series and their time dependence structures as the autocorrelation function (ACF) and the partial autocorrelation function (PACF). Additionally, when applying meboot bootstrapping, the results inherit the structure while respecting the dates of special events such as Black Friday as well as the possible structural changes.
Despite its importance, little research has been done in the area of time series significance analysis using PLS-SEM models, especially with regard to marketing mix analysis. Furthermore, current research does not highlight the relevance and importance of the application of consistent bootstrap methodologies for solving these types of problems; this research makes important contributions by filling this void. For these reasons, the overall aim of this paper is to provide a detailed empirical demonstration of the advantages of the suggested meboot bootstrapping procedure in comparison with sampling bootstrapping to calculate the significance of PLS-SEM model parameter estimates in a time series or marketing mix modeling context. To this end, we based our analysis on standardized data from a European consumer electronics multichannel company [2] containing web and store sales and online and offline advertising activities.
Given this aim, the remainder of this paper is structured as follows. First, the theoretical foundations are explained. Then, the data used in this research is analyzed, and next, both bootstrapping methods are applied to finally discuss the results.

2. Theoretical Foundation

2.1. PLS-SEM

PLS-SEM is a technique appropriate for solving marketing mix problems even when very complex relationships exist [14] because the optimization algorithm maximizes the variance explained of the model’s endogenous constructs, making it especially appropriate to identify key variables in situations of weak theory [15] or verify whether the hypothesized relationships are empirically acceptable [16], for example, those involving marketing mix model variables. Regarding its statistical properties, PLS-SEM admits single item constructs without identification or convergence problems [17]; moreover, PLS-SEM models can handle extremely non-normal data with asymmetries and very high levels of skewness, for example, those corresponding to marketing events such as Black Friday. PLS-SEM is also appropriate for the typical small sample sizes of marketing mix models, such as in our case of 120 weekly observations corresponding to approximately 2.5 years of weekly data.
Earlier applications of PLS-SEM to solve marketing mix problems focused on better understanding the direct and cross effects of advertising on sales. Early research [18] studied the impact of the interaction of radio and print advertising in the opening of checking and savings accounts at a commercial bank, finding evidence of direct and cross effects between both media. More recent research [19] added Internet advertising variables to measure the impact of print advertising and paid search on a service company, finding a crossover effect on online conversions.
Recently, [2] PLS-SEM applied to marketing mix showed evidence of the amplifying effect of organic search queries on the advertising and, consequently, the sales of a multichannel retailer. Additionally, the PLS-SEM [3] model was used to calculate the ROI of offline and Internet advertising campaigns.
To verify the statistical significance of the PLS-SEM model parameters, the literature proposes using sampling bootstrapping; the next section discusses the reasons.

2.2. Sampling Bootstrapping

The term bootstrapping is inspired by the story of the Baron of Munchausen [20], who explained how he pulled himself and his horse out of a swamp by his own hair, meaning that the Baron saved himself by his own means. In this sense, the homonymous statistical technique developed by Efron [9] is similar because bootstrapping draws conclusions about the characteristics of a population using the sample itself; in other words, given the absence of information about the population, the sample is assumed to be the best estimate of the population [21], making this method very appropriate when, as is the case with PLS-SEM, there is no knowledge about the distribution of the parameters.
To find the empirical sampling distribution of a parameter, bootstrapping generates a number of samples with repetition (recommended: 5000) [4], containing the same amount of data as the original series to be sure that the samples obtained have the same statistical properties as the original sample, i.e., if the data contains 120 observations, as in the present research, 5000 samples with 120 observations are generated; in this way, each resample has the same number of elements as the original sample, and the replacement method transforms the finite sample into an infinite population. For each sample, a PLS-SEM model is calculated, and the data on the coefficients of interest are stored, creating a distribution of 5000 distinct coefficients, one for each of the path coefficients or outer loading models of interest. For example, when analyzing the loadings of the indicator λ, we will obtain 5000 values of the estimate λ*, these values are then ordered from smallest to largest:
λ 1 * , λ 2 * , , λ 5000 *
Then, the lower and upper bounds of the confidence intervals are identified, i.e., if the desired confidence interval is 95%, the interval goes from the lower bound observation, 5000 × 0.025, to the upper 5000 × 0.975 observation, that is, from 125 observations to 4875. The resulting confidence interval (CI) suggests that the population value of λ
C I = λ 125 * , λ 4875 *
will be somewhere in between λ 125 * and λ 4875 * with a 95% probability. Once the confidence interval is calculated, if it does not include 0, we may consider that the coefficient is significant at 95%.
However, as stated previously, in many cases, because of the nature of the data, the distribution of the parameters is asymmetric and the percentile method is subject to coverage error as stated by [7], meaning that, for example, a 95% confidence interval may actually be a 90% confidence interval. Hence, it is recommended to construct bias-corrected percentile confidence intervals to make statistical inferences when using PLS-SEM. Using bias-corrected and accelerated (BCa) bootstrap confidence intervals solves this problem by adjusting for biases and skewness in the bootstrap distribution [22]; for a detailed step-by-step explanation of the methodology in a PLS-SEM context, see [23].
In the case of time series data as marketing mix model variables, this methodology has a major drawback because, by definition, resampling does not preserve the order of the data, the autocorrelation structure, or the exact time of marketing-associated events such as Black Friday. To solve these problems, the present research proposes the maximum entropy bootstrapping methodology for analyzing the significance of time series coefficients, which will be explained next.

2.3. Maximum Entropy Bootstrapping

Carlstein [24], aware that time series do not satisfy the i.i.d. hypothesis required by bootstrapping and the problems generated by breaking the internal structure of time series by shuffling the data, proposed a solution convenient for stationary time series consisting of bootstrapping nonoverlapping blocks of observations instead of case-by-case observations; on the basis of this idea, the methodology was improved with the proposal of nonoverlapping moving blocks [25,26]. However, even after these improvements, the methods faced the same problems with respect to violations of the required stationarity property and therefore did not provide any remedy.
As a solution to time series bootstrapping, Vinod and López-de-Lacalle [13] proposed the application of the principle of maximum entropy (ME), explained in depth by [27]. According to Vinod [28], ME is a powerful tool to avoid unnecessary distributional assumptions, such as i.i.d. or stationarity assumptions. ME constructs a population of time series, called ensemble Ω, which can include regime switches, gaps, or jump discontinuities. With f(x) being the density function of xt, the entropy H (Equation (3)) is defined as:
H = E l o g   f x ,
Maximizing the entropy H in a density f(x) function, defined in terms of Shannon information [29], means that we are finding the smoothest possible probability distribution that meets the constraints derived from prior knowledge about the mean and variance of the original series. The meboot algorithm constructs segments of ME density f(x) subject to certain mass- and mean-preserving constraints.
The meboot algorithm [13] is a procedure that generates a large number of replicates, e.g., 5000, of the original series, which can be used for statistical inference; it then applies the “blocking” technique to break the time series into nonoverlapping blocks such that the grand mean of all the simulated samples equals the time average of the original, constructing bootstrap samples, or ensembles, that retain the basic shape and dependence structure of the original data. Figure 1 shows the actual series of web sales used in this research, explained in the next section, as well as two random ensembles generated with the meboot algorithm.
Moreover, the approach can be applied in the presence of structural breaks, such as economic crises or recoveries, as well as jumps due to Black Friday sales in which both offline and online sales may “jump” sharply above the mean. For more information on meboot, Vinod [30] provides extensive Monte Carlo evidence that supports the use of the meboot in empirical work and suggests that the meboot confidence intervals are reliable.

3. Materials and Methods

3.1. Data

To conduct the present research, we used data from Méndez-Suárez and Monfort [2], which contains a time series over 120 weeks from a European consumer electronics multichannel retailer, including information on investment in offline, Internet, and paid search advertising, as well as Google queries containing the name of the retailer and the online and offline sales. Table 1 depicts the descriptive statistics of the standardized values of the original data; some variables, such as online Sales, queries, and retargeting, show high levels of skewness and excess kurtosis.

3.2. Methods

To compare the results of sampling versus meboot bootstrapping, we used the PLS-SEM model from [2], depicted in Figure 2. The online and offline media in which the multichannel retailer advertised during the period are represented as two reflective latent constructs; the rest of the exogenous variables included in the structural model are single item constructs.
The latent variable online advertising included display, Facebook, Retargeting, Twitter, and YouTube, and the latent variable offline advertising contained store flyers and TV advertising (Equation (4)).
O n l i n e t = D i s p l a y t λ 1 + F a c e b o o k t λ 2 + R e t a r g e t i n g t λ 3 + T w i t t e r t λ 4 + Y o u t u b e t λ 5 O f f l i n e t = S t o r e   f l y e r t λ 6 + T V   A d v e r t i s i n g t λ 7 ,
The structural model contained four endogenous variables (Equation (5)), including queries, explained by online and offline web and store sales, both explained by on and offline advertising, paid search, and Christmas. Paid search was explained by queries.
Q u e r i e s t = O n l i n e t β 1 + O f f l i n e t β 2 W e b S a l e s t = Q u e r i e s t β 3 + O n l i n e t β 4 + O f f l i n e t β 5 + P a i d S e a r c h t β 6 + C h r i s t m a s t β 7 S t o r e S a l e s t = Q u e r i e s t β 8 + O n l i n e t β 9 + O f f l i n e t β 10 + P a i d S e a r c h t β 11 + C h r i s t m a s t β 12 P a i d S e a r c h t = Q u e r i e s t β 13 ,
The PLS-SEM model from Figure 2 was used to bootstrap the latent variable outer loadings and the path coefficients using sampling and meboot; the results are presented in the following section.

4. Empirical Results

To compare the results of sampling and meboot, we bootstrapped 5000 subsamples of the PLS-SEM model and calculated the bias-corrected and accelerated (BCa) confidence intervals [7]. Bootstrapping of the structural model employed the R [31] packages, plspm [32], and meboot [13]. The BCa confidence interval calculation in R followed that of Streukens and Leroi-Werelds [23]. The discriminant validity of the model, heterotrait–monotrait (HTMT) ratio of correlations, employed the R semTools package [33].

4.1. Correlations

The correlation of the original series and two random draws of the meboot and sample bootstrap are shown in Table 2a–c. The results showed similar correlations between the original and the bootstrapped variables; there were no significant differences to suggest that one method is better than the other or that one of the methods has major flaws and cannot be used to assess the significance of the results. Next, we analyze the results of the bootstrapped confidence intervals.

4.2. Reliability, Validity, Structural Model, and Fit Assessment

Following [7], to assess the reflective measurement model, we evaluated the composite convergent validity using the average variance explained (AVE), the internal consistency reliability with Cronbach’s α, and the discriminant validity using HTMT. The mathematical formulations are represented in Equation (6) (a–d), respectively.
( a )   A V E ξ j = 1 K j k = 1 K j λ jk 2 ; ( b )   C r o n b a c h s   α = N · c ¯ 1 + N 1 · c ¯ ; ( c )   J ö r e s k o g s   ρ = i = 1 N l 1 2 i = 1 N l 1 2 + i = 1 N var e i ( d )   H T M T ij = 1 K i K j g = 1 K i k = 1 K j r ig , jh ÷ 2 K i ( K i 1 ) · g = 1 K i 1 k = g + 1 K i r ig , ih · 2 K j ( K j 1 ) · g = 1 K j 1 k = g + 1 K j r jg , jh 1 2
The AVE for construct ξj is defined as the average of the explained variances λ2 of each reflective construct. In Cronbach’s α, N is the number of low-order components (i = 1, …, N), and c ¯ is the average correlation between the lower-order components. In Jöreskog’s ρ, li is the loading of the lower-order component i on a particular higher-order construct, and var(ei) is the variance of the measurement error of the lower-order component i. As explained by [34], the HTMT of constructs ξi and ξj with Ki and Kj indicators, respectively, are the averages of the correlations of indicators across constructs measuring different phenomena relative to the average of the correlations of indicators within the same construct.
Table 3 shows the BCa confidence intervals of the reflective measuring model assessment using both bootstrapping methodologies. For the external loadings of the latent variables (Table 3a), there was agreement between the two methods in terms of the significance of the loadings, but in this case, the width of the intervals is consistently larger when using sampling bootstrapping, which means that there is a much larger level of dispersion of the results when this methodology is used.
However, the problems become especially severe when assessing the reflective constructs (Table 3b) because of the width of the sampling bootstrap intervals, which in all cases is three times wider or more compared with the meboot intervals; consequently, the latent variables are not validated in terms of AVE, Cronbach’s Alpha, and Jöreskog’s ρ; the HTMT is validated but by hundredths of a percent.
The confidence intervals from the regression coefficients (Table 4a) had similar amplitudes and showed similar results with respect to significance in all the paths, except for the offline advertising path to web sales, for which the sampling bootstrap method indicated that offline advertising had a non-significant coefficient on web sales; in other words, offline advertising does not impact the sales of the web store.
As [35] stated, the different meaning of the term fit does not depend on whether covariance-based SEM or variance-based SEM is used but on whether confirmatory or explanatory research is performed (see [36]). Since in explanatory research, as in this case, we would like to explain as much variation as possible in a dependent variable, the R2 is the natural measure of fit; however, as occurred in the assessment of the reflective construct outer loadings, the confidence intervals of the R2 (Table 4b) of the sampling bootstrapped values were widespread and invalidated the model, contrary to the meboot values, which showed high levels of fit in line with the results of the model application shown in Figure 3.
To understand what really explains the differences between the bootstrapping methodologies, we need to visually inspect the entire time series. Figure 3 shows the original series, and two random paths of the sampling and meboot series both for online and offline sales. The sampling bootstrapped series added jumps to sales corresponding to events such as Christmas and Black Friday but at very different times from those occurring in the original series, and, for example, in the case of offline sales (Figure 3a), it included up to 10 jumps, only one of which corresponded to the date on which it occurred; however, at the times these events occur, the sampling bootstrapped series did not reflect them. On the other hand, in the meboot series, the jumps occurred at the same times as in the original series; however, as expected for the maximum entropy modeling, some replicas of the original series were more pronounced than others.

5. Discussion

PLS-SEM methodology using i.i.d. data has been very successful in areas such as marketing, strategic management, management information systems, production, and operations or accounting [37], and it is a promising methodology for time series, especially marketing mix modeling [2,3]. However, to succeed in these areas, the traditional method to measure the significance of the structural model and the outer loadings using sampling bootstrapping should be reconsidered because this method shuffles the data without considering their internal structure or respecting the order of the sequence, the autocorrelation structure, and the moments of occurrence of special events.
The present research presents a detailed analysis of the consequences of using sampling bootstrapping for time series, especially marketing mix series, showing the risks of the decision to trust in sampling bootstrap because the method destroys the internal structure of the series and shows wider confidence intervals for the outer loadings of the models. As a solution for these types of time-series analyses in PLS-SEM contexts, when the exact colocation of the bootstrapped data is essential, as in marketing mix analyses, this study recommends using meboot bootstrapping as an alternative and proves its suitability for time series or marketing mix modeling with PLS-SEM.
Additionally, this research contributes to the development of PLS-SEM methodology by providing a technique free of the risks associated with sampling bootstrapping in time series analysis, broadening the scope and accuracy of the methodology in other areas of research. Taken as a whole, the contributions of the present research provide valuable insights into how the evaluation of time series dependencies can be effectively performed using PLS-SEM analysis and why it is so relevant to apply a bootstrapping technique specifically adapted to time series and a technique that is compatible with their time structure to measure the significance of external loadings and path coefficients.
The managerial implications of this work are twofold: (1) practitioners have to be very careful when analyzing time series using PLS-SEM if the data is not i.i.d. and smooth in terms of shape because sampling bootstrapping shuffles the time series and destroys its integrity. In this respect, the present research shows that the use of sampling bootstrapping for time series involves very high risks, especially those associated with the assessment of the reflective and the path coefficient significance; this finding constitutes one of the main contributions of this article. The meboot bootstrapping procedure respects the internal structure of the data and maintains the colocation of special marketing events, making it a trustable technique for time series analysis.
The methodology proposed in the present research can be an excellent source of innovation for PLS-SEM methodology, extending its possible application to all areas that use time series analysis for explanatory purposes and are susceptible to the potential application of PLS-SEM predictive analysis, such as those related, for example, to quality control in industrial processes or the evolution of natural ecosystems.
Three limitations of the present research may become avenues for future research. The model in this study is limited to marketing mix series, and the proposed methodology has not been tested in other time series contexts in which PLS-SEM models may be used. Additionally, the proposed model and meboot bootstrap methodology were tested on a time series of only 120 observations and not on a series with a larger number of observations. In addition, since the model only uses reflective constructs, evaluation of time series models with formative constructs would complement the results of this research.

Funding

This research was funded by ESIC Business and Marketing School, grant number 1-M-2019.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Méndez-Suárez, M.; Estevez, M. Calculation of marketing ROI in marketing mix models, from ROMI, to marketing-created value for shareholders, EVAM. Universia Bus. Rev. 2016, 52, 18–75. [Google Scholar]
  2. Méndez-Suárez, M.; Monfort, A. The amplifying effect of branded queries on advertising in multi-channel retailing. J. Bus. Res. 2020, 112, 254–260. [Google Scholar] [CrossRef]
  3. Méndez-Suárez, M.; Monfort, A. Marketing Attribution in Omnichannel Retailing in Springer Proceedings in Business and Economics; Springer: Cham, Switzerland, 2021; pp. 114–120. ISBN 9783030189105. [Google Scholar]
  4. Hair, J.F.; Ringle, C.M.; Sarstedt, M. PLS-SEM: Indeed a Silver Bullet. J. Mark. Theory Pract. 2011, 19, 139–152. [Google Scholar] [CrossRef]
  5. Fornell, C. A Second Generation of Multivariate Analysis: An Overview; Fornell, C., Ed.; Praeger: New York, NY, USA, 1982. [Google Scholar]
  6. Henseler, J.; Ringle, C.M.; Sinkovics, R.R. The use of partial least squares path modeling in international marketing. Adv. Int. Mark. 2009, 20, 277–319. [Google Scholar]
  7. Hair, J.F.J.; Hult, G.T.; Ringle, C.; Sarstedt, M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 2nd ed.; SAGE Publications: Thousand Oaks, CA, USA, 2017; ISBN 9781483377445. [Google Scholar]
  8. Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; Monographs on Statistics and Applied Probability; Chapman & Hall/CRC: New York, NY, USA, 1993; ISBN 978-0-412-04231-7. [Google Scholar]
  9. Efron, B. Bootstrap Methods: Another Look at the Jackknife. Ann. Stat. 1979, 7, 1403–1433. [Google Scholar] [CrossRef]
  10. Liu, R.Y. Bootstrap Procedures under some Non-I.I.D. Models. Ann. Stat. 1988, 16, 1696–1708. [Google Scholar] [CrossRef]
  11. Richter, N.F.; Hauff, S.; Schlaegel, C.; Gudergan, S.; Ringle, C.M.; Gunkel, M. Using Cultural Archetypes in Cross-cultural Management Studies. J. Int. Manag. 2016, 22, 63–83. [Google Scholar] [CrossRef]
  12. Benítez-Márquez, M.D.; Bermúdez-González, G.; Sánchez-Teba, E.M.; Cruz-Ruiz, E. Exploring the antecedents of cruisers’ destination loyalty: Cognitive destination image and cruisers’ satisfaction. Mathematics 2021, 9, 1218. [Google Scholar] [CrossRef]
  13. Vinod, H.D.; López-de-Lacalle, J. Maximum Entropy Bootstrap for Time Series: The meboot R Package. J. Stat. Softw. 2009, 29, 1–29. [Google Scholar] [CrossRef] [Green Version]
  14. Ramírez-Orellana, A.; Martínez, M.D.C.V.; Grasso, M. Using Higher-Order Constructs to Estimate Health-Disease Status: The Effect of Health System Performance and Sustainability. Mathematics 2021, 9, 1228. [Google Scholar] [CrossRef]
  15. Wold, H. Partial Least Squares; Wiley: New York, NY, USA, 1985; ISBN 0471667196. [Google Scholar]
  16. Chin, W.W. Issues and Opinion on Structural Equation Modeling. MIS Q. 1998, 22, 7–16. [Google Scholar]
  17. Usakli, A.; Kucukergin, K.G. Using partial least squares structural equation modeling in hospitality and tourism. Int. J. Contemp. Hosp. Manag. 2018, 30, 3462–3512. [Google Scholar] [CrossRef]
  18. Jagpal, H.S. Measuring joint advertising effects in multiproduct firms. J. Advert. Res. 1981, 21, 65–69. [Google Scholar]
  19. Olbrich, R.; Schultz, C.D. Multichannel advertising: Does print advertising affect search engine advertising? Eur. J. Mark. 2014, 48, 1731–1756. [Google Scholar] [CrossRef]
  20. Raspe, R.E. The Surprising Adventures of Baron Munchausen; Standard Ebooks: Nevada County, CA, USA, 1781. [Google Scholar]
  21. Abdi, H.; Chin, W.W.; Vinzi, V.E.; Russolillo, G.; Trinchera, L. New Perspectives in Partial Least Squares and Related Methods. In Springer Proceedings in Mathematics and Statistics; Abdi, H., Chin, W.W., Esposito Vinzi, V., Russolillo, G., Trinchera, L., Eds.; Springer: New York, NY, USA, 2013; Volume 56, pp. 201–208. ISBN 978-1-4614-8282-6. [Google Scholar]
  22. Efron, B. Better bootstrap confidence intervals. J. Am. Stat. Assoc. 1987, 82, 171–185. [Google Scholar] [CrossRef]
  23. Streukens, S.; Leroi-Werelds, S. Bootstrapping and PLS-SEM: A step-by-step guide to get more out of your bootstrap results. Eur. Manag. J. 2016, 34, 618–632. [Google Scholar] [CrossRef]
  24. Carlstein, E. The Use of Subseries Values for Estimating the Variance of a General Statistic from a Stationary Sequence. Ann. Stat. 1986, 14, 1171–1179. [Google Scholar] [CrossRef]
  25. Kunsch, H.R. The Jackknife and the Bootstrap for General Stationary Observations. Ann. Stat. 1989, 17, 1217–1241. [Google Scholar] [CrossRef]
  26. Liu, R.Y.; Singh, K. Moving blocks jackknife and bootstrap capture weak dependence. Explor. Limits Bootstrap 1992, 225, 248. [Google Scholar]
  27. Baldwin, R.A. Use of Maximum Entropy Modeling in Wildlife Research. Entropy 2009, 11, 854–866. [Google Scholar] [CrossRef]
  28. Vinod, H.D. Maximum Entropy Bootstrap Algorithm Enhancements; Discussion Paper Series; Fordham University: Bronx, NY, USA, 2013; Volume 2013-04. [Google Scholar]
  29. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  30. Vinod, H. New bootstrap inference for spurious regression problems. J. Appl. Stat. 2016, 43, 317–335. [Google Scholar] [CrossRef]
  31. R Core Team R. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
  32. Sanchez, G. PLS Path Modeling with R; Trowchez Editions: Berkeley, CA, USA, 2013. [Google Scholar]
  33. Jorgensen, T.D.; Pornprasertmanit, S.; Schoemann, A.M.; Rosseel, Y. semTools: Useful Tools for Structural Equation Modeling, Version 0.5-5; R Packag. 2021. Available online: https://cran.r-project.org/web/packages/semTools/semTools.pdf (accessed on 1 August 2021).
  34. Henseler, J.; Ringle, C.M.; Sarstedt, M. A new criterion for assessing discriminant validity in variance-based structural equation modeling. J. Acad. Mark. Sci. 2015, 43, 115–135. [Google Scholar] [CrossRef] [Green Version]
  35. Henseler, J.; Hubona, G.; Ray, P.A. Using PLS path modeling in new technology research: Updated guidelines. Ind. Manag. Data Syst. 2016, 116, 2–20. [Google Scholar] [CrossRef]
  36. Henseler, J. Partial least squares path modeling: Quo vadis? Qual. Quant. 2018, 52, 1–8. [Google Scholar] [CrossRef] [Green Version]
  37. Hair, J.F.; Sarstedt, M.; Hopkins, L.; Kuppelwieser, V.G. Partial least squares structural equation modeling (PLS-SEM): An emerging tool in business research. Eur. Bus. Rev. 2014, 26, 106–121. [Google Scholar] [CrossRef]
Figure 1. Plot of the standardized EUR series of web sales data used in this research, explained in the next section, and two random ensembles.
Figure 1. Plot of the standardized EUR series of web sales data used in this research, explained in the next section, and two random ensembles.
Mathematics 09 01832 g001
Figure 2. The PLS-SEM model used to illustrate the sample and meboot bootstrapping results comparison. Figure adapted with permission; the article was published in Journal of Business Research, 112, Méndez-Suárez, M.; Monfort, A. The amplifying effect of branded queries on advertising in multichannel retailing, 254–260, Copyright Elsevier (2020).
Figure 2. The PLS-SEM model used to illustrate the sample and meboot bootstrapping results comparison. Figure adapted with permission; the article was published in Journal of Business Research, 112, Méndez-Suárez, M.; Monfort, A. The amplifying effect of branded queries on advertising in multichannel retailing, 254–260, Copyright Elsevier (2020).
Mathematics 09 01832 g002
Figure 3. (a,b) plot the original weekly Sales. Offline sales (a) and online sales (b) series and their respective sampling and meboot counterparts. The horizontal axis represents time in weeks and the vertical axis represents the standard deviation of the standardized sales series.
Figure 3. (a,b) plot the original weekly Sales. Offline sales (a) and online sales (b) series and their respective sampling and meboot counterparts. The horizontal axis represents time in weeks and the vertical axis represents the standard deviation of the standardized sales series.
Mathematics 09 01832 g003
Table 1. Descriptive statistics of the data.
Table 1. Descriptive statistics of the data.
VariablesMedianMinMaxSkewnessKurtosis
Online Sales−0.2−0.69.06.854.5
Offline Sales−0.3−0.75.53.412.4
Queries−0.3−0.86.74.726.7
Paid Search−0.1−1.85.31.56.3
Store flyer0.2−1.12.40.2−1.3
TV advertising0.1−1.44.61.13.7
Display0.0−1.33.51.32.4
Facebook−0.2−1.53.71.01.2
Retargeting0.0−1.17.63.725.0
Twitter0.0−1.25.01.85.8
YouTube−0.2−0.93.71.32.0
Christmas−0.2−0.25.45.124.6
Note: Data represent standardized EUR with a mean of 0 and standard deviation of 1. Christmas is a dummy binary variable representing Christmas Eve and Epiphany.
Table 2. (a) Correlation coefficients of the time series, (b) correlation coefficients of one randomly selected series from meboot, and (c) correlation coefficients of one randomly selected series from sampling bootstrap.
Table 2. (a) Correlation coefficients of the time series, (b) correlation coefficients of one randomly selected series from meboot, and (c) correlation coefficients of one randomly selected series from sampling bootstrap.
(a) Correlations of Original Series
123456789101112
1 Online Sales100
2 Offline Sales76100
3 Queries9275100
4 Paid Search696853100
5 Store Flyers33303819100
6 TV Advertising23624646100
7 Display32738124266100
8 Facebook34202728284248100
9 Retargeting57524464118928100
10 Twitter32323191758645112100
11 YouTube36152626294746713452100
12 Christmas3257293462−11239−412100
(b) Correlation of one random series, meboot bootstrapping
123456789101112
1 Online Sales100
2 Offline Sales81100
3 Queries8086100
4 Paid Search727460100
5 Store Flyers35374526100
6 TV Advertising101122841100
7 Display142840143867100
8 Facebook29322829294241100
9 Retargeting4640376713121230100
10 Twitter82820201755625312100
11 YouTube25272426304338703553100
12 Christmas5618223141−21332−414100
(c) Correlation of one random series, sampling bootstrapping
123456789101112
1 Online Sales100
2 Offline Sales90100
3 Queries9391100
4 Paid Search596554100
5 Store Flyers39334018100
6 TV Advertising12420242100
7 Display413346234763100
8 Facebook31242631324353100
9 Retargeting5055547422113339100
10 Twitter442330332854615930100
11 YouTube23112020325655713269100
12 Christmas536734−15−19−7−424−15−11100
Note: Data values are percentages. Bootstrapped.
Table 3. Assessment of the reflective measurement model latent variables by meboot and sampling bootstrapping. (a) Convergent validity of the outer model. (b) Reliability of the outer model. (c) Discriminant validity.
Table 3. Assessment of the reflective measurement model latent variables by meboot and sampling bootstrapping. (a) Convergent validity of the outer model. (b) Reliability of the outer model. (c) Discriminant validity.
(a) Outer Loading Convergent Validity Bootstrap Results
IndicatorsLoadings95% BCa CI MebootCI Amplitude>0.5?95% BCa CI SamplingCI Amplitude>0.5?
Store flyer0.93(0.87, 0.93)0.10Yes(0.75, 0.97)0.22Yes
TV advertising0.75(0.64, 0.83)0.14Yes(0.58, 0.87)0.30Yes
Display0.65(0.65, 0.75)0.09Yes(0.24, 0.81)0.57No
Facebook0.78(0.71, 0.80)0.11Yes(0.53, 0.87)0.34Yes
Retargeting0.66(0.64, 0.79)0.15Yes(0.50, 0.88)0.38Yes
Twitter0.67(0.65, 0.76)0.05Yes(0.24, 0.86)0.62No
YouTube0.80(0.67, 0.82)0.19Yes(0.63, 0.88)0.25Yes
Latent VariablesAVE95% BCa CI MebootCI Amplitude>0.5?95% BCa CI SamplingCI Amplitude>0.5?
Online ad0.51(0.51, 0.55)0.04Yes(0.35, 0.62)0.27No
Offline ad0.72(0.66, 0.76)0.11Yes(0.63, 0.8)0.17Yes
(b) Latent Variables Internal Consistency Reliability Bootstrap Results
Latent VariablesCronbach’s Alpha95% BCa CI MebootCI Amplitude0.60–0.90?95% BCa CI SamplingCI Amplitude0.60–0.90?
Online ad0.78(0.77, 0.8)0.03Yes(0.70, 0.84)0.14Yes
Offline ad0.63(0.6, 0.71)0.11Yes(0.44, 0.77)0.34No
Latent VariablesJöreskog’s ρ95% BCa CI MebootCI Amplitude>0.7?95% BCa CI SamplingCI Amplitude>0.7?
Online ad0.85(0.85, 0.87)0.02Yes(0.45, 1)0.55No
Offline ad0.84(0.82, 0.87)0.05Yes(0.50, 1.39)0.89No
(c) Latent Variables Discriminant Validity Bootstrap Results
Latent VariablesHTMT95% BCa CI MebootCI AmplitudeCI < 1?95% BCa CI SamplingCI AmplitudeCI < 1?
Online ad & Offline ad0.80(0.73, 0.89)0.16Yes(0.61, 0.99)0.37Yes
Note: As per Hair et al. [7], bootstrapped coefficients are corrected and accelerated (BCa).
Table 4. Evaluation of the structural model. (a) The model’s regression coefficients and their significance based on meboot and sampling bootstrapping. (b) The model’s predictive accuracy based on meboot and sampling bootstrapping.
Table 4. Evaluation of the structural model. (a) The model’s regression coefficients and their significance based on meboot and sampling bootstrapping. (b) The model’s predictive accuracy based on meboot and sampling bootstrapping.
(a) Regression Coefficients Bootstrap Results
Endogenous VariablesExogenous VariablesPath Coefficient95% BCa CI MebootCI AmplitudeSignificance (p < 0.05)?95% BCa CI SamplingCI AmplitudeSignificance (p < 0.05)?
Web salesOnline ad0.14(0.07, 0.32)0.25Yes(0.05, 0.24)0.19Yes
Web salesOffline ad−0.05(−0.14, −0.002)0.14Yes(−0.12, 0.02)0.14No
Web salesQueries0.75(0.58, 0.89)0.31Yes(0.62, 0.85)0.23Yes
Web salesPaid Search0.24(0.06, 0.32)0.26Yes(0.15, 0.39)0.23Yes
Web salesChristmas−0.15(−0.15, 0.13)0.29No(−0.14, 0.12)0.26No
Store salesOnline ad−0.18(−0.24, −0.04)0.19Yes(−0.36, −0.01)0.35Yes
Store salesOffline ad0.05(−0.02, 0.12)0.14No(−0.14, 0.16)0.30No
Store salesQueries0.52(0.27, 0.65)0.38Yes(0.42, 0.75)0.34Yes
Store salesPaid Search0.39(0.31, 0.52)0.21Yes(0.12, 0.59)0.46Yes
Store salesChristmas0.32(0.23, 0.44)0.21Yes(0.08, 0.53)0.45Yes
QueriesOnline ad0.38(0.27, 0.47)0.20Yes(0.12, 0.59)0.47Yes
QueriesOffline ad0.20(0.13, 0.3)0.17Yes(0.08, 0.35)0.27Yes
Paid SearchQueries0.54(0.36, 0.62)0.26Yes(0.36, 0.66)0.30Yes
(b) Predictive accuracy of the structural model evaluated with the magnitude of the explained variance, R2
Endogenous VariablesR295% BCa CI MebootCI Amplitude95% BCa CI SamplingCI Amplitude
Queries0.26(0.17, 0.3)0.13(0.06, 0.36)0.30
Paid Search0.29(0.13, 0.39)0.25(0.11, 0.42)0.31
Store sales0.79(0.77, 0.94)0.17(0.68, 0.93)0.25
Web sales0.92(0.91, 0.95)0.04(0.92, 0.97)0.04
Note: As per Hair et al. [7], bootstrapped coefficients are corrected and accelerated (BCa).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Méndez-Suárez, M. Marketing Mix Modeling Using PLS-SEM, Bootstrapping the Model Coefficients. Mathematics 2021, 9, 1832. https://doi.org/10.3390/math9151832

AMA Style

Méndez-Suárez M. Marketing Mix Modeling Using PLS-SEM, Bootstrapping the Model Coefficients. Mathematics. 2021; 9(15):1832. https://doi.org/10.3390/math9151832

Chicago/Turabian Style

Méndez-Suárez, Mariano. 2021. "Marketing Mix Modeling Using PLS-SEM, Bootstrapping the Model Coefficients" Mathematics 9, no. 15: 1832. https://doi.org/10.3390/math9151832

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop