**3. Data Description**

## *3.1. Data and Pretreatment*

In this paper, we obtained a real dataset of material procurement from a large grid company in China over the span from June in 2012 to April in 2016, comprising a total of 47 months. To minimize forecasting errors, we removed trend and seasonal components from the time series, following [38,45].

Because it takes at least three years to estimate seasonal components, we treat the first 36 months as a training set and the remaining 11 months as a test set to evaluate the out-of-sample prediction ability of forecasting approaches. After removing trend and seasonal e ffect, results of the unit root test sugges<sup>t</sup> that all processed variables are stationary, the subsequent forecasting steps can continue.

Purchased products are mainly infrastructure materials, consisting of cables, transformers, fittings, etc. Note that the demand is intermittent in the dataset. A few products even have zero demand at more than 2/3 of all time points. These products are not suitable to do forecasting and the data have already be cleaned up. Besides, products without procurement in the first 12 months and the last 12 months are also not considered. In total, there are 338 products left. According to product characteristics, they can be aggregated at di fferent levels, forming a hierarchical structure, which is: Family > Category > Subcategory > Product, from the top level to the bottom level. As shown in Table 1, at the most aggregated level, there are two families, namely primary equipment and equipment material respectively, which can be further disaggregated into 15 categories at level 2 and into 59 subcategories at level 3. Besides, it is obvious that the quantity of products and the value of procurement vary significantly within each categories (subcategories).


**Table 1.** Description of hierarchical structure and procurement scale of products.

1 Units of average purchase: Million yuan per month.
