Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods

Karakasidou, Sofia; Charakopoulos, Avraam; Zachilas, Loukas

doi:10.3390/appliedmath4040071

Open AccessArticle

Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods

by

Sofia Karakasidou

¹

,

Avraam Charakopoulos

^2,*

and

Loukas Zachilas

¹

Department of Economics, University of Thessaly, 38334 Volos, Greece

²

Department of Physics, University of Thessaly, 35100 Lamia, Greece

^*

Author to whom correspondence should be addressed.

AppliedMath 2024, 4(4), 1328-1357; https://doi.org/10.3390/appliedmath4040071

Submission received: 28 August 2024 / Revised: 3 October 2024 / Accepted: 8 October 2024 / Published: 16 October 2024

Download

Browse Figures

Versions Notes

Abstract

:

In the present study, we analyze the price time series behavior of selected vegetable products, using complex network analysis in two approaches: (a) correlation complex networks and (b) visibility complex networks based on transformed time series. Additionally, we apply time variability methods, including Hurst exponent and Hjorth parameter analysis. We have chosen products available throughout the year from the Central Market of Thessaloniki (Greece) as a case study. To the best of our knowledge, this kind of study is applied for the first time, both as a type of analysis and on the given dataset. Our aim was to investigate alternative ways of classifying products into groups that could be useful for management and policy issues. The results show that the formed groups present similarities related to their use as plates as well as price variation mode and variability depending on the type of analysis performed. The results could be of interest to government policies in various directions, such as products to develop greater stability, identify fluctuating prices, etc. This work could be extended in the future by including data from other central markets as well as with data with missing data, as is the case for products not available throughout the year.

Keywords:

vegetable prices; temporal variation methods; correlation complex network; visibility graphs

1. Introduction

Vegetable prices play an important role in people’s daily lives worldwide. Thus, there is increasing interest in the study of vegetable price research (see, for example, [1]) worldwide. The importance of vegetables in the lives of people can be seen in papers dealing with social and medical aspects dealing with the price issue [2,3,4,5,6,7,8] and vegetable consumption, which demonstrates the importance of the subject.

Numerous studies have analyzed vegetable prices by considering influencing factors, focusing on particular regions and vegetable varieties and exploring the influence of several external parameters.

Several studies have addressed energy issues [9] and factors like petroleum prices [10,11,12,13] and coal prices [14], since energy is necessary for transportation but also for several processes necessary in the cultivation procedure of several products, like, in the case of greenhouses, moving water to the fields, etc.

Economic factors like exchange rates have also been examined [11,15] since exchange rates affect the prices of materials necessary for cultivation and production, especially when imported, but the exchange rates also play a role in the price of vegetables in the case of both imports and also exports.

Several researchers have examined the effect of weather (see, for example, [16]). In that study, several products’ prices were investigated using weekly wholesale processes and other factors in several regions in China, and the authors examined the effect of water shocks on prices.

Several research works’ models based on demand, supply, import, and export functions have been employed to study the vegetable price variance in the Korean vegetable market, specifically on selected vegetables such as cabbage, radish, dried red pepper, garlic, and onions [16]. The results showed that demand, import, and export had a limited impact on price fluctuations, with the exception of very few products. Other similar research for various regions around the world includes [17,18,19].

Other researchers [20] studied, through time-series analysis, onion price behavior, production, and productivity in several Indian markets. The results indicated seasonal effects on the onion price variation, but these were not the only factor, although there was an increasing trend. In another study [19] focusing on onions, the authors explored the high price volatility and proposed strategies for managing price volatility.

There are also studies [21] studying the management of the perishability of vegetable and fruit chains using panel data analysis.

More recent studies [22] have tried to identify the most critical factors influencing vegetable prices to facilitate stakeholders’ decisions and implement strategies for market stabilization and improving farmers’ income. In that study, several machine learning methods were employed, such as a combination of Lasso regression, the back-propagation neural network, and random forest models. The study was performed on a dataset of historical cucumber price data and several variables that are considered to affect prices.

There is also research on the subject of automatic pricing and Replenishment Decisions for selected vegetables [23], which deals with replenishment and pricing strategy under the dual constraints of vegetable supply and demand change and quality loss, and it analyzes and forecasts it based on the sales data of a batch of vegetable commodities.

In certain studies, methodologies for the prediction of prices are employed. An example of such work concerns predicting wholesale prices in China [16]. In that study, agricultural price fluctuation factors are analyzed, and a Support Vector Regression (SVR) model is employed to predict wholesale agrarian product prices. Other studies use machine learning approaches such as the STL-LSTM Method for price prediction for Chinese cabbages and radishes [24] where the effect of input variables on the forecasting of prices was investigated. The authors suggested that their model can be employed to automatically adjust demand and supply and develop policies to save corresponding social costs. There also more recent studies for price prediction using machine learning techniques, where several methods have been examined based on several error metrics [25].

Summarizing the idea behind the aforementioned types of research works, i.e., studying the price dynamics and predicting future values, can play a central role in central market management and government or private policies concerning prices and the consumption of vegetables. The identification of common characteristics in the dynamical behavior of vegetables, especially those of large consumption, can be a very crucial issue since one can investigate the classification of products in groups for market decisions or planning about prices due to the use of the vegetables or other parameters such as the method of cultivation, e.g., greenhouse cultivation, open-air cultivation, etc. This approach can be employed in several ways to study individual product price dynamics and correlations among them.

As can be understood from the literature review, the analysis of various types are based on price time-series analysis in order to understand the system’s dynamics of prices and predict future values. Time-series analysis methods have been applied with success in a large area of different scientific fields (physical, engineering economic, biological, etc.).

The majority of cases assume the linear behavior of the dynamical system combined with the effect of stochastic noise. Several well-known and widely used such methods include the autocorrelation function, Fourier transforms, and ARMA and ARIMA models, which have proven remarkably successful in many cases. However, these methods do not account for nonlinear effects or complex interactions that are present in the majority of dynamical real-life systems.

Other methods consider complexity and nonlinear behavior in dynamical systems. Such methods include recurrence plots based on the phase space concept and various methods based on complex network analysis. In a recent paper, Karakasidou et al. [26], using recurrence plots and recurrence quantification analysis, have shown that there is the possibility to group products based on the dynamical behavior observed from the RPs, which can be employed to classify products in categories based on their use as plates and ways of cultivation.

Complex network analysis [27,28,29] constitutes a class of methods that are widely employed in many research areas. Since the way in which nodes connect with edges is essential, several methods have been suggested for mapping time series into a complex network. There are correlation-based networks based on the correlation between time series [30], methods based on the phase space reconstruction of the corresponding time series [31], the visibility algorithm for transforming time series to networks [32], and the recurrence-based complex networks [33]. A concise review of the above-mentioned methodologies can be found in [27]. In the present work, we selected the visibility graph (VG) method introduced by Lacasa et al. [32] to employ for the transformation of the time series to networks. In the frame of this method, it was shown that systems with different dynamic behavior are mapped into networks with different topological measures. There are several variations of VG, as can be found in [34]. In the present article, we employed the VG proposed by Lacasa [32] since it is easy to implement and highly computationally efficient. The visibility algorithm has been applied with success in many areas such as finance [35], turbulence [36,37], environmental data [38,39], and biology [40], to mention just a few.

Moreover, the calculation of some non-linear dynamic detectors of the temporal variation of time series such as the Hurst exponent, Detrended Fluctuation Analysis (DFA), and Hjorth parameters has shown significant contributions in time-series analyses [41,42,43].

In addition, the clustering technique permits the separation of items into groups based on common characteristics that may be evident or not. The advantage of hierarchical clustering is that one can group items into classes/groups without any a priori hypothesis on the number of groups into which our data is separated [36].

In the present work, we are going to investigate the price dynamics with the aim of finding a methodology that can separate in groups the various products in such a way that it reflects characteristics, for example, of the use or type of cultivation and availability throughout the year. Toward this aim, we follow several different approaches. One of them could be characterized as conventional strictly based on the hierarchical clustering of the time series themselves, a methodology that is supposed to capture the similarity of the prices. The other direction is based on complex network analysis, but we test two different approaches. In one, which is a multivariate approach, we employ correlation-based complex networks and extract the networks’ communities, i.e., clusters of products presenting similarities. In the other univariate approach, we transformed each time series into a complex network using the visibility algorithm; we extracted metrics of the networks for each time series and performed the clustering of the metrics in order to better capture the dynamical behavior of each vegetable process evolution.

In parallel, we employed Hurst analysis to identify the persistence of price evolution in time and a Hjorth parameter analysis to reflect the variability of the prices. Again, the various metrics are clustered to classify products into groups. The results of the various approaches are compared between them but also with the results of a previous study [26] based on recurrence plots and recurrence quantification analysis, a method capturing system dynamics through phase space.

To our knowledge, this analysis is performed for the first time. The results can be employed in the classification of products into groups based on the similarity of their dynamical behavior or their correlated variation. This grouping procedure could be employed as a guide in designing strategies of buying and selling products from central markets, independent buyers, and farmers. Such information could be of interest for government policy design concerning greater market stability, the identification of more fluctuating products, etc.

For the application, we are going to use data on the price of vegetables from the Thessaloniki Central Market from the point of view of management or of identifying changes in trends that may arise due to several factors like financial change policies, crises, etc. The Central Market of Thessaloniki (C.M.TH. S.A.) is a Société Anonyme with the sole shareholder being the Hellenic Corporation of Assets and Participations (HCAP) and is supervised by the Ministry of Development & Investment. It is the second largest Central Market after that of Athens in Greece and a very large Central Market in the Balkan region.

The paper is organized as follows. Section 1 contains a literature review and the present paper’s aims. In Section 2, the data employed in the present work are presented. In Section 3, the methodologies employed, including complex network analysis (correlation networks, visibility networks), Hurst exponent analysis, and Hjorth parameters and clustering. In Section 4, the results are presented, and the possible relations between various vegetables are discussed, trying to explain the resulting groups on the basis of their use and cultivation. In Section 5, the conclusions of the present work are presented.

2. Data

The data were obtained from the Thessaloniki Central Market (Thessaloniki, Greece) covering the period from 1999 to 2016. For the period up to 2010, the data were kept in hand-written form, so hard copies had to be obtained and then the data had to be entered manually and checked for input errors. The remaining data existed in several electronic formats but were stored in separate files with nonuniform formats, requiring data to be reentered into our database, and integrity checks had to be performed. The values correspond only to working days (approximately 250 days per year).

The choice of the time period was based on the fact that (a) before 1999, there were no available data and (b) we would like to avoid possible increased climate effects [44,45], as well as the COVID-19 effect after December 2019. Such effects could be investigated in a future work.

We considered time series that exhibited continuous data without any missing values. This is due to the fact that missing values over extended periods may originate for products that have seasonal production or, in some cases, were not sold systematically, or where weather conditions caused serious destruction of production. Following the above criteria, we have chosen only vegetables that demonstrated continuous values without any missing periods in order to test the methodology. The resulting dataset contains 17 products, represented in Table 1.

In Figure 1, the plots of the 17 products’ time series are presented. It can be seen that several products exhibit periodic behavior in time, such as Knossos cucumber, zucchini, and peppers. For several vegetables, this periodicity is quite evident, while for others, there is a slight modification or it is less clear such as in the case of cucumber pair, spring onions, garlic, and tomatoes. We attribute this to the fact that the same products are mainly cultivated in greenhouses, affecting their price variability.

Among all the products examined in the study, peppers present the highest variation of prices between minimum and maximum prices. Tomatoes seem to have a kind of periodic behavior but noisier, perhaps because they are used mainly in salads and in touristic salads such as “Greek salad”, which is a plate that many people Greek and tourists prefer, and they are also cultivated in large quantities in greenhouses, especially in Crete (south Greece). Additionally, tomatoes are also a preferred kind of salad, especially in periods before Easter, when, traditionally, at least in previous years, people consumed less meat.

For several products, more complex behavior is observed. This is the case for beetroots and garlic. Salads (here corresponding to red leaf lettuce and romaine lettuce) have smaller periodicities since their production has been increased due to the healthier food trend and corresponding increased cultivation. However, extreme weather conditions can significantly affect their production, raising prices.

It is worth noting that there was a general price increase in nearly all products after the introduction of the Euro in 2002. This is particularly noticeable in products with relatively low prices in drachmas (the local currency in Greece before the Euro’s introduction, where 370 drachmas equaled 1 Euro), where the rounding effect created an increase in prices. This is more evident in dill and parsley, endives, carrots, cucumber pairs, onions, and celery. Moreover, considering that after 2010, the Greek Crisis has begun, we conducted partial analyses for three distinct periods:

A period: 1 January 1999 to 1 March 2002 (time series points 1–780);
B period: 1 January 2002 to 31 December 2009 (time series points 780–2700);
C period: 1 January 2010 up to 31 December 2016 (time series points 2700–4400).

3. Methods

In the present section, the methodologies employed are briefly discussed. First, the hierarchical clustering approach is presented. Then, the complex network analysis, both for the correlation-based networks and the visibility-transformed networks, is discussed. Finally, the Hurst exponent and the Hjorth methods are presented.

3.1. Clustering Analysis

We utilized hierarchical clustering based either on the whole time series or metrics extracted from the analysis of the time series using the single-linkage hierarchical clustering algorithm in order to classify the products. The main advantage of hierarchical clustering is that a dendrogram can be constructed to find the appropriate number of clusters corresponding to a dataset. The height at which two clusters are merged in the dendrogram corresponds to the distance of the two clusters. We employed the single-linkage hierarchical clustering algorithm. In this scheme, the distance from each object (point) to all other points is calculated, employing the Euclidean metric and a corresponding distance matrix between all elements is constructed. Then, the two clusters with the shortest distance in the matrix are identified and merged. Then, the distance matrix is computed again in the next step since the two clusters above constitute a single one. This process is repeated, computing distances from each object to all others until all data are grouped into clusters.

3.2. Complex Networks

The way in which edges connect nodes in a network is crucial. Several methods have been proposed to convert time series data into complex networks (see, for example, [27]). In this study, we utilize two different approaches, described below. In the first one, the Pearson correlation coefficient is used to establish connections between nodes representing different time series, and in the second place, the time series are transformed into complex networks and the metrics of each time series are studied; subsequently, we perform a clustering based on these metrics.

3.2.1. Pearson Coefficient

The Pearson correlation coefficient, r, measures the linear relationship between two variables. It is used to assess the strength and direction of the association between two continuous variables.

The Pearson coefficient is given by

r = \frac{\sum (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{\sum {(X_{i} - \bar{X})}^{2} \sum {(Y_{i} - \bar{Y})}^{2}}}

(1)

where X_i and Y_i are the individual data points;

\bar{X}

and

\bar{Y}

are the means of X and Y, respectively.

The value of the Pearson coefficient ranges from −1 to 1. A value r = 1 corresponds to a perfect positive linear relationship, meaning that as one variable increases, the other variable also increases proportionally, while a value of r = −1 corresponds to a perfect negative linear relationship, meaning that as one variable increases, the other variable decreases proportionally. A value of r = 0 indicates no linear relationship between the variables. It should be noted that the trend was removed from the data before calculating the correlations, applying first differences.

3.2.2. Correlation-Based Complex Networks

In a correlation network, two nodes x(t_i) and x(t_j) are connected in the associated graph at time t if a correlation coefficient exists larger than a threshold value. Each point grid corresponds to a network node in a network mapped using this criterion.

From a mathematical perspective, a network is represented by a graph G = (N,E), which consists of a set of N = (n₁, n₂, … n_N) vertices or nodes connected by a set of E = (e₁, e₂, …, e_E) links or edges. A network can be described using its adjacency matrix A = [a_ij], which encodes the connectivity structure of the graph. For a graph with N nodes, the adjacency matrix is an N × N matrix.

The degree of a node i(k_i) is defined as the total number of edges adjacent to that node and can be calculated as

k_{i} = \sum_{j} a_{i j} = \sum_{j} a_{j i}

(2)

and the average degree <k>, which represents the mean value of k_i for all vertices, is the mean value of k_i of all vertices and is a global measurement of the connectivity of the network.

The network’s modularity was introduced in [46] and is a measure for detecting communities/clusters that exist in a network. In fact, it quantifies the density of links between nodes within communities relative to the corresponding situation in a random network. Networks with high modularity present dense connections among vertices within communities/groups, suggesting closely connected community members assuring the efficient transmission of information between them.

Several community detection methods exist in the literature. A widely employed method is the Newman–Girvan and Louvain algorithms [47]. In the frame of the Newman–Girvan method [48], we have a network with n nodes. Let s_i = 1 if the vertex i belongs to group 1 and s_i= −1 if it belongs to group 2. The modularity Q is defined as

Q = \frac{1}{4 m} \sum_{i j} (A_{i j} - \frac{k_{i} \cdot k_{j}}{2 m}) s_{i} s_{j}

(3)

where A_ij is the adjacency matrix; k_i·k_j/2m is the probability a random edge would go between i and j; which is the expected number of edges between vertices i and j if edges are placed at random, where k_i and k_j are the degrees of the vertices; and

m = 1 / 2 \sum_{i} k_{i}

is the total number of edges in the network. This method is more efficient for small networks.

An alternative community detection method is the Louvain Method introduced in [49], which is a fast, computationally efficient algorithm that is based on the maximization of the modularity of nonoverlapping community structure through an iterative, hierarchical optimization process. This method is more efficient in community detection in the case of large networks.

For community detection, the Newman algorithm was employed, which is suggested for small networks, as is the case in the present study [46].

3.2.3. Visibility Algorithm Complex Network Transformed Time Series

In the frame of the visibility algorithm [32], two nodes x(t_i) and x(t_j) are considered to be connected in the corresponding graph if there are any other data (t_k, x(t_k)) such that (t_i < t_k < t_j) fulfills the following condition:

x (t_{k}) < x (t_{i}) + (x (t_{j}) - x (t_{i})) \frac{t_{k} - t_{i}}{t_{j} - t_{i}}

(4)

Hence, i and j are connected if any intermediate data height does not intersect a visibility line. In a visibility network, each value of time series is mapped to nodes, and each node is visible at least by its nearest neighbors. The number of the edges is related to the dynamic variability of the data and the corresponding system under study. This approach better reflects the system dynamics of the independent time series.

3.2.4. Topological Measures of Networks

The topological properties are examined in this study are briefly described below.

Node degree is the number of links connected to the node.

Eccentricity (E_G(v)) measures how far a node is from the most distant node in the network. The eccentricity of a node v is calculated by computing the shortest path between the node v and all other nodes in the graph, and then the longest shortest path is chosen (let (v,K) where K is the most distant node from v). Once this path with length dist(v,K) is identified, its reciprocal is calculated (1/dist(v,K)).

E_{G} (v) = \frac{1}{d i s t (v, K)}

(5)

Closeness centrality (CC_i) is defined as the reciprocal of the sum of the shortest path distances (d_ij) from the given node (i) to all other nodes (j) in the graph. It is a measure used in network analysis to quantify the degree to which a node is close to all other nodes in a network. Nodes with high closeness centrality are those that can reach other nodes in the network with shorter average shortest path distances.

{C C}_{i} = \frac{1}{n - 1} \sum_{j \in N, i \neq j}^{n} d_{i j}

(6)

Betweenness centrality (CB) is defined as the fraction of all shortest paths in the network that contain a given node. Nodes with high values of betweenness centrality participate in a large number of shortest paths.

C B = \sum_{i \neq j \neq v \in N} \frac{σ_{i j} (v)}{σ_{i j}}

(7)

where σ_ij(v) is the number of paths that pass through node v, and σ_ij is the total number of shortest paths from node i to node j.

The clustering coefficient is defined as the fraction of triangles around a node and is equivalent to the fraction of node’s neighbors that are neighbors of each other. The local clustering coefficient C(i) is the number of edges among neighbors of i, divided by the number of the total triplets k_i(k_i − 1) shaped by the node i.

C (i) = \frac{2 n_{i}}{k_{i} (k_{i} - 1)}

(8)

where the average clustering coefficient of the network is defined as

C = \frac{1}{n} \sum_{i \in N} C (i)

(9)

Eigenvector centrality (x) is a self-referential measure of centrality—nodes have high eigenvector centrality if they connect to other nodes that have high eigenvector centrality.

x_{v} = \frac{1}{λ} \sum_{i \in G} a_{v, t} t x_{t}

(10)

where G represents the other nodes in the network, a_v, t denotes the value in the adjacency matrix corresponding to nodes v and t, x_t represents the eigenvector centrality of node t, and λ denotes the eigenvalue of the adjacency matrix.

Bridging coefficient (BrCo) is often based on various network measures such as betweenness centrality, which measures the number of shortest paths passing through a node. Nodes with high bridging coefficients are those that lie on many shortest paths between different communities or clusters in the network.

B r C o = \frac{1}{k_{(v)}} \sum_{i \in N (v)} \frac{δ (i)}{d (i) - 1}

(11)

where k(v) is the degree of a node v and δ(v) is the number of edges leaving the direct neighbor subgraph of a node v among edges incident to each direct neighbor of node i of node v, and N(v) is the set of neighbors of node v.

Bridging centrality refers to a measure that quantifies the extent to which a node or a set of nodes serve as bridges connecting different parts or communities within the network.

3.3. Hurst Exponent

The Hurst exponent (H) is employed as a measure of detection long-range correlation between time series values. Hurst introduced the idea of the Hurst exponent [43] to investigate the River Nile’s water discharge time series. The oldest method to calculate the Hurst exponent, rescaled Range or R/S analysis, was proposed [50], and estimates of H are based on R/S statistics.

Let us suppose a time series with N successive measurements. The time series is divided into N shorter subseries of length n = N, N/2, N/4. For each subseries, the range Rn is defined as

R_{n} = \max_{1 \leq k \leq s} [\sum_{i}^{k} (x_{n s + i} - {\bar{x}}_{n})] - \min_{1 \leq k \leq s} [\sum_{i}^{k} (x_{n s + i} - {\bar{x}}_{n})]

(12)

where n = 0, 1, …… N_s − 1, N_s = N/S and

\bar{x_{n}} = \frac{1}{s} \sum_{i = 1}^{s} x_{n s + i}

(13)

The sample standard deviation is defined as

S_{n} = \sqrt{\frac{1}{s} \sum_{i = 1}^{s} {{(x}_{n s + i} - \bar{x_{n}})}^{2}}

(14)

Then, the rescaled range is R_n/S_n.

The Hurst exponent is estimated by calculating the average rescaled range for all subseries of length n. It can be shown that the R/S statistics follows the relation

{(R / S)}_{n} = c \cdot n^{H} a s n \to \infty

(15)

When we plot (R/S)_n statistics against n on a log–log scale, if the behavior is linear at that scalel, then the slope determines the Hurst exponent. A value of the Hurst exponent equal to 0.5 indicates a random series. A value of H in the range 0.5 < H < 1 indicates a persistent behavior, while a Hurst exponent value between 0 and 0.5 indicates an anti-persistent behavior.

3.4. Hjorth Descriptors

The Hjorth descriptors, namely activity, mobility, and complexity, were mainly developed for the quantification of an electroencephalogram (EEG) by Hjorth [41]. The Hjorth parameters are called normalized slope descriptors because they can be defined as first and second derivatives and are, respectively, defined as follows:

A c t i v i t y = m_{0} = v a r (x (t))

(16)

M o b i l i t y = \sqrt{\frac{m_{2}}{m_{0}}} = \sqrt{\frac{A c t i v i t y (\frac{d x (t)}{d t})}{A c t i v i t y (x (t))}}

(17)

C o m p l e x i t y = \sqrt{\frac{\frac{m_{4}}{m_{2}}}{\frac{m_{2}}{m_{0}}}} = \frac{M o b i l i t y (\frac{d x (t)}{d t})}{M o b i l i t y (x (t))}

(18)

Activity is a measure of the squared standard deviation of the amplitude and represents the width of variation of time series. Mobility is defined as the square root of the ratio between the variances of the first derivative and the amplitude and represents the mean frequency of the time series. Complexity is the ratio between the mobility of the first derivative and the mobility of the non-linear time series and reflects the deviation of the slope and can be employed as a measure of change in the frequency of the signal.

4. Results and Discussion

In this section, we present the results of time-series analysis using the different approaches mentioned in the Introduction and the Methods section. First (Section 4.1), we present the hierarchical clustering of the time series, and we discuss the groups obtained, trying to find if there are common characteristics related to their use and cultivation along with their availability around the year. Then, in second place, we present in Section 4.2 the Pearson coefficient results between the time series. It is based on these coefficients that the correlation complex networks are based (Section 4.3), where we detect the communities formed in the networks and discuss the results. Then, we present the results of a complex network transformed time series where, first, each time series is transformed to a complex network using the visibility algorithm, and then, for each network, its metrics are extracted, which are employed in the clustering procedure. Finally, in Section 4.5, we present the results concerning the temporal evolution of the process, where persistence in prices is observed. Section 4.6 contains the Hjorth parameters, describing the variability of the time series, and again, a clustering is performed. Finally, we have a comparison of the clustering results obtained in the present article paper and in previous work where a clustering based on system dynamics was performed using the phase space approach [26].

4.1. Clustering

Based on the whole time series (for all the time periods), a hierarchical clustering analysis was performed. Subsequently, the same clustering procedure was applied to three different time periods, and the results are presented in Figure 2. Employing the elbow criterion, the results were separated into six clusters for the whole time examined, eight clusters for period A, and six clusters for periods B and C. Below, a description of the clusters and a corresponding discussion is presented.

For the whole time series, the clusters (groups and corresponding colors) are as follows (Figure 2a). The first letter T corresponds to the whole period data, A to the A period data, B to the B period data, and C for the thirst period. G stands just for group, and the number corresponds to the number of groups (in only the A case, we have also employed lowercase Roman numerals to denote subgroups).

TG1 = {dill parsley, spring onion, onions, cucumber pair, endives, carrots, beetroots};

TG2 = {Knossos cucumber, lettuce, tomatoes, spinach, salads, zucchini};

TG3 = {celery};

TG4 = {long-fruited pepper);

TG5 = {coarse pepper};

TG6 = {garlic}.

Figure 2b represents the case when data only in period A are employed.

AG1 = {dill parsley, spring onions, onion, cucumber pair, endives, carrots, beetroots};

AG2i = {Knossos cucumber, lettuce, tomatoes, spinach};

AG2ii = {zucchini}

AG2iii = {salads};;

AG3 = {celery};

AG4 = {long-fruited peppers};

AG5 = {coarse pepper};

AG6 = {garlic}.

Figure 2c represents the case when data only in period B are employed.

BG1 = {dill parsley, spring onions, onions-cucumber pair, endives, carrots, beetroot};

BG2 = {cucumber Knossos, lettuce, tomatoes, spinach, salads, zucchini};

BG3 = {celery};

BG4 = {long-fruited peppers};

BG5 = {coarse pepper};

BG6 = {garlic}.

Figure 2d represents the case when data only in period C are employed.

CG1 = {dill parsley, Spring onions, onions-cucumber pair, endives, carrot, beetroot};

CG2 = {cucumber Knossos, lettuce, tomatoes, spinach, salads, zucchini};

CG3 = {celery};

CG4-5 = {long-fruited pepper, coarse pepper};

CG6 = {garlic}.

It is interesting to discuss the various groups formed and their evolution as a function of the time.

The first group G1 remains the same in all analyses (see TG1, AG1, BG1, CG1), as we can see below.

TG1 = {dill parsley, spring onion, onions, cucumber pair, endives, carrots, beetroots};

AG1 = {dill parsley, spring onions, onion, cucumber pair, endives, carrots, beetroots};

BG1 = {dill parsley, spring onions, onions, cucumber pair, endives, carrots, beetroot};

CG1 = {dill parsley, Spring onions, onions, cucumber pair, endives, carrots, beetroot}.

First, some comments about the group members’ properties related to their use as plates and their prices are presented.

Dill-parsley: in general, it has low prices and is employed in many plates and salads in Greek cuisine. It is also cultivated all year long.

Spring onions: they have in general low prices (except for some characteristic periods during the year), and they are used in many plates and salads.

Onions: they have relatively low prices and are used in the preparation of many plates and salads (Greek salad), and they are available all along the year since they can be employed as soon as they are collected or they can be stored and sold later during the year.

Cucumber pair: they also present relatively low prices, and they are used in the preparation of salads especially in summer (Greek salad which is a touristic product too very much consumed). In recent years, they have also been cultivated in greenhouses and thus have nearly all-year-long availability.

Carrots: they are employed in many plates and salads; they are available all year long at relatively low prices.

Endives: they have relatively small prices and they are used mainly in salad plates.

Beetroot: relatively small prices; they are used in salad and are quite special like endives.

Endives and beetroot are used for more special salad plates (in contrast to lettuce and tomatoes) and less touristic.

In summary, we have a group of vegetables that are available throughout the year at relatively low prices, which are employed mainly in salads or elements of plates (such as carrots, spring onions, dill parley, onions; all other vegetables in the group are not an important constituent of salad plates).

In the second group it can be seen below that the global time groups are similar to that of period B and C.

TG2: {cucumber Knossos, lettuce, tomatoes, spinach, salads, zucchini};

BG2 {cucumber Knossos, lettuce, tomatoes, spinach, salads, zucchini};

CG2 {cucumber Knossos, lettuce, tomatoes, spinach, salads, zucchini}.

However, in period A, this is dissociated into subgroups.

AG2i: {cucumber Knossos, lettuce, tomatoes, spinach};

AG2ii: {zucchini};

AG2iii: {salads}.

Cucumber Knossos, lettuce, tomatoes, and spinach form a constant part of the group; they are widely used in plates and have low prices. However, zucchini presents a higher price for period A, while salads present a less periodic behavior in that time window. We present some more detailed comments about the group members’ properties related to their use as plates and their price.

Cucumber Knossos: they are available for long periods but not as common as the conventional cucumber.

Lettuce: it is available as green salad nearly all year long.

Tomatoes: it is the main constituent of salads (Greek Salad) but also of plates and is also employed in sauce preparation and salad plates at home or in restaurants.

Spinach: also employed as food (in dishes, pies) and some salads. It is not produced all around the year but it is commonly stored in frozen form and used all around the year. This availability plays a role in the price of fresh spinach.

Zucchini is a different product. It is employed in special dishes and as an appetizer, especially in summer. A difference in period A is observed as mentioned above.

Salad is like lettuce but in general are slightly more expensive.

Group G3 {only one element celery}

TG3: {celery};

AG3: {celery};

BG3: {celery};

CG3: {celery}.

Celery is quite a special vegetable. It is employed in small quantities as accompanying element of sauces or soups but not as a salad itself or a dish.

For groups 4 and 5:

TG4: {long-fruited pepper};

AG4: {long-fruited peppers};

BG4: {long-fruited peppers}.

We observe long-fruited pepper, which is mostly seasonally used in salads as well as in accompanying dishes, and it has relatively high prices, especially in some periods during the year.

TG5: {coarse pepper};

AG5: {coarse pepper};

BG5: {coarse pepper}.

Coarse peppers change in periodicity due to their production in Greek houses; they are mostly seasonal and are widely employed in dishes that are also very popular among tourists.

Only a difference in period C is observed.

CG4-5 {long-fruited pepper, coarse pepper} in the C period has closer price variation (in A and B, coarse peppers have higher prices).

G6

TG6: {garlic};

AG6: {garlic};

BG6: {garlic};

CG6: {garlic}.

Garlic is a special product used for giving taste in many plates and is available all year long since it can be stored for long period, thus resulting in constant availability and relative price stability too.

In summary, we observe the following characteristics for the groups obtained:

In G1, there exist mainly salads or vegetables accompanying salads (spring onions, dill-parsley), as well as those used in plate preparation for their taste.

G2 contains mainly vegetables that are also used in dishes like tomatoes in filled tomatoes and zucchini in filled zucchini for fried zucchini (especially in summer).

G3 consists of celery, which is mainly used in soup dishes, while the green part is used a little in salads and in the preparation of several soup-like dishes.

G4, G5 consist of peppers, which are employed in dish preparation, but also in salads, with a slight difference that coarse peppers are more widely employed in plates (stuffed coarse peppers are a widely employed dish).

G6 consists of garlic, which is used for giving taste, mainly in dish preparation and far less in salads.

We must mention that the clustering approach measures how close the price variation of products is and not the dynamics of each price itself.

4.2. Pearson Correlatons

In order to detect relations between pairs of variables, the Pearson correlation function was calculated as described in Section 3.2.1. The results are presented in Figure 3 for the whole time period and the subperiods mentioned before. As far as the complete data set is concerned (Figure 3a), the higher correlations between the following vegetables are presented in Table 2. Only correlations that presented statistical significance (p value < 0.05) are presented and were taken into account for the construction of the corresponding adjacency matrices and the networks.

In Table 2, one can see that there are very highly correlated prices between the two type of peppers, and then come the cucumbers, along with lettuce and salads, which are similar products, mainly (G1) and (G2) group members in the clustering analysis performed in the previous section.

When performing the same analysis in three different periods (Figure 3b,c), some interesting behavior between periods A, B, and C was observed. In general, it can be observed that the results of period A are higher than both the overall correlation results and those of periods B and C. In other words, it appears that the integration of the Euro as a currency has influenced the correlation of product prices.

4.3. Correlation Networks

Initially, the networks were constructed using the methodology of correlation networks. Then for each network, the measure of partitioning was calculated, based on the Newman algorithm for the three periods/networks, and the results are presented in Figure 4.

Whole Period (Figure 4a)

Group 1 (onions and garlic);

Group 2 (cucumber pair, Knossos cucumber, zucchini, long-fruited peppers, coarse peppers and tomatoes);

Group 3 (dill parsley bales, endives, carrots, spring onions, lettuce, beetroot, salads, celery, spinach).

Period A (Figure 4b)

Group 1 (cucumber pair, Knossos cucumber, dill-parsley, tomatoes);

Group 2 (onions, beetroot, long-fruited peppers, coarse peppers);

Group 3 (endives, carrots, zucchini, spring onions, lettuce, salads, celery, garlic, spinach).

Very large groups are observed, which we believe are related mostly to the type of price variation before the introduction of the Euro, as well their variability in time.

Period B (Figure 4c)

Group 1 (cucumber pair, Knossos cucumber, zucchini, tomatoes);

Group 2 (long-fruited peppers, coarse peppers);

Group 3 (dill parsley, spring onions, lettuce, beetroot, salads, spinach);

Group 4 (endives, carrots, onions, celery).

Group 5 (garlic)

Period C (Figure 4d)

Group 1 (cucumber pair, Knossos cucumber, zucchini);

Group 2 (long-fruited peppers, coarse peppers, tomatoes);

Group 3 (dill-parsley, carrots, spring onions, onions, garlic);

Group 4 (endives, lettuce, beetroot, salads, celery, spinach).

It turns out that the reduction in the correlation between the various products can be seen from the correlation matrices increase in number of groups.

The groups present some mixing of G1 and G2 members, as previously observed in the clustering-based classification on the whole time series, but one must bear in mind that the correlation-based complex network results are more representative of interdependencies on prices such as cucumber Knossos with cucumber pair and long-fruited peppers with coarse peppers.

It is also of interest as it can be seen in period C that G3 and G4 contain different kinds of products. G3 has accompanying elements of plates and salads, while G4 is for salad-plate-oriented elements that are served themselves in some cases.

4.4. Complex Networks Transformed Time Series

The time series are transformed to complex networks following the visibility algorithm, and the measures degree, eccentricity, closeness centrality, betweenness centrality, clustering coefficient, eigenvector centrality, bridging coefficient, and bridging centrality are calculated for each product; the results are depicted in Figure 5a–h.

From Figure 5, it can be seen that the network measures of different products exhibit different behavior.

For example, in Figure 5a, it can be seen that nearly all products present an average degree between 0 and 0.4, with both types of peppers and garlic presenting a larger degree than the other products. This means that, on average, the prices of the latter are more linked to other nodes (prices in the time series) than in the case of other products. This happens when nodes with high values exist that can “see” (in the visibility algorithm) the different values. In fact, if one observes the corresponding time series in Figure 1k,l,o, this is the case.

As far as the eccentricity is concerned (Figure 5b), it can be seen that most products present values varying between 8 and 16 Onions present the largest eccentricity; i.e., they present the largest shortest path lengths. In general, prices are more “distant from others”; this can be seen since the periodicities that we have seen in many other products are not observed (see Figure 1h). It is followed by spinach.

For betweenness centrality, which measures the number of shortest paths between pairs of nodes in the network that pass through a particular node, we can see that Knossos cucumbers, onions, and spinach present the highest values, while all other products vary between 0 and 0.6. This can be seen from the corresponding time series (Figure 1) since they present some high values that are linked with many values close to them. However, the opposite behavior is observed for closeness centrality, a measure used in network analysis to evaluate the centrality of a node within a network. It quantifies how close a node is to all other nodes in the network on average, based on the shortest path distance. This means that, on average, the nodes do not present high connectivity, as in the case of other products.

Eigenvector centrality can be applied to networks generated from time series data using the visibility algorithm to identify important time points or data points within the time series. These important points can then be further analyzed to understand their influence on the underlying dynamics captured by the time series. Again, it is evident that both types of peppers and garlic present distinctly different behaviors, with important nodes with very high values that can be seen over neighboring maximum values.

The clustering coefficient is defined as the fraction of triangles around a node, which is equivalent to the fraction of node neighbors that are neighbors of each other.

The bridging coefficient is often based on various network measures, such as betweenness centrality, which measures the number of shortest paths passing through a node. Nodes with high bridging coefficients are those that lie on many shortest paths between different communities or clusters in the network. The lowest values for peppers present significant peaks in successive periods with relatively small fluctuations in successive timesteps.

Bridging centrality refers to a measure that quantifies the extent to which a node or a set of nodes serve as bridges connecting different parts or communities within the network. The values are high for salads and tomatoes, which present more variability (fluctuations) in successive values than the other products.

The implementation of the hierarchical clustering algorithm to the network measures was further investigated. We applied the methodology of hierarchical clustering, taking into account all network measures for each product. The results are presented in dendrogram form in Figure 6a, and a comparison is made with the results obtained from clustering of times series (previously obtained) and presented in order to facilitate the reader again in Figure 6b.

It can be seen that there are products that are separate from the others, such as garlic, long-fruited peppers, and coarse peppers, as well as two quite separate groups with some common members that are not exactly the same. The network-based results consider the dynamics in a more detailed manner. However, as has also been seen in the case of Hjorth analysis, some products present distinct behavior, namely garlic and peppers. The groups formed are

G1 (cucumber pair, tomatoes, beetroots, celery, dill parsley, spring onions, carrots};

G2 {cucumber Knossos, endives, salads, zucchini, lettuce, onions, spinach};

G3 {long-fruited peppers, garlic};

G4 {coarse peppers}.

One can see that G3 and G4 are similar to the groups of clustering of time series.

4.5. Hurst Results (Temporal Behavior)

The rescaled range analysis (R/S) method was employed in order to calculate the Hurst exponent, which is a measure of long-range memory of time series. For values 0 < H < 0.5, there exists anti-persistent behavior, i.e., a large value is followed by a small value and vice versa, while for values 0.5 < H < 1, there is a persistence behavior, i.e., a small value will be followed by a small value and vice versa, and in the case where the value is equal to 0.5, this indicates that there is no autocorrelation in time series.

The results for the whole period as well as for the three different subperiods are presented in Figure 7. For the whole period, it is observed that all exponents are significantly higher than 0.5, indicating a relatively large persistence (i.e., a large value tends to be followed be a large value, and a small value tends to be followed by a small value).

It is of interest that even in the analysis in the three time periods, persistence is observed for all products, although the value of the exponent varies for several products as a function of the time.

One can see that the range of values for the Hurst exponent is larger in the first period, before the introduction for Euro and any possible perturbation this has created [51,52], which seems to produce a reduction in the next periods. A closer look shows that several products present a reduction of the Hurst exponent compared to period A.

These products are for period B:

{cucumber Knossos (id2), zucchini (id6), long-fruited peppers (id11), salads (id13) and celery (id14)} with the largest reductions.

This is also the case with smaller reductions:

{cucumber pair (id1), endives (id4), carrots (id5), lettuce (id9), coarse peppers (id12), spinach (id16)}.

The Jurst exponent remains more or less the same for the following products:

{dill-parsley (id3), spring onions (id7), onions (id8), beetroots (id10), garlic (id15), tomatoes (id17)}.

In period C, no significant modification of the values of the Hurst exponent is observed, except for products 1–7, i.e., cucumber pair, Knossos cucumbers, dill, parsley, endives, carrots, zucchini, and spring onions.

The variations observed in period B may be related to the change to the Euro currency and the increase in several prices that have made the evolution less continuous as it was before.

The persistence in values can be related to the fact that during given periods, there is a given offer of products and, apart from cases of sudden demand like Easter holidays for several of the products, the situation remains quite the same. This may also lead to the conclusion that the collective behavior of consumers and producer drives prices. (One must bear in mind that consumers comprise not only independent household consumers but also shops, restaurants, and hotels.)

4.6. Hjorth Parameters

In Figure 8, the estimated results for the various Hjorth parameters for the three time periods studied are presented. In the left row of Figure 8, one can see the evolution of the Hjorth activity parameter, and a discussion of its behavior for the various vegetables studied is presented. It can be seen that in all periods, there are products for which activity values are above 0.2.

In the first period, this occurs in a more pronounced way for coarse peppers (id12), long-fruited peppers (id11), zucchini (id6), and salads (id13).

Coarse peppers also continue in the second period with an increase, while long-fruited peppers, salads, and garlic present such behavior in period B and also in period C (with salads presenting a slightly lower value than 0.2).

As was mentioned in the Methods section, activity (left column in Figure 8) is related to the variance of the time series, and this behavior can be verified by the time-series plot. Coarse peppers present very large variabilities between a value less than 1 Euros up to 3 Euros or more. Zucchini’s price also presents more variability in the first period, while it seems that it is reduced in the following periods.

Salads are also more pronounced in the first period, since they were not so common in that period but became more cultivated in the next periods (change in the behavior of people). Long-fruited peppers also have a relatively lower variability since they have also been cultivated in greenhouses.

It is of interest to examine also the mobility variation (central column in Figure 8). Lettuce presents the largest value in the first period, with a significant difference from the other products. Spring onions, beetroots, and tomatoes present the second largest variation in the ratio variance of first derivative/variance of the time series and, as mentioned, represent the mean frequency of the time series. This can be verified since more frequent changes (not necessarily always very important) are observed in the prices of these products (Figure 1).

As far as complexity is concerned (right column in Figure 8), it can be seen that long-fruited peppers (id11) followed by garlic (id15), onions (id8), and zucchini (id6) have values larger than 7. This behavior seems to be persistent for these products with an increase also in the values of complexity. As mentioned, complexity indicates the deviation of the slope and can be seen as a measure of the change in frequency in the signal, something that can be seen in the time-series plot.

This variability based on all the three Hjorth parameters is also represented in the hierarchical clustering presented in Figure 9. One can see that long-fruited peppers, garlic, onions, and zucchini separate from the rest in period A; garlic persists in all separate periods, along with peppers, both log-fruited and coarse peppers, and onions and Knossos cucumbers in periods B and C. This difference can also be related to the change in eating habits, along with the change in the cultivation processes (use of greenhouse for cultivation of these products).

As can be observed in detail, the groups and their members are presented below (the first letter A, B, and C corresponds to the period under study; H corresponds to Hjorth; and Gi corresponds to the group formed).

AHG1 = {cucumber pair, spinach, celery, carrots, beetroot};

AHG2 = {Knossos cucumber, dill-parsley, salads, endives, coarse pepper};

AHG3 = {spring onions, lettuce, tomatoes};

AHG4 = {zucchini, onions, garlic};

AHG5 = {long-fruited peppers};

BHG1 = {endives, celery, spring onions, cucumber pair, tomatoes};

BHG2 = {dill-parsley, carrots, lettuce, salads, beetroot, zucchini, spinach};

BHG3 = {Knossos cucumber, long-fruited peppers, coarse peppers};

BHG4 = {onions};

BHG5 = {garlic};

CHG1 = {zucchini, spinach, dill-parsley, beetroot, carrots};

CHG2 = {cucumber pairs, spring onions, endives, lettuce, celery, salads};

CHG3 = {tomatoes};

CHG4 = {Knossos cucumber, long-fruited peppers};

CHG5 = {coarse peppers};

CHG6 = {onions, garlic}.

As can be verified, the members of the groups present similar variabilities, as can be seen from the price time series (Figure 1), for example, for period A for the members of the first group (AHG1), as well as for the members in the first group (BHG1) for period B.

These results can be used to measure the variability of prices and categorize the products as price-variable or not. This can be used as an indicator for large buyers to have an idea about the quantities to buy.

In Table 3, a comparison of the various groups formed for the total period based on time-series clustering (G Tclus), visibility network metrics clustering (G Nvis clsut), and correlation network (G cN), along with results from RPs metrics clustering from a previous work (G RPs clust) [22], is presented.

Time-series clustering presents similarity with visibility clustering results. It seems from the network approach that the visibility network better reflects finer differences than the correlation networks. It also seems that clustering and networks tend to form some larger groups than the RP methodology. The reason, perhaps, is that RPs are related to the phase-space reconstruction of the dynamical system, and the metrics are more directly related to the system dynamics. This grouping of products reflects dynamic similarities not only of prices, which may depend on several external factors.

In the case of clustering, what we identify is products varying in similar ways based on the distance of the prices (thus a kind of correlation). In the correlation network, we employ correlations (linear relations between the products’ prices; RPs take into account nonlinear behavior, too), while in the visibility algorithm, we seek some linear relation, too, although in a more subtle way.

So perhaps, depending on what information one wants to extract for the products under investigation, different methodologies can be chosen.

5. Conclusions

In the present work, we present the results of the time-series analysis of selected vegetable products for the central market of Thessaloniki using a variety of methods to analyze them both independently and with their correlations and clustering procedures based on measures from their analysis with the aims of examining if automatic procedures for clustering can help produce groups appropriate for various management aims and the advantages of other methods compared to simple time-series clustering along with other dynamical indices employed in previous works, like recurrence plots and recurrence quantification analysis.

It is of interest that the several clusters obtained on various methodologies reveal some effect of the use of vegetables as food (dishes, salads, or accompanying dishes) but also the effect of the range of their prices.

The analysis over different periods reveals some differences that, especially for the period after the introduction of the Euro in Greece, correspond to the increase in the process but also are related to some changes in cultivation (the increase in greenhouse cultures for some products).

Clustering of time series reveals six groups with different types of products. The cluster to which the products belong is determined by their use and price. Using time-series clustering, we can group the products into two large clusters and four groups of isolated products. The first two clusters contain widely used vegetables commonly used as accompaniments to many dishes, such as dill, parsley, spring onions, onions, cucumber, endives, carrots, and beetroots, which are often used in salads. The other cluster contains commonly used vegetables such as tomatoes and cucumbers, along with lettuce, salads, zucchinis, and spinach, which are used in pies throughout the year. Another interesting point is that the highest correlations were found between group 1 and 2 products, likely due to use and production timing.

The results of the networks present some differences. The correlation networks are slightly different with the presence of fewer groups (two or three groups depending on the period examined). They reveal various forms of “interrelations” between the group members themselves, along with other group members, since they are based on price correlations.

However, the complex networks that were produced from the transformation of time series using the visibility algorithm give rise to various topological properties of the resulting networks that, when employed in the hierarchical agglomerative clustering, reveal in a better way the difference in dynamics, and the results are closer to those of the method of recurrence plots (employed in previous work), which also reflect in a different way the dynamic behavior of the system. The detailed comparison with results from RPs reveals that RPs take rather more into account the effect of the dynamics of each product price while the other methods employed in the present work refer more to the correlation between processes and perhaps less on their use as plates.

It turns out that clustering based on the dynamical behavior of the system (complex networks transformed time series and recurrence plots) better takes into account the dynamical behavior of the price of the products, which is related both to their use as well as their method and duration of cultivation around the year.

In a future work, several points could be examined. First, the above procedure for clustering products could be extended for products presenting missing values, such as for products that are not available throughout the year. Second, different algorithms for visibility graphs should be employed, such as those described in [34].

Author Contributions

Conceptualization, L.Z.; methodology, S.K. and A.C.; software, S.K. and A.C.; formal analysis, S.K.; data curation, S.K.; writing—original draft, S.K.; writing—review and editing, L.Z. and A.C.; visualization, A.C. and S.K.; supervision, L.Z.; project administration, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are available on request by the authors.

Acknowledgments

Authors would like to thank the administration of the Thessaloniki Central Market for providing access to the handwritten data used in the research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, Y.; Liu, J.; Yang, H.; Chen, J.; Xiong, J. A Bibliometric Analysis of Literature on Vegetable Prices at Domestic and International Markets—A Knowledge Graph Approach. Agriculture 2021, 11, 951. [Google Scholar] [CrossRef]
Claro, R.M.; do Carmo, H.C.E.; Machado, F.M.S.; Monteiro, C.A. Income, Food Prices, and Participation of Fruit and Vegetables in the Diet. Rev. Saúde Pública 2007, 41, 557–564. [Google Scholar] [CrossRef] [PubMed]
Cassady, D.; Jetter, K.M.; Culp, J. Is Price a Barrier to Eating More Fruits and Vegetables for Low-Income Families? J. Am. Diet. Assoc. 2007, 107, 1909–1915. [Google Scholar] [CrossRef]
Glanz, K.; Yaroch, A.L. Strategies for Increasing Fruit and Vegetable Intake in Grocery Stores and Communities: Policy, Pricing, and Environmental Change. Prev. Med. 2004, 39, 75–80. [Google Scholar] [CrossRef]
Choudhury, S.; Shankar, B.; Aleksandrowicz, L.; Tak, M.; Green, R.; Harris, F.; Scheelbeek, P.; Dangour, A. What Underlies Inadequate and Unequal Fruit and Vegetable Consumption in India? An Exploratory Analysis. Glob. Food Secur. 2020, 24, 100332. [Google Scholar] [CrossRef]
Kasprzak, C.M.; Sauer, H.A.; Schoonover, J.J.; Lapp, M.M.; Leone, L.A. Barriers and Facilitators to Fruit and Vegetable Consumption among Lower-Income Families: Matching Preferences with Stakeholder Resources. J. Hunger Environ. Nutr. 2021, 16, 490–508. [Google Scholar] [CrossRef]
Kalmpourtzidou, A.; Eilander, A.; Talsma, E.F. Global Vegetable Intake and Supply Compared to Recommendations: A Systematic Review. Nutrients 2020, 12, 1558. [Google Scholar] [CrossRef]
Głąbska, D.; Guzek, D.; Groele, B.; Gutkowska, K. Fruit and Vegetable Intake and Mental Health in Adults: A Systematic Review. Nutrients 2020, 12, 115. [Google Scholar] [CrossRef]
Chowdhury, M.A.F.; Meo, M.S.; Uddin, A.; Haque, M.M. Asymmetric Effect of Energy Price on Commodity Price: New Evidence from NARDL and Time Frequency Wavelet Approaches. Energy 2021, 231, 120934. [Google Scholar] [CrossRef]
Hameed, A.A.A. The impact of petroleum prices on vegetable oils prices: Evidence from cointegration tests. In Proceedings of the 3rd International Borneo Business Conference (IBBC), Sabah, Malaysia, 15–17 December 2008. [Google Scholar]
Bozma, G.; İmamoğlu, İ.K. The effects of gasoline price, real exchange rate and food price on vegetable and fruit export. Uluslar. İktisadi Ve İdari İncelemeler Derg. 2023, 41, 182–198. [Google Scholar] [CrossRef]
Pal, D.; Mitra, S.K. Interdependence between Crude Oil and World Food Prices: A Detrended Cross-Correlation Analysis. Phys. Stat. Mech. Its Appl. 2018, 492, 1032–1044. [Google Scholar] [CrossRef]
Moradi, M.; Salehi, M.; Keivanfar, M. A Study of the Effect of Oil Price Fluctuation on Industrial and Agricultural Products in Iran. Asian J. Qual. 2010, 11, 303–316. [Google Scholar] [CrossRef]
Du, W.; Wu, Y.; Zhang, Y.; Gao, Y. The Impact Effect of Coal Price Fluctuations on China’s Agricultural Product Price. Sustainability 2022, 14, 8971. [Google Scholar] [CrossRef]
Sheldon, I.; Mishra, S.K.; Pick, D.; Thompson, S.R. Exchange Rate Uncertainty and US Bilateral Fresh Fruit and Fresh Vegetable Trade: An Application of the Gravity Model. Appl. Econ. 2013, 45, 2067–2082. [Google Scholar] [CrossRef]
Wang, S.; Li, Y.; Zhuang, J.; Liu, J. Agricultural Price Fluctuation Model Based on SVR. In Proceedings of the 2017 9th International Conference on Modelling, Identification and Control (ICMIC), Kunming, China, 10–12 July 2017; pp. 545–550. [Google Scholar]
Beniwal, A.; Poolsingh, D.; Shastry, S.S. Trends and Price Behaviour Analysis of Onion in India. Indian J. Agric. Econ. 2022, 77, 632–642. [Google Scholar] [CrossRef]
Birthal, P.; Negi, A.; Joshi, P.K. Understanding Causes of Volatility in Onion Prices in India. J. Agribus. Dev. Emerg. Econ. 2019, 9, 255–275. [Google Scholar] [CrossRef]
Sankaran, S. Demand Forecasting of Fresh Vegetable Product by Seasonal ARIMA Model. Int. J. Oper. Res. 2014, 20, 315–330. [Google Scholar] [CrossRef]
Qiao, Y.; Kang, M.; Ahn, B. Analysis of Factors Affecting Vegetable Price Fluctuation: A Case Study of South Korea. Agriculture 2023, 13, 577. [Google Scholar] [CrossRef]
Kirci, M.; Isaksson, O.; Seifert, R. Managing perishability in the fruit and vegetable supply chains. Sustainability 2022, 14, 5378. [Google Scholar] [CrossRef]
Dong, S.; Mo, K. Analysis of factors for vegetable price fluctuation in Lu’an. In Exploring the Financial Landscape in the Digital Age; CRC Press: Boca Raton, FL, USA, 2024; pp. 456–462. [Google Scholar]
Fang, J.; He, Y.; Zhao, J.; Lu, Y. Analysis and Research on the Problem of Automatic Pricing and Replenishment Decision for Vegetable Categories. In Proceedings of the 2024 43rd Chinese Control Conference (CCC), Kunming, China, 28–31 July 2024; pp. 1609–1614. [Google Scholar]
Jin, D.; Yin, H.; Gu, Y.; Yoo, S.J. Forecasting of Vegetable Prices Using STL-LSTM Method. In Proceedings of the 2019 6th International Conference on Systems and Informatics (ICSAI), Shanghai, China, 2–4 November 2019; pp. 866–871. [Google Scholar]
Paul, R.K.; Yeasin, M.; Kumar, P.; Kumar, P.; Balasubramanian, M.; Roy, H.S.; Paul, A.K.; Gupta, A. Machine learning techniques for forecasting agricultural prices: A case of brinjal in Odisha, India. PLoS ONE 2022, 17, e0270553. [Google Scholar] [CrossRef]
Karakasidou, S.; Fragkou, A.; Zachilas, L.; Karakasidis, T. Exploring Price Patterns of Vegetables with Recurrence Quantification Analysis. Appl. Math. 2024, 4, 1012–1046. [Google Scholar] [CrossRef]
Zou, Y.; Donner, R.V.; Marwan, N.; Donges, J.F.; Kurths, J. Complex Network Approaches to Nonlinear Time Series Analysis. Phys. Rep. 2019, 787, 1–97. [Google Scholar] [CrossRef]
Tsiotas, D. Detecting Different Topologies Immanent in Scale-Free Networks with the Same Degree Distribution. Proc. Natl. Acad. Sci. USA 2019, 116, 6701–6706. [Google Scholar] [CrossRef]
Hasson, U.; Iacovacci, J.; Davis, B.; Flanagan, R.; Tagliazucchi, E.; Laufs, H.; Lacasa, L. A Combinatorial Framework to Quantify Peak/Pit Asymmetries in Complex Dynamics. Sci. Rep. 2018, 8, 3557. [Google Scholar] [CrossRef]
Yang, Y.; Yang, H. Complex Network-Based Time Series Analysis. Phys. Stat. Mech. Its Appl. 2008, 387, 1381–1386. [Google Scholar] [CrossRef]
Zhang, J.; Small, M. Complex Network from Pseudoperiodic Time Series: Topology versus Dynamics. Phys. Rev. Lett. 2006, 96, 238701. [Google Scholar] [CrossRef] [PubMed]
Lacasa, L.; Luque, B.; Ballesteros, F.; Luque, J.; Nuño, J.C. From Time Series to Complex Networks: The Visibility Graph. Proc. Natl. Acad. Sci. USA 2008, 105, 4972–4975. [Google Scholar] [CrossRef] [PubMed]
Donner, R.V.; Zou, Y.; Donges, J.F.; Marwan, N.; Kurths, J. Recurrence Networks—A Novel Paradigm for Nonlinear Time Series Analysis. New J. Phys. 2010, 12, 033025. [Google Scholar] [CrossRef]
Silva, V.F.; Silva, M.E.; Ribeiro, P.; Silva, F. Time Series Analysis via Network Science: Concepts and Algorithms. WIREs Data Min. Knowl. Discov. 2021, 11, e1404. [Google Scholar] [CrossRef]
Zhuang, E.; Small, M.; Feng, G. Time Series Analysis of the Developed Financial Markets’ Integration Using Visibility Graphs. Phys. Stat. Mech. Its Appl. 2014, 410, 483–495. [Google Scholar] [CrossRef]
Charakopoulos, A.Κ.; Karakasidis, T.E.; Papanicolaou, P.N.; Liakopoulos, A. The Application of Complex Network Time Series Analysis in Turbulent Heated Jets. Chaos Interdiscip. J. Nonlinear Sci. 2014, 24, 024408. [Google Scholar] [CrossRef] [PubMed]
Iacobello, G.; Scarsoglio, S.; Ridolfi, L. Visibility Graph Analysis of Wall Turbulence Time-Series. Phys. Lett. A 2018, 382, 1–11. [Google Scholar] [CrossRef]
Charakopoulos, A.K.; Katsouli, G.A.; Karakasidis, T.E. Dynamics and Causalities of Atmospheric and Oceanic Data Identified by Complex Networks and Granger Causality Analysis. Phys. Stat. Mech. Its Appl. 2018, 495, 436–453. [Google Scholar] [CrossRef]
Aranburu-Imatz, A.; Jiménez-Hornero, J.E.; Morales-Cané, I.; López-Soto, P.J. Environmental Pollution in North-Eastern Italy and Its Influence on Chronic Obstructive Pulmonary Disease: Time Series Modelling and Analysis Using Visibility Graphs. Air Qual. Atmos. Health 2023, 16, 793–804. [Google Scholar] [CrossRef] [PubMed]
Zheng, M.; Domanskyi, S.; Piermarocchi, C.; Mias, G.I. Visibility Graph Based Temporal Community Detection with Applications in Biological Time Series. Sci. Rep. 2021, 11, 5623. [Google Scholar] [CrossRef]
Hjorth, B. EEG Analysis Based on Time Domain Properties. Electroencephalogr. Clin. Neurophysiol. 1970, 29, 306–310. [Google Scholar] [CrossRef]
Hjorth, B. The Physical Significance of Time Domain Descriptors in EEG Analysis. Electroencephalogr. Clin. Neurophysiol. 1973, 34, 321–325. [Google Scholar] [CrossRef]
Hurst, H.E. Long-Term Storage Capacity of Reservoirs. Trans. Am. Soc. Civ. Eng. 1951, 116, 770–799. [Google Scholar] [CrossRef]
Mandelbrot, B.B.; Wallis, J.R. Noah, Joseph, and Operational Hydrology. Water Resour. Res. 1968, 4, 909–918. [Google Scholar] [CrossRef]
Georgopoulou, E.; Mirasgedis, S.; Sarafidis, Y.; Vitaliotou, M.; Lalas, D.P.; Theloudis, I.; Giannoulaki, K.-D.; Dimopoulos, D.; Zavras, V. Climate Change Impacts and Adaptation Options for the Greek Agriculture in 2021–2050: A Monetary Assessment. Clim. Risk Manag. 2017, 16, 164–182. [Google Scholar] [CrossRef]
Sarantopoulos, A.; Korovesis, S. Effects of Climate Change in Agricultural Areas of Greece, Vulnerability Assessment, Economic-Technical Analysis, and Adaptation Strategies. Environ. Sci. Proc. 2023, 26, 173. [Google Scholar] [CrossRef]
Varsha, K.; Patil, K.K. An Overview of Community Detection Algorithms in Social Networks. In Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, Tamilnadu, 26–28 February 2020; pp. 121–126. [Google Scholar]
Newman, M.E.J.; Girvan, M. Finding and Evaluating Community Structure in Networks. Phys. Rev. E 2004, 69, 026113. [Google Scholar] [CrossRef] [PubMed]
Blondel, V.D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E. Fast Unfolding of Communities in Large Networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
Newman, M.E.J. Modularity and Community Structure in Networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef]
Pelagidis, T.; Toay, T.N. Expensive Living: The Greek Experience under the Euro; Levy Economics Institute: Annandale-On-Hudson, NY, USA, 2006. [Google Scholar]
Glauben, T.; Loy, J.-P.; Meyer, J. (Eds.) The Impact of Euro Introduction on the Vertical Price Transmission in the German Food Market—Does Money Illusion Matter? Contributed Paper; European Association of Agricultural Economists (EAAE): Brussels, Belgium, 2005. [Google Scholar]

Figure 1. Time series for the vegetables studied in this article.

Figure 2. Clustering of time series as a whole (a) for period A, (b) for period B, (c) and for period C (d).

Figure 3. Correlation matrix between various products for the whole period (a) for period A (b) for period B (c) for period C (d).

Figure 4. Correlation complex network of the product process for (a) the whole period (b) period A, (c) period B, and (d) period C. Different colors represent different communities.

Figure 5. (a–h) Degree, eccentricity, closeness centrality, betweenness centrality, clustering coefficient, eigenvector centrality, bridging coefficient, and bridging centrality for all products.

Figure 6. (a) Clustering based on all network measures; (b) clustering based on time series as a whole.

Figure 7. Hurst exponent for various periods: (a) the whole period, (b) Period A, (c) Period B, and (d) Period C.

Figure 8. (left) column: Hjorth activity for three time periods; (center) column: Horth mobility for three time periods; (right) column: Hjort complexity for three periods.

Figure 9. Clustering of products based on the combined Hjorth parameters.

Table 1. Vegetables for which time-series analysis was performed.

ID	Description
1	Cucumber pair
2	Knossos cucumbers
3	Dill, parsley bales
4	Endives
5	Carrots
6	Zucchini
7	Spring onions, fresh bale
8	Onions
9	Lettuce
10	Beetroot
11	Long-fruited peppers
12	Coarse peppers
13	Salads
14	Celery
15	Garlic
16	Spinach
17	Tomatoes

Table 2. Pearson correlation values for the whole period (only values above 0.10 are presented).

Product		Value of Correlation
Long-fruited peppers (id11)	Coarse peppers (id12)	0.34
Cucumber pair (id1)	Knossos cucumbers (id2)	0.24
Lettuce (id9)	Salads (id13)	0.23
Salads (id13)	Spinach (id16)	0.20
Dill, Parley bales (id3)	Spring onions, fresh bale (id7)	0.19
Spring onions, fresh bale (id7)	Lettuce (id9)	0.15
Lettuce (id9)	Celery (id14)	0.15
Cucumber pair (id1)	Zucchini (id6)	0.14
Carrots (id5)	Spring onions, fresh bale (id7)	0.11
Knossos cucumbers (id2)	Zucchini (id6)	0.10
Dill, parley bales (id3)	Carrots (id5)	0.10
Endives (id4)	Salads (id13)	0.10
Carrots (id5)	Celery (id14)	0.10

Table 3. Global variability comparison for various methods in the paper and comparison with RPs [22].

Vegetables	G1 RPs (Clust)	G2 RPs (Clust)	G3 RPs (Clust)	G4 RPs (Clust)	G5 RPs (Clust)	G1 Tclus	G2 Tclus	G3 Tclus	G4 Tclus	G5 Tclus	G6 Tclus	G1 Nvisib Clust	G2 Nvisib Clust	G3 Nvisib Clust	G4 Nvisib Clust	G1 cN	G2 cN	G3 cN
Garlic
Onions
Carrots
Dill parsley
Celery
Knossos cucumbers
Beetroots
Spring onions
Coarse peppers
Long-fruited peppers
Zucchini
Spinach
Salads
Endives
Lettuce
Tomatoes
Cucumbers

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Karakasidou, S.; Charakopoulos, A.; Zachilas, L. Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods. AppliedMath 2024, 4, 1328-1357. https://doi.org/10.3390/appliedmath4040071

AMA Style

Karakasidou S, Charakopoulos A, Zachilas L. Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods. AppliedMath. 2024; 4(4):1328-1357. https://doi.org/10.3390/appliedmath4040071

Chicago/Turabian Style

Karakasidou, Sofia, Avraam Charakopoulos, and Loukas Zachilas. 2024. "Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods" AppliedMath 4, no. 4: 1328-1357. https://doi.org/10.3390/appliedmath4040071

APA Style

Karakasidou, S., Charakopoulos, A., & Zachilas, L. (2024). Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods. AppliedMath, 4(4), 1328-1357. https://doi.org/10.3390/appliedmath4040071

Article Menu

Investigating System Dynamics of Vegetable Prices Using Complex Network Analysis and Temporal Variation Methods

Abstract

1. Introduction

2. Data

3. Methods

3.1. Clustering Analysis

3.2. Complex Networks

3.2.1. Pearson Coefficient

3.2.2. Correlation-Based Complex Networks

3.2.3. Visibility Algorithm Complex Network Transformed Time Series

3.2.4. Topological Measures of Networks

3.3. Hurst Exponent

3.4. Hjorth Descriptors

4. Results and Discussion

4.1. Clustering

4.2. Pearson Correlatons

4.3. Correlation Networks

4.4. Complex Networks Transformed Time Series

4.5. Hurst Results (Temporal Behavior)

4.6. Hjorth Parameters

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI