Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals

Feng, Yiming; Gao, Wanxin; Zhang, Lefeng; Qi, Minfeng; Zhong, Qi; Li, Ningran

doi:10.3390/math13060927

Open AccessArticle

Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals

by

Yiming Feng

¹,

Wanxin Gao

^1,*

,

Lefeng Zhang

¹,

Minfeng Qi

¹

,

Qi Zhong

¹

and

Ningran Li

²

¹

Faculty of Data Science, City University of Macau, Macau SAR, China

²

School of Computer and Mathematical Sciences, The University of Adelaide, Adelaide, SA 5005, Australia

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(6), 927; https://doi.org/10.3390/math13060927

Submission received: 13 February 2025 / Revised: 4 March 2025 / Accepted: 8 March 2025 / Published: 11 March 2025

(This article belongs to the Special Issue Advanced Image Processing and Computational Intelligence: Methodologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

For the optimization and performance evaluation of mobile ad hoc networks, a beneficial but challenging act is to derive from nodal movement behavior the steady-state spatial density function of nodal locations over a given finite area. Such derivation, however, is often intractable when any assumption of the mobility model is not basic, e.g., when the movement area is irregular in shape. As the first endeavor, we address this density derivation problem for the classic random waypoint mobility model over irregular convex polygons including triangles (i.e., 3-gons) and quadrilaterals (i.e., 4-gons). By mixing multiple Dirichlet distributions, we first devise a mixture density neural network tailored for density approximation over triangles and then extend this model to accommodate convex quadrilaterals. Experimental results show that our Dirichlet mixture model (DMM) can accurately capture the irregularity of ground-truth density distributions at low training cost, markedly outperforming the classic Gaussian mixture model (GMM).

Keywords:

spatial density functions; random waypoint mobility; Dirichlet distribution; mixture density networks; irregular convex polygons

MSC:

68T07

1. Introduction

In mobile ad hoc networks (MANETs), given the movement behavior of nodes, a common and practical concern is to know the nodal density distribution over time in a certain finite area. Such a priori geospatial knowledge can facilitate online or offline performance evaluation and optimization of the network, enabling connectivity assessment [1], wireless capacity estimation [2], and resource placement optimization [3], to name a few. For instance, imagine a flying ad hoc network (FANET) composed of multiple unmanned aerial vehicles (UAVs) that patrol along the land/coastal border to counter smuggling activities, or, a FANET that hovers over a hazardous region (e.g., a forest patch prone to wildfire) to maintain surveillance of ongoing or potential disasters [4]. Each UAV would roam within the target area following certain movement behavior, e.g., the random waypoint (RWP) mobility [5], which is one of the most widely used mobility models in MANETs. In whichever scenario, it is very beneficial to know how the location of a UAV would be distributed probabilistically such that additional resources or facilities like ground/aerial sinks can be deployed accordingly (e.g., in proportion to the spatial density of the UAV), thus improving the efficiency of data collection at the sinks.

Practically, the movement area of concern is likely to vary over time in both size and shape (e.g., due to expansion of a wildfire), as a result of which, it is hard to retain optimal positioning of the sinks if the UAV’s new density distribution cannot be known promptly once the area changes. A conventional way to obtain the nodal distribution is via simulation of a mobility model [6], which, however, is mostly too time-consuming to be useful in applications like the aforementioned example. Often, it can take hours for a single run of the RWP model to (closely) reach its steady state (see Section 4), and this time cost can extend to days if the simulation needs to be conducted at a high resolution for fine-grained density evaluation. Other than simulation, it is also possible to mathematically derive exact or approximate probability density functions directly from the mobility model. The analytic expressions can be used to not only produce precise density values at arbitrary locations within the movement area but also to enable theoretical network performance analysis [1]. Nonetheless, such derivation, even inclusive of approximation, is mostly intractable except for a few cases where the mobility model is substantially regulated [1,3,7]. In [1], for instance, the nodes are assumed to follow (variants of) the RWP model and the density functions are available for approximation in closed form only for areas that are regular in shape, e.g., the unit disk, or regular polygons like an equilateral triangle and a square. Also, in [3], the RWP mobility is extended to incorporate potential recharging behavior of human agents, but the movement area is still limited to the regular unit disk to retain analytic tractability. The difficulty of deriving spatial density functions for RWP mainly comes from its “border effect” [7], meaning that the nodal trace tends to concentrate around the center of the area even if the waypoints are scattered uniformly at random and the concentration pattern is determined by the specific shape of the border. This effect renders it a formidable task to analyze RWP over irregularly shaped areas, although such types of area have broader applications than regular ones.

One may consider circumventing this challenge by assuming other (synthetic) mobility models, e.g., random walk and random direction [8], which appear to be more amenable to mathematical analysis [9,10]. However, we omit these models here in consideration of their more limited applicability than RWP. Because of its ability to model the common behavior of movement driven by destinations, RWP, together with its extensions, remains a popular framework for mimicking various sophisticated mobility patterns [11,12].

In light of the difficulty mentioned above, we propose a new method to derive highly accurate, approximate, and closed-form spatial density functions for the classic RWP mobility over movement areas that are irregular convex polygons in shape. For concave polygons, the RWP mobility model has to be redefined to determine the movement behavior around vertices whose interior angles are greater than

180^{\circ}

[13]. Since there is still no RWP variant that is generally agreed upon for this scenario, we leave this problem for future work. To the best of our knowledge, such analytic results for RWP are still lacking. Our solution is to construct and train a deep neural network (using only a small sample set of data from simulation) to learn the optimal parameter settings of a probability density function with a predetermined form. This learning model owes its inspiration to the so-called mixture density network (MDN) [14], which mixes multiple simpler (e.g., Gaussian) distributions,

f^{k} (x)

, to approximate the target complex distribution as

g (x) = \sum_{k = 1}^{K} η^{k} f^{k} (x)

(assuming

η^{k}

denotes the mixing weight). Notably, to best accommodate the irregularly polygonal movement areas (e.g., triangles, quadrilaterals), we mix the Dirichlet distribution that has a triangular finite support [15,16], instead of the prevalent Gaussian distribution that is infinite supported (without non-trivial truncation). Our experiments show that the choice of an MDN adopting the Dirichlet distribution as its kernel function is profound, in the sense that unseen density distributions of RWP over polygons can be approximated accurately and a small number of training examples is enough for the learning model to generalize from (see Section 4). This relatively high generalizability is because the neural network only needs to optimize several parameter values for the kernels imposed as limitations on the learning model. Once the model is sufficiently trained and deployed, the target density distribution can be inferred within milliseconds (as in [17]).

The main contributions of this paper are outlined as follows:

We propose a Dirichlet-based mixture model (DMM) to address, for the first time, the problem of approximating the spatial density functions of the classic RWP mobility over irregular triangles (i.e., 3-gons). Since the trivariate Dirichlet distribution used for mixture has a support that is not defined in the Cartesian plane, where the movement area resides, we also employ the techniques of conversion between (Cartesian and barycentric) coordinate systems and change of variables to make it feasible to evaluate density values of RWP based on the Dirichlet distribution.
Extending the framework of DMM proposed for triangles, we also address the density fitting problem over irregular convex quadrilaterals (i.e., 4-gons). For the extension, we propose a decomposition method that enables approximating the density distribution over a (convex) polygon by decomposing the polygon (together with its overall distribution) into multiple sub-triangles. This method can also be applied for n-gons with $n > 4$ by conflating more sub-triangles.
To demonstrate the performance of DMM, we conduct experiments in comparison to the classic Gaussian-based mixture model (GMM) as the benchmark. The experimental results show that the DMM we have proposed can effectively and efficiently approximate the spatial density distribution of RWP over triangular and quadrilateral movement areas, in the sense that it (1) respects the border of any polygonal area exactly, and (2) can achieve markedly lower approximation errors than GMM, (3) while requiring only a sparse sample set of examples for training.

2. Related Work

There are only a couple of recent studies directly related to the problem addressed in this paper, mainly from our previous work, including [2,3,17]. Nonetheless, for thoroughness, we consider an extended scope for the literature review to include noteworthy studies that are potentially related and informative in one or more aspects.

Mathematical analysis of RWP mobility. A closed-form expression of the spatial density distribution of mobility models can not only accelerate simulations of MANETs by initially positioning the nodes according to their steady-state distribution, but also enable theoretical evaluation of the network performance (e.g., connectivity, capacity) based on the density functions obtained. To attain these benefits, a series of studies have considered formal analysis of mobility models, with RWP-related ones being the most appealing and challenging [1,2,3,7,18,19,20]. The studies of [7,18] were among the first to analyze the density distributions of RWP, giving closed-form expressions for circular and rectangular areas. To further simplify the derivation while accommodating more areas in arbitrary convex shapes, the work of [1] provides a general explicit density function, which, however, is still limited to an integral formulation. Based on the integral, the densities over the unit disk, equilateral triangle, square, and hexagon are approximated in the closed form, while such analyticity for irregular areas is still missing. In our previous work [2,3], we adopt the Palm calculus first introduced in [19] to cover the derivation of density for an RWP extension where the nodes assume a type of charging-aware mobility. We can derive exact and closed-form expressions for one-dimensional areas (i.e., a segment or a ring), but in the case of a two-dimensional unit disk, the result is as complicated as a triple integral.

In general, owing to the inherent intractability of the problem, it appears that progress on the analytic formulation of the spatial density functions for RWP (be it exact or approximate) has been relatively stagnant for sophisticated cases, especially those assuming irregularly shaped movement areas. In this paper, we intend to resume this line of work, by newly adopting a machine learning method (i.e., an MDN) instead of pure mathematical analyses.

Estimation of spatial density distributions. Apart from mathematical analyses, there exists work, mainly from other fields than networking (e.g., transportation, urban informatics), on using mixture models (mostly based on Gaussians) for fitting the spatial density of mobility traces not specific to RWP. For instance, recognizing that there are social (e.g., friendship) and temporal factors driving the location transitions of mobile users, the work of [21] proposes the PSMM (Periodic and Social Mobility Model) to capture the dual-faceted movement patterns, based on mixture of the power-law distribution and Gaussian distributions. The movement area concerned therein is in the shape of a simple square, with no consideration of density leaks outside. In [22], still for modeling mobile users, Gaussian distributions are used as the kernel for multi-scale spatial density fitting. The novelty of this study lies in that the correlation matrices (or so-called, bandwidths) of the Gaussians can be adapted according to the population density of the area considered. In the context of vehicles on roads, a mixture model based on the (infinite-supported) Laplacian distribution is introduced in [23] to predict the future locations of moving agents. Again, the area’s border plays no specific role in the solutions proposed in [22,23].

For spatial density estimation, we can also use MDNs, by employing a neural network to learn the parameters of the kernel distributions. For instance, a Gaussian-based MDN learning model is proposed in [24] for the stable multimodal prediction of the states of objects (e.g., bicyclists at intersections). Despite its sophistication in addressing the potential instability of MDNs, the distributions used in [24] are still infinitely supported. In another work [25], with extension to the type of recurrent neural network, a recurrent mixture density network is devised to forecast the rider demand in urban MoD (Mobility-on-Demand) systems. Unlike others, the focus of this work leans towards temporal rather than spatial density predictions. Moreover, in [26], a novel type of MDN called a graph mixture density network (GMDN) is proposed to better capture the graph-like structure of input data. While this GMDN is able to combine the benefits of graph neural networks [27] and MDNs to produce accurate multimodal distributions, the primary assumption of the kernel is still Gaussian. Lastly, our recent work [17] proposes a new MDN model based on the circularly supported Möbius distribution in order to fit the density distribution of a charging-aware RWP mobility model over a unit disk. Although the border of the area is respected in [17], the shape assumed is still regular (i.e., being a unit disk), which limits the applicability of this MDN.

Regardless of the specific mobility and application scenarios, the most salient distinction between the studies above and ours lies in their indifference to the delineation of the movement area, obeying which is the crux of this paper. The prevalent mixture models (be they MDN-type or not) based on infinitely supported kernel distributions cannot suit our needs to accommodate the irregular finite movement areas.

There also exist alternative methodologies that can potentially accommodate density estimation within a bounded irregular area. For instance, based on a density estimation method called conditional normalizing flows (CNFs) [28], a generative model named the recurrent flow network is proposed in [29] for learning the location-dependent density of urban mobility. Although inattention to the regional border remains in [29], we suspect the adoption of a finite-supported prior distribution other than Gaussians may facilitate modeling for an irregular support. Also, we note that another recent work [30] attempts to incorporate the environmental constraints into estimation by adopting the particle filter algorithm; however, the density still leaks into the forbidden areas. Lastly, one may also consider the use of spatial point processes for mobility modeling (e.g., [31,32]), which can enable the “density” derivation over a bounded area. Nevertheless, it should be noted that the notion of density concerned in such methods is the point density (or intensity) rather than the probability density of interest.

General non-Gaussian mixture models. In other fields, e.g., economics, physics, and speech processing, where the focus of density estimation is not on geospatial mobility modeling, it appears that mixture models using non-Gaussian kernels (e.g., beta, gamma, or Dirichlet distributions) with (semi-)finite supports are more common [16,33,34,35]. Notably, in [16], the complexity of DMM for model-based clustering is discussed and a new parameter estimation method based on expectation maximization (EM) is proposed for optimal parameterization with higher interpretability. The proposed method has been applied to high-dimensional (e.g., brain cancer) data to demonstrate its versatility. Moreover, for cluster analysis of compositional data (i.e., positive vectors with their component values summing to one), a DMM parameterized by a hard EM algorithm is proposed in [36] to satisfy the sum constraint and also to accommodate the issue of empty clusters. Compared to GMM, they show that DMM is more efficient for clustering high-dimensional data. Also for compositional data, a new Dirichlet-based regression model is proposed in [37] to account for the potential spatial interdependence in data by including the “spatial lag” in the estimation process. No Dirichlet mixture is mentioned in this work.

Note that none of the work above is concerned with mobility modeling and their applications are mostly targeted at parameter estimation for (multivariate) clustering, while we are pursuing a learning model that can promptly predict parameters fitting density distributions of RWP confined in different areas.

3. Mixture Density Networks Based on Dirichlet Distributions

Generally, assumning an arbitrary convex polygonal (i.e., triangular, quadrilateral) movement area,

P \subset R^{2}

, then, over the Cartesian plane of

(x, y) \in R^{2}

, our main objective is to find an analytic parameterized density function,

\hat{g} (x, y; θ) \in R_{\geq 0}

, along with its parameters,

θ

, such that the distance L (in terms of a certain metric) between

\hat{g} (x, y; θ)

and the ground-truth spatial density distribution of RWP over

P

, i.e.,

g (x, y) \in R_{\geq 0}

, is minimized over

(x, y) \in P

, while

\hat{g} (x, y; θ) = 0

for

(x, y) \notin P

. More formally, for any given polygon

P

, we aim to solve an optimization problem as follows:

\begin{matrix} min_{\hat{g}, θ} & L (\hat{g} (x, y; θ), g (x, y)) \\ s . t . & \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} \hat{g} (x, y; θ) d x d y = 1, \\ \hat{g} (x, y; θ) = 0, \forall (x, y) \notin P . \end{matrix}

(1)

Addressing the above general constrained problem “from scratch” can be really difficult, since both the function

\hat{g}

and its parameterization

θ

need to be decided while the finite polygonal support should be respected. Instead, we can heuristically predetermine the form of

\hat{g}

, i.e., adopting the Dirichlet distribution (whose unique triangular support provides the inspiration). Given

\hat{g}

in the form of (mixed) Dirichlet distributions, we then resort to the neural network to learn and quickly induce the optimal values of

θ

even for an unseen density distribution from an unseen

P

(whose features are given).

Using the Dirichlet distribution as the kernel, we can build MDN-based learning models for our spatial density fitting task over convex polygons. In this section, we first present the design of the learning model for irregular triangles, which is then further adapted to accommodate irregular convex quadrilaterals, exemplifying the extendability of the learning model for other convex n-gons with

n > 3

.

3.1. Mixture Density Learning for Triangles

The MDN learning model we have constructed for fitting irregular triangles is shown in Figure 1, where, as an example, we assume

K = 2

component distributions are used for mixture. In this learning model, a six-dimensional feature vector, denoted by

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3})

, is firstly fed as the input. This vector corresponds to the six Cartesian coordinates of the three vertices of the triangular area, within which the node roams along a “zig-zag” trajectory, following the classic RWP mobility. That is, the node would continue moving following a straight path until it reaches a “waypoint”, at which time, a new waypoint is generated randomly in the area and the node then turns straight towards it, and so forth. Without loss of generality, we assume that the triangular movement area (denoted by

T

) is always encompassed by the unit square of

{[0, 1]}^{2}

, i.e.,

a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3} \in [0, 1]

(or

T \subset {[0, 1]}^{2}

), regardless of the configuration of the three vertices.

Following the input layer, there are two fully connected hidden layers, each of size

8 K

(recalling that

K \in Z_{\geq 1}

denotes how many component distributions are used for mixture). The output layer is of size

4 K

(with

K = 2

in the example of Figure 1), meaning that each component is determined by four parameters, including the mixing coefficient

η \in (0, 1]

and three parameters,

α_{1}, α_{2}, α_{3} \in R_{> 0}

, for tuning the Dirichlet distribution, formulated as

f (u, v, w; α_{1}, α_{2}, α_{3}) = \frac{u^{α_{1} - 1} v^{α_{2} - 1} w^{α_{3} - 1}}{B (α_{1}, α_{2}, α_{3})},

(2)

where

B (α_{1}, α_{2}, α_{3}) = \frac{Γ (α_{1}) Γ (α_{2}) Γ (α_{3})}{Γ (α_{1} + α_{2} + α_{3})}

is a (trivariate) beta function for normalization, with

Γ (\cdot)

being the gamma function.

We use the trivariate Dirichlet distribution because its support is a two-dimensional simplex, i.e., a triangle, thus suiting our movement area in shape. However, it should be noted that the triangular support of the Dirichlet distribution in Equation (2) is defined in the three-dimensional space of

(u, v, w)

, with

u + v + w = 1

. Hence, given a location of the node at

(x, y)

within the movement area

T

, its density value cannot be sampled directly from the mixed Dirichlet distributions. To resolve this incompatibility, we resort to the conversion between (two-dimensional) Cartesian and barycentric coordinates [38]. Specifically, given the three vertices of the triangle

T

, the Cartesian coordinates of the nodal location,

(x, y)

, can be transformed linearly into the barycentric coordinates,

(u, v, w)

, as

\{\begin{matrix} \begin{matrix} u & = C^{- 1} \cdot [(x - a_{3}) (b_{2} - b_{3}) + (a_{3} - a_{2}) (y - b_{3})] \\ v & = C^{- 1} \cdot [(x - a_{3}) (b_{3} - b_{1}) + (a_{1} - a_{3}) (y - b_{3})] \\ w & = 1 - u - v, \end{matrix} \end{matrix}

(3)

where

C = (a_{1} - a_{3}) (b_{2} - b_{3}) + (a_{3} - a_{2}) (b_{1} - b_{3}) .

(4)

Moreover, due to the change in variables from

(x, y)

to

(u, v, w)

(or simply

(u, v)

, since

w = 1 - u - v

) in Equation (3), we have to compute the Jacobian accordingly to account for the scaling factor of the distribution after transformation (so that the total probability mass still sums to one) [39]. This Jacobian, denoted by

J_{u, v} (x, y)

, is a determinant consisting of the first-order partial derivatives of

(u, v)

with respect to

(x, y)

, and can be given by

J_{u, v} (x, y) = |\begin{matrix} \frac{\partial u}{\partial x} & \frac{\partial u}{\partial y} \\ \frac{\partial v}{\partial x} & \frac{\partial v}{\partial y} \end{matrix}| = C^{- 2} \cdot |\begin{matrix} b_{2} - b_{3} & a_{3} - a_{2} \\ b_{3} - b_{1} & a_{1} - a_{3} \end{matrix}| = C^{- 1} .

(5)

Combining the above, we then have the probability density function for the mixture of Dirichlet distributions over the triangular movement area

T

, i.e.,

\hat{g} (x, y; η, α_{1}, α_{2}, α_{3}) = \sum_{k = 1}^{K} η^{k} |C^{- 1}| f^{k} (x, y; α_{1}^{k}, α_{2}^{k}, α_{3}^{k}),

(6)

where

(x, y) \in T

,

η = {[η^{k}]}_{1 \leq k \leq K}

with

\sum_{k = 1}^{K} η^{k} = 1

,

α_{i} = {[α_{i}^{k}]}_{1 \leq k \leq K}

(for

i = 1, 2, 3

), and

f^{k} (x, y; α_{1}^{k}, α_{2}^{k}, α_{3}^{k})

(

k = 1, 2, \dots, K

) can be formed by plugging Equation (3) into the density function of Equation (2).

Lastly, for training the MDN learning model of Figure 1, we use the classic metric, the mean squared error (MSE), as the loss function, i.e.,

L = \frac{1}{|S|} \sum_{(x, y) \in S} {[\hat{g} (x, y) - g (x, y)]}^{2},

(7)

where

\hat{g} (x, y)

is the density value of location

(x, y)

estimated by the learning model of Equation (6), and

g (x, y)

is the corresponding ground-truth density value induced by a simulation that has achieved the steady state after a sufficiently long run. For consistency, the set of locations (i.e.,

S

) sampled for evaluating the loss function is set as a grid of

m \times m

points (with

m = 250

in this paper) distributed uniformly within the unit square of

{[0, 1]}^{2}

. The density values outside the triangular movement area

T

are all zeros for both

\hat{g} (x, y)

and

g (x, y)

.

3.2. Mixture Density Learning for Convex Quadrilaterals

The mixture density learning model designed in Section 3.1 can be further extended for density fitting over convex polygons having more than three sides, e.g., quadrilaterals. The basic idea behind such extension is to “decompose” the polygon into multiple (possibly overlapping) triangles such that any point within or on the border of the polygon is covered by at least one triangle and the density value therein can be approximated by mixing density distributions defined over triangles individually. Notably, there are various ways to conduct the decomposition.

For instance, as shown in Figure 2a, we can partition any quadrilateral movement area (denoted by

Q

) into a number of

K = 2

sub-triangles, along an arbitrary diagonal, say,

\bar{A C}

, of the quadrilateral. In this case, each point

(x, y) \in Q

is covered by one sub-triangle. Further, when more sub-triangles are to be mixed for a finer approximation, e.g., when

K = 4

, we simply stack the additional two sub-triangles along the same diagonal of

\bar{A C}

. Alternatively, the quadrilateral can be partitioned such that the added sub-triangles border on the other diagonal

\bar{B D}

, as shown in Figure 2b. If

K = 6

, the diagonal

\bar{A C}

is used again for further decomposition. Note that the method of Figure 2b using alternate diagonals for decomposition is potentially more robust than that of Figure 2a, since the former would induce sub-triangles that are more diverse in shape. Other methods to decompose the quadrilateral include making the four sub-triangles have a common vertex as the intersection point of the two diagonals, or even generating triangles with random vertices until all points (and only those) of the quadrilateral are covered. In this paper, as the first step, we only consider the type of decomposition like in Figure 2b, leaving the testing of other more sophisticated methods for future work. Furthermore, the type of polygon examined for density fitting is limited to

n \leq 4

(i.e., triangles and quadrilaterals), which is to showcase how the density distribution over n-gons with

n > 3

can be approximated by assembling the components of Dirichlet mixtures over triangles.

For completeness, the mixture learning model for irregular (convex) quadrilaterals is shown in Figure 3, supposing the quadrilateral is decomposed into a number of

K = 2

sub-triangles. The probability density function for quadrilaterals (and any other convex polygons) can be obtained still based on Equation (6). The main difference lies in that the feature vector for input now becomes an eight-vector,

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3}, a_{4}, b_{4})

(assuming

a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3}, a_{4}, b_{4} \in [0, 1]

), corresponding to the (Cartesian) coordinates of the four vertices of the quadrilateral movement area. Furthermore, during the conversion between Cartesian and barycentric coordinates, the three vertices of the triangles assumed for Equations (3)–(5) may be varied each time the equations are evaluated, i.e., alternating between two sub-triangles, one being

T_{1}

, with the vertices of

(a_{1}, b_{1})

,

(a_{2}, b_{2})

, and

(a_{4}, b_{4})

, and the other being

T_{2}

, whose three vertices are

(a_{2}, b_{2})

,

(a_{3}, b_{3})

, and

(a_{4}, b_{4})

. The similarity between Figure 1 and Figure 3 shows that it takes little effort to adapt the framework of our MDN learning model for triangles to accommodate quadrilaterals (or other convex polygons).

4. Experimental Evaluation

4.1. Overall Settings

To demonstrate the efficacy and efficiency of the Dirichlet-based mixture models (DMMs) we have proposed for triangles (Figure 1) and convex quadrilaterals (Figure 3), we compare them against the classic Gaussian mixture model (GMM), which (together with its variants) is arguably the most commonly used MDN in the literature [14,40]. Note that the benchmark model for comparison in this paper is limited only to the GMM in its basic version, which assumes an infinite support without truncation of its component Gaussian distributions, since it takes a non-trivial effort to truncate the GMM to fit an irregularly shaped support, be it triangular or quadrilateral. Also, note that comparison against the basic GMM without conditioning it to fit the irregular areas should not undermine the validity and efficacy of DMM; that is, the advantages of DMM (see the results in Section 3.1 and Section 3.2) are supposed to mainly come from its component Dirichlet distributions that are inherently more tailored than Gaussians to the problem concerned, and the techniques adopted (e.g., conversion between coordinate systems) are only convenient auxiliaries to let DMM (already having a triangular finite support) interface with the irregular movement areas that are also (composed of) triangles. Such convenience of conditioning distributions to strictly fit irregular supports appears tough to achieve for GMM, due to the “inconvenience” of its generic infinite support. Moreover, we resist the temptation to include any other finite-supported (e.g., beta-type [41]) models for a seemingly more comprehensive evaluation, considering the fact that, as far as we are aware, no mixture models (other than the DMMs proposed) or probability distributions are available if the support is allowed to be an arbitrary polygon in shape.

For mobility modeling, the GMM employs the bivariate Gaussian distribution as its kernel and is constructed following the same framework in Figure 1, with the modification that the output layer now emits

6 K

parameters; that is, each Gaussian component is determined by six parameters, including

η

,

μ_{x}, μ_{y} \in T

,

σ_{x}, σ_{y} \in R_{> 0}

, and

ρ \in (- 1, 1)

, corresponding to the mixing coefficient, means, standard deviations, and the correlation coefficient, respectively. The loss function for training is also the MSE as defined in Equation (7), while no Jacobian needs to be computed for GMMs.

For training the mixture models, we use an “atypically” sparse sample set of examples (the justifications for which are elaborated later). Specifically, for models over triangular movement areas, we use three examples, each having a different value for the feature

a_{1}

, i.e.,

a_{1} = 0, 0.3, 0.6

, while the other five features have their values fixed to

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

. By varying

a_{1}

only, the MDNs can be informed of the different types of triangle (from acute to obtuse ones) for density fitting. Likewise, for training models over quadrilaterals, the number of examples is also set to three with

a_{1}

being the variable, i.e.,

a_{1} = 0, 0.3, 0.6

, while the values of the other seven features remain constant as

b_{1} = 0.2

,

a_{2} = 0.6

,

b_{2} = 1

,

a_{3} = 1

,

b_{3} = 1

,

a_{4} = 1

, and

b_{4} = 0

, so as to exemplify a spectrum of irregularity. Note that the variation in a single variable (

a_{1}

) is enough to induce global changes of the density distribution across the whole movement area, due to the “border effect” of RWP [7]. For testing, the sample set resembles that for training, except that

a_{1}

has twelve new values, i.e.,

0.05, 0.1, 0.15, 0.2, \dots, 0.7

(excluding 0.3 and 0.6), for both the cases of triangles and quadrilaterals. All the input feature values of the training/testing examples used here are summarized as in Table 1. The target density values come from simulations that have achieved the steady state, which corresponds to the moment when the node has traveled a distance of over

5 \times 10^{7}

units within the movement area (which is encompassed by the unit square of

{[0, 1]}^{2}

).

The motivations behind the limited training set and the large ratio (i.e., 4:1) of the testing to training examples assumed above stem from the following two considerations. On the one hand, a main point of this work is to adopt application-specific probability distributions for mixture such that the mixture model can learn to fit the target density in an efficient but precise manner, without the unnecessary overheads of gathering and circulating large datasets. By selecting a kernel function with high fitness (e.g., the Dirichlet distribution), the mixture model can quickly grasp the key patterns of the spatial density distribution of RWP after observing only a few related examples. On the other hand, the simulation time of mobility models (e.g., RWP) is often lengthy; for instance, using the compiler of clang 16.0.0 with -O3 optimization on the Apple M4 Pro chip, it takes about 2 h and 25 min for our C++ simulator to complete a single full run of RWP over the triangular area of

(a_{1} = 0, b_{1} = 0, a_{2} = 0.5, b_{2} = 1, a_{3} = 1, b_{3} = 0)

. In view of this cost, it is desirable in practice to minimize the runs of simulation taken for generating training examples.

As will be shown in Section 4.2 and Section 4.3, the learning model can leverage observations from the three selected examples to generate well-approximated probability density functions for many more unseen examples, greatly cutting the costs of training. Note that the variation with respect to

a_{1} = 0, 0.3, 0.6

is only a convenient setting to showcase the efficiency of DMM regarding generalization. The results of training a different set of triangles with

a_{2} = 0, 0.4, 0.8

while

(a_{1} = 0, b_{1} = 0, b_{2} = 1, a_{3} = 1, b_{3} = 0)

can also be found in Appendix A.1, giving more evidence that the high generalizability of the mixture models is an inherent feature.

Other settings pertaining to the experiments are listed as follows. For density fitting over triangles, both DMM and GMM are trained 100 times in total, with

K = 1, 2, \dots, 10

, each involving ten seeded runs. For quadrilaterals, while it is still ten runs per K, we set

K = 2, 4, 6, 8, 10

to examine the effects when the quadrilateral is decomposed (producing two sub-triangles each time) with different frequencies under DMM. The maximum number of epochs per run is set to 20,000 for all, but, to prevent overfitting, the model version reaching the minimum loss (Equation (7)) for the testing examples is used. In each epoch all the three examples are batch fed to the learning model. Additionally, the Adam (Adaptive Moment Estimation) optimizer is used for updating weights of both DMM and GMM, with the learning rates set to

5 \times 10^{- 4}

and

2 \times 10^{- 5}

for triangles and quadrilaterals, respectively.

The specifics of the experimental settings pertaining to RWP simulation (for generating training examples) and the model training/inference, including the platform used for all the experiments, are summarized in Table 2.

4.2. Evaluation Results for Triangles

The training and testing results of DMM versus GMM in fitting the spatial density of RWP over triangular movement areas using different numbers of components for mixture are all given in Figure 4. The KL (Kullback–Leibler) divergence,

D_{KL} [g (x, y) ‖ \hat{g} (x, y)]

, is introduced besides MSE, considering that it can better evaluate how the true density distributions are captured in shape. Firstly, in Figure 4a, it can be seen that the value of MSE tends downwards as K increases, as expected, for both DMM and GMM. In particular, there is a noticeable drop at

K = 4

for DMM, and for

K \geq 4

, the MSE can fall below

10^{- 3}

or even

10^{- 4}

. In contrast, the MSE of GMM remains relatively high, above

10^{- 1}

for

K \leq 2

and always above

10^{- 2}

for

3 \leq K \leq 10

. Overall, the training performance of DMM (in terms of MSE) is orders of magnitude better than that of GMM, for all

K = 1, 2, \dots, 10

.

This marked superiority of DMM over GMM is believed to mainly come from two facts. On the one hand, the GMM is infinitely supported, which will inevitably cause non-zero density outside the movement area

T

, while the support of DMM is finite and matches

T

perfectly in shape such that no density is leaked. On the other hand, although DMM requires fewer parameters to learn than GMM (i.e.,

4 K

versus

6 K

), the former is more flexible in fitting the density distribution of RWP within

T

. These are confirmed by the contour plots in Figure 5. It shows that DMM indeed respects the border of

T

and can well approximate the true density for both acute and obtuse triangles, while GMM evidently underperforms. The advantage of DMM remains for the metric of KL divergence, as shown in Figure 4b. A slight difference from MSE is that the KL divergence of DMM would stay around a relatively low level with little decrease for

K \geq 4

, meaning that a small number (e.g., four) of Dirichlet components are sufficient to capture the spatial density of RWP in shape over triangles.

Furthermore, the results of Figure 4c,d follow similar patterns of those of Figure 4a,b, showing that DMM retains its superiority over GMM in prediction for unseen examples. Meanwhile, the closeness between the training and testing results provides evidence that a small sample set for training is able to provide sufficient clues for the mixture learning model to generate the density distributions for considerably more different shapes. It is reasonable to believe that the spatial density over any triangle with

a_{1} \in [0, 0.7]

and

(b_{1} = 0, a_{2} = 0.5, b_{2} = 1, a_{3} = 1, b_{3} = 0)

can be well approximated by DMM trained with the three examples of

a_{1} = 0, 0.3, 0.6

. This high efficiency of generalization is due to the fact that the DMM need not learn RWP’s distribution “from scratch” but only needs to optimize configuration of the few parameters (i.e.,

η

,

α_{1}

,

α_{2}

, and

α_{3}

) for each component Dirichlet distribution, whose formulation already provides a suitable basis for fitting the target distribution over triangles. Additionally, here we have intentionally restricted the value of

a_{1}

to be no greater than 0.7 to avoid possible “contamination” from the results of triangular areas that are excessively obtuse in the sense that

a_{1} \geq 0.8

. While the DMM proposed is still able to fit such obtuse triangles, a larger number of component distributions would be needed to retain the same accuracy of approximation. For the results of specially obtuse triangles with

a_{1} \geq 0.8

, see Appendix A.2.

To illustrate the component distributions mixed for

K > 1

, the DMM with

K = 3

has its mixture dissected with separate contour plots in Figure 6, assuming

a_{1} = 0.6

. It can be seen that optimization of the DMM tends to induce the three component Dirichlet distributions to be drawn close to the three corners of the triangle. This observation is assumed to be due to the relative greater subtlety, and thus greater difficulty, of approximation around the corners, which is consistent with the findings in [1].

4.3. Evaluation Results for Convex Quadrilaterals

As in the case of triangular movement areas, the training and testing performance of the mixture models in fitting RWP over (convex) quadrilaterals with

K = 2, 4, 6, 8, 10

is summarized in Figure 7. At first glance, the results of Figure 7 repeat the key finding that DMM consistently outperforms GMM in the accuracy of approximation (to different extents) in terms of both MSE and KL divergence and for both the training and testing processes. Nonetheless, there are a few differences from Figure 4 that are worth mentioning. Firstly, it can be seen that the performance gap between DMM and GMM for fitting RWP over quadrilaterals is now narrower than that in the case of triangles as previously shown in Figure 4, with GMM improving its MSE and KL divergence for all

K = 2, 4, 6, 8, 10

, while DMM shifts in the opposite direction. When

K = 2

, for instance, the gap is minimal. We believe these new observations are caused by two factors. That is, for GMM, it faces less of a challenge of density fitting from a quadrilateral whose corners tend to be less pointy on average than those of a triangle, due to more vertices and thus more angles in a quadrilateral (e.g., imagine a square versus an equilateral triangle). Meanwhile, for DMM, because of the combinatorial complication of composing multiple component Dirichlet distributions that are supported over sub-triangles, it can be harder for DMM to optimize parameter configurations for quadrilaterals.

It is worth noting that both the metrics, i.e., MSE and KL divergence, are assumed to be conservative estimates for GMM. For MSE, because it is evaluated only over the sample points of

S

within the unit square of

{[0, 1]}^{2}

, the density leaks outside the unit square are not taken into account for GMM, even if its leaks are inevitable and at times substantial. A similar underestimation issue also applies to the KL divergence of GMM, since the true density values out of the quadrilateral (or triangular) movement area are zeros and thus the discrepancy in density distributions outside will be omitted for GMM. For DMM, which strictly obeys the borders of any movement areas, all the metric values obtained are legitimate. Despite closer performances between DMM and GMM, it should be noted that the former is still superior to the latter by a considerable margin. Except for the relatively high MSE values of DMM when

K = 2

, the minimum metric value (be it MSE or KL divergence) achieved by GMM over all

K = 2, 4, 6, 8, 10

is always higher than the maximum achieved by DMM, let alone the larger numbers of parameters to learn for GMM (i.e.,

6 K

versus

4 K

of DMM).

Another insight given by the results of Figure 7 is that a number of four component Dirichlet distributions (like in Figure 2b) seems to correspond to the “sweet spot” for using DMM for density approximation over quadrilaterals. That is, the setting of

K = 4

can induce an appreciable gain in accuracy compared to that of

K = 2

, but without incurring the extra complexity due to combining more component distributions for mixture. As Figure 7 shows, the performance of DMM tends to incur more instability for

K > 4

, according to the 95% confidence intervals. Hence, for the example here,

K = 4

appears to be the “sweet spot” for the number of components for mixture. Comparing this with the results of Figure 4, where the leap in accuracy also occurs at

K = 4

, we would recommend starting from

K = 4

for training DMM in general cases.

Lastly, in comparison to the ground-truth density distributions obtained from simulation, the isolines of the approximate distributions from mixing four Gaussians and four Dirichlets, respectively, for the two examples with

a_{1} = 0

and

a_{1} = 0.6

, are plotted in Figure 8. The contours in Figure 8b,e show that the method we have proposed through decomposition of the quadrilateral into sub-triangles can indeed well capture the shapes of the target distributions (Figure 8a,d), despite some small imperfections along the two diagonal lines reflecting the composition of the Dirichlet components. The four Dirichlet distributions mixed for

a_{1} = 0.6

are illustrated in Figure 9. In contrast, the distributions produced by GMM (Figure 8c,f) not only leak outside the quadrilateral areas (and the unit square) but also exhibit a “wobbly” pattern, which is rather distant from the true distributions in shape (as confirmed by the relatively high values of KL divergence and MSE in Figure 7).

5. Additional Discussion

The main results of this work have been provided in Section 4 as above. Yet, we are aware that there may exist other relevant concerns that are also worth discussion. For instance, although DMM is able to achieve high accuracy of density approximations at low training cost, it is still possible for DMM to produce results with relatively large errors, if the movement area is overly squeezed/narrow (see Appendix A.2). The errors would also rise in our experiments if the set of training examples is too sparse (e.g., using fewer than three examples for training), or if the examples resemble each other. Another evident cause is the use of a small

K < 4

(e.g.,

K = 1

), possibly compounded by the aforementioned conditions with insufficient training epochs.

For comprehensiveness, we provide additional results and discussion in the form of Appendix B (besides Appendix A mentioned in Section 4), as follows:

Appendix B.1 gives the analyses of time and memory overheads consumed for training the two mixture models for triangular and quadrilateral areas, respectively.
Appendix B.2 gives a brief discussion of the implications when non-convex polygons are considered for the movement area.

6. Conclusions

In this paper, we demonstrate that the spatial density function of RWP over an irregular triangular or (convex) quadrilateral area can be formulated and predicted in an approximate but accurate and efficient manner by a mixture model using the Dirichlet distribution as its kernel. Compared to GMM, the DMM proposed is consistently better in terms of both MSE and KL divergence, owing to the high fitness of the Dirichlet component to the shapes of both the border of the movement area and density values of the nodal distribution within. Such efficacy of DMM can be attained by training with a small number of examples, which are sufficient for the model to generalize from and learn good predictions. Considering the balance between training costs and approximation accuracy, we suggest

K = 4

would be a sensible starting point for configuring and training DMM for both cases of triangular and quadrilateral movement areas.

Besides further testing of more triangles and quadrilaterals, the next step is to explore density learning for other polygons with more sides, i.e., n-gons with

n \geq 5

, decomposing into sub-triangles which are supposed to be more complex. Another challenging problem is to consider the scenario where the polygonal movement areas are compounded by more sophisticated mobility, e.g., non-uniform waypoints over a polygon.

Author Contributions

Conceptualization, W.G.; methodology, W.G. and Y.F.; software, Y.F.; validation, Y.F. and W.G.; formal analysis, W.G. and Y.F.; investigation, Y.F.; resources, L.Z., M.Q. and N.L.; data curation, Y.F., W.G. and Q.Z.; writing—original draft preparation, W.G. and Y.F.; writing—review and editing, W.G., M.Q. and N.L.; visualization, Y.F. and Q.Z.; supervision, W.G. and L.Z.; project administration, W.G. and L.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the Hainan Provincial Joint Project of Li’an International Education Innovation Pilot Zone, Grant No: 624LALH010.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNF	Conditional Normalizing Flow
DMM	Dirichlet Mixture Model
EM	Expectation Maximization
FANET	Flying Ad hoc NETwork
GMDN	Graph Mixture Density Network
GMM	Gaussian Mixture Model
KL	Kullback–Leibler
MANET	Mobile Ad hoc NETwork
MDN	Mixture Density Network
MSE	Mean Squared Error
PSMM	Periodic and Social Mobility Model
RWP	Random WayPoint
UAV	Unmanned Aerial Vehicle

Appendix A

Appendix A.1

To demonstrate the efficacy of DMM for predicting the density of RWP over more different triangles still based on a small sample set for training, we conduct another experiment and summarize the corresponding results as in Figure A1. The examples used for training now vary with respect to

a_{2}

, i.e.,

a_{2} = 0, 0.4, 0.8

, while

(a_{1} = 0, b_{1} = 0, b_{2} = 1, a_{3} = 1, b_{3} = 0)

constantly. For testing, it is

a_{2} = 0.05, 0.1, 0.15, 0.2, \dots, 0.75

(excluding 0.4), while the other five features have the same constant values as those for training. So, the ratio of testing to training examples is 14:3.

As Figure A1 shows, the accuracy and generalizability of DMM remain high and superior to GMM, as in Figure 4. In fact, the performance of DMM here appears even better than that of Figure 4, due to the potentially easier learning task in this case (i.e., no obtuse triangles in the training/testing examples). Therefore, this attests, again, to the feasibility in practice to train the mixture model with low costs of simulation and data collection before deploying the model for predictions of various unseen examples.

Figure A1. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence. The examples used include

a_{2} = 0, 0.4, 0.8

for training and

a_{2} = 0.05, 0.1, 0.15, 0.2, \dots, 0.75

(excluding 0.4) for testing, while

a_{1} = 0

,

b_{1} = 0

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Figure A1. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence. The examples used include

a_{2} = 0, 0.4, 0.8

for training and

a_{2} = 0.05, 0.1, 0.15, 0.2, \dots, 0.75

(excluding 0.4) for testing, while

a_{1} = 0

,

b_{1} = 0

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Appendix A.2

It can be more difficult for the mixture models DMM and GMM to fit density over extensively obtuse triangles, because of their extremely small angles (other than the obtuse one). With

a_{1} \geq 0.8

(while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

), for instance, more component distributions are needed for approximation with similarly high accuracy. Figure A2 shows the training results of DMM verus GMM, assuming two examples are used for training, i.e.,

a_{1} = 0.8, 0.9

. Since our main purpose here is only to see how well the mixture models can manage to fit the density distributions in shape, we have not included testing results, which are supposedly similar to the results of Figure 4 and Figure A1 in generalizability.

From Figure A2, it can be seen that both DMM and GMM have their performances (in terms of MSE and KL divergence) degraded to a certain extent in the case of obtuse triangles, although the advantage of DMM is retained. Compared to Figure 4 and Figure A1, the degradation incurred by the models is, in general, an order of magnitude or more. Also, GMM appears to have become more unstable (in terms of the 95% confidence interval), which is believed to result from the further squeezed acute angles raising the challenge of approximation. For DMM, the instability issue has not occurred despite lower performance.

Figure A2. Training performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for overly obtuse triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence. The examples used for training include

a_{1} = 0.8, 0.9

while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Figure A2. Training performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for overly obtuse triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence. The examples used for training include

a_{1} = 0.8, 0.9

while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Appendix B

Appendix B.1

Here, we compare DMM and GMM regarding their training time and space complexity for the training examples assumed in Section 3.1 and Section 3.2. For fairness of comparison, we decide to set an identical “learning target” for DMM and GMM (given the same value of K); that is, for each value of K, we evaluate the number of epochs required by DMM and GMM, respectively, to achieve an MSE value, which is the minimum loss GMM can reach for the testing examples during training. Such pairwise training is run ten times per K. As shown in Figure A3a, which is for the case of triangles, we can see that it always takes a notably shorter time (in terms of the number of epochs) for DMM to achieve the best generalization/testing loss of GMM for any

K = 1, 2, \dots, 10

, with DMM consistently remaining below 600 epochs for training, while GMM has a wider range of variation, from above 10,000 epochs (for

K \leq 2

) to around 2000 epochs as K is increased to 10. This advantage of DMM is supposed to come from its higher fitness and generalizability for the density fitting problem over triangles. For quadrilaterals, the efficiency of DMM in training time remains, as shown in Figure A3c, although the gaps become narrower than those of Figure A3a, still in accordance with the performance gaps between the mixture models (see Figure 7).

Moreover, it is worth noting that DMM can further improve its generalization loss to lower levels as the training process continues, while GMM would quickly hit its capacity, especially when a larger number of components are available for mixture. This difference is exemplified by the plots of Figure A3b, which illustrate the training and generalization losses (in terms of MSE) of the models over time, sampling an instance of training with

K = 10

for the triangular case. In this instance, the generalization loss of GMM has reached its lowest around the 1000th epoch, far sooner than that of DMM, which keeps decreasing in line with the training loss. In other words, DMM is able to learn to effectively utilize more component distributions for finer approximations, while GMM is subject to early occurrences of overfitting because of its bottleneck in fitness and generalizability.

Lastly, we also present a simple comparison of the space complexity between the two mixture models with respect to their neural network structures. Note that our MDN-based learning models are relatively small-scale and similar between DMM and GMM; that is, both the models have an input layer of 6 (8 resp.) neurons for triangles (quadrilaterals resp.), followed by two hidden layers, each of size

8 K

. The only difference lies in the output layer (see Figure A3d), which is of size

4 K

(e.g., 40 parameters to learn for the Dirichlet mixture when

K = 10

) for DMM and

6 K

(e.g., 60 parameters when

K = 10

) for GMM. Hence, the mixture models are both manageable regarding their space complexity, with DMM being more so than GMM (by a margin of

2 K

parameters at the output layer).

Figure A3. Comparisons of the training complexity between DMM and GMM for triangles and quadrilaterals, in terms of (a) the training time for triangles, (b) the training and generalization losses over time, (c) the training time for quadrilaterals, and (d) the total number of parameters to learn at the output layer. The examples used include

a_{1} = 0, 0.3, 0.6

for training and

a_{1} = 0.05, 0.1, 0.15, 0.2, \dots, 0.7

(excluding 0.3 and 0.6) for testing, while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Figure A3. Comparisons of the training complexity between DMM and GMM for triangles and quadrilaterals, in terms of (a) the training time for triangles, (b) the training and generalization losses over time, (c) the training time for quadrilaterals, and (d) the total number of parameters to learn at the output layer. The examples used include

a_{1} = 0, 0.3, 0.6

for training and

a_{1} = 0.05, 0.1, 0.15, 0.2, \dots, 0.7

(excluding 0.3 and 0.6) for testing, while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Appendix B.2

When the polygonal movement area is non-convex, it is possible that the node cannot reach its next waypoint by simply moving straight towards it, in which case, we have to define how the node would detour around the concave corner of the area to continue its movement. However, there is no commonly assumed definition (to our understanding) for such detour behavior which would lend itself to adaptation to specific scenarios. Considering that the spatial density distribution of the node is greatly determined by its movement behavior, we argue that a full investigation of the RWP varieties over non-convex areas can be non-trivial and hence would prefer to defer studies thereof to the future. Still, we would give an example of the non-convex area and identify several RWP variants that appear noteworthy for such areas.

Figure A4 shows a non-convex quadrilateral area, within which the node may follow three different patterns to move between its waypoints. In Figure A4a, like the Swiss flag example assumed in [13], the node tends to stop at the concave corner as the “transit” for its intended waypoint/destination. Yet, as shown in Figure A4b, it is also logical to assume a randomly generated point for this transit, as long as the paths between waypoints and the transit are all straight. To make it even more sophisticated, the node may stop at two (or more) transit points along its way, like in Figure A4c. Note that the node is not obligated to always travel in the shortest (Euclidean) path to its waypoints. While the decomposition method originally proposed is still valid (supposedly) for non-convex quadrilaterals, the potentially rich patterns of the nodal density distribution (due to different detour behaviors) can substantially complicate the problem of density learning, especially for non-convex polygons with more than four sides.

Figure A4. Examples of three possible RWP variants over a non-convex quadrilateral movement area, assuming (a) a single transit point at the concave corner, (b) a single random transit point that allows straight path segments, and (c) two random transit points that allow straight path segments, respectively.

References

Hyytia, E.; Lassila, P.; Virtamo, J. Spatial node distribution of the random waypoint mobility model with applications. IEEE Trans. Mob. Comput. 2006, 5, 680–694. [Google Scholar] [CrossRef]
Gao, W.; Nikolaidis, I.; Harms, J.J. On the interaction of charging-aware mobility and wireless capacity. IEEE Trans. Mob. Comput. 2020, 19, 654–663. [Google Scholar] [CrossRef]
Gao, W.; Nikolaidis, I.; Harms, J.J. On the impact of recharging behavior on mobility. IEEE Trans. Mob. Comput. 2023, 22, 4103–4118. [Google Scholar] [CrossRef]
Chriki, A.; Touati, H.; Snoussi, H.; Kamoun, F. FANET: Communication, mobility models and security issues. Comput. Netw. 2019, 163, 106877. [Google Scholar] [CrossRef]
Johnson, D.B.; Maltz, D.A. Dynamic source routing in ad hoc wireless networks. In Mobile Computing; Springer: Boston, MA, USA, 1996; pp. 153–181. [Google Scholar]
Le Boudec, J.Y.; Vojnovic, M. Perfect simulation and stationarity of a class of mobility models. In Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, Miami, FL, USA, 13–17 March 2005; Volume 4, pp. 2743–2754. [Google Scholar]
Bettstetter, C.; Resta, G.; Santi, P. The node distribution of the random waypoint mobility model for wireless ad hoc networks. IEEE Trans. Mob. Comput. 2003, 2, 257–269. [Google Scholar] [CrossRef]
Camp, T.; Boleng, J.; Davies, V. A survey of mobility models for ad hoc network research. Wirel. Commun. Mob. Comput. 2002, 2, 483–502. [Google Scholar] [CrossRef]
Blough, D.M.; Resta, G.; Santi, P. A statistical analysis of the long-run node spatial distribution in mobile ad hoc networks. In Proceedings of the 5th ACM International Workshop on Modeling Analysis and Simulation of Wireless and Mobile Systems, Atlanta, GA, USA, 28 September 2002; pp. 30–37. [Google Scholar]
Nain, P.; Towsley, D.; Liu, B.; Liu, Z. Properties of random direction models. In Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, Miami, FL, USA, 13–17 March 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 3, pp. 1897–1907. [Google Scholar]
Soltani, M.D.; Purwita, A.A.; Zeng, Z.; Chen, C.; Haas, H.; Safari, M. An orientation-based random waypoint model for user mobility in wireless networks. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
Ammar, H.A.; Adve, R.; Shahbazpanahi, S.; Boudreau, G.; Srinivas, K.V. RWP+: A new random waypoint model for high-speed mobility. IEEE Commun. Lett. 2021, 25, 3748–3752. [Google Scholar] [CrossRef]
Le Boudec, J.Y.; Vojnovic, M. The random trip model: Stability, stationary regime, and perfect simulation. IEEE/ACM Trans. Netw. 2006, 14, 1153–1166. [Google Scholar] [CrossRef]
Bishop, C.M. Mixture Density Networks; Technical Report; Aston University: Birmingham, UK, 1994. [Google Scholar]
Ng, K.W.; Tian, G.L.; Tang, M.L. Dirichlet and Related Distributions: Theory, Methods and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Pal, S.; Heumann, C. Revisiting Dirichlet mixture model: Unraveling deeper insights and practical applications. Stat. Pap. 2025, 66, 2. [Google Scholar] [CrossRef]
Gao, W.; Nikolaidis, I.; Harms, J. Beyond normal: Learning spatial density models of node mobility. arXiv 2024, arXiv:2411.10 997. [Google Scholar]
Navidi, W.; Camp, T. Stationary distributions for the random waypoint mobility model. IEEE Trans. Mob. Comput. 2004, 3, 99–108. [Google Scholar] [CrossRef]
Le Boudec, J.Y. Understanding the simulation of mobility models with palm calculus. Perform. Eval. 2007, 64, 126–147. [Google Scholar] [CrossRef]
Hyytia, E.; Lassila, P.; Virtamo, J. A markovian waypoint mobility model with application to hotspot modeling. Proc. IEEE Int. Conf. Commun. 2006, 3, 979–986. [Google Scholar]
Cho, E.; Myers, S.A.; Leskovec, J. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1082–1090. [Google Scholar]
Lichman, M.; Smyth, P. Modeling human location data with mixtures of kernel densities. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 35–44. [Google Scholar]
Zhou, Z.; Wang, J.; Li, Y.H.; Huang, Y.K. Query-centric trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 17863–17873. [Google Scholar]
Makansi, O.; Ilg, E.; Cicek, O.; Brox, T. Overcoming limitations of mixture density networks: A sampling and fitting framework for multimodal future prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7144–7153. [Google Scholar]
Li, X.; Normandin-Taillon, H.; Wang, C.; Huang, X. Demand Density Forecasting in Mobility-on-Demand Systems Through Recurrent Mixture Density Networks. In Proceedings of the 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), Bilbao, Spain, 24–28 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 250–255. [Google Scholar]
Errica, F.; Bacciu, D.; Micheli, A. Graph mixture density networks. In Proceedings of the International Conference on Machine Learning. PMLR, Online, 18–24 July 2021; pp. 3025–3035. [Google Scholar]
Corso, G.; Stark, H.; Jegelka, S.; Jaakkola, T.; Barzilay, R. Graph neural networks. Nat. Rev. Methods Prim. 2024, 4, 17. [Google Scholar] [CrossRef]
Winkler, C.; Worrall, D.; Hoogeboom, E.; Welling, M. Learning likelihoods with conditional normalizing flows. arXiv 2019, arXiv:1912.00042. [Google Scholar]
Gammelli, D.; Rodrigues, F. Recurrent flow networks: A recurrent latent variable model for density estimation of urban mobility. Pattern Recognit. 2022, 129, 108752. [Google Scholar] [CrossRef]
Darányi, A.; Ruppert, T.; Abonyi, J. Particle filtering supported probability density estimation of mobility patterns. Heliyon 2024, 10, e29437. [Google Scholar] [CrossRef] [PubMed]
Zhang, S.; Liu, X.; Tang, J.; Cheng, S.; Wang, Y. Urban spatial structure and travel patterns: Analysis of workday and holiday travel using inhomogeneous Poisson point process models. Comput. Environ. Urban Syst. 2019, 73, 68–84. [Google Scholar] [CrossRef]
Baccelli, F.; Błaszczyszyn, B.; Karray, M. Random Measures, Point Processes, and Stochastic Geometry; Inria: Le Chesnay-Rocquencourt, France, 2024. [Google Scholar]
Delong, Ł.; Lindholm, M.; Wüthrich, M.V. Gamma mixture density networks and their application to modelling insurance claim amounts. Insur. Math. Econ. 2021, 101, 240–261. [Google Scholar] [CrossRef]
Wang, G.J.; Cheng, C.; Ma, Y.Z.; Xia, J.Q. Likelihood-free inference with the mixture density network. Astrophys. J. Suppl. Ser. 2022, 262, 24. [Google Scholar] [CrossRef]
Ma, Z.; Leijon, A.; Kleijn, W.B. Vector quantization of LSF parameters with a mixture of Dirichlet distributions. IEEE Trans. Audio Speech Lang. Process. 2013, 21, 1777–1790. [Google Scholar] [CrossRef]
Pal, S.; Heumann, C. Clustering compositional data using Dirichlet mixture model. PLoS ONE 2022, 17, e0268438. [Google Scholar] [CrossRef] [PubMed]
Nguyen, T.; Moka, S.; Mengersen, K.; Liquet, B. Spatial Autoregressive Model on a Dirichlet Distribution. arXiv 2024, arXiv:2403.13076. [Google Scholar]
Ungar, A.A. Barycentric Calculus in Euclidean and Hyperbolic Geometry: A Comparative Introduction; World Scientific: Singapore, 2010. [Google Scholar]
Lax, P.D. Change of variables in multiple integrals. Am. Math. Mon. 1999, 106, 497–501. [Google Scholar] [CrossRef]
Wang, J.; Li, T.; Li, B.; Meng, M.Q.H. GMR-RRT*: Sampling-based path planning using gaussian mixture regression. IEEE Trans. Intell. Veh. 2022, 7, 690–700. [Google Scholar] [CrossRef]
Hsu, Y.P.; Chen, H.H. Multivariate beta mixture model: Probabilistic clustering with flexible cluster shapes. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, Taiwan, 7–10 May 2024; Springer: Singapore, 2024; pp. 233–245. [Google Scholar]

Figure 1. An example of the MDN model for learning the spatial density distribution of RWP mobility over a triangle, assuming here a number of

K = 2

trivariate Dirichlet distributions are mixed as the components. Note that the input to the fully connected neural network is a six-vector, i.e.,

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3})

, which uniquely identifies a triangular movement area with the six coordinates of the three vertices of the triangle. Corresponding to definitions of the output parameters, the activation functions at the output layer are softmax for

η^{k}

(such that

\sum_{k = 1}^{K} η^{k} = 1

) and softplus for

α_{1}^{k}

,

α_{2}^{k}

, and

α_{3}^{k}

, respectively.

Figure 1. An example of the MDN model for learning the spatial density distribution of RWP mobility over a triangle, assuming here a number of

K = 2

trivariate Dirichlet distributions are mixed as the components. Note that the input to the fully connected neural network is a six-vector, i.e.,

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3})

, which uniquely identifies a triangular movement area with the six coordinates of the three vertices of the triangle. Corresponding to definitions of the output parameters, the activation functions at the output layer are softmax for

η^{k}

(such that

\sum_{k = 1}^{K} η^{k} = 1

) and softplus for

α_{1}^{k}

,

α_{2}^{k}

, and

α_{3}^{k}

, respectively.

Figure 2. Examples of the decomposition of a convex quadrilateral into four sub-triangles for mixture density learning: (a) constantly along the diagonal of

\bar{A C}

; (b) alternately along the two diagonals of

\bar{A C}

and

\bar{B D}

.

Figure 2. Examples of the decomposition of a convex quadrilateral into four sub-triangles for mixture density learning: (a) constantly along the diagonal of

\bar{A C}

; (b) alternately along the two diagonals of

\bar{A C}

and

\bar{B D}

.

Figure 3. An example of the learning model for fitting RWP mobility over a convex quadrilateral, assuming here

K = 2

trivariate Dirichlet distributions are mixed, each covering one sub-triangle from decomposition (like in Figure 2a) of the quadrilateral. Note that the input to the fully connected neural network is an eight-vector, i.e.,

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3}, a_{4}, b_{4})

, which uniquely identifies a quadrilateral movement area with the eight coordinates of the four vertices of the quadrilateral.

Figure 3. An example of the learning model for fitting RWP mobility over a convex quadrilateral, assuming here

K = 2

trivariate Dirichlet distributions are mixed, each covering one sub-triangle from decomposition (like in Figure 2a) of the quadrilateral. Note that the input to the fully connected neural network is an eight-vector, i.e.,

(a_{1}, b_{1}, a_{2}, b_{2}, a_{3}, b_{3}, a_{4}, b_{4})

, which uniquely identifies a quadrilateral movement area with the eight coordinates of the four vertices of the quadrilateral.

Figure 4. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence. The examples used include

a_{1} = 0, 0.3, 0.6

for training and

a_{1} = 0.05, 0.1, 0.15, 0.2, \dots, 0.7

(excluding 0.3 and 0.6) for testing, while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Figure 4. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for triangles with

K = 1, 2, \dots, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence. The examples used include

a_{1} = 0, 0.3, 0.6

for training and

a_{1} = 0.05, 0.1, 0.15, 0.2, \dots, 0.7

(excluding 0.3 and 0.6) for testing, while

b_{1} = 0

,

a_{2} = 0.5

,

b_{2} = 1

,

a_{3} = 1

, and

b_{3} = 0

constantly.

Figure 5. Contour plots of the spatial density distributions of RWP over triangles obtained from (b,e) DMM and (c,f) GMM, assuming (a–c)

a_{1} = 0

or (d–f)

a_{1} = 0.6

, with

K = 1

for both mixture models, against those from (a,d) simulation.

Figure 5. Contour plots of the spatial density distributions of RWP over triangles obtained from (b,e) DMM and (c,f) GMM, assuming (a–c)

a_{1} = 0

or (d–f)

a_{1} = 0.6

, with

K = 1

for both mixture models, against those from (a,d) simulation.

Figure 6. Contour plots of the DMM’s component Dirichlet distributions over triangles with

a_{1} = 0.6

and

K = 3

, for (a) the first, (b) the second, and (c) the third components, respectively.

Figure 6. Contour plots of the DMM’s component Dirichlet distributions over triangles with

a_{1} = 0.6

and

K = 3

, for (a) the first, (b) the second, and (c) the third components, respectively.

Figure 7. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for quadrilaterals with

K = 2, 4, 6, 8, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence.

Figure 7. Training and testing performance comparisons (in terms of MSE or KL divergence, with 95% confidence intervals) between DMM and GMM for quadrilaterals with

K = 2, 4, 6, 8, 10

: (a) training, MSE; (b) training, KL divergence; (c) testing, MSE; (d) testing, KL divergence.

Figure 8. Contour plots of the spatial density distributions of RWP over (convex) quadrilaterals obtained from (b,e) DMM and (c,f) GMM, assuming (a–c)

a_{1} = 0

or (d–f)

a_{1} = 0.6

, with

K = 4

for both mixture models, against those from (a,d) simulation.

Figure 8. Contour plots of the spatial density distributions of RWP over (convex) quadrilaterals obtained from (b,e) DMM and (c,f) GMM, assuming (a–c)

a_{1} = 0

or (d–f)

a_{1} = 0.6

, with

K = 4

for both mixture models, against those from (a,d) simulation.

Figure 9. Contour plots of the DMM’s component Dirichlet distributions over (convex) quadrilaterals with

a_{1} = 0.6

and

K = 4

, for (a) the first, (b) the second, (c) the third, and (d) the fourth components, respectively.

Figure 9. Contour plots of the DMM’s component Dirichlet distributions over (convex) quadrilaterals with

a_{1} = 0.6

and

K = 4

, for (a) the first, (b) the second, (c) the third, and (d) the fourth components, respectively.

Table 1. Summary of input feature values of the training and testing examples.

For Triangles
	$a_{1}$	$b_{1}$	$a_{2}$	$b_{2}$	$a_{3}$	$b_{3}$
Training	$0, 0.3, 0.6$	0	$0.5$	1	1	0
Testing	$0.05, 0.1, \dots, 0.7$ (except 0.3, 0.6)
For Quadrilaterals
	$a_{1}$	$b_{1}$	$a_{2}$	$b_{2}$	$a_{3}$	$b_{3}$	$a_{4}$	$b_{4}$
Training	$0, 0.3, 0.6$	$0.2$	$0.6$	1	1	1	1	0
Testing	$0.05, 0.1, \dots, 0.7$ (except 0.3, 0.6)

Table 2. Summary of the specifics of relevant experimental settings.

Experimental Settings	Specifics
Model Training
Optimizer	Adam
Batch Size (per epoch)	3
Total # of Epochs (per run)	20,000
Total # of Runs (per example)	10
Learning Rates	$5 \times 10^{- 4}$ ^†, or, $2 \times 10^{- 5}$ ^††
Total # of Components	$K = 1, 2, \dots, 10$ ^†, or, $K = 2, 4, \dots, 10$ ^††
RWP Simulation
Sampling Space	${[0, 1]}^{2}$
Sampling Grid	$250 \times 250$
Simulation Time	$5 \times 10^{7}$ (units of distance)
Platform for Model Training/Inference & RWP Simulation
Python	`python 3.12.7`
NumPy	`numpy 2.0.1`
PyTorch	`torch 2.5.1`
Compiler (for RWP Simulation)	`clang 16.0.0` (w/ `-O3` optimization)
Chip	Apple M4 Pro
Total # of Cores	14 (CPU), 20 (GPU)
Memory	24 GB

^† for triangles, and ^†† for quadrilaterals.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, Y.; Gao, W.; Zhang, L.; Qi, M.; Zhong, Q.; Li, N. Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals. Mathematics 2025, 13, 927. https://doi.org/10.3390/math13060927

AMA Style

Feng Y, Gao W, Zhang L, Qi M, Zhong Q, Li N. Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals. Mathematics. 2025; 13(6):927. https://doi.org/10.3390/math13060927

Chicago/Turabian Style

Feng, Yiming, Wanxin Gao, Lefeng Zhang, Minfeng Qi, Qi Zhong, and Ningran Li. 2025. "Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals" Mathematics 13, no. 6: 927. https://doi.org/10.3390/math13060927

APA Style

Feng, Y., Gao, W., Zhang, L., Qi, M., Zhong, Q., & Li, N. (2025). Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals. Mathematics, 13(6), 927. https://doi.org/10.3390/math13060927

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Learning Spatial Density Functions of Random Waypoint Mobility over Irregular Triangles and Convex Quadrilaterals

Abstract

1. Introduction

2. Related Work

3. Mixture Density Networks Based on Dirichlet Distributions

3.1. Mixture Density Learning for Triangles

3.2. Mixture Density Learning for Convex Quadrilaterals

4. Experimental Evaluation

4.1. Overall Settings

4.2. Evaluation Results for Triangles

4.3. Evaluation Results for Convex Quadrilaterals

5. Additional Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

Appendix B

Appendix B.1

Appendix B.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI