*Article* **Efficient Monte Carlo Methods for Multidimensional Modeling of Slot Machines Jackpot**

**Slavi Georgiev 1,2 and Venelin Todorov 1,3,\***


**Abstract:** Nowadays, entertainment is one of the biggest industries, which continues to expand. In this study, the problem of estimating the consolation prize as a fraction of the jackpot is dealt with, which is an important issue for each casino and gambling club. Solving the problem leads to the computation of multidimensional integrals. For that purpose, modifications of the most powerful stochastic quasi-Monte Carlo approaches are employed, in particular lattice and digital sequences, Halton and Sobol sequences, and Latin hypercube sampling. They show significant improvements to the classical Monte Carlo methods. After accurate computation of the arisen integrals, it is shown how to calculate the expectation of the real consolation prize, taking into account the distribution of time, when different numbers of players are betting. Moreover, a solution to the problem with higher dimensions is also proposed. All the suggestions are verified by computational experiments with real data. Besides gambling, the results obtained in this study have various applications in numerous areas, including finance, ecology and many others.

**Keywords:** gambling; jackpot; multidimensional integrals; Monte Carlo methods; lattice sequences; digital sequences

**MSC:** 62P30; 65C05; 68W20; 91A60

#### **1. Introduction**

The gambling industry plays a significant role in modern life [1,2]. It is one of the most profitable businesses worldwide with prognosed value more than USD 640 billion by 2027 [3]. In the recent years, there are more and more opportunities for everyone seeking such kind of joy. The most popular game machines are the slot machines [4], also called "fruit machines" or even "one-armed bandits" [5]. However, due to the competition, the payback rate (Return-to-Player, RTP) is as high, as it reaches 98% [6]. For every casino and gambling club, it is vital to plan its expenditures in a precise way in order to be both competitive and profitable.

#### *1.1. General Framework*

In this paper, we consider the gambling club systems with slot machines [7]. The revenues are formed entirely by the player's bets. The greatest deal of the expenditures is composed by the direct 'payline' wins. Here, the bonuses and the standalone jackpots are also included. Then, it comes the linked progressive jackpot (hereinafter called jackpot). Its size is based on the size of the bets, so it constitutes a deterministic part of the expenses. This is not the case, though, for the consolation prize. Its upper limit is equal to the jackpot or a predefined part of it. However, its particular size depends on the player who won the

**Citation:** Georgiev, S.; Todorov, V. Efficient Monte Carlo Methods for Multidimensional Modeling of Slot Machines Jackpot. *Mathematics* **2023**, *11*, 266. https://doi.org/10.3390/ math11020266

Academic Editors: Snezhana Gocheva-Ilieva, Atanas Ivanov and Hristina Kulina

Received: 7 December 2022 Revised: 28 December 2022 Accepted: 31 December 2022 Published: 4 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

jackpot and their bet, so its size is a stochastic variable. The aim of this paper is to propose a robust method to compute its expectation. For the sake of completeness, the other part of the costs concerns the drink and food in the casino as well as the staff salary, equipment and housing.

The consolation prize is divided between all the players except the one who won the jackpot. Everyone gets a share, proportional to their bet. So, basically, if E[*X*] is the expected bet share of the winner, 1 <sup>−</sup> <sup>E</sup>[*X*] is the expected size (in percentage) of the consolation prize. How the bet is placed is explained in the following subsection.

#### *1.2. Bet Collection*

Firstly, a player should deposit a cash amount in the machine. This is done directly via the bill validator or by the dealer with the «attendant» electronic key. Then, before choosing a game, the player selects how many credits he/she bets on every spin and sets the credit denomination, or the prize of one credit. After, the player can choose a particular game. From the game settings, the number of lines can be selected. Roughly speaking, 30 lines with bet of 30 credits means playing simultaneously 30 games, each of them with a bet of 1 credit.

The first slot machines had real mechanical reels, while the recent have screens with virtual reels. The player pushes a button, which activates the spinning of the reels. When they stop, the player wins a payline prize if there are the same symbols on an active line (which is in general a pattern rather than a straight line). Regardless of the outcome, the bet is collected for every spin.

Every game is characterized with volatility [8,9]. It is measured in a discrete scale between 1 and 5. An exact formula does not exist—it should be interpreted just as the higher the number, the more volatile the win size. A game has different settings for the RTP level [3,10–12], which often varies between 88% and 96%, but usually is higher than 91–92%. Every RTP level is associated with a different set of reels. It is preset by the owner of the machine and cannot be altered on the go.

#### *1.3. Jackpot Winning*

So, for every bet, a predefined amount is dedicated to compose the jackpot prize. When the event for hitting the jackpot comes, it is won by a single player. As we discussed earlier, the jackpot win probability of each player is proportional to their bet. The process of hitting is visualized as follows. On the screen, a tube following its perimeter appears (which, obviously, has a rectangular shape). The tube is partitioned such that each player is assigned a segment of the tube with proportional length. Then, a ball starts circulating over the tube with decreasing velocity. The segment it stops determines the jackpot winner. The others receive the consolation prize.

The main novelty of the study is the reformulation of the problem of finding the expectation of the consolation price into a multidimensional integral evaluation problem, and the design of an algorithm to obtain this value by employing advanced stochastic approaches and numerical techniques. The paper itself is organized as follows. The next Section 2 introduces the models of integral representations, the algorithms for point transformations and the stochastic methods used in the integral computation. Section 3 is devoted to the detailed presentation of the obtained results and their thorough explanation. In Section 4, a particular case study is considered with real data, where it is demonstrated how to derive the real consolation prize expected value. The paper is concluded with Section 5.

#### **2. Algorithms and Methods**

Before presenting the model, we reveal one more constraint which has to be taken care of. No matter how small a single bet is, the associated probability cannot be less than 0.5%. This automatically suggests that the upper limit of a single player's probability is \$ 100 − 0.5(*N* − 1) % %, where *N* is the number of players participating in the play. Henceforward, we define *L* := 0.005 and *U* := 0.995. Obviously, this setting is valid for, at most, *N* ≤ 200 players.

Let the probability for *i* th player (*i* = 1, ... , *N*) to win the jackpot be defined with *xi*. By the aforementioned arguments, it is obvious that

$$\sum\_{i=1}^{N} x\_i = 1 \tag{1}$$

and

$$L \le \mathbf{x}\_i \le \mathbf{U} - L \cdot \mathbf{N} \quad \text{for} \quad i = 1, \dots, N. \tag{2}$$

Equality (1) suggests that the points should be drawn from the *N*-dimensional simplex. Inequalities (2), though, are more complicated to satisfy. In order to cope with it, we will present two approaches. However, to explain them better, firstly we present the expectation operator.

#### *2.1. Integral Representation*

Let us restate our aim to find the expected size of the consolation prize. If the first player wins the jackpot, the (relative) size of the consolation prize is 1 − *x*1. However, the probability of the first player to win the jackpot is exactly *x*1. Of course, this is true for all players. So, the size of the consolation prize (CP), given in a percentage, in the case of *N* players is

$$\begin{aligned} \mathbb{E}[\mathbb{CP}] &= \mathbb{P}[\text{first player wins}] \ast \text{Size}[\mathbb{CP}] \text{first player wins}] + \\ &+ \mathbb{P}[\text{second player wins}] \ast \text{Size}[\mathbb{CP}] \text{second player wins}] + \\ &+ \dots + \\ &+ \mathbb{P}[\text{last player wins}] \ast \text{Size}[\mathbb{CP}] \text{last player wins}] \\ &= \sum\_{i=1}^{N} \mathbf{x}\_{i} (1 - \mathbf{x}\_{i}) . \end{aligned}$$

In this case, the expectation would look like

$$\mathbb{E}[\mathbb{CP}] = \int\limits\_{\mathbf{x} \in \Delta^{N-1}} \cdots \int\limits\_{i=1}^{N} \mathbf{x}\_{i} (1 - \mathbf{x}\_{i}) \mathrm{d}\mathbf{x}\_{N} \mathrm{d}\mathbf{x}\_{N-1} \cdots \mathrm{d}\mathbf{x}\_{1} \tag{4}$$

where <sup>Δ</sup>*N*−<sup>1</sup> is the standard (*<sup>N</sup>* − <sup>1</sup>)-simplex, transformed according to (2).

However, we can reduce the dimension of the integral (4) with one using *xN* = 1 − *x*<sup>1</sup> − *x*<sup>2</sup> − ... − *xN*−<sup>1</sup> (1):

$$\mathbb{E}[\mathbb{C}P] = \sum\_{i=1}^{N-1} x\_i (1 - x\_i) + \left(1 - \sum\_{i=1}^{N-1} x\_i\right) \sum\_{i=1}^{N-1} x\_i. \tag{5}$$

Setting *D* := *N* − 1, then the integral would be

$$\mathbb{E}[\mathbb{C}P] = \int\limits\_{\mathbf{x}\in\mathbb{V}^{D}} \cdots \int\limits\_{i=1}^{D} \mathbf{x}\_{i} (1 - \mathbf{x}\_{i}) + \left(1 - \sum\_{i=1}^{D} \mathbf{x}\_{i}\right) \sum\_{i=1}^{D} \mathbf{x}\_{i} \mathbf{dx}\_{D} \cdots \mathbf{dx}\_{1'} \tag{6}$$

where *<sup>V</sup><sup>D</sup>* is the *space* between standard (*<sup>D</sup>* <sup>−</sup> <sup>1</sup>)-simplex and the coordinate hyperplanes in R*D*, again transformed according to (2).

For clarity purposes, we write the integrals with their respective limits for the first values of *D*:


$$\int\_{L}^{(I-L)} \int\_{L}^{(I-x\_1)} \mathbf{x}\_1 (1 - \mathbf{x}\_1) + \mathbf{x}\_2 (1 - \mathbf{x}\_2) + (1 - \mathbf{x}\_1 - \mathbf{x}\_2)(\mathbf{x}\_1 + \mathbf{x}\_2) \mathbf{dx}\_2 \mathbf{dx}\_1 dx\_1$$

• For *D* = 3:

$$\int\limits\_{L}^{(I-L)}\int\limits\_{L}^{(I-L-x\_{1})}\int\limits\_{L}^{(I-x\_{1}-x\_{2})}\int\limits\_{L}^{(I-x\_{1}-x\_{2})}\left\{\mathop{\rm x}\limits\_{L}\big{(} (1-\mathbf{x}\_{1}) + \mathop{\rm x}\limits\_{2}(1-\mathbf{x}\_{2}) + \mathop{\rm x}\limits\_{3}(1-\mathbf{x}\_{3}) + (1-\mathbf{x}\_{1}-\mathbf{x}\_{2}-\mathbf{x}\_{3})(\mathbf{x}\_{1}+\mathbf{x}\_{2}+\mathbf{x}\_{3})\mathop{\rm dx}\mathbf{x}\_{3}\mathbf{d}x\_{2}\mathbf{d}x\_{1}\mathbf{d}x\_{3}\right\}$$

$$\begin{array}{c} \bullet \quad \text{For } D = k; \\ \hline \end{array}$$

$$
\int \limits\_{L} \int\_{L}^{L(-(k-1)L)} \int\_{L}^{(L-(k-2)L-x\_{1})} \left( \mathsf{U} - \sum\_{i=1}^{k-1} x\_{i} \right)\_{k} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{i=1}{\operatorname{\bf d}}} \, x\_{i} (1-x\_{i}) \\
+ \left( 1 - \sum\_{i=1}^{k} x\_{i} \right) \sum\_{i=1}^{k} x\_{i} \, \mathop{\bf d}\mathbf{x}\_{k} \cdot \cdots \, \mathop{\bf d}\mathbf{x}\_{2} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}} \, \overset{k}{\underset{k-1}{\operatorname{\bf d}}}$$

For *D* = 1 and *D* = 2, the integrands are plotted on Figure 1.

**Figure 1.** Integrands for *D* = 1 and *D* = 2.

In the next subsections, we will describe the algorithm for drawing points for both (4) and (6). Henceforward, let *C* 1 be the total number of points.

#### *2.2. Algorithm for Drawing Point for (4)*

This approach is rather simple. Firstly, we draw *C D*-dimensional points **xˆ** *<sup>D</sup>*, uniformly distributed in the *D*-dimensional hypercube. Then, we sort the coordinates of every point, independently from the other points, and name the new points **x˜** *<sup>D</sup>*. Subsequently, we map **x˜** *<sup>D</sup>* to the points **x˜** *<sup>D</sup>*+<sup>2</sup> from the (*D* + 2)-dimensional hypercube as

$$\tilde{\mathbf{x}}^{D+2}(i) := \tilde{\mathbf{x}}^D(i-1) \quad \text{for} \quad i = 2, \dots, D+1$$

and set

$$\mathfrak{x}^{D+2}(1) := 0, \qquad \mathfrak{x}^{D+2}(D+2) := 1.$$

Now, we are sure that the coordinates of each point in **x˜** *<sup>D</sup>*+<sup>2</sup> are sorted and lie on the line [0, 1], including the boundaries 0 and 1. If we take the difference between every two adjacent coordinates and in such a way define a new coordinate, we arrive at points **xˆ** *<sup>N</sup>* in the (*D* + 1)-dimensional hypercube, which

$$\mathfrak{x}^N(i) := \mathfrak{x}^{D+2}(i+1) - \mathfrak{x}^{D+2}(i).$$

The points **xˆ** *<sup>N</sup>* indeed belong to the standard *D*-simplex.

In that way, we fulfilled (1). In order to satisfy (2), we apply a *linear* transformation on **xˆ** *<sup>N</sup>*:

$$\mathbf{x}^{N} := \mathbf{\hat{x}}^{N}(1 - N \cdot L) + L,\tag{7}$$

where the operations should be applied element-wisely.

Thus, the points **x***<sup>N</sup>* (7) satisfy both (1) and (2) and belong to Δ*D*.

#### *2.3. Algorithm for Drawing Point for (6)*

This algorithm is also not complicated. To begin with, we again draw *C D*-dimensional points **xˆ** *<sup>D</sup>*, uniformly distributed in the *D*-dimensional hypercube. Now, it is enough to *linearly* scale the point coordinates as follows:

$$\mathbf{x}^{D}(i) := \mathfrak{X}^{D}(i) \cdot \left( \mathcal{U} - (D - i + 1)L - \sum\_{j=1}^{i-1} \mathfrak{X}^{D}(j) \right) + L,\tag{8}$$

where, of course, the sum is equal to 0 for *i* = 1.

The points **x***<sup>D</sup>* (8) satisfy

$$\sum\_{i=1}^{D} \boldsymbol{\pi}\_{i} \le \boldsymbol{U} \quad \text{and} \quad \boldsymbol{\pi}\_{i} \ge L \text{ for } i = 1, \dots, D.$$

Thus, they truly belong to *V<sup>D</sup>* and satisfy the requirements to be used for the evaluation of (6).

Of course, for that purpose, we could use the former algorithm and truncate the last coordinate of **x***<sup>N</sup>* (7). This operation is actually a projection of **x***<sup>N</sup>* onto the *D*-dimensional hyperplane *Ox*1*x*<sup>2</sup> ... *xD*.

It is worth saying that we indeed use the first algorithm with the truncation of the last coordinate since the second one does not distribute the points in an optimal way.

#### *2.4. Monte Carlo Algorithms*

The Monte Carlo methods [13] come in handy when the deterministic methods fail [14–16]. They have tremendous applications in many areas, including financial derivatives evaluation [17] and even slot machines play [18] and reels reconstruction [19]. Now we mention some of the fundamental Monte Carlo approaches.

*Plain (crude) Monte Carlo* is the earliest and probably the most used Monte Carlo (MC) method to solve multidimensional integrals [14]. The MC quadrature formula lies on the probabilistic interpretation of the integral

$$I[f] = \int\_{\Omega} f(\mathbf{x}) p(\mathbf{x}) d\mathbf{x}.$$

Let the random variable *θ* = *f*(*ξ*) be such that

$$\mathbf{E}\theta = \int\_{\Omega} f(\mathbf{x}) p(\mathbf{x}) d\mathbf{x},$$

where the random points *ξ*1, *ξ*2, ... , *ξ<sup>N</sup>* are independent realizations of the random point *ξ* with probability density function *p*(x) and *θ*<sup>1</sup> = *f*(*ξ*1), ... , *θ<sup>N</sup>* = *f*(*ξN*). Then the plain MC approach for the integral *I* is defined as [14]

$$
\overline{\theta}\_N = \frac{1}{N} \sum\_{i=1}^N \theta\_i.
$$

*Latin hypercube sampling (LHS)* is a type of stratified sampling (SS) [14]. In the case of SS, one must divide [0, 1] *<sup>d</sup>* into *M<sup>d</sup>* disjoint subdomains, each of volume <sup>1</sup> *<sup>M</sup><sup>d</sup>* , and to sample one point at each subdomain. It is proved in [15] that the variance of a SS could never exceed the variance of a plain random sampling. LHS is a highly researched topic [20–23].

By definition, *quasi-Monte Carlo (QMC) methods* are based on quasi-random sequences that are built in such a way as to minimize a measure of their deviation from uniformity, called discrepancy [24,25].

Let x*<sup>i</sup>* = (*x* (1) *<sup>i</sup>* , *x* (2) *<sup>i</sup>* , ... , *x* (*s*) *<sup>i</sup>* ), *i* = 1, 2, ... The representation of *n* in base *b* is presented by the following formula [26]: *n* = ... *a*3(*n*), *a*2(*n*), *a*1(*n*), *n* > 0, *n* ∈ Z.

The radical inverse sequence is defined as [27]: *n* = ∑<sup>∞</sup> *<sup>i</sup>*=<sup>0</sup> *ai*+1(*n*)*b<sup>i</sup>* , *φb*(*n*) = ∑<sup>∞</sup> *<sup>i</sup>*=<sup>0</sup> *ai*+1(*n*)*b*−(*i*+1) and its discrepancy satisfies: *<sup>D</sup>*<sup>∗</sup> *<sup>N</sup>* = O log *N N* . The Van der Corput sequence [28] is obtained when *b* = 2.

*Halton* sequence [29,30] is defined as

$$s\_n^{(k)} = \sum\_{i=0}^{\infty} \sigma\_{i+1}^{(k)} a\_{i+1}^{(k)}(n) b\_k^{-(i+1)}.$$

where (*b*1, *b*2, ... , *bs*) ≡ (2, 3, 5, ... , *ps*), and *pi* denotes the *i* th prime, and *σ*(*k*) *<sup>i</sup>* , *i* ≥ 1: set of permutations on (0, 1, 2, . . . , *pk* − 1).

*Sobol* sequence [17,31–34] is defined by

$$\mathbf{x}\_k \in \overline{\sigma}\_i^{(k)}, k = 0, 1, 2, \dots$$

where *σ<sup>i</sup>* (*k*) , *<sup>i</sup>* ≥ 1: set of permutations on every 2*k*, *<sup>k</sup>* = 0, 1, 2, ... subsequent points of the Van der Corput sequence. In binary, we have that: *x* (*k*) *<sup>n</sup>* = <sup>&</sup>lt; *i*≥0 *ai*+1(*n*)*vi*, where *vi*, *i* = 1, . . . ,*s* is a set of direction numbers [34].

To scramble the *Halton sequence*, we used a permutation of the radical inverse coefficients derived by applying a reverse-radix operation to all of the possible coefficient values. The algorithm is proposed in [35]. To scramble the *Sobol sequence*, we used a random linear scramble combined with a random digital shift. The algorithm is suggested in [36].

When the integrand is sufficiently regular, the *lattice sequences*, using special types of sequences with low discrepancy, generally outperform the basic MC methods. Sloan and Kachoyan [37], Niederreiter [38], Hua and Wang [39], Wang and Hickernell [40] and Sloan and Joe [41] provide comprehensive expositions on the theory of lattice sequences.

We constructed our lattice sequence with the optimal generating vector by using a special algorithmic [42–45] construction of rank-1 lattice rules with prime number of points and with product weights with 230 points.

Niederreiter [27,46] introduced a special family of digital (*t*, *m*,*s*)-nets over *Fb*. Those nets are obtained from rational functions over finite fields [47–49]. We will use a special type of digital sequences, namely *interlaced digital sequences* [50–53], a special class of digital nets, a concept analogous to lattice sequences, but based on linear algebra over finite fields [54].

We will use the generating matrices for an implementation of the Sobol' sequence from [55] with 21201 dimensions for the *digital sequence*, as well as the generating matrices for interlaced Sobol' sequences with interlacing factor *d* = 2 for the *interlaced digital sequence*.

#### **3. Results and Discussion**

In this section, we make an overview of the results, obtained by the aforementioned methods. We recall that *C* denotes the total number of points used in the integration.

For the 10-dimensional integral (Table 1), the best approach is the lattice sequence. It achieves the smallest error in 7 of 21 cases. The other suitable method is the interlaced digital sequence, producing the five smallest errors, but they are obtained for large values of *C*, and they are of one order better than the errors produced by the other methods.



Regarding the 20-dimensional integral (Table 2), the best approach is the digital sequence with the five smallest errors. The other suitable approaches for large *C* are the scrambled versions of Halton and Sobol sequences, but the best errors are mostly of the same magnitude with the other errors.

**Table 2.** The 20-dimensional integral absolute errors.


The results for the 10- and 20-dimensional integrals are visualized on Figure 2.

**Figure 2.** The 10- (**left**) and 20-dimensional (**right**) integral absolute errors.

Considering the 30-dimensional integral (Table 3), the scrambled Halton sequence stands out with the 10 smallest errors. For large *C*, only the interlaced digital sequence produces comparable results.


**Table 3.** The 30-dimensional integral absolute errors.

Similar is the situation with the 40-dimensional integral (Table 4), where the best methods are the scrambled versions of Halton and Sobol sequences with seven respective best errors. It makes an impression that the scrambled Halton sequences achieves order of error 1e-9 for both integrals.

**Table 4.** The 40-dimensional integral absolute errors.


The results for them are displayed on Figure 3.

**Figure 3.** The 30- (**left**) and 40-dimensional (**right**) integral absolute errors.

Going to higher dimensions, the best approach for the 50-dimensional integral (Table 5) is again the scrambled Halton sequences with five smallest errors. For the largest values of *C*, the Digital Sequence produces the best results. It is also noticeable that the former achieves an order of error 1e-9.


**Table 5.** The 50-dimensional integral absolute errors.

For the last integral tested, the 60-dimensional one (Table 6), the most accurate approaches are the scrambled Sobol sequence and the Latin hypercube sampling, producing six respective best errors. The LHS is suitable for lower values of *C*, while the scrambled Sobol sequence and the lattice sequence perform well for larger values of *C*. Most of the methods achieve 1e-9 order of error, but staring at the order of convergence, it is obvious that this phenomenon is pure luck in the crude Monte Carlo approach.


**Table 6.** The 60-dimensional integral absolute errors.

The result for the 50- and 60-dimensional integrals are plotted on Figure 4.

**Figure 4.** The 50- (**left**) and 60-dimensional (**right**) integral absolute errors.

A number of conclusions could be drawn. Firstly, the achieved absolute errors are very small, even for the large-scale problems. Secondly, the lattice and digital sequences perform well for a lower number of dimensions, while the scrambled Halton and Sobol sequences are suitable for a larger number of dimensions. It is difficult to say in general, though, which method performs superior to the others.

There is a reason for this outcome. The integrand functions are smooth multivariate polynomials. Some of the adaptive methods, for example, Latin hypercube sampling, have advantages over irregular and non-smooth functions, which is not the case here. Therefore, the scrambled Halton sequence and the lattice sequence with a good generation vector are recommended for accurate evaluation of the studied integrals.

It is worth mentioning a couple of notes about the implementation of the stochastic algorithms. First of all, the two digital sequence (interlaced or not) are implemented in C++, mainly to gain performance, while the other methods and the rest of the operations are implemented in MATLAB-. Secondly, it is difficult to maintain in the RAM 230 multidimensional points which coordinates are stored in the 8-byte 'double' type. Even for the 10-dimensional points, the required storage amounts to 80 GiB (gibibytes). However, after all, it is not needed. We have manually implemented a simple variant of a MapReduce model, where we evaluate the integrals over chunks of 210 points, save the results, and eventually average again all the approximations. Some of the stochastic methods are suitable for parallel implementation, but the elements of other quasi-sequences depend on the previous elements in the sequence, which makes an effective parallelization impossible.

#### **4. A Real Case Study**

In this section, we apply the described algorithm for assessment of the consolation prize expectation to real data from gambling clubs in Bulgaria.

Let us also recall that *D*-dimensional integral *f*(*D*) denotes the consolation prize (CP) expectation in a game with *N* = *D* + 1 players. In a game with one player, (s)he gets the jackpot and *CP* = 0. Now, we plot the results for *D* = 1, ... , 63 on Figure 5, showing the mathematical expectation and the standard deviation of the CP.

**Figure 5.** Consolation prize statistics.

The value of *f*(*D*) represents the share of the jackpot that the casino expects to pay as a consolation prize if *D* + 1 players play all the time. Of course, this is not the case. In particular, in the small gambling clubs, most of the time, low numbers of players are found. Furthermore, there is a day-and-night cyclicity in the attendance. The distribution could be even bimodal or multimodal. It is given for two gambling clubs in mid-sized city in Bulgaria for 2017 on Figure 6. In each club, there are 32 slot machines, but it appeared that never more than 25 attendants in the first club and 16 attendants in the second one were playing simultaneously.

The *x*-axis denotes the number of attendants playing, while the *y*-axis shows the relative portion of time. Now, we calculate the *real consolation prize f* (*D*), which is defined as the CP expectation in the case of at least 1, and at most *D* + 1 attendants played considering a predefined distribution. To compute the confidence intervals (CIs), we need to compute the standard deviation, using the standard formulae (9), assuming independence between the different numbers of attendants, playing simultaneously.

$$\mathbb{E}[f'(D)] = \sum\_{i=1}^{D} w\_i \mathbb{E}[f(i)] \text{ and } \sigma[f'(D)] = \sqrt{w\_i^2 \sum\_{i=1}^{D} v^2[f(i)]} \text{ for } D = 1, \dots, 24,\tag{9}$$

where *wi*, *i* = 1, . . . , *D* are the relative weights and ∑*<sup>D</sup> <sup>i</sup>*=<sup>1</sup> *wi* = 1 must hold for every *D*.

Finally, assuming the normality of the real CP *f* (*D*) for *D* = 1, ... , 24, its expectation and CIs are plotted for both distributions on Figures 7 and 8, where *n* denotes the number of jackpot hits.

**Figure 7.** Real consolation prize expectation and CIs (first gaming club/distribution).

The real CP of the second gambling club has higher standard deviation and thus broader CIs because low numbers of attendants play for relatively more time; see Figure 6. The lower the number of players, the higher the deviation; see Figure 5. On the other hand, the usually low number of players has a lower expectation of the real CP.

The overall conclusion is that with 99% confidence (*z*<sup>∗</sup> ≈ 2.57), the realized real consolation prize would not be farther than 1% from its expectation, in absolute units. This is true if the jackpot is hit on a daily basis, for less than two years (*n* = 500 days).

We conclude this section with a brief discussion about the bigger casinos. In the case of more machines and a time distribution of simultaneous play with low or even negative skew, the expectation of the real CP would be higher and closer to one. Of course, in this case, the jackpot would be hit more frequently, but also the absolute sizes of the jackpot and CP would be much greater compared to their counterparts of the small gambling clubs. So, the accurate computation of the CP expectation is also of a paramount importance for the big casinos.

**Figure 8.** Real consolation prize expectation and CIs (second gaming club/distribution).

Here arises the question for the calculation of integrals with hundreds of dimensions. The approaches, described in the paper, are robust and capable of dealing with largescale problems, but when the number of dimensions gets huge, all methods become slow. A reasonable question is whether the CP values *f*(*D*) for large *D* could be extrapolated from CP values for lower *D*, which are already computed.

Our investigation shows that *f*(*D*), see Figure 5, could be approximated by the functional form of the Michaelis–Menten saturation curve (10):

$$\lg(D; \mathbf{p}) = \frac{aD}{b+D} + c,\tag{10}$$

where the parameters are **p** = {*a*, *b*, *c*}.

We fit the model (10) to only the first ten values for *f*(*D*), *D* = 0, ... , 9, recalling *f*(0) = 0. The fitted values of **p** are called a nonlinear least-squares estimator and it is denoted with **p**ˇ. Let us also define the least-squared error functional as Φ(**p**) = ∑<sup>9</sup> *D*=0 \$ *f*(*D*) − *g*(*D*; **p**) %2 .

The fit is indeed good since the norm of the step *δ***p***<sup>k</sup>* = 8.24163*e* − 5, the first-order optimality measure is ∇Φ(**p**ˇ) <sup>∞</sup> = 2.35*e* − 9 and the error functional Φ(**p**ˇ) = 4.24941*e* − 8 are very small. What is more, the variance of the residuals *<sup>σ</sup>*˜ <sup>2</sup> = 2.0614*<sup>e</sup>* − 5 and the root mean squared error *σ*ˆ = 7.7914*e* − 5 are very small and the coefficient of determination is practically *R*<sup>2</sup> = 1.0000.

All the parameters **p** are statistically significant, and their fitted values are **p**ˇ = {*a*, *b*, *c*} = {2.006, 1.014, −0.9962}. Finally, we evaluate *g*(**p**ˇ; *D*) for *D* = 0, ... , 63 and plot the absolute error (*D*) = *f*(*D*) − *g*(*D*; **p**ˇ) on Figure 9.

The errors are of magnitude 1e-4, but the fit was performed only on the first ten values. If all known values are fitted, the errors will be negligible. This shows that one could use (10) to extrapolate the real consolation prize for higher values of *D* at low computational cost with acceptable error.

**Figure 9.** Absolute error between the actual and extrapolated CP values.

#### **5. Conclusions**

In this novel experimental study, for the first time, it is solved the problem of determining the expected value of the consolation prize as a fraction of the jackpot. This is extremely vital for each casino and gambling club, regardless of their size, due to the tight budget planning. This problem is formulated as multidimensional integral evaluations. Some of the most advanced quasi-Monte Carlo methods are used, in particular Sobol and Halton sequences with scrambling, lattice and digital sequences with interlacing, and Latin hypercube sampling. All of them are demonstrated to have superior performance compared to the basic Monte Carlo approaches.

The other novel element in the paper is the formulation of the expectation of the real consolation prize, taking into account the temporal distribution of the different number of playing attendants. Eventually, it is suggested an approach to cope with very high dimensions through extrapolation of the already calculated results.

The proposed algorithm is able to calculate the consolation prize not only for linked in-house jackpots, but also for wide area jackpots (which run across machines from multiple casinos). Another possible way to further develop this investigation is to optimize the stochastic approaches with respect to execution time, accuracy and largeness of number of dimensions.

The results, obtained in this investigation, could be used in many areas of life. They could play a significant role in the estimation of Sobol sensitivity indices for large-scale pollution models. Furthermore, such findings are heavily used in quantitative finance to evaluate and calibrate multidimensional financial derivatives. Finally, the results would help other scientists to perform demanding computations in diverse fields of knowledge. **Disclaimer**: This paper must not be understood as advice to play slot games or not to do so. Its aim is to propose an algorithm for practical and applied scientific purposes.

**Author Contributions:** All authors contributed equally. Conceptualization, S.G.; Methodology, V.T.; Software, S.G.; Validation, V.T.; Formal analysis, S.G.; Investigation, S.G.; Resources, V.T.; Writing—review & editing, V.T.; Supervision, V.T.; Funding acquisition, V.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** Venelin Todorov is supported by the Bulgarian National Science Fund under the Bilateral Project KP-06-Russia/17 "New Highly Efficient Stochastic Simulation Methods and Applications" and by the Bulgarian National Science Fund under Project KP-06-N52/5 "Efficient methods for modeling, optimization and decision making". Slavi Georgiev is supported by the Bulgarian National Science Fund under Project KP-06-M32/2—17.12.2019 "Advanced Stochastic and Deterministic Approaches for Large-Scale Problems of Computational Mathematics" and by the National Program "Young Scientists and Postdoctoral Researchers—2"—Bulgarian Academy of Sciences and by the Scientific Research Fund of University of Ruse under FNSE-03.

**Data Availability Statement:** Restrictions apply to the availability of these data.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


54. Dick, J.; Kuo, F.Y.; Sloan, I.H. High dimensional integration: The quasi-Monte Carlo way. *Acta Numer.* **2013**, *22*, 133288. [CrossRef] 55. Joe, S.; Kuo, F.Y. Constructing Sobol' sequences with better two-dimensional projections. *SIAM J. Sci. Comput.* **2008**, *30*, 2635–2654. [CrossRef]

**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

**Ivan De Boi 1,\*, Carl Henrik Ek <sup>2</sup> and Rudi Penne <sup>1</sup>**


**Abstract:** The close relation between spatial kinematics and line geometry has been proven to be fruitful in surface detection and reconstruction. However, methods based on this approach are limited to simple geometric shapes that can be formulated as a linear subspace of line or line element space. The core of this approach is a principal component formulation to find a best-fit approximant to a possibly noisy or impartial surface given as an unordered set of points or point cloud. We expand on this by introducing the Gaussian process latent variable model, a probabilistic non-linear non-parametric dimensionality reduction approach following the Bayesian paradigm. This allows us to find structure in a lower dimensional latent space for the surfaces of interest. We show how this can be applied in surface approximation and unsupervised segmentation to the surfaces mentioned above and demonstrate its benefits on surfaces that deviate from these. Experiments are conducted on synthetic and real-world objects.

**Keywords:** surface approximation; surface segmentation; surface denoising; gaussian process latent variable model; line geometry; line elements

**MSC:** 60G15

#### **1. Introduction**

Extracting structural information as shapes or surfaces from an unordered set of 3D coordinates (point cloud) has been an important topic in computer vision [1]. It is a crucial part of many applications such as autonomous driving [2], scene understanding [3], reverse engineering of geometric models [4], quality control [5], simultaneous localization and mapping (SLAM) [6] and matching point clouds to CAD models [7]. Over the last decade, hardware developments have made the acquisition of those point clouds more affordable. As the availability, ease of use and hence the popularity of various 3D sensors increases so does the need for methods to interpret the data they generate.

However, in this work, we mainly focus on detecting simple geometrical surfaces such as planes, spheres, cylinders, cones, spiral and helical surfaces, surfaces of revolution, etc. as described in [8]. Examples of these surfaces can be found in Figure 1. In [8], the close relation between these shapes, spatial kinematics and line geometry are formulated. A point cloud, as a set of noisy points on a surface, is transformed into a set of normals (also referred to as normal lines or normal vectors) that show exploitable properties for that surface. For instance, the normals of a sphere intersect in a single point, the normals of a surface of revolution intersect in an axis of rotation, and the normals of a helical surface can be seen as path normals of a helical motion. These insights led to applications in surface reconstruction and robotics [8,9]. Later, their method was refined in [10,11] to address pipe surfaces, profile surfaces and developable surfaces in general. In [12], the authors introduced principal component analysis (PCA) to approximate the set of normals. This laid the groundwork for a more general approach in [13] using so-called *line elements.*

**Citation:** De Boi, I.; Ek, C.H.; Penne, R. Surface Approximation by Means of Gaussian Process Latent Variable Models and Line Element Geometry. *Mathematics* **2023**, *11*, 380. https:// doi.org/10.3390/math11020380

Academic Editors: Snezhana Gocheva-Ilieva, Atanas Ivanov and Hristina Kulina

Received: 14 December 2022 Revised: 4 January 2023 Accepted: 6 January 2023 Published: 11 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

These are constructed for every point of the point cloud. They are formed by the (Plücker) coordinates of the normal line and the surface point which lies on that line. The key insight of their work is that the line elements of simple geometric surfaces lie on a linear subspace in R7, which can be found by solving an ordinary eigenvalue problem. We elaborate more on this approach in Section 2.2.

(**a**) Cylinder of revolution (**b**) Cone of revolution (**c**) Spiral cylinder

(**d**) Cylinder without revolution (**e**) Cone without revolution (**f**) Surface of revolution

(**g**) Helical surface (**h**) Spiral surface

**Figure 1.** Examples of equiform kinematic surfaces.

Although a mathematically very elegant way to describe 3D surfaces, this approach does have several drawbacks. First, the surface classification is strict. This means that only very primitive shapes can be treated. Real-world objects do not always fall neatly into one of these categories. For example, imperfect shapes like a slight bend plane, a sphere with a dent or shapes found in nature or human anatomy. Blindly following this method, results in a misrepresentation of the data. Second, because PCA minimises an L2-norm, it is very sensitive to outliers. This can be mitigated by iterative RANSAC or by downweighting the outliers. However, this comes at the cost of increased computation time. Third, real-world point clouds show various other imperfections like non-uniform sampling, noise, missing regions, etc. This highly affects the quality of the representation. Fourth, the authors of [10–13] propose to look for *small* eigenvalues. The obvious question arises: when is an eigenvalue small? Even though some guidelines are provided, these threshold values remain domain specific and are challenging to set.

Most of these drawbacks can be attributed to the eigenvalue problem (or its PCA formulation) used to find an appropriate linear subspace in R7. In essence, this is a linear dimensionality reduction from the seven-dimensional space of line elements to a lowerdimensional latent space. In this work, we build on that method by introducing the Gaussian process latent variable model (GPLVM) [14] as an alternative. This allows for a non-linear relationship between a latent space and a higher dimensional data space, where observations are made. We implement a multi-output Gaussian process (seven outputs in this case) and try to find a mapping from a lower dimensional latent space to the line elements. Several variants on this theme exist, which we explain in more depth in Section 2.5. No longer confined by the linearity of PCA, our models can handle a wider range of shapes. Moreover, Gaussian processes handle noise very well, even in low data regimes [15]. The GPLVM places a Gaussian process prior to the mapping from the latent to the observed space. By exploiting strong priors, we can significantly reduce the amount of data needed for equally accurate predictions.

In our approach, we can handle shapes (or surfaces) that fall in the categories described in [8–13] but also shapes that significantly deviate from these. For instance, a surface of revolution whose central axis is not a straight line or an imperfect plane with one of the corners slightly bent. In fact, we drop the strict classification and allow for shapes that can be seen as somewhere in between the categories. This makes our methods more appropriate for handling surfaces that can be found in nature or when modelling the human body. Moreover, our formulation can handle multiple types of subsurfaces at once. This means we can perform segmentation in the latent space.

For completeness, we mention that in recent years various deep learning techniques have been successfully introduced to discover objects in point clouds. A thorough overview can be found in [16] and more recently in [1,17]. These techniques vary from ordinary multilayer perception models (MLP) to convolutional- and graph-based models. Numerous datasets have been made public to encourage the field further to develop new models (e.g., ScanObjectNN [18], ShapeNet [19], ScanNet [20], KITTI Vision Benchmark Suite [21], ModelNet40 [22,23], ... ). Generally, these data-driven models are trained on more complex shapes: vehicles, urban objects, and furniture, ... These models are specifically designed for detecting obstacles, such as vehicles in autonomous driving, but not so for accurate object reconstruction from detailed 3D scanning. In this work, we focus on the latter. Moreover, whenever a new shape has to be learned, the underlying model has to be trained again.

To summarise, for every point on a given point cloud, we can formulate a so-called line element. Dimensionality reduction on the set of line elements reveals the characteristics of the surface captured by that point cloud. Existing methods rely on PCA, which is a linear mapping. In contrast, our model is built on the Gaussian process latent variable model, which allows for non-linear mapping. This results in a more nuanced way of representing the surface. The main contributions of this work are the following:


All of our 3D models, sets of line elements, trained GPLVM, notebooks with code and many more experiments and plots can be found on our GitHub repository (https: //github.com/IvanDeBoi/Surface-Approximation-GPLVM-Line-Geometry, accessed on 10 January 2023).

The rest of this paper is structured as follows. In the next section, we give some theoretical background on line element geometry, kinematic surfaces, approximating the data, Gaussian processes and Gaussian process latent variable models in particular. The third section describes the results of our method applied to surface approximation, surface segmentation and surface denoising. Section four presents a discussion of our findings.

#### **2. Materials and Methods**

#### *2.1. Line Element Geometry*

In projective 3-space P3, a straight line **L** can be represented by a direction vector **l** and an arbitrary point *x* on that line. The so-called moment vector ¯**l** for that line with respect to the origin, can be written as

$$
\overline{\mathbf{l}} = \mathbf{x} \times \mathbf{l}, \tag{1}
$$

where **x** are the coordinates of *x* in R3. The *Plücker coordinates* for a line are defined as (**l**, ¯**l**)=(*l*<sup>1</sup> : *l*<sup>2</sup> : *l*<sup>3</sup> : *l*<sup>4</sup> : *l*<sup>5</sup> : *l*6) [24]. These are homogeneous coordinates, meaning they are scale invariant. Notice that the scale factor is determined by the norm of the direction vector. Moreover, they are independent of the choice of *x*. Since we are not concerned about the orientation, (**l**, ¯**l**) and (−**l**, <sup>−</sup>¯**l**) describe the same line, which also follows from the homogeneity.

For example, a line **L** is spanned by two given points *x* and *y*, possibly at infinity. By following the notation in [24], we write the homogenous coordinates for *x* and *y* as (*x*0, **x**) and (*y*0, **y**) respectively. Then, the homogenous Plücker coordinates for **L** are found as

$$\mathbf{L} := (\mathbf{l}, \bar{\mathbf{l}}) = (\mathbf{x}\_0 \mathbf{y} - y\_0 \mathbf{x}, \mathbf{x} \times \mathbf{y}) \in \mathbb{R}^6. \tag{2}$$

Not every combination of six numbers yields a straight line. To do so, the following condition is necessary and sufficient:

$$\mathbf{l} \cdot \mathbf{\bar{l}} = 0.\tag{3}$$

This is called the *Grassmann–Plücker relation*. Plücker coordinates can also be regarded as homogeneous points coordinates in projective 5-space P5, where straight lines are points lying on a quadric given by the equation

$$l\_1 l\_4 + l\_2 l\_5 + l\_3 l\_6 = 0.\tag{4}$$

This quadric is called the *Klein quadric* and is denoted as *M*<sup>4</sup> <sup>2</sup>. The interpretation of points in projective 5-space has proved useful in a variety of line geometry applications [24].

Plücker coordinates of a line in R<sup>3</sup> can be extended to *line elements* by adding a specific point *x* on that line [13]. To do so, we start by choosing an orientation for the unit direction vector **l** of the line. A seventh coordinate *λ* is needed to locate *x* on that line, which can be defined as

$$
\lambda = \mathbf{x} \cdot \mathbf{l}.\tag{5}
$$

Notice that the norm of **l** matters, which is why we work with the (normalised) unit direction vector. This yields the seven-tuple (**l**, ¯**l**, *λ*) of coordinates for a line element based on a line and a point, in which **l** <sup>=</sup> 1 and **<sup>l</sup>** · ¯**<sup>l</sup>** <sup>=</sup> 0.

Each point on a smooth surface Φ of a 3D volume has an outward unit normal vector **n**. For every point *x* on that surface, a line element can be defined as (**n**, **x** × **n**, **x** · **n**). These line elements constitute an associated surface Γ(Φ) in R7. An important property of many simple geometrical shapes in R<sup>3</sup> (planes, spheres, cones, ... ), is that their Γ(Φ) is contained in a linear subspace of R7. We will see in Section 2.2 that this aspect can be exploited in surface approximation, surface segmentation and surface denoising.

#### *2.2. Kinematic Surfaces*

Rigid body motions can be seen as a superposition of rotations and translations. These can be extended by adding a scaling, making them the family of *equiform motions,* also known as *similarities* [10]. Such a one-parameter motion *M*(*t*) is either a rotation, translation, a central similarity, a spiral motion or a combination of any of them. The velocity vector field of *M*(*t*) is constant (time-independent) and can be written as

$$\mathbf{v}(\mathbf{x}) = \overline{\mathbf{c}} + \gamma \mathbf{x} + \mathbf{c} \times \mathbf{x},\tag{6}$$

where **c**¯, *γ***x** and **c** × **x** are the translation, scale and rotation component of the velocity vector **v** at a point *x*. A curve undergoing an equiform motion forms an *equiform kinematic surface.*

As defined in [13], a *linear complex of line elements* is the set of line elements whose coordinates (**l**, ¯**l**, *λ*) satisfy the linear equation

$$
\vec{\mathbf{c}} \cdot \mathbf{l} + \mathbf{c} \cdot \vec{\mathbf{l}} + \gamma \lambda = 0,\tag{7}
$$

where (**c**, **c**¯, *γ*) is de coordinate vector of the complex. The following theorem from [25] shows the relation between linear complexes of lines and equiform kinematics:

**Theorem 1.** *The surface normal elements of a regular C*<sup>1</sup> *surface in* R<sup>3</sup> *are contained in a linear line element complex with coordinates* (**c**, **c**¯, *γ*) *if and only if the surface is part of an equiform kinematic surface. In that case, the uniform equiform motion has the velocity vector field as given in Equation* (6)*.*

Here, we will give an overview of such motions *M*(*t*) and their corresponding surfaces Φ. For a thorough explanation of these (and multiple applications), we refer the reader to the works [8,10–13,25,26].

	- **c** = 0, **c**¯ = 0: *M*(*t*) is the identical motion (no motion at all).
	- **c** = 0, **c**¯ = 0: *M*(*t*) is a translation along **c**¯ and Φ is a cylinder (not necessarily of revolution).
	- **c** = 0, **c** · **c**¯ = 0: *M*(*t*) is a rotation about an axis parallel to **c** and Φ is a surface of revolution.
	- **c** = 0, **c** · **c**¯ = 0: *M*(*t*) is a helical motion about an axis parallel to **c** and Φ is a helical surface.
	- **c** = 0: *M*(*t*) is a spiral motion and Φ is a spiral surface.
	- **c** = 0: *M*(*t*) is a central similarity, and Φ is a conical surface (not necessarily of revolution).

Examples of these surfaces can be found in Figure 1.

This alternative way of describing surfaces as linear complexes of line elements opens up a new way of studying them, as explained below.

#### *2.3. Approximating the Complex*

Suppose a scanning process results in a set of points *X* (a point cloud), i.e., the results of the scanning process. The aim is to determine the type of surface Φ on which these points lie. This knowledge would allow us to reconstruct the surface using its underlying geometrical properties. For instance, if we know our points result from the scan of a surface of revolution, we could determine the central axis etc. So, we are interested in the (linear) complex (of line elements) that best describes the given points. Its coordinates (**c**, **c**¯, *γ*) determine the type of surface [13].

First, we calculate the unit normal vectors from the point cloud at every point. This topic has been very well documented in the literature. We refer the reader to [27] for a more in-depth discussion. For each *xi* in *X* with *i* = 1, 2, ... , *N* we obtain a unit normal vector **ni**. From these normal vectors and corresponding points, we calculate their line elements (**ni**, **xi** × **ni**, **xi** · **ni**).

Second, according to Equation (7), a complex with coordinates (**c**, **c**¯, *γ*) that best fits these line elements minimises

$$F(\mathbf{c}, \bar{\mathbf{c}}, \gamma) = \sum\_{i=1}^{N} (\bar{\mathbf{c}} \cdot \mathbf{l\_i} + \mathbf{c} \cdot \bar{\mathbf{l\_i}} + \gamma \lambda\_i)^2,\tag{8}$$

under the condition **c**<sup>2</sup> + **c**¯<sup>2</sup> + *γ*<sup>2</sup> = 1. We follow the notation used in [12], in which **<sup>a</sup>**<sup>2</sup> = **<sup>a</sup>** · **<sup>a</sup>**. For this condition to make sense, we normalise our point cloud such that max *xi* ≈ 1. We also centre it around the origin. We can rewrite this as

$$F(\mathbf{c}, \overline{\mathbf{c}}, \gamma) = (\mathbf{c}, \overline{\mathbf{c}}, \gamma) M(\mathbf{c}, \overline{\mathbf{c}}, \gamma)^T,\tag{9}$$

where *M* = ∑*<sup>N</sup> <sup>i</sup>*=1(¯**li**, **li**, *<sup>λ</sup>i*)*T*(¯**li**, **li**, *<sup>λ</sup>i*). This is an ordinary eigenvalue problem. The smallest eigenvalue of *M* corresponds to an eigenvector (**c**ˆ, **c**ˆ¯, *γ*ˆ) which best approximates Equation (7) for the given (**li**, ¯**li**, *λi*).

Some surfaces are invariant under more than one one-parameter transformation [8,10– 13,25,26]. In that case, *k* small eigenvalues appear as solutions to Equation (8). The corresponding eigenvectors can be seen as a basis for a subspace in R7. We list the possibilities below:


Further examination of the coordinate (**c**, **c**¯, *γ*) determines the exact type of surface, as described in Section 2.2. Multiple examples, applications and variations on this theme can be found in above mentioned references.

Although this is a very elegant and powerful approach, some issues are discussed in the works listed above. First, this method is very sensitive to outliers. A solution proposed by the authors is to apply a RANSAC variant or to downweigh the outliers by iteratively

$$M = \frac{1}{\sum \sigma\_{\bar{i}}} \sum\_{i=1}^{N} \sigma\_{i} (\bar{\mathbf{l}}\_{\bar{\mathbf{i}}\prime} \mathbf{l}\_{\bar{\mathbf{i}}\prime} \lambda\_{i})^{T} (\bar{\mathbf{l}}\_{\bar{\mathbf{i}}\prime} \mathbf{l}\_{\bar{\mathbf{i}}\prime} \lambda\_{i}). \tag{10}$$

This obviously results in longer computation times. Second, numerical issues can arise calculating the eigenvalues (especially for planes and spheres). Third, some shapes do not fall into the classification of these simple geometric forms. This is certainly the case for organic surfaces that can be found in nature or when modelling the human body. Reconstructing a surface based on a simple geometric shape is obviously only valid if the surface resembles the shape well. Fourth, some shapes are either a combination or a composition of the elementary simple shapes (e.g., pipe constructions). In this case, the question arises of what constitutes as a small eigenvalue and where to draw the line between steadily increasing values. Even though some guidance is given in the literature, these thresholds are often application-specific parameters.

Our approach provides a solution for these issues, by finding a representative lower dimensional latent space for the line elements in a more flexible non-linear way. This is no longer a linear subspace in R7.

#### *2.4. Gaussian Processes*

By definition, a Gaussian process (GP) is a stochastic process (a collection of random variables), with the property that any finite subset of its variables is distributed as a multivariate Gaussian distribution. It is a generalization of the multivariate Gaussian distribution to infinitely many variables. Here, we only give an overview of the main aspects. We refer the reader to the book [15] for a more detailed treatise.

Let a dataset <sup>D</sup> <sup>=</sup> {*X*, **<sup>y</sup>**} consist of *<sup>n</sup>* observations, where *<sup>X</sup>* <sup>=</sup> <sup>=</sup> **x**1, **x**2,... , **x***<sup>n</sup>* >*<sup>T</sup>* is an *<sup>n</sup>* <sup>×</sup> *<sup>d</sup>* matrix of *<sup>n</sup>* input vectors of dimension *<sup>d</sup>* and **<sup>y</sup>** <sup>=</sup> <sup>=</sup> *y*1, *y*2,... , *yn* >*<sup>T</sup>* is a vector of continuous-valued scalar outputs. These data points are also called training points. In regression, the aim is to find a mapping *<sup>f</sup>* : R*<sup>d</sup>* → R,

$$y = f(\mathbf{x}) + \epsilon, \quad \epsilon \sim \mathcal{N}(0, \sigma\_{\mathbf{c}}^2), \tag{11}$$

with being identically distributed observation noise. In this work, this mapping is implemented by a Gaussian process. As stated above, a Gaussian process generalises the multivariate Gaussian distribution to infinitely many variables. Just like the multivariate Gaussian distribution is fully defined by its mean vector and covariance matrix, a Gaussian process is fully defined by its mean as a function *m*(**x**) and *covariance function k*(**x**, **x** ). It is generally denoted as *f*(**x**) ∼ GP(*m*(**x**), *k*(**x**, **x** )). The covariance function is parametrised by a vector of hyperparameters *θ*. By definition, a GP yields a distribution over a collection of functions that have a joint normal distribution [15],

$$
\begin{bmatrix} \mathbf{f} \\ \mathbf{f}\_{\*} \end{bmatrix} \sim \mathcal{N} \left( \begin{bmatrix} \boldsymbol{\mu}\_{\boldsymbol{X}} \\ \boldsymbol{\mu}\_{\boldsymbol{X}\_{\*}} \end{bmatrix} \prime \begin{bmatrix} \mathbf{K}\_{\boldsymbol{X},\boldsymbol{X}} & \mathbf{K}\_{\boldsymbol{X},\boldsymbol{X}\_{\*}} \\ \mathbf{K}\_{\boldsymbol{X}\_{\*},\boldsymbol{X}} & \mathbf{K}\_{\boldsymbol{X}\_{\*},\boldsymbol{X}\_{\*}} \end{bmatrix} \right) \prime \tag{12}
$$

where *X* are the input vectors of the *n* observed training points and *X*<sup>∗</sup> are the *n*<sup>∗</sup> input vectors of the unobserved test points. The mean value for **X** is given by *mX*. Likewise, the mean value for *<sup>X</sup>*<sup>∗</sup> is given by *<sup>m</sup>X*<sup>∗</sup> . The covariance matrices **<sup>K</sup>***X*,*X*, **<sup>K</sup>***X*∗,*X*<sup>∗</sup> , **<sup>K</sup>***X*∗,*<sup>X</sup>* and **K***X*,*X*<sup>∗</sup> are constructed by evaluating the covariance function *k* at their respective pairs of points. In real-world applications, we do not have access to the latent function values. We are depending on noisy observations **y**.

The conditional predictive posterior distribution of the GP can be written as:

$$\mathbf{f}\_\*|X, X\_\*, \mathbf{y}, \theta, \sigma\_c^2 \sim \mathcal{N}(\mathbb{E}(\mathbf{f}\_\*), \mathbb{V}(\mathbf{f}\_\*)),\tag{13}$$

$$\mathbb{E}(\mathbf{f}\_\*) = \boldsymbol{m}\_{X\_\*} + \mathbb{K}\_{X\_\*,X} \left[ \mathbb{K}\_{X,X} + \sigma\_\varepsilon^2 I \right]^{-1} \mathbf{f}\_\* \tag{14}$$

$$\mathbb{V}(\mathbf{f}\_\*) = \mathbf{K}\_{X\_\*,X\_\*} - \mathbf{K}\_{X\_\*,X} \left[\mathbf{K}\_{X,X} + \sigma\_\varepsilon^2 I\right]^{-1} \mathbf{K}\_{X,X\_\*}.\tag{15}$$

The hyperparameters *θ* are usually learned by using a gradient-based optimisation algorithm to maximise the log marginal likelihood,

$$\log p(\mathbf{y}|\boldsymbol{\theta}, \boldsymbol{X}) \propto -\frac{1}{2} \left[ \mathbf{y}^T \left[ \mathbf{K}\_{\boldsymbol{X}, \boldsymbol{X}} + \sigma\_\varepsilon^2 I \right] \mathbf{y} + \log \left| \mathbf{K}\_{\boldsymbol{X}, \boldsymbol{X}} + \sigma\_\varepsilon^2 I \right| \right],\tag{16}$$

which is a combination of a data fit term and complexity penalty and, thus, automatically incorporates Occam's Razor [15]. This guards the Gaussian process model against overfitting. In our experiments, we use BFGS, a quasi-Newton method described in [28]. The linear system = **K***X*,*<sup>X</sup>* + *σ*<sup>2</sup> *I* > **y** is often calculated by first calculating the Cholesky decomposition factor *L* of = **K***X*,*<sup>X</sup>* + *σ*<sup>2</sup> *I* > and then solving

$$\left[\mathbf{K}\_{X,X} + \sigma\_\varepsilon^2 I\right] \mathbf{y} = L^T \; \mid \; (L \; \mid \mathbf{y}). \tag{17}$$

In the literature, many kernel functions have been extensively studied and reviewed. An overview can be found in [29]. A very popular kernel is the squared exponential kernel. It is suited for a wide range of applications because it is infinitely differentiable and, thus, yields smooth functions. Moreover, it only has two tunable hyperparameters. It has the form:

$$k\_{SE}(\mathbf{x}, \mathbf{x}') = \sigma\_f^2 \exp\left(-\frac{|\mathbf{x} - \mathbf{x}'|^2}{2l^2}\right),\tag{18}$$

in which *σ*<sup>2</sup> *<sup>f</sup>* is a scale factor and *l* is the length-scale that controls the decline of the influence of the training points with distance. For the squared exponential kernel the hyperparameters *<sup>θ</sup>SE* are - *σ*2 *f* , *l* . For the function *k* : *X* × *X* → R to be a valid kernel, it must be positive semi-definite (PSD), which means that for any vector **<sup>x</sup>** ∈ *<sup>X</sup>d*, the kernel matrix *<sup>K</sup>* is positive semi-definite. This implies that **<sup>x</sup>***TK***<sup>x</sup>** ≥ 0 for all **<sup>x</sup>** ∈ R*d*.

In this work, we do implement a different length-scale parameter for every input dimension. This technique is called automatic relevance determination (ARD) and allows for functions that vary differently in each input dimension [29]. The kernel has the form:

$$k\_{SEARD}(\mathbf{x}, \mathbf{x}') = \sigma\_f^2 \exp\left(-\frac{1}{2} \sum\_{j=1}^d \left(\frac{\left|\mathbf{x}\_j - \mathbf{x}\_j'\right|}{l\_j}\right)^2\right),\tag{19}$$

in which *lj* is a separate length-scale parameter for each of the *d* input dimensions.

#### *2.5. Gaussian Process Latent Variable Models*

Principal component analysis (PCA) transforms a set of data points to a new coordinate system, in which the greatest variance is explained by the first coordinate (called the first principal component), the second greatest variance by the second coordinate, etc. This reprojection of the data can be exploited in dimensionality reduction by dropping the components with the smallest variance associated. The result will still contain most of the information of the original data. This method can also be explained as a statistical model known as probabilistic PCA (PPCA) [30], which implies that the principal components associated with the largest variance also maximise the likelihood of the data.

In dimensionality-reduction, the representation of the original data by its retained principal components can be interpreted as a latent variable model, in which the *n* latent variables **X** = = **x**1, **x**2,... , **x***<sup>n</sup>* >*<sup>T</sup>* are of a dimension *k* that is lower than the dimension *d* of the original data. PPCA requires a marginalisation of those latent variables and an optimisation of the mapping from the latent space to the data (observation) space. For *n d*-dimensional observations **Y** = = **y**1, **y**2,... , **y***<sup>n</sup>* >*<sup>T</sup>* we can write

$$\mathbf{y}\_{i} = \mathbf{W}\mathbf{x}\_{i} + \boldsymbol{\epsilon}, \quad \boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \sigma\_{\mathbf{c}}^{2}), \tag{20}$$

in which **x***<sup>i</sup>* is a *k*-dimensional latent variable with *k* < *d*, **W** is a *d*×*k* matrix representing the mapping and is observation noise.

In [14,31], a dual approach is proposed by marginalising the mapping **W** and optimising the latent variables **X**. This approach is called the Gaussian process latent variable model (GPLVM) and is achieved by maximising the Gaussian process likelihood L with respect to the latent variables. We optimise

$$\mathcal{L}(\mathbf{X}) = -\frac{d\mathbf{n}}{2}\log 2\pi - \frac{d}{2}\log|\mathbf{K}| - \frac{1}{2}\text{tr}(\mathbf{K}^{-1}\mathbf{Y}\mathbf{Y}^{\top}),\tag{21}$$

with respect to **X**. It is proved in [14] that this approach is equivalent to PCA when using a linear kernel to compose **K**, which can be written as

$$k\_{linear}(\mathbf{x}, \mathbf{x}') = \mathbf{x}^T \mathbf{x}'.\tag{22}$$

However, by choosing a nonlinear kernel, we can establish a nonlinear relationship between the latent variables **X** and the observations **Y**. This relationship can also be seen as placing a Gaussian process prior on each column of the *n*×*d* matrix **Y** and allows for a more flexible mapping between latent and data space.

In the original GPLVM, the unobserved inputs are treated as latent variables which are optimised. Another approach is to variationally integrate them and compute a lower bound on the exact marginal likelihood of the non-linear variable model. This is known as the Bayesian Gaussian process latent variable model (BGPLVM) [32]. It is more robust to overfitting and can automatically detect the relevant dimensions in the latent space. These dimensions are characterised by a larger Automatic Relevance Determination (ARD)

contribution, which is the inverse of the length-scales *lj* in Equation (19). Every component in data space is a vector whose components are formed by as many Gaussian processes as there are input dimensions. These ARD contributions determine the weight of the outcomes of each of the Gaussian processes and, thus, its corresponding input dimension. Less relevant dimensions result in longer length scales and are pruned out. In this work, we exploit this by performing this Bayesian non-linear dimensionality reduction on the sevendimensional line elements. For most shapes in 3D, a lower-dimensional representation in a latent space can be found, as we show below.

The GPLVM is a map from latent to data space. As such, it preserves local distances between points in the latent space. However, this does not imply that points close by in the data space are also close by in the latent space. To incorporate this extra feature, one can equip the GPLVM with a so-called back constraint [33]. This constraint is accomplished by introducing a second distance-preserving map from the data to the latent space. We refer to this model as the back-constrained Gaussian process latent variable model (BCGPLVM). A thorough review of GPLVM and its variants, including more than the ones mentioned here, can also be found in [34,35] and more recently in [36,37].

#### *2.6. Our Approach*

In this section, we explain how all of the above-mentioned concepts come together in our approach. To recap, we can represent a straight line in 3D space by Plücker coordinates, which are six-tuples. By adding a seventh component, we can specify a point on that line. We can do this for every *n* point in a given point cloud. The line we choose through each point is the normal line to the surface that is captured by that point cloud. We, thus, obtain a set of seven-dimensional line elements, that captures the information about the surface we want to examine.

The theory of kinematic surfaces links the line elements that are contained in a linear line element complex to an equiform kinematic surface. Finding this complex comes down to solving an ordinary eigenvalue problem. The dimensionality of the linear subspace, in which the line elements live, and the resulting eigenvalues determine the type of surface. In essence, this is dimensionality reduction on the seven-dimensional line elements via PCA.

However, PCA is a linear mapping. In contrast, our model is built on the Gaussian process latent variable model (GPLVM), which allows for non-linear mapping. This results in a more nuanced way of representing the surface. Our model is only given the sevendimensional line elements and finds the mapping from a latent (unobserved) space to these line elements. Each of the seven dimensions of the line elements is assigned to a Gaussian process (GP). The outputs (predictions) of those GPs are the components of the line elements. The inputs (training data) are the latent points, which are treated as variables of the overall model and are optimised (or integrated out in the Bayesian formulation).

In the next section, we will describe in more detail how we compose the datasets and elaborate on our experiments.

#### **3. Results**

All 3D models, datasets, plots and trained GP models described below can be found in our GitHub repository. These include supplementary experiments and plots we omitted here, so we do not overload the text.

To assess the latent representation of various 3D shapes, we composed a collection of both synthetically generated point clouds and real-world scanned objects. An overview can be found in Table 1. The synthetically generated point clouds are based on objects drawn in the free and open source Blender 3.3 LTS (https://www.blender.org/, accessed on 10 January 2023). Point clouds resulting from real-world scans were created on an Android mobile phone using the photogrammetry KIRI engine (https://www.kiriengine.com/, accessed on 10 January 2023) and imported in Blender, where they are cleaned up by dissolving disconnected points, removing the background and subsampling using standard Blender tools. However, they still contain some overlapping triangles and other mesh

irregularities. Synthetic models were made noisy by first subdividing the mesh several times and then applying the Blender Randomize tool on the vertices. This breaks the lattice structure of the vertices. Moreover, this makes them resemble a real-world scan, where imperfections are inevitable. In this paper, we restrict ourselves to one noise level and leave the effect of the amount of noise on our point clouds as future work. The noiseless and noisy bend torus are the same as their ordinary torus variant with a Simple Deform modifier of 45° applied to it. All models are exported in the Polygon File Format format (.ply), resulting in files consisting of points and unit normals for those points.

The line elements are calculated in Matlab R2020b and exported as comma-separated values (.csv). These serve as the data space for the GPLVM models, which are implemented using the python GPy library (http://sheffieldml.github.io/GPy/, accessed on 10 January 2023). Some point clouds were subsampled uniformly for performance reasons. All GPLVM models were initialised with the results from a PCA. The BCGPLVM models were implemented with an MLP mapping with five hidden layers. The details are in Table 1. All code for training the models, as well as the trained models themselves, are available via notebooks in the GitHub repository.

**Table 1.** An overview of the surfaces and their corresponding GPLVM and properties. Larger point clouds are subsampled uniformly to a smaller set. The number of points retained for training is given in # Train. Noise is the Gaussian noise variance hyperparameter for the GPLVM model, which is either a fixed value or a value that has to be learned along with the other hyperparameters. The number of inducing points for the Bayesian GPLVM is shown in # Ind. To make sure we do not end up in local minima, we restarted the training of the model a number of times given in # Restarts.


#### *3.1. Surface Approximation*

We trained a Bayesian Gaussian process latent variable model for all examples of the equiform kinematic surfaces listed in Section 2.2. As can be seen by the ARD contributions in Figure 2, these surfaces can be represented by a lower dimensional representation. These dimensions are characterised by the largest ARD contribution and, thus, the smallest length scales. A 2D plot of the surface points in their latent space is given by Figure 3. The Supplementary Material also includes plots in 3D latent space and provides PCA bar charts for comparison. They are omitted here not to overload this text with too many plots. These experiments were repeated for the noisy variants of the surfaces as well. The same underlying structure can be observed. Again, we refer to the Supplementary Material.

All surfaces show a clear structure in their latent space. Note that the number of small eigenvalues does not necessarily correspond with the number of relevant dimensions in latent space. The latter is the result of an optimization algorithm in which both the latent points and the kernel hyperparameters are found. This can be seen in the ARD contribution plot for the helical surface. The plot for the cylinder of revolution even has a significant value for all seven dimensions. This effect can be thought of as overfitting [37], as the model attributes importance to more latent dimensions than needed. In our experiments, we tried to lower this effect by making the model less flexible. We added a fixed noise term to the hyperparameters and lowered the number of inducing points. For details per surface, we refer to Table 1. All trained models are available in the Supplementary Material on the GitHub repository.

Another important remark is that the mapping from latent to data space is non-linear. Care must be taken when interpreting the 2D latent space plots. For instance, the spiral surface clearly has a one-dimensional subspace. However, the 2D plot shows a scattered cloud of points. This is an artefact of the visualization. The ARD plot indicates that only dimension 1 has a significant contribution. Another example is given by the cylinder without revolution. Its subspace in R<sup>7</sup> is one-dimensional, which manifests itself as a curve-like trail of points in the 2D latent space.

(**d**) Cylinder without revolu-

tion (**e**) Cone without revolution (**f**) Surface of revolution

(**g**) Helical surface (**h**) Spiral surface

**Figure 2.** ARD contributions for the dimensions of the latent space for the examples of equiform kinematic surfaces.

So far, nothing has been gained by this new Bayesian GPLVM way of representing surfaces. The difference with the approach described in Section 2.2, is that we are no longer restricted to the simple geometric surfaces of Figure 1 and their linear subspaces of R7. We can now also describe surfaces that do not fall into the categories listed above. We investigate two cases.

First, we apply our method to a bend torus. This is a surface of revolution, which we altered using the Simple Deform modifier in Blender to bend it 45° around an axis perpendicular to the axis of rotational symmetry of the original torus. This removes the rotational symmetry altogether. The results can be seen in Figure 4. We notice that the BGPLVM only deemed one dimension as significant. The 2D plot reveals the latent structure.

**Figure 4.** Results for a bend torus. One latent dimension is found to be dominant.

Second, we look at the surface of the point cloud obtained by scanning a pear as described above. This is an organic shape, so it possesses the same challenges as working with shapes that can be found in other items from nature or when modelling the anatomy of humans and animals. We are only interested in the shape of the body, so we removed the stalk and the bottom part when cleaning up the 3D model. In this case, the 3D shape resembles a surface of revolution, but the axis is bent irregularly and the rotational symmetry is broken (not all normals intersect the axis of rotation). The results can be seen in Figure 5. The darker region in the 2D latent plot indicates more posterior uncertainty. In the latent space, we observe a set of points similar to what we saw for a cylinder of revolution with an additional distortion in a third latent dimension.

**Figure 5.** Results for a real world pear scanned with a mobile phone app.

Both the bend torus and the pear can not be described by an equiform kinematic surface. Applying the methods of Section 2.2 (i.e., approximating by a linear complex of line elements) for classification is numerically still possible. A small set of eigenvalues can be found. However, their interpretation would be faulty. The bend torus shows one small eigenvalue, **c** = (0, 0, 0), *γ* = 0 and **c** · **c**¯ ≈ 0. Unsurprisingly, these values fit a surface of revolution. They resemble the values for the torus or the torus with noise. However, blindly using the methods from [10] would result in a perfect surface of revolution. The same reasoning can be applied to the scanned pear's point cloud. Below, we show how to exploit our newfound GPLVM representation in surface approximation, surface segmentation and surface denoising.

#### *3.2. Surface Segmentation*

A major challenge in point cloud classification is the segmentation of sub-regions within that cloud. Once points are grouped together in simpler shapes, the underlying structure can be found via either our method or the methods described in [8,10–13,25,26]. In these works, several approaches are described for discovering the sub-regions. Mostly, they are based on time-consuming trial and error RANSAC. Here, we show that working in a latent space can be beneficial. The challenge is to group points together, whose line elements show similar behaviour.

As we want to separate coherent groups of points in latent space, we care about their local distances. Points close by in the latent space should be close by in the data space as well. Therefore, we expand our GPLVM with a back constraint as described in Section 2.5. We implement both an RBF kernel with ARD and a multi-layer perceptron (MLP) mapping to capture the back constraint [33,37]. The details for the different 3D models can be found in Table 1. As before, all code is available in the GitHub repository. The notebooks also include 3D plots made with the python open source graphing library Plotly (https://plotly.com/python/, accessed on 10 January 2023), that allow user interaction such as 3D rotations. By rotating the viewpoint, we can clearly see how separable the latent points are.

To demonstrate our approach, we first designed three objects composed of different simpler geometric shapes. They can be found in Figure 6. The parts of these three models

fall under the different categories described in Section 2.2. The aim of surface segmentation is to find those parts in an unsupervised manner.

**Figure 6.** Three synthetically generated 3D models by combining primitive surfaces. The BCGPLVM is able to show distinguishable structures for points in latent space.

First, we created a 3D model called Mixture 1, which consists of a cylinder and cone, neither without rotational symmetry. Both of those shapes individually show one small eigenvalue and a clearly distinguishable curve in their 2D latent space, as can be seen in Figure 3. Combined, their latent space looks like two curves, shown in Figure 6. Notice that the 3D points that lie both on the cylinder and the cone, fall into both categories. Moreover, their normal is inconsistent with either of the two shapes. For this cylinder, all normals are horizontal. For the cone, normals for points on a line connecting the cone's apex and its generating curve, are parallel. For points on the intersection of the cylinder and the cone, the normals are weighted with their neighbouring points. This results in a latent space that is not easily separable by clustering.

Second, the 3D model named Mixture 2 consists of a noisy cylinder of revolution where one end is closed by a demi-sphere. The former is characterised by two small eigenvalues and the latter by three. Again, this behaviour can be clearly observed in the latent space. Notice how the BCGPLVM formulates latent shapes for each part that are consistent with the kinematic surface described in Section 2.2. For the sphere, we observe a 2D shape. For the cylinder of revolution, an annulus can be seen. The supplementary material includes an interactively rotatable 3D plot where this cloud of latent points can be observed in more detail. We also see that the region for the tip of the demi-sphere has a darker background in the 2D latent plot, indicating more uncertainty in this region of the posterior. This can be explained by the fact that the normals of a sphere all intersect at the centre of the sphere. As such, no normals are parallel. This results in line elements whose vector components vary more. Points with normals that lie in parallel planes, as is the case for a cylinder, have more similarity in the direction components of their line elements. Moreover, the hyperparameters in the mapping from latent to data space are optimised globally. This means for all latent points simultaneously. The strong structure in the cylinder part renders the large variations in the tip of the demi-sphere part as less likely. Hence the larger posterior variance.

Finally, in the 3D model Mixture 3, we grouped together the upper half of a sphere, a cone of revolution, a cone without revolution and a cylinder without revolution. These parts have three, two, one and one small eigenvalues, respectively. As this model consists of four different parts, the segmentation is more complex. Nonetheless, the BCGLVM is able to find distinct substructures in the latent space, even in just two dimensions.

For a real-world and more challenging example, we scanned a metal hinge, as described above. It can be found in Figure 7. The original 3D model and the cleaned-up one can be found in the supplementary material. The 3D model is a collection of a cylinder of revolution, two planes and a cone-like aperture. It is important to notice that the scan itself is of poor quality, mainly due to the shininess of the metal and the lack of distinct features. There are holes and bumps in the surface, even after cleaning up the model in Blender. Moreover, the cone-like aperture does not have a lot of vertices (the region around the apex is completely missing). The latent space still shows the formation of clusters, especially when three dimensions are taken into account.

**Figure 7.** The results for a real-world scanned metal hinge. Again, the BGPLVM is able to separate the points in latent space.

Once a latent space is found, the segmentation can be done via either a manual selection of the latent points or a form of unsupervised learning. In the case of separable clusters, we can perform the well-studied k-means clustering algorithm or draw (hyper)planes determined by support vector machines (SVM). The details of these are outside the scope of this work. The reader is referred to [38] and [39], respectively. This segmentation can then be the basis for fitting simple geometric surfaces to each cluster of points. As we can observe from the plots, some of the latent points do not belong to any of the found substructures. In practice, these can be ignored or filtered out. We are left with enough points to perform the best fit. Afterwards, we can determine whether or not such a rogue point belongs to the best-fit subsurface or not.

#### *3.3. Surface Denoising*

In general, a Gaussian process can handle noise very well, even in low data regimes [15]. This means our technique is beneficial to denoise point clouds. Once a mapping is found from a latent space to a data space, it can be queried to predict new points in the data space. This can be used to handle missing data [36,37]. Here, we take advantage of this feature by correcting noisy points in the lower dimensional latent space and predicting their

counterparts in the data space. The smooth mapping allows re-predict the line elements for every latent point.

From a predicted line element (**l**, ¯**l**, *<sup>λ</sup>*), with **l** = 1, we can calculate a corresponding 3D coordinate **x** for a point *x* using

$$\mathbf{x} = \mathbf{l} \times \mathbf{\bar{l}} + \lambda \mathbf{l}. \tag{23}$$

To demonstrate this, we again work on the bend torus model. We introduce random noise with the Blender Randomize tool and select a hundred vertices at random, which we translate to simulate shot noise. The results can be seen in Figure 8. The 3D model, the .ply file with the point coordinates and unit normal vectors, the .csv file with the line elements and the notebook with the executed code for the BGPLVM can be found in the supplementary material. Once the BGPLVM is trained on the noisy point cloud, we use it to predict line elements for the latent point. From these line elements, we extract 3D coordinates for points via Equation (23). We observe that the BGPLVM is able to smooth out the translated vertices. This approach can also be used to detect and remove outliers.

(**a**) Deformed bend torus (**b**) Denoised bend torus

**Figure 8.** A bend torus. Noise is added to the entire surface. Moreover, A hundred vertices were translated. (**b**) The BGPLVM is able to smooth out the surface. Blue are the noisy points. Red are the denoised points.

#### **4. Discussion**

This work presented the first findings for this new GPLVM approach to describe 3D surfaces. In this manuscript, we wanted to focus on the theoretical principles themselves and not overload the paper with additional research questions that determine the limits of this idea. Even though these are both interesting and important in real-world applications, we leave them for future work.

We have shown surface segmentation for surfaces that are the combination of a few different simpler geometrical shapes. The question remains how many sub-regions can be detected and what the complexity of those regions can be?

We presented the Bayesian GPLVM and the GPLVM with back constraints. There are more variations on this topic investigated in the literature. A recent paper describes a generalised GPLVM with stochastic variational inference [37]. They also present models for applying these techniques on a larger dataset. This would be most applicable to larger point clouds, which are often obtained in real-world applications.

A line element is formed by a line and a point on that line. By working with normal lines for points on a surface, we effectively introduced a second so-called *view* for those points, where we follow the terminology used in [34,36]. Those works present a multi-view unsupervised learning technique called manifold relevance determination (MRD), which offers another worthwhile approach.

The prediction as seven-tuples made by the model does not automatically follow the Grassmann–Plücker relationship in Equation (3) for their direction and moment vector. This leads to faulty line elements. In other words, the first six components of a line element vector are the Plücker coordinates of the line where the point of the line element lies on. Not all six-tuples represent a straight line in P3. In general, a *screw centre C* can be written as (**c**, **c**¯). The pitch of *C* is defined as

$$\rho = \frac{\mathbf{c} \cdot \overline{\mathbf{c}}}{\|\mathbf{c}\|^2}. \tag{24}$$

This only holds for **c** not being the zero vector, in which case *C* would be a line at infinity. The pitch can be thought of as the deviation of the screw to a perfectly straight line. We can always write *C* as

$$\mathbf{C} = (\mathbf{c}, \overline{\mathbf{c}} - \rho \mathbf{c}) + (\mathbf{0}, \rho \mathbf{c}) = A + (\mathbf{0}, \rho \mathbf{c}), \tag{25}$$

in which *A* is called the *Poinsot* or *central axis* of the screw centre *C*. The term (**0**, *ρ***c**) represents the line at infinity where the planes perpendicular on *A* meet. Since *A* does satisfy the Grassmann–Plücker relation, it is a straight line in P3. This allows us to correct the predicted six tuples (by six distinct Gaussian processes) into straight lines via

$$A = \mathbb{C} - (\mathbf{0}, \rho \mathbf{c}). \tag{26}$$

This approach is taken in [40]. Another way to ensure the Grassmann–Plücker relation is given in [41], where constraints are built in the kernel functions of the Gaussian process themselves, although at a considerable extra computational cost. A representation that does not suffer from this hurdle is the stereographic projection of a line [24]. In this approach, the line is made to intersect two arbitrary parallel planes. Only the 2D coordinates on those planes are kept as data points. Predictions on those planes, a point on each, are then used to calculate the Plücker coordinates of the predicted line. By doing so, the Grassmann–Plücker relation is always ensured. The problem herein is that the line parallel to the two planes can not be captured. Numerically, a line close to being parallel to the two planes also causes issues. As 3D surfaces can have normals in any direction, this latter approach is not recommended in a general setting.

#### **5. Conclusions**

We provided a theoretical introduction to kinematic surfaces and showed how they could be used to perform surface detection. Many simple geometric shapes manifest themselves as linear subspaces of line or line element space. This approach is limited by the linearity of the underlying eigenvalue problem. We expanded on this by reformulating this as a probabilistic non-linear non-parametric dimensionality reduction technique known as the Gaussian process latent variable model. We showed how this could be applied to many simple geometric surfaces, as well as surfaces that do not fall into any of these categories. Moreover, we showed the benefits of unsupervised surface segmentation and surface denoising. We presented findings on synthetically generated surfaces and scanned real-world objects.

The main goal of the current study was to determine the feasibility of applying the Gaussian process latent variable model to line element geometry. Even though several experiments are explained, and several more are included in the supplementary material, considerably more work will need to be done to determine the limits of this method. For instance, it remains an open question how noise affects the overall representation in the latent space. Moreover, we did not implement any optimizations on the training part of the underlying models, which is paramount for real-world settings. We leave this as future work.

Another natural progression of this work is to exploit further the found latent space in the case of missing data. Point clouds sometimes have missing regions, caused by bad lighting conditions, occluded areas or areas that simply can not be reached by the scanning device. Finding the 3D coordinates for the missing points is a classic example of the missing data problem. In our case, it manifests itself as a region in the latent space that is missing values. If the found structure in the latent space is enough to reconstruct those missing latent points, then according to data space points can also be inferred by the Gaussian process latent variable model.

More broadly, we plan to study the benefits of working on latent spaces not just for line elements, but for the lines themselves (in which case, we drop the point on the line and only keep the description of the line). This technique could be used in the calibration of various devices that are built on the usage of straight lines. Cameras, for instance, produce images in which a pixel is the result of a single incoming ray of light. Another example is galvanometric laser scanners, which guide a laser beam by means of two rotating mirrors. Calibrating such a device means finding the relationship between the two angles of the mirrors and the outgoing beam. So, in this case, a 2D latent space must exist. This would be a fruitful area for further work.

**Supplementary Materials:** All of our 3D models, sets of line elements, trained GPLVM, notebooks with code and many more experiments and plots can be found on our GitHub repository https://github.com/IvanDeBoi/Surface-Approximation-GPLVM-Line-Geometry (accessed on 10 January 2023).

**Author Contributions:** Conceptualisation, I.D.B., C.H.E. and R.P.; methodology, I.D.B., C.H.E. and R.P.; software, I.D.B.; validation, I.D.B.; formal analysis, I.D.B. and R.P.; investigation, I.D.B.; resources, I.D.B.; data curation, I.D.B.; writing—original draft preparation, I.D.B.; writing—review and editing, I.D.B., C.H.E. and R.P.; visualisation, I.D.B.; supervision, R.P.; project administration, I.D.B.; funding acquisition, R.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research has been funded by the University of Antwerp (BOF FFB200259, Antigoon ID 42339).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All of our 3D models, sets of line elements, trained GPLVM, notebooks with code and many more experiments and plots can be found on Supplementary Materials.

**Acknowledgments:** We thank the anonymous reviewers whose comments helped improve the quality of this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
