2. Data Analysis from a Mathematical Point of View
Geometry is a branch of mathematics that focuses on the study of shapes, sizes, positions, and dimensions of objects in space. The concepts of geometry have been studied for thousands of years, and its principles and formulas are still used today in various applications. Through the study of geometry, we can gain a deeper understanding of the world around us and develop problem-solving skills that can be applied in numerous contexts. Algebraic geometry is a branch of mathematics that combines algebra and geometry to study the solutions of polynomial equations. It deals with geometric objects that are defined by algebraic equations and seeks to understand their properties and structures. In algebraic geometry, the focus is on studying the geometric shapes that are solutions to polynomial equations rather than the specific numerical values of those solutions. The field has numerous applications, including in physics, computer science, and cryptography. The study of algebraic geometry requires a solid understanding of abstract algebra, topology, and complex analysis. With its focus on the relationship between algebra and geometry, algebraic geometry has been instrumental in advancing the fields of modern mathematics and theoretical physics [
2,
3].
Without getting into debates about the philosophy of mathematics, it is generally agreed that mathematics is not an exception to the scientific method: the appearance of certain data, whether from physical reality, mathematical reality, or the plane of ideas, leads to a number of conjectures, which are then proved or disproved through further analysis. Up until the introduction of the first computers, the entire procedure was traditionally completed mentally.
An interesting approach can be found in [
4], from the mathematics mechanization point of view. This paper reviews the state-of-the-art on developing symbolic algorithms from manipulating mathematical objects, aided by computers or artificial intelligence. These methods enable automated proving or discovering geometry theorems. Moreover, the methods always work in the symbolic realm, leaving apart the analysis of numerical datasets coming from geometry-related problems. Automated Deduction in Geometry conferences [
5] are a valuable resource of work for this approach. Nevertheless, as the latest advances in machine learning and artificial intelligence (AI) bring us closer to achieving universal AI, this will mark a significant leap in knowledge discovery in mathematics.
Given that mathematics is the language through which nature is expressed [
6], it is not unexpected that there is a significant crossover between mathematics and physics. The works of Kepler, Newton, Fermat, Gauss, and even more recently, Einstein, come to mind. In each of them, we may see a collection of tables or numerical records from which we must deduce the mathematical expressions that support, clarify, or model the facts we are dealing with. All of this was performed manually before the invention of the computer.
However, computers are also able to provide huge datasets representing a theoretical or practical problem. This is the crucial aspect of the situation: [
7,
8] are just two examples of enormous data sets that frequently emerge in mathematics as a result of numerical simulations. Any numerical mathematical simulation generates a lot of data that, unlike real process data, are accurate: they include no noise at all. An example of this may be the creation of numerically rational curves modeling a specific problem [
9]. These days, it is usual to use computers for these tasks due to the cheapness of storage and the current computing capability of CPUs and GPUs.
Significant contributions to these developments have also came from the study of theoretical physics and cosmology. On the one hand, some of their problems are conceptually and formally treated from a purely mathematical perspective; on the other hand, numerical simulations have become crucial to scientific advancement almost since the arrival of the computer.
Symmetry is a critical concept in all physical theories, and it is described using mathematical groups. Understanding a theory’s symmetry greatly simplifies calculations and aids in developing intuition [
10,
11,
12]. The theory of relativity was a groundbreaking paradigm shift in the way space and time were perceived. Relativity frames were based on speeds, and a maximum speed of light was established as an unbreakable limit. As the theory evolved into general relativity, incorporating accelerating objects and massive bodies, it became clear that gravity was based on the geometry of spacetime [
13,
14,
15].
Other hot topic in theoretical physics is the superfluid vacuum theory (SVT). The vacuum, which appears to be empty, actually includes a superfluid that permeates the entire cosmos. It is hypothesized that this superfluid, which has peculiar characteristics such as zero viscosity and limitless compressibility, is the cause of a variety of phenomena, including the presence of mass and the functioning of gravity. The scientific community still views SVT as a novel and contentious concept, and much research and discussions are being conducted to examine its potential applications and veracity.
Fundamental particle objects are changed to fields in specific representations of the Lorentz group, resulting in quantum field theory, which incorporates the key concepts of special relativity into quantum theory [
16,
17,
18,
19,
20].
Quantum computing is an advancement in physics and mathematics. In order to tackle complicated problems at a previously unheard-of speed, quantum computing is a breakthrough area that merges the ideas of quantum physics and computer technology. It has the potential to transform many industries, including artificial intelligence, medicine development, and cryptography. The development and use of quantum computing heavily rely on mathematics. Mathematics is at the core of quantum computing, from creating quantum algorithms to determining the mathematical foundation to characterize quantum systems. By providing the required mathematical tools and insights, mathematicians have considerably aided the development of quantum computing. We can examine the connection between mathematics and quantum computing in [
21,
22,
23,
24] and highlight some important mathematical concepts used in quantum computing.
We concluded that the vast majority of scientific disciplines heavily rely on data and mathematics. It is inevitable that one of the most well-known fields of our era—machine learning—will emerge with massive mathematical data sets. We can take into account a variety of choices when we have enormous data sets, the output of simulations, or the outcome of assigning numerical values to a problem. Every research has a target, whether it be to investigate particular high-level concepts or to respond to particular questions. Exploratory questions may aim to find unusual data trends or locate unusual records. Confirmatory questions, on the other hand, are more focused and involve tasks such as identifying group differences or monitoring attribute changes over time.
Intelligent data analysis finds its roots in various disciplines, but statistics and machine learning are arguably the most significant [
1,
25]. Although statistics is the older of the two, the emergence of machine learning has added a distinct culture, interests, emphases, aims, and objectives that diverge from those of statistics. This divergence has created a creative tension between the two disciplines at the core of intelligent data analysis, leading to the development of innovative data analytic tools. Despite their differences, both statistics and machine learning contribute to the advancement of intelligent data analysis.
There are different types of machine learning algorithms, such as supervised learning, unsupervised learning, and reinforcement learning, and each is used in different contexts and for different purposes [
1]. Supervised learning is a type of machine learning in which a labeled data set is used to train a machine learning algorithm. The labeled data are those that have already been “tagged” with the correct answers, and are used to train the algorithm to make predictions or decisions. Supervised learning is useful when we have labeled data available and want to predict a specific outcome or make a decision based on that data. In a way, we are incorporating prior human knowledge and experience of the problem into the solution of the problem. The algorithm has all that knowledge as a starting point, and it evolves and learns from the knowledge that we have provided it. Examples of supervised algorithms could be linear regression, logistic regression, neural networks, or support vector machines (SVM).
Unsupervised learning is a type of machine learning in which you do not provide labeled data to the machine learning algorithm. Instead, the algorithm is trusted to discover patterns and relationships in the data on its own. Unsupervised learning is useful when we do not have a labeled data set available and want to discover patterns and trends in the data. However, unsupervised learning does not allow us to make specific predictions or decisions based on the data, as it does not provide the correct answers to train the algorithm. Unsupervised algorithms are clustering algorithms and dimensionality reduction algorithms.
In semi-supervised learning, the data set contains both labeled and unlabeled data. Typically, the amount of unlabeled data is much larger than the number of labeled examples. The goal of a semi-supervised learning algorithm is the same as that of a supervised algorithm. The idea is that using the unlabeled data in addition to the labeled data allows the algorithm to find a better model.
We may come across reinforced learning algorithms on occasion. In these circumstances, the algorithm is immersed in the problem and may observe the problem’s or the environment’s state, which is converted into a feature vector. The algorithm can then operate on each state and change it. The algorithm is rewarded (or punished) based on the outcome of the action, whether it changes to a better or desired state or not. The algorithm’s goal is to generate an action strategy that solves the problem.
3. Machine Learning Techniques for Geometry
Machine learning and geometry are two fields that have become increasingly interconnected in recent years. Geometry provides a powerful toolset for understanding and analyzing the structure of data, while machine learning algorithms offer a framework for processing and making predictions based on that data. As a result, machine learning has found many applications in geometry, and geometric methods have become increasingly important in machine learning [
8,
26,
27].
Geometry has been employed in machine learning in a variety of ways, including the development of novel algorithms [
7,
28]. Geometric algorithms, for example, have been utilized to build clustering approaches that group comparable data points together based on their geometric qualities. Geometric algorithms have also been used to create classification algorithms, which use geometric features to assign new data points to one of several pre-defined categories.
Another important application of geometry in machine learning is in the analysis of high-dimensional data [
29,
30,
31,
32]. High-dimensional data are common in many machine learning applications, but it can be difficult to understand and analyze using traditional statistical methods. Geometric methods provide a way to represent and analyze high-dimensional data in a way that is more intuitive and interpretable. Geometric deep learning is another area of research where machine learning and geometry intersect [
33,
34,
35]. In geometric deep learning, the goal is to develop deep learning algorithms that can operate directly on geometric structures such as graphs, point clouds, and meshes. This method has demonstrated potential in a number of applications, including 3D object identification and drug development.
Finally, machine learning has also been used to advance the field of geometry itself. Machine learning algorithms have been used to automate the generation of geometric models, to predict geometric properties of materials, and to develop new geometric optimization algorithms [
7].
One example of a supervised learning algorithm used in geometry is the support vector machine (SVM) (
Figure 2). SVMs are a popular tool for classification and regression tasks. They work by finding the hyperplane that maximally separates two classes in a high-dimensional space (see
Figure 3) [
1]. An example of the application of SVMs can be found in [
36]. In this paper, the authors use the database of weighted-P4s which admit Calabi–Yau 3-fold. This is a classic problem in string theory, closely related to algebraic geometry, that has been faced with machine learning tools, mainly due to the existence of numerical datasets [
37,
38]. Unsupervised techniques identified an unanticipated almost linear dependence of the topological data on the weights. This then allowed them to identify a previously unnoticed clustering in the Calabi–Yau data. Supervised techniques were successful in predicting the topological parameters of the hypersurface from its weights with an accuracy of R2 > 95%. Supervised learning also allowed them to identify weighted-P4s, which admit Calabi–Yau hypersurfaces to 100% accuracy by making use of partitioning supported by the clustering behavior.
Another example of a supervised learning algorithm used in geometry is the convolutional neural network (CNN). CNNs are a type of deep learning algorithm that use layers of convolutional filters to extract features from images or other spatial data [
39]. In geometry, CNNs can be used to segment images of geometric shapes or recognize patterns in point clouds [
40]. For example, a CNN could be trained to identify the boundaries between different regions of a 3D surface [
41,
42,
43,
44]. It has been also applied in the Amoebae problem in [
45]. Amoebae, introduced in [
46], are regions in
with several holes and straight narrowing tentacles reaching to the infinity, constructed from polynomials in
n complex variables. Amoebae from tropical geometry and the Mahler measure from number theory play important roles in quiver gauge theories and dimer models.
Regression tasks in geometry are also an example of supervised learning algorithms. In fact, the aforementioned Amoebae problem has also been tackled with regression in [
47]. Their dependencies on the coefficients of the Newton polynomial closely resemble each other, and they are connected via the Ronkin function. Genetic symbolic regression methods are employed to extract the numerical relationships between the 2D and 3D amoebae components and the Mahler measure.
Another common unsupervised learning problem in geometry is clustering, where the target is to group data points together based on some measure of similarity. This can be useful for tasks such as image segmentation or identifying patterns in complex data sets. Another popular algorithm is principal component analysis (PCA), which finds the directions of greatest variance in the data and projects the data onto those directions, effectively reducing the dimensionality of the data. Berman, in [
36], provides some analysis of the fundamentals of the dataset using PCA, topological data analysis (TDA), and other unsupervised machine learning methods, just as a previous stage before applying supervised machine learning methods.
Unsupervised learning has demonstrated success in generative models, which aim to generate new data that are similar to the training data. According to [
48], one prominent strategy is generative adversarial networks (GANs), in which two neural networks are trained concurrently: one generates new data, and the other attempts to distinguish between the produced data and the real data. This results in a feedback loop in which the generator learns to generate increasingly realistic data and the discriminator improves its ability to discern between actual and fake data.
In geometry, GANs have been used to generate 3D shapes and textures, as well as to interpolate between different shapes. For example, GANs have been used to generate realistic 3D models of chairs, cars, and other objects, which can be useful in fields such as architecture and product design. GANs have also been used to interpolate between different shapes, allowing for the creation of new shapes that are similar to existing ones but with variations that may not have been manually designed. More complex approaches, related to geometry, can be found in [
49,
50].
Reinforcement learning has also been applied in geometry, i.e., in the optimization of geometric shapes. Given a set of points, an algorithm can use reinforcement learning to find the shape that maximizes a certain criterion, such as the area or perimeter. The algorithm would start with an initial guess of the shape, and then iteratively modify it based on the feedback received from the environment. The feedback could be the value of the criterion, or a measure of the distance between the shape and a target shape. Another application of reinforcement learning in geometry is the discovery of new mathematical structures. An algorithm could learn to generate graphs that satisfy certain properties, such as being planar or having a certain degree distribution. The algorithm would start with a random graph and then iteratively modify it based on the feedback received from the environment. The feedback could be the value of a metric that measures how well the graph satisfies the desired properties. Some applied examples can be found in [
51,
52].
Neural networks and deep learning have changed the field of machine learning, and their impact on geometry has been significant. A neural network (ANN) is a computational model that is designed to simulate the way the human brain works. It is composed of interconnected nodes or neurons, each of which is assigned a weight and a bias value [
1]. These neurons receive inputs from other neurons and perform a computation before passing their output to other neurons. By adjusting the weights and biases of these neurons, the neural network can be trained to recognize patterns in data. One application can be seen in [
53], where they apply machine learning algorithms to the study of lattice polytopes. With ANN, they are able to predict standard properties, such as volume, dual volume, reflexivity, etc, with accuracies up to 100%. The paper applies to 2D polygons and 3D polytopes with Plücker coordinates as input, which out-perform the usual vertex representation. Same author also applies ANN to amoebae in [
45], applying multilayer perceptron as well as CNN.
Deep learning takes this concept a step further by using neural networks with many layers [
25]. Each layer in a deep neural network performs a different computation on the input data, with the output of one layer serving as the input to the next. This allows the network to learn more complex features and patterns in the data, and can result in more accurate predictions. One of the key benefits of deep learning in geometry is its ability to learn from large amounts of data. This is especially useful in situations where traditional geometric algorithms may be too computationally expensive or too complex to implement. Deep learning algorithms can be trained on large datasets of images or geometric models, allowing them to learn from a vast amount of information and make accurate predictions [
8,
33,
35]. Another advantage is its ability to generalize to new, unseen data. Once a neural network has been trained on a particular dataset, it can be applied to new data with similar properties. This has many applications in fields such as computer graphics, computer vision, and robotics [
8,
54,
55] (see
Figure 4).
Table 1 summarizes most popular methods in machine learning, ordered by the function performed (clustering, regression, classification or dimensionality reduction).
Figure 5 offers a short guide to choose the most suited algorithm.
The fields of mathematics and geometry are poised for a significant revolution in the form of machine learning, which will impact virtually every area of study. It is essential for mathematicians to remain at the forefront of these inevitable advancements to guide their development.
4. Challenges in Machine Learning for Geometry
Math and geometry are no exception to how machine learning approaches have altered how we process and interpret data. We consider that machine learning in geometry has a number of opportunities and challenges. Our survey shows that this method has potential to be both helpful and disruptive in the coming years. However, there are some historical ML flaws that should be carefully considered when used in mathematics and geometry. Despite these challenges, here are also several opportunities in applying machine learning to geometry. For example, machine learning can be used to extract meaningful features from high-dimensional geometry data, such as point clouds, meshes, and curves. These features can then be used for tasks such as classification, segmentation, and reconstruction.
One of the most significant challenges in machine learning for geometry is overfitting and underfitting. Overfitting occurs when a model is too complex and tries to fit the noise in the data, resulting in poor generalization performance. On the other hand, underfitting happens when a model is too simple and fails to capture the underlying patterns in the data, leading to poor performance on both the training and test data. Defeating overfitting and underfitting in geometry requires careful attention to the choice of model and the amount of data used for training. To prevent overfitting or underfitting, it is crucial to find a balance between model complexity and the volume of training data. This is particularly difficult in geometry because the data’s dimensionality might be very great. Additionally, the theoretical basis and prior understanding of the issue should be considered. This additional information can be used to improve the models or possibly create some new ones based on a thorough theoretical understanding of the issue.
Another drawback in machine learning for geometry is the lack of labeled data. Unlike in other fields, such as computer vision or natural language processing, where large labeled datasets are available, geometry often requires manual labeling, which is time-consuming and expensive. Nevertheless, symbolic and numerical computer software is able to generate datasets that represent a mathematical problem. Machine learning applied over this dataset can offer mathematics solutions regarding classification, prediction of modeling, three of the key features of ML methods.
Alternatively, unsupervised learning techniques can be used to learn from unlabeled data. Unsupervised learning can be used to discover meaningful structure and patterns in the data without the need for explicit labels. For example, unsupervised learning can be used to learn a low-dimensional representation of high-dimensional geometry data, such as point clouds or meshes. This low-dimensional representation can then be used for downstream tasks such as classification, segmentation, and reconstruction.
One more way to address the limited availability of data is to develop techniques that can learn from few or zero-shot examples. Few-shot learning aims to learn from a small number of examples, while zero-shot learning aims to learn from a set of examples without any direct training. These techniques are particularly useful in geometry, where it is often challenging to obtain large labeled datasets. Few-shot and zero-shot learning can enable machines to recognize new shapes and structures with minimal training data, making them valuable tools in geometry processing and modeling.
Machine learning can also be used to enhance the accuracy and efficiency of traditional geometry processing techniques, such as surface fitting, shape optimization, and geometric modeling. For example, machine learning can be used to predict the behavior of complex geometric structures, such as composite materials, and to optimize their design.
Because machine learning for geometry frequently uses sophisticated mathematical operations and transformations that are difficult to understand or explain, interpretability is a challenge when using machine learning methods. In order to overcome this difficulty, researchers are creating methods for interpreting and explaining the decisions made by machine learning algorithms. To increase understanding of the models’ decision-making process, one option is to use visualization techniques. Another strategy is to create models with built-in interpretability. For instance, because they are simple to comprehend and explain, decision trees and rule-based models are frequently utilized in situations where interpretability is important. Explainability can support or extend previous knowledge, and also gain insight on mathematical problems. Previous knowledge of the problem or similar problems, and the theoretical existing corpus, can help on decide how to choose between the different solutions machine learning gives for a problem. In mathematics, theoretical knowledge is always a key factor.
One of the opportunities in applying machine learning to geometry is the potential for collaboration and interdisciplinary research. Although several fields have a strong relationship with geometry, namely theoretical physics, other fields lying in the results of mathematical processes, more than theoretical developments, can be largely benefited. Machine learning techniques can be used in conjunction with traditional geometry processing techniques, such as surface reconstruction, shape optimization, and geometric modeling. This collaboration can enable researchers to develop more efficient and accurate techniques for solving challenging geometric problems.
Another chance for collaboration is the development of shared benchmarks and datasets. Researchers may compare and assess the performance of various algorithms and models by working together on benchmarks and datasets, allowing the discipline to advance more swiftly and effectively. Mathematician can generate huge synthetic datasets, and machine learning can look for patterns, anomalies, or make predictions of the spatial or time evolution. This collaboration can bring insight for mathematical open problems, besides developing new tailored machine learning algorithms starting from the deep mathematical expertise in the problem.
5. A New Practical Application in Algebraic Geometry
As we stated above, machine learning can be used for point cloud reconstruction, which is the process of creating a 2D or a 3D model from a set of 2D or 3D points captured by a scanner or generated by some other means. Point cloud reconstruction is an important step in a wide range of applications such as printing, virtual reality, and autonomous driving.
There are several machine learning techniques that can be used for point cloud reconstruction. One popular approach is to use deep learning models such as convolutional neural networks (CNNs) or graph neural networks (GNNs) to learn a mapping from the input point cloud to the output model. These models can be trained on a large dataset of point clouds and corresponding models, and can learn to generalize to new, unseen data.
Another approach is to use traditional machine learning algorithms such as k-nearest neighbors or random forests to predict the geometry of the model from the input point cloud. These methods can be effective in certain situations, but may not be as powerful as deep learning models when it comes to handling complex, high-dimensional data.
Machine learning has the potential to revolutionize the field of point cloud reconstruction by enabling faster, more accurate, and more automated methods for creating models from point clouds.
However, when point clouds include points at the “infinity” (i.e., points having large coordinates), the construction of effective method needs some other approach since otherwise, the prediction of the geometry of the 2D or 3D model could not be the expected one. More precisely, let us consider the point clouds given in
Figure 6 and
Figure 7.
The behavior of the curve we are looking for, for modeling the point clouds, is totally different if we look at the squares with smaller coordinate points, than if we “go to the infinity”. Here, the distortion of the model seems to indicate that we have different curves. In the square with N small enough, some well-known machine learning techniques allow to accurately determine the model. However, to correctly predict the geometric object, we need to model the infinity accurately and the essential tool for this problem is the asymptotes.
Thus, in this section, we use a new important tool from which we can extract geometric information from the point cloud and reconstruct a 2D or a 3D model, the
generalized asymptotes or
g-asymptotes. A curve may have more general curves than lines describing the status at the points with “large coordinates”. More precisely, a curve
is a
generalized asymptote (or
g-asymptote) of another curve
if the distance between
and
tends to zero as they tend to infinity, and
can not be approached by a new curve of lower degree. This notion, introduced and studied by S. Pérez-Díaz in some previous papers ([
56,
57,
58,
59,
60]), generalizes the classical concept of an asymptote of a curve
defined as a line such that the distance between
and the line approaches zero as they tend to infinity (see [
61,
62,
63]).
The approach using asymptotes for point clouds reconstruction involves fitting a set of asymptotes to the point clouds. The asymptotes can be defined from the infinity branches that can be constructed from the point clouds. Once the asymptotes have been fit to the point clouds, they can be used to reconstruct a 2D or 3D model by interpolating between the points and generating a curve or a surface that follows the asymptotes. This approach can be particularly useful for reconstructing smooth, curve and surfaces, where other methods such as voxel-based reconstruction (see [
64]) may not be as effective.
The novelty of this paper is to use the asymptotes that are not necessarily lines but
g-asymptotes. For this purpose, first we start with some previous notions and we introduce the concept of
infinity branch and
g-asymptote from which one may obtain an algebraic plane curve that follows the point clouds. We present the method for the case of plane curves, but this approach can be easily generalized to the
n-dimensional space (see [
58] where the g-asymptotes for algebraic curves in
n-dimensional space are introduced). The case of surfaces, can be dealt in a similar way, but for this purpose, we need the theory concerning g-asymptotes, which is currently being studied by the authors of this paper (see [
65,
66]).
The use of infinity branches and g-asymptotes opens up a promising field at intersection with machine learning. In general, any method that seeks boundaries of separation between classes can rely on these concepts to try to look for curves, planes, or hypersurfaces that, in some way, are defined by the asymptotic behavior of the point cloud determined by a given class. Without forgetting the predictive capacity that asymptotes have in themselves, since their own concept is the projection of a tendency towards extreme values of the coordinates.
In the following, let be a plane curve over the complex field defined by the (irreducible) polynomial . Its corresponding projective curve denoted as , is defined by the (homogeneous) polynomial where and are the homogeneous forms of degree j, for . Throughout this section, we assume w.l.o.g. that (0:1:0) is not an infinity point of (otherwise, we apply a linear change of coordinates).
To obtain the infinity branches of , we consider the curve defined by the polynomial and we compute the series expansion for the solutions of around . We obtain solutions defined by the (different) Puiseux series that can be grouped into conjugacy classes. That is, if where , , , and , is a Puiseux series (i.e., ), and (N is the called ramification index of ), the series , where , are the conjugates of . The set of all the conjugates of is called the conjugacy class of and it contains different series.
Since
in some neighborhood of
where
converges, there exists
with
for
and
, which implies that
, for
and
. Set
. We find that
for
and
where
, and
.
One may reason likewise with the
N different series in the conjugacy class. However, in [
57], we prove that all the results hold independently on the chosen series in the conjugacy class. Thus, in the following, we consider any representant in the conjugacy class and we introduce the notion of
infinity branch of a plane curve .
Definition 1. An infinity branch of a plane curve associated to the infinity point 1:m:0, is a set , ,where , and . Now, we provide the concepts of
convergent branches and
approaching curves. These notions will allow us to study if two curves approach each other (Theorem 2). In addition, Theorem 1 characterizes the convergence of two infinity branches (these notions and the proofs of the theorems can be found in [
56,
57]).
Definition 2. Two infinity branches, and , are convergent if
Theorem 1. Two branches and are convergent iff the terms with non-negative exponent in and are the same. Therefore, two convergent infinity branches are associated with the same infinity point.
The classical concept of asymptote has to be with a line that approaches a given curve at the infinity. In the following, we generalize this idea and we say that two curves approach each other if they have two infinity branches that converge (see Definition 3 and Theorem 2).
Definition 3. Let be a plane curve with an infinity branch B. A curve approaches at B if
Theorem 2. Let be a plane curve with an infinity branch B. A plane curve approaches at B iff has an infinity branch, , such that B and are convergent.
Now, we consider a plane curve and B an infinity branch of . We have just described how can be approached at B by a new curve , and now we consider that . Then, one may say that degenerates since it behaves at the infinity as a curve of smaller degree. For example, one may think on a hyperbola that is a curve of degree two having two real asymptotes. This could make us deduce that the hyperbola degenerates at the infinity in two lines. Similarly, an ellipse has two asymptotes that, in this case, are complex lines. The asymptotic behavior of a parabola is different since it cannot be approached at the infinity by any line. This leads us to the notion of perfect curve and g-asymptotes.
Definition 4. A curve of degree d is a perfect curve if it cannot be approached by any curve of degree less than d.
Definition 5. Let be a curve with an infinity branch B. A g-asymptote (or generalized asymptote) of at B is a perfect curve that approaches at B.
The notion of g-asymptote is a generalization of the classical concept of asymptote since as one may deduce, a g-asymptote is not necessarily a line, but a perfect curve (Definition 4). Throughout this section, we refer to g-asymptote simply as asymptote.
Every infinity branch of a given plane curve implicitly defined has, at least, one asymptote and now, we show how to compute it. For this purpose, we rewrite Equation (
1) defining a branch
B (Definition 1) as
where
and
,
,
. That is, we simplify the non-negative exponents such that
. Remark that
, and
, and
, i.e., the terms
with
are those which have negative exponent. We denote these terms as
where
We say that
n is
the degree of B, and we denote it by
.
Taking into account Theorems 1 and 2, we find that any curve
approaching
at
B should have an infinity branch
such that the terms with non-negative exponent in
and
are the same. In the most simple case, if
(there are no terms with negative exponent, see Equation (
2)), we obtain
where
,
,
,
, and
. We observe that
has the same terms with non-negative exponent as
r, and
does not have terms with negative exponent.
Let
be the plane curve containing the branch
. We have that
where
,
, and
, is a polynomial parametrization of
and it is proper (see Lemma 3 in [
56]). In Theorem 2 in [
56], we prove that
is a g-asymptote of
at
B.
In the following, we illustrate this process by means of an example.
Example 1. Let be a curve of degree defined by (see Figure 8) The infinity points are , , and We first consider and we compute its associated branches and asymptotes.
There exists only one branch associated to , , where(we compute by using the command puiseux, which is included in the algcurves package of the computer algebra system Maple). We obtain and hence, the parametrization of the asymptote isNow, we analyze the point . We have one infinity branch associated to , , whereWe obtain that and thus, the parametrization of the asymptote is given by . Now, we analyze the point . We have one infinity branch associated to , , whereWe obtain that . The parametrization of the asymptote is given by . In Figure 9, we plot the curve , and the asymptotes , , and . Observe that Figure 9 is plotted in the square . Note that in this square, where the points have “sufficient large coordinates”, asymptotes approach perfectly to the input curve. However, if we plot with the asymptotes , , and in a smaller square, (see Figure 10), one may check that the asymptotes approach worse. That is, as one knows, the approach of asymptotes to the curve is good at the infinity and in fact, the asymptotes are the only tool we have to approach at the infinity. Now, let us assume that we are given a point clouds as in
Figure 7, and we are interested in making specific predictions or decisions based on the data. For this purpose, one has to develop methods that should generate geometric models and analyze its geometric properties. In fact as we stated above, although some machine learning algorithms are developed in this sense, some new tools are necessary to understand the behavior at the infinity, which is summarized in constructing the asymptotes.
For this purpose, the idea we provide in this paper is the following: from the point clouds at the infinity and in each of the directions, the infinity branches, , passing through those points are constructed. One can consider branches of degree according to the given point clouds. The ramification index to be considered (i.e., the value of N) and the number of terms in can be as large as one wishes, depending on the number of points one wants to use. As more points one considers, a better approximation is obtained.
As one can deduce, this method only involves linear systems and their solution provides the infinity branches and hence the asymptotes. That is, we compute , where has as many terms as one wishes. Depending on the approximation purposes, either the whole infinity branch can be used, or only the asymptote determined from it. Note that an infinity branch (with a finite number of terms) is, in the background, a parametrization however, it is not polynomial (as the asymptote is) and its degree is N, which could be much larger than the degree of the asymptote (n).
Additionally and in order to measure the error, for each given point one may compute the distance to the nearest point on the asymptote. An error will be acceptable depending on the objectives of the problem being addressed.
This new technique, as we mentioned above, can serve as a basis for new clustering algorithms based on the distance of points to a hypersurface. The approach could also be combined with dimensionality reduction methods or methods based on manifolds to search for spaces in which points can be separated by lower-order surfaces, as SVM kernels do. All the mathematical theory developed so far can be used as a starting point for such algorithms, by means of the numerical approximations presented above, setting as parameters the maximum orders or the errors committed, for example.
Therefore, this method can be used as a basis for a new classification algorithm, and also, once the families of asymptotes that solve the problem have been determined, they can be used as a basis for predicting the classes in the extreme locations of the points, whether they are time variables or other types of variables. These asymptotes, together with plausibility conditions imposed by the problem, can lead to knowledge discovery in problems analyzed in this way. It also opens up a new area of work in the study of asymptotes that can represent acceptable solutions, as opposed to those that provide false solutions.
This proposal for the use of asymptotes is being developed by the authors and it will be the subject of publications and developments in forthcoming articles.
In the following, we illustrate these ideas by means of several examples. In the first one, the asymptotes computed have a degree of one. However, the second provides asymptotes of degree two.
Example 2. One should note that the input points are given, in general, in floating point arithmetic. From the figures, we observe that we have three different directions and then one should obtain three infinity branches. We consider and (if the approximation is not good enough, one may increase the value of n).
The first branch obtained is , whereWe compute , and we have that . Hence, the parametrization of the asymptote is Now, we determine , where We obtain that and thus, the parametrization of the asymptote is given by .
Finally, we obtain the infinity branch , where We obtain that , and the parametrization of the asymptote is given by .
In Figure 11, we plot the curve , and the asymptotes , and in . In Figure 12, we plot the curve , and the asymptotes in . In Figure 13, we plot the curve , and the asymptotes in . In Figure 14, we plot the curve , and the asymptotes in . One may observe that, although the approximation in the squares of smaller length is also good, the error in this area is much greater than in the infinity. More precisely, if for each given point we calculate the distance to the nearest point on the nearest line, we obtain that for the first case () the error is less or equal than , for the error is less or equal than , for the error is less or equal than , and for the error is less or equal than . However, is important to note that topologically, the asymptotes describe the curve perfectly at the infinity but not in the area near the origin.
In the following example, we have that asymptotes of degree two have to be used to obtain a better approximation at one of the directions of the infinity.
Example 3. From these figures, we observe that we have two different directions and then one should compute two infinity branches. We consider and (if the approximation is not good one can increase the value of n).
The first branch obtained is , where We compute , and we have that . Hence, the parametrization of the asymptote is
Now, we determine . If one considers one does not obtain a nice approximation. Thus, let . We find We obtain that and thus, the parametrization of the asymptote is given by .
In Figure 17, we plot the curve , and the asymptotes , and in . In Figure 18, we plot the curve , and the asymptotes in . In Figure 19, we represent the curve , and the asymptotes in . In Figure 20, we plot the curve , and the asymptotes in . Now, for each given point we calculate the distance to the nearest point on the asymptotes. We obtain that for the first case () the error is less or equal than , for the error is less or equal than , for the error is less or equal than , and for the error is less or equal than .