Statistical Modeling of Implicit Functional Relations

Lipovetsky, Stan

doi:10.3390/stats6030055

Open AccessArticle

Statistical Modeling of Implicit Functional Relations

by

Stan Lipovetsky

Independent Researcher, Minneapolis, MN 55305, USA

Stats 2023, 6(3), 889-906; https://doi.org/10.3390/stats6030055

Submission received: 26 July 2023 / Revised: 15 August 2023 / Accepted: 16 August 2023 / Published: 25 August 2023

Download

Browse Figures

Versions Notes

Abstract

:

This study considers the statistical estimation of relations presented by implicit functions. Such structures define mutual interconnections of variables rather than outcome variable dependence by predictor variables considered in regular regression analysis. For a simple case of two variables, pairwise regression modeling produces two different lines of each variable dependence using another variable, but building an implicit relation yields one invertible model composed of two simple regressions. Modeling an implicit linear relation for multiple variables can be expressed as a generalized eigenproblem of the covariance matrix of the variables in the metric of the covariance matrix of their errors. For unknown errors, this work describes their estimation by the residual errors of each variable in its regression by the other predictors. Then, the generalized eigenproblem can be reduced to the diagonalization of a special matrix built from the variables’ covariance matrix and its inversion. Numerical examples demonstrate the eigenvector solution’s good properties for building a unique equation of the relations between all variables. The proposed approach can be useful in practical regression modeling with all variables containing unobserved errors, which is a common situation for the applied problems.

Keywords:

implicit function; OLS regression; orthogonal regression; diagonal regression; geometric mean regression; residual variance; generalized eigenproblem

MSC:

62J02; 15A18

1. Introduction

This paper describes several approaches to the statistical estimation of relations expressed via implicit functions. In regular regression analysis, an outcome variable is observed or measured with errors, and it is presented as an explicit function of the other variables, which are free from errors. The minimization of squared errors by the outcome variable produces an estimation of the parameters in ordinary least-squares (OLS) regression [1]. Depending on which variable is taken as the outcome variable, it is possible to build its own model based on the other variables. These models are different and cannot be reduced from one to another. Even in the simple case of two variables, y and x, in their linear dependence, there are two solutions corresponding to minimizations of deviations in the y-axis or x-axis directions, respectively. These models can be presented graphically as two different lines of each variable dependence by another variable, intersecting at the point of the variables’ mean values. These two relations are not inversible, and only in the case of correlation |r| = 1 do the models coincide. The same two OLS solutions correspond to another scheme of statistical modeling when two random variables satisfy the binormal distribution, and each regression can be obtained as the conditional expectation of one variable by the fixed values of the other variable.

In a general case of n variables, there are n different linear regressions of each variable formed by all the others. These models are not inversible, and thus cannot be reduced from one to another. Such a situation raises a question—is it possible to build a unique model of all the variables connected in one relationship and to make it inversible to express any needed variable as a function of all other variables? A reasonable approach for simultaneous fitting by all directions consists of the minimization of the shortest distances or the length of perpendiculars from the observations to the line, plane, or surface of the variables’ connection in their theoretical model. The first results on such modeling were probably obtained by the American statistician Robert Adcock [2,3,4], and their development led to orthogonal regression by any number of variables, proposed by Karl Pearson [5] as the eigenvector with the minimum eigenvalue of the correlation matrix. In contrast to OLS regression, orthogonal regression is not invariant to the scaling transformation; therefore, it can be applied only to the variables of the same units or to the standardized variables, and its properties have been studied further in many works [6,7,8].

If a functional relation, for example, a physics law or an economics supply–demand link, or any other kind of mutual interconnection, can be assumed for variables measured with errors, then a more general approach of the maximum likelihood objective can be applied to building the models. Such models are called deterministic functional relations (notably, this does not concern functional data analyses, which deal with curves, surfaces, or other structures varying over time). For linear or linearized functional relations, such an approach produces a generalized eigenproblem with two covariance matrices—one of the observed variables and one of the errors of observations by the variables [9]. For uncorrelated errors with equal variances, the generalized eigenproblem reduces to the regular eigenproblem for orthogonal regression.

In the simple case of two variables, if the errors’ variances are proportional to the variances of the variables themselves, the result corresponds to the so-called diagonal regression, which can be presented graphically as a diagonal line of the first and third, or otherwise the second and fourth, quadrants in the plane of the standardized variables [10]. Diagonal regression was proposed by Ragnar Frisch, a Nobel laureate in economics, in his work on confluence analysis, in which functional structure relations were built for random or non-random variables observed with additional random noise [11]. In contrast to OLS regression, for which the slope depends on the pair correlation r, the diagonal regression of the standardized variables depends only on the sign of r, and similar constructs have been described for models using two predictors [12,13].

Diagonal regression is also known as geometric mean regression, standard (reduced) major axis regression, and other names, as described in an extensive review [14]. More references on orthogonal and diagonal regressions and related problems can be found in [15,16,17,18,19,20,21,22,23,24]. The so-called total least squares of the errors-in-variables regression and various other techniques have been developed as well [25,26,27,28,29,30,31].

The current paper considers regression modeling for the functional relations defined in an implicit form. For two variables, the explicit dependence of one variable on another can be written as y = f(x) or x = g(y), while the implicit function can be presented as F(x, y) = 0. In many cases, it is more convenient to use the implicit form in statistical modeling. For example, if the theoretical relation is assumed to be a circle x² + y² = ρ² in the center of coordinates with the radius ρ, it is easier to use two new variables, u₁ = x² and u₂ = y², and to build one simple linear model, u₁ + u₂ =k, where the estimated constant will then be used to find the radius as ρ = k^0.5. Otherwise, initially transforming the model to an explicit form, we would have a more complicated problem of splitting the data to fit each portion to one of the two branches of the functions y₁ = +(ρ² − x²)^0.5 and y₂ = −(ρ² − x²)^0.5 and then to combine the estimations into one value for the radius. A similar problem of restoring a Pythagorean linear relation between the squares of three variables can be considered by the measurement of three sides of right triangles. The well-known ideal gas law PV/T = constant can be restored from data on the pressure, volume, and temperature, which is modeled in the general linear equation after the logarithmic transformation of this relation.

The implicit function helps to evaluate data corresponding to non-unique but possible multi-value dependencies. The implicit function of the second-order curve of conical sections given by two variables in the linear, quadratic, and mixed items was applied for statistical modeling in [32]. It helps to build a model of two possible values of y at one given value of x, as observed, for instance, in a parabola function extending at some angle to the x-axis. An implicit relations approach was applied for modeling the unitary response data useful for the estimation of shares in the total, and for Thurstone scaling data [33]. Building an implicit logistic regression for description of the so-called supercritical pitchfork bifurcation known in chaotic systems was used in [34]. Bifurcation is a function behavior where a number of solutions and their structure can change abruptly, and such modeling can be helpful in dealing with messy data characterized by a wide range of response variables at each point of the predictors’ values, for instance, those known in advertising and marketing research. Various other applications of the implicit multi-value functions are also known [35,36,37,38,39]. Data fitting for implicit functions can be performed by various techniques of nonlinear statistical estimation available in modern software packages. It is also often possible to use new variables in which the original implicit function can be transformed to a linear function, similar to reducing a circle to the line in the coordinates squared from the original variables.

The regression model of an implicit linear relation for multiple variables can be expressed as the generalized eigenproblem containing two covariance matrices—of the variables and of the errors of their observations. As shown in this study, in the common case of unknown errors, these can be estimated by the residual errors in each variable linear regression by the other predictors. Then, the generalized eigenproblem is reduced to diagonalization of the special matrix constructed from the covariance matrix of the variables and its inversion. The proposed approach enables building a unique equation of relations between all variables when they contain unobserved errors.

The paper is structured as follows: Section 2 describes a case of two variables in different modeling approaches; Section 3 extends these techniques to multiple variables; Section 4 discusses numerical examples; and Section 5 summarizes.

2. Modeling for Two Variables

Let us briefly consider OLS regressions. As it is assumed in applied regression modeling, the outcome variable is observed or measured with errors and depends on predictors which are free from errors. For a simple case of two variables, y and x, when a theoretical unobserved value of the dependent variable z₂ is linearly connected to x, the model can be written as

y_{i} = z_{2 i} + ε_{i} = a x_{i} + b + ε_{i}

(1)

where i denotes observations (i = 1, 2, …, N, where N—their total number), a and b are unknown parameters, and

ε

is the error by y. The OLS minimizing the sum of squared deviations in Equation (1) yields the solution for the regression of y by x:

y = m_{y} + r \frac{σ_{y}}{σ_{x}} (x - m_{x})

(2)

where the sample mean values are denoted as m_x and m_y, the standard deviations are

σ_{x}

and

σ_{y}

, and r is the correlation between the variables.

Similarly, assuming errors in the x variable, when a theoretical unobserved dependent variable z₁ is linearly related to y, we obtain another model:

x_{i} = z_{1 i} + δ_{i} = c y_{i} + d + δ_{i}

(3)

where c and d are unknown parameters, and

δ

is the error by x. The OLS criterion for the total of squared deviations in (3) leads to the regression of x by y:

x = m_{x} + r \frac{σ_{x}}{σ_{y}} (y - m_{y})

(4)

The two solutions, (2) and (4), correspond to minimizations of deviations in the y-axis or x-axis directions, respectively, and they can be presented graphically as two different lines of each variable dependence by another one. The lines intersect at the point (m_x, m_y) of mean values, and (4) inclines more steeply than (2) to the x-axis, because the slope of the line (2) is proportional to the correlation |r| < 1, and the slope of the line (4) is proportional to the inverse value |1/r| > 1. Only in the case of correlation reaching one by the absolute value, |r| = 1, do both these equations coincide. If y-m_y is expressed by x-m_x from (4), it would not coincide with (2) because there will be the term 1/r instead of r, as in (2), but taking the geometric mean of the slopes of these equations yields the expression

y = m_{y} + s g n (r) \frac{σ_{y}}{σ_{x}} (x - m_{x})

(5)

which does not depend on the value of the correlation, but only on its sign sgn(r). Building this model explains its name of “geometric mean regression”, and it presents a compromise between two OLS solutions, is symmetric by both variables, and can be inverted. The coordinates of the standardized variables (centered and normalized by the standard deviations) graphically present a diagonal line with the slope sgn(r), which explains the name of “diagonal regression”.

In addition to classical regression modeling, the relationship models between both variables observed with errors have been considered, with their theoretical values z₁ and z₂ satisfying the general linear equation of their connection:

α_{0} + z_{1 i} α_{1} + z_{2 i} α_{2} = 0

(6)

where the alphas are unknown parameters. Using this general equation for both variables observed with errors (1) and (3), and taking into account the identical relation (6), yields the relation:

α_{0} + x_{i} α_{1} + y_{i} α_{2} = α_{0} + (z_{1 i} + δ_{i}) α_{1} + (z_{2 i} + ε_{i}) α_{2} = δ_{i} α_{1} + ε_{i} α_{1},

(7)

and the errors by each variable are assumed to be of zero mean value. To simplify the derivation, let the variables be centered, then the intercept

α_{0}

on the left-hand side of (7) can be omitted, and this side in matrix form can be written as

\vec{x} α_{1} + \vec{y} α_{2}

where

\vec{x}

and

\vec{y}

are the column-vectors of N observations by the variables. Then, the scalar product of this total vector by itself yields the quadratic form:

{(\vec{x} α_{1} + \vec{y} α_{2})}^{'} (\vec{x} α_{1} + \vec{y} α_{2}) = {(\begin{matrix} α_{1} \\ α_{2} \end{matrix})}^{'} (\begin{matrix} c_{x x} & c_{x y} \\ c_{y x} & c_{y y} \end{matrix}) (\begin{matrix} α_{1} \\ α_{2} \end{matrix}) = α' C α

(8)

where prime denotes transposition, c_xx and c_yy are the second moments, or sample variances for the variables x and y, respectively, and c_xy (equal c_yx) is their sample covariance. The second-order matrix C on the right-hand side of (8) is the variance–covariance matrix of the variables, and the vector

α

contains both alpha parameters of the model. Similarly, the right-hand side of (7) in the matrix form can be written as

\vec{δ} α_{1} + \vec{ε} α_{2}

, where the column-vectors contain N errors by the observed variables. The scalar product of this total vector of errors by itself yields another quadratic form:

{(\vec{δ} α_{1} + \vec{ε} α_{2})}^{'} (\vec{δ} α_{1} + \vec{ε} α_{2}) = {(\begin{matrix} α_{1} \\ α_{2} \end{matrix})}^{'} (\begin{matrix} s_{x x} & s_{x y} \\ s_{y x} & s_{y y} \end{matrix}) (\begin{matrix} α_{1} \\ α_{2} \end{matrix}) = α' S α

(9)

where s_xx and s_yy are the second moments, or sample variances for the errors by the variables x and y, respectively, and s_xy (or s_yx) is the sample covariance of the errors. Thus, the second-order matrix S on the right-hand side of (9) is the variance–covariance matrix of the variables’ errors.

Finding the minimum sum of squares of the observations’ deviations from the model on the left-hand side of (7) subject to the fixed value at its right-hand side (needed for the parameters identifiability) can be presented as the Rayleigh quotient of two quadratic forms

α^{'} C α / α' S α

. Minimization of this ratio can be reduced to the conditional least squares problem:

L = α^{'} C α - λ (α^{'} S α - 1)

(10)

where

λ

is the Lagrange term. The derivative of (10) with vector

α

set to zero for finding the extrema yields the relation:

C α = λ S α

(11)

which is the generalized eigenproblem.

In practical modeling, it is commonly assumed that the errors by different variables are independent, so their covariance equals zero, s_xy = 0, and S is the diagonal matrix. Then, problem (11) reduces to the regular eigenproblem:

(\begin{matrix} c_{x x} / s_{x x} & c_{x y} / s_{x x} \\ c_{y x} / s_{y y} & c_{y y} / s_{y y} \end{matrix}) (\begin{matrix} α_{1} \\ α_{2} \end{matrix}) = λ (\begin{matrix} α_{1} \\ α_{2} \end{matrix})

(12)

The characteristic equation for problem (12) is defined by the zero value of the determinant:

|\begin{matrix} \frac{c_{x x}}{s_{x x}} - λ & \frac{c_{x y}}{s_{x x}} \\ \frac{c_{y x}}{s_{y y}} & \frac{c_{y y}}{s_{y y}} - λ \end{matrix}| = λ^{2} - (\frac{c_{x x}}{s_{x x}} + \frac{c_{y y}}{s_{y y}}) λ + (\frac{c_{x x}}{s_{x x}} \cdot \frac{c_{y y}}{s_{y y}} - \frac{c_{x y}}{s_{x x}} \cdot \frac{c_{y x}}{s_{y y}}) = 0

(13)

Solving this quadratic equation and taking the minimum root corresponding to the minimization of the objective (10) produces the expression:

λ_{m i n} = \frac{1}{2} (\frac{c_{x x}}{s_{x x}} + \frac{c_{y y}}{s_{y y}}) - \sqrt{\frac{1}{4} {(\frac{c_{x x}}{s_{x x}} + \frac{c_{y y}}{s_{y y}})}^{2} - \frac{c_{x x} c_{y y} - c_{x y}^{2}}{s_{x x} s_{y y}}}

(14)

With the minimum eigenvalue (14), the eigenvector can be found. Substituting (14) into one of the equations of the homogeneous system (12), for example, the first one, yields:

(\frac{c_{x x}}{s_{x x}} - λ_{m i n}) α_{1} + \frac{c_{x y}}{s_{x x}} α_{2} = 0

(15)

The elements

α_{1}

and

α_{2}

of the eigenvector

α

(11) are defined up to a constant; thus, for their identifiability, we can fix one of them, for instance,

α_{2} = 1

. Then, the expression (15) reduces to

(\frac{1}{2} (\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}}) + \sqrt{\frac{1}{4} {(\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}})}^{2} + \frac{c_{x y}^{2}}{s_{x x} s_{y y}}}) α_{1} + \frac{c_{x y}}{s_{x x}} = 0

(16)

with the solution for the unknown element

α_{1}

of the eigenvector as follows:

α_{1} = (\frac{1}{2} (\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}}) - \sqrt{\frac{1}{4} {(\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}})}^{2} + \frac{c_{x y}^{2}}{s_{x x} s_{y y}}}) / (\frac{c_{x y}}{s_{y y}})

(17)

Then, the general linear relation (6) for the centered variables, expressed as the dependence of y by x (for the sake of comparison with the models (2), (4), and (5)), becomes:

y = m_{y} + \{(\sqrt{\frac{1}{4} {(\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}})}^{2} + \frac{c_{x y}^{2}}{s_{x x} s_{y y}}} - \frac{1}{2} (\frac{c_{x x}}{s_{x x}} - \frac{c_{y y}}{s_{y y}})) / (\frac{c_{x y}}{s_{y y}})\} (x - m_{x})

(18)

This is the generalized eigenproblem solution for a reversable relation between two measured variables with different errors.

In the special case of the equal variances of the errors, s_xx = s_yy, Equation (18) reduces to the following expression:

y = m_{y} + \frac{(c_{y y} - c_{x x}) + \sqrt{{(c_{y y} - c_{x x})}^{2} + {4 c}_{x y}^{2}}}{2 c_{x y}} (x - m_{x})

(19)

Equation (19) defines the orthogonal regression which is obtained from the general solution (18) in assumption of the equal precision (measured by the variances) in observations by both variables.

Another special case is defined by the proportion:

\frac{c_{x x}}{s_{x x}} = \frac{c_{y y}}{s_{y y}}, o r \frac{c_{x x}}{c_{y y}} = \frac{s_{x x}}{s_{y y}}

(20)

This means that the quotient of a variable’s variance by its error’s variance is equal for both variables. The same proportion holds if the ratio of the variances of two variables equals the ratio of the variances of their errors. In assumption (20), Equation (18) simplifies to:

y = m_{y} + (\frac{\sqrt{c_{x y}^{2}}}{c_{x y}}) \sqrt{\frac{s_{y y}}{s_{x x}}} (x - m_{x}) = m_{y} + s g n (c_{x y}) \sqrt{\frac{c_{y y}}{c_{x x}}} (x - m_{x})

(21)

The covariance and correlation are of the same sign, and the standard deviation equals the square root of the variance,

σ_{x} = \sqrt{c_{x x}}

(similarly to the other variable); therefore, the result (21) coincides with Equation (5) for diagonal regression. The same results (19)–(21) can be obtained by the maximum likelihood method based on the normal distribution of errors [9,10], but the considered approach dealing with quadratic forms (6)–(11) does not depend on a specific distribution of the errors.

Another reason for choosing the diagonal regression when the errors of observation are unknown could be as follows: let us estimate the error’s variance for each variable using the residual variance of this variable by the other available predictor. In the case of two variables, the residual variance of the model (1)–(2) equals the variance of the dependent variable, multiplied by the term 1 − r², where r is the coefficient of pair correlation, and a similar construct holds for the second model (3)–(4):

s_{y y} = c_{y y} (1 - r^{2}), s_{x x} = c_{x x} (1 - r^{2})

(22)

From (22), the first proportion (20) can be derived; thus, in assumption of the residual variances serving as estimates of the unobserved error variances, the diagonal regression (21) can be obtained.

3. Modeling for Many Variables

In a general case of many variables, an implicit function of their connection can be presented in the series by variables in the linear, quadratic, and other order of power and the mix effects. All these items can be considered as new additional variables, and the function can be written as a general linear form:

α_{0} + α_{1} z_{1 i} + α_{2} z_{2 i} + \dots + α_{n} z_{n i} = 0

(23)

where

z_{j i}

denotes each jth theoretical unobserved variable without the errors (j = 1, 2, …, n—the total number of variables) in the ith observation, which is an extension of the equation (6) from two to many variables. The observed variables in each ith point (i = 1, 2, …, N—the total number of observations in the sample) can be defined as:

x_{i j} = z_{j i} + ε_{i j}

(24)

in which

ε_{i j}

is an error in the ith observation by jth variable, and the errors by each variable are assumed to have zero mean value. Substituting (24) into (23), and taking into account the identical relation (23), yields:

α_{0} + α_{1} x_{i 1} + \dots + α_{n} x_{i n} = α_{1} ε_{i 1} + \dots + α_{n} ε_{i n}

(25)

Suppose only one variable is observed with errors, for example, the first variable:

ε_{i 1} \neq 0

, but all other errors equal zero. Dividing by the first alpha coefficient

α_{1}

reduces (25) to the explicit model for x₁ by new parameters a:

x_{i 1} = (- \frac{α_{0}}{α_{1}}) + (- \frac{α_{2}}{α_{1}}) x_{i 2} + \dots + (- \frac{α_{n}}{α_{1}}) x_{i n} + ε_{i 1} \equiv a_{0} + a_{2} x_{i 2} + \dots + a_{n} x_{i n} + ε_{i 1}

(26)

Then, we can build the multiple linear OLS regression of outcome x₁ by the other variables. The solution for this model (26) with the centered variables can be presented as the inverted covariance matrix of n − 1 predictors multiplied by the vector of their covariances with the outcome variable x₁; the intercept can be found by this equation in the point of mean values of the variables. Instead of the first variable, assumption on the error in another variable leads to its own regression by the remaining n-1 variables.

All possible n regressions of each variable by the other predictors can be obtained in one matrix inversion of all n variables. Consider the matrix X of the order N-by-n with elements x_ij of ith observations by all jth variables. Suppose the variables are centered, so there are no intercepts in the models. Then, the matrix C = X′X is the sample variance–covariance matrix of the nth order (a generalization of the matrix in (8)). Let C⁻¹ be the inverted matrix, with its elements denoted by the upper indices, (C⁻¹)_jk = c^jk (these are the cofactors of the elements of the covariance matrix), and the diagonal elements presented in the diagonal matrix D:

D = d i a g (C^{- 1})

(27)

As is known, matrix C⁻¹ contains parameters of all n regressions of each variable by all the others. More specifically, any of the models of one variable by the rest of them can be written in the so-called Yule’s notations as follows:

x_{j} = a_{j 0} + a_{j 1 .} x_{1} + a_{j 2 .} x_{2} + \dots + a_{j, j - 1 .} x_{j - 1} + a_{j, j + 1 .} x_{j + 1} \dots . . . + a_{j n .} x_{n} + ε_{j}

(28)

where a_jk. denotes the coefficient of the jth regression by the kth variable, and the dot denotes the rest of the variables [9,40]. The non-diagonal elements in any jth row of matrix C⁻¹ divided by the diagonal element in the same row and taken with opposite signs coincide with coefficients of regression x_j by all other xs (28) that can be presented as the following matrix of all parameters of all the models:

A \equiv D^{- 1} C^{- 1} = (\begin{matrix} c^{11} / c^{11} c^{12} / c^{11} \dots c^{1 n} / c^{11} \\ c^{21} / c^{22} c^{22} / c^{22} \dots c^{2 n} / c^{22} \\ \dots \dots \dots \dots \dots \dots \\ c^{n 1} / c^{n n} c^{n 2} / c^{n n} \dots c^{n n} / c^{n n} \end{matrix}) = (\begin{array}{l} 1 - a_{12 .} \dots - a_{1 n .} \\ - a_{21 .} 1 \dots - a_{2 n .} \\ \dots \dots \dots \dots \dots \dots \\ - a_{n 1 .} - a_{n 2 .} \dots 1 \end{array})

(29)

Additionally, the product Am of matrix A (29) by the vector of mean values m of all predictors coincides with the vector of intercepts a₀ for all regressions (28):

a_{j 0} = m_{j} - (a_{j 1 .} m_{1} + a_{j 2 .} m_{2} + . . . + a_{j, j - 1 .} m_{j - 1} + a_{j, j + 1 .} m_{j + 1} + . . . + a_{j n .} m_{n})

(30)

The product of the data matrix X by the transposed matrix (29) yields the total matrix E of N-by-n order consisting of the residual errors

ε_{i j}

in all ith observations of all jth regression models (28):

E = X A^{'} = X C^{- 1} D^{- 1}

(31)

Then, the matrix of the errors’ variance–covariances can be found by the expression:

S = E' E = D^{- 1} C^{- 1} X' X C^{- 1} D^{- 1} = D^{- 1} C^{- 1} D^{- 1}

(32)

because

X^{'} X C^{- 1} = I

is the identity matrix. In assumption of errors’ independence by different outcome variables, matrix (32) can be reduced to its diagonal:

S = d i a g (S) = d i a g (D^{- 1} C^{- 1} D^{- 1}) = D^{- 1} .

(33)

Notably, the values (27) are the so-called variance inflation factors (VIFs), and their reciprocal values (33) are the residual sums of squares in the regressions of each variable x_j by the rest of them:

S_{j j} = D_{j j}^{- 1} = {V I F}_{j}^{- 1} = c_{j j} (1 - R_{j .}^{2})

(34)

where

R_{j .}^{2}

is the coefficient of multiple determination in the models of each x_j by all the other variables. In the case n = 2, Equation (34) reduces to the expression (22), because for two variables, the coefficients of determination coincide with the squared pair correlation,

R_{1 .}^{2} = R_{2 .}^{2} = r^{2}

.

Similar to obtaining the geometric mean pair regression (5) from the slopes of the two regressions (2) and (4), if expressing one variable, x_j, from the regressions for all other predictors (28) using the coefficients (29), it is easy to see that all these models would coincide if the following relations for the coefficient (29) hold by all the variables:

a_{j k .} \cdot \frac{1}{a_{k j .}} = \frac{c^{j k}}{c^{j j}} \frac{c^{k k}}{c^{k j}} = \frac{c^{k k}}{c^{j j}} \equiv {({\tilde{a}}_{j k .})}^{2}

(35)

where the transposed elements of the symmetric inverted matrix are equal,

c^{j k} = c^{k j}

, and the coefficient with a tilde denotes the built parameter of the geometric mean regression model. These parameters can be defined from (35) as

{\tilde{a}}_{j k .} = {(c^{k k} / c^{j j})}^{1 / 2}

. Using them in Equation (28) for the centered variables yields:

x_{j} = \sqrt{\frac{c^{11}}{c^{j j}}} s g n (c^{j 1}) x_{1} + \sqrt{\frac{c^{22}}{c^{j j}}} s g n (c^{j 2}) x_{2} + . . . + \sqrt{\frac{c^{n n}}{c^{j j}}} s g n (c^{j n}) x_{n}

(36)

This expression presents a heuristic generalization of the geometric mean (5) or diagonal regression (21) for a case of many variables. Recalling the Cramer’s rule that a diagonal element c^jj of the inverted matrix equals the quotient of the first minor (which is a determinant of the matrix C_(-j) without the jth row and column, equal to the cofactor for the diagonal elements) and the determinant of the whole matrix, we can write the relation:

c^{j j} = \det (C_{(- j)}) / \det (C)

(37)

Substituting this equation into (36) and canceling the total determinant, we can rewrite this equation in a more symmetric form:

\sqrt{\det (C_{(- j)})} x_{j} = \sqrt{\det (C_{(- 1)})} s g n (c^{j 1}) x_{1} \dots + \sqrt{\det (C_{(- n)})} s g n (c^{j n}) x_{n}

(38)

This expression has an interesting interpretation in terms of the generalized variance, defined as the determinant of the covariance matrix [41]: a value of each jth coefficient of the diagonal regression (38) equals the square root of the generalized variance of all additional variables to the x_j variable.

Let us return to the general relation (25) and rewrite it for the centered variables in the matrix form as

X α = E α

, where E is the matrix of the unknown errors by all observations in the model (25). The scalar products of these vectors by themselves yield the expression:

α^{'} X^{'} X α = α^{'} E^{'} E α

(39)

where the matrices C = X′X and S = E′E are the nth order extensions of the second-order matrices in (8) and (9). For solving this problem for vector

α

of n parameters, we can find the minimum of the quadratic form on the left-hand side of (39) subject to the normalizing condition on its right-hand side (needed for the identifiability of the parameters defined up to an arbitrary constant). It corresponds to presenting (39) as the Rayleigh quotient of two quadratic forms,

α^{'} C α / α' S α

, reduced to the conditional least squares problem (10), and then to the generalized eigenproblem (11) for the nth order covariance matrices C and S of the variables and their errors, respectively.

For unknown errors of observations, let us estimate the error’s variance for each variable as the residual variance of this variable by all the rest of predictors, as it is considered in the case of two variables in the relation (22). For the case of independent errors, we substitute the diagonal matrix of variances (33) into the generalized eigenproblem (11), which produces the following relation:

C α = λ D^{- 1} α

(40)

For the more general case of the possibly correlated errors, we substitute the variance–covariance matrix (32) into the generalized eigenproblem (11) which yields the expression:

C α = µ D^{- 1} C^{- 1} D^{- 1} α

(41)

where the eigenvalue is renamed to

µ

to distinguish it from eigenvalue

λ

in the problem (40). The matrix D is defined by the inverted matrix C⁻¹ (27); therefore, both last generalized eigenproblems can be expressed via the variance–covariance matrix of the variables. Transforming (40) to the regular eigenproblem yields the equation:

(D C) α = λ α

(42)

and a similar transformation of (41) leads to the equation:

({D C)}^{2} α = µ α

(43)

If the matrix DC is multiplied by the problem (42), we derive the relation:

(D C) (D C) α = λ (D C) α = λ^{2} α

(44)

Therefore, the eigenvalues in (43) equal the squared eigenvalues in (42),

λ^{2} = µ

. However, the sets of eigenvectors of both problems (42) and (43) coincide; thus, the eigenvector solution related to the minimum eigenvalue in any of these two problems is the same. It is an unexpected result that this solution does not depend on the assumption of the correlated or non-correlated errors used in (40) and (41), respectively. Therefore, it is sufficient to solve the simpler problem (42) which, utilizing the definition (27), can be presented as follows:

d i a g (C^{- 1}) C α = λ α

(45)

The solution (45) for the eigenvector related to the minimum eigenvalue

λ_{m i n}

defines the functional implicit multiple linear regression (23).

Suppose all diagonal elements of the matrix C⁻¹, or the errors’ variances are the same; thus, diag(C⁻¹) is a scalar matrix, and the problem (45) is essentially the eigenproblem of a covariance matrix. Then, its maximum

λ_{m a x}

and several next big eigenvalues define the main eigenvectors used in the well-known principal component analysis (PCA), and in the regression modeling of the PCA scores employed in place of the original predictors for a multicollinearity reduction. On the other hand,

λ_{m i n}

of the covariance, or correlation matrix, defines the eigenvector which presents the coefficients of the orthogonal regression [5]. Although they are based on the same eigenproblem, the PCA and regression on PCA differ from the orthogonal regression, as well as from the functional implicit regression in the case of different variances of errors.

Problem (45) presents a generalization of the solution (16)–(18) to the multiple models with n > 2 variables. Such a model is symmetrical by all variables, and any one of them can be expressed via the others from this unique equation. Eigenproblem (45) with n variables reduces to the explicit eigenproblem (12) if n = 2. The errors’ variances s_xx and s_yy in (12) are extended to many variables with the errors’ variances defined by (34) in a general case of n variables.

The eigenproblem (45) of the non-symmetric matrix can be transformed to the eigenproblem of a symmetric matrix which is more convenient in numerical calculations because it produces a higher precision of estimations. For this aim, let us rewrite problem (42) as the following expression:

{(D}^{1 / 2} C D^{1 / 2}) (D^{- 1 / 2} α) = λ (D^{- 1 / 2} α)

(46)

where D^1/2 is the square root of the diagonal matrix D (27). The problem (46) can be represented as the following eigenproblem:

B β = λ β

(47)

where the symmetric matrix, B, and its eigenvectors are defined by the relations:

{B = D}^{1 / 2} C D^{1 / 2}, β = D^{- 1 / 2} α

(48)

The regular eigenproblem of the symmetric matrix (47) can be solved by most modern statistical and mathematical software packages. When the eigenvector

β

for the minimum eigenvalue in (47) is found, the original eigenvector

α

can be obtained from (48) by the inversed transformation:

α = D^{1 / 2} β

(49)

Taking into account the relations (27) and (37), we can find each jth element of the diagonal matrix in (49):

{(D^{1 / 2})}_{j} = \sqrt{c^{j j}} = \sqrt{\frac{d e t (C_{(- j)})}{\det (C)}}

(50)

Then, the original estimated model (25) with parameter alpha (49) can be presented for the centered variables as follows:

\sqrt{d e t (C_{(- 1)})} β_{1} x_{1} + \sqrt{d e t (C_{(- 2)})} β_{2} x_{2} + \dots + \sqrt{d e t (C_{(- n)})} β_{n} x_{n} = 0

(51)

where the common constant of determinant det(C) (50) is cancelled. In the case of an ill-conditioned or even singular sample covariance matrix C, the inverted matrix could not exist, but it is still possible to present the solution (51) via the determinants without each individual variable.

The functional implicit linear regression (51) obtained with the errors assumed by all variables can be solved with respect to any one variable x_j, similar to the heuristically constructed diagonal regression (38). Both models (38) and (51) have parameters proportional to the square root of the generalized variances; however, in place of the sign functions in (38) there are coefficients

β

(51) of the eigenproblem (47). Finding one of two possible signs for each variable presents an additional problem in building the model (38), while the coefficients

β

are uniquely defined in solving the eigenproblem (47) for the minimum eigenvalue

λ_{m i n}

.

For the characteristics of quality, the minimum eigenvalue

λ_{m i n}

of the standardized matrix B (48) can serve as a measure of the residual sum of squares

S_{r e s i d}^{2}

, taken by the shortest distances from the observations to the hyperplane of the orthogonal regression in the metric of the matrix D^1/2. An analogue of the coefficient of determination R² in multiple regression can also be constructed as one minus the quotient of the minimum eigenvalue to the mean of all eigenvalues:

{S_{r e s i d}^{2} = λ}_{m i n}, R^{2} = 1 - \frac{λ_{m i n}}{\frac{1}{n} \sum_{j = 1}^{n} λ_{j}}

(52)

4. Numerical Examples

For the first example, consider the simulated data for the conic section curve defined by the implicit equation of the second order:

x^{2} + 2 x y + y^{2} + 2 x + 10 y = 0

(53)

This corresponds to the parabola with its axis of symmetry extending at the angle −45° to the x-axis. Solving this quadratic equation with respect to the y variable yields two branches of this parabola,

y_{1} = - (x + 5) + \sqrt{8 x + 25}, y_{2} = - (x + 5) - \sqrt{8 x + 25}

(54)

which are shown in Figure 1.

Having a set of numeric data on x and y variables, it is still a problem to restore such a parabola equation because it has not the unique, but dual values of y at each x, so a regression of y by x could produce a line of the axis of symmetry between the branches. By the same reason, building a quadratic regression of x by y also has a problem of dual x values for the same y in the upper branch, which would distort the model. Separate modeling for the branches (54) requires splitting the data related to each branch and application of the nonlinear estimation.

If the numeric values on the x variable and on y variable defined in (54) were obtained with an additional noise of errors, it increases the difficulty of the adequate restoration of the original implicit Equation (53). The higher variability in the errors, the more fuzzified the scatter plot of the original function hidden behind the empirical data.

An example for the implicit function (53) with data by x taken from −3.1 to 6.0 with the step 0.1, for the sample size N = 92, was used to build the values of the functions (54). The random noise of normal distribution N(0, 1) with zero mean value and the standard deviation std = 1 was added to x and y variables. These values are presented in the scatter plot shown in Figure 2.

To restore the original function from the data obtained with errors, we can rewrite Equation (53) as a generalized linear implicit function:

a_{0} + a_{1} u_{1} + a_{2} u_{2} + a_{3} u_{3} + a_{4} u_{4} + a_{5} u_{5} = 0

(55)

where the notations for the new variables, u_j, are:

{u_{1} = x}^{2}, u_{2} = x y, u_{3} = y^{2}, u_{4} = x, u_{5} = y

(56)

The coefficients in (55) for the original implicit function (53) are as follows:

a_{0} = 0, a_{1} = 1, a_{2} = 2, a_{3} = 1, a_{4} = 2, a_{5} = 10

(57)

Applying the considered above approach for evaluating these parameters by the data distorted with errors, we can build each variable u_j regression by the other u-variables, find their residual variances, and use them for the errors’ estimates in the generalized eigenproblem. The obtained results are shown in Table 1, the columns in which present the intercept and parameters of the original function (53), five OLS regressions (29), and the general eigenvector solution (51).

The last two rows in Table 1 show the residual variance S²_resid and the coefficient of multiple determination R² for each OLS regression, as well as their analogues for the eigenvector solution (52). For the original exact function (53), S²_resid = 0, and R² = 1; therefore, by comparison, it can be determined which models are of a better quality than the others.

Table 2 presents parameters in each model divided by the coefficient for the first variable, which is more convenient for comparison across the models. It is easy to see that some models reconstruct the original model more accurately than the other models.

To find a measure of this closeness, we calculate pair correlations between the solutions. The correlations of the vectors in the columns of Table 2 (except for the intercepts, which depend on the slope parameters) are presented in Table 3.

Table 3 shows that many models are highly correlated with the original function and among themselves. The last row of the mean correlations also indicates that the OLS models for u₂, u₃, and u₅ (56), as well as the eigenvector solution, are the best performers, with high correlations with the original function.

Using any obtained solution for regression parameters from Table 1, we can calculate the values of the parabola (55) and (56) branches by the formula:

y_{1,2} = \frac{- (a_{2} x + a_{5}) \pm \sqrt{{(a_{2} x + a_{5})}^{2} - 4 a_{3} (a_{1} x^{2} + a_{4} x + a_{0})}}{2 a_{3}}

(58)

For illustration, using parameters of the solution by the eigenvector in the last column of Table 1, we can restore the parabola from the data with random noise—Figure 3 presents the original parabola and the restored curve.

The results of the eigenvector solution with the data distorted by the random noise yield a good approximation for the original exact data. The difference in the restored from the original values of the function can be estimated by the mean absolute error (MAE) or by the standard deviation (STD). For the results presented in Figure 3, these estimates are MAE = 0.442 and STD = 0.505, that are acceptable values in comparison with the standard deviation used in the random normal noise N(0, 1) added to x and y variables.

For the second numerical example, the dataset on physicochemical properties of the red wine [42] was taken (freely available at the site Wine Quality—UCI Machine Learning Repository; also available through the link Red Wine Quality | Kaggle). From that total dataset of 1599 observations and 12 variables, the following 10 variables describing chemical features were used: x₁—fixed acidity; x₂—volatile acidity; x₃—citric acid; x₄—residual sugar; x₅—chlorides; x₆—free sulfur dioxide; x₇—total sulfur dioxide; x₈—pH; x₉—sulphates; and x₁₀—alcohol.

To predict each of these characteristics using the others, we considered the outcome variable in the regression model, so ten such models could be built. These models differ and do not reduce from one to another one. However, it was possible to assume that all these characteristics were connected among themselves, and to construct one implicit relation of their interdependence by which any variable can be expressed via the rest of them. The next three tables are organized similarly to the first three tables (with an exception of the original function which is not known in these data).

Table 4 presents all ten OLS regression parameters (29) (the columns in this table correspond to the rows in the matrix on the right-hand side of (29)), as well as the general eigenvector solution (51).

Each column in Table 4 shows the intercept and the coefficients of regression which are given as in the implicit form (23); thus, to obtain the actual OLS model (28) in explicit form, we changed the signs of all coefficients. For instance, the model for the outcome x₁ is:

x_{1} = 25.11 + 1.83 x_{2} + 5.03 x_{3} + \dots - 0.13 u_{10}

(59)

The eigenvector model given in Table 4, is as following:

- 1.09 + 0.04 x_{1} - 0.27 x_{2} + \dots + 0.02 u_{10}

(60)

Each of the OLS regressions presents a model of the dependence of one particular variable by all the others of them; for example, the model for x₁ in Equation (59) can find the values of the dependent variable by the given values of its predictors. In contrast to the OLS models, the generalized linear relation (60) shows how, and in which proportions, all chemical components in this example can be combined in the relation of their interconnection. This can facilitate examination and interpretation of the components in their compound of a better property product.

The two bottom rows in Table 4 present the residual variance and the coefficient of multiple determination R² for each OLS regression model, as well as their analogues for the eigenvector solution (52). Most of the models were of good quality.

For the sake of comparison, we divided the parameters in each model by its coefficient for the first variable, which yielded the results presented in Table 5.

There is a wide variability in the parameters across the models in Table 5; therefore, to estimate the closeness between these solutions, we identified pair correlations. Table 6 shows the pair correlations of the vectors from the columns in Table 5.

Correlations were determined by the regressions’ coefficients only, without the intercepts. Many of the models are closely related. The bottom row of the mean correlations in the matrix columns in Table 6 indicates that the last model had the highest mean value; thus, the eigenvector solution presents a good compromise between the partial regression models of each one variable by the others of them.

5. Summary

This paper describes a statistical estimation of interconnections among the variables presented by the implicit functions. In the regular regression analysis for a given dataset, there could be many regressions with different variables taken as the dependent; however, building an implicit relation yields one unique invertible model of connection between all the variables. This kind of statistical modeling rather corresponds to restoring existing physical laws between the variables than to an approximation of the dependence of one outcome variable by the other predictors.

Modeling of the implicit linear relation is expressed as the generalized eigenproblem of the covariance matrix of the variables in the metric of the covariance matrix of their errors. Derivation was performed using relations between the quadratic forms in (6)–(11), (23)–(25), and (39)–(40), which are free from assumptions on a specific distribution of the errors, which could find wider practical applications.

For unknown errors, this study suggests performing their estimation via the residual errors of the ordinary least square regression of each variable by the other ones. In the simple case of two variables, the analytical close-form solutions are presented, including the orthogonal and diagonal, or geometric mean regressions. In the case of many variables, the generalized eigenproblem can be reduced to diagonalization of the matrix built from the covariance matrix of the variables and its inversed matrix. As shown in the derivations (40)–(45), the interesting feature of this generalized eigenvector problem is that the results are the same for both cases of the correlated and non-correlated estimated errors. The generalized eigenproblem solution holds the scale invariant property as well.

Numerical examples show that the eigenproblem solution presents a good compromise between all particularly regular regressions. The considered method of modeling with variables containing errors enables a unique equation of relations between all variables to be built that can be useful in practical regression modeling.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data sources are given in the references.

Acknowledgments

I am thankful to the three reviewers for their comments and suggestions which improved the paper.

Conflicts of Interest

The author declares no conflict of interest.

References

Draper, N.R.; Smith, H. Applied Regression Analysis, 3rd ed.; Wiley: New York, NY, USA, 1998. [Google Scholar]
Adcock, R.J. Note on the method of least squares. Analyst 1877, 4, 183–184. [Google Scholar] [CrossRef]
Adcock, R.J. A problem in least squares. Analyst 1878, 5, 53–54. [Google Scholar] [CrossRef]
Adcock, R.J. Extension of the method of least squares to any number of variables. Analyst 1880, 7, 22–23. [Google Scholar] [CrossRef]
Pearson, K. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Koopmans, T.C. Linear Regression Analysis of Economic Time Series; DeErven F. Bohn: Haarlem, The Netherlands, 1937. [Google Scholar]
Deming, W.E. Statistical Adjustment of Data; Wiley: New York, NY, USA, 1943. [Google Scholar]
Jackson, J.D.; Dunlevy, J.A. Orthogonal least squares and the interchangeability of alternative proxy variables in the social sciences. Statistician 1988, 37, 7–14. [Google Scholar] [CrossRef]
Kendall, M.G.; Stuart, A. The Advanced Theory of Statistics: Inference and Relationship, 2nd ed.; Griffin: London, UK, 1983; Chapter 29. [Google Scholar]
Leser, C.E.V. Econometric Techniques and Problems; Griffin: London, UK, 1974; Chapter 2. [Google Scholar]
Frisch, R.A.K. Statistical Confluence Analysis by Means of Complete Regression Systems; Economics Institute, University of Oslo: Oslo, Norway, 1934. [Google Scholar]
Cobb, C.W. Note on Frisch’s diagonal regression. Econometrica 1939, 7, 77–80. [Google Scholar] [CrossRef]
Cobb, C.W. A regression. Econometrica 1943, 11, 265–267. [Google Scholar] [CrossRef]
Xe, S. A property of geometric mean regression. Am. Stat. 2014, 68, 277–281. [Google Scholar]
Durbin, J. Errors in variables. Rev. Int. Stat. Inst. 1954, 22, 23–32. [Google Scholar] [CrossRef]
Creasy, M.A. Condense limits for the gradient in the linear functional relationship. J. R. Stat. Soc. Ser. B 1956, 18, 65–69. [Google Scholar]
Sprent, P.; Dolby, G.R. Query: The geometric mean functional relationship. Biometrics 1980, 36, 547–550. [Google Scholar] [CrossRef]
Barker, F.; Soh, Y.C.; Evans, R.J. Properties of the geometric mean functional relationship. Biometrics 1988, 44, 279–281. [Google Scholar] [CrossRef]
Draper, N.R.; Yang, Y.F. Generalization of the geometric mean functional relationship. Comput. Stat. Data Anal. 1997, 23, 355–372. [Google Scholar] [CrossRef]
Leng, L.; Zhang, T.; Kleinman, L.; Zhu, W. Ordinary least square regression, orthogonal regression, geometric mean regression and their applications in aerosol science. J. Phys. Conf. Ser. 2007, 78, 012084. [Google Scholar] [CrossRef]
Tofallis, C. Fitting Equations to Data with the Perfect Correlation Relationship; Working Paper Series; University of Hertfordshire Business School: Hertfordshire, UK, 2015. [Google Scholar] [CrossRef]
Bottai, M.; Kim, T.; Lieberman, B.; Luta, G.; Pena, E. On Optimal Correlation-Based Prediction. Am. Stat. 2022, 76, 313–321. [Google Scholar] [CrossRef]
Christensen, R. Comment on “On Optimal Correlation-Based Prediction,” by Bottai et al. (2022). Am. Stat. 2022, 76, 322. [Google Scholar] [CrossRef]
Lipovetsky, S. Comment on “On Optimal Correlation-Based Prediction,” by Bottai et al. (2022). Am. Stat. 2023, 77, 113. [Google Scholar] [CrossRef]
Tan, C.Y.; Iglewicz, B. Measurement-methods comparisons and linear statistical relationship. Technometrics 1999, 41, 192–201. [Google Scholar] [CrossRef]
Kukush, A.; Mandel, I. Does regression approximate the influence of the covariates or just measurement errors? A model validity test. arXiv 2019, arXiv:1911.07556. [Google Scholar]
Francq, B.G.; Govaerts, B.B. Measurement methods comparison with errors-in-variables regressions. From horizontal to vertical OLS regression, review and new perspectives. Chemom. Intell. Lab. Syst. 2014, 134, 123–139. [Google Scholar] [CrossRef]
Yi, G.Y.; Delaigle, A.; Gustafson, P. Handbook of Measurement Error Models; Chapman and Hall/CRC: Boca Raton, FL, USA, 2021. [Google Scholar]
Lipovetsky, S. The review on the book: “Handbook of Measurement Error Models, by Yi, G.Y. et al., Eds.”. Technometrics 2023, 65, 302–304. [Google Scholar] [CrossRef]
Fuller, W.A. Measurement Error Models; Wiley: New York, NY, USA, 1987. [Google Scholar]
Van Huffel, S.; Lemmerling, P. (Eds.) Total Least Squares and Errors-in-Variables Modeling: Analysis, Algorithms and Applications; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002. [Google Scholar]
Lipovetsky, S. Regression models of implicit functions. Ind. Lab. 1979, 45, 1136–1141. [Google Scholar]
Lipovetsky, S. Unitary response regression models. Int. J. Math. Educ. Sci. Technol. 2007, 38, 1113–1120. [Google Scholar] [CrossRef]
Lipovetsky, S. Supercritical pitchfork bifurcation in implicit regression modeling. Int. J. Artif. Life Res. 2010, 1, 1–9. [Google Scholar] [CrossRef]
Schmidt, M.; Lipson, H. Symbolic regression of implicit equations. In Genetic Programming Theory and Practice VII; Riolo, R., O’Reilly, U.M., McConaghy, T., Eds.; Springer: Boston, MA, USA, 2010; pp. 73–85. [Google Scholar]
Wooten, R.D.; Baah, K.; D’Andrea, J. Implicit Regression: Detecting Constants and Inverse Relationships with Bivariate Random Error. Cornell University Library, 2015. Available online: https://arxiv.org/abs/1512.05307v1 (accessed on 24 August 2023).
Wooten, R.D. Introduction to Implicit Regression: Extending Standard Regression to Rotational Analysis and Non-Response Analysis. Cornell University Library, 2016. Available online: https://arxiv.org/abs/1602.00158v1 (accessed on 24 August 2023).
Miao, Z.; Zhong, J.; Yang, P.; Wang, S.; Liu, D. Implicit neural network for implicit data regression problems. In Neural Information Processing; Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N., Eds.; Springer: Cham, Switzerland, 2021; pp. 187–195. [Google Scholar]
Zhong, J.; Yang, J.; Chen, Y.; Liu, W.-L.; Feng, L. Mining implicit equations from data using gene expression programming. IEEE Trans. Emerg. Top. Comput. 2022, 10, 1058–1074. [Google Scholar] [CrossRef]
Yule, G.U.; Kendall, M.G. An Introduction to the Theory of Statistics; Griffin: London, UK, 1950. [Google Scholar]
Kocherlakota, S.; Kocherlakota, K. Generalized Variance. In Encyclopedia of Statistical Sciences; Johnson, N.L., Kotz, S., Eds.; Wiley: New York, NY, USA, 1983; Volume 3, pp. 354–357. [Google Scholar]
Cortez, P.; Cerdeira, A.; Almeida, F.; Matos, T.; Reis, J. Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 2009, 47, 547–553. [Google Scholar] [CrossRef]

Figure 1. Parabola defined by Equation (53), with its branches defined in relations (54).

Figure 2. Parabola data with the added normal random errors N(0, 1).

Figure 3. The exact original parabola and the curve restored by the eigenvector.

Table 1. Original function, OLS regressions, and eigenvector (parabola example).

Parameters	Original Function	Model for u₁	Model for u₂	Model for u₃	Model for u₄	Model for u₅	Eigenvector
a₀	0	−6.570	−4.117	5.391	0.081	1.130	0.127
a₁	1	1.000	0.613	0.648	−0.109	0.021	0.096
a₂	2	0.364	1.000	1.459	0.045	0.078	0.150
a₃	1	0.106	0.400	1.000	0.010	0.074	0.088
a₄	2	−2.005	1.390	1.115	1.000	0.070	0.111
a₅	10	0.544	3.392	11.629	0.097	1.000	0.974
S²_resid	0	54.907	92.495	337.042	2.974	2.134	0.380
R²	1	0.664	0.915	0.975	0.622	0.952	0.976

Table 2. The models normalized by the first coefficients (parabola example).

	Original Function	Model for u₁	Model for u₂	Model for u₃	Model for u₄	Model for u₅	Eigenvector
a₀	0	−6.570	−6.719	8.315	−0.747	53.435	1.318
a₁	1	1.000	1.000	1.000	1.000	1.000	1.000
a₂	2	0.364	1.632	2.250	−0.412	3.701	1.554
a₃	1	0.106	0.653	1.543	−0.091	3.482	0.910
a₄	2	−2.005	2.268	1.720	−9.211	3.305	1.151
a₅	10	0.544	5.535	17.937	−0.897	47.286	10.120

Table 3. Correlations between the normalized solutions (parabola example).

	Original Function	Model for u₁	Model for u₂	Model for u₃	Model for u₄	Model for u₅	Eigenvector
Original function	1.000	0.181	0.978	0.996	0.054	0.994	0.996
Model for u₁	0.181	1.000	0.012	0.242	0.975	0.234	0.258
Model for u₂	0.978	0.012	1.000	0.956	−0.132	0.953	0.958
Model for u₃	0.996	0.242	0.956	1.000	0.124	1.000	0.999
Model for u₄	0.054	0.975	−0.132	0.124	1.000	0.119	0.134
Model for u₅	0.994	0.234	0.953	1.000	0.119	1.000	0.998
Eigenvector	0.996	0.258	0.958	0.999	0.134	0.998	1.000
Mean Correlation	0.840	0.380	0.745	0.863	0.255	0.860	0.869

Table 4. OLS regressions and eigenvector (wine example).

	Model for x₁	Model for x₂	Model for x₃	Model for x₄	Model for x₅	Model for x₆	Model for x₇	Model for x₈	Model for x₉	Model for x₁₀	Eigen-vector
a₀	−25.11	0.06	0.26	1.42	−0.42	24.88	−213.19	−3.59	−0.19	−6.08	−1.09
a₁	1.00	−0.03	−0.06	−0.10	0.01	−0.52	6.09	0.05	−0.01	0.11	0.04
a₂	−1.83	1.00	0.44	−0.66	−0.08	9.91	−37.49	−0.09	0.20	0.19	−0.27
a₃	−5.03	0.66	1.00	−0.89	−0.09	11.15	−53.99	0.03	−0.04	−1.85	−0.38
a₄	−0.06	−0.01	−0.01	1.00	0.00	−0.53	−1.72	0.00	0.01	−0.07	0.00
a₅	6.74	−0.97	−0.75	−1.95	1.00	−9.83	76.88	0.58	−1.46	6.03	0.85
a₆	−0.01	0.00	0.00	−0.02	0.00	1.00	−1.97	0.00	0.00	−0.01	0.00
a₇	0.01	0.00	0.00	−0.01	0.00	−0.22	1.00	0.00	0.00	0.01	0.00
a₈	5.19	−0.16	0.04	−0.23	0.08	−7.89	39.37	1.00	−0.02	−1.48	0.23
a₉	−0.49	0.17	−0.02	0.63	−0.10	−1.74	−8.30	−0.01	1.00	−1.02	−0.11
a₁₀	0.13	0.00	−0.03	−0.14	0.01	−0.47	5.23	−0.02	−0.03	1.00	0.02
S²_resid	1.02	0.02	0.01	1.81	0.00	56.15	497.92	0.01	0.02	0.87	0.38
R²	0.66	0.43	0.68	0.09	0.32	0.49	0.54	0.55	0.25	0.23	0.81

Table 5. The models normalized by the first coefficients (wine example).

	Model for x₁	Model for x₂	Model for x₃	Model for x₄	Model for x₅	Model for x₆	Model for x₇	Model for x₈	Model for x₉	Model for x₁₀	Eigen-vector
a₀	−25.11	−1.82	−4.27	−14.55	−41.78	−48.04	−35.03	−66.23	18.57	−52.93	−26.38
a₁	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
a₂	−1.83	−30.48	−7.33	6.69	−8.07	−19.12	−6.16	−1.74	−19.40	1.69	−6.56
a₃	−5.03	−20.12	−16.68	9.09	−9.40	−21.52	−8.87	0.62	3.81	−16.14	−9.11
a₄	−0.06	0.20	0.10	−10.22	−0.16	1.02	−0.28	−0.02	−0.72	−0.60	−0.04
a₅	6.74	29.62	12.57	19.89	100.30	18.97	12.63	10.65	141.36	52.50	20.56
a₆	−0.01	−0.10	−0.04	0.17	−0.03	−1.93	−0.32	−0.03	0.06	−0.06	−0.09
a₇	0.01	0.04	0.02	0.06	0.02	0.43	0.16	0.02	0.03	0.08	0.04
a₈	5.19	4.91	−0.64	2.33	8.20	15.23	6.47	18.46	2.30	−12.87	5.61
a₉	−0.49	−5.20	0.37	−6.41	−10.31	3.36	−1.36	−0.22	−96.79	−8.89	−2.59
a₁₀	0.13	−0.12	0.43	1.46	1.04	0.91	0.86	−0.33	2.42	8.71	0.47

Table 6. Correlations between the normalized solutions (wine example).

	Model for x₁	Model for x₂	Model for x₃	Model for x₄	Model for x₅	Model for x₆	Model for x₇	Model for x₈	Model for x₉	Model for x₁₀	Eigen-vector
Model for x₁	1	0.85	0.85	0.32	0.75	0.93	0.98	0.79	0.59	0.63	0.93
Model for x₂	0.85	1	0.88	0.25	0.79	0.91	0.93	0.55	0.65	0.69	0.93
Model for x₃	0.85	0.88	1	0.13	0.72	0.88	0.9	0.36	0.49	0.76	0.88
Model for x₄	0.32	0.25	0.13	1	0.71	0.07	0.33	0.38	0.76	0.63	0.49
Model for x₅	0.75	0.79	0.72	0.71	1	0.63	0.81	0.52	0.89	0.92	0.93
Model for x₆	0.93	0.91	0.88	0.07	0.63	1	0.95	0.69	0.41	0.51	0.87
Model for x₇	0.98	0.93	0.9	0.33	0.81	0.95	1	0.71	0.65	0.71	0.97
Model for x₈	0.79	0.55	0.36	0.38	0.52	0.69	0.71	1	0.45	0.21	0.65
Model for x₉	0.59	0.65	0.49	0.76	0.89	0.41	0.65	0.45	1	0.82	0.78
Model for x₁₀	0.63	0.69	0.76	0.63	0.92	0.51	0.71	0.21	0.82	1	0.83
Eigenvector	0.93	0.93	0.88	0.49	0.93	0.87	0.97	0.65	0.78	0.83	1
Mean correlation	0.76	0.74	0.68	0.41	0.77	0.68	0.8	0.53	0.65	0.67	0.82

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lipovetsky, S. Statistical Modeling of Implicit Functional Relations. Stats 2023, 6, 889-906. https://doi.org/10.3390/stats6030055

AMA Style

Lipovetsky S. Statistical Modeling of Implicit Functional Relations. Stats. 2023; 6(3):889-906. https://doi.org/10.3390/stats6030055

Chicago/Turabian Style

Lipovetsky, Stan. 2023. "Statistical Modeling of Implicit Functional Relations" Stats 6, no. 3: 889-906. https://doi.org/10.3390/stats6030055

Article Menu

Statistical Modeling of Implicit Functional Relations

Abstract

1. Introduction

2. Modeling for Two Variables

3. Modeling for Many Variables

4. Numerical Examples

5. Summary

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI