**2. Non-Diophantine Arithmetic and Non-Newtonian Calculus**

The most general form of non-Newtonian calculus deals with functions *A* defined by the commutative diagram (*f*X and *f*Y are arbitrary bijections)

$$\begin{array}{ccccc}\text{X} & \stackrel{A}{\longrightarrow} & \text{Y} \\ f\text{x} & & \bigcup f\text{y} \\ \mathbb{R} & \stackrel{A}{\longrightarrow} & \mathbb{R} \end{array} \tag{1}$$

The only assumption about the domain X and the codomain Y is that they have the same cardinality as the continuum R. The latter guarantees that bijections *f*<sup>X</sup> and *f*<sup>Y</sup> exist. The bijections are automatically continuous in the topologies they induce from the open-interval topology of R, even if they are discontinuous in metric topologies of X and Y (a typical situation in fractal applications, or in cases where X or Y are not subsets of R). In general, one does *not* assume anything else about *f*<sup>X</sup> and *f*Y. In particular, their differentiability in the usual (Newtonian) sense is not assumed. No topological assumptions are made about X and Y. Of course, the structure of the diagram implies that X and Y may be regarded as Banach manifolds with global charts *f*X and *f*Y, but one does not make the usual assumptions about changes of charts.

Non-Newtonian calculus begins with (generalized, non-Diophantine) arithmetics in X and Y, induced from R,

$$\mathbf{x}\_1 \oplus\_{\mathbf{X}} \mathbf{x}\_2 \quad = \quad f\_{\mathbf{X}}^{-1} (f\_{\mathbf{X}}(\mathbf{x}\_1) + f\_{\mathbf{X}}(\mathbf{x}\_2)), \tag{2}$$

$$\mathbf{x}\_1 \ominus\_{\mathbb{X}} \mathbf{x}\_2 = \begin{array}{c} f\_{\mathbb{X}}^{-1} \left( f\_{\mathbb{X}}(\mathbf{x}\_1) - f\_{\mathbb{X}}(\mathbf{x}\_2) \right), \end{array} \tag{3}$$

$$\mathbf{x}\_1 \odot\_\mathcal{X} \mathbf{x}\_2 \quad = \quad f\_\mathcal{X}^{-1} \big( f\_\mathcal{X}(\mathbf{x}\_1) \cdot f\_\mathcal{X}(\mathbf{x}\_2) \big), \tag{4}$$

*<sup>x</sup>*<sup>1</sup> #<sup>X</sup> *<sup>x</sup>*<sup>2</sup> <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup> X *f*X(*x*1)/ *f*X(*x*2) (5)

(and analogously in Y).

**Example 1.** *According to one of the axioms of standard quantum mechanics, states of a quantum system belong to a separable Hilbert space. All separable Hilbert spaces are isomorphic, so state spaces of any two quantum systems are isomorphic. Does it mean that all quantum systems are equivalent? No, it only shows that mathematically isomorphic structures can play physically different roles. Similarly, the arithmetic given by (2)–(5) is isomorphic to the standard arithmetic of* R*, but it does not imply that the two arithmetics are physically equivalent.*

**Example 2.** *The origin of Einstein's special theory of relativity goes back to the observation that the velocity of a source of light does not influence the velocity of light itself, contradicting our everyday experiences with velocities in trains or football. Relativistic addition of velocities is based on a fundamental unit c and the dimensionless parameter <sup>β</sup>, related to velocity by <sup>v</sup>* <sup>=</sup> *<sup>β</sup>c. <sup>β</sup>* <sup>∈</sup> <sup>X</sup> = (−1, 1) *while the bijection reads <sup>f</sup>*X(*β*) = arctanh *<sup>β</sup>. The velocities are added or subtracted by means of (2) and (3),*

$$
\beta\_1 \oplus\_\mathbb{X} \beta\_2 = \tanh(\operatorname{arctanh} \beta\_1 + \operatorname{arctanh} \beta\_2). \tag{6}
$$

*Interestingly, (4) and (5) are not directly employed in special relativity. The presence of the fundamental unit c is a signature of a general non-Diophantine arithmetic (which typically works with dimensionless numbers). Numbers* <sup>±</sup><sup>1</sup> <sup>∈</sup> <sup>R</sup> *play the roles of infinities,* <sup>±</sup>1<sup>R</sup> <sup>=</sup> <sup>±</sup>∞X*. The velocity of light is therefore literally infinite in the non-Diophantine sense. The neutral element of multiplication,* 1<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (1) = tanh 1 = 0.76 *(i.e., v* = 0.76*c), does not seem to play in relativistic physics any privileged role.*

Sometimes, for example in the context of Bell's theorem, one works with mixed arithmetics of the form [13]

$$\mathbf{x}\_{1}\odot\_{\mathbb{Z}}^{\mathbf{XY}}\mathbf{y}\_{2} = f\_{\mathbb{Z}}^{-1}(f\_{\mathbb{X}}(\mathbf{x}\_{1})\cdot f\_{\mathbb{Y}}(\mathbf{y}\_{2})), \quad \odot\_{\mathbb{Z}}^{\mathbf{XY}}\mathbf{:}\mathbb{X}\times\mathbb{Y}\to\mathbb{Z}, \quad \text{etc.} \tag{7}$$

Mixed arithmetics naturally occur in Taylor expansions of functions whose domains and codomains involve different arithmetics, and in the chain rule for derivatives (see Example 6).

In order to define calculus one needs limits "to zero", and thus the notion of zero itself. In the arithmetic context a zero is a neutral element of addition, for example, *<sup>x</sup>* <sup>⊕</sup><sup>X</sup> <sup>0</sup><sup>X</sup> <sup>=</sup> *<sup>x</sup>* for any *<sup>x</sup>* <sup>∈</sup> <sup>X</sup>. Obviously, such a zero is arithmetic-dependent. The same concerns a "one", a neutral element of multiplication, fulfilling *<sup>x</sup>* "<sup>X</sup> <sup>1</sup><sup>X</sup> <sup>=</sup> *<sup>x</sup>* for any *<sup>x</sup>* <sup>∈</sup> <sup>X</sup>. Once the arithmetic in <sup>X</sup> is specified, both neutral

elements are uniquely given by the general formula: *r*<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (*r*) for any *<sup>r</sup>* <sup>∈</sup> <sup>R</sup>. Therefore, in particular, 0<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (0), 1<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (1). One easily verifies that

$$r\_X \oplus s\_{\mathcal{X}\_-} = \quad (r+s)\_{\mathcal{X}\_-} \tag{8}$$

$$r\_{\mathbb{X}} \odot s\_{\mathbb{X}} = \quad (\text{rs})\_{\mathbb{X}} \tag{9}$$

for all *<sup>r</sup>*,*<sup>s</sup>* <sup>∈</sup> <sup>R</sup>, which extends also to mixed arithmetics,

$$r\_{\mathbf{X}} \oplus\_{\mathbf{X}}^{\mathbf{XY}} s\_{\mathbf{Y}} = \quad (r+s)\_{\mathbf{X}} \tag{10}$$

$$r\_{\mathbb{X}} \oplus\_{\mathbb{Y}}^{\text{XY}} s\_{\mathbb{Y}} = \ (r+s)\_{\mathbb{Y}}.\tag{11}$$

$$r\_{\mathbb{X}} \oplus\_{\mathbb{Z}}^{\text{XY}} s\_Y = \quad (r+s)\_{\mathbb{Z}\prime} \quad \text{etc.} \tag{12}$$

If there is no danger of ambiguity one can simplify the notation by <sup>⊕</sup>XX <sup>X</sup> = ⊕<sup>X</sup> or <sup>⊕</sup>XY <sup>X</sup> <sup>=</sup> <sup>⊕</sup><sup>Y</sup> <sup>X</sup>. Mixed arithmetics can be given an interpretation in terms of communication channels. Mixed multiplication is in many respects analogous to a tensor product [13].

**Example 3.** *Consider* <sup>X</sup> <sup>=</sup> <sup>R</sup>+*,* <sup>Y</sup> <sup>=</sup> <sup>−</sup>R+*, <sup>f</sup>*X(*x*) = ln *x, <sup>f</sup>* <sup>−</sup><sup>1</sup> <sup>X</sup> (*r*) = *e<sup>r</sup> , <sup>f</sup>*Y(*x*) = ln(−*x*)*, <sup>f</sup>* <sup>−</sup><sup>1</sup> <sup>Y</sup> (*r*) = −*e<sup>r</sup> . "Two plus two equals four" looks here as follows,*

$$\mathcal{Z}\_{\mathbb{X}} \oplus\_{\mathbb{X}} \mathcal{Z}\_{\mathbb{X}} = \begin{array}{c} f\_{\mathbb{X}}^{-1}(\mathbb{2} + \mathbb{2}) = 4\mathbb{x} = \varepsilon^{4} \end{array} \tag{13}$$

$$\mathfrak{L}\_{\mathbb{X}} \oplus\_{\mathbb{X}}^{\mathbb{Y}} \mathfrak{L}\_{\mathbb{Y}} = \begin{array}{c} f\_{\mathbb{X}}^{-1}(\mathfrak{L} + \mathfrak{L}) = \mathfrak{A}\_{\mathbb{X}} = \mathfrak{e}^{4}, \end{array} \tag{14}$$

$$\mathcal{Z}\_{\mathbf{Y}} \oplus\_{\mathbf{Y}} \mathcal{Z}\_{\mathbf{Y}} = \begin{array}{c} f\_{\mathbf{Y}}^{-1}(\mathbf{2} + \mathbf{2}) = 4\_{\mathbf{Y}} = -e^{\mathbf{4}}, \end{array} \tag{15}$$

$$2\_{\mathbb{X}} \oplus\_{\mathbb{Y}}^{\mathbb{X}} 2\_{\mathbb{Y}} = \ \ f\_{\mathbb{Y}}^{-1}(2+2) = 4\_{\mathbb{Y}} = -e^4,\tag{16}$$

*where* 2<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (2) = *e*2*,* 2<sup>Y</sup> = *f* <sup>−</sup><sup>1</sup> <sup>Y</sup> (2) = −*e*2*. From the point of view of communication channels the situation is as follows. There are two parties ("Alice" and "Bob"), each computing by means of her/his own rules. They communicate their results and agree the numbers they have found are the same, namely, "two" and "four". However, for an external observer (an eavesdropper "Eve"), their results are opposite, say e*<sup>4</sup> *and* −*e*4*. Mixed arithmetic plays a role of a "connection" relating different local arithmetics. This is why, in the terminology of Burgin, these types or arithmetics are non-Diophantine (from Diophantus of Alexandria who formalized the standard arithmetic). Similarly to nontrivial manifolds, non-Diophantine arithmetics do not have to admit a single global description (which we nevertheless assume in this paper).*

A limit such as lim*x*→*<sup>x</sup> <sup>A</sup>*(*x* ) = *A*(*x*) is defined by the diagram (1) as follows,

$$\lim\_{\mathbf{x'} \to \mathbf{x}} A(\mathbf{x'}) = f\_{\mathbf{Y}}^{-1} \left( \lim\_{r \to f\_{\mathbf{X}}(\mathbf{x})} \vec{A}(r) \right) \tag{17}$$

i.e., in terms of an ordinary limit in R. A non-Newtonian derivative is then defined by

$$\frac{\mathrm{D}A(\mathbf{x})}{\mathrm{D}\mathbf{x}} = \lim\_{\delta \to 0} \left( A(\mathbf{x} \oplus\_{\mathbf{X}} \delta\_{\mathbf{X}}) \ominus\_{\mathbb{Y}} A(\mathbf{x}) \right) \odot\_{\mathbb{Y}} \delta\_{\mathbb{Y}} = f\_{\mathbb{Y}}^{-1} \left( \frac{\mathrm{d}\tilde{A}(f\_{\mathbf{X}}(\mathbf{x}))}{\mathrm{d}f\_{\mathbf{X}}(\mathbf{x})} \right), \tag{18}$$

if the Newtonian derivative d*A*˜(*r*)/d*r* exists. It is additive,

$$\frac{\mathrm{D}[A(\mathbf{x}) \oplus\_{\mathrm{Y}} B(\mathbf{x})]}{\mathrm{D}\mathbf{x}} = \begin{array}{c} \mathrm{D}A(\mathbf{x})\\ \mathrm{D}\mathbf{x} \end{array} \oplus\_{\mathrm{Y}} \frac{\mathrm{D}B(\mathbf{x})}{\mathrm{D}\mathbf{x}},\tag{19}$$

and satisfies the Leibniz rule,

$$\frac{\mathrm{D}[A(\mathbf{x})\odot\_{\mathbf{Y}}B(\mathbf{x})]}{\mathrm{D}\mathbf{x}} = \; \left(\frac{\mathrm{D}A(\mathbf{x})}{\mathrm{D}\mathbf{x}} \odot\_{\mathbf{Y}} B(\mathbf{x})\right) \oplus\_{\mathbf{Y}} \left(A(\mathbf{x}) \odot\_{\mathbf{Y}} \frac{\mathrm{DB}(\mathbf{x})}{\mathrm{D}\mathbf{x}}\right). \tag{20}$$

A general chain rule for compositions of functions involving arbitrary arithmetics in domains and codomains can be derived [12] (see Example 6). It implies, in particular, that the bijections defining the arithmetics are themselves always non-Newtonian differentiable (with respect to the derivatives they define). The resulting derivatives are "trivial",

$$\frac{\mathrm{D}f\_{\mathrm{X}}(\mathrm{x})}{\mathrm{D}\mathbf{x}} = 1 = \frac{\mathrm{D}f\_{\mathrm{Y}}(\mathrm{y})}{\mathrm{D}\mathbf{y}}, \quad \frac{\mathrm{D}f\_{\mathrm{X}}^{-1}(\mathrm{r})}{\mathrm{D}\mathbf{r}} = 1\_{\mathrm{X}}, \quad \frac{\mathrm{D}f\_{\mathrm{Y}}^{-1}(\mathrm{r})}{\mathrm{D}\mathbf{r}} = 1\_{\mathrm{Y}}.\tag{21}$$

A non-Newtonian integral is defined by the requirement that, under typical assumptions paralleling those from the fundamental theorem of Newtonian calculus, one finds

$$\frac{\mathrm{D}}{\mathrm{D} \mathbf{x}} \int\_{y}^{\mathrm{x}} A(\mathbf{x}') \mathrm{D} \mathbf{x}' \quad = \quad A(\mathbf{x}), \tag{22}$$

$$\int\_{y}^{x} \frac{\mathcal{D}A(\mathbf{x}')}{\mathcal{D}\mathbf{x}'} \mathcal{D}\mathbf{x}' \;=\;\quad A(\mathbf{x}) \ominus\_{\mathbb{Y}} A(y),\tag{23}$$

which uniquely implies that

$$\int\_{y}^{\infty} A(\mathbf{x'}) \mathbf{D} \mathbf{x'} \; \; \; \; \; \; \; \; f\_{\mathbf{Y}}^{-1} \left( \int\_{f\_{\mathbf{X}}(y)}^{f\_{\mathbf{X}}(\mathbf{x})} \tilde{A}(r) \mathbf{d}r \right) . \tag{24}$$

Here, as before, *A*˜ is defined by (1) and d*r* denotes the usual Newtonian (Riemann, Lebesgue, etc.) integration. To have a feel of the potential inherent in this simple formula, let us mention that for a Koch-type fractal (24) turns out to be equivalent to the Hausdorff integral [12,36,37]. In applications, typically the only nontrivial element is to find the explicit form of *f*X. It should be stressed that (24) reduces any integral to the one over a subset of R. The fact that such a counterintuitive possibility exists was noticed already by Wiener in his 1933 lectures on Fourier analysis [38].

#### **3. Non-Newtonian Exponential Function and Logarithm**

Once we know how to differentiate and integrate, we can turn to differential equations. The so-called exponential family plays a crucial role in thermodynamics, both standard and generalized [39–43]. Many different deformations of the usual *e<sup>x</sup>* can be found in the literature. However, from the non-Newtonian perspective, the exponential function Exp : <sup>X</sup> <sup>→</sup> <sup>Y</sup> is defined by

$$\frac{\text{DExp}(\mathbf{x})}{\text{Dx}} = \text{Exp}(\mathbf{x}), \quad \text{Exp}(\mathbf{0}\_{\mathbb{X}}) = \mathbf{1}\_{\mathbb{Y}}.\tag{25}$$

Integrating (25) (in a non-Newtonian way) one finds the unique solution

$$\operatorname{Exp}(\mathbf{x}) = f\_{\mathbb{Y}}^{-1} \left( \mathfrak{e}^{f\_{\mathbb{X}}(\mathbf{x})} \right), \quad \operatorname{Exp}(\mathbf{x}\_1 \oplus\_{\mathbb{X}} \mathbf{x}\_2) = \operatorname{Exp}(\mathbf{x}\_1) \odot\_{\mathbb{Y}} \operatorname{Exp}(\mathbf{x}\_2). \tag{26}$$

In thermodynamic applications, one often encounters exponents of negative arguments, *e*−*x*. In a non-Newtonian context the correct form of a minus is X*<sup>x</sup>* <sup>=</sup> <sup>0</sup><sup>X</sup> <sup>X</sup> *<sup>x</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup> X − *f*X(*x*) . The example discussed in the next section will involve X = R and *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (−*r*) = <sup>−</sup>*<sup>f</sup>* <sup>−</sup><sup>1</sup> <sup>X</sup> (r). In consequence, it will be correct to write X*x* = −*x*, but in general such a simple rule may be meaningless (because "−", as opposed to X, may be undefined in <sup>X</sup>).

**Example 4.** *Let* <sup>X</sup> = (R+, <sup>⊕</sup>, " )*, with the arithmetic defined by <sup>f</sup>*<sup>X</sup> : <sup>R</sup><sup>+</sup> <sup>→</sup> <sup>R</sup>*, <sup>f</sup>*X(*x*) = ln *x, f* <sup>−</sup><sup>1</sup> <sup>X</sup> (*r*) = *e<sup>r</sup> . Then*

$$\varepsilon \ominus \mathbf{x} = f\_{\mathbf{X}}^{-1} \left( -f\_{\mathbf{X}}(\mathbf{x}) \right) = \varepsilon^{-\ln \mathbf{x}} = 1/\mathbf{x} \in \mathbb{R}\_{+}.\tag{27}$$

*The same number can be both positive and negative, depending on the arithmetic.*

A (natural) logarithm is the inverse of Exp, namely, Ln : <sup>Y</sup> <sup>→</sup> <sup>X</sup>,

$$\operatorname{Ln}(y) = f\_{\mathbb{X}}^{-1} \left( \ln f\_{\mathbb{Y}}(\mathbf{x}) \right), \quad \operatorname{Ln}(y\_1 \odot\_{\mathbb{Y}} y\_2) = \operatorname{Ln}(y\_1) \oplus\_{\mathbb{X}} \operatorname{Ln}(y\_2). \tag{28}$$

Expressions such as Exp *<sup>x</sup>* <sup>+</sup> Ln *<sup>y</sup>* are in general meaningless even if <sup>X</sup> <sup>⊂</sup> <sup>R</sup><sup>+</sup> and <sup>Y</sup> <sup>⊂</sup> <sup>R</sup>+. However, formulas such as

$$(\operatorname{Exp}\,\mathbf{x})\oplus\_{\mathbb{Z}}^{\operatorname{YX}}\left(\operatorname{Ln}\,\mathbf{y}\right) = f\_{\mathbb{Z}}^{-1}\left(e^{f\_{\mathbb{X}}\left(\mathbf{x}\right)} + \ln f\_{\mathbb{Y}}\left(\mathbf{y}\right)\right) \tag{29}$$

make perfect sense. For example, if *pk* <sup>∈</sup> <sup>X</sup>, then an entropy can be defined as

$$S\_{\llcorner} = \bigoplus\_{k} p\_k \odot\_{\mathbb{Z}}^{XY} \operatorname{Ln} \left( \mathbf{1}\_{\mathbb{X}} \odot\_{\mathbb{X}} p\_k \right) \tag{30}$$

$$=\left[f\_{\mathbb{Z}}^{-1}\left[\sum\_{k} f\_{\mathbb{Z}}\left(p\_{k}\odot\_{\mathbb{Z}}^{\text{XY}}\operatorname{Ln}\left(\mathbf{1}\_{\mathbb{X}}\odot\_{\mathbb{X}} p\_{k}\right)\right)\right]\right] \tag{31}$$

$$=\left.f\_{\mathbb{Z}}^{-1}\left[\sum\_{k} f\_{\mathbb{X}}(p\_k) \ln\left(1/f\_{\mathbb{X}}(p\_k)\right)\right] \right. \tag{32}$$

Many intriguing questions occur if one asks about normalization of probabilities. We will come to it later.

Non-Newtonian constructions of Exp and Ln are systematic, general, and flexible. There seems to exist a relation between the arithmetic formalism and the method of monotone embedding discussed in information geometry [44], but the problem requires further studies.

**Example 5.** *In order to appreciate the difference between Newtonian and non-Newtonian differentiation let us differentiate the function <sup>A</sup>*(*x*) = *x, <sup>A</sup>* : <sup>X</sup> <sup>→</sup> <sup>Y</sup>*, but in two cases. The first one is trivial,* <sup>X</sup> <sup>=</sup> <sup>Y</sup> = (R, <sup>+</sup>, ·)*, with the arithmetic defined by the identity f*X = *f*Y = idR*. Then, the non-Newtonian and Newtonian derivatives coincide, so*

$$\frac{\mathrm{D}A(x)}{\mathrm{D}x} = \frac{\mathrm{d}A(x)}{\mathrm{d}x} = 1.\tag{33}$$

*The second case involves, as before, the codomain* <sup>Y</sup> = (R, <sup>+</sup>, ·)*, with the arithmetic defined by the identity <sup>f</sup>*<sup>Y</sup> <sup>=</sup> idR*. However, as the domain we choose* <sup>X</sup> = (R+, <sup>⊕</sup>, " )*, with the arithmetic defined by <sup>f</sup>*<sup>X</sup> : <sup>R</sup><sup>+</sup> <sup>→</sup> <sup>R</sup>*, f*X(*x*) = ln *x, f* <sup>−</sup><sup>1</sup> <sup>X</sup> (*r*) = *e<sup>r</sup> . Now,*

$$\frac{\mathrm{DA}(\mathbf{x})}{\mathrm{D}\mathbf{x}} = \lim\_{\delta \to 0} \left( A(\mathbf{x} \oplus\_{\mathbf{X}} \delta\_{\mathbf{X}}) \ominus\_{\mathbf{Y}} A(\mathbf{x}) \right) \odot\_{\mathbf{Y}} \delta\_{\mathbf{Y}} = \lim\_{\delta \to 0} \frac{\left( \mathbf{x} \oplus\_{\mathbf{X}} f\_{\mathbf{X}}^{-1}(\delta) \right) - \mathbf{x}}{\delta}$$

$$= \lim\_{\delta \to 0} \frac{e^{\mathrm{Im}\mathbf{x} + \delta} - \mathbf{x}}{\delta} = \mathbf{x} = A(\mathbf{x}). \tag{34}$$

*As,* 0<sup>X</sup> = *f* <sup>−</sup><sup>1</sup> <sup>X</sup> (0) = *<sup>e</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup>*, we find <sup>A</sup>*(0X) = <sup>0</sup><sup>X</sup> <sup>=</sup> <sup>1</sup> <sup>=</sup> <sup>1</sup>Y*, and conclude that <sup>A</sup>*(*x*) = *x, <sup>A</sup>* : <sup>R</sup><sup>+</sup> <sup>→</sup> <sup>R</sup> *belongs to the exponential family. Indeed,*

$$A(\mathbf{x}\_1 \oplus \mathbf{x}\_2 \mathbf{x}\_2) = \mathbf{x}\_1 \oplus \mathbf{x}\_2 \mathbf{x}\_2 = \mathbf{c}^{\ln \mathbf{x}\_1 + \ln \mathbf{x}\_2} = \mathbf{x}\_1 \cdot \mathbf{x}\_2 = A(\mathbf{x}\_1) \odot\_{\mathbb{T}} A(\mathbf{x}\_2). \tag{35}$$

*To understand the result, write A*(*x*) = *f* <sup>−</sup><sup>1</sup> Y *A*˜(*f*X(*x*) = *A*˜(ln *x*) = *x, so that A*˜(*r*) = *e<sup>r</sup> . Then, by the second form of derivative in (18),*

$$\frac{\mathrm{D}A(\mathbf{x})}{\mathrm{D}\mathbf{x}} = \ \ f\_{\mathbb{Y}}^{-1} \left( \frac{\mathrm{d}\bar{A}\left(f\_{\mathbb{X}}(\mathbf{x})\right)}{\mathrm{d}f\_{\mathbb{X}}(\mathbf{x})} \right) = \frac{\mathrm{d}\,\mathrm{e}^{f\_{\mathbb{X}}(\mathbf{x})}}{\mathrm{d}f\_{\mathbb{X}}(\mathbf{x})} = \mathrm{e}^{f\_{\mathbb{X}}(\mathbf{x})} = \mathrm{e}^{\ln\mathbf{x}} = \mathbf{x}. \tag{36}$$

*Entropy* **2020**, *22*, 1180

*The map A does not affect the value of x, but changes its arithmetic properties. It behaves as if it assigned a different meaning to the same word. The example becomes even more intriguing if one realizes that logarithm is known to approximately relate stimulus with sensation in real-life sensory systems (hence the logarithmic scale of decibels and star magnitudes) [35].*

**Example 6.** *Many calculations in thermodynamics reduce to formulas of the form*

$$\text{d}\,\text{d}I(S,V) = \left(\frac{\partial \text{d}I}{\partial S}\right)\_V \text{d}S + \left(\frac{\partial \text{d}I}{\partial V}\right)\_S \text{d}V,\tag{37}$$

*being equivalent to the derivative* d*U*(*S*(*t*), *V*(*t*))/d*t of a composite function of several variables. The latter has a unique formulation in non-Newtonian calculus: One only needs to specify the arithmetics. For example, let U be a map U* : <sup>S</sup> <sup>×</sup> <sup>V</sup> <sup>→</sup> <sup>U</sup>*, and let S* : <sup>T</sup> <sup>→</sup> <sup>S</sup>*, V* : <sup>T</sup> <sup>→</sup> <sup>V</sup>*. Then,*

$$\frac{\mathrm{D}\mathrm{II}(\mathcal{S}(t), V(t))}{\mathrm{D}t} = \lim\_{\delta \to 0} \left( \mathrm{U}(\left( \mathcal{S}\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right), V\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right) \right) \ominus\_{\mathrm{U}} \mathrm{U}\left(\mathcal{S}(t), V(t)\right) \right) \odot\_{\mathrm{U}} \delta\_{\mathrm{U}}.\tag{38}$$

*As*

$$\lim\_{\mathbf{x'}\to\mathbf{x}} A(\mathbf{x}) \oplus\_{\mathbb{Y}} B(\mathbf{x}) = \left(\lim\_{\mathbf{x'}\to\mathbf{x}} A(\mathbf{x})\right) \oplus\_{\mathbb{Y}} \left(\lim\_{\mathbf{x'}\to\mathbf{x}} B(\mathbf{x})\right) \tag{39}$$

*(see Appendix A), we rewrite (38) as*

$$\frac{\mathrm{D}\mathcal{U}(S(t),\mathcal{V}(t))}{\mathrm{D}t} = \lim\_{\delta \to 0} \left( \mathcal{U}(\left( S\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right), \mathcal{V}\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right)\right) \ominus\_{\mathrm{U}} \mathcal{U}\left(\left( S\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right), \mathcal{V}(t) \right)\right) \right) \circ\_{\mathrm{U}} \delta\_{\mathrm{U}} $$

$$\oplus\_{\mathrm{U}} \lim\_{\delta \to 0} \left( \mathcal{U}\left(\left( S\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right), \mathcal{V}(t) \right) \ominus\_{\mathrm{U}} \mathcal{U}\left(\mathcal{S}(t), \mathcal{V}(t) \right)\right) \right) \odot\_{\mathrm{U}} \delta\_{\mathrm{U}}.\tag{40}$$

*Under the usual assumptions about continuity of <sup>U</sup>*˜ : <sup>R</sup> <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> *in*

$$\begin{array}{ccc} \mathbb{S} \times \mathbb{V} & \stackrel{\mathcal{U}}{\longrightarrow} & \mathbb{U} \\ f\_{\mathbb{B}} \bigsqcup f\_{\mathbb{V}} & & \Bigsqcup f\_{\mathbb{U}} \\ \mathbb{R} \times \mathbb{R} & \stackrel{\mathcal{Q}}{\longrightarrow} & \mathbb{R} \end{array} \tag{41}$$

*we reduce (40) to*

$$\frac{\mathrm{DI}I(S(t),V(t))}{\mathrm{Dt}} \quad = \lim\_{\delta \to 0} \left( \mathrm{II}\left( (\mathcal{S}(t), V\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right) \right) \ominus\_{\mathrm{U}} \mathrm{U}\left( (\mathcal{S}(t), V(t)) \right) \right) \oslash\_{\mathrm{U}} \delta\_{\mathrm{U}} $$

$$\oplus\_{\mathrm{U}} \lim\_{\delta \to 0} \left( \mathrm{II}\left( (\mathcal{S}\left(t \oplus\_{\mathrm{T}} \delta\_{\mathrm{T}}\right), V(t) \right) \ominus\_{\mathrm{U}} \mathrm{U}\left( \mathcal{S}(t), V(t) \right) \right) \odot\_{\mathrm{U}} \delta\_{\mathrm{U}} \,, \tag{42}$$

*and then to two instances of the non-Newtonian chain rule,*

$$\frac{D(B\circ A)(\mathbf{x})}{D\mathbf{x}} = f\_{\mathbb{Z}}^{-1} \left[ f \mathbf{z} \left( \frac{D\mathcal{B}(A(\mathbf{x}))}{D\mathbf{A}(\mathbf{x})} \right) f \mathbf{y} \left( \frac{D\mathbf{A}(\mathbf{x})}{D\mathbf{x}} \right) \right] = \frac{D\mathcal{B}(A(\mathbf{x}))}{D\mathbf{A}(\mathbf{x})} \odot\_{\mathbb{Z}}^{\overline{\mathbf{x}}\overline{\mathbf{y}}} \frac{D\mathbf{A}(\mathbf{x})}{D\mathbf{x}},\tag{43}$$

*valid for the composition*

$$\begin{array}{ccccc}\mathbb{X} & \stackrel{A}{\longrightarrow} & \mathbb{Y} & \stackrel{B}{\longrightarrow} & \mathbb{Z} \\ f\mathbb{x}\downarrow & & f\mathbb{y}\downarrow & & f\mathbb{z}\downarrow \\ \mathbb{R} & \stackrel{A}{\longrightarrow} & \mathbb{R} & \stackrel{B}{\longrightarrow} & \mathbb{R} \end{array} \tag{44}$$

*Entropy* **2020**, *22*, 1180

*of maps. Finally,*

$$\frac{\mathrm{D}I(S(t),V(t))}{\mathrm{D}t} = \frac{\mathrm{D}I(S(t),V(t))}{\mathrm{DS}(t)} \circlearrowleft\_{\mathrm{U}}^{\mathrm{U}\overline{\mathrm{S}}} \frac{\mathrm{DS}(t)}{\mathrm{D}t} \oplus \mathrm{U} \frac{\mathrm{D}I(S(t),V(t))}{\mathrm{D}V(t)} \circlearrowleft\_{\mathrm{U}}^{\mathrm{U}\overline{\mathrm{V}}} \frac{\mathrm{DV}(t)}{\mathrm{D}t}.\tag{45}$$

*Effectively,*

$$\mathrm{D}I(S,V) = \left(\frac{\mathrm{D}I}{\mathrm{DS}}\right)\_V \circ\_{\mathrm{U}}^{\mathrm{U}\overline{\mathrm{S}}} \mathrm{DS} \oplus \mathrm{\_{U}}\left(\frac{\mathrm{D}I}{\mathrm{DV}}\right)\_{\mathrm{S}} \circ\_{\mathrm{U}}^{\mathrm{UV}} \mathrm{DV}\_{\prime} \tag{46}$$

*is the non-Newtonian formula for a differential.*

The next section shows that the above mentioned subtleties with arithmetics of domains and codomains have straightforward implications for generalized thermostatistics.

#### **4. Kaniadakis** *κ***-Calculus Versus Non-Newtonian Calculus**

Kaniadakis, in a series of papers [26–34], developed a generalized form of arithmetic and calculus, with numerous applications to statistical physics, and beyond. In the present section, we will clarify links between his formalism and non-Newtonian calculus. As we will see, some of the results have a straightforward non-Newtonian interpretation, but not all.

Assume <sup>X</sup> <sup>=</sup> <sup>R</sup>, with the bijection *<sup>f</sup>*<sup>X</sup> <sup>≡</sup> *<sup>f</sup><sup>κ</sup>* : <sup>R</sup> <sup>→</sup> <sup>R</sup> given explicitly by

$$f\_{\kappa}(\mathbf{x}) \quad = \ \frac{1}{\kappa} \operatorname{arcsinh} \kappa \mathbf{x}\_{\prime} \tag{47}$$

$$f\_{\kappa}^{-1}(\mathbf{x}) \quad = \ \frac{1}{\kappa} \sinh \kappa \mathbf{x}. \tag{48}$$

Kaniadakis' *κ*-calculus begins with the arithmetic,

$$\propto \stackrel{\text{x}}{\oplus} y \quad = \quad f\_{\kappa}^{-1} \left( f\_{\kappa}(\mathbf{x}) + f\_{\kappa}(\mathbf{y}) \right), \tag{49}$$

$$\mathbf{x} \stackrel{\text{x}}{\leftrightarrow} \mathbf{y} \quad = \quad f\_{\mathbf{x}}^{-1} \left( f\_{\mathbf{x}}(\mathbf{x}) - f\_{\mathbf{x}}(\mathbf{y}) \right), \tag{50}$$

$$\propto \stackrel{\text{x}}{\odot} y \quad = \quad f\_{\mathbf{x}}^{-1} (f\_{\mathbf{x}}(\mathbf{x}) \cdot f\_{\mathbf{x}}(y)),\tag{51}$$

$$\propto \stackrel{\text{x}}{\odot} y \quad = \quad f\_{\mathbf{x}}^{-1}(f\_{\mathbf{x}}(\mathbf{x}) / f\_{\mathbf{x}}(y)). \tag{52}$$

As *<sup>f</sup>*0(*x*) = *<sup>x</sup>*, the case *<sup>κ</sup>* <sup>=</sup> 0 corresponds to the usual field <sup>R</sup><sup>0</sup> = (R, <sup>+</sup>, ·), which we will shortly denote by R. The neutral element of addition, 0*<sup>κ</sup>* = *f* <sup>−</sup><sup>1</sup> *<sup>κ</sup>* (0) = 0, is the same for all *κ*s. The neutral element of *<sup>κ</sup>*-multiplication is nontrivial, 1*<sup>κ</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup> *<sup>κ</sup>* (1) <sup>=</sup> 1. The fields <sup>R</sup>*<sup>κ</sup>* = (R, *κ* ⊕, *κ* ") are isomorphic to one another due to their isomorphism with R0,

$$f\_{\mathbf{x}}(\mathbf{x}\stackrel{\mathbf{x}}{\oplus}\mathbf{y}) \quad = \quad f\_{\mathbf{x}}(\mathbf{x}) + f\_{\mathbf{x}}(\mathbf{y}),\tag{53}$$

$$f\_{\mathbf{x}}(\mathbf{x}\stackrel{\mathbf{x}}{\odot}y) \quad = \quad f\_{\mathbf{x}}(\mathbf{x}) \cdot f\_{\mathbf{x}}(y). \tag{54}$$

Kaniadakis defines his *κ*-derivative of a real function *A*(*x*) as

$$\frac{\mathrm{d}A(x)}{\mathrm{d}\_{\mathrm{k}}x} = \lim\_{\delta \to 0} \frac{A(\mathbf{x} + \delta) - A(\mathbf{x})}{(\mathbf{x} + \delta) \stackrel{\mathrm{k}}{\ominus} \mathbf{x}} = \frac{\mathrm{d}A(\mathbf{x})}{\mathrm{d}\mathbf{x}} \Big/ \frac{\mathrm{d}f\_{\mathbf{k}}(\mathbf{x})}{\mathrm{d}\mathbf{x}} = \frac{\mathrm{d}A(\mathbf{x})}{\mathrm{d}\mathbf{x}} \sqrt{1 + \mathbf{x}^{2}\mathbf{x}^{2}}.\tag{55}$$

We will now specify in which sense the *κ*-derivative is non-Newtonian. First consider a function *A*,

$$\begin{array}{ccccc}\mathbb{R}\_{\kappa\_1} & \stackrel{A}{\longrightarrow} & \mathbb{R}\_{\kappa\_2} \\ f\_{\kappa\_1} & & \Big\downarrow f\_{\kappa\_2} \\ \mathbb{R} & \stackrel{A}{\longrightarrow} & \mathbb{R} \end{array} \tag{56}$$

Its non-Newtonian derivative

$$\frac{\mathrm{DA}(\mathbf{x})}{\mathrm{Dx}} = \lim\_{\delta \to 0} \left( A(\mathbf{x} \overset{\mathbf{x}\_1}{\ominus} \delta\_{\mathbf{x}\_1}) \overset{\mathbf{x}\_2}{\ominus} A(\mathbf{x}) \right) \overset{\mathbf{x}\_2}{\ominus} \delta\_{\mathbf{x}\_2 \nu} \tag{57}$$

if compared with (55), suggests *κ*<sup>2</sup> = 0. Setting *κ*<sup>1</sup> = *κ*, *κ*<sup>2</sup> = 0, we find

$$\frac{\mathrm{D}A(\mathbf{x})}{\mathrm{D}\mathbf{x}} = \lim\_{\delta \to 0} \frac{A(\mathbf{x} \stackrel{\mathrm{x}}{\ominus} \delta \mathbf{x}) - A(\mathbf{x})}{\delta} = \lim\_{\delta \to 0} \frac{A[\mathbf{x} \stackrel{\mathrm{x}}{\ominus} f\_{\mathbf{x}}^{-1}(\delta)] - A(\mathbf{x})}{\delta} = \lim\_{\delta \to 0} \frac{A(\mathbf{x} \stackrel{\mathrm{x}}{\ominus} \delta) - A(\mathbf{x})}{\delta},\tag{58}$$

as *<sup>f</sup>* <sup>−</sup><sup>1</sup> *<sup>κ</sup>* (*δ*) <sup>≈</sup> *<sup>δ</sup>* for *<sup>δ</sup>* <sup>≈</sup> 0. Denoting *<sup>x</sup> <sup>κ</sup>* ⊕ *δ* = *x* + *δ* we find *δ* = (*x* + *δ* ) *κ x*, and

$$\frac{\mathrm{D}A(\mathbf{x})}{\mathrm{D}\mathbf{x}} = \lim\_{\delta' \to 0} \frac{A(\mathbf{x} + \delta') - A(\mathbf{x})}{(\mathbf{x} + \delta') \stackrel{\mathrm{\kappa}}{\ominus} \mathbf{x}} \,\tag{59}$$

in agreement with the Kaniadakis formula. However, as a by-product of the calculation we have proved that *κ*-calculus is applicable only to functions mapping R*<sup>κ</sup>* into R. Kaniadakis exponential function satisfies

$$\frac{\text{DExp}(\mathbf{x})}{\text{Dx}} = \text{Exp}(\mathbf{x}), \quad \text{Exp}(\mathbf{0}) = 1,\tag{60}$$

with 0 = 0*κ*, 1 = 10. Accordingly,

$$\operatorname{Exp}(\mathbf{x}) = f\_{\mathbf{Y}}^{-1} \left( e^{f\_{\mathbf{X}}(\mathbf{x})} \right) = e^{f\_{\mathbf{x}}(\mathbf{x})} = e^{\frac{1}{\mathbf{x}} \operatorname{arcsinh} \mathbf{x} \mathbf{x}},\tag{61}$$

which is indeed the Kaniadakis result. Recalling that *f*Y(*x*) = *x*, we find the explicit form of the logarithm, Ln : <sup>R</sup> <sup>→</sup> <sup>R</sup>*κ*,

$$\operatorname{Ln}(y) = f\_{\mathcal{X}}^{-1} \left( \ln f\_{\mathcal{Y}}(y) \right) = \frac{1}{\kappa} \sinh(\kappa \ln y), \tag{62}$$

which again agrees with the Kaniadakis definition.

Yet, the readers must be hereby warned that it is *not* allowed to apply the Kaniadakis definition of derivative to Ln *x*. The correct non-Newtonian form is

$$\frac{\mathrm{DLn}(y)}{\mathrm{Dy}} = \lim\_{\delta \to 0} \left( \mathrm{Ln}(y + \delta) \stackrel{\mathrm{x}}{\ominus} \mathrm{Ln}(y) \right) \stackrel{\mathrm{x}}{\odot} \delta\_{\mathrm{X}} = f\_{\mathrm{X}}^{-1} \left( \mathrm{1}/f\_{\mathrm{Y}}(y) \right) = \frac{1}{\mathrm{x}} \sinh(\mathrm{x}/y), \tag{63}$$

because Ln maps R into R*κ*. Kaniadakis is aware of the subtlety and thus introduces also another derivative, meant for differentiation of inverse functions,

$$\frac{\mathrm{d}\_{\mathrm{x}}A(y)}{\mathrm{d}y} = \lim\_{u \to y} \frac{A(y) \stackrel{\mathrm{x}}{\hookrightarrow} A(u)}{y - u} = \lim\_{\delta \to 0} \frac{A(y + \delta) \stackrel{\mathrm{x}}{\hookrightarrow} A(y)}{\delta},\tag{64}$$

a definition which, from the non-Newtonian standpoint, must be nevertheless regarded as incorrect ('/' should be replaced by *<sup>κ</sup>* # typical of the codomain <sup>R</sup>*κ*). As a result,

$$\frac{\mathrm{d}\_{\mathrm{k}}\mathrm{Ln}(y)}{\mathrm{d}y} = \frac{1}{y} \neq \frac{\mathrm{DLn}(y)}{\mathrm{D}y} = \frac{1}{\kappa}\sinh\frac{\kappa}{y}.\tag{65}$$

This is probably why (64), as opposed to (55), has not found too many applications.

Let us finally check what would have happened if instead of (61) one considered the exponential function mapping R*<sup>κ</sup>* into itself, *f*<sup>Y</sup> = *f*<sup>X</sup> = *fκ*,

$$\operatorname{Exp}(\mathbf{x}) = f\_{\mathbf{Y}}^{-1}\left(e^{f\_{\mathbf{X}}(\mathbf{x})}\right) = f\_{\mathbf{x}}^{-1}\left(e^{f\_{\mathbf{x}}(\mathbf{x})}\right) = \frac{1}{\kappa}\sinh\left(\kappa\left.e^{\frac{1}{\kappa}\operatorname{arcsinh}\kappa\mathbf{x}}\right). \tag{66}$$

As in thermodynamic applications one typically encounters Exp of a negative argument, one expects that physical differences between Exp : <sup>R</sup>*<sup>κ</sup>* <sup>→</sup> <sup>R</sup>*<sup>κ</sup>* and Exp : <sup>R</sup>*<sup>κ</sup>* <sup>→</sup> <sup>R</sup> should not be essential. Moreover, indeed, Figure 1 shows that both exponents lead to identical asymptotic tails.

**Figure 1.** Log-log plots of Exp(−*x*) for *κ*<sup>1</sup> = 1, *κ*<sup>2</sup> = 0 (black), and *κ*<sup>1</sup> = *κ*<sup>2</sup> = 1 (red). The tails are identical.

#### **5. A Cosmological Aspect of the Kaniadakis Arithmetic**

Kaniadakis explored possible relativistic implications of his formalism. In particular, he noted that fluxes of cosmic rays depend on energy in a way that seems to indicate *κ* > 0. It is therefore intriguing that essentially the same arithmetic was recently shown [14] to have links with the problem of accelerated expansion of the Universe, one of the greatest puzzles of contemporary physics.

Cosmological expansion is well described by the Friedman equation,

$$\frac{\mathrm{d}a(t)}{\mathrm{d}t} = \sqrt{\Omega\_{\Lambda}a(t)^2 + \frac{\Omega\_M}{a(t)}}, \quad a(t) > 0,\tag{67}$$

for a dimensionless scale factor *a*(*t*) evolving in a dimensionless time *t* (in units of the Hubble time *tH* ≈ 13.58 × <sup>10</sup><sup>9</sup> yr). The observable parameters are <sup>Ω</sup>*<sup>M</sup>* = 0.3, ΩΛ = 0.7 [45,46]. ΩΛ = 0 is typically interpreted as an indication of dark energy. Equation (67) is solved by

$$a(t) = \left(\sqrt{\frac{\Omega\_M}{\Omega\_\Lambda}} \sinh \frac{3\sqrt{\Omega\_\Lambda}t}{2}\right)^{2/3}, \quad t > 0. \tag{68}$$

Now assume that

$$\begin{array}{ccccc}\mathbb{X} & \stackrel{a}{\longrightarrow} & \mathbb{R} \\ f\_{\mathbb{X}} & & \Big\downarrow f\_{\mathbb{R}} = \mathrm{id}\_{\mathbb{R}} \\ \mathbb{R} & \stackrel{a}{\longrightarrow} & \mathbb{R} \end{array} \tag{69}$$

whereas the Friedman equation involves no ΩΛ,

$$\frac{\mathrm{Da}(t)}{\mathrm{Dt}} = \sqrt{\frac{\Omega}{a(t)}}, \quad a(t) > 0,\tag{70}$$

for some Ω. Its solution by non-Newtonian techniques reads

$$a(t) \quad = \left(\frac{3}{2}\sqrt{\Omega}f\_{\mathbb{X}}(t)\right)^{2/3},\tag{71}$$

so, comparing (71) with (68), we find

$$f\_{\mathbf{X}}(t) = \frac{2}{3\sqrt{0.7}} \sqrt{\frac{\Omega\_M}{\Omega}} \sinh \frac{3\sqrt{0.7}}{2} t = \sqrt{\frac{\Omega\_M}{\Omega}} f\_{\mathbf{x}}^{-1}(t), \quad \text{for } \mathbf{x} = 1.255. \tag{72}$$

Accelerated expansion of the Universe looks like a combined effect of non-Euclidean geometry and non-Diophantine arithmetic. The resulting dynamics is non-Newtonian in both meanings of this term.

The presence of the inverse bijection *f* <sup>−</sup><sup>1</sup> *<sup>κ</sup>* and *κ* > 1 raises a number of interesting questions. It is related to the fundamental duality between Diophantine and non-Diophantine arithmetics. Namely, any equation of the form, say

$$\mathbf{x}\_1 \oplus \mathbf{x}\_2 \quad = \quad f^{-1}(f(\mathbf{x}\_1) + f(\mathbf{x}\_2)),\tag{73}$$

can be inverted by *f*(*x*) = *y* into

$$\{y\_1 + y\_2 \: \: \: \: \: f(f^{-1}(y\_1) \oplus f^{-1}(y\_2)), \tag{74}$$

suggesting that it is ⊕ and not + which is the Diophantine arithmetic operation. Having two isomorphic arithmetics we, in general, do not have any criterion telling us which of the two is "normal", and which is "generalized".

#### **6. Kolmogorov–Nagumo Averages and Non-Diophantine/Non-Newtonian Probability**

Another non-Diophantine/non-Newtonian aspect that can be identified in the context of information theory and thermodynamics is implicitly present in the works of Kolmogorow, Nagumo, and Rényi. Let us recall that a Kolmogorov–Nagumo average is defined as [47–54]

$$\langle a \rangle\_f = \quad f^{-1}\left(\sum\_k p\_k f(a\_k)\right). \tag{75}$$

Rewriting (75) as

$$\langle a \rangle\_f = f^{-1} \left( \sum\_k f(p\_k') f(a\_k) \right) = \bigoplus\_k p\_k' \odot a\_{k'} \tag{76}$$

where *p <sup>k</sup>* = *<sup>f</sup>* <sup>−</sup>1(*pk*), one interprets the average as the one typical of a non-Diophantinearithmetic-valued probability. Apparently, neither Kolmogorov nor Nagumo nor Rényi had interpreted their results from this arithmetic point of view [7].

The lack of arithmetic perspective is especially visible in the works of Rényi [49] who, while deriving his *α*-entropies, began with a general Kolmogorov–Nagumo average. Trying to derive a meaningful class of *f* s he demanded that

$$
\langle a+c \rangle\_f = \langle a \rangle\_f + c \tag{77}
$$

be valid for any constant random variable *c*, and this led him to the exponential family *fα*(*x*) = 2(1−*α*)*<sup>x</sup>* (up to a general affine transformation *f* → *A f* + *B*, which does not affect Kolmogorov–Nagumo averages). In physical applications, it is more convenient to work with natural logarithms, so let us replace *f<sup>α</sup>* by *fq*(*x*) = *e*(1−*q*)*x*, *f* <sup>−</sup><sup>1</sup> *<sup>q</sup>* (*x*) = <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* ln *<sup>x</sup>*, *<sup>q</sup>* <sup>∈</sup> *<sup>R</sup>*. With this particular choice of *<sup>f</sup>* one finds

$$\langle a \rangle\_{f\_q} = -\frac{1}{1-q} \ln \left( \sum\_k p\_k e^{(1-q)a\_k} \right). \tag{78}$$

As is well known, the standard linear average is the limiting case lim*q*→<sup>1</sup>*afq* = ∑*<sup>k</sup> pk ak*, that includes the entropy of Shannon, *S* = ∑*<sup>k</sup> pk* ln(1/*pk*) = *S*1, as the limit *q* → 1 of the Rényi entropy

$$S\_q = \frac{1}{1-q} \ln \left( \sum\_k p\_k e^{(1-q)\ln(1/p\_k)} \right) = \frac{1}{1-q} \ln \sum\_k p\_k^q. \tag{79}$$

Still, notice that *a* ⊕ *b<sup>f</sup>* = *a<sup>f</sup>* ⊕ *b<sup>f</sup>* for any *f* , so had Rényi been thinking in arithmetic categories, he would not have arrived at his *fα*. Yet, *fα* is an interesting special case. For example,

$$p\_k' = f\_q^{-1}(p\_k) = \frac{1}{q-1} \ln(1/p\_k). \tag{80}$$

The random variable *ak* = log*b*(1/*pk*) is, according to Shannon [49,55], the amount of information obtained by observing an event whose probability is *pk*. The choice of *b* defines units of information. Therefore, Rényi's non-Diophantine probability *p <sup>k</sup>* is the amount of information encoded in *pk*.

#### **7. Escort Probabilities and Quantum Mechanical Hidden Variables**

Non-Diophantine arithmetics have several properties that make them analogous to sets of values of incompatible random variables in quantum mechanics. Generalized arithmetics and non-Newtonian calculi have nontrivial consequences for the problem of hidden variables and completeness of quantum mechanics.

**Example 7.** *Pauli matrices σ*<sup>1</sup> *and σ*<sup>2</sup> *represent random variables whose values are s*<sup>1</sup> = ±1 *and s*<sup>2</sup> = ±1*, respectively. However, it is not allowed to assume that σ*<sup>1</sup> + *σ*<sup>2</sup> *represents a random variable whose possible values are s*<sup>1</sup> + *s*<sup>2</sup> = 0, ±2*, even though an average of σ*<sup>1</sup> + *σ*<sup>2</sup> *ia a sum of independent averages of σ*<sup>1</sup> *and σ*2*. In non-Diophantine arithmetic one encounters a similar problem. In general it makes no sense to perform additions of the form <sup>x</sup>*<sup>X</sup> <sup>+</sup> *<sup>y</sup>*<sup>Y</sup> *even if <sup>x</sup>*<sup>X</sup> <sup>∈</sup> <sup>R</sup> *and <sup>y</sup>*<sup>Y</sup> <sup>∈</sup> <sup>R</sup>*. One should not be surprised if non-Diophantine probabilities turn out to be analogous to quantum probabilities, at least in some respects.*

Normalization of probability implies

$$1\_X = f^{-1}(1) = f^{-1}\left(\sum\_k p\_k\right) = f^{-1}\left(\sum\_k f(p\_k')\right) = \bigoplus\_k p\_k'.\tag{81}$$

In principle, 1<sup>X</sup> = 1. An interesting and highly nontrivial case occurs if both *pk* and *p <sup>k</sup>* = *<sup>f</sup>* <sup>−</sup>1(*pk*) are probabilities in the ordinary sense, i.e., in addition to (81) one finds 1<sup>X</sup> = 1, 0 ≤ *p <sup>k</sup>* ≤ 1, and ∑*<sup>k</sup> p <sup>k</sup>* = 1. What can be then said about *f* ? We can formalize the question as follows.

**Problem 1.** *Find a characterization of those functions g* : [0, 1] → [0, 1] *that satisfy*

$$\sum\_{k} \mathcal{g}(p\_k) = 1,\quad \text{for any choice of probabilities } p\_k. \tag{82}$$

In analogy to the generalized thermostatistics literature we can term *p <sup>k</sup>* = *g*(*pk*) the escort probabilities [56–58]. Notice that we are *not* in interested in the trivial solution, often employed in the context of Tsallis and Rényi entropies, where *pk* is replaced by *p q <sup>k</sup>* and then *renormalized*,

$$P\_k = \frac{p\_k^q}{\sum\_j p\_j^q} = g\_k(p\_1, \dots, p\_{n\_\prime}, \dots) \tag{83}$$

as *gk*(*p*1, ... , *pn*, ...) = *g*(*pk*) for a single function *g* of one variable. As we will shortly see, the solution of (82) turns out to have straightforward implications for the quantum mechanical problem of hidden variables, and relations between classical and quantum probabilities.

The most nontrivial result is found for binary probabilities, *p*<sup>1</sup> + *p*<sup>2</sup> = 1.

**Lemma 1.** *g*(*p*1) + *g*(*p*2) = 1 *for all p*<sup>1</sup> + *p*<sup>2</sup> = 1 *if and only if*

$$g(p) = \frac{1}{2} + h\left(p - \frac{1}{2}\right) \tag{84}$$

*where h*(−*x*) = −*h*(*x*)*.*

**Proof.** See Appendix B.

The lemma has profound consequences for foundations of quantum mechanics, as it allows to circumvent Bell's theorem by non-Newtonian hidden variables. For more details the readers are referred to [13,15], but here just a few examples.

**Example 8.** *The trivial case g*(*p*) = *p implies h*(*x*) = *x, where* 0 ≤ *p* ≤ 1 *and* −1/2 ≤ *x* ≤ 1/2*.*

**Example 9.** *Consider g*(*p*) = sin2 *<sup>π</sup>* <sup>2</sup> *p. Then,*

$$h(\mathbf{x}) = \mathbf{g}\left(\mathbf{x} + \frac{1}{2}\right) - \frac{1}{2} = \frac{1}{2}\sin\pi\mathbf{x}.\tag{85}$$

*Let us cross-check,*

$$\log(p) + \lg(1 - p) = \sin^2 \frac{\pi}{2} p + \sin^2 \frac{\pi}{2} (1 - p) = \sin^2 \frac{\pi}{2} p + \cos^2 \frac{\pi}{2} p = 1. \tag{86}$$

*Now let p* = (*π* − *θ*)/*π be the probability of finding a point belonging to the overlap of two half-circles rotated by θ. Then,*

$$\log(p) = \sin^2 \frac{\pi}{2} \frac{\pi - \theta}{\pi} = \cos^2 \frac{\theta}{2} \tag{87}$$

*is the quantum-mechanical law describing the conditional probability for two successive measurements of spin-1/2 in two Stern–Gerlach devices placed one after another, with relative angle θ. Escort probability has become a quantum probability.*

**Example 10.** *Let us continue the analysis of Example 9. Function <sup>g</sup>* : [0, 1] <sup>→</sup> [0, 1]*, <sup>g</sup>*(*p*) = sin2 *<sup>π</sup>* <sup>2</sup> *p, is one-to-one. It can be continued to the bijection g* : <sup>R</sup> <sup>→</sup> <sup>R</sup> *by the periodic repetition,*

$$g(\mathbf{x}) = n + \sin^2 \frac{\pi}{2} (\mathbf{x} - \mathbf{n}), \quad n \le \mathbf{x} \le n + 1, \quad n \in \mathbb{Z}. \tag{88}$$

*Now let f* = *g*−1*. (88) leads to a non-Diophantine arithmetic and non-Newtonian calculus. Let θ* = *α* − *β,* 0 ≤ *θ* ≤ *π, be an angle between two vectors representing directions of Stern-Gerlach devices. Quantum conditional probability (87) can be represented in a non-Newtonian hidden-variable form,*

$$\cos^2\frac{\mathfrak{a}-\not\beta}{2} = \sin^2\frac{\pi}{2}\frac{\mathfrak{n}-(\mathfrak{a}-\beta)}{\mathfrak{n}} = f^{-1}\left(\frac{1}{\pi}\int\_{\mathfrak{a}}^{\mathfrak{n}+\beta} \mathrm{d}r\right) = f^{-1}\left(\int\_{f(\mathfrak{a}')}^{f(\pi'\in\beta')} \bar{\rho}(r)\mathrm{d}r\right)$$

$$=\int\_{\mathfrak{a}'}^{\pi'\in\beta'} \rho(\lambda)\mathrm{D}\lambda,\tag{89}$$

*where x* = *f* <sup>−</sup>1(*x*)*. Here, ρ is a conditional probability density of non-Newtonian hidden-variables (the half-circle is a result of conditioning by the first measurement).*

Non-Newtonian calculus shifts the discussion on relations between classical and quantum probability, or classical and quantum information, into unexplored areas.

**Example 11.** *In typical Bell-type experiments one deals with four probabilities, corresponding to four combinations* (±, ±)*,* (±, ∓) *of pairs of binary results. The corresponding non-Newtonian model is obtained by rescaling g*(*pk*) → *p g*(*pk*/*p*)*, with p* = 1/2*. The rescaled bijection satisfies g*(*p*1) + *g*(*p*2) = *p for any p*<sup>1</sup> + *p*<sup>2</sup> = *p. Explicitly,*

$$\lg(p\_{++}) + \lg(p\_{+-}) + \lg(p\_{-+}) + \lg(p\_{--}) = 1 = p\_{++} + p\_{+-} + p\_{-+} + p\_{--} \tag{90}$$

*The resulting hidden-variable model is local, but standard Bell's inequality cannot be proved [15]. Why? Mainly because the non-Newtonian integral is not a linear map with respect to the ordinary Diophantine addition and multiplication (unless f is linear), whereas the latter is always assumed in proofs of Bell-type inequalities.*

A generalization to arbitrary probabilities, *p*<sup>1</sup> + ··· + *pn* = 1, leads to an affine deformation of arithmetic, an analogue of Benioff number scaling [21–25]. Affine transformations do not affect Kolmogorov–Nagumo averages.

**Lemma 2.** *Consider probabilities p*1, ... , *pn, n* ≥ 3*. g*(*pk*) *are probabilities for any choice of pk if and only if <sup>g</sup>*(*pk*) = <sup>1</sup>−*a*+2*apk <sup>n</sup>*+(2−*n*)*<sup>a</sup> ,* <sup>−</sup><sup>1</sup> <sup>≤</sup> *<sup>a</sup>* <sup>≤</sup> <sup>1</sup>*.*

**Proof.** See Appendix C.

The bijection *g* implied by Lemma 2 depends on *n*. In infinitely dimensional systems, that is when *n* can be arbitrary, the only option is *a* = 1 and thus *g*(*p*) = *p* is the only acceptable solution. However, in spin systems there exits an alternative interpretation of this property: The dimension *n* grows with spin in such a way that *gn*(*p*) → *p* with *n* → ∞ is a correspondence principle meaning that very large spins are practically classical. The transition non-Diophantine → Diophantine, non-Newtonian → Newtonian becomes an analogue of non-classical → classical.

**Example 12.** *Limitations imposed by Lemma 2 can be nevertheless circumvented in various ways. For example, let g*(1) = 1 *for a solution g from Lemma 1, so that* 1X = 1*. Obviously,*

$$1 = 1\_{\mathbb{X}} \odot \cdots \odot 1\_{\mathbb{X}} = 1 \odot \cdots \odot 1 = 1 \cdot \ldots \cdot 1. \tag{91}$$

*Replacing each of the* 1*s by an appropriate sum of binary conditional probabilities*

$$1 = \lg(p\_{k\_1 \ldots k\_n 1}) + \lg(p\_{k\_1 \ldots k\_n 2}) = \lg(p\_{k\_1 \ldots k\_n 1}) \oplus \lg(p\_{k\_1 \ldots k\_n 2}) \tag{92}$$

*we can generate various conditional classical or quantum probabilities typical of a generalized Bernoulli-type process, representing several classical or quantum filters placed one after another.*

#### **8. Non-Newtonian Maximum Entropy Principle**

Let us finally discuss the implications of our non-Newtonian form (32) of entropy for maximum entropy principles. Assume probabilities belong to X. Define the Massieu function [43] by

$$\Phi \quad = \quad \mathbb{S} \ominus\_{\mathbb{Z}} \mathfrak{a}\_{\mathbb{Z}} \circ\_{\mathbb{Z}} N \hookrightarrow\_{\mathbb{Z}} \mathfrak{f}\_{\mathbb{Z}} \circ\_{\mathbb{Z}} H,\tag{93}$$

$$N\_{\\_} = \bigoplus\_{k} \frac{x}{\mathbb{Z}} p\_k = f\_{\mathbb{Z}}^{-1} \left( \sum\_{k} f\_{\mathbb{X}}(p\_k) \right),\tag{94}$$

$$H\_{\parallel} = \bigoplus\_{k} \mathbb{Z}p\_{k} \odot\_{\mathbb{Z}}^{\text{XE}} E\_{k} = f\_{\mathbb{Z}}^{-1} \left( \sum\_{k} f\_{\mathbb{X}}(p\_{k}) f\_{\mathbb{E}}(E\_{k}) \right),\tag{95}$$

where *Ek* <sup>∈</sup> <sup>E</sup>, and *<sup>α</sup>*<sup>Z</sup> <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup> <sup>Z</sup> (*α*), *β*<sup>Z</sup> = *f* <sup>−</sup><sup>1</sup> <sup>Z</sup> (*β*) are Lagrange multipliers. Explicitly,

$$\Phi = f\_{\mathbb{Z}}^{-1} \left[ \sum\_{k} f\_{\mathbb{X}}(p\_k) \ln \left( 1/f\_{\mathbb{X}}(p\_k) \right) - a \sum\_{k} f\_{\mathbb{X}}(p\_k) - \beta \sum\_{k} f\_{\mathbb{X}}(p\_k) f\_{\mathbb{E}}(E\_k) \right]. \tag{96}$$

Vanishing of the derivative of Φ,

$$\frac{\mathrm{D}\Phi}{\mathrm{D}p\_{\mathrm{I}}} = \mathrm{0}\_{\mathrm{Z}\prime} \tag{97}$$

is equivalent to the standard formula for probabilities *f*X(*pk*) (see the second form of non-Newtonian derivative in (18)),

$$\frac{\mathbf{d}}{\mathbf{d}f\_{\mathbf{X}}(p\_l)} \left( \sum\_{k} f\_{\mathbf{X}}(p\_k) \ln \left( 1/f\_{\mathbf{X}}(p\_k) \right) - a \sum\_{k} f\_{\mathbf{X}}(p\_k) - \beta \sum\_{k} f\_{\mathbf{X}}(p\_k) f\_{\mathbf{E}}(\mathbf{E}\_k) \right) = 0. \tag{98}$$

Accordingly, the solution reads

$$p\_k = \ \ f\_\mathbf{X}^{-1}\left(e^{-\beta f\_\mathbf{E}\left(\mathbf{E}\_k\right)}/\mathbf{\tilde{Z}}(\boldsymbol{\beta})\right) = \mathbf{Exp}\left(\ominus\_\mathbf{E}\beta\_\mathbf{E}\odot\_\mathbf{E}\mathbf{E}\_k\right)\odot\_\mathbf{X}\mathbf{Z}\_\mathbf{X}(\boldsymbol{\beta}),\tag{99}$$

$$Z\_{\mathbb{X}}(\beta) \quad = \quad f\_{\mathbb{X}}^{-1}(\bar{Z}(\beta)) = f\_{\mathbb{X}}^{-1}(\bar{Z}(f\_{\mathbb{E}}(\beta\_{\mathbb{E}}))) = Z(\beta\_{\mathbb{E}}),\tag{100}$$

and involves the exponential function Exp : <sup>E</sup> <sup>→</sup> <sup>X</sup> we have encountered before. The normalization,

$$1\_{\mathcal{X}} = \bigoplus\_{k} \mathcal{X}p\_k = f\_{\mathcal{X}}^{-1} \left( \sum\_{k} e^{-\beta f\_{\mathcal{E}}(E\_k)} / Z(\beta) \right) = f\_{\mathcal{X}}^{-1}(1),\tag{101}$$

implies the usual relation *Z*˜(*β*) = ∑*<sup>k</sup> e*−*<sup>β</sup> <sup>f</sup>*E(*Ek* ).

Equivalently, directly at the level of X,

$$\begin{split} Z\_{\mathbb{X}}(\boldsymbol{\beta}) &= \; f\_{\mathbb{X}}^{-1} \left( \sum\_{k} e^{-\beta f\_{\mathbb{E}}(\mathbb{E}\_{k})} \right) = f\_{\mathbb{X}}^{-1} \left( \sum\_{k} f\_{\mathbb{X}} \circ f\_{\mathbb{X}}^{-1} \left( e^{f\_{\mathbb{E}} \left( \ominus\_{\mathbb{E}} \beta\_{\mathbb{E}} \odot\_{\mathbb{E}} \mathbb{E}\_{k} \right)} \right) \right) \\ &= \; f\_{\mathbb{X}}^{-1} \left( \sum\_{k} f\_{\mathbb{X}} \left( \operatorname{Exp} \left( \ominus\_{\mathbb{E}} \beta\_{\mathbb{E}} \odot\_{\mathbb{E}} \mathbb{E}\_{k} \right) \right) \right) = \bigoplus\_{k} \operatorname{Exp} \left( \ominus\_{\mathbb{E}} \beta\_{\mathbb{E}} \odot\_{\mathbb{E}} \mathbb{E}\_{k} \right) = Z(\beta\_{\mathbb{E}}). \end{split} \tag{102}$$

All the standard tricks one finds in thermodynamics textbooks will work here. For example,

$$\begin{split} \mathcal{H} &= \quad f\_{\mathbb{Z}}^{-1} \left( \sum\_{k} f\_{\mathbb{X}}(p\_{k}) f\_{\mathbb{Z}}(\mathbb{E}\_{k}) \right) = f\_{\mathbb{Z}}^{-1} \left( \sum\_{k} e^{-\beta f\_{\mathbb{E}}(\mathbb{E}\_{k})} f\_{\mathbb{E}}(\mathbb{E}\_{k}) / \mathcal{Z}(\beta) \right) = f\_{\mathbb{Z}}^{-1} \left( -\frac{\mathrm{d} \ln \bar{\mathcal{Z}}(\beta)}{\mathrm{d}\beta} \right) \\ &= \quad \odot \mathrm{z} f\_{\mathbb{Z}}^{-1} \left( \frac{\mathrm{d} \ln \bar{\mathcal{Z}}(\beta)}{\mathrm{d}\beta} \right) = \odot \mathrm{z} f\_{\mathbb{Z}}^{-1} \left( \frac{\mathrm{d} \ln \bar{\mathcal{Z}}(\beta)}{\mathrm{d} f\_{\mathbb{E}}(\beta \mathbb{E})} \right) = \odot \mathrm{z} f\_{\mathbb{Z}}^{-1} \left( \frac{\mathrm{d} \bar{\mathcal{A}} \left( f\_{\mathbb{E}}(\beta \mathbb{E}) \right)}{\mathrm{d} f\_{\mathbb{E}}(\beta \mathbb{E})} \right), \end{split} \tag{103}$$

for some function

$$\begin{array}{ccccc} \mathbb{E} & \stackrel{A}{\longrightarrow} & \mathbb{Z} \\ f\_{\mathbb{E}} & & \Bigvee f\_{\mathbb{Z}} \\ \mathbb{R} & \stackrel{A}{\longrightarrow} & \mathbb{R} \end{array} \tag{104}$$

we yet have to determine. Clearly,

$$\begin{aligned} \ln \tilde{Z}(\boldsymbol{\beta}) &= \quad \bar{A}\left(f\_{\mathbb{E}}(\boldsymbol{\beta}\_{\mathbb{E}})\right) = \bar{A}(\boldsymbol{\beta}), \\ A(\mathbf{x}) &= \; f\_{\mathbb{Z}}^{-1}\left(\bar{A}\left(f\_{\mathbb{E}}(\mathbf{x})\right)\right) = f\_{\mathbb{Z}}^{-1}\left(\ln \left(Z\left(f\_{\mathbb{E}}(\mathbf{x})\right)\right)\right) = f\_{\mathbb{Z}}^{-1}\left(\ln \left(f\_{\mathbb{X}} \circ f\_{\mathbb{X}}^{-1}\left[Z\left(f\_{\mathbb{E}}(\mathbf{x})\right)\right]\right)\right) \\ &= \; f\_{\mathbb{Z}}^{-1}\left(\ln f\_{\mathbb{X}}\left(Z(\mathbf{x})\right)\right) = \ln Z(\mathbf{x}), \end{aligned} \tag{106}$$

where *<sup>Z</sup>* : <sup>E</sup> <sup>→</sup> <sup>X</sup>, Ln : <sup>X</sup> <sup>→</sup> <sup>Z</sup>. Ultimately,

$$H\_{\rm int} = \ \_ {\ominus} \mathbf{D} (\mathbf{L} \mathbf{n} \diamond \mathbf{Z}) (\boldsymbol{\beta}\_{\rm E}) \tag{107}$$
