1. Introduction
In recent decades, modeling using high-class precision measuring instruments and powerful computers has become a comprehensive tool for studying natural phenomena and technological processes—material object (MO). By its nature, modeling is an information process [
1] realized between MO through some kind of connection (channel) and observer. It can be shown by the following chain:
In what follows, a model means a mental structure designed to understand the essence of observed MO. It is a symbolic representation of a physical system. The model is a framework of ideas and concepts from which a researcher/observer interprets his intuition, experience, observations and experimental results. It includes physical structure-model and mathematical structure-model. Physical model is expressed as a set of natural law’s inherent to the recognized object. It interprets a mathematical model, including its assumptions and constraints. Mathematical model is a set of equations using symbolic representations of quantitative variables in a simplified physical system. It helps modeler to understand and quantifies physical model, thus enabling the physical-mathematical model to make precise predictions and different applications [
2].
Currently, in most scientific publications, it is presupposed that the achievement of high-precision measurements and a relatively small difference between theoretical and experimental data allow making a judgment on the appropriateness of a completed experience as a criterion of legality of the proposed physical-mathematical model. In turn, the uncertainties occur not only during the measurements, but also in a synthesis of the theoretical modeling. At this process, in accordance with the nature of emergence, there are significant uncertainties that arise when developing model, in the computer analysis or numerical computations, associated with a finite amount of digits of variables in calculations and, etc.
Over the past two decades, the considerable efforts are made to develop methods allowing the design of the mathematical models with the lowest discrepancy from the observed MO. The numerous methods and criteria have been proposed to achieve this goal. However, all of them are focused on identifying the a-posteriori uncertainty caused by the ineradicable gap between model and a physical system. The same situation exists regarding measurement theory, which covers only the aspects of data analysis and measurement procedures of the variable observed or after formulating a mathematical model. Thus the problem that there is uncertainty before experimental or computer simulation and caused only by limited number of variables recorded in the mathematical model is generally ignored in the measurement theory.
The present approach is focused to formulate the a-priori interaction between the level of detailed descriptions of the MO (the number of recorded variables) and the lowest achievable total experimental uncertainty of the main researched parameter.
2. Materials
The above-mentioned aspects are presented and analyzed by the general theory of information [
1]. According to this theory, the process of the physical-mathematical model formulation can be called
information processing. It includes
information construction that is an operation when the information and/or its initial representations about MO are not changing but new information and/or representations are created. Physicists and engineers obtain information from MO and can develop scientific laws and analyze natural phenomena or engineering processes only based upon this information.
In other words, observer knows about certain MO only if MO has a name in the mind of observer and there are some data in his mind that represent properties of MO. It must be emphasized that any observer or group of scientists are not ideal because, in opposite case, them have to be capable of potentially acquire infinite knowledge.
Despite numerous scientific publications that the author is aware of related to the possibility of using the concept of “amount of information” and “entropy” in conducting field experiments and computer modeling, examples of the practical use of information theory with concrete numerical calculations in physics and engineering are few. In the context of this work, a number of articles should be noted.
For instance, in one of the first innovative works [
3] L. Brillouin related the concept of entropy with the uncertainty of the physical experiment results in order to determine the accuracy of the experiment. For a more detailed study of the accuracy achieved in the experiment, an additional metric was proposed. It is called the comparative uncertainty and is the ratio of the absolute uncertainty of measurement of the variable to the magnitude of its changes interval. It has been explained in detail, without any knowledge about this interval, any experimental research loses its physical meaning.
In [
4] Akaike Information Criterion (AIC) has been proposed. It is a metric of the relative quality of a statistical model for a chosen set of data. If one has a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. AIC is founded on the concept of entropy in information theory: it offers a relative estimate of the information lost when a given model is used to represent the process that generates the data. AIC can be conceived of as a theoretical tool for empirical modeling. When we wish to determine calculated values to represent theoretical data of an experiment, a researcher should usually choose the model with the smallest AIC. Unfortunately, AIC does not determine the quality of a model in an absolute sense. If all candidate models fit poorly, AIC will not give any indication of this. Although AIC can be used for concrete practical cases, its application is quite different to the approach proposed here.
A study of quantum gates has been developed [
5]. The author considered these gates as physical devices which are characterized by the existence of random uncertainty. Reliability of quantum gates was investigated from the perspective of information complexity. In turn, the complexity of the gate’s operation was determined by the difference between the entropies of the variables characterizing the initial and final states. The study has stated that the gate operation may be associated with unlimited entropy, implying the impossibility of realization of the quantum gates function under certain conditions. The relevance of this study comes from its conceptual approach of use of variables, as a specific metric for calculation of information quantity changing between input and output of the apparatus model.
The information theory-based principles have been investigated in relation to uncertainty of mathematical models of water-based systems [
6]. In this research, the mismatch between physically-based models and observations has been minimized by the use of intelligent data-driven models and methods of information theory. The real successes were achieved in developing forecast models for the Rhine and Meuse rivers in the Netherlands. In addition to the possibility of forecasting the uncertainties and accuracy of model predictions, the application of information theory principles indicates that, alongside appropriate analysis techniques, patterns in model uncertainties can be used as indicators to make further improvements to physically-based computational models. At the same time, there have been no attempts to apply these methodologies to results to other physical or engineering tasks.
In [
7] there has been conducted a systematic review of major physical applications of information theory to physical systems, its methods in various subfields of physics, and examples of how specific disciplines adapt this tool. In the context of the proposed approach for practical purposes in experimental and theoretical physics and engineering, the physics of computation, acoustics, climate physics, and chemistry have been mentioned. However, no surveys, reviews, research studies were found with respect to apply information theory for calculating an uncertainty of models of the phenomenon or technological process.
The design information entropy was introduced as a state that reflects both complexity and refinement [
8]. The author argued that it can be useful as some measure of design efficacy and design quality. The method has been applied to the conceptual design of an unmanned aircraft, going through concept generation, concept selection, and parameter optimization. For the purposes of this study it is important to note that introducing the design information entropy as a state can be used as a quantitative description for various aspects in the design process, both with regards to structural information of architecture and connectivity, as well as for parameter values, both discrete and continuous.
In [
9] there has been calculated an upper limit, called the Bekenstein bound, of the quantity of information contained within a given framed object which has the maximum amount of information required to perfectly describe a given physical system. It was implied that the quantity of information of a physical system must be finite if the space of the object and its energy are finite. In informational terms, this bound is given by
where Y is the information expressed in the number of bits contained in the quantum states of the chosen object sphere. The ln2 factor comes from defining the information as the natural logarithm of the number of quantum states; R is the radius of an object sphere that can enclose the given system, E is the total mass-energy including any rest masses,
ħ is the reduced Planck constant, and c is the speed of light. The results are purely theoretical in nature, although it is possible, judging by the numerous references to this article, that one may find applications of the proposed formula in medicine or biology.
The approach that uses the tools of estimation theory to fuse together information from multi-fidelity analysis, resulting in a Bayesian-based approach to mitigating risk in complex design has been proposed [
10]. Maximum entropy characterizations of model discrepancies have been used to represent epistemic uncertainties due to modeling limitations and model assumptions. The revolutionary methodology has been applied to multidisciplinary design optimization and demonstrated on a wing-sizing problem for a high altitude, long endurance aircraft. Uncertainties have been examined that have been explicitly maintained and propagated through the design and synthesis process, resulting in quantified uncertainties on the output estimates of quantities of interest. However, the proposed approach focuses on the optimization of the predefined and computer-ready simulation model.
Thus, there is a very limited amount of literature on how to use the “amount of information” for practical development of a model that describes the MO with the greatest possible accuracy. And at the same time, no one solved the difficult task of quantifying the uncertainty of the conceptual model based on the amount of information embedded in the model, and caused by the choice of a limited number of registered variables.
3. Methods
De facto, the information processing is based on two axioms:
1. Observation is framed by System of Primary Variables. General knowledge of the world is significantly limited (approximate knowledge) by the act of choice a System of Primary Variables. Whatever people know, all scientific knowledge, depends only on information that they have and framed by SPV. It is a set of variables (primary and, designed on their basis, secondary [
11]), which are necessary and sufficient to describe the known nature laws, as in physical content and quantitatively. SPV includes the primary and secondary variables used for descriptions different classes of phenomena. As an example of SPV, SI (International system of units), or CGS (centimeter-gram-second) may be offered. The number of dimensional variables included in SPV is finite. Dimensional variables have a potency to characterize the world’s physical properties and, in particular, observed MO qualitatively and quantitatively. Factually, SPV is an original and unique channel
(generalizing carrier of information), by which scientist gets information about the researched object.
2. Number of variables taken into account in physical-mathematical model is limited. The limits of the description of the studied MO are caused due to the choice of the class of phenomena (CoP) and the number of secondary parameters taken into account in MM.
CoP is a set of physical phenomena, and processes described by a finite number of primary and secondary variables, which characterize certain specific features of MO with qualitative and quantitative aspects [
12]. For heat- and electromagnetism processes, for example, it may be useful to apply SI dimensions of
L is length,
M is weight,
T is time,
Θ is temperature, and
I is powered by electric current, i.e.,
CoPSI ≡
LMTΘI. In thermodynamics, the base set of dimensions often includes
L,
M,
T, and
Θ is the thermodynamic temperature, i.e.,
CoPSI ≡
LMTΘ. If SPV and
CoP are not given, then the definition of “information about MO” loses its force, and the information quantity may increase to infinity or decrease to zero. Without SPV, the modeling of MO is impossible. “You can never get something out of nothing, not even watching” [
3]. It is possible to interpret SPV as a base of all accessible knowledge that humans are able to have about their environment at the moment.
The declaration of the above axioms means the following:
- -
the fundamental quality of the ultimate completeness of the realization of possible combinations of dimensionless complexes inherent in the chosen SPV:
- -
the mandatory organization of dimensionless complexes in a given class of phenomena, without which it is impossible to provide the individuality of a particular model describing the observed object of study.
Taking these axioms into account, we can formulate a minimum set of mandatory defining properties of models within the framework of the proposed method:
- -
the possibility of independent individual existence of different models describing the same object;
- -
the existence of a quantitative evaluation of the differences of the models that dissimilar by the class of phenomena and used for the observed process or phenomenon;
- -
the presence of a known structure of the model, the elements of which can serve as dimensionless complexes;
- -
each selected model, initially, before any experiment or the development and implementation of computer simulation, has a calculated minimum achievable, comparative uncertainty (see below).
Based on the above mentioned reasons we denote Δpmm the uncertainty in determining the dimensionless theoretical field measurements u, “embedded” in the physical-mathematical model and caused only by its dimension. It is noted that dimension of a model is its property to reflect the certain number of characteristics of MO, its external and internal connections (links).
The uncertainty
Δpmm can be represented as the sum of two terms
where
Δpmm′ is due to
CoP, which is associated with reduction of the amount of counted primary parameters compared with SPV;
Δpmm′′ is due to the choice of the amount of counted influencing factors within the framework of the set of
CoP.
There was shown [
13] that the
a-priori amount of information quantity about the observed object due to the choice of
CoP ΔA′ is linked to
Δpmm′ and
S (the dimensionless interval of supervision of a field
u) by the dependence
where
is the Boltzmann’s constant,
z′
is being the number of physical dimensional values in the selected
CoP,
β′
is the number of primary physical dimensional values in the selected
CoP,
אSI is the maximum number of
dimensionless complexes of SI,
אSI
= 38,265.
Following the same reasoning, it can be shown that
Δpmm′′ equals:
where
z′′ is the number of physical dimensional variables recorded in a mathematical model;
β′′ is the number of primary physical dimensional variables recorded in a model. Then, summarizing
Δpmm′ and
Δpmm′′, one can estimate the value
Δpmm.
All the above could be summarized as follows in the form of
א-hypothesis [
13] based on main assumptions of GTI:
Let the chosen system of primary variables in the total number of dimensional physical variables
G,
ξ of which are independent dimensions. In the framework of class of phenomena (the total number of dimensional variables—
z′, the number of primary variables—
β′) there is a dimensionless field
u raised in a given range of values
S. Then the absolute uncertainty of
u, for a given number of recorded physical dimensional variables
z′′, and
β′′ is the number of major physical dimensional variables, can be determined from the relationship:
Using (6), you can find the uncertainty of calculations with the theoretical analysis of the physical phenomena. On other hands, the Equation (6) also sets a limit of the advisability increase of the measurement accuracy in conducting pilot studies. Equation (6) has physical meaning. This relationship testifies that in nature there is a fundamental limit to the accuracy of measuring any observed material object, which cannot be surpassed by any improvement of instruments and methods of measurement. The value of this limit is much more than the Heisenberg uncertainty relation provides and places severe restrictions on the microphysics.
At its core, Δpmm is the a-priori conceptual “first-born” uncertainty that is inherent to any physical-mathematical model and is independent of the measurement process. The uncertainty determined by the proposed principle is not the result of measurement, it represents an intrinsic property of the model, and it is caused only by the number of selected variables and the chosen CoP. Therefore, the overall uncertainty model including additional uncertainties associated with the structure of the model and its subsequent computerization will be much greater than Δpmm. Factually, Equation (6) can be regarded as the uncertainty principle for the model development process. Namely, any change in the level of the detailed description of the observed object (z′′ − β′′; z′ − β′) causes a change in the uncertainty model Δpmm and the accuracy of each main variable characterizing the properties of the object internal structure.
Within the above approach, we can find the relation between (
z′′ −
β′′) and (
z′ −
β′), by which the “comparative uncertainty”
Δpmm/
S [
3] is minimal,
According to (9), for SI and the chosen
CoP, for example,
LMTI, a least comparative uncertainty can be reached at (
z′′ −
β′′)
LMTI = 6, for
LMTΘ the number of dimensionless variables (
z′′ −
β′′)
LMTΘcausing a minimum value of
Δpmm/S, is 19 [
13].
5. Conclusions
Analysis With a methodological point of view, Δpmm can be taken as a measure of the adequacy of accuracy in the physical experiment measuring.
The physical meaning of
Δpmm lies in the fact that at the schematization any event or process, there is a mismatch between PMM and MO, called threshold discrepancy [
38].
The value Δpmm, due to the threshold discrepancy, should always be no more than the permissible uncertainty of measurement. Otherwise, it is necessary to redefine the model before the experiment carrying out. Within the above approach, Δpmm represents a sort of “model noise” (similar to the “thermal noise”).
Along with the already mentioned functions, inherent Δpmm (criterion validity of the proposed physical-mathematical model, the measure of evaluation of sufficient accuracy calculations), it is necessary to draw attention to the following fact. The uncertainty Δpmm can also be used in carrying out numerical experiments using the theory of planning experiment on computers. The feasibility of this approach is dictated by the need to calculate the reproducibility dispersion and, respectively, Fisher criterion. In turn, the Fisher criterion determines the times of cessation of screening influencing factors, which are important for the study.
With the aim of this study, there was formulated the approach to the choice of dimension of a physical-mathematical model, describing researched natural phenomena or process. This model corresponds to the measurement accuracy in the field experiments. For the chosen physical-mathematical model, we proposed a formula for calculating the minimum absolute uncertainty of defining the desired unknown variable (criterion) with which to compare the actual experimental uncertainty.
It would be useful for practice and theory of measurements to consider the uncertainty estimation of the physical-mathematical model arising from the finiteness of the model (a limited number of chosen variables), as a measure of the adequate precision of physical measurements for the experiment.
The concept of relative uncertainty was used when considering the accuracy of the achieved results (absolute value and absolute uncertainty of the separate variables and criteria) during the measurement process in different applications. However, this method for identifying the measurement accuracy does not indicate the direction of deviation from the true value of the main variable. In addition, it involves an element of subjective judgment. That is why, for the purposes of this approach, along with a relative uncertainty, this study recommends a comparative uncertainty for analyzing published results.
The introduced novel analysis is intended to help physicists and designers to determine the most simple and reliable way to select a model with the optimal number of recorded variables calculated according to the minimum achievable value of the model’s uncertainty.
The information approach and its presented results can be used for the prediction of the model’s discrepancy of physical phenomenon and technological process for the practical problems of macro- and microphysics.
One important remark about the physical meaning of the proposed information approach. Any physical process, from quantum mechanics to palpitation, can be viewed by the observer only through the idiosyncratic “lens”. Its material is alloy of not only mathematical equations, but also, without fail, regardless of the researcher’s desire, his intuition, experience and knowledge. They, in turn, are framed by a system of primary variables, which is also chosen by the universal consensus of human individuals. Thus, the aberration in modeling (distortion of reality) is inherent, before the formulation of any physical, and even more so, mathematical statement. The degree of depravity of the image of a true real object depends precisely on the chosen class of phenomena and the number of variables considered.