Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5

Okulich-Kazarin, Valery

doi:10.3390/su16198395

Open AccessArticle

Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5

by

Valery Okulich-Kazarin

^1,2

¹

Faculty of Social and Computer Sciences, Higher School of Business—National Louis University, 33-300 Nowy Sącz, Poland

²

Faculty of Social Sciences and Humanities, Humanitas University, 41-200 Sosnowiec, Poland

Sustainability 2024, 16(19), 8395; https://doi.org/10.3390/su16198395

Submission received: 23 July 2024 / Revised: 3 September 2024 / Accepted: 24 September 2024 / Published: 26 September 2024

(This article belongs to the Special Issue AI for Sustainable Real-World Applications)

Download

Browse Figures

Versions Notes

Abstract

:

In recent years neural networks have been used to achieve all 17 SDGs. This paper is directly related to SDG 9. In particular, the application of neural networks in statistics indicates the creation and development of a scientific research infrastructure (including encouraging innovation, SDG 9.5). Also, this paper shows the possibility of the mass practical application of neural networks for statistics in the context of sustainable development (with the possibilit of increasing the number of researchers, SDG 9.5). The paper aims to test the following two hypotheses in the context of SDG 9.5: (1) The rapid growth of scientific interest in neural networks will lead to a decrease in the number of scientific publications in statistics. (2) It is possible to use neural networks for calculating statistical indicators. Bibliometric analysis, mathematical modeling, the calculation of statistical indicators using the new prompt and Excel table z-statistics were used. The scientific novelty lies in the new knowledge obtained by the author for the first time. This study integrates advanced technologies (neural networks) and a traditional field (statistics), which is a significant contribution to innovation and infrastructure development (Indicator 9.5.1). The practical value lies in the ease of the mass use of neural networks for statistical data processing of more than 100,000 units, which is related to Indicator 9.5.2. Thus, this paper represents an important contribution to the stimulation of innovation, thereby building up technological potential and leading to a significant increase in the number of researchers (SDG 9.5).

Keywords:

sustainable development; SDG 9.5; Indicator 9.5.1; Indicator 9.5.2; neural networks; statistics; statistical indicator

1. Introduction

Neural networks’ capabilities in solving the social problems identified by the United Nations (UN) Sustainable Development Goals (SDGs) are constantly expanding [1]. Back in 2018, the authors of article [2] suggested that the existing capabilities of neural networks could contribute to solving problems in all 17 UN SDGs in both developed and developing countries. In 2024 [1], neural networks are already being used to achieve all 17 UN SDGs, ranging from poverty eradication to the creation of sustainable cities and communities.

Real-world examples of the use of neural networks range from cancer diagnosis to helping blind people navigate and from predicting the spatial distribution of environmental risk indicators to assistance in disaster response [2,3]. The development and application of neural networks benefit interdisciplinary fields, such as “renewable energy development, environmental protection and economic analysis” [4,5,6,7,8]. Therefore, the topic of this scientific paper is primarily related to SDG 9, which involves industry, innovation, and infrastructure.

On the one hand, the development of a new approach and the application of neural networks to calculate statistical indicators indicate the creation and development of a scientific research infrastructure, which corresponds to SDG 9.5 ([9], p. 13). On the other hand, the author describes the use of neural networks to calculate statistical indicators, which is an innovative approach in the field of statistics. This corresponds to the objectives of SDG 9.4, SDG 9.a, and SDG 9.b, which are aimed at the development of new technologies and methods [6,9].

In modern sustainable society, the use of neural networks is becoming more widespread and significant in various fields of business [10,11,12,13], politics [14,15,16,17,18], and the social sphere [19,20,21,22]. One of the areas of application of neural networks is the solution of statistical tasks [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42], including the calculation of various statistical indicators.

In this paper, the author opens up the possibility of the mass use of a new approach for calculating statistical indicators. It also promotes the dissemination of innovations and their implementation in practice (SDG 9) [7,9].

Finally, the author demonstrates the integration of advanced technologies (neural networks) into a traditional field (statistics), which represents a significant contribution to innovation and infrastructure development (Indicator 9.5.1) [9]. At the same time, the author describes a new application of neural networks in the field of statistics, which is the basis for new technological breakthroughs. At the same time, the proposed approach to the application of neural networks simplifies the calculation of statistical indicators and may lead to an increase in the number of researchers (Indicator 9.5.2) [9]. Thus, the author of the paper demonstrated how advanced scientific research is related to the sustainable development of industry, innovation, and infrastructure (SDG 9) [7].

A final goal of this study was to test the following pair of hypotheses about neural networks and statistics:

The rapid growth of scientific interest in neural networks will lead to a decrease in the number of scientific publications in the field of statistics;
It is possible to use neural networks for calculating statistical indicators.

The first hypothesis can be tested using a bibliometric analysis.

The second hypothesis can be tested by creating a prompt for calculating statistical indicators. The calculation results using the new prompt should be compared with the results obtained in the traditional way (for example, an Excel table).

The scientific novelty of the research lies in the following:

−: the insufficient study of statistics using neural networks in the context of SDG 9.5;
−: new knowledge obtained by the author for the first time, which was not previously known;
−: an innovative approach to the use of neural networks for calculating statistical indicators;
−: two tested hypotheses about statistics using neural networks in the context of SDG 9.5 were substantiated;
−: a new prompt for neural networks adapted for calculating the statistical indicators (M) and (s);
−: proposed measures aimed at the development of statistical science according to global trends and the mass use of neural networks for calculating statistical indicators.

The practical significance of the research lies in a significant simplification of calculating the statistical indicators such as the average value (M) and the standard deviation (s). The prompt promises to significantly simplify the calculation process and improve the quality of the results obtained. A significant advantage of the new prompt is the absence of the need to enter a large amount of data. This prompt is proof of the second hypothesis.

To ensure the high quality of the data collected and used in this study, the following measures were taken:

First, the literature sources of data were carefully selected from reputable and verified databases, such as Scopus and Web of Science.

Second, the elimination of bias was achieved through transparency, careful planning of the thought experiment, and careful evaluation of the empirical results and conclusions.

Third, a standard method (Excel tables) was used to compare the accuracy of the data obtained using the new prompt.

Finally, the author performed statistical hypothesis verification with a high level of verification, namely, 0.99. The alternative hypothesis was accepted. It is a very strong scientific proof with an accurate, predictable probability.

Thus, this minimized the possibility of errors and distortions.

The article is organized as follows:

−: the “Literature review” section covers the background, a bibliometric analysis for the keywords “Neural networks”, a theoretical framework of neural networks, and other necessary issues;
−: the “Methodology” section prepares a proof of the hypotheses;
−: the “Results” section presents the construction of two trend lines for the proof of the first hypothesis, the proof of the second hypothesis through the creation of the new prompt, five examples of using the new prompt to calculate statistical indicators (M) and (s), and verification of the statistical hypotheses;
−: in the “Discussion” section, the results are discussed and compared with existing methods;
−: finally, the “Conclusion” section summarizes the results of this study and formulates directions for future research.

2. Literature Review

To ensure the reliability and scientific value of the sources (documents), the author used the following methods:

(A): The author tried to use sources published during the last 5 years. This ensures the use of new scientific data that do not contain outdated data;
(B): The author tried to use sources published in journals that are indexed in the Scopus and Web of Science databases. There are more than 60% of such sources. About half of them have an impact factor. This guarantees reliance on reliable and valuable scientific data.

2.1. Background

The solution of statistical tasks is used in solving sustainable development problems (Switzerland, Poland, Slovakia, Ukraine, and Germany) [23,24,25], nature conservation (Malaysia) [26], higher education (Eastern Europe) [27], the banking sector (Indonesia and China) [28,29], psychology (Canada, Germany, and the USA) [30,31,32], the field of health care and sports (Great Britain, Spain, Kazakhstan, Indonesia, and Malaysia) [33,34,35,36,37], engineering sciences (China) [38], public policy (Great Britain, Poland, and Ukraine) [39,40,41], “clean” energy (Morocco) [42], and weather forecasting (India, Tanzania) [43].

Thus, statistics are used to solve a wide range of problems related to social and sustainable development. However, a review of sources [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43] showed a rare use of statistics to solve problems related to SDG 9. This is the third reason for the author’s interest in the research topic. This paper discusses the use of neural networks in statistics, specifically to calculate statistical indicators (M and s) [44,45,46].

Thus, statistical indicators play a key role in data analysis and decision-making in various fields, including sustainable development, economics, psychology, public policy, and others [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43]. The accuracy and efficiency of calculating these indicators have a direct impact on the quality of data analyses and the conclusions that are drawn based on them. Traditional methods of calculating statistical indicators can be time-consuming and error-prone, especially when working with a large amount of data [26,47,48,49,50].

The research [51] has resulted in a new prompt for neural networks, which allows for calculating the mentioned statistical indicators (M and s) in the simplest cases.

2.2. Bibliometric Analysis for the Keywords “Neural Networks”

The bibliometric analysis was performed on 2 February 2024. In the very beginning, 980,781 document search results for the keywords “Neural networks” were visualized (Figure 1). The analysis was performed for sources indexed in the Scopus database from 1961 to 2023 (https://www.scopus.com/term/analyzer.uri, accessed on 2 February 2024).

Figure 1 shows a steady increase in the number of published documents on the topic of “Neural networks” from 1987 to 2015. After that, a sharp increase in the number of publications begins. Overall, the growth amounted to 109,999 publications from 2015 to 2023. This rapid growth since 2015 may lead to a decline in research activity in other areas of knowledge.

2.3. Theoretical Framework of Neural Networks

One definition of a “Neural network” is “a machine learning program, or model, that makes decisions in a manner similar to the human brain, by using processes that mimic the way biological neurons work together to identify phenomena, weigh options and arrive at conclusions” [52].

The next definition is as follows: “A neural network is a massively parallel distributed processor made up of simple processing units that has a natural propensity for storing experiential knowledge and making it available for use” [53].

Neural networks are used in many scientific disciplines and practical activities, including predictive modeling, facial recognition, handwriting recognition, text translation, email spam filtering, financial and medical diagnostics, management, solving differential equations, technical system diagnostics, the fashion and design industry, the field of sports, etc. [54,55,56,57,58,59,60].

Therefore, the author hypothesized that the rapid growth of scientific interest in neural networks could lead to a decrease in the number of publications in the field of statistics.

2.4. Some Words of Statistical Indicators

Statistical indicators, such as average value (M) and standard deviation (s), play an important role in statistical analysis. These parameters provide important information about the distribution of data and characterize their central trend and spread. The first statistical indicator is the average value (M). It represents the arithmetic mean of all values in the sample [61]. The average value allows you to estimate the typical value in the sample and identify common patterns in the data. The second statistical indicator is the standard deviation (s). It reflects the spread of values around the average value and represents a measure of data variability [62]. The standard deviation allows you to estimate the degree of data dispersion relative to their average value and identify the presence of anomalies or heterogeneities in the data.

In scientific data analysis, there are several methods for calculating statistical indicators, such as the average value (M) and the standard deviation (s), using various approaches and tools.

The first method is a manual calculation using formulas [63].

The second method is the use of electronic calculators [64]. Scientific calculators provide a convenient and reliable way to perform statistical calculations for datasets. Their use is especially suitable when working with small amounts of data and for quick analysis of experimental results. Such examples may be students’ graduation papers.

The third method is using software such as Microsoft Excel 97-2003 tables [65]. The user enters data into an Excel table and uses the appropriate functions to calculate statistical indicators.

Finally, there are specialized programs for data analysis, such as MATLAB, Python, R, and others [66,67,68,69]. This is the fourth way to calculate statistical indicators. These programs provide a wide range of tools and functions for data analysis and the calculation of statistical indicators.

All four methods have their advantages and limitations, and the choice of a particular method depends on the specifics of this study, the available tools, and the preferences of the researcher. When using specialized programs, it is necessary to take into account their study and getting used to the interface, as well as possible limitations and features of the algorithms used inside these programs.

There are also sources describing the experience of using neural networks for statistical calculations.

For example, there are forecasts of monthly and seasonal weather anomalies in the western Indian Ocean using three neural networks [70]. The average correlation coefficients are about 0.88 and 0.98, respectively. More recently, the effectiveness of Bayesian neural network models for virtual monitoring at an operating offshore wind farm has been proven [71].

The authors of papers [72,73] showed satisfactory results when using neural networks in statistics for diagnosis and the mass classification of medical data.

Panda and Warrior (2022) have trained a neural network to develop accurate Reynolds stress models to predict flow [74].

Thus, you have examples of the use of neural networks in statistics [70,71,72,73,74]. The accumulation of such experience can lead to the creation of a fifth method for calculating statistical indicators using neural networks.

Therefore, the author hypothesized that the calculation of statistical indicators using neural networks should become a simple and convenient tool for mass use by people without professional statistical education.

2.5. Short Summary

Concluding the literary review, the author puts forward the following considerations:

1. Statistical indicators are of key importance for data analysis and subsequent management decision-making in various areas of sustainable development and others.

The methods of calculating statistical indicators can range from manual calculations to the use of specialized programs.

2. Neural networks are used in various scientific disciplines and practical activities, including predictive modeling, face recognition, text processing, spam filtering, diagnostics in finance and medicine, management, solving differential equations, diagnostics of technical systems, the fashion and design industry, sports, etc.

The author of this paper suggested that the accumulation of experience in using neural networks in statistics may lead to the creation of a new method for calculating statistical indicators using them (SDG 9.5).

3. A bibliometric analysis of the keywords “Neural networks” shows a steady increase in publications from 1987 to 2015 and a sharp increase in the number of publications beginning in 2015.

The author of this paper suggested that this could lead to a decrease in the activity of research in other fields of knowledge, particularly in statistics (SDG 9.5).

4. The author of this paper sets out to prove the fundamental possibility of using neural networks in statistics to calculate the following statistical indicators: the average value (M) and the standard deviation for the sample (s). The positive results will be important for data analysis and subsequent management decision-making in the context of SDG 9.5.

3. Materials and Methods

3.1. General Information

This research was performed from February 2024 to July 2024.

The first step was to perform a literature review and bibliometric analysis using the keywords “Neural Networks” and “Statistics”. This has resulted in a test of the first hypothesis.

Testing the second hypothesis turned out to be more difficult. Thus, the next step was the creation of a neural network prompt in order to calculate statistical indicators (M) and (s). The author used ChatGPT 3.5 to create the prompt. However, the scientific novelty and practical significance of this study are relevant to all neural networks in general.

Then an experiment was performed comparing the results obtained using the new prompt and Excel tables. In the next step, the author verified the statistical hypotheses. After that, a guide for the use of the new prompt was compiled. Finally, after the discussion, a “Conclusions” section was prepared.

The following standard methods were used to achieve the final goal of this study:

−: a bibliometric analysis and analysis of scientific sources;
−: mathematical modeling for the choice of research boundaries;
−: the calculation of statistical indicators using the new prompt and Excel tables;
−: z-statistics (verification of statistical hypotheses).

3.2. Methodological Framework of the Choice of Research Boundaries

A sample size plays an important role in statistics [75,76]. When calculating the sample size, the volume of the general population (N) can be ignored if it is more than 100,000 [75]. Such situations are often found in the context of sustainable development [23,24,25,36,37,42,43].

A sampling error is selected by the researcher depending on the objectives of the study. It is believed that in order to make managerial decisions, the sampling error should be no more than 4% [75]. This value corresponds to the sample size of 600 respondents with a general population of above 100,000. For important strategic decisions, it is advisable to minimize the sampling error.

Therefore, this paper will consider examples of calculating statistical indicators with a general population of above 100,000 units [75]. The following are a number of the sampling error values (%): 1.0, 1.5, 2.0, 3.0, and 4.0 [74,75]. The following are the first and second limitations of this study: the volume of the general population is more than 100,000, and a number of sampling error values from 1.0% to 4.0% are accepted.

In this study, the examples will relate to a situation in which one of several features may correspond to a unit of population. In our examples, as a rule, each question provided 5 possible answers. This is conducted through analogy with the Likert scale [77]. When answering the question, respondents indicate their level of agreement or disagreement on a symmetrical “agree-disagree” scale [77]. The format of a typical five-level Likert scale, for example, can be as follows [77]:

−: I completely disagree;
−: Disagree;
−: Neither agree nor disagree;
−: I agree;
−: I completely agree.
−: This is the third limitation of this paper.

3.3. Boundaries of the Research

First, sample sizes were modeled for different sampling error values. As shown in Section 2.3, discrete sampling error values from 1.0 to 4.0 were selected. The choice is explained by the fact that for business and sustainable development decision-making, the sampling error should not exceed 4% [75].

Next, from the resulting set of values, five sample sizes will be selected for examples when calculating statistical indicators (M) and (s).

Examples will be considered for the volume of the general population (N) that is more than 100,000 [75].

Sample size modeling was performed using the electronic calculator [78] for the following two values of the standard and high testing levels [44,45,46]: 0.95 and 0.99 (Table 1).

Table 1 shows a set of 10 sample size values. These 10 sample sizes can be considered the number of standard sample sizes when the volume of the population (N) is more than 100,000.

Table 1 shows that the sample size has a minimum value of 600 units and a maximum value of 16,589 units. The values 1067 and 1037 are close to each other. The sample sizes 4268 and 4147 can also be considered close to each other.

For examples of using the new prompt, you will take the following five sample sizes: 600, 1067, 4147, 7373, and 16,589. The numbers 600 and 16,589 are the boundaries of the research. The numbers 1067 and 4147 have intermediate repeating values. The number 7373 is an intermediate value.

Thus, out of 10 standard sample sizes, five standard values are selected for consideration as an example (Table 2).

Table 2 shows 5 sample sizes, which will be used as examples for calculating statistical indicators.

Calculations of statistical indicators for a sampling error of 4.0% will be performed based on real cases [24].

3.4. Data for Creating the Prompt

When creating the new prompt, the following situation was adopted: you use a five-level Likert scale [77].

The author of this paper digitized the answers as follows:

−: I completely disagree—“0”;
−: Disagree—“1”;
−: Neither agree nor disagree—“2”;
−: I agree—“3”;
−: I completely agree—“4”.

That is, the neutral answer is the number “2”. The degree of negative assessment is shown by the numbers “1” and “0”. The degree of positive assessment is measured by grades “3” and “4”.

It is true that neural networks easily performed calculations of statistical indicators (M) and (s) for a sample size of less than 10 units [51]. For the general population above 100,000 units, the sample size ranges from 600 to 16,589 units (Table 1). With sample sizes of more than 600 units, neural networks periodically returned errors in the results [51]. For the numbers “−2”, “−1”, “0”, “1”, and “2”, neural networks also failed to obtain stable calculation results [51].

To achieve this goal, the author had to perform multiple refinements by referring to neural networks until an accurate result was obtained.

3.5. Method of Verification of Statistical Hypotheses

Here, the author has checked the accuracy of the calculation of statistical indicators using the neural network prompt. To achieve this, the author compared the result obtained using the new method with the result obtained using the traditional method (Excel table).

The author used the method of comparing the averages of two independent samples to test the statistical hypotheses [44,45,46]. The essence of this method is to calculate the z-statistics. Hypothesis testing is based on an assessment of the difference M₁–M₂.

The second hypothesis of this study was transformed into the following pair of statistical hypotheses: research and alternative.

The research hypothesis is H₀: M₁–M₂ ≠ 0.00 if random deviations are not taken into account.

The alternative hypothesis H₁: M₁–M₂ = 0.00 if random deviations are not taken into account.

If M₁–M₂ ≠ 0.00, the two samples are not equal to each other. This means that neural networks do not allow you to calculate statistical indicators with high accuracy.

In the other case (M₁–M₂ = 0.00), there is no statistically significant difference between the two averages. It means that the two samples are equal to each other. The results of calculating the statistical indicator (M) using the two methods are equal—their difference is explained by random deviations. This means that neural networks allow you to calculate statistical indicators with high accuracy.

In this case, the difference can be either greater than 0.00% or less than 0.00%. Thus, a two-way test was applied in our case.

4. Results

4.1. Testing the First Hypothesis

In the second step, 1,416,580 document search results for the keyword “Statistics” are visualized (Figure 2). The analysis was performed for sources indexed in the Scopus database from 1961 to 2023 (https://www.scopus.com/term/analyzer.uri, accessed on 2 February 2024).

Figure 2 shows a steady increase in the number of published documents on the topic of “Statistics” until 2015. After that, fluctuations in the number of publications begin with a general downward trend. Overall, the drop was 24,353 publications from 2015 to 2023.

It was very interesting to compare the number of publications on the keywords “Statistics” and “Neural networks” in the Scopus database. The comparison was made from 2015 to 2023 (Figure 3).

Figure 3 shows an interesting picture. You see a sharp decrease in the number of published documents on the topic of “Statistics” from 2018 to 2020. Further, since 2020, the number of documents has increased slightly. In general, the trend line tends to decrease.

At the same time, Figure 3 shows a constant increase in the number of published documents on the topic of “Neural networks”. The number of documents on the topic of “Neural networks” exceeds the number of documents on the topic of “Statistics” published since 2019.

The author has seen an increase in the number of published documents on the topic of “Neural networks” and a decrease in the number of published documents on the topic of “Statistics” (Figure 3). Figure 2 and Figure 3 have proven that the number of publications on the topic of “Statistics” decreases in the Scopus database after 2015.

Accordingly, the first hypothesis has been verified.

4.2. New Prompt for Calculating Statistical Indicators

The process of teaching neural networks led to the creation of a successfully working prompt. The following is the prompt for the first sample size value from Table 2 (N = 599 [24]):

“Make Task 1, please. You are a mathematician with a specialization in statistics ::

Be careful with statistical calculations, provide accurate solutions, originality is not needed here ::

The sample of numbers consists of N = 599 digits, of which N1 = 22 digits “4”, N2 = 43 digits “3”, N3 = 124 digits “2”, N4 = 234 digits “1”, and N5 = 176 digits “0” ::

(1): Calculate the sum of A = Ni × xi. Print the value of A :: Calculate the value of the sample mean M(x) = A/N. Print the value of M(x) ::
(2): Calculate Yi as the difference between each element of the sample and the average value of the sample, Yi = xi − M(x). Print the values of Yi in the column ::
(3): Calculate Bi as the squares of the differences between each element of the sample and the average value of the sample, Bi = (xi − M(x))2. Print the Bi values in the column ::
(4): Calculate Zi as the product of Ni and Bi, Zi = Ni × Bi. Print the values of Zi in the column ::
(5): Calculate Z as the sum of Z = ∑ Zi. Print the value of Z ::
(6): Calculate C as a quotient of C = Z/N. Print the value of C ::
(7): Calculate the value of the standard deviation for the sample sx as the square root of C, that is, Sx = √ C. Print the value of Sx ::
(8): Print the result in the following form: M(x) = the result of calculations up to the fourth decimal place, Sx = the result of calculations up to the fourth decimal place. Print the letters M(x) and Sx in bold, please.

The new prompt contains the following input designations [44,45,46]:

−: N—sample size;
−: Ni is the number of the attribute (one of the answers, for example, “I completely disagree”, etc.).

In the described prompt, N1 denotes the attribute “4”, N2 denotes the attribute “3”, etc.

This prompt contains the following numerical input values:

−: X, X1, X2, … Xi are the numbers of respondents’ responses for the i-th attribute, and X = X1 + X2 + … + Xi.

The numerical values of Xi are in the prompt after the input designation “Ni = .” In Example 1, the following values are set instead of Xi: N1 = 22, N2 = 43, N3 = 124, N4 = 234, and N5 = 176 (Section 4.4).

This prompt contains the following output notation [44,45,46]:

−: M(x)—the average value;
−: Sx—the standard deviation for the sample.

The rest of the designations do not matter to the user.

4.3. Guide for the Use of the New Prompt for ChatGPT 3.5

To use the new prompt for neural networks, take the following simple steps:

Register on the neural network website;
Write down the following six separate digits in the bold part of the prompt: X, X1, X2, X3, X4, and X5. They represent the sample size and the number of replies for each of the five features;
Insert the prompt into the neural network dialog box and click the “Send message” button;
Write down the obtained values for statistical indicators (M) and (s). These figures will be calculated within four decimal places.

The advantage of the new prompt is that there is no need to enter large amounts of data or perform any manipulations with them.

When using a scale other than a Likert scale [69], you can increase or decrease the number of “i” signs. In this case, in the bold part of the prompt, you will have an additional element, “N6 = .” Or, conversely, you may reduce the number of elements, for example, to “N3 = .”.

Thus, you are only interested in the following two digits at the output: (M) and (s).

4.4. Five Examples of Calculating Statistical Indicators

The input values for the first example are borrowed from paper [24]. In the paper [24], the author and his colleagues interviewed students from Eastern European universities about their attitudes toward the use of artificial intelligence in education. The authors used a five-level Likert scale [77]. For abstract examples, the same scale was used, with only five invented values of the selected features.

Table 3 shows five examples. The top line of each example shows the values of statistical indicators calculated experimentally (new prompt). The middle line of each example shows the values of statistical indicators calculated using the control method (Excel table). The bottom line of each example shows the difference in the values of statistical indicators calculated using the experimental and control methods.

Table 3 shows the difference in values (M) in the third and fourth digits (decimal places). In the first and fifth examples, the coincidence of the results is almost perfect.

The maximum difference in the values (M) and (s) is observed in the third example. Thus, the verification of statistical hypotheses about the equality of two sample averages will be performed for the third example (Table 3). If the difference is not statistically significant in the third example, then it is possible to decide on a good accuracy of the values of the statistical indicators calculated using the experimental and control methods.

4.5. Verification of Statistical Hypotheses

The author used a high level of verification [44,45,46], which is 0.99.

The z-statistics for the third example from Table 3 are shown in Table 4.

Table 4 demonstrates that the z-statistics |Zstat| is less than the Ztabl. In this case, the alternative hypothesis is accepted, as follows: the difference M₁–M₂ = 0.00. This means that the two samples are equal to each other. Acceptance of the alternative hypothesis is stronger evidence than acceptance of the research hypothesis [44,45,46]. This result was obtained at the high testing level of 0.99 and a sampling error of 2.0%. These are very strict conditions for verification. Accordingly, the second hypothesis has been verified.

5. Discussion

The results obtained demonstrate the integration of advanced technologies (neural networks) into a traditional field (statistics). This result represents a significant contribution to innovation and infrastructure development (Indicator 9.5.1) [9].

In fact, these two hypotheses suggest that neural networks can be used to calculate statistical indicators. Such hypotheses may seem strange. Indeed, why do we need new tools (neural networks) in statistics if there are already reliable and proven tools [64,65,66,67,68,69]? The author agrees with this objection.

At the same time, the author allows himself to point out that statistics began with manual calculations [63]. This method was definitely reliable, interesting, and important.

Then calculators came to people’s aid [64]. Perhaps someone objected to their use.

Then Excel tables and other automated methods appeared [64,65,66,67,68,69]. Probably, every time they were introduced, someone doubted and objected to them.

The bibliometric analysis showed that the number of scientific publications on statistics in the Scopus database has been decreasing since 2015 (Figure 2). The number of publications on neural networks exceeded the number of publications on statistics in 2018 (Figure 3). The author agrees with his colleagues that these two graphs (Figure 1 and Figure 2) do not appear to be directly related to each other. However, two trend lines (Figure 3) show an increase in the interest of scientists in neural network tools and a decrease in interest in statistics. Moreover, the scientific literature already provides examples of the use of neural networks in statistics [43,70,71,72,73,78,79].

The new prompt makes it possible to calculate statistical indicators, such as the average value (M) and the standard deviation (s). The experiment showed results similar to the results of the control calculation using Excel tables (Table 3). Verification of statistical hypotheses showed the equality of the average sample values (M₁) and (M₂) for a sample size of 4147 units at the high testing level of 0.99 and a sample error of 2.0% (Table 4).

A comparison of the results obtained with the results of calculations in the Excel table (Table 3) led to the adoption of an alternative hypothesis (Table 4). Together with a high level of verification, which is 0.99, this is very strong scientific proof with an accurate, predictable probability [44,45,46]. Earlier, the author compared the accuracy of using neural networks with the accuracy of calculations using the Windows calculator and Excel tables [51]. This was a simpler statistical case investigated. The verification of statistical hypotheses proved the high accuracy of calculations using the ChatGPT neural network [51].

The new prompt is a simple and accessible method for calculating statistical indicators. The prompt is intended for discrete data.

The new prompt promises to significantly simplify the calculation process and improve the quality of the results obtained. A significant advantage of the new prompt is the absence of the need to enter a large amount of data. In truth, this prompt is proof of the second hypothesis about the possibility of using neural networks in the field of statistical data analysis.

The practical application of the new prompt frees researchers from entering large amounts of data and conducting manual calculations and allows them to focus on conceptual information [50,51,80]. The author made this prompt himself using advice from sources [51,81,82,83,84]. Behind every word of the prompt, there are non-obvious techniques of prompt engineering, dozens of tests, the “prevention” of pitfalls, templates, boring solutions, and all the author’s experience in using AI for higher education. Despite the existing limitations (Section 2.4), the new prompt is the next step in the development of both statistics and civilization. Therefore, solving the problems of sustainable development becomes easier and faster with high accuracy.

It has been proven that the proposed approach to the application of neural networks simplifies the calculation of statistical indicators and therefore can lead to an increase in the number of researchers (Indicator 9.5.2) [9].

At the same time, the author of this paper demonstrated how the scientific research carried out is related to the sustainable development of industry, innovation, and infrastructure. The result obtained by the author corresponds to the result previously published prompt in the paper [51]. The mentioned prompt has found application in this study [84], the results of which are relevant to SDG 4.

This paper develops a previously published result [51] in the context of sustainable development. The first difference between the new scientific result and the previous one [51] is in the expansion of the scale for the analysis of primary data. The paper [51] describes a prompt for a two-level scale of data processing, as follows: agree–disagree, yes–no. In this paper, data processing is performed on a five-level Likert scale. The author also writes about the possibility of using a new prompt for an even wider scale (six levels or more). The second difference is in the number of units of information processed. In the previous paper [51], the author proved the applicability of neural networks in calculating statistical indicators for a general population of less than 50,000 units. The new prompt works for a total population of more than 100,000 units. Therefore, the use of neural networks in the field of statistics has become the basis of new technological breakthroughs (SDG 9 [7]).

Returning to the beginning of the discussion, the author notes that the calculation of statistical indicators using neural networks can be used in the broad context of sustainable development [23,24,25,36,37,42,43,85,86,87].

The integration of neural networks into statistical calculations should not be abrupt and sudden. It should occur in several stages. Therefore, the author follows the path of step-by-step data evaluation.

In the first stage, the author created a prompt for a simpler statistical case [51]. This was the calculation of statistical indicators for a simple choice of “yes-no”, “on-off”, “black-white”, “agree-disagree”, and so on.

In the second stage, the author presented a new prompt to evaluate the data for the Likert scale.

In the third stage, the author will try to perform z-statistics and t-statistics using neural networks.

In the future, the author plans to invite a wide range of specialists to solve other statistical problems using neural networks.

In this paper, we are mostly talking about the tasks related to SDG 9. However, the results obtained can be applied to analytics in areas unrelated to sustainable development.

6. Conclusions

In general, the work opens up a new, revolutionary direction in the use of neural networks for social sustainability. This study covers a wide range of sustainable development goals. However, the main advantages of this work relate to SDG 9.5.

This was tested using the following two hypotheses:

The rapid growth of scientific interest in neural networks will lead to a decrease in the number of scientific publications in the field of statistics;
It is possible to use neural networks for calculating statistical indicators.

The tests showed the correctness of both hypotheses. The results represent the innovative approach to the use of neural networks in the field of statistics. Thus, this study has scientific novelty.

Additional scientific novelty is the creation of the new prompt as a simple and accessible method for calculating statistical indicators through the use of neural networks. This new prompt is specially adapted for calculating the following statistical indicators: the average value (M) and the standard deviation for the sample (s). A strong advantage of the new prompt is the ability to perform calculations without entering a large amount of data. That is, the result represents a significant contribution to innovation and infrastructure development (Indicator 9.5.1).

This research is one of the first steps for applying neural networks in research methods in sustainable development, management, and business and solving other practical problems. The author recommends using this prompt to calculate statistical indicators. Thus, the practical value of paper lies in the possibility of mass use of the new prompt for calculating statistical indicators. Simplification of the calculation of statistical indicators should lead to an increase in the number of researchers (Indicator 9.5.2). The author recommends that researchers cite this paper when using the new prompt. Additionally, the author would like to give one managerial recommendation for scientific journals in the field of statistics, as follows: open a scope for “neural networks in statistics” for papers. This managerial recommendation allows statistical science to follow global trends.

In general, this article represents an important contribution to the use of neural networks for statistics, not only for SDG 9.5 but also in the broad context of sustainable development.

The purpose of subsequent research is the theoretical justification and empirical development of new concepts for the application of neural networks into statistics. It is also desirable to expand the research, taking into account the three limitations of this paper. Also, the author would like to explore the integration of neural networks into new statistical methods; “Bayesian networks”, “Markov models”, etc. were not taken into account. From the point of view of prompt engineering, it is possible to work on creating an AI assistant for performing t-statistics and z-statistics.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

This study was supported by the Eastern European Scientific Group. The author is very grateful to reviewers for their appropriate and constructive suggestions to improve this paper.

Conflicts of Interest

The author declares no conflicts of interest.

References

McKinsey. AI for Social Good: Improving Lives and Protecting the Planet. Report. Available online: https://www.mckinsey.com/capabilities/quantumblack/our-insights/ai-for-social-good#/ (accessed on 10 May 2024).
Safarov, R.; Shomanova, Z.; Nossenko, Y.; Mussayev, Z.; Shomanova, A. Digital Visualization of Environmental Risk Indicators in the Territory of the Urban Industrial Zone. Sustainability 2024, 16, 5190. [Google Scholar] [CrossRef]
Chui, M.; Harrison, M.; Manyika, J.; Roberts, R.; Chung, R.; van Heteren, A.; Nel, P. Notes from the AI Frontier: Applying AI for Social Good. Discussion Paper. December 2018. Available online: https://www.mckinsey.com/~/media/mckinsey/featured%20insights/artificial%20intelligence/applying%20artificial%20intelligence%20for%20social%20good/mgi-applying-ai-for-social-good-discussion-paper-dec-2018.pdf (accessed on 10 July 2024).
Nanping, W.; Lee, T.-J.; Chen, L.-J.; Kung, C.-C. Special Collections for Applying Artificial Intelligence Techniques to Encourage Economic Growth and Maintain Sustainable Societies. Sci. Prog. 2024, 107, 368504231223625. [Google Scholar] [CrossRef] [PubMed]
Yoon, J.; Han, S.; Lee, Y.; Hwang, H. Text Mining Analysis of ESG Management Reports in South Korea: Comparison with Sustainable Development Goals. Sage Open 2023, 13, 21582440231202896. [Google Scholar] [CrossRef]
Roger, M.; Shulin, L.; Sesay, B. ICT Development, Innovation Diffusion and Sustainable Growth in Sub-Saharan Africa. Sage Open 2022, 12, 21582440221123894. [Google Scholar] [CrossRef]
United Nations. 2017 High-Level Political Forum. Thematic Review of SDG-9: Build Resilient Infrastructure, Promote Inclusive and Sustainable Industrialization and Foster Innovation. 2017. Available online: https://sustainabledevelopment.un.org/content/documents/14363SDG9format-revOD.pdf (accessed on 10 July 2024).
Thacker, S.; Adshead, D.; Fay, M.; Hallegatte, S.; Harvey, M.; Meller, H.; O’Regan, N.; Rozenberg, J.; Watkins, G.; Hall, J.W. Infrastructure for Sustainable Development. Nat. Sustain. 2019, 2, 324–331. [Google Scholar] [CrossRef]
United Nations. Resolution Adopted by the General Assembly on 6 July 2017. Work of the Statistical Commission Pertaining to the 2030, Agenda for Sustainable Development (A/RES/71/313). Available online: https://ggim.un.org/documents/a_res_71_313.pdf (accessed on 6 June 2024).
Kaplan, A.D.; Kessler, T.T.; Brill, J.C.; Hancock, P.A. Trust in artificial intelligence: Meta-analytic findings. Hum. Factors 2023, 65, 337–359. [Google Scholar] [CrossRef]
Huang, M.-H.; Rust, R.T. Artificial intelligence in service. J. Serv. Res. 2018, 21, 155–172. [Google Scholar] [CrossRef]
Woods, R.; Doherty, O.; Stephens, S. Technology driven change in the retail sector: Implications for higher education. Ind. High. Educ. 2022, 36, 128–137. [Google Scholar] [CrossRef]
Lerma, D.F.P.; Kwarteng, M.A.; Pílik, M. Influence of Personal Cultural Orientations in Artificial Intelligence Adoption in Small and Medium-Sized Enterprises. In New Sustainable Horizons in Artificial Intelligence and Digital Solutions; I3E 2023. Lecture Notes in Computer Science; Janssen, M., Pinheiro, L., Matheus, R., Frankenberger, F., Dwivedi, Y.K., Pappas, I.O., Mäntymäki, M., Eds.; Springer: Cham, Switzerland, 2023; Volume 14316. [Google Scholar] [CrossRef]
Al-Amoudi, I. The politics of post-human technologies: Human enhancements, artificial intelligence and virtual reality. Organization 2023, 30, 1238–1245. [Google Scholar] [CrossRef]
Fell, E. Digital citizenship and artificial intelligence: Information and disinformation. Eur. J. Commun. 2022, 37, 563–568. [Google Scholar] [CrossRef]
Johnson, B.A.M.; Coggburn, J.D.; Llorens, J.J. Artificial Intelligence and Public Human Resource Management: Questions for Research and Practice. Public Pers. Manag. 2022, 51, 538–562. [Google Scholar] [CrossRef]
Schippers, B. Artificial Intelligence and Democratic Politics. Political Insight 2020, 11, 32–35. [Google Scholar] [CrossRef]
Damnjanović, I. Polity Without Politics? Artificial Intelligence Versus Democracy: Lessons from Neal Asher’s Polity Universe. Bull. Sci. Technol. Soc. 2015, 35, 76–83. [Google Scholar] [CrossRef]
Hockly, N. Artificial Intelligence in English Language Teaching: The Good, the Bad and the Ugly. RELC J. 2023, 54, 445–451. [Google Scholar] [CrossRef]
Suh, W.S.; Ahn, S. Development and Validation of a Scale Measuring Student Attitudes Toward Artificial Intelligence. SAGE Open 2022, 12, 21582440221100463. [Google Scholar] [CrossRef]
Hayward, K.J.; Maas, M.M. Artificial intelligence and crime: A primer for criminologists. Crime Media Cult. 2021, 17, 209–233. [Google Scholar] [CrossRef]
Yang, X. Accelerated Move for AI Education in China. ECNU Rev. Educ. 2019, 2, 347–352. [Google Scholar] [CrossRef]
MacFeely, S. Measuring the Sustainable Development Goal Indicators: An Unprecedented Statistical Challenge. J. Off. Stat. 2020, 36, 361–378. [Google Scholar] [CrossRef]
Okulich-Kazarin, V.; Artyukhov, A.; Skowron, Ł.; Artyukhova, N.; Dluhopolskyi, O.; Cwynar, W. Sustainability of Higher Education: Study of Student Opinions about the Possibility of Replacing Teachers with AI Technologies. Sustainability 2024, 16, 55. [Google Scholar] [CrossRef]
Walter, P.; Groß, M.; Schmid, T.; Weimer, K. Iterative Kernel Density Estimation Applied to Grouped Data: Estimating Poverty and Inequality Indicators from the German Microcensus. J. Off. Stat. 2022, 38, 599–635. [Google Scholar] [CrossRef]
Bakar, S.H.; Lola, M.S.; Kamil, A.A.; Zainuddin, N.H.; Abdullah, M.T. Hybrid Correlation Coefficient of Spearman with MM-Estimator. Math. Stat. 2023, 11, 693–702. [Google Scholar] [CrossRef]
Okulich-Kazarin, V. Are Students of East European Universities Subjects of Educational Services? Univers. J. Educ. Res. 2020, 8, 3148–3154. [Google Scholar] [CrossRef]
Abdurakhman. Asset Allocation in Indonesian Stocks Using Portfolio Robust. Math. Stat. 2022, 10, 1313–1319. [Google Scholar] [CrossRef]
Ye, R.; Lin, Y. Relationship between Interest Rate and Risk of P2P Lending in China Based on the Skew-Normal Panel Data Model. SAGE Open 2022, 13, 21582440231201378. [Google Scholar] [CrossRef]
Li, M.; Gao, T.; Su, Y.; Zhang, Y.; Yang, G.; D’arcy, C.; Meng, X. The Timing Effect of Childhood Maltreatment in Depression: A Systematic Review and Meta-Analysis. Trauma Violence Abus. 2023, 24, 2560–2580. [Google Scholar] [CrossRef]
Eckes, T.; Nestler, S. Do I Like Me Now? An Analysis of Everyday Sudden Gains and Sudden Losses in Self-Esteem and Nervousness. Clin. Psychol. Sci. 2024, 12, 22–36. [Google Scholar] [CrossRef]
Branch, M. Malignant side effects of null-hypothesis significance testing. Theory Psychol. 2014, 24, 256–277. [Google Scholar] [CrossRef]
Juniarsyah, A.D.; Apriantono, T.; Adnyana, I.K.; Kurniati, N.F.; Hasan, M.F. The Effect of Curcumin and Piperine Supplementation as a Recovery Method after Two Consecutive Futsal Matches. Int. J. Hum. Mov. Sports Sci. 2024, 12, 26–33. [Google Scholar] [CrossRef]
Ahmad, M.F.; Zukhi, N.A.M.; Radzi, J.A.; Othman, N.; Azmi, A.M.N.; Mustafa, M.A.; Malek, N.F.A.; Vasanthi, R.K. Analysis of Goal Scoring Pathway for the Winners in UEFA Champions League Competition. Int. J. Hum. Mov. Sports Sci. 2023, 11, 1175–1181. [Google Scholar] [CrossRef]
Townsend, P.; Simpson, D.; Tibbs, N. Inequalities in Health in the City of Bristol: A Preliminary Review of Statistical Evidence. Int. J. Health Serv. 1985, 15, 637–663. [Google Scholar] [CrossRef]
Muñoz-Bonet, J.I.; López-Prats, J.L.; Flor-Macián, E.M.; Cantavella, T.; Domínguez, A.; Vidal, Y.; Brines, J. Medical complications in a telemedicine home care programme for paediatric ventilated patients. J. Telemed. Telecare 2020, 26, 462–473. [Google Scholar] [CrossRef] [PubMed]
Baris, O.F.; Pelizzo, R. Research Note: Governance Indicators Explain Discrepancies in COVID-19 Data. World Aff. 2020, 183, 216–234. [Google Scholar] [CrossRef]
Mo, S.; Wang, D.; Hu, X.; Bao, H.; Cen, G.; Huang, Y. Dynamic characteristics of helical gear with root crack and friction. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2023, 237, 3163–3180. [Google Scholar] [CrossRef]
Okulich-Kazarin, V.; Zhurba, M.; Pagava, O.; Kalko, I.; Serbin, I. Lecture method preferences, auditory or visual, of Ukrainian consumers of educational services: A statistical analysis. Int. J. Educ. Pract. 2019, 7, 54–65. [Google Scholar] [CrossRef]
Coombes, M.; Raybould, S. Public Policy and Population Distribution: Developing Appropriate Indicators of Settlement Patterns. Environ. Plan. C Gov. Policy 2001, 19, 223–248. [Google Scholar] [CrossRef]
Okulich-Kazarin, V.; Zhurba, M.; Bokhonkova, Y.; Losiyevska, O. Three Scientific Facts about Ukrainian and Polish Law-students: Verification of statistical hypotheses about their Preferences of Learning at Lectures. Eur. J. Contemp. Educ. 2019, 8, 562–573. [Google Scholar] [CrossRef]
Ouahabi, M.H.; Benabdelouahab, F.; Khamlichi, A. Analyzing wind speed data and wind power density of Tetouan city in Morocco by adjustment to Weibull and Rayleigh distribution functions. Wind. Eng. 2017, 41, 174–184. [Google Scholar] [CrossRef]
Mahongo, S.B.; Deo, M.C. Using Artificial Neural Networks to Forecast Monthly and Seasonal Sea Surface Temperature Anomalies in the Western Indian Ocean. Int. J. Ocean Clim. Syst. 2013, 4, 133–150. [Google Scholar] [CrossRef]
Singpurwalla, D. A Handbook of Statistics: An Overview of Statistical Methods; bookboon: London, UK, 2015. [Google Scholar]
Krylov, V.E. General Theory of Statistics: A Textbook; Publishing House of the VolGU: Vladimir, Russia, 2020. [Google Scholar]
BUS_9641. Business Statistics 5, Textbook for the Program ‘Masters of Business Administration’; Kingston University: London, UK, 2010. [Google Scholar]
Riad, F.H. Statistical Inference of Modified Kies Exponential Distribution Using Censored Data. Math. Stat. 2022, 10, 659–669. [Google Scholar] [CrossRef]
Helmus, L.M.; Babchishin, K.M. Primer on Risk Assessment and the Statistics Used to Evaluate Its Accuracy. Crim. Justice Behav. 2017, 44, 8–25. [Google Scholar] [CrossRef]
Fawcett, J. Thoughts About Theories and Statistics. Nurs. Sci. Q. 2015, 28, 245–248. [Google Scholar] [CrossRef]
Pirlott, A.G.; Hines, J.C. Eliminating ANOVA Hand Calculations Predicts Improved Mastery in an Undergraduate Statistics Course. Teach. Psychol. 2023. [CrossRef]
Okulich-Kazarin, V. New Instruction (Prompt) to Calculate Statistical Indicators for Student Graduation Projects. WSEAS Trans. Comput. Res. 2024, 12, 307–317. [Google Scholar] [CrossRef]
What Is a Neural Network? Available online: https://www.ibm.com/topics/neural-networks#:~:text=A%20neural%20network%20is%20a%C2%A0,layers%2C%20and%20an%20output%20layer (accessed on 15 July 2024).
Oleinik, A. What are neural networks not good at? On artificial creativity. Big Data Soc. 2019, 6, 2053951719839433. [Google Scholar] [CrossRef]
Garpelli, L.N.; Alves, D.S.; Cavalca, K.L.; de Castro, H.F. Physics-guided neural networks applied in rotor unbalance problems. Struct. Health Monit. 2023, 22, 4117–4130. [Google Scholar] [CrossRef]
Collier, Z.K.; Leite, W.L.; Karpyn, A. Neural Networks to Estimate Generalized Propensity Scores for Continuous Treatment Doses. Eval. Rev. 2021. [Google Scholar] [CrossRef]
Shi, E.; Xu, C. A comparative investigation of neural networks in solving differential equations. J. Algorithms Comput. Technol. 2021, 15, 1748302621998605. [Google Scholar] [CrossRef]
Zhang, C.; Yue, X.; Yu, C.; Wang, Z. Clothing Color Collocation with Deep Neural Networks. AATCC J. Res. 2021, 8, 173–180. [Google Scholar] [CrossRef]
Rust, N.C.; Jannuzi, B.G.L. Identifying Objects and Remembering Images: Insights from Deep Neural Networks. Curr. Dir. Psychol. Sci. 2022, 31, 316–323. [Google Scholar] [CrossRef]
Han, W.; Luo, W.; Wang, X.; Zhong, Y. Characterization of Woven Fabric Drape Based on Neural Networks. Text. Res. J. 2023, 93, 4971–4982. [Google Scholar] [CrossRef]
He, Q.; Li, W.; Tang, W.; Xu, B. Recognition to Weightlifting Postures Using Convolutional Neural Networks with Evaluation Mechanism. Meas. Control 2023, 57, 653–663. [Google Scholar] [CrossRef]
Plackett, R.L. Studies in the History of Probability and Statistics: VII. The Principle of the Arithmetic Mean. Biometrika 1958, 45, 130–135. [Google Scholar] [CrossRef]
Bland, J.M.; Altman, D.G. Measurement Error. BMJ 1996, 312, 1654. [Google Scholar] [CrossRef] [PubMed]
MathIsFun. Standard Deviation Formulas. Available online: https://www.mathsisfun.com/data/standard-deviation-formulas.html (accessed on 15 July 2024).
How To Use A Scientific Calculator For Statistics? Available online: https://djst.org/windows/how-to-use-a-scientific-calculator-for-statistics/ (accessed on 15 July 2024).
Statistical Functions (Reference). Available online: https://support.microsoft.com/en-us/office/statistical-functions-reference-624dac86-a375-4435-bc25-76d659719ffd/ (accessed on 15 July 2024).
MathWorks. Data Import and Analysis. Available online: https://www.mathworks.com/help/matlab/data-import-and-analysis.html (accessed on 16 July 2024).
Python 3.12.1 Documentation. Available online: https://docs.python.org/3/ (accessed on 15 July 2024).
Pandas 1.2.4 Documentation. Available online: https://pandas.pydata.org/docs/ (accessed on 17 July 2024).
The R Project for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 15 July 2024).
Hlaing, N.; Morato, P.G.; Santos, F.d.N.; Weijtjens, W.; Devriendt, C.; Rigo, P. Farm-wide Virtual Load Monitoring for Offshore Wind Structures via Bayesian Neural Networks. Struct. Health Monit. 2023, 23, 1641–1663. [Google Scholar] [CrossRef]
Goyal, H.; Sherazi, S.A.A.; Gupta, S.; Perisetti, A.; Achebe, I.; Ali, A.; Tharian, B.; Thosani, N.; Sharma, N.R. Application of Artificial Intelligence in Diagnosis of Pancreatic Malignancies by Endoscopic Ultrasound: A Systemic Review. Ther. Adv. Gastroenterol. 2022, 15, 17562848221093873. [Google Scholar] [CrossRef]
Erguzel, T.T.; Noyan, C.O.; Eryilmaz, G.; Ünsalver, B.; Cebi, M.; Tas, C.; Dilbaz, N.; Tarhan, N. Binomial Logistic Regression and Artificial Neural Network Methods to Classify Opioid-Dependent Subjects and Control Group Using Quantitative EEG Power Measures. Clin. EEG Neurosci. 2019, 50, 303–310. [Google Scholar] [CrossRef]
Panda, J.P.; Warrior, H.V. Modeling the Pressure Strain Correlation in Turbulent Flows Using Deep Neural Networks. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2022, 236, 3447–3458. [Google Scholar] [CrossRef]
Sample. Size Is Not the Main Thing. Or the Main Thing. Available online: https://scanmarket.ru/blog/vyborka-razmer-ne-glavnoe-ili-glavnoe (accessed on 9 July 2024).
Fleiss, J.L.; Levin, B.; Paik, M.C. Statistical Methods for Rates and Proportions, 3rd ed.; John Wiley & Sons: New York, NY, USA, 2003. [Google Scholar]
Likert, R. A Technique for the Measurement of Attitudes. Arch. Psychol. 1932, 55. [Google Scholar]
A Calculator to Calculate a Sufficient Sample Size. Available online: https://scanmarket.ru/blog/vyborka-razmer-ne-glavnoe-ili-glavnoe#calc1 (accessed on 9 July 2024).
Hodges, J.; Mohan, S. Machine Learning in Gifted Education: A Demonstration Using Neural Networks. Gift. Child Q. 2019, 63, 243–252. [Google Scholar] [CrossRef]
Blanova, M. Use of artificial intelligence elements in predictive process management. In International Multidisciplinary Scientific GeoConference Surveying Geology and Mining Ecology Management, 22nd SGEM; International Multidisciplinary Scientific Geoconference: Albena, Bulgaria, 2022; pp. 97–104. [Google Scholar]
Jin, H.; Raghavan, K.; Papadimitriou, G.; Wang, C.; Mandal, A.; Kiran, M.; Deelman, E.; Balaprakash, P. Graph Neural Networks for Detecting Anomalies in Scientific Workflows. Int. J. High Perform. Comput. Appl. 2023, 37, 394–411. [Google Scholar] [CrossRef]
Prompt Engineering. Available online: https://platform.openai.com/docs/guides/prompt-engineering (accessed on 8 June 2024).
Robinson, Reid. How to Write an Effective GPT-3 or GPT-4 Prompt. 2023. Available online: https://zapier.com/blog/gpt-prompt/ (accessed on 8 June 2024).
Khalilov, D. AI Employees: Creation, Training, Implementation. 2023 [In Russian]. Available online: https://clockwork-school.com/aistaff?utm_source=vebinar&utm_medium=260324&utm_campaign=workshop#about (accessed on 8 June 2024).
Gouws-Stewart, N. The Ultimate Guide to Prompt Engineering Your GPT-3.5-Turbo Model. 2023. Available online: https://masterofcode.com/blog/the-ultimate-guide-to-gpt-prompt-engineering (accessed on 8 June 2024).
Okulich-Kazarin, V.; Artyukhov, A.; Skowron, Ł.; Artyukhova, N.; Wołowiec, T. Will AI Become a Threat to Higher Education Sustainability? A Study of Students’ Views. Sustainability 2024, 16, 4596. [Google Scholar] [CrossRef]
Oplatkova, Z.; Senkerik, R. Applications of Artificial Intelligence. Book chapter, Computer Science and Software Techniques 2011; pp. 29–41. Available online: https://publikace.k.utb.cz/handle/10563/1005994 (accessed on 31 July 2024).
Abreu Lopes, C. Big Data for SDG 3 and SDG 5: Promise and Inequality Traps. Digital Health Week 2019. Available online: https://i.unu.edu/media/iigh.unu.edu/news/6877/CALopes_DigitalHealthWeek.pdf (accessed on 30 June 2024).

Figure 1. Number of documents by topic “Neural networks” from 1961 to 2023.

Figure 2. Number of documents by topic “Statistics” in 1961–2023.

Figure 3. Number of documents from 2015 to 2023 by topics “Statistics” and “Neural networks” in the Scopus database.

Table 1. Results of mathematical modeling to choose the boundaries of this study (general population size >100,000 units).

Sampling error, %	4.0	3.0	2.0	1.5	1.0
Standard testing level = 0.95
Sample size, N	600	1067	2401	4268	9604
High testing level = 0.99
Sample size, N	1037	1843	4147	7373	16,589

Table 2. The final list of sample size values for calculating statistical indicators (general population size >100,000 units).

Sampling error	4.0, %	3.0, %	2.0, %	1.5, %	1.0, %
Sample size, N	599 [24]	1099	4147	7373	16,589

Table 3. Statistical indicators calculated using experimental and control methods.

№	N	Ni	The Tool	(M)	(s)
1	599	22; 43; 124; 234; 176	New prompt	1.1670	1.0440
			Excel table	1.1669	1.0443
			\|Difference\|	0.0001	0.0003
2	1099	84; 210; 347; 352; 106	New prompt	1.8298	1.0861
			Excel table	1.8308	1.0836
			\|Difference\|	0.0010	0.0025
3	4147	22; 43; 124; 234; 3724	New prompt	0.1685	0.5644
			Excel table	0.1709	0.5745
			\|Difference\|	0.0024	0.0101
4	7373	22; 43; 124; 234; 6950	New prompt	0.0948	0.4394
			Excel table	0.0956	0.4378
			\|Difference\|	0.0008	0.0016
5	16,589	22; 43; 124; 234; 16,166	New prompt	0.0421	0.2946
			Excel table	0.0423	0.2951
			\|Difference\|	0.0002	0.0005

Table 4. Verification of statistical hypotheses for Example 3 (Sampling error is 2.0%).

№	Calculations	New Prompt	Excel Tables
1	Size of a sample, N	4147	4147
2	Expected value, (M), %	0.1685	0.1709
3	\|(M₁)–(M₂)\|	0.0024
4	μ₁–μ₂	0.00
5	Standard deviation for the sample, (s)	0.5644	0.5745
6	Average error, Ṡ_Ẋ = (s)/√n	0.00876	0.00892
7	Ṡ_Ẋ²	0.0000768	0.0000796
8	\|Ṡ₁²–Ṡ₂²\|	0.0000028
9	√(Ṡ₁²–Ṡ₂²)	0.00167
10	\|z_stat\| = [(M₁)–(M₂) − (μ₁–μ₂)]/√(Ṡ₁²–Ṡ₂²)	1.43427
11	Value z_tabl for the high testing level of 0.99	2.58
12	Result, \|z_stat\| < z_tabl	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Okulich-Kazarin, V. Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5. Sustainability 2024, 16, 8395. https://doi.org/10.3390/su16198395

AMA Style

Okulich-Kazarin V. Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5. Sustainability. 2024; 16(19):8395. https://doi.org/10.3390/su16198395

Chicago/Turabian Style

Okulich-Kazarin, Valery. 2024. "Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5" Sustainability 16, no. 19: 8395. https://doi.org/10.3390/su16198395

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Statistics Using Neural Networks in the Context of Sustainable Development Goal 9.5

Abstract

1. Introduction

2. Literature Review

2.1. Background

2.2. Bibliometric Analysis for the Keywords “Neural Networks”

2.3. Theoretical Framework of Neural Networks

2.4. Some Words of Statistical Indicators

2.5. Short Summary

3. Materials and Methods

3.1. General Information

3.2. Methodological Framework of the Choice of Research Boundaries

3.3. Boundaries of the Research

3.4. Data for Creating the Prompt

3.5. Method of Verification of Statistical Hypotheses

4. Results

4.1. Testing the First Hypothesis

4.2. New Prompt for Calculating Statistical Indicators

4.3. Guide for the Use of the New Prompt for ChatGPT 3.5

4.4. Five Examples of Calculating Statistical Indicators

4.5. Verification of Statistical Hypotheses

5. Discussion

6. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI