Next Article in Journal
On the Solution of the Dirichlet Problem for Second-Order Elliptic Systems in the Unit Disk
Previous Article in Journal
Local Sensitivity of Failure Probability through Polynomial Regression and Importance Sampling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Analytical Approach for Temporal Infection Mapping and Composite Index Development

1
School of Medicine, Xiamen University, Xiamen 361005, China
2
National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
3
Data Mining Research Center, Xiamen University, Xiamen 361005, China
4
School of Management, Xiamen University, Xiamen 361005, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(20), 4358; https://doi.org/10.3390/math11204358
Submission received: 5 September 2023 / Revised: 12 October 2023 / Accepted: 17 October 2023 / Published: 20 October 2023
(This article belongs to the Section Mathematical Biology)

Abstract

:
Significant and composite indices for infectious disease can have implications for developing interventions and public health. This paper presents an investment for developing access to further analysis of the incidence of individual and multiple diseases. This research mainly comprises two steps: first, an automatic and reproducible procedure based on functional data analysis techniques was proposed for analyzing the dynamic properties of each disease; second, orthogonal transformation was adopted for the development of composite indices. Between 2000 and 2019, nineteen class B notifiable diseases in China were collected for this study from the National Bureau of Statistics of China. The study facilitates the probing of underlying information about the dynamics from discrete incidence rates of each disease through the procedure, and it is also possible to obtain similarities and differences about diseases in detail by combining the derivative features. There has been great success in intervening in the majority of notifiable diseases in China, like bacterial or amebic dysentery and epidemic cerebrospinal meningitis, while more efforts are required for some diseases, like AIDS and virus hepatitis. The composite indices were able to reflect a more complex concept by combining individual incidences into a single value, providing a simultaneous reflection for multiple objects, and facilitating disease comparisons accordingly. For the notifiable diseases included in this study, there was superior management of gastro-intestinal infectious diseases and respiratory infectious diseases from the perspective of composite indices. This study developed a methodology for exploring the prevalent properties of infectious diseases. The development of effective and reliable analytical methods provides special insight into infectious diseases’ common dynamics and properties and has implications for the effective intervention of infectious diseases.

1. Introduction

Despite remarkable successes in vaccination, health education, and health services, infectious diseases continue to pose great challenges to human health and economic sustainability, bringing challenges to public health and service delivery at the same time [1,2,3]. The disease burden caused by infectious diseases varies according to the development level and public health investment in different countries, while more than 20% of annual deaths worldwide are estimated to be related directly to infectious disease [4,5,6]. The emergence of new infectious diseases (e.g., COVID-19) and the high prevalence of post-infection symptoms have further prompted global research into the management and control of infectious diseases [7]. In addition to health losses, a pandemic can result in unprecedented impacts on global activity, causing trade decline and economic disruption [8,9]. Assessments of infection prevalence and the exploration of underlying properties appear essential to provide valuable insight into policy and strategy development [10].
Given the complexity of influencing factors and interactions, there is a concern that we may not observe an instant effect after the implementation of a policy, and a transient indicator usually appears to be insufficient to reflect the kinetic changes underlying emerging infections. Incidence, a commonly used indicator to measure temporal and spatial prevalence, is usually employed to describe the severity and consequences of a certain disease. However, the incidence indicator, while useful for describing the basic properties of a specific disease, has limited utility in probing underlying features and in portraying comprehensive information for a certain cluster of diseases [11]. Faced with the complexity of infectious disease, we need a better and as comprehensive as possible understanding of information related to disease incidence in order to strive for the effective avoidance and prevention of outbreaks and epidemics, and to minimize the harm to health and social development. More robust models that encompass the complex interface between pathogen prevalence and management efficiency are needed [12]. Researchers have devoted much effort to investigating the oscillatory properties aimed at providing a mapping of disease prevalence [13]. Compared to static incidence values, considering the incidence data as time dynamics may have implications for disease development and management, and new analysis indicators are needed to reflect this issue [14,15]. To portray the features of temporal dynamics, like tuning curves and internal kinetic characteristics, more techniques are required to handle temporal data (e.g., statistical parametric mapping) [16]. In light of this, there exists a deficiency in reflecting the essence of disease prevalence only through observations based on apparent incidence trends, and internal kinetic characteristics may provide evidence in detail about the origination and development of a certain disease.
Complex statistical analyses of diseases, taking into consideration mathematical and statistical principles, could generate detailed and in-depth insights for planning programs [17,18]. Statistical indices are fundamental and often shed light on profound data analyses; moreover, the transformation and extension of indices are commonly adopted in a wide range of fields, including environmental assessment and performance evaluation, aiming to provide comprehensive, deep, and scientific evaluation based on aspects of certain characteristics, causes, and influences [19]. Variable transformation provides an approach in solving problems such as distribution and linear characteristics [20]. Owing to the progress in multiple disciplines, including computer science, data science, applied mathematics, and statistics, we have moved closer to innovative data science approaches to obtain significant viewpoints in disease incidence and management [21,22,23,24]. There have been studies that identified the contribution of variable transformation in applications. Considering the sensitivity to sample size when using the classic Jaccard and Sorensen indices, Chao et al. [25] proposed probabilistic derivation estimators for the classic, incidence-based forms of these indices and extended this approach to formulate new Jaccard-type or Sorensen-type indices, which proved to be considerably less biased than classic indices [26,27]. Zhao et al. [16] developed several indices to discuss whether immediate additional benefits for controlling class B infectious diseases have been gained from the prevention policies for COVID-19, and they further analyzed the possible factors that may reduce the occurrence in outbreaks. Han et al. [13] used an improved spectrum analysis to illustrate the time series of infected cases and categorize the diseases accordingly.
The purpose of this study was to develop a more cohesive view of infectious disease and provide materials for management strategy. The first step was the theoretical innovation in analysis framework and the program development to probe the internal kinetic characteristics including the smoothing trend and derivative features of infectious diseases. The benefit of using functional data analysis (FDA) in the analysis of time series data has been observed in previous studies [28]. FDA, generally expressed by a basis system (Fourier basis, polynomial basis, etc.), has particular features including smoothing, aligning, and dimension reduction that have made the data analysis strategy exploratory, confirmatory, and predictive, and which brings a wide range of application fields [29,30,31,32]. As the basis for FDA lies in the expression of time series data in the form of a function, which makes it feasible to represent the data as a smooth function and further calculate the derivatives, we may obtain detailed and in-depth insights into the analysis of disease prevalence, which is usually important for widespread public health management and other applications [28,32,33]. The second step was to develop composite indices that can be used for comparison among diseases and factor exploration through the idea of orthogonal transformation [33]. Composite indices, through combining individual indicators into a single index [34], provide a comprehensive view with various perspectives being reflected synchronously, thus facilitating the comparison of prevalence trends between diseases with different transmission routes or pathogen species, and provide a chance for further analysis (e.g., influencing factors analysis). As a kind of special and commonly used matrix calculation, orthogonal transformation plays an important role in calculation methodology and data integration.

2. Materials and Methods

2.1. Data Sources

In order to assure the consistency and comparability of incidence data in time, we used annual incidence data of notifiable infectious disease nationwide in China, which were regulated by law in criteria and procedure. In China, a national standardized reporting system for infectious disease was established in the 1950s [35]. From the 1980s onward, the data collected, by use of this system, have become more consistent and reliable [36]. According to relevant laws and regulations in China, a total of 40 diseases have been included as notifiable infectious disease up to now, classified as A, B, and C according to their prevalence and hazard level (COVID-19 was included as one of the Class B notifications in 2023), with popularity and severity decreasing from Class A to Class C. As infectious diseases of Class B (27 types) have more varieties than those in either Class A (2 types) or Class C (11 types), and cause more severe consequences than those in Class C, Class B infectious diseases have received particular attention as a consequence [13]. In this study, the 19 most common types of Class B notifiable diseases were included for analysis, while others were excluded either because of data acquisition, like schistosomiasis, which was not available until the year 2005, and COVID-19, which has been classified as Class B since January 2023, or because of extremely low incidence, like neonatal tetanus. The incidence data were calculated with annual notification incidences per 1,000,000 people using year-end populations nationally between 2000 and 2019.
For composite index constructions, we classified the included notifiable diseases into four categories according to transmission route: one—gastro-intestinal infectious disease including bacterial or amebic dysentery and typhoid fever and paratyphoid; two—respiratory infectious disease, including tuberculosis, scarlet fever, pertussis, measles, and epidemic cerebrospinal meningitis; three—sexually transmitted and blood-borne diseases, including virus hepatitis, syphilis, gonorrhea, and acquired immune deficiency syndrome (AIDS); and fourth and last—vector-borne disease, including brucellosis, malaria, anthrax, leptospirosis, dengue fever, epidemic hemorrhagic fever, epidemic encephalitis, and lyssa.

2.2. Analytical Framework

2.2.1. Modeling Program with Functional Data Analysis

In order to obtain prevalence traits for each disease in-depth, we developed a modeling program using Delphi through the FDA concept, which provides techniques to handle time-series data [37]. With the incidence data as Y i ,   i = 0 , 1 , , n , we constructed the following model:
Y t i = j = 0 m α j φ j t i + ε t i   0 t i 1 ; i = 0 , 1 , , n ; j = 0 , 1 , , m ; m < n ,  
where φ j t i (   j = 0 , 1 , , m ) is a basis function; α j ,   j = 0 , 1 , , m is the coefficient vector; and ε ( t i ) , i = 0 , 1 , , n is the residual term which accords with normal distribution. To improve the comparability of variables, time series data Y i , i = 0 , 1 , , n have been isometrically parameterized. Suppose
Δ i = ν i + 1 ν i = c o n s t a n t ,
it is convenient to express the form in integer sequences as
ν i = i ,     i = 0 , 1 , , n ,
and then the normalized parameterizations can be acquired after normalization,
t i = ν i / n ,     i = 0 , 1 , , n .
Denote the parameter segmentation as
Δ t : t 0 < t 1 < < t n .
The Bernstein basis function was employed to define the basis function φ j , m ( t ) = C m j t j ( 1 t ) m j ,   j = 0 , 1 , , m , and the orthogonal least squares algorithm was used to determine the function of each category of disease as
f ( t i ) = j = 0 m α ^ j φ j , m ( t i ) ,   i = 0 , 1 , , n ,    
and then the model can be expressed as
Y t i = j = 0 m α ^ j φ j , m t i + e i t i ,   i = 0 , 1 , , n .
For each set of control points of function α ^ j , j = 0 , 1 , , m , calculate the value of
S S E = i = 0 n ( Y t i f t i ) 2 ,
and seek the minimal value of SSE with a loop program and acquire the corresponding parameter as follows:
α ^ 0 α ^ 1 α ^ m = ( Σ T Σ ) 1 Σ T Y t 0 Y t 1 Y t n ,
and
Σ = φ 0 , m ( t 0 ) φ 1 , m ( t 0 ) φ m , m ( t 0 ) φ 0 , m ( t 1 ) φ 1 , m ( t 1 ) φ m , m ( t 1 ) φ 0 , m ( t n ) φ 1 , m ( t n ) φ m , m ( t n ) ,
where Σ T is the matrix transform of Σ .
We obtained the estimation of control points α ^ 0 , α ^ 1 , , α ^ m for the function, and the fitted function can be written as:
f ( t ) = j = 0 m α ^ j φ j , m ( t ) .
The first derivative and second derivative were adopted to analyze the intrinsic traits, which gives
f t = j = 0 m α ^ j t m j t t 1 φ j , m t ,
and
f ( t ) = j = 0 m α ^ j m 2 + m ( t 1 ) 2 + j 2 j t 2 ( t 1 ) 2 + 2 j ( 1 m ) t ( t 1 ) 2 φ j , m ( t ) .
Based on the above steps, we can establish an automatic program to obtain an in-depth description of diseases separately.

2.2.2. Composite Indices Development through Orthogonal Transformation

Denote disease incidence as Y i j , where i signifies disease category and j signifies the time point. In order to improve the comparability of different diseases, we normalized the data as follows:
Y i j * = Y i j Y ¯ i S i   i = 1 , , p ; j = 1 , , T ,
where
Y ¯ i = j = 1 T Y i j T ,   S i = j = 1 T   ( Y i j Y ¯ i ) 2 ( T 1 ) .
Based on the normalized disease incidence, the p -dimensional parameter vector according to disease category can be expressed as Y = ( Y 1 , , Y p ) T , where Y i = ( Y i 1 , , Y i T ) T ( i = 1 , , p ) .
Let R = A T Y , where A represents orthogonal matrices. It is necessary to make sure that each membership of R is uncorrelated, and the largest variance pertains to the first component, followed by the second component, and the others accordingly. The cumulative variances of R and Y are equivalent for the avoidance of message loss.
Linear transformation based on a p -dimensional vector Y = ( Y 1 , , Y p ) T can be written as:
R 1 = α 11 Y 1 + α 12 Y 2 + + α 1 p Y p = A 1 T Y R 2 = α 21 Y 1 + α 22 Y 2 + + α 2 p Y p = A 2 T Y R p = α p 1 Y 1 + α p 2 Y 2 + + α p p Y p = A p T Y ,
and then the orthogonalized p -dimensional vector can be expressed as R 1 , R 2 , , R p according to their contributions, where
A i = ( α i 1 , , α i p ) T ,   i = 1 , , p .
Given Y = ( Y 1 , , Y p ) T , the covariance can be computed and marked as C , which may be decomposed into eigenvalues, and the eigenvalues and eigenvectors can be obtained as follows:
Φ T C Φ = λ ,
where Φ designates the eigenvector matrix and λ is the eigenvalue diagonal matrix. Here, we denote λ 1 λ 2 λ p 0 , and the corresponding eigenvectors are defined as Φ 1 , Φ 2 , , Φ p . Combing the matrix attribute, we may obtain the following:
C = λ 1 Φ 1 Φ 1 T + + λ p Φ P Φ p T
where
Φ i T Φ j = 0 ,   i j .
Normalize the eigenvalues λ 1 , λ 2 , , λ p as
w i = λ i i = 1 p λ i   i = 1 , 2 , , p ,
which can be rewritten in the simplified form as
W = ( w 1 , w 2 , w p ) T
and then the synthetic measurement of notifiable infectious disease
Z = w 1 R 1 + w 2 R 2 + + w p R p
is formed. Here, Z is a composite reflection of diseases within the same category from the viewpoint of incidence variance.

3. Results

3.1. Prevalence and Internal Kinetic Characteristics for Each Disease

A descriptive analysis for each disease was performed according to FDA, at first, with the incidence trend describing the change tendency for each disease along the time dimension, while the first and second derivative reflected the pace and curvature change, respectively (Table S1). Using the automatic program with the FDA concept, the discrete incidence for each disease can be expressed in the form of a function, providing possibilities for deducing implicit information from the continuous smooth dynamics by the use of estimates of parameters generated from the function. According to the model approaches, we initially categorized the diseases into two classes, as shown in Figures S1 and S2, according to the prevalence trend. Most of the diseases included (12 out of 19) showed a descending trend, including bacterial or amebic dysentery, epidemic cerebrospinal meningitis, typhoid fever and paratyphoid, anthrax, leptospirosis, gonorrhea, epidemic encephalitis, epidemic hemorrhagic fever, malaria, tuberculosis, hydrophobia, and measles. Combining the first and second derivative, it is easy to find a consistency in bacterial or amebic dysentery and epidemic cerebrospinal meningitis. Additionally, a similar feature appeared in derivatives for typhoid fever and paratyphoid, anthrax, and leptospirosis, with the first stationary point observed in the period 2001–2003. Furthermore, epidemic encephalitis and epidemic hemorrhagic fever, both belonging to vector-borne disease, appeared to be similar in all prevalence traits, and the same was true in for malaria, tuberculosis, and hydrophobia. The volatility characteristics were prominent for measles in the prevalence trend and first derivative (see Figure S1).
The diseases with an ascending prevalence trend (7 out of 19) were fewer than those with a descending trend, including scarlet fever, dengue fever, syphilis, AIDS, pertussis, brucellosis, and virus hepatitis (see Figure S2). It is obvious that scarlet fever and dengue fever are close considering the properties of prevalence trend, both having similar internal kinetic characteristics according to the first and second derivatives. Syphilis and AIDS, as the most common sexually transmitted and blood-borne disease, have the same performance not only in prevalence portraits, but also in derivative properties. Pertussis retained an upward trend, the same as the prevalence trend for brucellosis, and although these were not close in oscillation, both had similar derivative properties. Virus hepatitis remains at a high level of prevalence in China, with an upward velocity change, indicating that it could be the focus of attention in the future.

3.2. Composite Indices

The idea of orthogonal transformation provides a valuable approach for data transformation and composite indices construction without information losses. Four composite indices according to transmission route were developed in this study, including a composite index for gastro-intestinal infectious disease, a composite index for respiratory infectious disease, a composite index for sexually transmitted and blood-borne disease, and a composite index for vector-borne disease. Detailed information concerning the development of the indices is shown in Table 1, Table 2, Table 3 and Table 4. The expression of each composite index was a compound product of each disease included in the category, and was actually a linear combination of the included diseases with the corresponding coefficient. Taking the expression of gastro-intestinal infectious disease as an example, Z 1 in the formula designates the composite index for gastro-intestinal infectious disease, while X 11 and X 12 contribute the components of the composite index separately, with quantitative weights of 0.707 for bacterial or amebic dysentery and 0.421 for typhoid fever and paratyphoid.
Table 5 details the composite indices of 19 Class B notifiable diseases during the study period. The composite measures comprehensively reflect information about the incidence variance of 19 notifiable infections with different transmission routes. In summary, there was a downward trend in indices of gastro-intestinal and respiratory infectious disease in China between 2000 and 2019, and the declining trend in gastro-intestinal infectious disease was specifically prominent. In contrast, the prevalence trend in indices of vector-borne, and sexually transmitted and blood-borne disease, showed a fluctuating upward trend, indicating that more effort is needed to improve the situation (Figure 1).

4. Discussion

In measuring the prevalence of diseases, incidence rate has proven to be an intuitive and effective measure, as a comprehensive reflection of the epidemic prevalence trend affected by correlated multifactors for each specific disease. However, the index of incidence rate is limited in its ability to provide continuous and extended information regarding a disease. Pathogens causing infectious diseases are various, including bacteria, viruses, parasites, etc. [38], and the suitable environmental demands for a specific pathogen appeared dissimilar [39]. A large body of research has demonstrated that a set of factors linked to natural environment, as well as social and economic development, including temperature, humidity, air pollution, unsafe water, sanitation, and handwashing, are all considered to be risk exposure pathways. It is crucial that infectious diseases be regarded in ecological contexts, which may be interpreted in biotic and abiotic components [1,40]. As the core of a particular infection, the involvement of a pathogen coexisting with multiple hosts, vectors, or parasite species under certain natural and social environments determines the prevalence trend in combination [41]. In addition to biotic factors, the oscillation of infectious disease can also be influenced by natural factors (e.g., seasonal temperature, rainfall, natural disasters) [42,43], human factors (e.g., migration) [44], and social factors (e.g., public health investment, vaccination coverage) [45]. Additionally, the effect of influential factors on infectious diseases varies considerably in immediacy and sustainability. Factors may influence the incidence asynchronously or synchronously, and may influence the incidence with sustainability or transience. In this regard, we need some profound insights in addition to incidence trends to discover the underlying changes.
The annual incidence data of a certain disease, restrained by the common determinants, are correlated spatially and temporally. As the data used in this research are nationwide, with no necessity for considering spatial variation, they are suitable to be considered as time series data. A time series is by definition a sequence of observations over time, and has frequently been used to evaluate changes over time, especially in the social sciences [46], like disease incidence, stock market prices, etc. Studies have demonstrated that time series analyses have the potential to reveal more information than a cursory observation of data, with applications in describing the dependence of each observation and in future forecasting [47,48]. Many methods have been developed for time series analysis to meet demands in different situations [49]. To probe the underlying information, aside from prevalence trends alone, there is a need for data analysis and skill adoption to acquire further information about disease prevalence. In this work, we used two statistical models to derive implications regarding health policies and program developments.
One of the aims of this study was to put forward a statistical procedure to meet the demand of identifying prevalence portraits in-depth for infectious diseases according to the incidence data. In this regard, a strict and replicable procedure for the statistical analysis of change patterns and underlying features was developed based on FDA techniques. The intrinsic mechanism of FDA is to transform a set of data into a curve with a particular shape; that is, firstly, a function of the research object that changes over time is constructed according to the discrete monitoring sequence, and then the relevant feature information of the objective function is extracted through a series of operations based on the definite function [50]. FDA is an extension and deepening of traditional discrete statistical models for continuous time series data with an assumption that the data correspond to a function over time, and the FDA has mostly been used in statistical analysis and smoothing data so far [22]. This is significant for FDA in providing new opportunities to understand variation and group comparisons in time series data. The advantages of FDA are prominent; firstly, the observations do not have to be equally spaced in time, which means that the FDA approach is highly flexible compared to other methods commonly used for modeling trends in time series data. Another key advantage in the FDA approach is that parametric assumptions of time effects are unnecessary. Moreover, there is no demand for the assumption that the values observed are independent at different time points. Additionally, the FDA approach is not concerned with correlations between repeated measurements, as it considers the entire curve as a single entity. FDA can often extract additional information contained in functions and their derivatives that would not normally be available from the application of traditional statistical methods, which facilitates modeling and forecasting data across a range of health and population issues to better understand prevalence trends, risk factors, and effective measures [33]. Through FDA analysis, data with large sample size can be found and provide useful bases for post-observation [51], which represents a change in the concept of processing time series and related data [52].
Using the functional datasets of smooth incidence curves, we can easily classify the disease incidence into two categories according to the variation tendency, including a tendency with a descending trend and the opposite with an ascending trend. Moreover, by virtue of functional datasets of smooth incidence curves, we were able to acquire derivative features containing implicit information about disease prevalence and influential factors. The essence of a derivative is a local linear approximation of a function through the concept of limit. Approximation theories have been developed quickly and used extensively in various contexts due to the combination of functional analysis concepts and classical analysis techniques. The main concerns in the application of functional linear approximation lie in the choice of the basis function, which remains a gray area to some extent [53], and many optimization applications highlight the search for estimates of the fixed point of linear contraction, i.e., the control point. In this study, the Bernstein basis function and the orthogonal least squares algorithm were jointly adopted to confront these two challenges [54]. The Bernstein polynomial was constructed to prove the Weierstrass theorem in 1912 [55], and has received prevalence in approximation theory ever since due to superior properties such as conductivity and convexity. In addition, the extension of such properties brings a broad range of applications in many fields, like curve surface construction. The problems arising in the process of minimizing residuals can generally be summarized into two types, depending on whether the residual is linear, of which the first is for linear or ordinary least squares, and the second is nonlinear. As a universal criterion of optimality for the approximation problem, the one that minimizes the sum of squared deviations between the corresponding vectors (least squares, for short) is widely adopted [56].
Concerning the basic descriptive statistics of the diseases analyzed, 12 out of 19 infectious diseases decreased over time, and 7 out of 19 increased. Consistent with other studies, this situation showed a positive effect in infectious disease management by the Chinese government in recent decades [13]. Combining the derivative, we can interpret the incidence prevalence with an extension, and conclude that there were more similarities in temporal dynamics for those diseases with similar derivative features than those behaving notably different in derivatives, like bacterial or amebic dysentery and epidemic cerebrospinal meningitis, scarlet fever and dengue fever, malaria and tuberculosis, syphilis and AIDS, etc. The derivative, as an indicator describing the change in speed or acceleration, is used to describe the classification of different clusters of diseases and appeared to be better than a single prevalence trend. This variation has profound implications on the empirical management of infections. For example, a high level of prevalence trend and speed acceleration directs us to give more attention to virus hepatitis prevention and management.
Based on this analysis of prevalence traits, we can grasp the straightforward features together with the internal information for each disease. However, an analysis of individual disease prevalence is not adequate to identify common attributes with sufficient reliability and provide further implications for the effective management of disease, and there is often a necessity to acquire information that cannot be reflected by a single disease. Thus, comprehensive indices for a certain cluster of diseases appear indispensable, which are usually achieved through data processing or model construction. Given the fact that the reasons for infection persistence are manifold and the most likely causes of infection emergence is a combination of ecological, environmental, and socioeconomic factors [9,10,11], studies involved in disease management are extensive, covering areas like natural environment, risk factors, burden of disease, and the implementation of interventions [57,58]. The heterogeneity in different diseases is striking, and it is useful to develop composite indices to provide implications for disease occurrence and development, as this allows the study of a certain cluster of diseases as a whole and uses all the focused diseases to estimate intrinsic information in common. As one of the most powerful and useful tools of numerical linear algebra, orthogonal transformation, possessing the ability to conduct a relative decorrelation and compaction of information, arises in many application fields [59]. The application of orthogonal transformation in this study gives full play to the advantages of the concept, integrates certain incidence data separately into composite indices according to a given clustering criteria, and provides implications for effect evaluation and factor exploration of infectious diseases.
A comprehensive understanding of infectious disease may be interpreted from biotic and abiotic perspectives. Biotic factors, including pathogen heritability, variability, and pathogenicity, play a fundamental role in disease occurrence and development. Of particular note for SARS-CoV-2 was the emergence of mutational variants at a high frequency, which facilitated immune escape for the virus [60]. Corresponding to biotic factors, the abiotic elements, composed of natural and social factors, may accelerate or inhibit the spread of infectious disease. The onset of many diseases characterized by seasonality (e.g., influenza) is mostly attributed to the seasonal laws of the natural environment, influencing the survival, spread, and infection of a particular pathogen. A key step for infection formation is the social transmission of pathogens, which is vulnerable to population density, sanitary conditions, etc. Improved public health conditions and efficient interventions are important for the control of all infectious diseases; however, the transmission and prevalence of different diseases are determined disproportionately by those influences due to the heterogeneity in biological attributes and transmission routes.
In this study, the composite indices are developed according to transmission routes, with the performance of gastro-intestinal infectious disease and that of respiratory infectious diseases superior to the other two. For gastro-intestinal infectious disease, which is mostly vulnerable to nutrition and living standards, water and food safety regulations, sanitation, public hygiene, and hygiene behaviors [39], mitigation has gained prominent success as a result of developments in socioeconomic and public hygiene conditions. Respiratory infectious diseases are most likely to develop into a large-scale epidemic based the speed and wide range of transmission, as well as a high difficulty in prevention, and thus their index did not perform as well as the index for gastro-intestinal infectious disease. The severe acute respiratory syndrome (SARS) crisis in 2003 accelerated the development of the public health information system in China, and was followed by an increased investment in public health expenditure, surveillance systems, facilities, institutions, and human resources, forming a solid foundation for all infectious disease responses.
Sexually transmitted and blood-borne diseases, especially hepatitis B and HIV, remain health challenges worldwide [61], and their management is more dependent on social interventions, such as health education and vaccination, in which the Chinese government has invested significantly, together with proven therapies and prophylaxis measures; however, more efforts are still needed. By calculating the reproduction number for hepatitis B virus transmission within the population, research conducted by Xu et al. calculated there would be more than 449,535 potentially infectious hepatitis B virus infections in mainland China, considering the existing vaccine inefficiency rate [62]. Vector-borne diseases, though not the most outstanding in the number of infected populations, stand out in the difficulty coefficient of prevention. The common feature is that they are indispensable to vectors, and the influence of the natural environment appears extraordinarily important, proving that the disease spectrum is different across geographic locations. Cross-regional infections and emerging infectious diseases are important components following the increase in population growth and urbanization, cross-regional travel, and commerce [63]. Faced with threats from new emerging infections, it is important to learn management experiences from ever-existing infections.
The data-driven approaches presented in this study are reliable and replicable in other settings, and can serve as exploratory analyses to guide administrative data research and intervention implementation. However, limitations also exist in this research. One is that the result only portrays the prevalence properties, with no analysis of the underlying infection mechanisms. Another limitation of our study is that viral hepatitis consists of hepatitis A, hepatitis B, hepatitis C, hepatitis D, and hepatitis E, each with different transmission routes; however, we have no detailed information on each viral hepatitis separately in our study, so we classified all viral hepatitis in the sexual and blood-borne category based on the fact that 96% of deaths caused by viral hepatitis are attributable to hepatitis B and hepatitis C virus [64].

5. Conclusions

More practical analyses that encompass feature extraction from infectious diseases are needed, given the complexity of biotic and abiotic factors related to disease prevalence. The primary goal for this study was to provide implications for infection prevalence and program developments through the analysis of incidence data, which underscored the importance of internal kinetics and composite indices of infections and presented a methodological approach that targeted for these analyses. We constructed an analysis framework that provides an approach to form a continuous fit for discrete observations, which would be implemented in a wide range of practices. Most of the Class B notifiable diseases in mainland China were, accordingly, studied from the perspective of the internal kinetic portraits. According to the program designed for this research, discrete incidence rates of infectious diseases could be presented as a continuous function to illustrate the intrinsic dynamics and underlying features. The introduction of the FDA concept can better capture the continuous dynamics for discrete observations to provide implications for underlying determinants related to infection prevalence. This can provide new pictures of the significant properties for each disease, supplied with more practical guidance on prevention and control measures.
Our study was also intended to provide a foundation for information synthesis in a comprehensive, evidence-based comparison of infectious diseases. Influences on infection incidence are multifactorial, and a comprehensive analysis of determinants on health is likely to be much more relevant for accelerating global health progress. Synthetic measures of infections and a combination of socioeconomic and environmental determinants can explain infections more comprehensively and objectively, providing implications for epidemic management. In order to achieve an approach for the synthetic description and quantitative comparison of diseases, as well as further exploration, four composite indices according to transmission routes were developed for the synthetic description and quantitative comparison of diseases within and between categories according to clustering criteria. Comprehensive measurements based on orthogonal transformation are helpful for in-depth analysis of a certain cluster of diseases possessing common characteristics, which can not only reflect the common trend, but can also be used for further applications such as factor exploration. In this regard, composite indices will comprise scientific evidence with extension compared to those supported by only one of the analytical approaches.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/math11204358/s1: Table S1: Corresponding interpretation of function traits and disease incidence change characteristics; Figure S1: The prevalence trend, first derivative, and second derivative features for diseases with a descending prevalence trend: (a) kinetic prevalence features for bacterial or amebic dysentery; (b) kinetic prevalence features for epidemic cerebrospinal meningitis; (c) kinetic prevalence features for typhoid fever and paratyphoid; (d) kinetic prevalence features for anthrax; (e) kinetic prevalence features for leptospirosis; (f) kinetic prevalence features for epidemic encephalitis; (g) kinetic prevalence features for epidemic hemorrhagic fever; (h) kinetic prevalence features for gonorrhea; (i) kinetic prevalence features for malaria; (j) kinetic prevalence features for tuberculosis; (k) kinetic prevalence features for hydrophobia; (l) kinetic prevalence features for measles. For each exhibition of a prevalence trend, the smooth curve represents the fitted function according to functional data analysis and the other is the direct connection of incidence data; Figure S2: The prevalence trend, first derivative, and second derivative features for diseases with an ascending prevalence trend: (a) kinetic prevalence features for scarlet fever; (b) kinetic prevalence features for dengue fever; (c) kinetic prevalence features for epidemic cerebrospinal meningitis; (d) kinetic prevalence features for AIDS; (e) kinetic prevalence features for pertussis; (f) kinetic prevalence features for brucellosis; (g) kinetic prevalence features for virus hepatitis. For each exhibition of a prevalence trend, the smooth curve represents the fitted function according to functional data analysis and the other is the direct connection of incidence data.

Author Contributions

J.Z. initiated the study with W.W. W.W. coordinated and conducted the analyses with F.W. and X.W.; W.W. cowrote the paper with Q.L. and X.W. All authors contributed to result interpretation and the critical evaluation and revision of the article. The corresponding author had full access to all the data in the study. All authors have read and agreed to the published version of the manuscript.

Funding

This project was funded by the National Office for Philosophy and Social Science (20&ZD137).

Data Availability Statement

Aggregated data of the notifiable infectious disease database can be retrieved at http://data.stats.gov.cn (accessed on 5 September 2022). Detailed data are confidential and protected by Chinese law.

Acknowledgments

This work was supported by the phased achievements of the “Cultural Inheritance and Innovation” of the “Double First Class” key construction project of Xiamen University (0610-X2302001). We would like to express our sincere gratitude to J.W. Xu.

Conflicts of Interest

The authors declare that they have no competing interests.

References

  1. Johnson, P.T.J.; de Roode, J.C.; Fenton, A. Why infectious disease research needs community ecology. Science 2005, 349, 1259504. [Google Scholar] [CrossRef]
  2. Jasny, B.; Roberts, L.; Enserink, M.; Smith, O. What works [introduction to Global Health special issue]. Science 2014, 345, 1256–1257. [Google Scholar] [CrossRef]
  3. Hassanzadeh, P.; Atyabi, F.; Dinarvand, R. Nanobionics: From plant empowering to the infectious disease treatment. J. Control Release 2022, 349, 890–901. [Google Scholar] [CrossRef] [PubMed]
  4. GBD 2019 Antimicrobial Resistance Collaborators. Global mortality associated with 33 bacterial pathogens in 2019, a systematic analysis for the Global Burden of Disease Study 2019. Lancet 2022, 400, 2221–2248. [Google Scholar]
  5. Kirtane, A.R.; Verma, M.; Karandikar, P.; Furin, J.; Langer, R.; Traverso, G. Nanotechnology approaches for global infectious diseases. Nat. Nanotechnol. 2021, 16, 369–384. [Google Scholar] [CrossRef]
  6. Morens, D.; Folkers, G.; Fauci, A. The challenge of emerging and re-emerging infectious diseases. Nature 2004, 430, 242–249. [Google Scholar] [CrossRef] [PubMed]
  7. Chen, C.; Haupert, S.R.; Zimmermann, L.; Shi, X.; Fritsche, L.G.; Mukherjee, B. Global prevalence of post-coronavirus disease 2019 (COVID-19) condition or long covid: A meta-analysis and systematic review. J. Infect. Dis. 2022, 226, 1593–1607. [Google Scholar] [CrossRef] [PubMed]
  8. Bai, L.; Wei, Y.; Wei, G.; Li, X.; Zhang, S. Infectious disease pandemic and permanent volatility of international stock markets: A long-term perspective. Financ. Res. Lett. 2021, 40, 101709. [Google Scholar] [CrossRef]
  9. Padhan, R.; Prabheesh, K.P. The economics of COVID-19 pandemic: A survey. Econ. Anal. Policy 2021, 70, 220–237. [Google Scholar] [CrossRef]
  10. Wolf, J.; Johnston, R.B.; Ambelu, A.; Arnold, B.F.; Bain, R.; Brauer, M.; Brown, J.; Caruso, B.A.; Clasen, T.; Colford, J.M.; et al. Burden of disease attributable to unsafe drinking water, sanitation, and hygiene in domestic settings: A global analysis for selected adverse health outcomes. Lancet 2023, 401, 2060–2071. [Google Scholar] [CrossRef]
  11. European Commission. Handbook on Constructing Composite Indicators: Methodology and User Guide; Organisation for Economic Co-Operation and Development, SourceOECD (Online Service), Ed.; OECD: Paris, France, 2008; p. 158. [Google Scholar]
  12. Lloyd-Smith, J.O.; George, D.; Pepin, K.M.; Pitzer, V.E.; Pulliam, J.R.; Dobson, A.P.; Hudson, P.J.; Grenfell, B.T. Epidemic dynamics at the human–animal interface. Science 2009, 326, 1362–1367. [Google Scholar] [CrossRef]
  13. Han, C.; Li, M.; Haihambo, N.; Cao, Y.; Zhao, X. Enlightenment on oscillatory properties of 23 class B notifiable infectious diseases in the mainland of China from 2004 to 2020. PLoS ONE 2021, 16, e0252803. [Google Scholar] [CrossRef]
  14. Zhang, X.; Hou, F.; Qiao, Z.; Li, X.; Zhou, L.; Liu, Y.; Zhang, T. Temporal and long-term trend analysis of class C notifiable diseases in China from 2009 to 2014. BMJ Open 2016, 6, e011038. [Google Scholar] [CrossRef] [PubMed]
  15. Wang, Z.; Liu, Y.; Li, Y.; Wang, G.; Lourenço, J.; Kraemer, M.; He, Q.; Cazelles, B.; Li, Y.; Wang, R.; et al. The relationship between rising temperatures and malaria incidence in Hainan, China, from 1984 to 2010: A longitudinal cohort study. Lancet Planet Health 2022, 6, e350–e358. [Google Scholar] [CrossRef] [PubMed]
  16. Zhao, X.; Li, M.; Haihambo, N.; Jin, J.; Zeng, Y.; Qiu, J.; Guo, M.; Zhu, Y.; Li, Z.; Liu, J.; et al. Changes in Temporal Properties of Notifiable Infectious Disease Epidemics in China During the COVID-19 Pandemic: Population-Based Surveillance Study. JMIR Public Health Surveill. 2022, 8, e35343. [Google Scholar] [CrossRef]
  17. The Lancet Infectious Diseases. Designing infectious disease programmes for the future. Lancet Infect. Dis. 2022, 22, 1253.
  18. Schleihauf, E.; Watkins, R.; Plant, A. Heterogeneity in the spatial distribution of bacterial sexually transmitted infections. Sex. Transm. Infect. 2009, 85, 45–49. [Google Scholar] [CrossRef] [PubMed]
  19. Liu, Y.; Hu, Y.; Hu, Y.; Gao, Y.; Liu, Z. Water quality characteristics and assessment of Yongding New River by improved comprehensive water quality identification index based on game theory. J. Environ. Sci. 2021, 104, 40–52. [Google Scholar] [CrossRef] [PubMed]
  20. Vondrak, J.; Penhakert, M. Statistical Evaluation of Transformation Methods Accuracy on Derived Pathological Vectorcardiographic Leads. IEEE J. Transl. Eng. Health Med. 2022, 10, 1900208. [Google Scholar] [CrossRef]
  21. King, D.A.; Peckham, C.; Waage, J.K.; Brownlie, J.; Woolhouse, M.E.J. Infectious Diseases: Preparing for the Future. Science 2006, 313, 1392–1393. [Google Scholar] [CrossRef]
  22. Ladner, J.T.; Grubaugh, N.D.; Pybus, O.G.; Andersen, K.G. Precision epidemiology for infectious disease control. Nat. Med. 2019, 25, 206–211. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, Q. Data science approaches to infectious disease surveillance. Philos. Trans. A Math. Phys. Eng. Sci. 2022, 380, 20210115. [Google Scholar] [CrossRef]
  24. Ye, Y.; Zhang, Q.; Wei, X.; Cao, Z.; Yuan, H.Y.; Zeng, D.D. Equitable access to COVID-19 vaccines makes a life-saving difference to all countries. Nat. Hum. Behav. 2022, 6, 207–216. [Google Scholar] [CrossRef] [PubMed]
  25. Chao, A.; Chazdon, R.L.; Colwell, R.K.; Shen, T.J. A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol. Lett. 2004, 8, 148–159. [Google Scholar] [CrossRef]
  26. Paulson, A.R.; Lougheed, S.C.; Huang, D.; Colautti, R.I. Multiomics Reveals Symbionts, Pathogens, and Tissue-Specific Microbiome of Blacklegged Ticks (Ixodes scapularis) from a Lyme Disease Hot Spot in Southeastern Ontario, Canada. Microbiol. Spectr. 2023, 11, e0140423. [Google Scholar] [CrossRef]
  27. Romero-Vega, L.M.; Piche-Ovares, M.; Soto-Garita, C.; Barantes Murillo, D.F.; Chaverri, L.G.; Alfaro-Alarcón, A.; Corrales-Aguilar, E.; Troyo, A. Seasonal changes in the diversity, host preferences and infectivity of mosquitoes in two arbovirus-endemic regions of Costa Rica. Parasit. Vectors 2023, 16, 34. [Google Scholar] [CrossRef]
  28. Dannenmaier, J.; Kaltenbach, C.; Kölle, T.; Krischak, G. Application of functional data analysis to explore movements: Walking, running and jumping—A systematic review. Gait Posture 2020, 77, 182–189. [Google Scholar] [CrossRef]
  29. Ramsay, J.O.; Silverman, B.W. Functional Data Analysis, 2nd ed.; Springer: New York, NY, USA, 2010. [Google Scholar]
  30. Dieng, S.; Michel, P.; Guindo, A.; Sallah, K.; Ba, E.H.; Cissé, B.; Carrieri, M.P.; Sokhna, C.; Milligan, P.; Gaudart, J. Application of Functional Data Analysis to Identify Patterns of Malaria Incidence, to Guide Targeted Control Strategies. Int. J. Environ. Res. Public Health 2020, 17, 4168. [Google Scholar] [CrossRef]
  31. Shah, D.A.; De Wolf, E.D.; Paul, P.A.; Madden, L.V. Functional Data Analysis of Weather Variables Linked to Fusarium Head Blight Epidemics in the United States. Phytopathology 2019, 109, 96–110. [Google Scholar] [CrossRef] [PubMed]
  32. Ullah, S.; Finch, C.F. Applications of functional data analysis: A systematic review. BMC Med. Res. Methodol. 2013, 13, 43. [Google Scholar] [CrossRef]
  33. Bernasconi, A.; Canakoglu, A.; Masseroli, M.; Ceri, S. The road towards data integration in human genomics: Players, steps and interactions. Brief. Bioinform. 2021, 22, 30–44. [Google Scholar] [CrossRef]
  34. Wilhelm, D.; Lohmann, J.; De Allegri, M.; Chinkhumba, J.; Muula, A.S.; Brenner, S. Quality of maternal obstetric and neonatal care in low-income countries: Development of a composite index. BMC Med. Res. Methodol. 2019, 19, 154. [Google Scholar] [CrossRef]
  35. Wang, L.; Wang, Y.; Jin, S.; Wu, Z.; Chin, D.P.; Koplan, J.P.; Wilson, M.E. Emergence and control of infectious diseases in China. Lancet 2008, 372, 1598–1605. [Google Scholar] [CrossRef]
  36. Jiang, Y.; Dou, X.; Yan, C.; Wan, L.; Liu, H.; Li, M.; Wang, R.; Li, G.; Zhao, L.; Liu, Z.; et al. Epidemiological characteristics and trends of notifiable infectious diseases in China from 1986 to 2016. J. Glob. Health 2020, 10, 020803. [Google Scholar] [CrossRef]
  37. LoMauro, A.; Colli, A.; Colombo, L.; Aliverti, A. Breathing patterns recognition: A functional data analysis approach. Comput. Methods Programs Biomed. 2022, 217, 106670. [Google Scholar] [CrossRef] [PubMed]
  38. Pons-Salort, M.; Grassly, N.C. Serotype-specific immunity explains the incidence of diseases caused by human enteroviruses. Science 2018, 361, 800–803. [Google Scholar] [CrossRef]
  39. Jones, K.E.; Patel, N.G.; Levy, M.A.; Storeygard, A.; Balk, D.; Gittleman, J.L.; Daszak, P. Global trends in emerging infectious diseases. Nature 2008, 451, 990–993. [Google Scholar] [CrossRef] [PubMed]
  40. Halliday, J.E.B.; Hampson, K.; Hanley, N.; Lembo, T.; Sharp, J.P.; Haydon, D.T.; Cleaveland, S. Driving improvements in emerging disease surveillance through locally relevant capacity strengthening. Science 2017, 357, 146–148. [Google Scholar] [CrossRef]
  41. Townsend, A.K.; Sewall, K.B.; Leonard, A.S.; Hawley, D.M. Infectious disease and cognition in wild populations. Trends Ecol. Evol. 2022, 37, 899–910. [Google Scholar] [CrossRef] [PubMed]
  42. Han, C.; Li, M.; Haihambo, N.; Babuna, P.; Liu, Q.; Zhao, X.; Jaeger, C.; Li, Y.; Yang, S. Mechanisms of recurrent outbreak of COVID-19: A model-based study. Nonlinear Dyn. 2021, 106, 1169–1185. [Google Scholar] [CrossRef]
  43. Suk, J.E.; Vaughan, E.C.; Cook, R.G.; Semenza, J.C. Natural disasters and infectious disease in Europe: A literature review to identify cascading risk pathways. Eur. J. Public Health 2020, 30, 928–935. [Google Scholar] [CrossRef]
  44. Bharti, N.; Tatem, A.J.; Ferrari, M.J.; Grais, R.F.; Djibo, A.; Grenfell, B.T. Explaining seasonal fluctuations of measles in Niger using nighttime lights imagery. Science 2011, 334, 1424–1427. [Google Scholar] [CrossRef] [PubMed]
  45. Buonomo, B.; Della Marca, R. Oscillations and hysteresis in an epidemic model with information-dependent imperfect vaccination. Math. Comput. Simul. 2019, 162, 97–114. [Google Scholar] [CrossRef]
  46. Lee, S.W.; Kim, H.Y. Stock market forecasting with super-high dimensional time-series data using ConvLSTM, trend sampling, and specialized data augmentation. Expert Syst. Appl. 2020, 161, 113704. [Google Scholar] [CrossRef]
  47. Zeger, S.L.; Irizarry, R.; Peng, R.D. On Time Series Analysis of Public Health and Biomedical Data. Annu. Rev. Public Health 2006, 27, 57–79. [Google Scholar] [CrossRef]
  48. Yuan, A.E.; Shou, W. Data-driven causal analysis of observational biological time series. eLife 2022, 11, e72518. [Google Scholar] [CrossRef]
  49. Blázquez-García, A.; Conde, A.; Mori, U.; Lozano, J.A. A review on outlier/anomaly detection in time series data. Comput. Surveys 2021, 54, 1–33. [Google Scholar] [CrossRef]
  50. López-Pintado, S.; Romo, J. On the Concept of Depth for Functional Data. J. Am. Stat. Assoc. 2009, 104, 718–734. [Google Scholar] [CrossRef]
  51. Nanditha, N.G.; Dong, X.; McLinden, T.; Sereda, P.; Kopec, J.; Hogg, R.S.; Montaner, J.S.; Lima, V.D. The impact of lookback windows on the prevalence and incidence of chronic diseases among people living with HIV: An exploration in administrative health data in Canada. BMC Med. Res. Methodol. 2022, 22, 1. [Google Scholar] [CrossRef]
  52. Levitin, D.J.; Nuzzo, R.L.; Vines, B.W.; Ramsay, J.O. Introduction to functional data analysis. Can. Psychol. 2007, 48, 135–155. [Google Scholar] [CrossRef]
  53. Barman, K.; Borkar, V.S. A note on linear function approximation using random projections. Syst. Control. Lett. 2008, 57, 784–786. [Google Scholar] [CrossRef]
  54. Nedic, A.; Bertsekas, D.P. Least-squares policy evaluation algorithms with linear function approximation. Discret. Event Dyn. Syst. 2003, 13, 79–110. [Google Scholar] [CrossRef]
  55. Berstein, S. Démonstration du théorèm de Weierstrass fondée sur le calculdes probabilities. Comm. Soc. Math. Khardov 1912, 13, 1–2. [Google Scholar]
  56. Li, Y.; Lei, M.; Cui, W.; Guo, Y.; Wei, H.L. A Parametric Time-Frequency Conditional Granger Causality Method Using Ultra-Regularized Orthogonal Least Squares and Multiwavelets for Dynamic Connectivity Analysis in EEGs. IEEE Trans. Biomed. Eng. 2019, 66, 3509–3525. [Google Scholar] [CrossRef] [PubMed]
  57. Jimenez, C.E.; Keestra, S.; Tandon, P.; Cumming, O.; Pickering, A.J.; Moodley, A.; Chandler, C.I. Biosecurity and water, sanitation, and hygiene (WASH) interventions in animal agricultural settings for reducing infection burden, antibiotic use, and antibiotic resistance: A One Health systematic review. Lancet Planet. Health 2023, 7, e418–e434. [Google Scholar] [CrossRef]
  58. Harder, T.; Takla, A.; Rehfuess, E.; Sánchez-Vivar, A.; Matysiak-Klose, D.; Eckmanns, T.; Krause, G.; de Carvalho Gomes, H.; Jansen, A.; Ellis, S.; et al. Evidence-based decision-making in infectious diseases epidemiology, prevention and control: Matching research questions to study designs and quality appraisal tools. BMC Med. Res. Methodol. 2014, 14, 69. [Google Scholar] [CrossRef]
  59. Yen, J.; Wang, L. Simplifying fuzzy rule-based models using orthogonal transformation methods. IEEE Trans. Syst. Man. Cybern. B Cybern. 1999, 29, 13–24. [Google Scholar] [CrossRef]
  60. Li, J.; Lai, S.; Gao, G.F.; Shi, W. The emergence, genomic diversity and global spread of SARS-CoV-2. Nature 2021, 600, 408–418. [Google Scholar] [CrossRef]
  61. Polaris Observatory Collaborators. Global prevalence, cascade of care, and prophylaxis coverage of hepatitis B in 2022: A modelling study. Lancet Gastroenterol. Hepatol. 2023, 8, 879–907. [Google Scholar]
  62. Xu, C.; Wang, Y.; Cheng, K.; Yang, X.; Wang, X.; Guo, S.; Liu, M.; Liu, X. A Mathematical Model to Study the Potential Hepatitis B Virus Infections and Effects of Vaccination Strategies in China. Vaccines 2023, 11, 1530. [Google Scholar] [CrossRef]
  63. Rothe, C.; Jong, E.C. Emerging Infectious Diseases and the International Traveler. Travel Trop. Med. Man. 2017, 27–35. [Google Scholar] [CrossRef]
  64. Cooke, G.S.; Andrieux-Meyer, I.; Applegate, T.L.; Atun, R.; Burry, J.R.; Cheinquer, H.; Dusheiko, G.; Feld, J.J.; Gore, C.; Griswold, M.G.; et al. Accelerating the elimination of viral hepatitis: A Lancet Gastroenterology & Hepatology Commission. Lancet Gastroenterol. Hepatol. 2019, 4, 135–184. [Google Scholar] [PubMed]
Figure 1. Orthogonally synthesized indices of notifiable infectious diseases grouped by transmission route ( Z 1 represents the synthetic index of gastro-intestinal infectious disease; Z 2 represents the synthetic index of respiratory infectious disease; Z 3 represents the synthetic index of sexually transmitted and blood-borne disease; and Z 4 represents the synthetic index of vector-borne disease).
Figure 1. Orthogonally synthesized indices of notifiable infectious diseases grouped by transmission route ( Z 1 represents the synthetic index of gastro-intestinal infectious disease; Z 2 represents the synthetic index of respiratory infectious disease; Z 3 represents the synthetic index of sexually transmitted and blood-borne disease; and Z 4 represents the synthetic index of vector-borne disease).
Mathematics 11 04358 g001
Table 1. Synthetic expression of gastro-intestinal infectious disease.
Table 1. Synthetic expression of gastro-intestinal infectious disease.
Gastro-Intestinal Infectious Disease   ( P = 2 )
ItemsOrthogonalized Variates
Y 11 Y 12
Bacterial or amebic dysentery X 11 0.7070.707
Typhoid fever and paratyphoid X 12 0.707−0.707
Eigenvalue λ 1.3710.346
Weight w 0.7980.202
Synthetic expression Z 1 Z 1 = 0.707 X 11 + 0.421 X 12 *
* Z 1 in the synthetic expression designates the composite index for gastro-intestinal infectious disease, X 11 contributes the component of bacterial or amebic dysentery with a load of 0.707, and X 12 contributes the components of typhoid fever and paratyphoid with a load of 0.421.
Table 2. Synthetic expression of respiratory infectious disease.
Table 2. Synthetic expression of respiratory infectious disease.
Respiratory   Infectious   Disease   ( P = 5 )
ItemsOrthogonalized Variates
Y 21 Y 22 Y 23 Y 24 Y 25
Tuberculosis X 21 0.2380.7420.2610.570.009
Scarlet fever X 22 −0.5440.2830.231−0.236−0.718
Pertussis X 23 −0.427−0.3060.7340.2340.362
Measles X 24 0.5050.1530.535−0.6560.065
Epidemic cerebrospinal
meningitis X 25
0.458−0.5030.2310.366−0.591
Eigenvalue λ 1.6291.1530.8240.4730.34
Weight w 0.3690.2610.1860.1070.077
Synthetic expression Z 2 Z 2 = 0.392 X 21 0.164 X 22 0.048 X 23 + 0.261 X 24 + 0.074 X 25 *
* Z 2 in the synthetic expression designates the composite index for respiratory infectious disease, X 21 contributes the component of tuberculosis with a load of 0.392, X 22 contributes the component of scarlet fever with a load of −0.164, X 23 contributes the component of pertussis with a load of 0.048 , X 24 contributes the component of measles with a load of 0.261, and X 25 contributes the component of measles with a load of 0.074.
Table 3. Synthetic expression of sexually transmitted and blood-borne disease.
Table 3. Synthetic expression of sexually transmitted and blood-borne disease.
Sexually Transmitted and Blood-Borne Disease ( P = 3 )
Orthogonalized Variates
Y 31 Y 32 Y 33 Y 34
Virus hepatitis X 31 0.4010.8120.4050.121
Syphilis X 32 0.545−0.2660.222−0.764
Gonorrhea X 33 −0.536−0.1340.828−0.095
Acquired immune deficiency syndrome (AIDS) X 34 0.505−0.5010.3180.627
Eigenvalue λ 1.7810.8450.3310.076
Weight w 0.5870.2790.1090.025
Synthetic expression Z 3 Z 3 = 0.516 X 31 + 0.275 X 32 0.241 X 33 + 0.199 X 34 *
* Z 3 in the synthetic expression designates the composite index for sexually transmitted and blood-borne disease, X 31 contributes the component of virus hepatitis with a load of 0.516, X 32 contributes the component of syphilis with a load of 0.275, X 33 contributes the component of gonorrhea with a load of −0.241, and X 34 contributes the component of AIDS with a load of 0.199.
Table 4. Synthetic expression of vector-borne disease.
Table 4. Synthetic expression of vector-borne disease.
Vector-Borne Disease   ( P = 9 )
Orthogonalized Variates
Y 41 Y 42 Y 43 Y 44 Y 45 Y 46 Y 47 Y 48
Brucellosis X 41 0.4240.0650.0270.0880.380.7990.1430.06
Dengue fever X 42 0.2020.3420.8980.0170.012−0.167−0.0550.06
Epidemic hemorrhagic fever X 43 −0.3950.3120.019−0.0510.068−0.0020.8570.062
Malaria X 44 −0.337−0.4080.354−0.109−0.4630.4540.046−0.4
Epidemic encephalitis X 45 −0.4240.0670.062−0.4680.0490.25−0.3010.662
Anthrax X 46 −0.4010.1320.0270.851−0.0410.148−0.1930.189
Hydrophobia X 47 −0.125−0.7080.2490.1060.571−0.2040.1440.144
Leptospirosis X 48 −0.3920.302−0.005−0.150.5530.035−0.301−0.58
Eigenvalue λ 2.2771.2840.8570.430.3490.2490.1940.155
Weight w 0.3930.2220.1480.0740.0060.0430.0330.027
Synthetic expression Z 4 Z 4 = 0.255 X 41 + 0.283 X 42 0.213 X 43 0.036 X 44 0.156 X 45 0.059 X 46 0.127 X 47 0.09 X 48 *
* Z 4 in the synthetic expression designates the composite index for vector-borne disease, X 41 contributes the component of brucellosis with a load of 0.255, X 42 contributes the component of dengue fever with a load of 0.283, X 43 contributes the component of epidemic hemorrhagic fever with a load of 0.213 , X 44 contributes the component of malaria with a load of 0.036 , X 45 contributes the component of epidemic encephalitis with a load of 0.156 , X 46 contributes the component of anthrax with a load of 0.059 , X 47 contributes the component of hydrophobia with a load of 0.127 , and X 48 contributes the component of leptospirosis with a load of 0.09 .
Table 5. Orthogonally synthesized indices of 19 notifiable infectious disease categorized by transmission routes.
Table 5. Orthogonally synthesized indices of 19 notifiable infectious disease categorized by transmission routes.
Year Z 1 Z 2 Z 3 Z 4
20001.6925−0.1611−2.1279−1.5806
20011.8862−0.0356−1.8848−1.4797
20021.5086−0.2673−1.7643−1.1765
20031.3259−0.3807−1.6958−0.9115
20041.43950.5718−0.9421−0.9119
20050.92171.3731−0.5628−0.6481
20060.58310.91950.0242−0.6388
20070.21360.95000.4128−0.4043
2008−0.15291.10960.5273−0.0451
2009−0.29630.47170.70880.1335
2010−0.45020.19300.65770.223
2011−0.5648−0.36550.92530.3422
2012−0.6686−0.27001.00050.3961
2013−0.7391−0.15320.59180.5923
2014−0.8944−0.19580.57121.9487
2015−1.0073−0.43130.58890.8935
2016−1.0910−0.52330.55700.6848
2017−1.1576−0.77850.69630.6909
2018−1.2365−0.92820.76880.6359
2019−1.3125−1.09820.94691.2556
Note: Z 1 represents the synthetic index of gastro-intestinal infectious disease, Z 2 represents the synthetic index of respiratory infectious disease, Z 3 represents the synthetic index of sexually transmitted and blood-borne disease, and Z 4 represents the synthetic index of vector-borne disease.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, W.; Weng, F.; Zhu, J.; Li, Q.; Wu, X. An Analytical Approach for Temporal Infection Mapping and Composite Index Development. Mathematics 2023, 11, 4358. https://doi.org/10.3390/math11204358

AMA Style

Wang W, Weng F, Zhu J, Li Q, Wu X. An Analytical Approach for Temporal Infection Mapping and Composite Index Development. Mathematics. 2023; 11(20):4358. https://doi.org/10.3390/math11204358

Chicago/Turabian Style

Wang, Weiwei, Futian Weng, Jianping Zhu, Qiyuan Li, and Xiaolong Wu. 2023. "An Analytical Approach for Temporal Infection Mapping and Composite Index Development" Mathematics 11, no. 20: 4358. https://doi.org/10.3390/math11204358

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop