Correlations in Compositional Data without Log Transformations
Abstract
:1. Introduction
2. Materials and Methods
3. Results
3.1. Loss of Degrees of Freedom and Correcting for the Value of r0
3.2. Transformation of Correlation Coefficients
3.3. Neutral Vectors in Compositional Data
3.4. Hybrid Models
3.5. Excess of Zeros
3.6. Rank Correlation Coefficients
4. Discussion
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chayes, F. On correlation between variables of constant sum. J. Geophys. Res. 1960, 65, 4185–4193. [Google Scholar] [CrossRef]
- Sarmanov, O.V. On spurious correlation between random variables. Tr. Mat. Instituta Im. V. A. Steklova 1961, 64, 173–184. (In Russian) [Google Scholar]
- Mosimann, J.E. On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions. Biometrika 1962, 49, 65–82. [Google Scholar]
- Chayes, F.; Kruskal, W. An approximate statistical test for correlations between proportions. J. Geol. 1966, 74, 692–702. [Google Scholar] [CrossRef]
- Aitchison, J. A new approach to null correlations of proportions. Math. Geol. 1981, 13, 175–189. [Google Scholar] [CrossRef]
- Aitchison, J. The statistical analysis of compositional data. J. R. Stat. Society. Ser. B 1982, 44, 139–177. [Google Scholar] [CrossRef]
- Friedman, J.; Alm, E.J. Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 2012, 8, 1002687. [Google Scholar] [CrossRef]
- Fang, H.; Huang, C.; Zhao, H.; Deng, M. CCLasso: Correlation inference for compositional data through Lasso. Bioinformatics 2015, 31, 3172–3180. [Google Scholar] [CrossRef]
- Lovell, D.; Pawlowsky-Glahn, V.; Egozcue, J.J.; Bähler, J. Proportionality: A valid alternative to correlation for relative data. PLoS Comput. Biol. 2015, 11, 1004075. [Google Scholar] [CrossRef]
- Lovell, D.; Chua, X.Y.; McGrath, A. Counts: An outstanding challenge for log-ratio analysis of compositional data in the molecular biosciences. NAR Genom. Bioinform. 2020, 2, 5859926. [Google Scholar] [CrossRef]
- Kurtzt, Z.D.; Müller, C.L.; Miraldi, E.R.; Littmann, D.R.; Blaser, M.J.; Bonneau, R.A. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput. Biol. 2015, 11, 1004226. [Google Scholar]
- Ban, Y.; An, L.; Jiang, H. Investigating microbial co-occurrence patterns based on metagenomic compositional data. Bioinformatics 2015, 31, 3322–3329. [Google Scholar] [CrossRef]
- Erb, I.; Notredame, C. How should we measure proportionality on relative gene expression data? Theory Biosci. 2016, 135, 21–36. [Google Scholar] [CrossRef]
- Schwager, E.; Mallick, H.; Ventz, S.; Huttenhower, C. A Bayesian method for detecting pairwise associations in compositional data. PLoS Comput. Biol. 2017, 13, 1005852. [Google Scholar] [CrossRef]
- Kynčlová, P.; Hron, K.; Filzmoser, P. Correlation between compositional parts based on symmetric balances. Math. Geosci. 2017, 49, 777–796. [Google Scholar] [CrossRef]
- Yoon, G.; Gaynanova, I.; Müller, C.L. Microbial networks in SPRING—Semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data. Front. Genet. 2019, 10, 00516. [Google Scholar] [CrossRef]
- Egozcue, J.J. Reply to “On the Harker Variation Diagrams” by J.A. Cortés. Math. Geosci. 2009, 41, 829–834. [Google Scholar] [CrossRef]
- Shaffer, M.; Thurimella, K.; Sterrett, J.D.; Lozupone, C.A. SCNIC: Sparse correlation network investigation for compositional data. Mol. Ecol. Resour. 2023, 23, 312–325. [Google Scholar] [CrossRef]
- Faust, K.; Sathirapongsasuti, J.F.; Izard, J.; Segata, N.; Gevers, D.; Raes, J.; Huttenhower, C. Microbial co-occurrence relationships in the human microbiome. PLoS Comput. Biol. 2012, 8, 1002606. [Google Scholar] [CrossRef] [PubMed]
- Fisher, R.A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 1915, 10, 507–521. [Google Scholar] [CrossRef]
- Olkin, I.; Pratt, J.W. Unbiased estimation of certain correlation coefficients. Ann. Math. Stat. 1958, 29, 201–211. [Google Scholar] [CrossRef]
- Salter, M.J.; Ridler, N.M.; Cox, M.G. Distribution of correlation coefficient for samples taken from a bivariate normal distribution. NPL Rep. CETM 2000, 22, 1–43. [Google Scholar]
- Connor, R.J.; Mosimann, J.E. Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc. 1969, 64, 194–206. [Google Scholar] [CrossRef]
- James, I.R.; Mosimann, J.E. A new characterization of the Dirichlet distribution through neutrality. Ann. Stat. 1980, 8, 183–189. [Google Scholar] [CrossRef]
- Darroch, J.N. Null correlation for proportions. Math. Geol. 1969, 1, 221–227. [Google Scholar] [CrossRef]
- Paula, A.J.; Hwang, G.; Koo, H. Dynamics of bacterial population growth in biofilms resemble spatial and structural aspects of urbanization. Nat. Commun. 2020, 11, 1354. [Google Scholar] [CrossRef]
- Mandakovic, D.; Rojas, C.; Maldonado, J.; Latorre, M.; Travisany, D.; Delage, E.; Bihouée, A.; Jean, G.; Díaz, F.P.; Fernández-Gómez, B.; et al. Structure and co-occurrence patterns in microbial communities under acute environmental stress reveal ecological factors fostering resilience. Sci. Rep. 2018, 8, 5875. [Google Scholar] [CrossRef]
- Lutz, K.C.; Jiang, S.; Neugent, M.L.; De Nisco, N.J.; Zhan, X.; Li, Q. A survey of statistical methods for microbiome data analysis. Front. Appl. Math. Stat. 2022, 8, 884810. [Google Scholar] [CrossRef]
Mean Values of r (Upper Diagonals) and Shifts by Equation (4) (Lower Diagonals) | Medians by Equation (7) (Upper Diagonals) and by the Method from [15] (Lower Diagonals) | ||||||||
---|---|---|---|---|---|---|---|---|---|
0.526 | 0.316 | 0.105 | 0.053 | 0.526 | 0.316 | 0.105 | 0.053 | ||
0.526 | – | −0.710 | −0.358 | −0.246 | 0.526 | – | 0.001 | −0.002 | −0.004 |
0.316 | −0.710 | – | −0.226 | −0.156 | 0.316 | 0.580 | – | 0.002 | 0.002 |
0.105 | −0.356 | −0.229 | – | −0.080 | 0.105 | 0.243 | 0.183 | – | −0.001 |
0.053 | −0.245 | −0.158 | −0.080 | – | 0.053 | −0.171 | −0.193 | −0.292 | – |
0.552 | 0.331 | 0.111 | 0.006 | 0.552 | 0.331 | 0.111 | 0.006 | ||
0.552 | – | −0.777 | −0.386 | −0.083 | 0.552 | – | −0.000 | 0.000 | −0.002 |
0.331 | −0.777 | – | −0.244 | −0.049 | 0.331 | 0.921 | – | −0.000 | 0.005 |
0.111 | −0.386 | −0.244 | – | −0.027 | 0.111 | 0.832 | 0.815 | – | −0.000 |
0.006 | −0.081 | −0.052 | −0.026 | – | 0.006 | −0.751 | −0.725 | −0.644 | – |
Distribution Type; n; µ | Equations; df0 | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0.005 | 0.025 | 0.1 | 0.4 | 1 | 0.4 | 0.1 | 0.025 | 0.005 | |||
Large loss of df. Shifts from −0.353 to −0.037 (average of 21 series of 11110 coefficients) | |||||||||||
1 | HG + Gamma (50,1); n = 10 | Equation (7); df0 = 3.96 | −0.002 | 0.002 | 0.005 | 0.001 | −0.001 | −0.002 | −0.002 | −0.004 | −0.006 |
−0.003 | −0.001 | 0.003 | 0.001 | −0.001 | −0.001 | −0.004 | −0.007 | −0.008 | |||
Equation (5); df0 = 8 | 0.010 | 0.134 | 0.159 | 0.133 | −0.020 | 0.101 | 0.142 | 0.126 | 0.096 | ||
2 | HG; n = 18 | Equation (7); df0 = 4.81 | −0.005 | −0.004 | 0.002 | 0.005 | −0.000 | 0.004 | 0.004 | −0.000 | −0.000 |
Equation (5); df0 = 16 | 0.235 | 0.260 | 0.264 | 0.193 | −0.016 | 0.169 | 0.256 | 0.258 | 0.237 | ||
Big shifts (1 series of 11110 coefficients) | |||||||||||
3 | HG + NB (100, 0.5); n = 10; µ = −0.326 | Equation (7); df0 = 3.96 | 0.000 | −0.001 | 0.004 | 0.005 | 0.001 | −0.004 | −0.010 | −0.005 | −0.002 |
Equation (5); df0 = 8 | 0.103 | 0.137 | 0.166 | 0.152 | −0.042 | 0.083 | 0.130 | 0.126 | 0.101 | ||
4 | Dir (αi); n = 30; µ = −0.556 | Equation (7); df0 = 27.85 | 0.001 | −0.002 | −0.001 | 0.002 | −0.002 | 0.003 | 0.002 | 0.001 | 0.002 |
Equation (5); df0 = 28 | 0.009 | 0.007 | 0.009 | 0.012 | −0.013 | −0.007 | −0.006 | −0.006 | −0.003 | ||
5 | HG + B (10, 0.5); n = 16; µ = −0.677 | Equation (7); df0 = 7.14 | −0.003 | 0.008 | 0.003 | −0.006 | −0.002 | −0.010 | −0.013 | −0.005 | 0.002 |
Equation (5); df0 = 14 | 0.134 | 0.159 | 0.156 | 0.117 | −0.050 | 0.033 | 0.089 | 0.116 | 0.120 | ||
6 | HG + B (10, 0.5); n = 20; µ = −0.923 | Equation (7); df0 = 18 | −0.010 | −0.008 | −0.006 | −0.004 | 0.003 | −0.004 | −0.010 | −0.007 | −0.005 |
Equation (5); df0 = 18 | 0.006 | 0.011 | 0.015 | 0.021 | −0.023 | −0.029 | −0.032 | −0.026 | −0.020 | ||
7 | HG; n = 20; µ = −0.979 | Equation (7); df0 = 14.69 | 0.002 | 0.000 | −0.008 | 0.000 | −0.003 | −0.009 | −0.002 | 0.002 | 0.003 |
Equation (5); df0 = 18 | 0.049 | 0.052 | 0.045 | 0.048 | −0.036 | −0.021 | 0.001 | 0.011 | 0.015 | ||
Shifts from −0.378 to −0.040 (average of 21 series of 11110 coefficients) | |||||||||||
8 | HG; n = 7 | Equation (7); df0 = 5 | 0.001 | −0.002 | −0.002 | −0.001 | −0.003 | −0.003 | −0.003 | −0.006 | −0.005 |
Equation (5); df0 = 5 | 0.003 | 0.002 | 0.005 | 0.011 | −0.018 | −0.015 | −0.007 | −0.007 | −0.004 | ||
z transform | −0.070 | −0.062 | −0.036 | 0.002 | −0.018 | −0.023 | −0.047 | −0.069 | −0.077 | ||
9 | HG; n = 20 | Equation (7); df0 = 18 | −0.006 | −0.006 | −0.005 | −0.002 | −0.000 | −0.002 | −0.004 | −0.004 | −0.003 |
Equation (5); df0 = 18 | −0.003 | −0.002 | −0.002 | 0.002 | −0.005 | −0.007 | −0.008 | −0.006 | −0.005 | ||
10 | HG; n = 42 | Equation (7); df0 = 40 | −0.001 | −0.001 | −0.000 | 0.000 | 0.000 | 0.000 | 0.000 | −0.000 | 0.000 |
Equation (5); df0 = 40 | 0.001 | 0.001 | 0.002 | 0.002 | −0.002 | −0.002 | −0.002 | −0.002 | −0.001 | ||
11 | HG; n = 62 | Equation (7); df0 = 58.18 | −0.002 | −0.002 | −0.001 | −0.001 | −0.000 | −0.001 | −0.000 | 0.001 | −0.001 |
Equation (5); df0 = 60 | 0.002 | 0.002 | 0.002 | 0.002 | −0.001 | −0.001 | 0.001 | 0.003 | 0.002 |
n; df0; Loss | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.001 | 0.01 | 0.05 | 0.5 | 1 | 0.5 | 0.05 | 0.01 | 0.001 | ||
(I) Power-law growth of variable | ||||||||||
1 | 30; 26.22; 0.064 | 0.002 | 0.001 | 0.000 | −0.001 | −0.000 | −0.001 | −0.000 | −0.000 | −0.001 |
2 | 42; 35.44; 0.114 | 0.000 | 0.004 | 0.002 | −0.000 | 0.000 | −0.000 | 0.001 | 0.001 | 0.005 |
3 | 35; 28.93; 0.123 | 0.006 | 0.004 | 0.002 | 0.000 | −0.000 | 0.000 | 0.002 | 0.003 | 0.001 |
(II) Uniform growth of variable (vj = 60 + 15(j − 1)) | ||||||||||
4 | 6; 3.64; 0.091 | −0.001 | −0.004 | −0.006 | −0.004 | −0.003 | −0.008 | −0.011 | −0.007 | −0.003 |
5 | 15; 10.16; 0.218 | −0.000 | −0.006 | −0.006 | −0.005 | 0.000 | −0.005 | −0.007 | −0.007 | −0.007 |
6 | 45; 24.70; 0.426 | −0.010 | −0.007 | −0.007 | −0.003 | 0.000 | −0.003 | −0.006 | −0.004 | −0.003 |
7 | 102; 41.96; 0.580 | −0.004 | −0.005 | −0.005 | −0.002 | −0.000 | −0.002 | −0.004 | −0.004 | −0.002 |
(III) Other growth forms of variable | ||||||||||
8 | 10; 5.85; 0.269 | 0.000 | −0.005 | −0.008 | −0.006 | 0.000 | −0.007 | −0.008 | −0.007 | −0.006 |
9 | 96; 54.61; 0.419 | −0.013 | −0.008 | −0.006 | −0.002 | −0.000 | −0.002 | −0.004 | −0.005 | −0.001 |
10 | 10; 3.77; 0.529 | 0.003 | 0.002 | 0.008 | 0.006 | −0.000 | 0.002 | 0.002 | −0.006 | −0.005 |
11 | 14; 4.38; 0.635 | −0.010 | −0.002 | 0.001 | 0.005 | −0.001 | 0.004 | 0.006 | −0.001 | −0.007 |
(IV) Outliers at the beginning of variable | ||||||||||
12 | 28; 15.12; 0.419 | −0.025 | −0.026 | −0.022 | −0.011 | 0.001 | −0.010 | −0.021 | −0.023 | −0.021 |
13 | 45; 24.58; 0.428 | −0.044 | −0.041 | −0.033 | −0.012 | −0.000 | −0.012 | −0.026 | −0.027 | −0.021 |
14 | 62; 34.99; 0.417 | −0.054 | −0.047 | −0.037 | −0.014 | 0.000 | −0.013 | −0.032 | −0.038 | −0.030 |
Distribution Type; n; df0; r0 | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.001 | 0.01 | 0.05 | 0.5 | 1 | 0.5 | 0.05 | 0.01 | 0.001 | ||
Dirichlet distribution | ||||||||||
1 | Dir (α = 1); n = 20; df0 = 14.80; r0 = −0.1667 | −0.075 | −0.070 | −0.054 | −0.012 | −0.009 | −0.019 | −0.011 | 0.003 | 0.014 |
−0.135 | −0.132 | −0.111 | −0.038 | −0.003 | −0.034 | −0.048 | −0.038 | −0.018 | ||
2 | Dir (α = 5); n = 20; df0 = 17.10; r0 = −0.1667 | −0.012 | −0.014 | −0.011 | −0.002 | −0.001 | −0.004 | −0.001 | 0.003 | 0.009 |
−0.029 | −0.031 | −0.026 | −0.010 | −0.000 | −0.008 | −0.010 | −0.008 | −0.001 | ||
3 | Dir (α = 10); n = 20; df0 = 17.47; r0 = −0.1667 | −0.002 | −0.005 | −0.003 | −0.000 | −0.001 | −0.002 | 0.000 | 0.003 | 0.012 |
−0.016 | −0.019 | −0.015 | −0.006 | −0.001 | −0.005 | −0.005 | −0.003 | −0.006 | ||
4 | Dir (α = 30); n = 20; df0 = 17.77; r0 = −0.1667 | 0.002 | 0.000 | −0.000 | 0.000 | −0.000 | −0.000 | −0.001 | −0.000 | 0.000 |
0.001 | −0.004 | −0.004 | −0.002 | −0.000 | −0.002 | −0.004 | −0.005 | 0.005 | ||
Hypergeometric distribution | ||||||||||
5 | HG (); n = 20; df0 = 18; r0 = −0.1667 | 0.016 | 0.020 | 0.021 | 0.016 | −0.005 | 0.006 | 0.026 | 0.030 | 0.029 |
−0.014 | −0.019 | −0.015 | −0.005 | −0.000 | −0.005 | −0.006 | −0.004 | 0.000 | ||
6 | HG (); n = 20; df0 = 18; r0 = −0.1667 | 0.010 | 0.019 | 0.018 | 0.012 | −0.004 | 0.005 | 0.020 | 0.025 | 0.027 |
−0.014 | −0.012 | −0.010 | −0.002 | −0.001 | −0.004 | −0.002 | −0.001 | 0.002 | ||
7 | HG (); n = 20; df0 = 18; r0 = −0.1667 | 0.018 | 0.011 | 0.010 | 0.007 | −0.002 | 0.003 | 0.010 | 0.015 | 0.012 |
0.001 | −0.005 | −0.005 | −0.002 | −0.000 | −0.002 | −0.004 | −0.003 | −0.003 | ||
Binomial distribution | ||||||||||
8 | B (14, 1/7); n = 20; df0 = 16.05; r0 = −0.1667 | 0.015 | 0.026 | 0.027 | 0.018 | −0.008 | 0.004 | 0.031 | 0.040 | 0.051 |
−0.013 | −0.012 | −0.010 | −0.005 | −0.000 | −0.004 | −0.004 | −0.002 | 0.001 | ||
9 | B (4, 0.5); n = 20; df0 = 16.75; r0 = −0.1667 | 0.037 | 0.037 | 0.035 | 0.017 | −0.004 | 0.010 | 0.035 | 0.046 | 0.052 |
0.018 | 0.015 | 0.012 | 0.006 | −0.000 | 0.004 | 0.011 | 0.015 | 0.018 | ||
10 | B (100, 0.05); n = 20; df0 = 17.09; r0 = −0.1667 | 0.017 | 0.017 | 0.015 | 0.009 | −0.002 | 0.005 | 0.016 | 0.020 | 0.017 |
−0.000 | −0.005 | −0.007 | −0.003 | −0.000 | −0.003 | −0.004 | −0.002 | 0.006 | ||
11 | B (10, 0.5); n = 20; df0 = 17.42; r0 = −0.1667 | 0.017 | 0.012 | 0.010 | 0.006 | 0.000 | 0.003 | 0.010 | 0.015 | 0.012 |
0.009 | 0.004 | 0.003 | 0.002 | 0.001 | 0.001 | 0.003 | 0.004 | 0.004 | ||
Negative binomial distribution | ||||||||||
12 | NB (10, 0.001); n = 20; df0 = 17.47; r0 = −0.1667 | −0.007 | −0.006 | −0.003 | 0.000 | −0.001 | −0.001 | 0.002 | 0.003 | −0.002 |
−0.016 | −0.017 | −0.015 | −0.005 | −0.001 | −0.004 | −0.006 | −0.007 | −0.007 | ||
13 | NB (10, 0.5); n = 20; df0 = 17.06; r0 = −0.1667 | 0.008 | 0.005 | 0.005 | 0.004 | −0.002 | 0.002 | 0.011 | 0.015 | 0.016 |
−0.015 | −0.017 | −0.016 | −0.007 | −0.001 | −0.006 | −0.006 | −0.002 | −0.001 | ||
14 | NB (30, 0.001); n = 20; df0 = 17.78; r0 = −0.1667 | 0.007 | −0.003 | −0.003 | −0.001 | −0.001 | −0.001 | −0.002 | −0.001 | 0.004 |
−0.004 | −0.007 | −0.005 | −0.002 | 0.000 | −0.002 | −0.004 | −0.003 | 0.004 | ||
15 | NB (100, 0.5); n = 20; df0 = 17.84; r0 = −0.1667 | −0.003 | 0.000 | 0.001 | 0.001 | 0.000 | 0.000 | −0.002 | 0.002 | 0.004 |
0.003 | −0.004 | −0.001 | −0.001 | −0.000 | −0.000 | −0.004 | −0.001 | 0.003 | ||
Uniform and power function distributions | ||||||||||
16 | CU (0, 1); n = 20; df0 = 16.25; r0 = −0.1667 | 0.024 | 0.019 | 0.019 | 0.014 | −0.008 | −0.000 | 0.018 | 0.030 | 0.033 |
0.010 | 0.003 | 0.001 | 0.002 | −0.003 | −0.002 | 0.006 | 0.013 | 0.014 | ||
17 | ; n = 20; df0 = 17.35; r0 = −0.1667 | 0.013 | 0.012 | 0.012 | 0.006 | −0.002 | 0.003 | 0.010 | 0.014 | 0.018 |
0.008 | 0.003 | 0.005 | 0.002 | −0.001 | 0.001 | 0.004 | 0.006 | 0.010 | ||
18 | ; n = 20; df0 = 17.58; r0 = −0.1667 | 0.015 | 0.009 | 0.005 | 0.003 | −0.001 | 0.002 | 0.006 | 0.009 | 0.011 |
0.012 | 0.005 | 0.002 | 0.001 | −0.001 | 0.000 | 0.003 | 0.001 | 0.005 | ||
19 | DU (0, 10); n = 20; df0 = 16.03; r0 = −0.1667 | 0.03 | 0.031 | 0.032 | 0.022 | −0.011 | 0.004 | 0.033 | 0.043 | 0.051 |
0.009 | 0.005 | 0.003 | 0.002 | −0.003 | −0.001 | 0.009 | 0.014 | 0.015 | ||
20 | DU (0, 100); n = 20; df0 = 16.39; r0 = −0.1667 | 0.031 | 0.026 | 0.025 | 0.015 | −0.008 | 0.002 | 0.021 | 0.024 | 0.031 |
0.012 | 0.008 | 0.006 | 0.002 | −0.002 | −0.001 | 0.007 | 0.010 | 0.016 | ||
21 | DU (0, 10000); n = 20; df0 = 16.23; r0 = −0.1667 | 0.016 | 0.014 | 0.012 | 0.006 | −0.004 | 0.003 | 0.020 | 0.024 | 0.024 |
0.009 | 0.008 | 0.002 | 0.001 | −0.002 | −0.003 | 0.005 | 0.010 | 0.015 | ||
22 | []2; n = 20; df0 = 17.31; r0 = −0.1667 | −0.001 | −0.004 | −0.005 | −0.000 | −0.002 | −0.002 | −0.000 | 0.001 | 0.005 |
−0.005 | −0.014 | −0.012 | −0.004 | −0.001 | −0.004 | −0.006 | −0.007 | −0.004 | ||
Normal distribution | ||||||||||
23 | N (50, 64); n = 20; df0 = 17.80; r0 = −0.1667 | 0.008 | 0.003 | 0.003 | 0.001 | −0.000 | 0.001 | 0.002 | −0.000 | 0.002 |
0.009 | 0.001 | 0.002 | 0.000 | 0.000 | −0.000 | 0.001 | −0.001 | 0.000 |
Distribution Type; n | Equations; df0 | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0.001 | 0.01 | 0.05 | 0.5 | 1 | 0.5 | 0.05 | 0.01 | 0.001 | |||
Proportions of 0.273, 0.254, 0.136, 0.122, 0.083, 0.066, and 0.065 (shifts from −0.326 to −0.063) | |||||||||||
1 | HG + NB (100, 0.5); n = 10 | Equation (7); df0 = 3.96 | 0.001 | 0.005 | 0.005 | 0.002 | −0.002 | 0.002 | 0.009 | 0.005 | −0.000 |
−0.010 | 0.002 | 0.004 | 0.001 | −0.001 | −0.001 | −0.003 | −0.005 | −0.012 | |||
Equation (5); df0 = 8 | 0.076 | 0.117 | 0.147 | 0.104 | −0.002 | 0.102 | 0.146 | 0.114 | 0.073 | ||
0.072 | 0.117 | 0.151 | 0.118 | −0.020 | 0.085 | 0.139 | 0.110 | 0.071 | |||
Proportions of 0.327, 0.250, 0.085, 0.085, 0.085, 0.085, and 0.085 (shifts from −0.402 to −0.092) | |||||||||||
2 | HG + 2; n = 27 | Equation (7); df0 = 21.52 | 0.007 | 0.008 | 0.004 | 0.000 | −0.000 | 0.000 | 0.004 | 0.006 | 0.008 |
−0.002 | −0.002 | −0.000 | −0.001 | 0.000 | −0.001 | 0.001 | 0.002 | 0.005 | |||
Equation (5); df0 = 25 | 0.028 | 0.029 | 0.024 | 0.010 | −0.000 | 0.010 | 0.024 | 0.027 | 0.029 | ||
0.021 | 0.022 | 0.022 | 0.012 | −0.004 | 0.005 | 0.019 | 0.021 | 0.025 |
n; df0; Insert; Lowest ME | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.001 | 0.01 | 0.05 | 0.5 | 1 | 0.5 | 0.05 | 0.01 | 0.001 | ||
(I) Binomial, negative binomial, gamma, logistic, and normal distributions | ||||||||||
1 | 30; 28; B (10, 0.5); 8.33 | 0.000 | −0.004 | −0.002 | −0.001 | −0.000 | −0.001 | 0.000 | 0.001 | 0.004 |
2 | 25; 23; NB (3, 0.001); 6.64 | 0.003 | −0.002 | −0.003 | −0.002 | 0.000 | −0.001 | −0.002 | −0.003 | 0.004 |
3 | 25; 23; Gamma (50, 1); 6.64 | −0.000 | 0.003 | −0.002 | −0.000 | 0.000 | −0.000 | −0.000 | −0.002 | 0.001 |
4 | 20; 18; L (1000, 50); 10 | 0.003 | −0.002 | 0.000 | −0.001 | −0.000 | 0.000 | −0.001 | −0.000 | 0.003 |
5 | 30; 28; N (50, 64); 8.33 | −0.003 | −0.001 | −0.000 | 0.000 | −0.000 | −0.000 | −0.002 | −0.001 | 0.004 |
6 | 30; 26.22; NB (30, 0.01); 12.99+ | 0.002 | −0.002 | −0.003 | −0.002 | 0.000 | −0.001 | −0.003 | −0.005 | 0.000 |
(II) Exponential function distributions | ||||||||||
7 | 7; 5; 2); 6.64 | −0.001 | −0.004 | −0.004 | −0.002 | −0.001 | −0.003 | −0.006 | −0.006 | 0.001 |
8 | 20; 18; 2; 8.33 | 0.003 | −0.005 | −0.006 | −0.003 | −0.001 | −0.002 | −0.002 | −0.000 | 0.004 |
9 | 13; 11; 2; 25 | 0.004 | −0.003 | −0.003 | −0.001 | 0.000 | −0.001 | −0.003 | −0.002 | 0.001 |
10 | 30; 28; 2; 25 | −0.001 | −0.005 | −0.002 | −0.000 | 0.000 | −0.000 | −0.001 | −0.000 | 0.000 |
11 | 30; 28; 2; 25 (15 series) (6 series) | −0.004 | −0.005 | −0.005 | −0.001 | −0.000 | −0.001 | −0.000 | 0.003 | 0.002 |
−0.000 | −0.004 | −0.004 | −0.002 | 0.001 | −0.000 | −0.001 | 0.001 | −0.000 | ||
−0.013 | −0.008 | −0.006 | 0.001 | −0.002 | −0.003 | 0.002 | 0.007 | 0.006 | ||
12 | 30; 28; exp ; 25 | −0.007 | −0.006 | −0.007 | −0.002 | −0.000 | −0.001 | −0.002 | −0.002 | −0.008 |
13 | 30; 28; exp ; 25 (15 series) (6 series) | −0.026 | −0.026 | −0.021 | −0.005 | −0.001 | −0.004 | −0.002 | 0.001 | 0.006 |
−0.020 | −0.022 | −0.019 | −0.005 | 0.000 | −0.003 | −0.002 | −0.001 | −0.000 | ||
−0.040 | −0.034 | −0.026 | −0.004 | −0.003 | −0.005 | −0.001 | 0.007 | 0.024 | ||
14 | 30; 28; 2^(CU (0, 10)); 25 | −0.004 | −0.004 | −0.004 | −0.000 | −0.000 | 0.000 | −0.001 | −0.000 | 0.001 |
15 | 30; 28; exp(CU (0, 10)); 25 | −0.005 | −0.007 | −0.005 | −0.001 | −0.000 | −0.001 | −0.002 | −0.000 | 0.002 |
(III) Uniform and power function distributions | ||||||||||
16 | 10; 3.96; [CU (0, 1000)]2; 5.66+ | −0.010 | −0.003 | 0.002 | 0.004 | −0.002 | 0.000 | −0.000 | −0.005 | −0.013 |
17 | 20; 18; [CU (0, 1000)]2; 24.36 | −0.002 | 0.001 | −0.000 | −0.001 | 0.000 | −0.000 | −0.003 | −0.004 | 0.002 |
18 | 13;11; []2; 54.16 | 0.000 | 0.002 | 0.001 | −0.000 | 0.000 | 0.000 | −0.004 | −0.003 | −0.009 |
19 | 30;28; []2; 25 | −0.004 | −0.000 | −0.000 | −0.000 | 0.000 | −0.000 | 0.000 | −0.002 | −0.002 |
20 | 30;28; [B (100, 0.5)]2; 25 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | −0.000 | −0.002 | −0.001 | 0.000 |
21 | 30;28; []3; 25 | 0.000 | −0.003 | −0.001 | 0.000 | 0.000 | 0.000 | 0.001 | −0.002 | −0.000 |
22 | 30; 28; [DU (0,100)]2; 25 | −0.003 | −0.002 | −0.002 | 0.000 | −0.000 | −0.000 | −0.001 | −0.000 | −0.003 |
23 | 30; 28; [DU (0,100)]4; 25 | −0.003 | −0.001 | −0.002 | −0.001 | 0.000 | −0.000 | −0.001 | −0.001 | −0.002 |
24 | 30; 28; [DU (0,100)]6; 25 | −0.003 | −0.002 | −0.002 | −0.001 | 0.000 | −0.001 | −0.001 | −0.005 | −0.002 |
25 | 10; 3.77; [DU (0,100)]; 4+ | −0.007 | 0.001 | 0.008 | 0.003 | −0.001 | 0.002 | 0.002 | −0.003 | −0.003 |
26 | 10; 3.77; [DU (0,100)]2; 4+ | −0.004 | 0.000 | 0.006 | 0.005 | −0.000 | 0.002 | 0.007 | 0.001 | −0.003 |
27 | 10; 3.77; [DU (0,100)]4; 4+ | −0.009 | −0.006 | 0.002 | 0.002 | −0.001 | 0.003 | 0.007 | 0.003 | −0.000 |
28 | 10; 3.77; [DU (0,100)]6; 4+ | −0.014 | −0.011 | −0.005 | 0.003 | −0.001 | 0.004 | 0.008 | −0.000 | −0.000 |
(IV) Power-law (Pareto Type I) distributions | ||||||||||
29 | 42; 35.44; Pr (2, 100, 3); 13.54+ | −0.004 | −0.004 | −0.002 | −0.001 | 0.000 | −0.001 | −0.001 | 0.002 | 0.005 |
30 | 20; 18; Pr (2, 100, 1); 55 | −0.006 | −0.006 | −0.004 | −0.001 | −0.000 | −0.001 | −0.000 | −0.000 | 0.005 |
31 | 20; 18; Pr (2, 3); 55 | −0.000 | −0.001 | 0.001 | −0.001 | 0.000 | 0.000 | 0.001 | 0.003 | 0.007 |
32 | 20; 18; Pr (2, 3); 253.85 | 0.007 | 0.003 | 0.003 | 0.000 | 0.001 | 0.000 | 0.000 | 0.002 | 0.002 |
33 | 20; 18; Pr (2, 2); 253.85 | 0.107 | 0.044 | 0.018 | 0.000 | 0.009 | 0.021 | 0.057 | 0.092 | 0.150 |
34 | 20; 18; Pr (2, 1); 253.85 | −0.132 | −0.125 | −0.104 | −0.056 | 0.040 | 0.062 | 0.147 | 0.172 | 0.175 |
vj Size in HG; Insert; Lowest ME; Expected Zeros | Deviations at the Below Significance Levels | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.005 | 0.025 | 0.1 | 0.4 | 1 | 0.4 | 0.1 | 0.025 | 0.005 | ||
(I) (1 series); (10 series); (10 series) | ||||||||||
1 | HG () + B (10, 0.5); 0.38; 13.59 (68%) | 0.026 | 0.019 | 0.014 | 0.013 | −0.006 | 0.003 | 0.001 | 0.014 | 0.015 |
−0.006 | −0.007 | −0.006 | −0.003 | 0.002 | −0.001 | −0.003 | −0.004 | −0.008 | ||
−0.131 | −0.103 | −0.059 | 0.005 | −0.028 | −0.003 | 0.039 | 0.075 | 0.102 | ||
2 | HG () + B (10, 0.5); 0.77; 9.21 (46%) | −0.008 | 0.003 | 0.004 | −0.000 | 0.005 | 0.003 | −0.002 | −0.000 | 0.002 |
0.001 | 0.000 | 0.001 | −0.000 | −0.000 | −0.001 | −0.000 | 0.001 | 0.002 | ||
−0.050 | −0.041 | −0.024 | 0.001 | −0.009 | 0.000 | 0.019 | 0.030 | 0.054 | ||
3 | HG () + B (10, 0.5); 2.5; 1.54 (7.7%) | 0.019 | −0.001 | −0.007 | 0.002 | 0.001 | −0.002 | −0.006 | −0.004 | 0.016 |
0.005 | 0.003 | 0.001 | 0.002 | 0.000 | 0.000 | −0.000 | 0.001 | −0.001 | ||
−0.011 | −0.012 | −0.008 | 0.001 | −0.004 | −0.000 | 0.006 | 0.011 | 0.020 | ||
4 | HG () + B (10, 0.5); 5; 0.12 (0.59%) | −0.010 | −0.008 | −0.006 | −0.004 | 0.003 | −0.004 | −0.010 | −0.007 | −0.005 |
0.003 | −0.001 | −0.001 | −0.001 | −0.000 | −0.001 | 0.000 | −0.001 | −0.001 | ||
−0.005 | −0.006 | −0.004 | −0.001 | −0.001 | −0.001 | 0.001 | 0.003 | 0.000 | ||
(II) Shifts from −0.4763 to −0.0901 (10 series); shifts from −0.0598 to −0.0261 (10 series) | ||||||||||
5 | HG () + B (10, 0.5); 4.23; 13.59 (68%) | −0.008 | −0.008 | −0.004 | −0.002 | 0.000 | −0.002 | −0.001 | −0.004 | 0.001 |
−0.028 | −0.015 | −0.012 | −0.003 | −0.003 | −0.002 | 0.002 | 0.007 | 0.015 | ||
6 | HG () + B (10, 0.5); 8.46; 9.21 (46%) | −0.006 | −0.003 | −0.003 | −0.003 | 0.000 | −0.003 | −0.004 | −0.004 | −0.005 |
−0.013 | −0.009 | −0.006 | −0.002 | −0.001 | −0.002 | 0.000 | 0.002 | 0.005 | ||
7 | HG () + B (10, 0.5); 27.5; 1.54 (7.7%) | 0.003 | −0.001 | −0.000 | −0.001 | −0.000 | −0.000 | 0.001 | 0.001 | 0.002 |
−0.001 | −0.004 | −0.003 | −0.001 | 0.000 | −0.001 | −0.001 | 0.001 | 0.005 | ||
8 | HG () + B (10, 0.5); 55; 0.12 (0.59%) | −0.002 | 0.001 | 0.002 | 0.000 | 0.001 | −0.000 | −0.001 | −0.000 | −0.003 |
0.000 | 0.004 | 0.000 | −0.001 | −0.002 | 0.000 | 0.001 | 0.000 | 0.006 | ||
(10 series) | ||||||||||
9 | HG (); 0.77; 9.21 (46%) | 0.020 | 0.009 | 0.005 | 0.003 | −0.001 | 0.001 | −0.007 | −0.009 | −0.004 |
0.004 | −0.001 | −0.003 | −0.001 | 0.001 | −0.000 | −0.002 | 0.000 | 0.002 | ||
−0.042 | −0.032 | −0.018 | 0.001 | −0.008 | −0.001 | 0.012 | 0.024 | 0.032 | ||
10 | HG (); 5; 0.12 (0.59%) | −0.000 | −0.005 | −0.002 | 0.000 | 0.001 | −0.001 | −0.005 | −0.009 | −0.010 |
−0.002 | −0.002 | −0.001 | −0.000 | −0.000 | 0.000 | −0.002 | −0.002 | −0.002 | ||
−0.005 | −0.005 | −0.004 | −0.001 | −0.002 | 0.001 | 0.003 | 0.005 | 0.004 | ||
11 | HG () + [CU (0, 100)]2; 0.77; 9.21 (46%) | 0.001 | 0.015 | 0.014 | 0.011 | −0.003 | 0.002 | 0.011 | 0.021 | 0.011 |
0.001 | −0.004 | −0.005 | −0.003 | 0.001 | 0.001 | −0.000 | −0.001 | 0.000 | ||
−0.132 | −0.101 | −0.058 | 0.003 | −0.028 | −0.005 | 0.039 | 0.075 | 0.098 | ||
12 | HG () + [CU (0, 100)]2; 5; 0.12 (0.59%) | 0.008 | −0.006 | −0.003 | −0.004 | 0.004 | 0.004 | −0.002 | −0.006 | −0.005 |
0.001 | −0.003 | −0.000 | 0.001 | −0.000 | −0.000 | 0.001 | −0.001 | 0.004 | ||
−0.011 | −0.009 | −0.006 | −0.001 | −0.002 | 0.001 | 0.008 | 0.013 | 0.017 | ||
13 | HG () + 2; 0.77; 9.21 (46%) | 0.028 | 0.023 | 0.028 | 0.024 | −0.011 | 0.005 | 0.010 | 0.010 | 0.008 |
−0.001 | −0.005 | −0.008 | −0.004 | 0.002 | −0.000 | −0.004 | −0.006 | −0.003 | ||
−0.107 | −0.084 | −0.048 | −0.001 | −0.023 | −0.007 | 0.036 | 0.067 | 0.087 | ||
14 | HG () + 2; 5; 0.12 (0.59%) | 0.003 | −0.004 | −0.005 | −0.001 | −0.004 | −0.003 | −0.009 | −0.006 | −0.012 |
−0.003 | 0.002 | −0.001 | −0.000 | 0.000 | 0.000 | −0.003 | −0.003 | 0.005 | ||
−0.017 | −0.011 | −0.006 | 0.000 | −0.003 | −0.001 | 0.007 | 0.012 | 0.015 |
Distribution Type; n; df0 | Deviations at the Below Significance Levels | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0.001 | 0.01 | 0.05 | 0.5 | 1 | 0.5 | 0.05 | 0.01 | 0.001 | |||
1 | Dir (αi); n = 30; df0 = 27.85; df0 = 27.92 | Pearson | −0.008 | −0.006 | −0.003 | −0.000 | −0.001 | −0.000 | −0.001 | −0.002 | 0.003 |
Kendall | 0.009 | 0.006 | 0.005 | 0.003 | −0.001 | 0.003 | 0.006 | 0.006 | 0.004 | ||
Spearman | 0.017 | 0.007 | 0.004 | 0.000 | 0.000 | −0.000 | −0.001 | −0.001 | −0.004 | ||
2 | HG + N (50, 64); n = 30; df0 = 21.04; df0 = 23.48 | Pearson | −0.005 | −0.003 | −0.005 | −0.003 | 0.001 | −0.003 | −0.004 | −0.001 | 0.009 |
Kendall | −0.005 | −0.003 | −0.003 | −0.002 | −0.001 | −0.003 | −0.006 | −0.001 | 0.003 | ||
Spearman | 0.003 | −0.002 | −0.008 | −0.007 | 0.001 | −0.005 | −0.014 | −0.010 | −0.004 | ||
3 | HG + Gamma (50, 1); n = 65; df0 = 63; df0 = 63 | Pearson | −0.003 | −0.003 | −0.002 | −0.000 | 0.000 | −0.001 | −0.000 | −0.001 | 0.004 |
Kendall | 0.002 | 0.001 | 0.002 | 0.000 | 0.001 | 0.001 | 0.003 | 0.002 | 0.010 | ||
Spearman | 0.005 | −0.003 | −0.002 | −0.002 | 0.001 | −0.001 | −0.004 | −0.005 | 0.001 | ||
4 | HG + 2; n = 13; df0 = 7.10; df0 = 8.17 | Pearson | 0.008 | 0.008 | 0.011 | 0.003 | −0.000 | 0.002 | 0.009 | 0.007 | 0.005 |
Kendall | 0.023 | 0.021 | 0.011 | 0.003 | −0.002 | −0.003 | 0.004 | 0.002 | 0.003 | ||
Spearman | 0.024 | 0.012 | −0.002 | −0.019 | 0.005 | −0.013 | −0.023 | −0.013 | 0.002 | ||
5 | HG + exp (CU (0, 100); n = 20; df0 = 14.72; df0 = 15.93 | Pearson | −0.205 | −0.190 | −0.155 | −0.047 | −0.010 | −0.043 | −0.029 | −0.004 | 0.019 |
Kendall | −0.014 | −0.037 | −0.050 | −0.065 | 0.066 | 0.062 | 0.056 | 0.044 | 0.041 | ||
Spearman | −0.016 | −0.042 | −0.057 | −0.072 | 0.067 | 0.058 | 0.043 | 0.034 | 0.034 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Monich, Y.V.; Nechipurenko, Y.D. Correlations in Compositional Data without Log Transformations. Axioms 2023, 12, 1084. https://doi.org/10.3390/axioms12121084
Monich YV, Nechipurenko YD. Correlations in Compositional Data without Log Transformations. Axioms. 2023; 12(12):1084. https://doi.org/10.3390/axioms12121084
Chicago/Turabian StyleMonich, Yury V., and Yury D. Nechipurenko. 2023. "Correlations in Compositional Data without Log Transformations" Axioms 12, no. 12: 1084. https://doi.org/10.3390/axioms12121084