entropy-logo

Journal Browser

Journal Browser

Complexity Characteristics of Natural Language

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Complexity".

Deadline for manuscript submissions: 15 May 2025 | Viewed by 4295

Special Issue Editors


E-Mail Website
Guest Editor
1. Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, 31-342 Kraków, Poland
2. Faculty of Computer Science and Telecommunications, Cracow University of Technology, 31-155 Kraków, Poland
Interests: complex systems; nuclear physics; quantum mechanics; multifractals; complex networks; nonlinear dynamics; deterministic chaos; random matrix theory; econophysics; quantitative linguistics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, 31-342 Kraków, Poland
Interests: complex systems; complex networks; financial markets; natural language; fractal analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, ul. Radzikowskiego 152, 31-342 Kraków, Poland
Interests: complex systems; complex networks; numerical analysis; quantitative linguistics; machine learning; statistics

Special Issue Information

Dear Colleagues,

The science of complexity is an interdisciplinary approach to seeking answers to the question of the principles by which nature operates when composing basic elements of matter and energy into dynamic patterns and structures that propagate throughout the entire hierarchy of scales in the universe. The associated extraordinary emergent phenomenon, such as the syntactically organized natural language, superbly reflects these patterns and structures, expressed in its great ability to encode and transmit information about them and between them. Therefore, it is highly reasonable to expect that natural language—spontaneously created by nature—best mirrors the laws of nature and carries within it the essence of complexity. Indeed, the complex systems methodology, which includes time series analysis, various variants of the concept of entropy, scale-free laws, fractals and multifractals, and, of course, complex networks, proves to be very effective in quantifying universal as well as system-specific linguistic characteristics. For the most up-to-date related review, see “Complex Systems Approach to Natural Language” at https://doi.org/10.1016/j.physrep.2023.12.002.It can also be added that, from the perspective of human communities, natural language is a fundamental factor shaping the development of civilization in all its aspects. Nowadays, quantitative studies on language and the need to understand the principles governing it are also gaining particular importance in the context of products such as ChatGPT from the family of language models developed by OpenAI. It is expected, in particular, that these studies will contribute to further significant improvements and optimization of relevant procedures, which, as is widely known, is highly desirable in this context. 

We thus invite researchers representing various disciplines, including language studies, computer studies, physics, mathematics, data science, and others, to submit their original papers reporting studies—empirical as well as modeling—whose results may contribute to a better understanding of the origins of natural language and the principles of its organization. It is obvious that such research should include various complementary representations of natural language belonging to different families, not only the major ones such as Indo-European and Sino-Tibetan but also those in less common use.

Prof. Dr. Stanisław Drożdż
Dr. Jarosław Kwapień
Dr. Tomasz Stanisz
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language
  • complexity
  • hierarchical organization
  • long-range correlations
  • time series analysis
  • scaling laws
  • multifractals
  • complex networks
  • large language models
  • natural language generation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 2736 KiB  
Article
Is Word Order Responsive to Morphology? Disentangling Cause and Effect in Morphosyntactic Change in Five Western European Languages
by Julie Nijs, Freek Van de Velde and Hubert Cuyckens
Entropy 2025, 27(1), 53; https://doi.org/10.3390/e27010053 - 9 Jan 2025
Viewed by 443
Abstract
This study examines the relationship between morphological complexity and word order rigidity, addressing a gap in the literature regarding causality in linguistic changes. While prior research suggests that the loss of inflectional morphology correlates with the adoption of fixed word order, this study [...] Read more.
This study examines the relationship between morphological complexity and word order rigidity, addressing a gap in the literature regarding causality in linguistic changes. While prior research suggests that the loss of inflectional morphology correlates with the adoption of fixed word order, this study shifts the focus from correlation to causation. By employing Kolmogorov complexity as a measure of linguistic complexity alongside Granger Causality to examine causal relationships, we analyzed data from Germanic and Romance languages over time. Our findings indicate that changes in morphological complexity are statistically more likely to cause shifts in word order rigidity than vice versa. The causal asymmetry is robustly borne out in Dutch and German, though waveringly in English, as well as in French and Italian. Nowhere, however, is the asymmetry reversed. Together, these results can be interpreted as supporting the idea that a decline in morphological complexity causally precedes a rise in syntactic complexity, though further investigation into the underlying factors contributing to the differing trends across languages is needed. Full article
(This article belongs to the Special Issue Complexity Characteristics of Natural Language)
Show Figures

Figure 1

26 pages, 766 KiB  
Article
Still No Evidence for an Effect of the Proportion of Non-Native Speakers on Natural Language Complexity
by Alexander Koplenig
Entropy 2024, 26(11), 993; https://doi.org/10.3390/e26110993 - 18 Nov 2024
Viewed by 679
Abstract
In a recent study, I demonstrated that large numbers of L2 (second language) speakers do not appear to influence the morphological or information-theoretic complexity of natural languages. This paper has three primary aims: First, I address recent criticisms of my analyses, showing that [...] Read more.
In a recent study, I demonstrated that large numbers of L2 (second language) speakers do not appear to influence the morphological or information-theoretic complexity of natural languages. This paper has three primary aims: First, I address recent criticisms of my analyses, showing that the points raised by my critics were already explicitly considered and analysed in my original work. Furthermore, I show that the proposed alternative analyses fail to withstand detailed examination. Second, I introduce new data on the information-theoretic complexity of natural languages, with the estimates derived from various language models—ranging from simple statistical models to advanced neural networks—based on a database of 40 multilingual text collections that represent a wide range of text types. Third, I re-analyse the information-theoretic and morphological complexity data using novel methods that better account for model uncertainty in parameter estimation, as well as the genealogical relatedness and geographic proximity of languages. In line with my earlier findings, the results show no evidence that large numbers of L2 speakers have an effect on natural language complexity. Full article
(This article belongs to the Special Issue Complexity Characteristics of Natural Language)
Show Figures

Figure 1

21 pages, 5387 KiB  
Article
Language Statistics at Different Spatial, Temporal, and Grammatical Scales
by Fernanda Sánchez-Puig, Rogelio Lozano-Aranda, Dante Pérez-Méndez, Ewan Colman, Alfredo J. Morales-Guzmán, Pedro Juan Rivera Torres, Carlos Pineda and Carlos Gershenson
Entropy 2024, 26(9), 734; https://doi.org/10.3390/e26090734 - 29 Aug 2024
Viewed by 1539
Abstract
In recent decades, the field of statistical linguistics has made significant strides, which have been fueled by the availability of data. Leveraging Twitter data, this paper explores the English and Spanish languages, investigating their rank diversity across different scales: temporal intervals (ranging from [...] Read more.
In recent decades, the field of statistical linguistics has made significant strides, which have been fueled by the availability of data. Leveraging Twitter data, this paper explores the English and Spanish languages, investigating their rank diversity across different scales: temporal intervals (ranging from 3 to 96 h), spatial radii (spanning 3 km to over 3000 km), and grammatical word ngrams (ranging from 1-grams to 5-grams). The analysis focuses on word ngrams, examining a time period of 1 year (2014) and eight different countries. Our findings highlight the relevance of all three scales with the most substantial changes observed at the grammatical level. Specifically, at the monogram level, rank diversity curves exhibit remarkable similarity across languages, countries, and temporal or spatial scales. However, as the grammatical scale expands, variations in rank diversity become more pronounced and influenced by temporal, spatial, linguistic, and national factors. Additionally, we investigate the statistical characteristics of Twitter-specific tokens, including emojis, hashtags, and user mentions, revealing a sigmoid pattern in their rank diversity function. These insights contribute to quantifying universal language statistics while also identifying potential sources of variation. Full article
(This article belongs to the Special Issue Complexity Characteristics of Natural Language)
Show Figures

Figure 1

15 pages, 15258 KiB  
Article
Multifractal Hopscotch in Hopscotch by Julio Cortázar
by Jakub Dec, Michał Dolina, Stanisław Drożdż, Jarosław Kwapień and Tomasz Stanisz
Entropy 2024, 26(8), 716; https://doi.org/10.3390/e26080716 - 22 Aug 2024
Viewed by 804
Abstract
Punctuation is the main factor introducing correlations in natural language written texts and it crucially impacts their overall effectiveness, expressiveness, and readability. Punctuation marks at the end of sentences are of particular importance as their distribution can determine various complexity features of written [...] Read more.
Punctuation is the main factor introducing correlations in natural language written texts and it crucially impacts their overall effectiveness, expressiveness, and readability. Punctuation marks at the end of sentences are of particular importance as their distribution can determine various complexity features of written natural language. Here, the sentence length variability (SLV) time series representing Hopscotch by Julio Cortázar are subjected to quantitative analysis with an attempt to identify their distribution type, long-memory effects, and potential multiscale patterns. The analyzed novel is an important and innovative piece of literature whose essential property is freedom of movement between its building blocks given to a reader by the author. The statistical consequences of this freedom are closely investigated in both the original, Spanish version of the novel, and its translations into English and Polish. Clear evidence of rich multifractality in the SLV dynamics, with a left-sided asymmetry, however, is observed in all three language versions as well as in the versions with differently ordered chapters. Full article
(This article belongs to the Special Issue Complexity Characteristics of Natural Language)
Show Figures

Figure 1

Back to TopTop