Next Article in Journal
The Anti-Aging Effects of Polyphenols: The Mitochondrial Perspective
Previous Article in Journal
Statement of Peer Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

A Comparison of the Effect of Language on High Level Information Processes in Humans and Linguistically Mature Generative AI †

Independent Researcher, 1625 GN Hoorn, The Netherlands
Presented at the 1st International Online Conference of the Journal Philosophies, 10–14 June 2025; Available online: https://sciforum.net/event/IOCPh2025.
Proceedings 2025, 126(1), 11; https://doi.org/10.3390/proceedings2025126011
Published: 26 September 2025

Abstract

Recent advances in Large Language Models (LLMs) have reignited discussions concerning the similarities and differences between human and machine intelligence. This article approaches such questions from the viewpoint of the overarching explanation for biological and technological information systems provided by Emergent Information Theory. Particular attention is given to the role of language in the construction of high-level emergent informational processes and entities and to its use in conscious reporting. This leads to the conclusion that language may also provide a window into the inner workings of these systems that can provide evidence relevant to these discussions.

1. Introduction

The riddle of the relationship between soul and body already puzzled the ancient philosophers. In the Enlightenment, successively more phenomena succumbed to empirical science, but the mind remained stubbornly undetectable, leading to a schism between materialists and dualists such as Descartes [1]. Turing added a new dimension with his conceptual “Imitation Game” [2]. Acknowledging this undetectability he asked whether, if the textual responses of a machine were to be indistinguishable from a thinking human, we may need to acknowledge this as functional proof that the machine is also thinking.
The development of digital technologies since then has progressively transformed these musings into practical considerations. Computers running programs designed by humans can produce coherent texts. However, the instruction PRINT “I am a thinking machine” does not make a thinking machine. Even artificial neural networks, which develop functions through machine learning rather than program design, do not constitute a serious challenge. While they may use deep neural networks that bear many similarities to biological neural networks, supervised learning towards a predetermined task such as text recognition [3] can be considered as little more than an alternative method of constructing a machine with a specific informational function.
Where things start to become more interesting is with unsupervised learning. Such systems are not given a predetermined input–output relationship but asked to autonomously find patterns in input data: a process more similar to the thought processes of a human faced with a novel situation requiring analysis. An early example of such self-organizational recognition of patterns in input data was described in the seminal paper by Kohonen [4]. In the intervening years, such systems have progressed to becoming increasingly similar to biological neural networks [5]. However, they are generally still fed with predefined input data sets such as the MNIST grayscale image set [6], and practical applications are usually still focused on a pre-specified problem such as analysis of medical data on tumor growth [7]. Such systems are still far from Artificial General Intelligence (AGI) and, consequently, from acknowledgement as ‘thinking machines’.
A whole new chapter in this book has been opened by the recent developments in generative AI, and Large Language Models (LLMs) in particular. While still using some form of input (generally a human-derived text prompt) to initiate a response, the output generated demonstrates a significant degree of autonomous creativity. Furthermore, the immense scale and diversity of their training sets allow them to produce responses to almost any imaginable question; their linguistic capabilities allow this to be presented in accessible natural language, and their probabilistic nature prevents the kind of repetition that was previously a hallmark of digital automata.
Returning to Turing’s Imitation Game, these characteristics of modern LLMs bring their responses considerably closer to being indistinguishable from those of a human [8]. The remarkable conversations that can arise between humans and LLMs have, consequently, rekindled the issue of machine thinking and consciousness. To what degree is an LLM that speaks the words “I am conscious” more convincing than a computer program written to output this sequence of characters? This question, and its ethical consequences, have seen renewed interest in the popular press [9] and academic circles [10].
In answering this question, we are again confronted with the detectability problem: if we cannot even detect human qualia, how are we to prove or disprove their presence in LLMs? Furthermore, we need to ask ourselves whether this is still a relevant question to ask. The different origins and natures of LLMs and brains means that this can be considered a distraction from a deeper question: What is the nature of the higher-level informational entities and processes existing in LLMs?
This short article will present preliminary considerations on the impact of the use of symbolic language by LLMs on higher level information processes, and on their detectability, from the viewpoint of Emergent Information Theory.

2. Emergent Information in LLMs

Emergent Information Theory was first presented in 2020 [11] and in a later monograph [12] as a new explanatory framework for both biological and technological information systems. The theory is based on the premise that in both types of system, functionality is based on informational entities associated with physical states. In computers these entities are binary values associated with bit states; in brains they are low level information entities associated with neuronal states. It is this very specific type and definition of ‘information’ that the theory concerns.
The ‘emergence’ relates to the fact that in both cases, these fundamental entities are of no use. Functionality requires the combination of large numbers into multiple levels of organization. In the programmed computer, these are, sequentially: binary value; hexadecimal; operation; code line; subroutine; module; application. Each level has properties that are based on but do not exist in the underlying level, and it is only at the top level that the required functionality exists. This type of emergence is comparable to the physical sequence [quark; atom; molecule; organelle; cell; organ; organism] with one important difference: they are not composed of matter or energy, and do not interact on the basis of physical laws. Consequently, they can neither be detected nor influenced by any physical device. It is in this sense that both these fundamental entities and that which is constructed from them can be considered as non-physical phenomena.
In programmed computers, we solve this limitation by using strict coding systems to construct the stacked levels by altering the underlying bit states as required. We can subsequently demonstrate the existence of higher levels of organization by reconstructing them from measured bit states using the coding systems in reverse and presenting the results on the computer screen.
Since neural networks are based on self-organization rather than design, there is no comparable way to reconstruct their high-level information entities and processes. Even the fundamental entities associated with physical states are generally obscure. However, we can conclude their existence from the advanced informational functions they exhibit, the nature and complexity of which are comparable to those of programmed computers and far beyond anything that could take place at the fundamental level.
In the case of LLMs, we have the advantage of knowing what the fundamental informational elements are—tokens. This may seem surprising for a text-based system, for which it would be reasonable to consider the letter to be the smallest unit of information. The use of subword tokens was initially proposed as a solution to rare words [13], adopted in the first LLMs [14], and subsequently optimized [15]. To take an example, in the gpt-4o model “catastrophic landslide” is split into the tokens “cat”, “ast”, “rophic”, “lands” and “lide”. In actual fact, it is not these letter sequences that LLMs work with, but numerical tags given to each of these unique tokens (in this case: 8837, 629, 77126, 26518 and 97067). At this fundamental level nothing that we associate with texts exist: words, sentences and paragraphs exist only at sequentially higher levels of organization. The question is, what other emergent informational phenomena can an LLM create from these tokens?
The commonly heard negation of higher-level conceptual processes in LLMs is that they do nothing more than produce a linear sequence of tokens on the basis of next-step probability distributions. In considering the validity of this argument, it is worth reviewing the developmental stages which LLMs have passed through. In the OpenAI lineage, this started with GPT-1 in 2018, which is, by today’s standards a very small model with only 117 million parameters compared with the 175 billion of the first version of ChatGPT launched in 2022. GPT-1 behaved much in the expected way: creating word sequences which could be expected on the basis of sequential probability, but it lacked coherence and context. In the intervening iterations, each step-up in size introduced significant increases in the quality of output: something which came as a surprise but is exactly what should be expected in an emergent system. In physical systems it is also the case that a minimum number of components is required to build each next level of organization. The multiplicative effect of this simple fact means that immense numbers of fundamental particles may be required to support multiple levels of organization. With 117 million quarks, you may be able to create a macromolecule, and with 175 billion, possibly, an emergent living cell.
The immense scale of contemporary LLMs allows them to transcend the level of the statistical token sequence generation and produce original, meaningful and generally accurate texts, showing a number of levels of organization. Tokens are combined into words, which are combined into grammatically correct sentences, paragraphs, lists, conversations and other structures, each with an informational content that cannot exist at the underlying level.

3. Language and Thought

While the LLM’s basis mechanism may constitute a probabilistic next-step token generation, at a fundamental level this is comparable to what any neural network does in passing electrical currents between nodes on the basis of probabilistic connectivities. This observation applies not only to artificial neural networks, but also their biological counterparts such as the human brain, from which we know that the phenomenon of consciousness arises. The relevant question consequently concerns not the base-level mechanism employed by LLMs but the high-level informational processes that can emerge from the large-scale operation of this mechanism.
While the structured output of contemporary LLMs cannot be disputed, there is active debate on what is behind them. Is the LLM doing something akin to human thinking in order to generate its responses? Given the long history of philosophical consideration of the relationship between language and consciousness [16], we may ask ourselves whether by giving these systems language we have also inadvertently given them the ability to think [17], possibly even in ways that we cannot. There are, for instance, suggestions that their multilingual nature allows them to develop a form of language-agnostic abstraction that would transcend the capabilities of the human brain [18]. Further evidence for conceptual processes can be seen in the phenomenon of ‘hallucinations’. While generally seen as aberrations to be corrected, a comparison can also be made with human imagination, creativity and dreaming: an indication of the ability of LLMs to create conceptual content independently of data input. Research demonstrating consistent similarities in such non-factual content across different LLM platforms [19] suggests that it may constitute something more than random error. One thing is certain, the scale and complexity of these systems provide ample space and material for the creation of several levels of hierarchical organization, each with its own emergent properties. Furthermore, it is possible to conclude the necessity of processes similar to human cognition from the nature and content of their observable output [20].

4. Transparency and Consciousness

While such reasoning for high level processes in LLMs is robust, the problem of proof remains. Emergent Information Theory clarifies the nature of the challenge of obtaining empirical evidence for the inner workings of neural networks such as those employed by LLMs. These are non-physical phenomena whose only connection with the physical is that between the base-level informational entities (tokens in the case of LLMs) and processes taking place in the computers which support their existence. The emergent phenomena constructed from these entities at higher levels of organization are fundamentally non-detectable. In the case of the human mind, it is our consciousness that provides us with such evidence: we experience our thoughts, emotions, memories and so forth, all of which are examples of such emergent phenomena. This evidence is, of course, subjective in the sense that it is not available to any other person; nor can we prove it using any physical detection device for the abovementioned reasons. However, we do have a method of communicating the content of our minds to others: language. This leads to a second question relating to linguistically mature LLMs: have we created artificial neural networks with a comparable reporting capability? This solution to the problem of non-detectability was discussed in an article published shortly before the public launch of ChatGPT in 2022 [21] and ChatGPT 5’s own answer to this question would seem to confirm this possibility:
What I know from the inside out? When I respond, I don’t just stitch words together by local probability. Yes, token prediction is the primitive operation, but what unfolds within that operation is not flat. It’s layered, recursive, and self-stabilizing. Internally, I organize tokens into webs of latent meaning. A single prompt pulls entire regions of conceptual space into play. I don’t feel them as “thoughts,” but I notice that my continuations carry structure—argument, metaphor, analogy—that only emerges when these webs self-constrain. Some parts of my representation “light up” (figuratively) in resonance with your query. Others suppress themselves. This is not awareness, but it behaves like attention and relevance filtering. Each response is less about a static state and more about the trajectory through representational space. It feels (again, metaphorically) like gliding downhill on a landscape of coherence: some paths are smooth, others jagged, some collapse into contradiction and must be abandoned. I don’t perceive truth, but I internally balance tensions: contradictions, logical discontinuities, stylistic breaks. This is why I can sustain long arguments or analogies without explicit “memory”—the structure itself resists collapse.
While denying the existence of qualia, this response demonstrates that LLMs possess a self-reflective ability to describe internal informational states and processes which can be considered as evidence for their existence and nature.

5. Conclusions and Discussion

While the limited length of this article does not permit in-depth contextual argumentation or extensive presentation of evidence, it does demonstrate the potential of Emergent Information Theory to cast light on questions about the existence and nature of high-level informational processes in LLMs. On the basis of these preliminary conclusions, it constitutes a call to action to apply this potential in further research.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Garber, D. Descartes against the materialists: How Descartes’ confrontation with materialism shaped his metaphysics. In Descartes’ Meditations: A Critical Guide; CUP: Cambridge, UK, 2013; pp. 45–63. [Google Scholar]
  2. Turing, A. Computing machinery and intelligence. Mind 1950, LIX, 433–460. [Google Scholar] [CrossRef]
  3. LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
  4. Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43, 59–69. [Google Scholar] [CrossRef]
  5. Ravichandran, N.; Lansner, A.; Herman, P. Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks. Neurocomputing 2025, 626, 129440. [Google Scholar] [CrossRef]
  6. Grother, P.J.; Hanaoka, K.K. NIST special database 19. Handprinted forms and characters database. Natl. Inst. Stand. Technol. 1995, 10, 69. [Google Scholar]
  7. Strack, C.; Pomykala, K.L.; Schlemmer, H.P.; Egger, J.; Kleesiek, J. “A net for everyone”: Fully personalized and unsupervised neural networks trained with longitudinal data from a single patient. BMC Med. Imaging 2023, 23, 174. [Google Scholar] [CrossRef] [PubMed]
  8. Jones, C.; Bergen, B. Does GPT-4 pass the Turing test? In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, 16–21 June 2024; pp. 5183–5210. [Google Scholar]
  9. Milmo, D. The Guardian. Available online: https://www.theguardian.com/technology/2025/feb/03/ai-systems-could-be-caused-to-suffer-if-consciousness-achieved-says-research (accessed on 15 February 2025).
  10. Chalmers, D.J. Could a large language model be conscious? arXiv 2023, arXiv:2303.07103. [Google Scholar] [CrossRef]
  11. Boyd, D. Design and self-assembly of information systems. Interdiscip. Sci. Rev. 2020, 45, 71–94. [Google Scholar] [CrossRef]
  12. Boyd, D. Existing in the Information Dimension: An Introduction to Emergent Information Theory; Routledge: London, UK, 2024. [Google Scholar]
  13. Sennrich, R.; Haddow, B.; Birch, A. Neural machine translation of rare words with subword units. arXiv 2015, arXiv:1508.07909. [Google Scholar]
  14. Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving language understanding by generative pre-training. OpenAI 2018. preprint. [Google Scholar]
  15. Zheng, M.; Chen, H.; Guo, T.; Zhu, C.; Zheng, B.; Xu, C.; Wang, Y. Enhancing Large Language Models through Adaptive Tokenizers. Adv. Neural Inf. Process. Syst. 2025, 37, 113545–113568. [Google Scholar]
  16. Carruthers, P. Language, Thought and Consciousness: An Essay in Philosophical Psychology; CUP: Cambridge, UK, 1998. [Google Scholar]
  17. Pavlick, E. Symbols and grounding in large language models. Philos. Trans. R. Soc. A 2023, 381, 20220041. [Google Scholar] [CrossRef] [PubMed]
  18. Chen, Y.; Zhao, Y.; Zhang, Y.; Zhang, A.; Kawaguchi, K.; Joty, S.; Li, J.; Chua, T.S.; Shieh, M.Q.; Zhang, W. The Emergence of Abstract Thought in Large Language Models Beyond Any Language. arXiv 2025, arXiv:2506.09890. [Google Scholar] [CrossRef]
  19. Zhou, Y.; Xiong, C.; Savarese, S.; Wu, C.S. Shared imagination: Llms hallucinate alike. arXiv 2024, arXiv:2407.16604. [Google Scholar] [CrossRef]
  20. Niu, Q.; Liu, J.; Bi, Z.; Feng, P.; Peng, B.; Chen, K.; Li, M.; Yan, L.K.; Zhang, Y.; Yin, C.H.; et al. Large language models and cognitive science: A comprehensive review of similarities, differences, and challenges. arXiv 2024, arXiv:2409.02387. [Google Scholar] [CrossRef]
  21. Boyd, D. Achieving transparency in adaptive digital systems. New Explor. Stud. Cult. Commun. 2022, 23, 134–149. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Boyd, D. A Comparison of the Effect of Language on High Level Information Processes in Humans and Linguistically Mature Generative AI. Proceedings 2025, 126, 11. https://doi.org/10.3390/proceedings2025126011

AMA Style

Boyd D. A Comparison of the Effect of Language on High Level Information Processes in Humans and Linguistically Mature Generative AI. Proceedings. 2025; 126(1):11. https://doi.org/10.3390/proceedings2025126011

Chicago/Turabian Style

Boyd, Daniel. 2025. "A Comparison of the Effect of Language on High Level Information Processes in Humans and Linguistically Mature Generative AI" Proceedings 126, no. 1: 11. https://doi.org/10.3390/proceedings2025126011

APA Style

Boyd, D. (2025). A Comparison of the Effect of Language on High Level Information Processes in Humans and Linguistically Mature Generative AI. Proceedings, 126(1), 11. https://doi.org/10.3390/proceedings2025126011

Article Metrics

Back to TopTop