Next Article in Journal
Kierkegaard’s Lesson on Religious Conformism vs. the Current Mainstream Environmentalism
Previous Article in Journal
Re-Thinking Subjectivation beyond Work and Appropriation: The Yanomami Anti-Production Strategies
Previous Article in Special Issue
Contemporary Natural Philosophy and Philosophies—Part 3
 
 
Article
Peer-Review Record

The Simulative Role of Neural Language Models in Brain Language Processing

Philosophies 2024, 9(5), 137; https://doi.org/10.3390/philosophies9050137
by Nicola Angius *, Pietro Perconti, Alessio Plebe and Alessandro Acciai
Reviewer 2: Anonymous
Philosophies 2024, 9(5), 137; https://doi.org/10.3390/philosophies9050137
Submission received: 10 July 2024 / Revised: 23 August 2024 / Accepted: 28 August 2024 / Published: 29 August 2024
(This article belongs to the Special Issue Contemporary Natural Philosophy and Philosophies - Part 3)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

SYNOPSIS

The core focus of this article is the trajectory of artificial intelligence applied to language models in brain language processing within connectionist frameworks, which emphasizes its role in prediction (such as predicting the next word in a story) and generalization. Initially, this approach served as a metaphor for the dynamic processes of the cerebral cortex. However, it subsequently transitions into a stage that moves away from the neurobiological metaphor, establishing a more abstract and independent model (DL = deep learning) not inspired by neurophysiology.

I have encountered articles with a comparable structure in the journal Philosophies, where the authors outline a geographical map of contemporary research without delving deeply into specifics. However, this article distinguishes itself with a thorough and meticulous approach. Its rigorous and comprehensive route results in a highly competent synthesis.

COMMENTS

(1) Attention mechanisms. Line 103. Gail Carpenter and Stephen Grossberg first introduced attention mechanisms in the ART networks' processing of information and learning, not [33]. Attention in Transformers was also introduced by Vaswani et al. in ''Attention is All You Need".

(2) Language acquisition. A note to say that none of the models reported in the paper explain language acquisition by children in a positive environment. See (3).

(3) Black boxes. This is the key point not highlighted in the articles cited by the author. The models (NLMs and other) are essentially hypocomputational, leaving a significant margin of computational power available to robots and computers. Assuming Transformers can effectively mimic primary human cognitive functions such as language, vision, and hearing, does this suggest that, aside from memory, human cognitive abilities are inferior to those of machines? I do not have an answer to this question, but the current trajectory of AI development appears to obscure this aspect of the debate. While some limitations are addressed in lines 415 and following, these concerns are retracted in lines 432 and subsequent. Can you add on these matters?

(4) Activation Patterns. Lines 238 and following. The meaning of predicting activation patterns is unclear to me. Generally, areas in the brain activate simultaneously and others at different times. Still, I don't understand how the extent of this activation is compared between the transformer simulation and the fMRI tests. Besides this fact, supposedly, the medium orbital frontal cortex or the amygdala is also activated, in addition to brain language processing areas. Ca you clarify the subject?

Bugs: I found almost no syntactic bugs, except in line 64: «conetxt» and «langauge».

Notation: missing meaning of terms, signs and purpose of the equation within lines 101 and 102.

References. Lines 28-29. I missed Edwin Land's monumental contributions to understanding colour vision in humans and machines. His algorithm, the Retinex theory, was the first to explain how colour constancy is maintained in varying lighting conditions. All these contributions were before Transformers.

Lines 167 and 168 seem to regress to Mark Gold's Theorem.

Author Response

Dear Reviewer,

we are very grateful for your useful comments that, we believe, helped us improving the quality of our manuscript. In the following, you will find a point-by-point reply to your remarks. All added parts in the manuscript have been marked in blue, for your perusal.

  1. "Attention mechanisms. Line 103. Gail Carpenter and Stephen Grossberg first introduced attention mechanisms in the ART networks' processing of information and learning, not [33]." Reference to "Carpenter, G.A.; Grossberg, S. ART 2: Self-organization of stable category recognition codes for 584 analog input patterns. Applied optics 1987, 26, 4919–4930. 585" is now made at lines 107-108.
  2. "Language acquisition. A note to say that none of the models reported in the paper explain language acquisition by children in a positive environment. See (3)." This is now explicitely stated in footnote 2 while addressing comment 3 below.
  3. "Assuming Transformers can effectively mimic primary human cognitive functions such as language, vision, and hearing, does this suggest that, aside from memory, human cognitive abilities are inferior to those of machines?" We added a paragraph on page 7; whereas our viewpoint is birefly stated there, we also specify that this is an important topic we would not like to adress, since it is out of the scope of the paper.
  4. "Activation Patterns. Lines 238 and following. The meaning of predicting activation patterns is unclear to me." Technical details on activation patterns and brain score are now included in footnote 3 (in order not to weigh the main text down).
  5. "Bugs: I found almost no syntactic bugs, except in line 64: «conetxt» and «langauge»." They have been corrected and the whole manuscript carefully read again.
  6. "Notation: missing meaning of terms, signs and purpose of the equation within lines 101 and 102." The paragraph contains now a more explicit, term by term, definition of the equation.
  7. "References. Lines 28-29. I missed Edwin Land's monumental contributions to understanding colour vision in humans and machines." Reference to "Land, E.H.; McCann, J.J. Lightness and retinex theory. Josa 1971, 61, 1–11. " and to "McCann, J.J. Retinex at 50: color theory and spatial algorithms, a review. Journal of Electronic 526 Imaging 2017, 26, 031204–031204. 527" have now been added.
  8. "Lines 167 and 168 seem to regress to Mark Gold's Theorem." We respectfully believe that the learning process of Transformer-based models would not fit with Gold's theorem; however, we would like not to include a discussion on this topic since this would need much space and it would take the manuscript away from its focus.

Reviewer 2 Report

Comments and Suggestions for Authors

The paper begins by an interesting discussion on the notion of AI which has first inherited of the setting of Cybernetics, However, since 10 years, AI has  lived what is called a real  “Renaissance” in 2 steps : First  thanks to the success of DL, it has acquired the faculty of vision. Then 5 years later, AI also became able to acquire a language possessing both a vectorial aspect one (via neuranets).and a symbolic aspect (e.g. via words).  

              

Section 2 recalls the notions allowing to develop such a language, namely the “Neural Language Models” (NLM) and  the “Transformer” with its “Self-attention mechanism”.

Let us note that  the whole paper essentially gives epistemological and methodological definitions , rather than discussing  their mathematical implications. Let us give 2 examples;

     =  What is the probability measured by a NLM.?

     = Figure 1 gives a simplified scheme of the Transformer architecture pertaining to the streamlined GPT architecture, with input text consisting of tokens ti. All the components are described by their name but no  mathematical operations are presented  on them.

A valuable addition to the text is the strength with which the authors insist on the surprising efficacity of the Transformer for its linguistic abilities,:and this though there is no real explanation since ihe Transformerr thas not been initially conceived at this end.

The sequel of the paper recalls the definition of the simulative (or synthetic) method which will be used to study, via  a number of specific experiments, the role of NLMs in Brain language processing

In Section 3 the experiments essentially concern cases where  NLMs  activate different parts of the brain.

Similarly Section 4 uses the simulative method to study  different experiments, for instance one (illustrated by Figure  2 )   which uses a robotlobster to study the chemiotaxis of lobsters.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        

Section  5 introduces the notion of “co-simulation” in which two ‘opaque’ sys-tems are used to simulate each other, for instance NLM and the brain. It is illustrated in Figure 3.

 

Comments on the Quality of English Language

 I propose “Minor Editing”.

Author Response

Dear Reviewer,

thank you very much for your valuable comments and suggestions that helped us improving the quality of our manuscript. In the following, you will find a point-by-point reply to your remarks. All added parts in the manuscript have been marked in red, for your perusal.

  • "Let us note that  the whole paper essentially gives epistemological and methodological definitions , rather than discussing  their mathematical implications." Yes, our main aim is to provide an epistemological and methodological analysis of NLMs and their implications in the philosophy of cognitive science, as what concerns  the synthetic method. However, the introduction of NLMs in section 2 provides now more techncial details (see third comment below).
  •  "What is the probability measured by a NLM.?" This is now more explicitly stated in the added lines 133-141.
  • "Figure 1 gives a simplified scheme of the Transformer architecture pertaining to the streamlined GPT architecture, with input text consisting of tokens ti. All the components are described by their name but no  mathematical operations are presented  on them." A new figure (Fig. 2) has been added, providing details of the attention mechanism, toghether with the definition of the algebraic operations involved (pages 5-7).
  • "Comments on the Quality of English Language. I propose “Minor Editing”." The manuscript has been carefully read and some typos fixed.

Back to TopTop