**1. Introduction**

Artificial intelligence in the form of machine learning is currently making impressive progress, especially in the field of deep learning (DL) [1]. Algorithms in deep learning have been inspired by the human brain, even though our knowledge about brain functions is still incomplete, ye<sup>t</sup> steadily increasing. Learning here is a two-way process where computing is learning from neuroscience, while neuroscience is adopting information processing models, and this process iterates, as discussed in [2–4].

Deep learning is based on artificial neural networks resembling neural networks of the brain, processing huge amounts of (labelled) data by high-performance GPUs (graphical processing units) with a parallel architecture. It is (typically supervised) machine learning from examples. It is static, based on the assumption that the world behaves in a similar way and that domain of application is close to the domain where training data are obtained. However impressive and successful, deep-learning intelligence has an Achilles heel, and that is lack of common sense reasoning [5–7]. Its recognition of pictures is based on pixels, and small changes, even invisible for humans, can confuse deep learning algorithm, leading to very surprising errors. Bengio [5] therefore points out that deep learning is missing the capability of out-of-distribution generalization, and compositionality.

Human intelligence has two distinct mechanisms of learning according to Kahneman [8]—quick, bottom-up, from data to patterns (System 1) and slow, top-down from language to objects (System 2) which have been recognized and analyzed earlier [8–10]. The starting point of old AI (GOFAI) was System 2, symbolic, language, and logic-based reasoning, planning and decision making. However, it was without System 1, so it had a problem of symbol grounding, as its mappings were always in the space of representations from symbols to symbols, and never to the physical world itself.

Now deep learning has grounding for its symbols in the observed/collected/measured data, but it lacks the System 2 capabilities of generalization/symbol generation, symbol manipulation, and language that are necessary in order to ge<sup>t</sup> to the human-level intelligence and ability of learning and meta-learning, that is learning to learn. The step from (big) data-based System 1 to manipulation of few concepts like in high level reasoning of System 2 is suggested in [5] to proceed via concepts of agency, attention, and causality. It is expected that 'agent perspective' will help to put constraints on the learned representations and so to encapsulate causal variables, and a ffordances. Bengio proposes that "the agen<sup>t</sup> perspective on representation learning should facilitate re-use of learned components in novel ways (..), enabling more powerful forms of compositional generalization, i.e., out-of-distribution generalization based on the hypothesis of localized (in time, space, and concept space) changes in the environment due to interventions of agents" [5].

This step, from System 1 (present state of DL) to System 2 (higher level cognition) will open new and even more powerful possibilities to AI. It is not the development into the unknown, as some of it on the System 2 side has earlier been proposed by GOFAI, and it is addressed in the new developments in cognitive science and neuroscience. In this article, we will focus on the modelling of System 1 and its connections to System 2 within the framework of computational model of cognition based on natural info-computation [3,4].

It should be acknowledged that the insight about the necessity of linking connectionism and symbolism is old, and already Minsky, in 1990, formulated the link in his "Logical vs. Analogical or Symbolic vs. Connectionist or Neat vs. Scru ffy" [11]. For more recent reflections on the topic, see [12–14].

The article is structured as follows. After the introduction, learning about the world through agency is presented. Learning in the computing nature, including learning in the evolutionary perspective, is outlined in the subsequent section. We then address learning as computation in networks of agents, and info-computational learning by morphological computation. Learning to learn from raw data and up—agency from System 1 to System 2 is the last topic investigated. Conclusions and future work close the article.

#### **2. Learning about the World through Agency**

The framework for the discussion is the computing nature in the form of info-computationalism. It takes the world (*Umwelt*) for an agen<sup>t</sup> to be information [15] with its dynamics seen as computation [16]. Information is observer relative and so is computation [17–19].

When discussing cognition as an embodied bioinformatic process, we use the notion of agent, i.e., a system able to act on its own behalf, pursuing an intrinsic goal [17,20]. Agency in biological systems, in the sense used here, has been explored in [21,22], where arguments are provided that the world as it appears to an agen<sup>t</sup> depends on the type of agen<sup>t</sup> and the type of interaction through which the agen<sup>t</sup> acquires information about the world [17]. Agents communicate by exchanging messages (information), which helps them to coordinate their actions based on the information they possess and then they share through social cognition.

For something to be information, there must exist an agen<sup>t</sup> for whom that is established as a "di fference that makes a di fference" [23]. When we argue that the fabric of the world is informational [24], the question can be asked: Who/what is the agen<sup>t</sup> in that context? An agen<sup>t</sup> can be seen as interacting with the points of inhomogeneity (differences that make a difference/data as atoms of information), establishing the connections between those data and the data that constitute the agen<sup>t</sup> itself (a particle, a system). There are myriads of agents for which information of the world makes differences—from elementary particles to molecules, cells, organisms, societies ... —all of them interact and exchange information on different levels of scale and these information dynamics are natural computation [25,26].

Our definition of agency and cognition as a property of all living organisms, is building on Maturana and Varela [27,28], and Stewart [29]. The question relevant for AI is how artifactual agents should be built in order to possess cognition and eventually even consciousness. Is it possible at all, given that cognition in living organisms is a deeply biologically rooted process? Along with reasoning, language is considered high-level cognitive activity that only humans are capable of. Increasing levels of cognition evolutionary developed in living organisms, starting from basic automatic behaviors such as found in organisms from bacteria [30–33] to insects, to increasingly complex behavior in complex multicellular life forms such as mammals [34]. Can AI "jump over" evolutionary steps in the development of cognition to reach, and even exceed, human intelligence?

While the idea that cognition is a biological process in all living organisms has been extensively discussed [27–29], it is not clear on which basis cognitive processes in all kinds of organisms would be accompanied by (some kind of, some degree of) consciousness. Consciousness is, according to Bengio [7], characteristics for System 2: "We closely associate conscious processing to Kahneman's system 2 cognitive abilities [Kahneman, 2011]." Bengio adopts Baars' global workspace theory of consciousness [35]. In the process of learning, and learning to learn, consciousness plays an important role through the process of attention, which selects only a tiny subset of information/data that is processed, instead of processing indiscriminately huge amounts of data, which is expensive from the point of view of response time and energy [7].

If we, in parallel with "minimal cognition" [36], search for "minimal consciousness" in an organism, what would that be? Opinions are divided at what point in the evolution one can say that consciousness emerged. Some would sugges<sup>t</sup> as Liljenström and Århem that only humans possess consciousness, while the others are ready to recognize consciousness in animals with emotions [37,38]. From the info-computational point of view, it has been argued that cognitive agents with nervous systems are the step in evolution which first enabled consciousness in the sense of internal model with the ability of distinguishing the "self" from the "other" and provide representation of "reality" for an agen<sup>t</sup> based on that distinction [4,39].

#### **3. Learning in the Computing Nature**

For naturalists, nature is the only reality [40]. Nature is described through its structures, processes, and relationships, using a scientific approach [41,42]. Naturalism studies the evolution of the entire natural world, including the life and development of human and humanity as a part of nature. Social and cultural phenomena are studied through their physical manifestations. An example of contemporary naturalist approach is the research field of social cognition with its network-based studies of social behaviors. Already Turing emphasized social character of learning [43], and was also elaborated on by Minsky [44] and Dennett [45].

Computational naturalism (pancomputationalism, naturalist computationalism, computing nature) [46–48], see even [3,4], is the view that the entirety of nature is a huge network of computational processes, which, according to physical laws, computes (dynamically develops) its own next state from the current one. Among prominent representatives of this approach are Zuse, Fredkin, Wolfram, Chaitin, and Lloyd, who proposed different varieties of computational naturalism. According to the idea of computing nature, one can view the time development (dynamics) of physical states as information processing (natural computation). Such processes include self-assembly, self-organization, developmental processes, gene regulation networks, gene assembly, protein–protein interaction networks, biological transport networks, social computing, evolution, and similar processes of morphogenesis (creation of form). The idea of computing nature and the relationships between two basic concepts of information and computation are explored in [17–19,25].

In the computing nature, cognition is a natural process, seen as a result of natural bio-chemical processes. All living organisms possess some degree of cognition, and for the simplest ones, like bacteria, cognition consists in metabolism and (my addition) locomotion [17]. This "degree" is not meant as continuous function, but as a qualitative characterization that cognitive capacities increase from simplest to the most complex organisms. The process of interaction with the environment causes changes in the informational structures that correspond to the body of an agen<sup>t</sup> and its control mechanisms, which define its future interactions with the world and its inner information processing [49]. Informational structures of an agen<sup>t</sup> become semantic information (i.e., ge<sup>t</sup> explicit metacognitive meaning through System 2, which generates metacognition for an agent) first in the case of highly intelligent agents capable of reasoning, which we know some birds are.

Recently, empirical studies have revealed an unexpected richness of cognitive behaviors (perception, information processing, memory, decision making) in organisms as simple as bacteria. [30–33]. Single bacteria are small, typically 0.5–5.0 micrometers in length, and interact with only their immediate environment. They live too short as a specific single organism to be able to memorize a significant amount of data. Biologically, bacteria are immortal at the level of the colony, as the two daughter bacteria from cell division of a parent bacterium are considered as two new individuals. Thus bacterial colonies, swarms, and films that extend to a bigger space, and can survive longer time, have longer memory, and exhibit an unanticipated complexity of behaviors that can undoubtedly be characterized as cognition [50,51], see even [45]. More fascinating cases are even simpler agents like viruses, on the border of the living, which are based on the simple principle that the most viable versions persist and multiply while others vanish [52,53]. Memory and learning are the key competences of living organisms [50], and in the simplest case, memory is based on the change of shape [54], which appears on di fferent scales and levels of organization [55]. Fields and Levin add evolutionary perspective to the memory characterization and argue that "genome is only one of several multi-generational biological memories". Additionally, cytoplasm and cell membrane, which characterize all of life on evolutionary timescale, preserve memory [56]. Because of complex structure of the cell, biological memory cannot be understood at one particular scale, and information is propagated and preserved in non-genomic cellular structures, which changes current understanding of biological memory [55,56]. It also forms at di fferent time scales [57].

Starting with bacteria and archaea [58], all organisms without nervous systems cognize, that is perceive their environment, process information, learn, memorize, and communicate. As they are natural information processors, some such as slime mold, multinucleate, or multicellular Amoebozoa, has been used as natural computer/information processors to compute shortest paths. Even plants cognize, in spite of being often thought of as living systems without cognitive capacities [59]. Plants have been found to possess memory (in their bodily structures that change as a result of past events), the ability to learn (plasticity, ability to adapt through morphodynamics), and the capacity to anticipate and direct their behavior accordingly. Plants are also argued to possess rudimentary forms of knowledge, according to [60] (p. 121), [61] (p. 7), and [34] (p. 61).

Consequently, in this article we take basic cognition to be the totality of processes of self-generation/self-organization, self-regulation, and self-maintenance that enables organisms to survive processing information from the environment. The understanding of cognition as it appears in degrees of complexity in living nature can help us better understand the step between inanimate and animate matter from the first autocatalytic chemical reactions to the first autopoietic proto-cells, as well as evolution of life and learning.
